Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

ML Systems Engineer: Cloud‑Scale Training Infra

Basis Research Institute

A nonprofit AI research organization in New York City seeks a full-time ML Systems Engineer. This role involves managing distributed training infrastructure, debugging complex issues, and optimizing cloud resources to enhance operational efficiency. Ideal candidates will have expertise in ML systems and cloud administration. Join a team focused on solving impactful problems through advanced AI infrastructure. #J-18808-Ljbffr Basis Research Institute

Vacancy posted 1 day ago
Similar jobs that could be interesting for youBased on the ML Systems Engineer: Cloud‑Scale Training Infra in New York, NY vacancy
  • $180k - $230k

     ...Harnham is looking for a Senior Machine Learning Engineer to join their AI-driven technology company, focused on building large-scale ML systems. The role emphasizes production ML over research, involving designing, training, and deploying ML models. Responsibilities... 
    Training
    Remote work

    Harnham

    New York, NY
    6 days ago
  •  ...problems . This means expanding the scale, complexity, and breadth of...  ...human values first. About the Role ML Systems Engineers at Basis ensure training and evaluation infrastructure is fast...  ...distributed training frameworks through cloud administration, making it possible... 
    Training
    Full time
    Work at office

    Basis Research Institute

    New York, NY
    1 day ago
  • Prsala is looking for a reliable Systems Administrator to manage and maintain their infrastructure and IT systems. This role supports a...  ...stable, secure, and monitored. Responsibilities include managing cloud infrastructure, handling IAM, and implementing security best practices... 
    Suggested
    Remote job
    Flexible hours

    Prsala

    New York, NY
    1 day ago
  • $250k - $350k

     ...Applied ML Systems Engineer  - Finance - NEW YORK - UNITED...  ...GPU kernels trying to shave training time. Other weeks you'll be...  ...machine" and "it works at scale, reliably, for months" - I must...  ...Brain, DeepMind, Ads ML, Infra); Meta (FAIR, Infra, Recsys)... 
    Training
    Permanent employment
    Full time
    Work experience placement
    Internship
    Immediate start
    Remote work
    Relocation
    Relocation package
    New York, NY
    2 days ago
  • Gritt Robotics Inc is seeking a Software - ML & Cloud Infrastructure Engineer to design scalable cloud infrastructure for AI and data pipelines. Join...  ...product evolution and develop high-performance ML systems. The ideal candidate has 4+ years of experience in deploying... 
    Suggested

    Gritt Robotics Inc

    Brooklyn, NY
    1 day ago
  • $216.7k - $303.4k

     ...Senior Machine Learning Systems Engineer Remote - United States...  ...You’ll Do As a Senior ML Infrastructure...  ...a platform for large scale ML models at Reddit. Design...  ...including improving model training time, efficiency, and...  ...Deep experience with cloud‑based technologies for... 
    Training
    Remote work

    Reddit

    New York, NY
    1 day ago
  •  ...Machine Learning Systems Engineer, RL Engineering San Francisco, CA | New...  ...cutting-edge systems that train AI models like Claude. You're...  ...reliable and steerable AI. As an ML Systems Engineer on our...  ...High performance, large scale distributed systems Large... 
    Training
    Work at office
    Visa sponsorship
    Flexible hours

    Anthropic

    New York, NY
    5 days ago
  • $320k - $405k

     ...Machine Learning Systems Engineer, Research Tools San Francisco, CA | New...  ...more efficient and effective training of our AI systems while...  ...systems, data pipelines, or ML infrastructure Are proficient...  ...cohesive team on just a few large-scale research efforts. And we... 
    Training
    Work at office
    Visa sponsorship
    Flexible hours

    Anthropic

    New York, NY
    5 days ago
  • A dynamic technology firm in New York is seeking a talented Senior/Staff level Systems Engineer to develop and scale a dedicated cloud for CI workloads. The role offers an opportunity to solve complex systems problems and build a new CI cloud from the ground up. Candidates... 

    Crossing Hurdles

    New York, NY
    3 days ago
  •  ...partner with research and infra to prototype, train, and deploy state-of-the-art...  .... Squeeze silicon — scale training and inference for...  ...PyTorch. Proven software engineer who loves ML; comfortable writing...  ...especially user-facing, online ML systems—despite shifting... 
    Training
    Full time
    Contract work
    Flexible hours
    Shift work

    SESAME

    New York, NY
    4 days ago
  •  .... You will own key infrastructure that powers clinical decision-making. Responsibilities include scaling AWS infrastructure, designing backend services, and ensuring system reliability. Candidates should have experience with distributed architectures and proficiency in... 

    Anterior

    New York, NY
    4 days ago
  • Modal Labs is seeking strong engineers to train production machine learning models and contribute to open-source projects. Candidates should have experience with high-performance code and ML training optimization, working in our NYC or San Francisco offices. Ideal applicants... 
    Training

    Modal Labs

    New York, NY
    1 day ago
  • $189.6k - $237k

     ...Scale's ML platform (RLXF) team builds our internal distributed framework...  ...for large language model training and inference. The platform...  ...to optimize our ML system Ideally you'd have:...  ...systems Strong software engineering skills, proficient in frameworks... 
    Training
    Full time

    Scale AI

    New York, NY
    2 days ago
  • $175k - $250k

     ...Senior Machine Learning Engineer (ML Infrastructure & Data Systems) Our client is an early-stage robotics and...  ...environments and is now entering a rapid scaling phase. Their approach emphasizes...  ...loops between deployment and model training. They are building toward large-... 
    Training

    Right Hand Talent

    Brooklyn, NY
    5 days ago
  • $250k - $350k

     ...function of our society. At Scale, our mission is to...  ...state of the art post-training algorithms to reach the...  ...The Enterprise ML Research Lab works on the...  ...As an ML Sys Research Engineer, you'll work on building...  ...technologies to optimize our ML system. Your customer will be... 
    Training
    Full time

    Scale AI

    New York, NY
    5 days ago
  •  ...a Senior Machine Learning Engineer - Training Platform in Australia. You...  ...building the foundational systems that power large-scale model training across a global...  ...research scientists, ML engineers, and product teams...  ...across infrastructure, cloud, and applied AI teams to solve... 
    Training
    Remote work
    Flexible hours

    Jobgether

    New York, NY
    1 day ago
  • Reflection, based in New York, is seeking an experienced professional to build and scale distributed training systems for frontier model pre-training. You will work closely with research teams to design large-scale training runs and optimize training efficiency across... 
    Training

    Reflection

    New York, NY
    3 days ago
  •  ...Lightning-Ai is seeking a Platform Support Engineer to support ML engineers running large-scale workloads. This role involves diagnosing complex systems issues and providing guidance to...  ...a strong background in Kubernetes and cloud infrastructure. The position is remote... 
    Remote work

    Lightning AI

    New York, NY
    6 days ago
  •  ...Senior GPU Systems / AI Infrastructure Engineer (NYC) Location: New York City (Hybrid...  ...A-C / high-growth AI infra) About the Role We’re...  ...infrastructure powering large-scale model training and inference. This role...  ...Collaborate closely with ML researchers and infra... 
    Training
    Permanent employment
    New York, NY
    a month ago
  •  ...construction of large-scale infrastructure around the globe. Gritt’s systems are already deployed commercially...  ...VCs. Role: Software - ML & Cloud Infrastructure Location...  ...& Cloud Infrastructure Engineer to join our team. As an...  ...and deploy scalable AI training and validation... 
    Training

    Gritt Robotics Inc

    Brooklyn, NY
    1 day ago
  • $152k - $272.25k

     ...Principal Machine Learning Engineer, ML Platform and Systems Architecture****POSITION...  ...design and evolution of large-scale machine learning platforms...  ...capabilities across training, evaluation, deployment, and...  ...distributed data processing, and cloud-native platform... 
    Training
    Remote work

    Autodesk

    New York, NY
    1 day ago
  • $140k - $210k

    A technology company in New York is seeking a skilled engineer to develop state-of-the-art machine learning solutions. The role involves training and deploying models that influence energy infrastructure management. Candidates should have strong Python skills and experience... 
    Training
    Full time

    Treeswift Inc

    New York, NY
    3 days ago
  •  ...Role: AI/ML Azure Engineer Duration : Full Time / Contact W2...  ..., including data ingestion, training, evaluation, and deployment....  ...infrastructure needs and ensure AI systems are robust, scalable, and...  ...Work on optimizing and scaling existing models and algorithms... 
    Training
    Full time
    Work experience placement

    ACI Infotech

    New York, NY
    2 days ago
  •  ...with multiple database systems (Teradata, HIVE, SQL Server...  ...Snowflake) including Cloud system, both on prem...  ...candidate to work on data engineering pipelines using Spark...  ...design to implementation to training to deployment of models...  ...machine learning tools ML Flow, Databricks,... 
    Training
    Worldwide

    TriOptus LLC

    New York, NY
    1 day ago
  •  ...Machinify is looking for a Sr/Director of Engineering to lead our AI/ML Engineering team in the United States. You will oversee a team of engineers...  ...the core AI/ML platform and ensure its reliability at scale. The ideal candidate will have extensive experience in backend... 
    Remote work

    Machinify, Inc.

    New York, NY
    4 days ago
  • $141.1k - $262.1k

    F. Hoffmann-La Roche AG is seeking a motivated ML Engineer for its Genentech team in New York. The role focuses on designing and maintaining ML infrastructure to support drug discovery initiatives. The ideal candidate will have a strong background in AWS, Python, and C++... 

    F. Hoffmann-La Roche AG

    New York, NY
    3 days ago
  •  ...Lead Systems Engineer (Rust) - AI Platform About the Role What if...  ...infrastructure that runs at scale. This is a fully remote contract...  ...teams to support model training and evaluation workflows Lead...  ...Familiarity with AI/ML workflows, model training, or... 
    Training
    Hourly pay
    Ongoing contract
    Contract work
    Remote work

    Alignerr

    New York, NY
    1 day ago
  • $156.5k - $181k

     ...FinTech) is seeking an experienced Lead Cloud Systems Engineer (Microsoft 365, AWS, Collaboration...  ...executes transactions on an extraordinary scale which has bolstered liquidity in the...  ...MAM Solutions. Strong communication and training skills for helpdesk enablement and... 
    Training
    Full time
    H1b
    Work at office
    Local area
    Remote work

    U.S. Financial Technology

    New York, NY
    1 day ago
  • $197.3k - $225.1k

     ...Lead AI/ML Engineer (Platform, kubeflow) Overview...  ...responsible and reliable AI systems, changing banking for...  ...foundation model training, large language model...  ...throughput — of large scale production AI systems....  ...responsible AI solutions on cloud platforms (e.g. AWS,... 
    Training
    Full time
    Part time
    Local area

    Capital One

    New York, NY
    3 days ago
  •  ...based in Ann Arbor, Michigan, is looking for Machine Learning Engineers to enhance its machine learning capabilities. The ideal...  ...2 years of relevant experience. Responsibilities include training and optimizing ML models, working with cross-functional teams to ensure quality... 
    Training
    Flexible hours

    May Mobility

    New York, NY
    1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to ML Systems Engineer: Cloud‑Scale Training Infra. Be the first to apply!