ML Systems Engineer: Cloud‑Scale Training Infra
Basis Research Institute
A nonprofit AI research organization in New York City seeks a full-time ML Systems Engineer. This role involves managing distributed training infrastructure, debugging complex issues, and optimizing cloud resources to enhance operational efficiency. Ideal candidates will have expertise in ML systems and cloud administration. Join a team focused on solving impactful problems through advanced AI infrastructure. #J-18808-Ljbffr Basis Research Institute
$180k - $230k
...Harnham is looking for a Senior Machine Learning Engineer to join their AI-driven technology company, focused on building large-scale ML systems. The role emphasizes production ML over research, involving designing, training, and deploying ML models. Responsibilities...TrainingRemote work- ...problems . This means expanding the scale, complexity, and breadth of... ...human values first. About the Role ML Systems Engineers at Basis ensure training and evaluation infrastructure is fast... ...distributed training frameworks through cloud administration, making it possible...TrainingFull timeWork at office
- Prsala is looking for a reliable Systems Administrator to manage and maintain their infrastructure and IT systems. This role supports a... ...stable, secure, and monitored. Responsibilities include managing cloud infrastructure, handling IAM, and implementing security best practices...SuggestedRemote jobFlexible hours
$250k - $350k
...Applied ML Systems Engineer - Finance - NEW YORK - UNITED... ...GPU kernels trying to shave training time. Other weeks you'll be... ...machine" and "it works at scale, reliably, for months" - I must... ...Brain, DeepMind, Ads ML, Infra); Meta (FAIR, Infra, Recsys)...TrainingPermanent employmentFull timeWork experience placementInternshipImmediate startRemote workRelocationRelocation package- Gritt Robotics Inc is seeking a Software - ML & Cloud Infrastructure Engineer to design scalable cloud infrastructure for AI and data pipelines. Join... ...product evolution and develop high-performance ML systems. The ideal candidate has 4+ years of experience in deploying...Suggested
$216.7k - $303.4k
...Senior Machine Learning Systems Engineer Remote - United States... ...You’ll Do As a Senior ML Infrastructure... ...a platform for large scale ML models at Reddit. Design... ...including improving model training time, efficiency, and... ...Deep experience with cloud‑based technologies for...TrainingRemote work- ...Machine Learning Systems Engineer, RL Engineering San Francisco, CA | New... ...cutting-edge systems that train AI models like Claude. You're... ...reliable and steerable AI. As an ML Systems Engineer on our... ...High performance, large scale distributed systems Large...TrainingWork at officeVisa sponsorshipFlexible hours
$320k - $405k
...Machine Learning Systems Engineer, Research Tools San Francisco, CA | New... ...more efficient and effective training of our AI systems while... ...systems, data pipelines, or ML infrastructure Are proficient... ...cohesive team on just a few large-scale research efforts. And we...TrainingWork at officeVisa sponsorshipFlexible hours- A dynamic technology firm in New York is seeking a talented Senior/Staff level Systems Engineer to develop and scale a dedicated cloud for CI workloads. The role offers an opportunity to solve complex systems problems and build a new CI cloud from the ground up. Candidates...
- ...partner with research and infra to prototype, train, and deploy state-of-the-art... .... Squeeze silicon — scale training and inference for... ...PyTorch. Proven software engineer who loves ML; comfortable writing... ...especially user-facing, online ML systems—despite shifting...TrainingFull timeContract workFlexible hoursShift work
- .... You will own key infrastructure that powers clinical decision-making. Responsibilities include scaling AWS infrastructure, designing backend services, and ensuring system reliability. Candidates should have experience with distributed architectures and proficiency in...
- Modal Labs is seeking strong engineers to train production machine learning models and contribute to open-source projects. Candidates should have experience with high-performance code and ML training optimization, working in our NYC or San Francisco offices. Ideal applicants...Training
$189.6k - $237k
...Scale's ML platform (RLXF) team builds our internal distributed framework... ...for large language model training and inference. The platform... ...to optimize our ML system Ideally you'd have:... ...systems Strong software engineering skills, proficient in frameworks...TrainingFull time$175k - $250k
...Senior Machine Learning Engineer (ML Infrastructure & Data Systems) Our client is an early-stage robotics and... ...environments and is now entering a rapid scaling phase. Their approach emphasizes... ...loops between deployment and model training. They are building toward large-...Training$250k - $350k
...function of our society. At Scale, our mission is to... ...state of the art post-training algorithms to reach the... ...The Enterprise ML Research Lab works on the... ...As an ML Sys Research Engineer, you'll work on building... ...technologies to optimize our ML system. Your customer will be...TrainingFull time- ...a Senior Machine Learning Engineer - Training Platform in Australia. You... ...building the foundational systems that power large-scale model training across a global... ...research scientists, ML engineers, and product teams... ...across infrastructure, cloud, and applied AI teams to solve...TrainingRemote workFlexible hours
- Reflection, based in New York, is seeking an experienced professional to build and scale distributed training systems for frontier model pre-training. You will work closely with research teams to design large-scale training runs and optimize training efficiency across...Training
- ...Lightning-Ai is seeking a Platform Support Engineer to support ML engineers running large-scale workloads. This role involves diagnosing complex systems issues and providing guidance to... ...a strong background in Kubernetes and cloud infrastructure. The position is remote...Remote work
- ...Senior GPU Systems / AI Infrastructure Engineer (NYC) Location: New York City (Hybrid... ...A-C / high-growth AI infra) About the Role We’re... ...infrastructure powering large-scale model training and inference. This role... ...Collaborate closely with ML researchers and infra...TrainingPermanent employment
- ...construction of large-scale infrastructure around the globe. Gritt’s systems are already deployed commercially... ...VCs. Role: Software - ML & Cloud Infrastructure Location... ...& Cloud Infrastructure Engineer to join our team. As an... ...and deploy scalable AI training and validation...Training
$152k - $272.25k
...Principal Machine Learning Engineer, ML Platform and Systems Architecture****POSITION... ...design and evolution of large-scale machine learning platforms... ...capabilities across training, evaluation, deployment, and... ...distributed data processing, and cloud-native platform...TrainingRemote work$140k - $210k
A technology company in New York is seeking a skilled engineer to develop state-of-the-art machine learning solutions. The role involves training and deploying models that influence energy infrastructure management. Candidates should have strong Python skills and experience...TrainingFull time- ...Role: AI/ML Azure Engineer Duration : Full Time / Contact W2... ..., including data ingestion, training, evaluation, and deployment.... ...infrastructure needs and ensure AI systems are robust, scalable, and... ...Work on optimizing and scaling existing models and algorithms...TrainingFull timeWork experience placement
- ...with multiple database systems (Teradata, HIVE, SQL Server... ...Snowflake) including Cloud system, both on prem... ...candidate to work on data engineering pipelines using Spark... ...design to implementation to training to deployment of models... ...machine learning tools ML Flow, Databricks,...TrainingWorldwide
- ...Machinify is looking for a Sr/Director of Engineering to lead our AI/ML Engineering team in the United States. You will oversee a team of engineers... ...the core AI/ML platform and ensure its reliability at scale. The ideal candidate will have extensive experience in backend...Remote work
$141.1k - $262.1k
F. Hoffmann-La Roche AG is seeking a motivated ML Engineer for its Genentech team in New York. The role focuses on designing and maintaining ML infrastructure to support drug discovery initiatives. The ideal candidate will have a strong background in AWS, Python, and C++...- ...Lead Systems Engineer (Rust) - AI Platform About the Role What if... ...infrastructure that runs at scale. This is a fully remote contract... ...teams to support model training and evaluation workflows Lead... ...Familiarity with AI/ML workflows, model training, or...TrainingHourly payOngoing contractContract workRemote work
$156.5k - $181k
...FinTech) is seeking an experienced Lead Cloud Systems Engineer (Microsoft 365, AWS, Collaboration... ...executes transactions on an extraordinary scale which has bolstered liquidity in the... ...MAM Solutions. Strong communication and training skills for helpdesk enablement and...TrainingFull timeH1bWork at officeLocal areaRemote work$197.3k - $225.1k
...Lead AI/ML Engineer (Platform, kubeflow) Overview... ...responsible and reliable AI systems, changing banking for... ...foundation model training, large language model... ...throughput — of large scale production AI systems.... ...responsible AI solutions on cloud platforms (e.g. AWS,...TrainingFull timePart timeLocal area- ...based in Ann Arbor, Michigan, is looking for Machine Learning Engineers to enhance its machine learning capabilities. The ideal... ...2 years of relevant experience. Responsibilities include training and optimizing ML models, working with cross-functional teams to ensure quality...TrainingFlexible hours
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to ML Systems Engineer: Cloud‑Scale Training Infra. Be the first to apply!
- senior ml engineer New York, NY
- data scientist machine learning engineer New York, NY
- machine learning ai engineer New York, NY
- junior machine learning research engineer New York, NY
- computer vision machine learning engineer New York, NY
- graduate machine learning engineer New York, NY
- machine learning software engineer New York, NY
- ai ml engineer New York, NY
- machine learning engineer New York, NY
- healthcare systems engineer New York, NY

