ML Infrastructure Engineer — Large-Scale AI Systems
Causal Labs
A leading AI research organization in San Francisco seeks an Infrastructure Engineer to design and maintain large distributed ML training and inference clusters. The ideal candidate will have a strong grasp of optimizing training workloads and experience with distributed training frameworks like FSDP and DeepSpeed. Proficiency in cloud platforms and containerization is essential. Join us to tackle unsolved problems and shape the future of AI. #J-18808-Ljbffr Causal Labs
$320k - $405k
...interpretable, and steerable AI systems. We want AI to be... ...researchers, engineers, policy experts,... ...Machine Learning Infrastructure Engineer to join... ...you'll build and scale the critical infrastructure... ...machine learning, large-scale distributed... ...and implement ML infrastructure...SuggestedWork at officeVisa sponsorshipFlexible hours- ...We're looking for an experienced HPC infrastructure engineer to lead bringup, administration, and... ...on is probably the largest anime AI training cluster in the world . You... ...get to combine your love of anime and large-scale GPU systems. You're familiar with the modern...SuggestedWork at officeVisa sponsorship
$100k - $200k
Coval Simulation & Evaluation that scales voice and chat AI agents ML‑Infrastructure Engineer Salary $100K - $200K Equity 0.2... ...foundations, the queuing systems, the monitoring patterns are in... ...tasks. You'll develop and architect large parts of our compute infrastructure...SuggestedFull timeLive inWork at office- ...advanced hardware engineering and AI solutions. Our... ...Machine Learning Infrastructure Engineer to join... ...design, build, and scale infrastructure to... ...production-grade ML ecosystem to support... ...Design and build systems ML cloud... ...software engineering, large-scale data infrastructure...SuggestedFlexible hours
- A leading AI technology firm in San Francisco is seeking an experienced ML Systems Engineer focused on developing and optimizing machine learning pipelines for robotics and... ...offers competitive compensation and a supportive work environment. #J-18808-Ljbffr Scale AI, Inc.Suggested
- A leading AI company in San Francisco is seeking a skilled ML Infrastructure Engineer to manage and optimize large-scale training systems. In this role, you will design and maintain infrastructure for model training, ensuring efficient GPU/TPU utilization while working...
- ...are seeking a Data Infrastructure Engineer to build and... ...bar for production systems. You will define clear... ...and product usage scale. What You'll Do... ...scalable data and ML infrastructure on... ...search, indexing, and large-scale querying... ...sensing, data, and AI systems with real-...Permanent employmentFull time
- ...ML Engineer At Krea, we are building next-generation AI creative tools. We are dedicated to making AI intuitive... ...and recommendation systems from scratch. You'll... ...points Experience with large-scale data systems and production ML infrastructure Prior work on or...H1b
- A pioneering AI firm based in San Francisco is seeking a Research Engineer, Distributed Data Systems. In this role, you will design and maintain infrastructure for large-scale multimodal training, ensuring scalability and reliability of data systems. Candidates should...Work at officeRelocation package
- A leading AI research company in San Francisco seeks Senior/Staff Engineers skilled in distributed systems and large-scale ML training. Responsibilities include designing systems optimized for low-bandwidth conditions and implementing robust training strategies. Ideal candidates...Remote job
- A leading AI technology company in San Francisco is... ...looking for a Senior Software Engineer to build scalable infrastructure for large‑scale training and fine-tuning... ...distributed training systems and optimize GPU... ...years of experience in ML infrastructure and a strong...
$248.8k - $311k
...Scale's Physical AI business unit is dedicated to solving the... ...AI and developing ML pipelines for processing... ...Role As an ML Systems Engineer on the Physical AI team... ...experience building large-scale, high-... ...in machine learning infrastructure. Algorithm Optimization...Full time$250k - $325k
...runs on the same infrastructure: agreements between... ...We're building the AI that finally... ...last 12 months. Engineering at Ivo Engineers... ...agentic RAG [2023] • Large-scale LLM-based legal... ...strategies to isolate ML vs API workloads... ...in distributed systems Experience managing...Contract workWork at officeRemote work- ...is looking for skilled engineers to work on autonomous R&D systems in machine learning. You... ...design experiments, build infrastructure, and implement systems that perform reliably in large-scale ML settings. The ideal... ...research background in AI or related fields. This...Full time
$189.6k - $237k
...Scale's ML platform (RLXF) team builds our internal distributed framework for large language model training and inference.... ...heart of the field of AI as an indispensable... ...to optimize our ML system Ideally you'd have... ...Strong software engineering skills, proficient in...Full time- Xterraai is looking for an ML Software Engineer to help build innovative AI agents capable of tackling complex scientific challenges. The position involves designing and developing systems that support cutting-edge research in geospatial and geophysics intelligence. The...
- ...Build the data infrastructure for robots operating... ...be improved, engineers rely on data to... ...and autonomous systems teams to ingest... ...looking for a ML Platform... ...design, deploy, and scale the systems that... ...pipelines over large, heterogeneous... ...vehicles, or physical AI workflows...Remote work
- Reducto, Inc. is seeking an Infrastructure Engineer to design, build, and maintain scalable infrastructure for AI and ML workloads. The role involves automating cloud infrastructure and implementing robust monitoring systems to ensure reliability. With a requirement of...
$131.76k - $161.06k
...Software Engineer ESnet delivers high-bandwidth,... ...s Integrated Research Infrastructure. As part of ESnet's Pilots... ...into production systems, and independently delivering... ...issues and simulating large-scale deployments.... ...experience in applying AI tools and agentic workflows...Full timeWork at officeRemote work$250k - $380k
...time Department Scaling Compensation $2... ...and inference infrastructure that powers frontier... ...scale. Our systems unify how... ...looking for an engineer to design and implement... ...across large fleets of machines... ...glamorous) part of the ML stack. Bonus... ...OpenAI is an AI research and...Full timeWork at officeLocal areaRelocation packageFlexible hours- A decentralized AI platform company in the United... ...an experienced ML Training Platform Engineer to design and build robust infrastructure for ML training. The... ...deployments and distributed systems. Responsibilities... ...essential for enabling large-scale, collaborative AI...
- ...ML Infrastructure Engineer Spectral Labs is a spatial intelligence company building... ...for engineering physical systems. Our model SGS-1 is state-of... ...the cutting edge of applied AI at Meta, Autodesk Research and... ...multi-node training at scale Deep understanding of profiler...
- ...blockchain analytics and AI solutions to help law... ...build a safer financial system for billions of people around... ...at an unprecedented scale. As a Senior Software Engineer, ML Infrastructure at TRM Labs, you will collaborate... ...serving patterns for large-scale models. Implement...Worldwide
- Whatnot is seeking an AI/ML Platform Engineer to shape the future of machine learning within a fast-growing livestream shopping platform. In this role, you'll design and scale systems that support various business functions, prototype novel architectures, and build robust...Remote job
- Andiamo is seeking a Member of Technical Staff specializing in AI/ML Engineering in San Francisco. This role involves building intelligent systems to modernize financial operations, developing machine learning applications, and collaborating with cross-functional teams....Flexible hours
$117.2k - $313.7k
...Category Software Engineering Job Details About... ...Salesforce is the #1 AI CRM, where humans with... ...Salesforce. Distributed Systems Software Engineer -... ...innovating and maintaining a large scale distributed systems... ...Deliver cloud infrastructure automation tools, frameworks...$100k - $200k
Voiceflow is seeking a skilled ML-Infrastructure Engineer in San Francisco to architect and operate auto-scaling systems for our voice AI simulation platform. The role includes optimizing GPU and compute infrastructure, ensuring high performance and reliability. Ideal...Work at office$50 - $80 per hour
...Are you a network engineer who's curious about data... ...at the intersection of infrastructure and intelligence? At... ...integrated networking systems and now we're using the... ...are built and used in AI systems. In this... ...schemas and structure for large-scale data pipelines You'...Hourly payContract work- A healthcare technology firm in San Francisco is seeking an ML Infrastructure Engineer, Model Inference to build and optimize AI-driven solutions. You will design scalable Kubernetes clusters, enhance ML model serving infrastructure, and collaborate with cross-functional...
$250k
...Consulting Ltd is looking for a talented ML/AI Research Engineer to join their San Francisco team. You... ...responsible for designing and managing the infrastructure that powers training, deployment, and governance of large-scale AI systems. The ideal candidate has a strong...
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to ML Infrastructure Engineer — Large-Scale AI Systems. Be the first to apply!
- computer vision machine learning engineer San Francisco, CA
- machine learning ai engineer San Francisco, CA
- senior ml engineer San Francisco, CA
- machine learning software engineer San Francisco, CA
- data scientist machine learning engineer San Francisco, CA
- machine learning engineer San Francisco, CA
- ai ml engineer San Francisco, CA
- junior machine learning research engineer San Francisco, CA
- graduate machine learning engineer San Francisco, CA
- entry level infrastructure engineer San Francisco, CA

