Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Software Engineer, ML Systems & Training Architecture

$295k - $380k

United States Digital Space LLC

About the Team The company Robotics team is focused on unlocking general-purpose robotics and pushing towards AGI-level intelligence in dynamic, real-world settings. Working across the entire model stack, we integrate cutting‑edge hardware and software to explore a broad range of robotic form factors. We strive to seamlessly blend high‑level AI capabilities with the constraints of physical systems to improve peoples’ lives. About the Role As a Senior Software Engineer, ML Systems & Training Infrastructure, you will be a deeply hands‑on engineering force multiplier for the robotics team. You will help keep the training framework and surrounding infrastructure healthy, review and improve code quickly, debug failures across ML systems and infrastructure, and unblock researchers and engineers when the path from idea to working training job gets rough. We’re looking for people who love writing, reading, reviewing, and fixing code; who can get productive quickly in unfamiliar systems; and who bring strong practical judgment without a lot of ego or process overhead. This role will be based in San Francisco, CA and be expected in office 5 days per week and offer relocation assistance to new employees. In this role, you will: Review, improve, and clean up code across training frameworks and adjacent infrastructure. Identify risky or low‑quality changes before they land, and raise the code quality bar without slowing the team down. Debug issues across ML training systems, GPUs, clusters, networking, and related infrastructure. Help researchers and engineers unblock broken training jobs, flaky workflows, and brittle internal tooling. Improve the reliability, maintainability, and usability of the robotics team’s training framework. Move quickly on practical engineering problems that directly affect team velocity. You might thrive in this role if you: Have strong software engineering fundamentals and excellent code review judgment. Have experience with ML systems, training frameworks, GPUs, distributed systems, infrastructure, or similarly complex technical environments. Read and debug unfamiliar codebases quickly, and enjoy getting to root cause. Ship high‑quality code with strong velocity and pragmatic judgment. Are low‑ego, responsive, and motivated by helping researchers and engineers move faster. Prefer being a highly effective hands‑on IC over driving broad process‑heavy initiatives. Have experience reviewing messy, fast‑moving, or AI‑generated codebases. Compensation Range

$295K – $380K USD

Equal Opportunity We are an equal opportunity employer, and we do not discriminate on the basis of race, religion, color, national origin, sex, sexual orientation, age, veteran status, disability, genetic information, or other applicable legally protected characteristic. #J-18808-Ljbffr United States Digital Space LLC

Vacancy posted 3 days ago
Similar jobs that could be interesting for youBased on the Software Engineer, ML Systems & Training Architecture in San Francisco, CA vacancy
  • $166k - $225k

     ...business. Founded by engineers — and customer obsessed...  ...SQL query engines. As a software engineer on the...  ...storage and processing systems that can outperform specialized...  ...data engineering architecture. Delta Pipelines : It'...  ...relevant certifications and training, and specific work... 
    Training
    Local area
    Worldwide

    Databricks Inc.

    San Francisco, CA
    1 day ago
  • $180k - $250k

     ...this role: You are an experienced software engineer who thrives on building large-scale...  ...in large scale distributed systems that deal with high complexity, a lot...  ...to have Experience with AI/ML inference or training infrastructure Experience with high... 
    Training
    Currently hiring
    Remote work
    Relocation package

    Fal

    San Francisco, CA
    5 days ago
  •  ...processing data pipeline, then update our ML data loader, then train some models to validate your change,...  ...idea that can help make our entire system more robust, scalable, or faster...  ...‑4 to hundreds of millions of users, engineered the foundations of autonomous driving... 
    Training

    Generalist

    San Francisco, CA
    3 days ago
  • $248.4k - $310.5k

     ...Software Engineer - Robotics & Autonomous Systems Scale's Robotics business unit is dedicated to solving the data bottleneck...  ...robotics data collection, model training pipelines, and evaluation...  ...autonomous vehicle datasets Build ML training and fine-tuning pipelines... 
    Training
    Full time

    Scale AI

    San Francisco, CA
    2 days ago
  • $248.4k - $310.5k

     ...contributor building production systems for robotics data collection, model training pipelines, and...  ...vehicle datasets Build ML training and fine-tuning...  ...quality Collaborate with ML engineers and researchers to bring...  ...3+ years of software engineering experience in... 
    Training
    Full time

    Scale AI

    San Francisco, CA
    2 days ago
  •  ...world-class scientists, ML researchers, and engineers to work together to...  ...frontier of model architectures for AI x Chemistry:...  ...of machine learning systems architecture and distributed...  ...data generation, training, and evaluations for...  ...systems design and software architecture.... 
    Training
    Work at office

    Achira

    San Francisco, CA
    8 days ago
  •  ...learning and real-time analytics systems operate in production —...  ..., and infrastructure architecture across engineering. This team operates in a highly...  ...environment in SOMA. Infrastructure, ML, and engineering leaders...  ...ML platforms at scale (training + inference) AWS or cloud‑... 
    Training

    Valid8 Financial, Inc.

    San Francisco, CA
    3 days ago
  • $147k - $211k

    Software Engineer, Agentic AI Systems, Cloud Security Google San Francisco, CA, USA Apply X Applicants in San...  ..., LLMs, Agentic development etc) or ML platform/infrastructure (e.g., model...  ..., and relevant education or training. Your recruiter can share more about... 
    Training
    Full time
    Worldwide

    Google Inc.

    San Francisco, CA
    3 days ago
  •  ...data that enable our training and scaling efforts,...  ...optimization techniques, model architectures, and efficiency...  ...co‑designing model-system interfaces with the...  ...We’re looking for a Software Engineer focused on building and...  ...with embedding‑based or ML‑powered systems.... 
    Training

    Repovive, Inc.

    San Francisco, CA
    20 hours ago
  • $166k - $225k

     ...BI, and all the way up to ML/AI with a unified platform...  ...believe the data warehouse architecture as we know it today will...  ...generation (decoupled) query engine and structured storage system that can outperform...  ...relevant certifications and training, and specific work location... 
    Training
    Local area
    Worldwide

    Cacheflow

    San Francisco, CA
    2 days ago
  • $166k - $225k

    Senior Software Engineer - Database Engine Internals P-97 Our...  ...and all the way up to ML/AI with a unified platform...  ...the data warehouse architecture as we know it today will...  ...structured storage system that can outperform specialized...  ...certifications and training, and specific work... 
    Training
    Permanent employment
    Contract work
    For contractors
    For subcontractor
    Work at office
    Local area
    Worldwide
    Relocation
    Work visa

    Databricks

    San Francisco, CA
    1 day ago
  • Staff Software Engineer, ML Infra & Distributed Systems About the Role: As a Staff Software Engineer on the ML Infrastructure...  ...projects. This role grants architectural freedom to explore new...  ...Feast) Understanding of ML model training pipelines and model internals. Experience... 
    Training

    Tubi Tv

    San Francisco, CA
    9 days ago
  • $218.4k - $365.2k

     .... Job Category Software Engineering Job Details About...  ...the most critical architectural initiatives for Spiff...  ...high-scale, agentic systems that move beyond static...  .... Experience with ML/AI model deployment and...  ...promotion, benefits, training, assessment of job... 
    Training
    Contract work
    Flexible hours

    Salesforce

    San Francisco, CA
    20 hours ago
  •  ...pioneering the model architectures that will make this possible...  ...a new primitive for training efficient, large-...  ...model innovation and systems engineering paired with a design‑...  ...we’re looking for a Software Engineer to help...  ...the training data and ML data infrastructure at... 
    Training
    Work at office
    Visa sponsorship
    Flexible hours

    Cartesia, Inc.

    San Francisco, CA
    2 days ago
  • $218.4k - $365.2k

     ...Management (ICM) software that drives commissions...  .... As a Software Engineering Architect...  ...the most critical architectural initiatives for Spiff...  ...-scale, agentic systems that move beyond...  ....Experience with ML/AI model deployment...  ..., benefits, training, assessment of job... 
    Training
    Contract work
    Flexible hours

    Salesforce

    San Francisco, CA
    1 day ago
  • $245k - $385k

     ...framework components to power our ML training systems.  We work on building...  ...As a Distributed Systems engineer, you will work to deliver powerful...  ...of our training systems architectures. This role is based in...  ...burden Have strong software engineering skills and are... 
    Training
    Work at office
    Local area
    Relocation package

    OpenAI

    San Francisco, CA
    more than 2 months ago
  • $180k - $250k

    Staff Software Engineer, ML Performance & Systems Help fal maintain its frontier position on model performance for generative media models. Design and implement novel approaches to model serving architecture on top of our in-house inference engine, focusing on maximizing... 
    Currently hiring
    Relocation
    Visa sponsorship

    fal

    San Francisco, CA
    6 days ago
  •  ...AI/ML Engineer (RL & Physical Systems) FLUIX is building the AI Operating System for data centers. We deploy...  ...environments to accelerate training, testing, and Sim2Real deployment....  ...meet. Collaborate with controls, software, and field engineering teams to integrate... 
    Training
    Weekend work

    Fluix AI

    San Francisco, CA
    20 hours ago
  •  ...Applied AI Engineer Duration: 6-12 months contract...  ...build and deploy software and biological AI systems to safeguard humanity. The same AI architectures that enable self-...  ...applied biological ML engineers from MIT's...  ...including adapting and post-training biological frontier... 
    Training
    Contract work

    CoSourcing Partners - Enterprise-AI and IT Services Company

    San Francisco, CA
    20 hours ago
  • $255k - $405k

     ...aligned with our mission of broad societal benefit. About the Role As a Software Engineer, Distributed Data Systems, you will design and scale the infrastructure that powers large‑scale multimodal training and evaluation at OpenAI. You’ll manage distributed data pipelines,... 
    Training
    Full time
    Work at office
    Local area
    Relocation package
    Flexible hours

    Slope

    San Francisco, CA
    3 days ago
  • $125k - $195k

     ...exceptional, hands-on engineers to make this happen. Mechanical...  ...stack from atoms to architecture. Our team is...  ...Team The Fab Software team builds the product...  ...monitoring and controlling systems in real time,...  ...process driven by AI and ML orchestration. About... 
    Work at office
    Visa sponsorship
    Night shift

    Atomic Semi

    San Francisco, CA
    2 days ago
  •  ...seeking a Machine Learning Platform Engineer to help build scalable systems that support model training for their Machine Learning...  ...productivity of data scientists and ML engineers. The ideal candidate...  ...grasp of ML concepts and software development experience. Responsibilities... 
    Training

    CVFine by Instrovate Technologies

    San Francisco, CA
    4 days ago
  •  ...platform, and the full software stack that powers it....  ...full stack: hardware architecture, locomotion, autonomy,...  ...The Role As an DevOps Engineer, you will own and evolve...  ...inference serving, to training rigs, to the agentic...  ...parallelism. Background in AI/ML infrastructure or GPU... 
    Training
    Local area
    Flexible hours

    Menlo

    San Francisco, CA
    3 days ago
  •  ...applied AI lab building a search engine unlike the world has ever...  ...infra to crawl the entire web, train state-of-the-art embedding...  ...want to build massive-scale ML systems that will define the way the...  ...critical role in our search architecture. We're pretty flexible on what... 
    H1b
    Flexible hours

    Exa Corporation

    San Francisco, CA
    1 day ago
  •  ...native financial operating system for health systems,...  ...partner closely with engineers and leadership to...  ...unaddressed. From multi-tenant architecture and security...  ...for a systems-minded software engineer who cares deeply...  ...computing, storage, and ML-enabled applications as... 
    Contract work

    MidStream PA

    San Francisco, CA
    2 days ago
  • $119k - $224k

     ...About this role: The AI/ML Data Architecture, Engineering, and Enablement team is seeking a Lead Machine...  ...secure data pipelines from internal systems of record to Google Cloud Platform...  ...pipelines. Design data architectures for training, validation and monitoring of... 
    Training
    Work experience placement
    Work at office
    Remote work
    Flexible hours

    Wells Fargo

    San Francisco, CA
    3 days ago
  • $160k - $180k

     ...most critical services. Those systems are at the core of our text...  ...This role reports to the Senior Engineering Manager of Realtime...  ...~ Familiar with open source software, and not afraid to dig into the...  ...experience, and relevant education or training. Please note that the... 
    Training
    Full time
    Relocation
    Relocation package

    Discord

    San Francisco, CA
    2 days ago
  •  ...advanced AI, data, and engineering capabilities. Our...  ...and technology-agnostic architectures, we ensure faster time...  ...of learning complex AI systems quickly. This role is...  ...can build models, debug training issues, optimize GPU...  ...train, and evaluate AI/ML models using Python, TensorFlow... 
    Training

    Accellor, Inc

    San Francisco, CA
    1 day ago
  •  ...2025. The Opportunity As a Software Engineer at Wealth.com, you will help...  ...build, and scale the backend systems powering our estate planning...  ...the systems you build — from architecture and implementation through...  ...handling) Background in AI/ML infrastructure, LLM-powered... 
    Temporary work
    Work at office
    Flexible hours
    Shift work

    Wealth LLC

    San Francisco, CA
    3 days ago
  •  ...small, fast-growing team of engineers in San Francisco powering Fortune...  ...special is our multi-stage architecture: Layout understanding with...  ...health Collaborate with design, ML, and customer-facing teams...  ...2+ years building production software, new grads considered Proficiency... 
    Work at office
    Visa sponsorship
    Relocation package

    Trypulse

    San Francisco, CA
    3 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Software Engineer, ML Systems & Training Architecture. Be the first to apply!