ML Platform Engineer
Foxglove Technologies, Inc
Build the data infrastructure that powers physical AI. Physical AI is moving from research labs into production fleets across industries. As robots scale across the real world, from factories to vehicles, to defense - every workflow from product development to deployment becomes a data problem: what happened, when, on which robot, and why? At Foxglove, we built the unified data platform for physical AI that developer and engineering teams use to answer those questions. We help teams make vast quantities of robotics data actionable, creating the data flywheel they need to develop, test, train, deploy, and operate robots with confidence. About the Role We're looking for a ML Platform Engineer with deep infrastructure instincts to help design, deploy, and scale the systems that power Foxglove's data platform. This is a platform-first role: you'll own the infrastructure layer that makes ML possible in production, not just the models that run on top of it. You'll be responsible for the reliability, scalability, and performance of the ML platform itself, from inference serving and pipeline orchestration to training infrastructure and evaluation frameworks. The problems are real and urgent: petabyte-scale multimodal robotics data, high-throughput retrieval and embedding pipelines, and the internal ML flywheel that lets our team ship fast. This is a hands-on infrastructure role, not research. Key Responsibilities
- Design, deploy, and operate production inference infrastructure - including model serving, autoscaling, load balancing, and cost optimization across cloud environments
- Own the platform architecture for embedding and retrieval pipelines that power semantic search over multimodal robotics data (image, video, point cloud, and timeseries)
- Build and maintain the training and evaluation infrastructure that enables rapid iteration on model performance - including job orchestration, experiment tracking, and dataset versioning
- Drive cloud infrastructure decisions (AWS/GCP) that directly impact latency, throughput, reliability, and cost at scale
- Define platform abstractions and internal tooling that let product engineers ship ML-powered features without needing to manage infrastructure themselves
- Evaluate, integrate, and operationalize third-party ML infrastructure components; establish clear build vs. buy frameworks for the team
- Deep, hands-on experience owning production ML infrastructure: inference serving, model optimization (e.g., vLLM, Triton, TorchServe), orchestration, and cloud cost management
- Strong foundation in distributed systems and cloud infrastructure (AWS/GCP) - you think in terms of system reliability, failure modes, and operational burden, not just model accuracy
- Experience architecting and operating retrieval systems at scale, including vector databases (e.g., Pinecone, Lance, turbopuffer, pgvector) and embedding pipelines over large, heterogeneous datasets
- A platform engineer's mindset: you build systems that other engineers depend on, and you take that responsibility seriously
- Proven ability to operate with high ownership - you can make hard infrastructure tradeoffs independently and move fast without breaking things
- Strong communication skills; you can explain infrastructure tradeoffs clearly to both ML and non-ML engineers
- Familiarity with fine-tuning and domain adaptation techniques for LLMs or embedding models (i.e. SFT, PEFT)
- Familiarity with data mining or hybrid search workflows, especially as applied in robotics autonomous vehicles, or physical AI workflows
- Prior experience building ML platforms, evaluation frameworks, or data management tooling from the ground up
- $300 monthly budget towards commuter benefits or building your personal workspace (remote only)
- Competitive equity grant in a Series B company
- Medical, Dental, Vision, and Term Life insurance coverage at 100% for employees and 75% for dependents
- 401(k) matching up to 4%
- 4 weeks vacation, plus holidays and winter break
- All expenses paid company off-sites 2× per year
- Work on real robotics problems. Robot data is large, messy, multimodal, time-sensitive, and tied to physical-world behavior. The problems we work on span ingestion, indexing, search, visualization, replay, connectivity, collaboration, evaluation, and operations.
- Build tools engineers rely on. Foxglove is used by robotics teams investigating failures, validating changes, reviewing field behavior, curating datasets, and operating production fleets. The work you do helps teams understand what their robots saw, what they did, and why they behaved the way they did.
- High-leverage product surface area. A better query path, visualization workflow, Fleet connection, UI primitive, API, onboarding flow, or customer deployment can change how an entire robotics team works.
- Ownership and autonomy. We're a small team, and people at Foxglove own meaningful work end-to-end. You'll have real influence over product direction, technical architecture, customer outcomes, and how we operate as a company.
- Strong peers and high standards. You'll work with people who care about correctness, performance, craft, product judgment, and building software that technical users trust under pressure.
- A mission grounded in production software. We accelerate robotics and physical AI by building the infrastructure teams use every day to connect to robots, inspect live telemetry, manage multimodal data, replay runs, investigate failures, and improve real systems.
- Competitive equity grant in a Series B company.
- Medical, dental, vision, and term life insurance coverage at 100% for employees and 75% for dependents, for U.S. full-time employees.
- 401(k) matching up to 4%, for U.S. full-time employees.
- 4 weeks of vacation, plus holidays and winter break.
- All-expenses-paid company offsites 1-2× per year.
- $300 monthly budget toward commuter benefits or building your personal workspace, depending on role/location.
Vacancy posted 5 days ago
Similar jobs that could be interesting for youBased on the ML Platform Engineer in San Francisco, CA vacancy
$205k - $235k
...Detroit, Houston, Los Angeles, McLean, New York, Hoboken, Philadelphia, San Francisco, Seattle EY-Parthenon – EY Growth Platforms - AI ML Engineering – Director At EY-Parthenon, our unique blend of strategy, transactions and corporate finance, combined with cutting‑...SuggestedFull timeFor contractorsWork experience placementSummer holidayFlexible hours$130.2k - $195.3k
...ML Platform Engineer We're on a mission to unleash the power of content… you in? We've got the brands, we've got the stars, we've got the power to achieve our mission to entertain the planet – now all we're missing is… YOU! Becoming a part of Paramount means joining...Suggested- Icehouseventures is seeking an Infrastructure Engineer in San Francisco, CA, to build and maintain the foundational Kubernetes platform across AWS, GCP, and Azure. The role... ...controls, and collaborate closely with SRE and ML teams. Ideal candidates will thrive in a startup...Suggested
- Crusoe is seeking a Senior Software Engineer for the Model LifeCycle team in San Francisco, California. This role focuses on building a managed platform for application development, with a strong emphasis on Machine Learning and Large Language Models. The ideal candidate...Suggested
- ...performance and reliability. Minimum Qualifications Bachelor’s degree in Computer Science, Engineering, or equivalent practical experience 3+ years in Software Engineering, MLOps, or ML Infrastructure Strong Python proficiency Experience building internal developer tools...SuggestedImmediate startRelocation packageNight shift
- A cutting-edge AI firm in San Francisco is seeking a talented engineer to design and implement robust CI/CD pipelines for machine learning workflows. The ideal candidate will have a bachelor's degree in Computer Science or a related field, with at least 3 years of experience...
- A dynamic tech company in San Francisco is seeking a seasoned ML Infrastructure Engineer to lead the development of innovative AI product systems. This role entails scaling ML product development infrastructure, collaborating with cross-functional teams, and mentoring...
- A cutting-edge AI firm in San Francisco seeks a Software Engineer specializing in machine learning infrastructure. This role requires designing robust CI/CD pipelines and developing internal tools that streamline research processes. The ideal candidate has a Bachelor's...Relocation package
- Docusign is looking for a Machine Learning Engineer to develop the foundational infrastructure for intelligent systems. The role requires... ...distributed systems, developing pipelines, and deploying robust ML models. The ideal candidate has over 5 years in machine learning...
- A decentralized AI platform company in the United States is seeking an experienced ML Training Platform Engineer to design and build robust infrastructure for ML training. The ideal candidate has over 5 years in infrastructure and platform engineering, with expertise in...
- Weights & Biases is seeking a Software Engineer to enhance and maintain the Sweeps product for optimizing hyperparameters. You will design... ...languages, and a passion for addressing challenges faced by ML practitioners. This role offers opportunities to interact with elite...Remote job
$293k
OpenAI is seeking a Software Engineer for Monetization ML Infrastructure in San Francisco. This role involves designing the machine learning infrastructure... ...'ll work on large-scale data pipelines, model training platforms, and real-time serving infrastructure, all while ensuring...$212k - $318.4k
A leading technology company in San Francisco is seeking a Software Engineer to join its Applied Machine Learning team. This role focuses on designing and building a robust ML platform and infrastructure to support enterprise-level initiatives. Candidates should have at...$155k - $175k
Alumni Ventures is seeking a GenAI/ML Platform Engineer to join its AI team. This hybrid role in San Francisco focuses on leveraging machine learning and generative AI to enhance features for Strava athletes. The ideal candidate will lead projects from inception to deployment...- AppFolio seeks a Machine Learning Infrastructure Engineer to build and operate its ML platform. You will design the infrastructure on AWS, optimize costs, and ensure reliable access to multi-provider LLMs. The ideal candidate has significant experience in production ML...Remote job
- Grow Therapy in San Francisco is hiring a Staff ML Platform Engineer to drive the technical vision and execution of its Machine Learning Platform. This role involves designing and building scalable real-time ML systems, particularly for patient-provider matching. The ideal...
- TOGETHXR is looking for a GenAI/ML Platform Engineer to join our AI team in San Francisco. The role involves developing and championing our AI/ML platforms, working hands-on with extensive datasets, and collaborating with cross-functional teams to create innovative product...Flexible hours3 days per week
- Strava is seeking a GenAI/ML Platform Engineer to join their AI team in San Francisco. This role involves driving AI/ML platform projects from inception to implementation, utilizing Strava's extensive datasets. You'll work closely with engineers and data scientists to...Work at officeFlexible hours3 days per week
- CVFine by Instrovate Technologies is seeking a Machine Learning Platform Engineer to help build scalable systems that support model training for... ...tools that enhance the productivity of data scientists and ML engineers. The ideal candidate should have a strong grasp of ML...
- Anyscale in San Francisco is seeking a Customer Engineer to support customers in their adoption of our platform. This position involves troubleshooting and resolving... ...technical role and a strong understanding of ML applications and infrastructure. Anyscale offers an...
- Abridge is looking for a Technical Recruiter to build our exceptional Platform and ML Engineering teams. This role involves owning the complete recruiting lifecycle, shaping our hiring strategy, and enhancing our employer brand to transform healthcare through AI. The ideal...
- Optum is seeking a Lead AI/ML Engineer to design and develop AI systems for enhancing patient care. This role involves leading the implementation of scalable AI/ML platforms that utilize healthcare data effectively. The ideal candidate will have extensive experience in...Remote job
$205k - $235k
A leading professional services firm is seeking a Director for AI ML Engineering to co-lead the engineering team and build a scalable analytics platform. This role involves delivering high-visibility solutions for Fortune 500 clients by translating business needs into...- ...Software Development Manager in San Francisco to lead the software and machine learning engineering team. This role will focus on building AMP, a next-generation machine learning platform. The ideal candidate has over 12 years of software development experience with at...Remote job
$160k - $250k
...Machine Learning, Platform Engineer San Francisco About the Role Our team focuses on enabling custom models and dedicated inference... ...familiar with multi-cluster scheduling and have some sense of ML bottlenecks. Responsibilities New hires may work on multi...Full time- ...A healthcare technology company in San Francisco is seeking an experienced AI/ML Engineer to enhance healthcare delivery in the U.S. In this hybrid role, you'll develop ML models and data pipelines to improve patient care. The ideal candidate has 5+ years of experience...
$164.7k - $266k
DocuSign, Inc. is seeking a Machine Learning Engineer to design and build scalable infrastructure for intelligent systems. You will work closely with AI research and engineering teams, ensuring the development of robust models and distributed systems. The role requires...- Hamilton Barnes Associates Limited is seeking a Senior ML Infrastructure Engineer to help build and scale Kubernetes-based machine learning platforms. This role focuses on workload orchestration, GPU scheduling, and ensuring system reliability, working with highly technical...
- Wherobots, Inc. is seeking a Senior Machine Learning Engineer in San Francisco, California to lead the development of a scalable geospatial ML platform. The ideal candidate will have a strong background in distributed systems and extensive experience with GPU-based workflows...Remote job
- Ersilia is seeking a passionate ML engineer in San Francisco to own the recommendation engine, balancing user experience and advertiser ROAS. This role involves building low-latency ad ranking systems and designing data pipelines for ML training. The ideal candidate should...
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to ML Platform Engineer. Be the first to apply!
Related searches
- graduate machine learning engineer San Francisco, CA
- machine learning engineer San Francisco, CA
- data scientist machine learning engineer San Francisco, CA
- junior machine learning research engineer San Francisco, CA
- senior ml engineer San Francisco, CA
- computer vision machine learning engineer San Francisco, CA
- ai ml engineer San Francisco, CA
- machine learning software engineer San Francisco, CA
- machine learning ai engineer San Francisco, CA
- platform developer San Francisco, CA


