Machine Learning Infra Engineer

Reducto

About Reducto Reducto helps AI teams ingest real world enterprise data with state of the art accuracy. The vast majority of enterprise data — from financial statements to health records — is locked in unstructured file formats like PDFs and spreadsheets. We train vision models to read those documents the way a human would, and make it possible to build products, train models, and automate processes at scale. We’ve grown incredibly quickly, growing revenue by 7x YOY, and now work with hundreds of companies ranging from leading AI teams (Harvey, Vanta, Scale), through to enterprise (FAANG, top 3 trading firm). We're raised over 100M from world class investors like A16z, Benchmark, and First Round Capital, and are hiring a Machine Learning Engineer to help us train and deploy the models critical to the performance of our core product. The Opportunity As an ML Infra Engineer , you’ll play a key role in building the inference and training frameworks that make it possible to deliver results at scale. You’ll collaborate closely with our ML and Platform teams to scale training across nodes, develop faster and more efficient serving, and create observability across the stack. This is a high-impact role where you’ll help define what high performance ML training and inference look like at Reducto. What You’ll Do Build, and maintain our training and inference stack with an emphasis for fast iteration on training + flexibility for exploring new methods and high performance in inference. Develop benchmarks for both sets of stacks to identify bottlenecks. Explore SOTA advances in training and inference and work to apply them. Design systems for scaling model training across multi-node, multi-GPU environments with strong reliability and observability. Scale distributed training and inference workloads across large GPU clusters while improving utilization, reliability, and cost efficiency. Build the tooling, abstractions, and observability that help ML engineers move faster from experiment to production. You’ll Thrive Here If You: Hold yourself to a high bar for quality and precision. Enjoy solving complex problems and building from first principles. Have strong Python skills + a background in systems engineering. Are comfortable with Kubernetes and distributed training frameworks. Love getting your hands dirty with real-world implementation challenges. Operate well in fast-changing, high-growth environments. Collaborate effectively across technical and non-technical teams. Take full ownership from strategy through execution. Bonus points if you: Have experience at an early-stage or high-growth startup. Have developed in open source training/inference stacks in a meaningful way. Are excited to set up distributed inference across 100s-1000s of GPUs. Care deeply about combining technical excellence with business impact. This is an in person role at our office in SF. We’re an early stage company which means that the role requires working hard and moving quickly. Please only apply if that excites you. Benefits at Reducto Unlimited PTO: We believe great work requires recharging. Lunch: Receive a free lunch to eat with your teammates daily at the office. Reimbursed Transportation: Provide us with your receipts and we’ll take care of the costs. Insurance: Generous health insurance covering medical, dental, and vision. Health and Wellness Budget: We provide up to $150/mo reimbursement for health and wellness spending, such as gym memberships, fitness classes, or similar. Parental Leave: Work with us to build a leave schedule that works for you and your family. Reducto is an Equal Opportunity Employer committed to diversity and inclusion in the workplace. All qualified applicants will receive consideration for employment without regard to sex, race, color, age, national origin, religion, physical and mental disability, genetic information, marital status, sexual orientation, gender identity/assignment, citizenship, pregnancy or maternity, protected veteran status, or any other status prohibited by applicable national, federal, state or local law. #J-18808-Ljbffr

Apply

Vacancy posted 1 day ago

Similar jobs that could be interesting for youBased on the Machine Learning Infra Engineer in San Francisco, CA vacancy

Machine Learning Engineer - Infra San Francisco, CA
$147.6k - $274k
...Job Description: Machine Learning Engineer - Infra San Francisco, CA The Opportunity We are revolutionizing drug discovery with cutting-edge machine learning techniques. We are seeking a highly motivated and skilled ML Engineer to join our growing team...
Suggested
Relocation package
ESR Healthcare
San Francisco, CA
1 day ago
ML Infra Engineer - Supercomputing
...reliable, reproducible, and fast. You will work closely with ML Infra (training systems), data platform, and research teams to ensure... ..., but strong candidates usually have: - Strong software engineering fundamentals - Experience building or operating job scheduling...
Suggested
Flexible hours
Physical Intelligence
San Francisco, CA
7 hours ago
ML Infra Engineer (TPU/Jax/Optimization)
...training pipelines. You’ll work closely with researchers and model engineers to translate ideas into experiments—and those experiments into... ...cost. Partner with researchers: Translate research needs into infra capabilities and guide best practices for training at scale....
Suggested
Physical Intelligence
San Francisco, CA
8 hours ago
ML Infra Engineer: Scale GPU Training & Inference
...Reducto, a fast-growing AI company in San Francisco, is hiring a Machine Learning Infra Engineer. This role involves building and maintaining the training and inference frameworks necessary for optimal performance. Ideal candidates should possess strong Python skills,...
Suggested
Reducto
San Francisco, CA
7 hours ago
Edge ML Infra Engineer for Real-Time Perception
...A cutting-edge technology company in San Francisco is seeking an ML Infrastructure Engineer to build and scale machine learning systems for real-time perception and inference. This role involves designing scalable training pipelines for computer vision models, optimizing...
Suggested
Specter Services LLC
San Francisco, CA
8 hours ago
ML Infra Engineer: Scale GPU Compute & Models
$100k - $200k
...Voiceflow is seeking a skilled ML-Infrastructure Engineer in San Francisco to architect and operate auto-scaling systems for our voice AI simulation platform. The role includes optimizing GPU and compute infrastructure, ensuring high performance and reliability. Ideal...
Work at office
Voiceflow
San Francisco, CA
8 hours ago
Founding ML Infra Engineer: Scale Real-Time Inference
...URun in San Francisco is searching for an ML Infrastructure and Platform Engineer. In this role, you will lead the architecture and scaling of our GPU compute platform from the ground up, ensuring high availability and low-latency inference. This is a founding technical...
U-Run
San Francisco, CA
7 hours ago
ML Infra Engineer
...network. We're a fast-moving, close-knit team of AV industry veterans and innovative thinkers. We don't believe culture can be engineered - but when it falls into place, it's a once-in-a-lifetime adventure. Progress has never felt so present. Position Overview We're...
Local area
Humble Robotics
San Francisco, CA
1 day ago
ML Infra Engineer: Scale Training & Inference (Hybrid)
...leading technology company is looking for an ML Infrastructure Engineer in San Francisco. The successful candidate will build and maintain... ...salaries, premium health benefits, and a hybrid work model with office access and a $5,000 annual learning budget. #J-18808-Ljbffr...
Work at office
Lattice
San Francisco, CA
7 hours ago
Senior ML Engineer - Self-Healing AI for Global Infra
...A technology company based in Seattle is looking for a Senior Machine Learning Engineer who will design and implement AI-driven solutions for optimizing their infrastructure. This role requires strong experience in Machine Learning, data processing, and software engineering...
DocuSign
San Francisco, CA
8 hours ago
Senior ML Engineer, MLOps Production AI Infra
$131.4k - $235.95k
...Autodesk, Inc. is seeking a Senior Machine Learning Engineer for MLOps in San Francisco. You will ensure AI-powered experiences meet high standards for reliability and scalability. Key responsibilities include automating model testing, managing inference services, and...
Autodesk
San Francisco, CA
7 hours ago
Founding ML Ops Engineer Equity & AI Infra
...Fabrion is looking for an ML Ops Engineer for its Agentic AI Lab in San Francisco. Your role will be pivotal in bridging ML research with production systems, focusing on automating model training and deployment. You will establish secure, scalable pipelines and manage...
Fabrion
San Francisco, CA
7 hours ago
ML Infra Engineer Scale AI (SF On-site)
$250k - $350k
...Most AI roles build on top of models. This one builds what makes them actually work. We’re hiring ML Infrastructure Engineers to tackle a hard, real-world problem, understanding what’s happening on live job sites using wearable devices, large-scale video, and AI. This...
Trades Workforce Solutions
San Francisco, CA
1 day ago
Senior Machine Learning Engineer, Weather & Degraded Road Surfaces
$204k - $259k
...team builds the system which learns the spatial-temporal representation... ...set of sensors, enabling engineers like you to (1) develop... ...perception platform teams that build infra for us as well as behavior... .... You will: Apply machine learning techniques to build...
Full time
Remote work
Waymo
San Francisco, CA
4 days ago
ML Infra Engineer (Distributed Training)
...ML Systems Engineer – Robotics & AI We are building the full-stack foundation for the next generation of humanoid robots, from high-performance... ...unseen in training. We work at the intersection of large-scale learning, robotics, and systems, with a research team of top-tier...
Maxwell Bond
San Francisco, CA
1 day ago
Senior Machine Learning Engineer - VLM/LLM Evaluation
$204k - $259k
...Senior Machine Learning Engineer – VLM/LLM Evaluation Waymo is an autonomous driving technology company with the mission to be the world's most... ...skills (eg: Python, C/C++) We prefer: ML infra experience: training, evaluating and deploying ML models at...
Full time
Temporary work
Remote work
Waymo
San Francisco, CA
4 days ago
ML Infra Engineer: Smart Scheduling for Scaled Training
...A leading AI technology company in San Francisco is seeking an engineering professional to develop and manage intelligent job scheduling systems for large-scale AI applications. This role focuses on ensuring efficient resource allocation across GPU and TPU clusters while...
Physical Intelligence
San Francisco, CA
8 hours ago
Senior Machine Learning Engineer, Computer Vision/VLM
$204k - $259k
...validating the AV stack. We are an advanced ML and engineering team that leverages state-of-the-art computer vision, deep learning, and generative AI to automatically analyze... .... Collaborate closely with the ML Infra, Perception, Behavior, and AI Foundation teams...
Full time
Remote work
Waymo
San Francisco, CA
1 day ago
Senior ML Engineer: Build Self-Healing AI for Global Infra
...DocuSign, Inc. in San Francisco, California is seeking a Senior Machine Learning Engineer to redefine global services operations. You will design autonomous multi-agent systems using Reinforcement Learning and develop deep learning models for high-volume time series data...
Work at office
2 days per week
DocuSign
San Francisco, CA
8 hours ago
Senior ML Training Systems Engineer - Distributed GPU Infra
...A leading AI technology company in San Francisco is looking for a Senior Software Engineer to build scalable infrastructure for large‑scale training and fine-tuning of foundation models. You will design distributed training systems and optimize GPU utilization while collaborating...
BaseTen
San Francisco, CA
8 hours ago
ML Infra Engineer for Multimodal Data Systems
...A pioneering AI firm based in San Francisco is seeking a Research Engineer, Distributed Data Systems. In this role, you will design and maintain infrastructure for large-scale multimodal training, ensuring scalability and reliability of data systems. Candidates should...
Work at office
Relocation package
OpenAI
San Francisco, CA
7 hours ago
Edge ML Engineer for Farm Robotics & Data Infra
...A technology startup in California is seeking a Machine Learning Engineer to develop robust solutions for ML/CV software relating to farm image data. The role involves building scalable ETL pipelines and collaborating with a dedicated team. An ideal candidate has 2+ years...
Full time
Orchard Robotics
San Francisco, CA
7 hours ago
Senior ML Infra Engineer - Monetization Systems
...AI Chopping Block, Inc. in San Francisco is seeking an experienced Software Engineer to develop machine learning infrastructure for monetization and ads systems. The role involves building data pipelines, creating training platforms, and collaborating with various teams...
AI Chopping Block, Inc.
San Francisco, CA
8 hours ago
Senior ML Systems Engineer - LLM Infra & Governance
...A tech-driven company focused on blockchain solutions is seeking a Senior ML Systems Engineer. In this role, you will build reusable workflows, automate model versioning, and deploy scalable AI systems. Candidates should have strong programming skills, experience with...
TRM Labs
San Francisco, CA
1 day ago
Senior ML Infra Engineer - Real-Time Data Systems
...Arena Intelligence, Inc. in San Francisco, CA, is seeking a Senior Software Engineer (Infrastructure) to lead the design of scalable data and API systems. The role involves architecting real-time data pipelines, ensuring performance and reliability, and mentoring engineers...
Arena Intelligence, Inc.
San Francisco, CA
7 hours ago
MLOps Engineer Scale AI Pipelines & Infra
...Hayden AI Technologies, Inc. is looking for an MLOps Engineer in San Francisco to design and implement cloud-based workflows for AI... ...requiring at least 3 days in-office. Join us to shape the future of machine learning infrastructure with your expertise and collaboration skills....
Work at office
Hayden AI Technologies, Inc.
San Francisco, CA
8 hours ago
ML Ops Engineer Equity & AI Infra Architect
...A pioneering AI company in the San Francisco Bay Area is seeking an ML Ops Engineer to automate model training, deployment, and governance processes. The ideal candidate will have extensive MLOps experience and be proficient in tools like Kubernetes and Terraform. This...
Fabrion
San Francisco, CA
8 hours ago
ML Training Infra Engineer — JAX/TPU & Scale
...company in San Francisco is seeking a skilled ML Infrastructure Engineer to manage and optimize large-scale training systems. In this... .... This role promises to engage you at the forefront of machine learning and software engineering. #J-18808-Ljbffr Physical Intelligence
Physical Intelligence
San Francisco, CA
3 days ago
Senior ML Infra Engineer - Large-Scale Training & Pipelines
...ideas to work What we’re looking for We value a relentless approach to problem-solving, rapid execution, and the ability to quickly learn in unfamiliar domains. Strong grasp of state-of-the‑art techniques for optimizing training and inference workloads Demonstrated...
Kindredventures
San Francisco, CA
3 days ago
Senior GPU ML Infra Engineer — Mid-Training & Inference
...optimization. The ideal candidate will have hands-on experience with modern inference frameworks and a solid understanding of reinforcement learning technologies. Comprehensive healthcare benefits, parental leave, and daily meals are provided, along with competitive salary and...
Reflection AI
San Francisco, CA
1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Machine Learning Infra Engineer. Be the first to apply!