ML Platform Engineer: Training & Inference Engine

A Medium Corporation

Saviynt, located in San Francisco, is seeking an AI Platform Engineer to manage and optimize the training and inference of AI models. You will lead efforts in operating the Ray ecosystem and distributed training on advanced GPU clusters. The ideal candidate has a solid foundation in ML engineering, particularly with Ray, LLMs, and experience in production-level MLOps. Competitive salary and opportunities for professional growth are offered. #J-18808-Ljbffr Medium

Apply

Vacancy posted 4 days ago

Similar jobs that could be interesting for youBased on the ML Platform Engineer: Training & Inference Engine in San Francisco, CA vacancy

ML Infra Engineer — Scalable GPU Training & Inference (SF)
Reducto, Inc. is hiring a Machine Learning Infra Engineer in San Francisco to build and maintain ML training and inference frameworks. The role focuses on high performance and scaling across multiple nodes and GPUs. The ideal candidate will have strong Python skills and...
Training
Reducto, Inc.
San Francisco, CA
16 hours ago
ML Infra Engineer: Scale GPU Training & Inference
Reducto, a fast-growing AI company in San Francisco, is hiring a Machine Learning Infra Engineer. This role involves building and maintaining the training and inference frameworks necessary for optimal performance. Ideal candidates should possess strong Python skills,...
Training
Reducto
San Francisco, CA
16 hours ago
ML Platform & Infrastructure Engineer
...read on. What You'll Do Training Automation: Design and... ...utilization and cluster health Inference cost and unit economics Build... ...degree in Computer Science, Engineering, or equivalent practical... ...Software Engineering, MLOps, or ML Infrastructure ~ Strong Python...
Training
Immediate start
Relocation package
Night shift
AGI
San Francisco, CA
4 days ago
ML Platform Engineer
...unexpectedly, or need to be improved, engineers rely on data to understand what... ...Role We're looking for a ML Platform Engineer with deep... ...of the ML platform itself, from inference serving and pipeline orchestration to training infrastructure and evaluation frameworks...
Training
Remote work
Foxglove Technologies, Inc
San Francisco, CA
4 days ago
Staff ML Platform Engineer - Large Scale Training (LLMOps/MLOps)
...Staff ML Platform Engineer – Large Scale Training (LLMOps/MLOps) We're TrueFoundry, and we're building the foundational infrastructure for production... ...infrastructure and code that enables high-throughput, low-latency inference pipelines for state-of-the-art models. Build...
Training
Flexible hours
TrueFoundry
San Francisco, CA
2 days ago
ML Platform & Infrastructure Engineer
...Type On‑site What You’ll Do Training Automation: Design and implement... ...and cluster health Inference cost and unit economics Build... ...degree in Computer Science, Engineering, or equivalent practical experience... ...Software Engineering, MLOps, or ML Infrastructure Experience...
Training
Full time
Immediate start
Relocation package
Night shift
AGI Inc
San Francisco, CA
2 days ago
Staff ML Infrastructure Engineer: Scale Training & Inference
$300k - $430k
...leading conversational AI platform empowering every brand... .... About the Team The ML Infrastructure team builds... ...platforms for model training, the infrastructure for... ...routing layer that manages inference across multiple... ...Staff ML Infrastructure Engineer to own the platforms powering...
Training
Work at office
Decagon
San Francisco, CA
4 days ago
Machine Learning Engineer, Inference & Serving (Speech LLM) - San Francisco
$200k
...experience building and deploying high-throughput, ultra-low-latency inference engines for large language models or foundational speech models.... ...you will sit at the critical intersection between the core ML training team and the backend infrastructure team. Thrive in fast-...
Training
Full time
Work at office
Worldwide
Plaud
San Francisco, CA
1 day ago
ML Training Platform Engineer | Multi-Cloud & Decentralized
A decentralized AI platform company in the United States is seeking an experienced ML Training Platform Engineer to design and build robust infrastructure for ML training. The ideal candidate has over 5 years in infrastructure and platform engineering, with expertise in...
Training
Pluralis Research
San Francisco, CA
1 day ago
Senior GPU ML Infra Engineer — Mid-Training & Inference
...specialist to design and operate large-scale GPU infrastructure. This role requires expertise in deploying GPU systems for high-throughput inference and model performance optimization. The ideal candidate will have hands-on experience with modern inference frameworks and a solid...
Training
Reflection AI
San Francisco, CA
1 day ago
ML Infra Engineer: Scale Training & Inference (Hybrid)
A leading technology company is looking for an ML Infrastructure Engineer in San Francisco. The successful candidate will build and maintain ML training pipelines and ensure low-latency model serving. Candidates should have over 4 years of experience in ML engineering,...
Training
Work at office
Lattice, Inc.
San Francisco, CA
16 hours ago
AI Platform Engineer, Training and Inference
AI Platform Engineer - Training & Inference Saviynt's AI-powered identity platform manages and governs human and non-human access to all of an organization... ..., and serves every AI model at Saviynt. We need an ML Platform Engineer to own distributed training on Ray +H1...
Training
Saviynt Inc.
San Francisco, CA
4 days ago
ML Platform / MLOps Engineer
$180k - $250k
...ML Platform / MLOps Engineer Emeryville, California, United States; Hybrid (2-3 days on-site) Profluent... ...reliable, scalable platforms for training, evaluating, and deploying large-... ...researchers to run large-scale ML training and inference workloads reliably and efficiently on...
Training
Profluent
Emeryville, CA
3 days ago
Staff Machine Learning Platform Engineer
$246.5k - $339k
...is a technology wholesale platform built on the belief that... ...Machine Learning Platform Engineer, you will help design,... ..., and operate a scalable ML platform to accelerate model training, deployment, and governance... ...cost across training and inference workloads Configure Identity...
Training
Work experience placement
Work at office
Local area
Remote work
Monday to Friday
Flexible hours
3 days per week
Faire Inc
San Francisco, CA
2 days ago
Staff ML Engineer, AI Platform
$250k - $300k
...building the AI intelligence platform that restores humanity... ..., every quarter. Our engineering roles are hybrid in... ...reconstructs exact inference inputs (retrieved chart... ...and convert it into training signal. End-to-end latency... ..., 3+ focused on ML infrastructure, platform...
Training
Work at office
Immediate start
Remote work
Flexible hours
Ambience Healthcare
San Francisco, CA
3 days ago
Lead AI/ML Engineer (Platform, kubeflow)
$197.3k - $225.1k
...Lead AI/ML Engineer (Platform, kubeflow) Overview At Capital One, we are creating responsible and reliable AI systems... ...AI software components including foundation model training, large language model inference, similarity search, guardrails, model evaluation, experimentation...
Training
Full time
Part time
Local area
Capital One Financial Corp
San Francisco, CA
16 hours ago
Machine Learning Infrastructure Engineer- Model Inference
...healthcare. Our AI-powered platform was purpose-built for... ...creatives, technologists, and engineers working together to... .... The Role As an ML Infrastructure Engineer, Model Inference at Abridge, you’ll play... ...AI model inference and training Develop, optimize, and...
Training
Hourly pay
Full time
Flexible hours
Abridge
San Francisco, CA
4 days ago
Staff ML Engineer, AI Platform
$250k - $300k
...Healthcare is the leading AI platform for documentation,... ..., every quarter. Our engineering roles are hybrid in... ...that reconstructs exact inference inputs (retrieved... ...usage and convert it into training signal. End‑to‑end... ...engineering, 3+ focused on ML infrastructure,...
Training
Work at office
Immediate start
Ambience
San Francisco, CA
4 days ago
ML Infra Engineer
...ML Infrastructure Engineer In this role you will help scale and optimize our training systems and core model code. You'll own critical infrastructure... ...with research, data, and platform engineers to ensure models... ...Will Own training/inference infrastructure: Design,...
Training
Physical Intelligence
San Francisco, CA
16 hours ago
ML Engineer
$250k - $400k
...discovery are actually built, trained, and run in production.... ..., and experimentation platforms that make long-horizon... .... It's building the engine that research runs on.... ...model deployment and inference for complex reasoning systems... ...building and scaling ML systems in production...
Training
Remote work
techire ai
San Francisco, CA
1 day ago
ML Engineer
...ML Engineer San Francisco, California, United States Or refer someone Job Openings ML Engineer About the Job Our client... ..., including data acquisition, preprocessing, model training, deployment, inference, and monitoring in production environments. Participate...
Training
Full time
Catalyst Labs, LLC
San Francisco, CA
1 day ago
ML Engineer LLM Privacy
...responsibility in mind. Our ML team comes from a... ...to work on the premier platform for private and personalized... ...detection, and/or membership inference attacks. Collaborate with our engineering team to deliver real-... ...high quality synthetic training data, train LLMs, and...
Training
Local area
Shift work
Dynamo AI
San Francisco, CA
3 days ago
Sr. ML Engineer - ML & Applied AI
...Senior Machine Learning Engineer with 10+ years of... ...focused on end-to-end ML system ownership, including... ...engineering, model training, deployment, monitoring... ...development of scalable ML platforms, drive best practices... ...-performance model inference in both batch and real...
Training
Gap Inc.
San Francisco, CA
3 days ago
Distributed Systems Engineer, Data & Inference Platform
...compute into useful intelligence - the inference services that serve LLMs at scale and the... ...you honest about both. Researchers and ML engineers will hand you workloads that barely run... ..., and curate the datasets behind training and evaluation. The bottleneck is rarely...
Training
Flexible hours
Adaption
San Francisco, CA
1 day ago
ML Engineer
...with research and infra to prototype, train, and deploy state-of-the-art voice models... ...Squeeze silicon — scale training and inference for LLM-class workloads; chase latency... ...-level PyTorch. Proven software engineer who loves ML; comfortable writing production code across...
Training
Full time
Contract work
Flexible hours
Shift work
SESAME
San Francisco, CA
3 days ago
Founding ML Engineer
...Founding Ml Engineer Skills: Python, PyTorch, NLP, LLMs, Information... ...core intelligence layer. Our platform indexes hundreds of millions... ...role. You will be researching, training, and shipping models - from... ...records Given raw people data, infer the org chart — who reports...
Training
Crustdata (YC F24)
San Francisco, CA
16 hours ago
ML Ops Engineer Agentic AI Lab (Founding Team)
...ML Ops Engineer — Agentic AI Lab (Founding Team) Location: San Francisco... ...for automating the model training, deployment, versioning, and... ...conversion, quantization, and inference rollout Manage hybrid... ...~4+ years in MLOps, ML platform engineering, or infra-focused...
Training
Full time
Fabrion
San Francisco, CA
3 days ago
ML Infra Engineer (Supercomputing)
...for the physical world. Training our models requires... ...The Team The ML Infrastructure team supports... ...training systems), data platform, and research teams to... .... - Support Inference and Robot Deployment... ...: - Strong software engineering fundamentals - Experience...
Training
Flexible hours
Physical Intelligence
San Francisco, CA
16 hours ago
Senior ML Engineer
$152k - $228k
...Description Job Description Senior ML Engineer About Invoca Invoca is an AI-powered revenue execution platform that brings together marketing,... ...ML lifecycle at Invoca, from model training and fine-tuning through inference optimization and production APIs. We...
Training
Currently hiring
Remote work
Flexible hours
Invoca
San Francisco, CA
3 days ago
ML Inference Engineer
Job Overview Department: Engineering Location: San Francisco We're looking for an ML Inference Engineer with deep expertise in high-performance ML engineering. This... ...partner teams to integrate their models into our platform Required Skills Bachelor’s degree in Computer...
Visa sponsorship
Relocation package
Reactor
San Francisco, CA
1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to ML Platform Engineer: Training & Inference Engine. Be the first to apply!