Machine Learning Engineer - Inference / Serving

$180k - $275k

YOBI, LLC

Yobi is a rapidly growing Behavioral AI company on a mission to ethically democratize the benefits of data and AI .

Since 2019, we have built one of the largest consented behavioral datasets in the United States, extending far beyond the walled gardens of Big Tech. Unlike traditional LLM companies, Yobi builds foundation models of human behavior grounded in real-world actions such as purchases and store visits.

Our private-by-design modeling enables state-of-the-art personalization and decisioning for leading brands and agencies while protecting privacy, safety, and ethics.

Today, we are focused on bringing the performance of closed-web user acquisition to the open web and connected TV , giving brands walled-garden results without the walls.

At our core, Yobi is building the behavioral intelligence layer for any system that makes a personalization decision .

Working at Yobi

We're at an inflection point-customer adoption is accelerating, but there's still room to shape the architecture and culture from the ground up. Engineers here own major surface areas , build 0→1 systems in large-scale data and model infrastructure, and help define how Behavioral AI scales ethically and effectively.

Highlights:

Well-funded with 5+ years of runway. At the same time, we are scaling revenue quickly and project to be breakeven in 2026.
Partnerships with Microsoft and Databricks
Fully remote or hybrid from several hubs (SF Bay Area, Seattle, NYC)
World-class team of Machine Learning experts who worked on cutting edge infra and recommender systems @ Amazon, Uber, Twitter, Meta, etc.
Product and Go-To-Market teams who have taken ideas from concept to 9 figure revenue streams

Benefits:

Competitive Base Salary
Meaningful equity & financial upside - a real % of the company
Annual bonus target based on personal and company performance
Health, Dental, Vision available
Unlimited PTO - we care about impact, not tracking days you're out
401k with company match %

About The Role

As a Machine Learning Engineer focused on Inference and Serving at Yobi , you'll design, optimize, and operate the systems that bring our Behavioral AI models to life in real time. You'll work at the core of our production environment, turning trained models into performant, reliable, and continuously improving services that power our open-web and CTV products.

This is an applied ML systems role-equal parts engineering depth, deployment craft, and model intuition. You'll shape how models are packaged, versioned, rolled out, and observed across environments, ensuring every prediction is fast, accurate, and accountable.

What it takes to succeed in this role:

Deep expertise in model deployment. You've built or scaled production ML serving systems-handling versioning, rollouts, rollback strategies, and live experimentation.
Low-latency mindset. You understand what makes inference fast: model graph optimization, quantization, caching, batching, and efficient feature retrieval.
Systems fluency. You write robust, high-performance code in Go, Rust, C++, or Java, and are comfortable bridging to Python for model integration and analysis.
Operational maturity. You treat inference as a living system-monitoring drift, tracking model lineage, and ensuring observability from input to outcome.
Infrastructure intuition. You know how to make serving systems reproducible and portable without over-engineering them, whether that's through custom runtime design, model registries, or lightweight orchestration.
Applied ML understanding. You can reason about model performance, interpret trade-offs, and work with researchers to make models more deployable.

A reasonable estimate of the current base salary range at the time of posting is below. Base salary does not include other forms of compensation or benefits. Actual base salary within the specified range is comprised of several components, including but not limited to applicant's skill, prior relevant experience, specific degrees and certifications, job responsibilities, market considerations and the location of the position.

Base salary range: $180,000-$275,000

We prioritize attitude, culture, and general (technical) fit over matching perfectly into one of our job descriptions. If our mission and work resonates with you, we encourage you to apply. Tell us how you can help drive our products forward, even if you don't feel like you are a perfect fit for some of the listings.

Apply

Vacancy posted 14 hours ago

Similar jobs that could be interesting for youBased on the Machine Learning Engineer - Inference / Serving in United States vacancy

Machine Learning Engineer, Inference & Serving (Speech LLM) - San Francisco
$200k
...security and privacy protection. To learn more about Plaud, please visit and follow... ...high-throughput, ultra-low-latency inference engines for large language models or foundational... ...have experience with: Frontier Serving Frameworks: Deep, under-the-hood familiarity...
Suggested
Full time
Work at office
Worldwide
Plaud
San Francisco, CA
1 day ago
Machine Learning Infrastructure Engineer- Model Inference
...ML Infrastructure Engineer, Model Inference As an ML Infrastructure Engineer, Model Inference... ...inference infrastructure that powers our machine learning models. Your work will be... ...Develop, optimize, and maintain ML model serving infrastructure, ensuring high-performance...
Suggested
Hourly pay
Full time
Remote work
Flexible hours
Abridge
United States
14 hours ago
LLM/ML Engineer (Inference)
...scale . You're experienced with modern inference systems like TGI , vLLM , TensorRT-LLM ,... ...robust, scalable inference systems for serving state-of-the-art AI models Optimizing model... ...this usually requires a large engineering effort dedicated to building specialized...
Suggested
Work at office
Gravity Engineering Services Pvt Ltd.
San Francisco, CA
2 days ago
Staff Inference ML Runtime Engineer
...deliver industry-leading training and inference speeds and empowers machine learning users to effortlessly run large-... ...The Role The Inference ML Engineering team at Cerebras Systems is... ...outputs. Maintain our scalable serving backend for handling many concurrent...
Suggested
CEREBRAS SYSTEMS INC.
Sunnyvale, CA
1 day ago
Senior Machine Learning Engineer (Inference Platform)
$200k - $250k
...seeking an experienced Senior MLOps Engineer to take ownership of how our machine learning systems run reliably and... ...and scaling - for a custom-built inference platform powering a live conversational... ...ML systems. Define and enforce serving-layer SLAs - latency,...
Suggested
Remote work
Flexible hours
Wizard
New York, NY
4 days ago
Staff ML Infra Engineer: Scalable Inference Platform (Hybrid)
...seeking a Staff ML Infrastructure Engineer to build robust compute platforms for machine learning workflows in Sunnyvale, CA. The... ...to ensure efficient model serving, leading technical decision-making... ...or C++, and expertise in ML inference. The position offers a hybrid work...
General Motors
Sunnyvale, CA
14 hours ago
ML Engineer - Inference & Model Deployment
...building a 100x better job search engine: fast, comprehensive, honest, and... ...infrastructure: deploying models, optimizing inference latency and throughput, scaling serving systems, and making sure our... ...Have deployed and optimized deep learning models in production environments....
Relocation package
HiringCafe
Cupertino, CA
1 day ago
ML Infrastructure Engineer - Model Inference & Scale
...technology firm in San Francisco is seeking an ML Infrastructure Engineer, Model Inference to build and optimize AI-driven solutions. You will design scalable Kubernetes clusters, enhance ML model serving infrastructure, and collaborate with cross-functional teams. Ideal...
Abridge
San Francisco, CA
4 days ago
Senior ML Inference Engineer Production Systems
MakerMaker.AI is looking for a Senior Machine Learning Systems Engineer in San Francisco. In this role, you will build and operate production inference systems, optimizing for performance... ...years of experience in production-grade serving infrastructure, be fluent in Python,...
MakerMaker.AI
San Francisco, CA
3 days ago
Lead ML Inference Engineer, Advertising
$246.5k
...and with low latency. We use Machine Learning, Reinforcement Learning, AI,... ...is our Machine Learning and Inference Platform that powers the... ...with deep experience in ML serving, high-performance computing,... ...- someone excited to mentor engineers, innovate at scale, and shape...
Work at office
Local area
Remote work
Monday to Thursday
Flexible hours
Roku
San Jose, CA
4 days ago
Staff Machine Learning Engineer
$205k - $272.5k
...multimodal data mining framework, is the engine that powers this discovery. As a Staff Machine Learning Engineer, you will serve as a technical leader defining the roadmap... ...learning loops to hyper-efficient production inference. You will own system-level architecture,...
Work at office
Remote work
Motional
Las Vegas, NV
9 days ago
Senior Machine Learning Engineer - News
$148.7k - $199.4k
...is a global organization of engineers, product developers, designers... ...consumer media touch points serving millions of people around... ...data pipelines and advanced machine learning platforms that deliver personalized... ...for scalable learning, inference, and monitoring, conduct in-...
Work experience placement
Local area
Day shift
Disney Entertainment and ESPN Product & Technology Careers
Glendale, CA
5 days ago
Senior Machine Learning Engineer - News
$148.7k - $199.4k
...Senior Machine Learning Engineer - News Technology is at the heart of Disney's past, present, and... ...foundation and consumer media touch points serving millions of people around the world.... ...for scalable learning, inference, and monitoring, conduct in-depth data...
Work experience placement
Local area
Day shift
Disney
New York, NY
3 days ago
Machine Learning Engineer - LLMs and Agentic
...that improve compliance. Learn More. Position... ...and forward-looking ML Engineer with experience in Large... ...a strong foundation in machine learning, modern deep learning... ...data pipelines that serve ML/LLM workloads.... ...tuning, prompt design, inference optimization). # Demonstrated...
Oversight Systems
Atlanta, GA
1 day ago
Sr. Machine Learning Engineer
...Sr. Machine Learning Engineer The Sr. Machine Learning Engineer will drive our AI initiatives by... ...tuning, deployment for batch and online inference, in highly scalable production... ...Experience with Ray for Training, Tuning and Serving ML models at scale. Experience...
Remote work
Prodege
United States
3 days ago
Senior Machine Learning Engineer
$157.8k - $197.2k
...Overview Grailed is looking for a Senior Machine Learning Engineer to drive personalization,... ...quality of inventory impressions that are served to prospective buyers. Develop proprietary... ...statistical modeling, causal inference, experiment/test design, and working...
Work experience placement
Local area
Remote work
GOAT Group
United States
3 days ago
Machine Learning Engineer
$150k - $215k
...Machine Learning Engineer Remote Vannevar is a defense technology company building AI to deter... ...models to deploying high-performance inference services, and we operate these capabilities... ...process large volumes of data and serve predictions with strict latency and throughput...
Remote work
Vannevar Labs
United States
3 days ago
Senior Machine Learning Engineer, Recommendation & AI Applications
$195k - $230k
...Role We are looking for a Senior Machine Learning Engineer to help evolve our large-scale... ...feed, retrieval, and ranking systems serving tens of millions of users, while also... ...systems from offline training → online inference → A/B experimentation → metric...
Full time
Local area
Work from home
NewsBreak
Mountain View, CA
2 days ago
Senior Machine Learning Engineer
$165k - $225k
...Senior Machine Learning Engineer Career Renew is recruiting for one of its clients a Senior Machine Learning Engineer - this is a fully... ...including CUDA kernel engineering, TensorRT/ONNX export, and inference serving frameworks such as Triton Experience with hosting...
Remote work
Worldwide
Career Renew
United States
3 days ago
Senior Machine Learning Engineer
$230k - $265k
...alongside industry-veteran scientists and engineers. As a Senior Machine Learning Engineer, you’ll bring your strong... ..., fine-tuning, post-training, and inference strategies for large language and... ...across training, inference, and serving infrastructure, including model versioning...
Permanent employment
Otter.ai
Mountain View, CA
4 days ago
Sr. Machine Learning Engineer
...Performers (autonomous AI Agents). Realm-X serves as both a foundation for internal... ...Looking For We're hiring a Senior Machine Learning Engineer to design and ship the next... ...Language Model (SLM) fine-tuning and inference optimization for voice latency and cost...
Full time
Remote work
Flexible hours
AppFolio
United States
14 hours ago
Senior Machine Learning Engineer
$25k
...'ll do Design, build, and deploy machine learning models and systems that operate reliably... ...infrastructure including feature stores, model serving platforms, and real-time inference pipelines Embed on a product engineering team and collaborate closely with data...
Work at office
Local area
Remote work
Work from home
Home office
Flexible hours
SeatGeek
United States
14 hours ago
Senior Machine Learning Engineer
...Senior Machine Learning Engineer McLean, Virginia Senior Machine Learning Engineer Location... ...delivery – from data to deployment – while serving as the primary technical point of... ..., Docker, ONNX/TensorRT, deploying inference services to the edge (e.g., NVIDIA Jetson...
Temporary work
Work at office
Local area
Flexible hours
Shift work
Covar
McLean, VA
3 days ago
Machine Learning Engineer with an Agentic Focus
...Machine Learning Engineer (MLE) We are looking for a Machine Learning Engineer (MLE) to design,... ...and ensure reliability for our AI that serves millions of players worldwide. What... ...Run to support ML model training and inference. Develop and implement robust...
Remote work
Worldwide
High 5 Games
United States
3 days ago
Machine Learning Engineer, Recommendation - E-Commerce
$150k - $316.8k
...Machine Learning Engineer, Recommendation - E-Commerce Location: San Jose Employment Type: Regular... ...The E-commerce Alliance team aims to serve merchants and creators in the e-... ...common machine/deep learning, causal inference, and operational optimization algorithms...
Temporary work
Local area
Worldwide
Tik Tok
San Jose, CA
14 hours ago
Sr. Machine Learning Engineer
...world running. Our Team's Vision: Our Engineering team is shaping the future of... ...TensorRT-LLM) or managing proprietary model inference endpoints. This position involves access... ...upon the applicant's capacity to serve in compliance with U.S. export controls...
Immediate start
Illumio
Sunnyvale, CA
3 days ago
Machine Learning Engineering
...Machine Learning Engineer We're seeking a Machine Learning Engineer to help design, build, and maintain... ...REST APIs, and webhooks for ML model serving Implement CI/CD pipelines for... ...workflows for model training and inference Build and maintain ML infrastructure...
Publicis Groupe Holdings B.V
Birmingham, MI
3 days ago
Sr. Machine Learning Engineer
$150k - $185k
...intelligence, computer vision and machine learning, and trusted by over 7,500... ...customers helps us better serve their needs. About Mitek... ...As a Sr. Machine Learning Engineer, you will lead applied ML initiatives... ...-grade training and inference pipelines on AWS with strong...
Remote work
Work from home
Worldwide
Home office
Mitek Systems
United States
3 days ago
Founding Machine Learning Engineer
...getting started. Role We are seeking a Founding ML Engineer to define and build Adaptive's ML capabilities. Our products... ...data pipelines, model training, evaluation frameworks, and inference serving. Establish evaluation methodology. Define how we measure...
Work at office
Local area
Adaptive Security Corporation
New York, NY
14 hours ago
Senior Machine Learning Engineer, Firefly Foundry
$151.8k - $265.35k
...verticals. We are hiring a Senior Machine Learning Engineer to build the pipelines and services... ...and latency, all while ensuring served quality matches the training and... ...significant ownership of production ML or inference services at scale. ~ Strong...
Temporary work
Local area
Worldwide
Adobe
San Jose, CA
3 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Machine Learning Engineer - Inference / Serving. Be the first to apply!