Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Machine Learning Engineer - Inference / Serving

$180k - $275k

YOBI, LLC

Yobi is a rapidly growing Behavioral AI company on a mission to ethically democratize the benefits of data and AI .

Since 2019, we have built one of the largest consented behavioral datasets in the United States, extending far beyond the walled gardens of Big Tech. Unlike traditional LLM companies, Yobi builds foundation models of human behavior grounded in real-world actions such as purchases and store visits.

Our private-by-design modeling enables state-of-the-art personalization and decisioning for leading brands and agencies while protecting privacy, safety, and ethics.

Today, we are focused on bringing the performance of closed-web user acquisition to the open web and connected TV , giving brands walled-garden results without the walls.

At our core, Yobi is building the behavioral intelligence layer for any system that makes a personalization decision .

Working at Yobi

We're at an inflection point-customer adoption is accelerating, but there's still room to shape the architecture and culture from the ground up. Engineers here own major surface areas , build 0→1 systems in large-scale data and model infrastructure, and help define how Behavioral AI scales ethically and effectively.

Highlights:
  • Well-funded with 5+ years of runway. At the same time, we are scaling revenue quickly and project to be breakeven in 2026.
  • Partnerships with Microsoft and Databricks
  • Fully remote or hybrid from several hubs (SF Bay Area, Seattle, NYC)
  • World-class team of Machine Learning experts who worked on cutting edge infra and recommender systems @ Amazon, Uber, Twitter, Meta, etc.
  • Product and Go-To-Market teams who have taken ideas from concept to 9 figure revenue streams
Benefits:
  • Competitive Base Salary
  • Meaningful equity & financial upside - a real % of the company
  • Annual bonus target based on personal and company performance
  • Health, Dental, Vision available
  • Unlimited PTO - we care about impact, not tracking days you're out
  • 401k with company match %
About The Role

As a Machine Learning Engineer focused on Inference and Serving at Yobi , you'll design, optimize, and operate the systems that bring our Behavioral AI models to life in real time. You'll work at the core of our production environment, turning trained models into performant, reliable, and continuously improving services that power our open-web and CTV products.

This is an applied ML systems role-equal parts engineering depth, deployment craft, and model intuition. You'll shape how models are packaged, versioned, rolled out, and observed across environments, ensuring every prediction is fast, accurate, and accountable.

What it takes to succeed in this role:
  • Deep expertise in model deployment. You've built or scaled production ML serving systems-handling versioning, rollouts, rollback strategies, and live experimentation.
  • Low-latency mindset. You understand what makes inference fast: model graph optimization, quantization, caching, batching, and efficient feature retrieval.
  • Systems fluency. You write robust, high-performance code in Go, Rust, C++, or Java, and are comfortable bridging to Python for model integration and analysis.
  • Operational maturity. You treat inference as a living system-monitoring drift, tracking model lineage, and ensuring observability from input to outcome.
  • Infrastructure intuition. You know how to make serving systems reproducible and portable without over-engineering them, whether that's through custom runtime design, model registries, or lightweight orchestration.
  • Applied ML understanding. You can reason about model performance, interpret trade-offs, and work with researchers to make models more deployable.

A reasonable estimate of the current base salary range at the time of posting is below. Base salary does not include other forms of compensation or benefits. Actual base salary within the specified range is comprised of several components, including but not limited to applicant's skill, prior relevant experience, specific degrees and certifications, job responsibilities, market considerations and the location of the position.


Base salary range: $180,000-$275,000

We prioritize attitude, culture, and general (technical) fit over matching perfectly into one of our job descriptions. If our mission and work resonates with you, we encourage you to apply. Tell us how you can help drive our products forward, even if you don't feel like you are a perfect fit for some of the listings.
Vacancy posted 14 hours ago
Similar jobs that could be interesting for youBased on the Machine Learning Engineer - Inference / Serving in United States vacancy
  • $200k

     ...security and privacy protection. To learn more about Plaud, please visit and follow...  ...high-throughput, ultra-low-latency inference engines for large language models or foundational...  ...have experience with: Frontier Serving Frameworks: Deep, under-the-hood familiarity... 
    Suggested
    Full time
    Work at office
    Worldwide

    Plaud

    San Francisco, CA
    1 day ago
  •  ...ML Infrastructure Engineer, Model Inference As an ML Infrastructure Engineer, Model Inference...  ...inference infrastructure that powers our machine learning models. Your work will be...  ...Develop, optimize, and maintain ML model serving infrastructure, ensuring high-performance... 
    Suggested
    Hourly pay
    Full time
    Remote work
    Flexible hours

    Abridge

    United States
    14 hours ago
  •  ...scale . You're experienced with modern inference systems like TGI , vLLM , TensorRT-LLM ,...  ...robust, scalable inference systems for serving state-of-the-art AI models Optimizing model...  ...this usually requires a large engineering effort dedicated to building specialized... 
    Suggested
    Work at office

    Gravity Engineering Services Pvt Ltd.

    San Francisco, CA
    2 days ago
  •  ...deliver industry-leading training and inference speeds and empowers machine learning users to effortlessly run large-...  ...The Role The Inference ML Engineering team at Cerebras Systems is...  ...outputs. Maintain our scalable serving backend for handling many concurrent... 
    Suggested

    CEREBRAS SYSTEMS INC.

    Sunnyvale, CA
    1 day ago
  • $200k - $250k

     ...seeking an experienced Senior MLOps Engineer to take ownership of how our machine learning systems run reliably and...  ...and scaling - for a custom-built inference platform powering a live conversational...  ...ML systems. Define and enforce serving-layer SLAs - latency,... 
    Suggested
    Remote work
    Flexible hours

    Wizard

    New York, NY
    4 days ago
  •  ...seeking a Staff ML Infrastructure Engineer to build robust compute platforms for machine learning workflows in Sunnyvale, CA. The...  ...to ensure efficient model serving, leading technical decision-making...  ...or C++, and expertise in ML inference. The position offers a hybrid work... 

    General Motors

    Sunnyvale, CA
    14 hours ago
  •  ...building a 100x better job search engine: fast, comprehensive, honest, and...  ...infrastructure: deploying models, optimizing inference latency and throughput, scaling serving systems, and making sure our...  ...Have deployed and optimized deep learning models in production environments.... 
    Relocation package

    HiringCafe

    Cupertino, CA
    1 day ago
  •  ...technology firm in San Francisco is seeking an ML Infrastructure Engineer, Model Inference to build and optimize AI-driven solutions. You will design scalable Kubernetes clusters, enhance ML model serving infrastructure, and collaborate with cross-functional teams. Ideal... 

    Abridge

    San Francisco, CA
    4 days ago
  • MakerMaker.AI is looking for a Senior Machine Learning Systems Engineer in San Francisco. In this role, you will build and operate production inference systems, optimizing for performance...  ...years of experience in production-grade serving infrastructure, be fluent in Python,... 

    MakerMaker.AI

    San Francisco, CA
    3 days ago
  • $246.5k

     ...and with low latency. We use Machine Learning, Reinforcement Learning, AI,...  ...is our Machine Learning and Inference Platform that powers the...  ...with deep experience in ML serving, high-performance computing,...  ...- someone excited to mentor engineers, innovate at scale, and shape... 
    Work at office
    Local area
    Remote work
    Monday to Thursday
    Flexible hours

    Roku

    San Jose, CA
    4 days ago
  • $205k - $272.5k

     ...multimodal data mining framework, is the engine that powers this discovery. As a Staff Machine Learning Engineer, you will serve as a technical leader defining the roadmap...  ...learning loops to hyper-efficient production inference. You will own system-level architecture,... 
    Work at office
    Remote work

    Motional

    Las Vegas, NV
    9 days ago
  • $148.7k - $199.4k

     ...is a global organization of engineers, product developers, designers...  ...consumer media touch points serving millions of people around...  ...data pipelines and advanced machine learning platforms that deliver personalized...  ...for scalable learning, inference, and monitoring, conduct in-... 
    Work experience placement
    Local area
    Day shift

    Disney Entertainment and ESPN Product & Technology Careers

    Glendale, CA
    5 days ago
  • $148.7k - $199.4k

     ...Senior Machine Learning Engineer - News Technology is at the heart of Disney's past, present, and...  ...foundation and consumer media touch points serving millions of people around the world....  ...for scalable learning, inference, and monitoring, conduct in-depth data... 
    Work experience placement
    Local area
    Day shift

    Disney

    New York, NY
    3 days ago
  •  ...that improve compliance. Learn More. Position...  ...and forward-looking ML Engineer with experience in Large...  ...a strong foundation in machine learning, modern deep learning...  ...data pipelines that serve ML/LLM workloads....  ...tuning, prompt design, inference optimization). # Demonstrated... 

    Oversight Systems

    Atlanta, GA
    1 day ago
  •  ...Sr. Machine Learning Engineer The Sr. Machine Learning Engineer will drive our AI initiatives by...  ...tuning, deployment for batch and online inference, in highly scalable production...  ...Experience with Ray for Training, Tuning and Serving ML models at scale. Experience... 
    Remote work

    Prodege

    United States
    3 days ago
  • $157.8k - $197.2k

     ...Overview Grailed is looking for a Senior Machine Learning Engineer to drive personalization,...  ...quality of inventory impressions that are served to prospective buyers. Develop proprietary...  ...statistical modeling, causal inference, experiment/test design, and working... 
    Work experience placement
    Local area
    Remote work

    GOAT Group

    United States
    3 days ago
  • $150k - $215k

     ...Machine Learning Engineer Remote Vannevar is a defense technology company building AI to deter...  ...models to deploying high-performance inference services, and we operate these capabilities...  ...process large volumes of data and serve predictions with strict latency and throughput... 
    Remote work

    Vannevar Labs

    United States
    3 days ago
  • $195k - $230k

     ...Role We are looking for a Senior Machine Learning Engineer to help evolve our large-scale...  ...feed, retrieval, and ranking systems serving tens of millions of users, while also...  ...systems from offline training → online inference → A/B experimentation → metric... 
    Full time
    Local area
    Work from home

    NewsBreak

    Mountain View, CA
    2 days ago
  • $165k - $225k

     ...Senior Machine Learning Engineer Career Renew is recruiting for one of its clients a Senior Machine Learning Engineer - this is a fully...  ...including CUDA kernel engineering, TensorRT/ONNX export, and inference serving frameworks such as Triton Experience with hosting... 
    Remote work
    Worldwide

    Career Renew

    United States
    3 days ago
  • $230k - $265k

     ...alongside industry-veteran scientists and engineers. As a Senior Machine Learning Engineer, you’ll bring your strong...  ..., fine-tuning, post-training, and inference strategies for large language and...  ...across training, inference, and serving infrastructure, including model versioning... 
    Permanent employment

    Otter.ai

    Mountain View, CA
    4 days ago
  •  ...Performers (autonomous AI Agents). Realm-X serves as both a foundation for internal...  ...Looking For We're hiring a Senior Machine Learning Engineer to design and ship the next...  ...Language Model (SLM) fine-tuning and inference optimization for voice latency and cost... 
    Full time
    Remote work
    Flexible hours

    AppFolio

    United States
    14 hours ago
  • $25k

     ...'ll do Design, build, and deploy machine learning models and systems that operate reliably...  ...infrastructure including feature stores, model serving platforms, and real-time inference pipelines Embed on a product engineering team and collaborate closely with data... 
    Work at office
    Local area
    Remote work
    Work from home
    Home office
    Flexible hours

    SeatGeek

    United States
    14 hours ago
  •  ...Senior Machine Learning Engineer McLean, Virginia Senior Machine Learning Engineer Location...  ...delivery – from data to deployment – while serving as the primary technical point of...  ..., Docker, ONNX/TensorRT, deploying inference services to the edge (e.g., NVIDIA Jetson... 
    Temporary work
    Work at office
    Local area
    Flexible hours
    Shift work

    Covar

    McLean, VA
    3 days ago
  •  ...Machine Learning Engineer (MLE) We are looking for a Machine Learning Engineer (MLE) to design,...  ...and ensure reliability for our AI that serves millions of players worldwide. What...  ...Run to support ML model training and inference. Develop and implement robust... 
    Remote work
    Worldwide

    High 5 Games

    United States
    3 days ago
  • $150k - $316.8k

     ...Machine Learning Engineer, Recommendation - E-Commerce Location: San Jose Employment Type: Regular...  ...The E-commerce Alliance team aims to serve merchants and creators in the e-...  ...common machine/deep learning, causal inference, and operational optimization algorithms... 
    Temporary work
    Local area
    Worldwide

    Tik Tok

    San Jose, CA
    14 hours ago
  •  ...world running. Our Team's Vision: Our Engineering team is shaping the future of...  ...TensorRT-LLM) or managing proprietary model inference endpoints. This position involves access...  ...upon the applicant's capacity to serve in compliance with U.S. export controls... 
    Immediate start

    Illumio

    Sunnyvale, CA
    3 days ago
  •  ...Machine Learning Engineer We're seeking a Machine Learning Engineer to help design, build, and maintain...  ...REST APIs, and webhooks for ML model serving Implement CI/CD pipelines for...  ...workflows for model training and inference Build and maintain ML infrastructure... 

    Publicis Groupe Holdings B.V

    Birmingham, MI
    3 days ago
  • $150k - $185k

     ...intelligence, computer vision and machine learning, and trusted by over 7,500...  ...customers helps us better serve their needs. About Mitek...  ...As a Sr. Machine Learning Engineer, you will lead applied ML initiatives...  ...-grade training and inference pipelines on AWS with strong... 
    Remote work
    Work from home
    Worldwide
    Home office

    Mitek Systems

    United States
    3 days ago
  •  ...getting started. Role We are seeking a Founding ML Engineer to define and build Adaptive's ML capabilities. Our products...  ...data pipelines, model training, evaluation frameworks, and inference serving. Establish evaluation methodology. Define how we measure... 
    Work at office
    Local area

    Adaptive Security Corporation

    New York, NY
    14 hours ago
  • $151.8k - $265.35k

     ...verticals. We are hiring a Senior Machine Learning Engineer to build the pipelines and services...  ...and latency, all while ensuring served quality matches the training and...  ...significant ownership of production ML or inference services at scale. ~ Strong... 
    Temporary work
    Local area
    Worldwide

    Adobe

    San Jose, CA
    3 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Machine Learning Engineer - Inference / Serving. Be the first to apply!