Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Real-Time ML Inference Engineer for Scalable Serving

Yobi

A Behavioral AI company is seeking a Machine Learning Engineer to design and optimize systems for bringing their models to life. The role involves ensuring ML models are efficient and reliable, requiring experience in model deployment and robust coding skills. Candidates should be familiar with low-latency techniques and operational maturity in ML systems. This position can be remote or hybrid from several hubs. #J-18808-Ljbffr Yobi

Vacancy posted 1 day ago
Similar jobs that could be interesting for youBased on the Real-Time ML Inference Engineer for Scalable Serving in New York, NY vacancy
  •  ...Reddit, Inc. is seeking a Staff Machine Learning Engineer to lead the development of a large-scale ML Inference Platform. Responsibilities include designing cloud-based ML systems on Kubernetes and ensuring reliable, low-latency performance. Candidates should have 7+... 
    Suggested

    Reddit

    New York, NY
    4 days ago
  • A leading Behavioral AI company is seeking a Machine Learning Engineer focused on inference and serving. In this role, you will design and optimize systems to operationalize AI models. The ideal candidate has deep expertise in model deployment, a strong low-latency mindset... 
    Suggested
    Remote work

    Yobi AI

    New York, NY
    1 day ago
  • $150k - $200k

     ...Affirm is seeking a Senior Machine Learning Engineer (Fraud) to lead the development of fraud prediction models. You will be working...  ...and will collaborate with cross-functional teams to build ML systems for real-time transaction decisions. Candidates should have 6+ years of... 
    Suggested
    Remote work

    Affirm

    New York, NY
    4 days ago
  •  ...Summary: We are seeking an ML Engineer to bridge the gap between...  ...This role focuses on building scalable, reliable systems to serve machine learning models in real-time and batch environments, ensuring...  ...models Optimize model inference for speed, scalability, and efficiency... 
    Suggested

    Compunnel

    Jersey City, NJ
    3 days ago
  •  ...Machine Learning Engineer - Inference / Serving Join to apply for the Machine Learning...  ...human behavior grounded in real‑world actions such as...  ...Behavioral AI models to life in real time. You’ll work at the core of...  .... This is an applied ML systems role—equal parts engineering... 
    Suggested
    Full time
    Remote work

    Yobi AI

    New York, NY
    1 day ago
  •  ...healthcare technology company is looking for a Sr. Machine Learning Engineer to optimize bidding strategies and improve machine learning...  ...with remote work options and requires a strong background in real-time bidding auction technologies. Join a dynamic team focused on... 
    Remote work

    PulsePoint

    New York, NY
    4 days ago
  •  ...can be a little time consuming and you...  ...for developing scalable infrastructure and...  ...Canva. Our Inference Platform team sits...  ...mission—ensuring that ML models are deployed, served, and optimised...  ...that support real-time AI features...  ...Machine Learning Engineer, you’ll focus on... 
    Work at office
    Remote work
    Flexible hours

    Canva

    New York, NY
    4 days ago
  • $128.7k - $261.3k

     ...General Motors seeks a skilled professional to develop its ML deployment platform within the autonomous vehicle sector. This role...  ...involves automating model deployment from training to on-vehicle inference and enhancing developer experience through robust tooling. Candidates... 

    General Motors

    New York, NY
    4 days ago
  • $200k - $250k

     ...quality, and trust. Our ML models power the...  ...Senior MLOps Engineer to take ownership...  ...for a custom-built inference platform powering...  ...engines handling real-time inference for high...  ...Define and enforce serving-layer SLAs – latency...  ...cost-efficient, and scalable, partnering with... 
    Remote work
    Flexible hours

    Wizard

    New York, NY
    4 days ago
  • $128.7k - $261.3k

     ...The Model Deployment & Inference Solutions team in GM AV...  ...is two-fold: build the ML deployment platform that...  ...so they meet the real-time latency and memory budgets...  ...makes deployment self-serve for every ML model development...  ...performed manually by engineers. Build the developer... 
    Flexible hours
    Shift work

    General Motors

    New York, NY
    4 days ago
  •  ...Description:****Principal ML Engineer****3M Health Care is...  ..., and automated scalability** over hype.While many...  ...reproducible experimentation.* **Inference at Scale:** Architect high-performance serving layers for both LLMs...  ...-quality features for real-time and batch inference.*... 
    H1b
    Remote work

    Solventum

    New York, NY
    2 days ago
  • $180k - $220k

     ...New York is seeking a Sr. Machine Learning Engineer to join their Applied Data Science group....  ...machine learning solutions for real-time ad optimization. The ideal candidate has...  ...of experience, a strong background in AI/ML technologies, and proficiency in Java, Python... 

    Nexxen International Ltd.

    New York, NY
    4 days ago
  •  ...systems design for real-world impact. Our mission...  ...capital in record time, and are scaling...  ...Principal ML Engineer (Applied / Systems)...  ...them into robust, scalable production systems that serve real-world needs....  ...optimize, and scale inference pipelines and model... 

    Soris

    New York, NY
    1 day ago
  • $150k - $300k

     ...Senior AI Engineering Expert At Goldman Sachs...  ...that build massively scalable software and...  ...Performance: Optimize inference latency and manage...  ...scale deployments serving thousands of...  ...Java processes to real-time, event-driven AI architectures...  ...focused on AI/ML integration in... 
    Full time
    Temporary work
    Part time
    Immediate start

    The Goldman Sachs Group, Inc.

    New York, NY
    4 days ago
  • The New York Times is seeking a Senior Data Engineer in New York City to contribute to the Customer-Facing Data Products team. This role involves developing real-time data pipelines and APIs that serve customer needs. The ideal candidate has over 5 years of experience... 

    The New York Times

    New York, NY
    5 days ago
  •  ...Machine Learning and Computer Algorithm Engineer. In this hands-on role, you will develop...  ...8+ years of experience and expertise in ML/algorithm development. Strong coding skills...  ...are essential, alongside experience with real-time computer vision pipelines. This is an exciting... 

    Peskind Executive Search

    New York, NY
    3 days ago
  • $300k - $400k

     ...Principal AI/ML Engineer - AdTech New York, New York...  ...advertising ecosystem (e.g., real-time bidding and digital...  ...highly performant, scalable, and reliable. You will...  ...training to real-time inference, for our real-time...  ...integrate with our ad serving architecture and handle... 

    Zeta Global

    New York, NY
    2 days ago
  • $148.7k - $199.4k

     ...Machine Learning Engineer - News Technology...  ..., innovation, and scalability for our businesses...  ...touch points serving millions of people...  ...identity. The News ML team is responsible...  ...models to enable real-time content personalization...  ...learning, inference, and monitoring, conduct... 
    Work experience placement
    Local area
    Day shift

    Disney

    New York, NY
    3 days ago
  •  ...Machine Learning Engineer - News Technology...  ..., innovation, and scalability for our businesses...  ...touch points serving millions of people...  ...identity. The News ML team is responsible...  ...models to enable real-time content personalization...  ...learning, inference, and monitoring, conduct... 
    Work experience placement
    Local area
    Day shift

    Walt Disney Company

    New York, NY
    3 days ago
  •  ...agency and precision engineering partner. For over 20...  ...technical depth to build scalable digital products for...  ...guidance of a Staff ML Architect, you will...  ...daily model training and inference tasks. Build and...  ...Skills Familiarity with real-time model serving and infrastructure (e... 
    Temporary work
    Remote work

    Halo Media

    New York, NY
    4 days ago
  • $30 - $60 per hour

     ...couldn't answer the real question: who...  ...that compound over time. If you have spent...  ...Machine Learning Engineer Interns. You will...  ...Hyper-scale training & inference infrastructure Pre...  ...decode disaggregation serving to decouple long-prompt...  ...seeking frontier ML systems and... 
    Hourly pay
    Full time
    Internship
    Work at office
    Shift work
    3 days per week

    PATHOS

    New York, NY
    3 days ago
  • $152k - $228k

     ...Job Description Senior ML Engineer About Invoca...  ...and fine-tuning through inference optimization and production...  ...'s ML stack — model serving, inference optimization...  ..., and build robust, scalable APIs for internal and...  ...regulations. Flexible Time Off – We encourage a... 
    Currently hiring
    Remote work
    Flexible hours

    Invoca

    New York, NY
    3 days ago
  • $111.24k - $222.48k

     ...Machine Learning Engineer We're...  ...community at a time. At CVS Health...  ...making to better serve millions of customers...  ...expertise in ML to work on...  ...technologies into real business...  ...code quality, and scalable architecture. Influence...  ..., causal inference, LLM, MCP ~ Experience... 
    Hourly pay
    Full time
    Temporary work

    Oak St. Health

    New York, NY
    1 day ago
  • $190k - $260k

     ...Machine Learning Engineer – Search, Ranking...  ...Employment Type: Full-Time Experience Level:...  ...will join the ML team to design,...  ...across a platform serving hundreds of...  ...accuracy and system scalability. Contribute to product...  ...processing for real‑time inference. Strong backend... 
    Full time
    H1b
    Remote work
    Relocation
    Visa sponsorship

    Fuku

    New York, NY
    1 day ago
  •  ...**Job Description:****ML Engineer****3M Health Care is now...  ...work reliably in the real world. You will help...  ...services are secure and scalable.**Key Responsibilities...  ...or Git).* **Model Serving:** Deploy ML models as...  ...for model training and inference.* **Feature Management... 
    H1b
    Remote work

    Solventum

    New York, NY
    2 days ago
  • $205k - $316.4k

     ...Machine Learning Engineer At Quizlet, our mission...  ...and systems that drive real-time product decisions—balancing...  ...ability to deliver scalable ML systems that drive...  ...for real-time and batch inference Build end-to-end ML...  ...data pipelines, model serving, and scalable systems... 
    Work at office
    3 days per week

    Quizlet

    New York, NY
    2 days ago
  • $158.1k - $213.8k

     ...for a Machine Learning Engineer II who can drive...  ...shape the future of ad serving on Amazon search. You...  ...deep learning, AWS, Auto ML, real-time ML serving systems....  ...production software to support scalable offline machine-...  ...architecture, training/inference lifecycles, and optimization... 
    Internship
    Flexible hours

    Amazon

    New York, NY
    3 days ago
  •  ...helps contractors, engineering firms, and utilities...  ...of our training and inference pipelines, fortifying...  ...Design and maintain scalable architectures for serving deep learning models...  ...computer vision and time-series models on large...  ...and scaling ML applications. Infrastructure... 
    For contractors

    SewerAI Corporation

    New York, NY
    4 days ago
  • $148.7k - $199.4k

     ...Machine Learning Engineer Technology...  ...innovation, and scalability for our...  ...media touch points serving millions of people...  ...direction of the News ML Platform. You...  ...learning, inference, and monitoring...  ...impact and most time-sensitive...  ...methods to solve real-world engineering... 
    Work experience placement

    The Walt Disney Studios

    New York, NY
    3 days ago
  • $97k - $166.75k

     ...Sr. ML Data Engineer, Relevancy Sciences – Personalization & Loyalty Strategy...  ...data generation. This role serves as the bridge between ML...  ...production-grade reliability and scalability. What You'll Do Feature...  ...using Apache Beam/Dataflow for real time feature computation and low-... 

    84.51

    New York, NY
    4 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Real-Time ML Inference Engineer for Scalable Serving. Be the first to apply!