Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Senior ML Serving Engineer for LLMs & Inference

Alldus

A tech company in AI/ML is seeking a Senior Software Engineer specializing in ML Serving to build robust infrastructure for ML models. The ideal candidate has 5+ years of experience in software engineering, with a focus on ML serving. Proficiency in Python and knowledge of various serving frameworks are essential. This full-time role is located in San Jose, California and offers a competitive salary. #J-18808-Ljbffr

Vacancy posted 2 days ago
Similar jobs that could be interesting for youBased on the Senior ML Serving Engineer for LLMs & Inference in San Jose, CA vacancy
  • $128.7k - $261.3k

    The Model Deployment & Inference Solutions team in GM AV deploys machine...  ...is two-fold: build the ML deployment platform that makes...  ...layer that makes deployment self-serve for every ML model...  ...equivalent) as part of your engineering workflow. Experience designing... 
    Senior
    Flexible hours

    General Motors

    Sunnyvale, CA
    4 days ago
  •  ...leading training and inference speeds and...  ...effortlessly run large-scale ML applications,...  ...The Inference ML Engineering team at Cerebras Systems...  .... As a Senior Software Engineer...  ...Maintain our scalable serving backend for...  ...inference systems for LLMs or multimodal models... 
    Suggested

    CEREBRAS SYSTEMS INC.

    Sunnyvale, CA
    3 days ago
  •  ...innovative AI startup is seeking a Founding ML Infrastructure Engineer to take charge of deploying and...  ...responsible for building and managing a full ML serving stack, working closely with product...  ...ML infrastructure, particularly with LLMs, and will be proficient in relevant... 
    Suggested

    Realm Labs LLC

    Sunnyvale, CA
    2 days ago
  • $151.8k - $265.35k

     ...We are hiring a Senior Machine Learning Engineer to build the pipelines...  ...including finetuned LLMs, image and video generation...  ..., all while ensuring served quality matches the...  ...environment. ML Engineering leadership...  ...of production ML or inference services at scale.... 
    Senior
    Temporary work
    Local area
    Worldwide

    Adobe

    San Jose, CA
    22 hours ago
  • $147.4k - $272.1k

     ...data platform, and the primary inference platform that enable next...  ...and driven Machine Learning Engineer who has a robust understanding...  ...performing systems and a model serving stack that can be practically...  ...emphasis on Large Language Models (LLMs) and Generative AI.... 
    Suggested
    Relocation

    Apple Inc.

    Cupertino, CA
    2 days ago
  •  ...automotive company is seeking a Staff ML Infrastructure Engineer to build robust compute platforms for...  ...engineers to ensure efficient model serving, leading technical decision-making, and...  ..., Python or C++, and expertise in ML inference. The position offers a hybrid work... 

    General Motors

    Sunnyvale, CA
    2 days ago
  • $100k

    Netflix, Inc. is seeking exceptional applied machine learning engineers to advance state-of-the-art Search and Recommendation experiences...  ...will have strong software development skills, expertise in LLMs, and experience with large-scale recommender systems. Netflix offers... 
    Senior

    Netflix, Inc.

    Los Gatos, CA
    1 day ago
  • $152k - $287.5k

    NVIDIA Gruppe is seeking a Senior Machine Learning Applications and Compiler Engineer in Santa Clara, California. This role involves developing algorithms for their LPX inference and compiler stack, optimizing the performance of neural network workloads on NVIDIA platforms... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    1 day ago
  •  ...Deep Learning, and Engineering. We tackle complex...  ..., and model serving. We take pride in...  ...the-art GenAI and ML models to identify...  ...at Walmart. As a Senior Data Scientist (Machine...  ...batch and real-time inference pipelines using...  ...AI technologies: LLMs, multimodal models... 
    Senior

    Relha LLC

    Sunnyvale, CA
    11 hours ago
  • $246.5k

     ...core of this is our Machine Learning and Inference Platform that powers the entire...  ...technical leader with deep experience in ML serving, high-performance computing, and industry...  ...frameworks - someone excited to mentor engineers, innovate at scale, and shape the future... 
    Work at office
    Local area
    Remote work
    Monday to Thursday
    Flexible hours

    Roku

    San Jose, CA
    1 day ago
  • $174k - $252k

    Senior Software ML Engineer, AI/ML GenAI, Gemini Enterprise corporate_fare Google place Sunnyvale, CA, USA...  ...(e.g., Large Language Models (LLMs), Retrieval-Augmented Generation (RAG)...  ...that is representative of the users we serve, creating a culture of belonging, and... 
    Senior
    Full time

    Google Inc.

    Sunnyvale, CA
    1 day ago
  •  ...Senior Staff AI/ML System Software Engineer At d-Matrix, we are focused on unleashing the potential of generative AI to power the transformation...  ...ONNX Runtime, TensorRT,...). Experience with inference servers/model serving frameworks (such as Triton, TFServ, KubeFlow, …)... 
    Senior
    Work experience placement
    3 days per week

    D-Matrix

    Santa Clara, CA
    4 days ago
  • General Motors is seeking a Senior ML Infrastructure Engineer to build and scale a robust platform for machine learning inference workflows. You will design backend software components, collaborate with ML engineers, and lead initiatives across GM's ML ecosystem. With over... 
    Senior
    Remote job

    General Motors

    Sunnyvale, CA
    3 days ago
  • $148.75k - $361k

     ...Learning, Experimentation and Inference Platform that powers the...  ...a talented and experienced Senior Software Engineer, MLOps/DevOps to join the Advertising...  ...platforms that accelerate ML experimentation and...  ...for critical ML training and serving infrastructure Partner with... 
    Senior
    Work at office
    Local area
    Remote work
    Monday to Thursday
    Flexible hours

    Roku

    San Jose, CA
    1 day ago
  • $147.4k - $272.1k

    A leading technology company is searching for a Machine Learning Engineer in Cupertino, California. The role involves working with Large Language Models and Generative AI to enhance user experiences across Apple's platforms. Candidates should have extensive experience... 

    Apple Inc.

    Cupertino, CA
    2 days ago
  •  ...that powers GM’s AV efforts. We’re proud to serve as the infrastructure platform for teams...  ...development by prioritizing high-impact, ML-centric use cases. About the Role: We are seeking a Senior ML Infrastructure engineer to help build and scale robust Compute... 
    Senior
    Local area
    Work from home

    General Motors

    Sunnyvale, CA
    8 days ago
  • $184k - $287.5k

     ...assess driving behavior using LLMs, VLMs, and multimodal models Develop...  ...workflows that chain model inference, retrieval, and structured...  ...analyzers that are candidates for ML replacement and build the...  ...in Computer Science, Computer Engineering, or a related technical field.... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    1 day ago
  •  ...supercomputer — feel like one seamless engine. Developers can write once,...  ...the Role We're looking for a Senior ML Performance Engineer to...  ...platform for evaluating LLM inference workloads across GPU clusters...  ...transformer‑based models and LLMs Hands‑on experience with GPU... 
    Senior

    Lemurian Labs

    Santa Clara, CA
    3 days ago
  • $190.2k - $345.65k

     ...Applied Machine Learning Engineer We're looking for an Applied Machine...  ..., debug, and operationalize ML systems for layout, generative...  .... Optimize training and inference (mixed precision, quantization...  ...building data pipelines and model serving infrastructure. ~ Strong... 
    Senior
    Temporary work
    Local area
    Worldwide

    Adobe

    San Jose, CA
    22 hours ago
  • $181.1k - $318.4k

    Sr. ML Engineer, Siri User Experience Metrics and Data Cupertino, California...  ...We’re looking for a Senior Machine Learning Engineer to...  ...applying large language models (LLMs) for downstream tasks (classification...  ...operations, including model serving, distributed training, CI/CD... 
    Senior
    Relocation

    Apple Inc.

    Cupertino, CA
    3 days ago
  • $152k - $241.5k

    Senior Software Engineer, Quantized Inference page is loaded## Senior Software Engineer, Quantized...  ...efficient inference recipes for LLMs. A recipe defines which...  ...correctly for downstream serving* Build prototypes and...  ...tooling* Experience with ML accelerators with a basic... 
    Senior

    NVIDIA

    Santa Clara, CA
    2 days ago
  • $195k - $298k

     ...relocation assistance. About the Team The ML Inference Platform is part of the AI Compute...  ...powers GM’s AI efforts. We’re proud to serve as the AI infrastructure platform for teams...  ...We are seeking a Staff ML Infrastructure engineer to help build and scale robust Compute platforms... 
    Relocation package
    Flexible hours

    General Motors

    Sunnyvale, CA
    3 days ago
  • $154k - $220k

     ...for a Sr. Staff Software Engineer to join our Zscaler...  ...multitenant architecture that serves over 15 million users....  ...features, utilizing LLMs, various machine...  ...processing, fine-tuning, and inference optimization Work with...  ...problems using AI/ML and distributed systems... 
    Senior
    Full time
    Work at office
    Local area
    Worldwide
    3 days per week

    Zscaler

    San Jose, CA
    2 days ago
  • $207k - $300k

    Google Inc. is seeking a Software Engineer in Sunnyvale, CA, to develop cutting-edge technologies for serving Large Language Models. This critical role focuses on performance...  ...extensive experience in software development, ML infrastructure, and performance profiling. The... 
    Senior
    Full time

    Google Inc.

    Sunnyvale, CA
    2 days ago
  • $152k - $241.5k

     ...NVIDIA Gruppe is seeking a Senior Software Engineer – AI Inference in Santa Clara, California. This role involves enhancing open-source LLM serving optimizations and implementing high-performance runtime capabilities. Candidates should have 5+ years of experience in building... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    2 days ago
  • NVIDIA Gruppe is seeking a Senior System Software Engineer in Santa Clara, California, to develop world-class GPU-accelerated AI inference serving software. This role involves contributing to feature development and optimizing software for deployment in production environments... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    1 day ago
  • $181.1k - $318.4k

    Apple Inc. in Santa Clara, California, is looking for an experienced Machine Learning engineer to optimize and build production-grade solutions serving millions in real time. You will work closely with product teams and utilize advanced machine learning technologies, contributing... 

    Apple Inc.

    Santa Clara, CA
    3 days ago
  • $212k - $386.3k

    Apple Inc. in Cupertino is seeking a Senior Engineer for the Health AI team to design innovative machine learning solutions that impact millions. The ideal candidate will have over 10 years of software development experience, expertise in machine learning, and a strong... 
    Senior

    Apple Inc.

    Cupertino, CA
    4 days ago
  •  ...Machine Learning Engineer, LLM Fine‑Tuning...  ...Design privacy‑first ML pipelines on AWS:...  ...dependable model serving: Bedrock model invocation...  ...self‑hosted inference (vLLM/TensorRT‑LLM...  ...productization: integrate LLMs with internal...  ...engineers. Seniority Level Mid‑Senior... 
    Full time

    FIRST SOFTSOLUTIONS INC

    San Jose, CA
    2 days ago
  •  ...An innovative AI company in San Jose is seeking a skilled Machine Learning Engineer with expertise in developing LLMs and VLMs. The ideal candidate will have a strong education background and proven experience in natural language processing and computer vision. This role... 

    Tensor

    San Jose, CA
    2 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior ML Serving Engineer for LLMs & Inference. Be the first to apply!