Senior ML Serving Engineer for LLMs & Inference

Alldus

A tech company in AI/ML is seeking a Senior Software Engineer specializing in ML Serving to build robust infrastructure for ML models. The ideal candidate has 5+ years of experience in software engineering, with a focus on ML serving. Proficiency in Python and knowledge of various serving frameworks are essential. This full-time role is located in San Jose, California and offers a competitive salary. #J-18808-Ljbffr

Apply

Vacancy posted 2 days ago

Similar jobs that could be interesting for youBased on the Senior ML Serving Engineer for LLMs & Inference in San Jose, CA vacancy

Senior ML Inference Engineer - Platform
$128.7k - $261.3k
The Model Deployment & Inference Solutions team in GM AV deploys machine... ...is two-fold: build the ML deployment platform that makes... ...layer that makes deployment self-serve for every ML model... ...equivalent) as part of your engineering workflow. Experience designing...
Senior
Flexible hours
General Motors
Sunnyvale, CA
4 days ago
Staff Inference ML Runtime Engineer
...leading training and inference speeds and... ...effortlessly run large-scale ML applications,... ...The Inference ML Engineering team at Cerebras Systems... .... As a Senior Software Engineer... ...Maintain our scalable serving backend for... ...inference systems for LLMs or multimodal models...
Suggested
CEREBRAS SYSTEMS INC.
Sunnyvale, CA
3 days ago
Founding ML Infra Engineer Production-Grade LLMs
...innovative AI startup is seeking a Founding ML Infrastructure Engineer to take charge of deploying and... ...responsible for building and managing a full ML serving stack, working closely with product... ...ML infrastructure, particularly with LLMs, and will be proficient in relevant...
Suggested
Realm Labs LLC
Sunnyvale, CA
2 days ago
Senior Machine Learning Engineer, Firefly Foundry
$151.8k - $265.35k
...We are hiring a Senior Machine Learning Engineer to build the pipelines... ...including finetuned LLMs, image and video generation... ..., all while ensuring served quality matches the... ...environment. ML Engineering leadership... ...of production ML or inference services at scale....
Senior
Temporary work
Local area
Worldwide
Adobe
San Jose, CA
22 hours ago
Machine Learning Engineer - Large Language Models & Generative AI Inference
$147.4k - $272.1k
...data platform, and the primary inference platform that enable next... ...and driven Machine Learning Engineer who has a robust understanding... ...performing systems and a model serving stack that can be practically... ...emphasis on Large Language Models (LLMs) and Generative AI....
Suggested
Relocation
Apple Inc.
Cupertino, CA
2 days ago
Staff ML Infra Engineer: Scalable Inference Platform (Hybrid)
...automotive company is seeking a Staff ML Infrastructure Engineer to build robust compute platforms for... ...engineers to ensure efficient model serving, leading technical decision-making, and... ..., Python or C++, and expertise in ML inference. The position offers a hybrid work...
General Motors
Sunnyvale, CA
2 days ago
Senior ML Engineer - LLMs, RLHF & Recommender Systems
$100k
Netflix, Inc. is seeking exceptional applied machine learning engineers to advance state-of-the-art Search and Recommendation experiences... ...will have strong software development skills, expertise in LLMs, and experience with large-scale recommender systems. Netflix offers...
Senior
Netflix, Inc.
Los Gatos, CA
1 day ago
Senior ML Compiler & Inference Systems Engineer
$152k - $287.5k
NVIDIA Gruppe is seeking a Senior Machine Learning Applications and Compiler Engineer in Santa Clara, California. This role involves developing algorithms for their LPX inference and compiler stack, optimizing the performance of neural network workloads on NVIDIA platforms...
Senior
NVIDIA Gruppe
Santa Clara, CA
1 day ago
Senior, Data Scientist (Machine Learning Engineer)
...Deep Learning, and Engineering. We tackle complex... ..., and model serving. We take pride in... ...the-art GenAI and ML models to identify... ...at Walmart. As a Senior Data Scientist (Machine... ...batch and real-time inference pipelines using... ...AI technologies: LLMs, multimodal models...
Senior
Relha LLC
Sunnyvale, CA
11 hours ago
Lead ML Inference Engineer, Advertising
$246.5k
...core of this is our Machine Learning and Inference Platform that powers the entire... ...technical leader with deep experience in ML serving, high-performance computing, and industry... ...frameworks - someone excited to mentor engineers, innovate at scale, and shape the future...
Work at office
Local area
Remote work
Monday to Thursday
Flexible hours
Roku
San Jose, CA
1 day ago
Senior Software ML Engineer, AI/ML GenAI, Gemini Enterprise
$174k - $252k
Senior Software ML Engineer, AI/ML GenAI, Gemini Enterprise corporate_fare Google place Sunnyvale, CA, USA... ...(e.g., Large Language Models (LLMs), Retrieval-Augmented Generation (RAG)... ...that is representative of the users we serve, creating a culture of belonging, and...
Senior
Full time
Google Inc.
Sunnyvale, CA
1 day ago
Senior Staff AI/ML System Software Engineer
...Senior Staff AI/ML System Software Engineer At d-Matrix, we are focused on unleashing the potential of generative AI to power the transformation... ...ONNX Runtime, TensorRT,...). Experience with inference servers/model serving frameworks (such as Triton, TFServ, KubeFlow, …)...
Senior
Work experience placement
3 days per week
D-Matrix
Santa Clara, CA
4 days ago
Remote Senior ML Inference Platform Engineer
General Motors is seeking a Senior ML Infrastructure Engineer to build and scale a robust platform for machine learning inference workflows. You will design backend software components, collaborate with ML engineers, and lead initiatives across GM's ML ecosystem. With over...
Senior
Remote job
General Motors
Sunnyvale, CA
3 days ago
Senior Machine Learning Engineer, DevOps/SRE
$148.75k - $361k
...Learning, Experimentation and Inference Platform that powers the... ...a talented and experienced Senior Software Engineer, MLOps/DevOps to join the Advertising... ...platforms that accelerate ML experimentation and... ...for critical ML training and serving infrastructure Partner with...
Senior
Work at office
Local area
Remote work
Monday to Thursday
Flexible hours
Roku
San Jose, CA
1 day ago
ML Engineer: LLMs & Generative AI Inference
$147.4k - $272.1k
A leading technology company is searching for a Machine Learning Engineer in Cupertino, California. The role involves working with Large Language Models and Generative AI to enhance user experiences across Apple's platforms. Candidates should have extensive experience...
Apple Inc.
Cupertino, CA
2 days ago
Senior ML Infrastructure Engineer (Compute)
...that powers GM’s AV efforts. We’re proud to serve as the infrastructure platform for teams... ...development by prioritizing high-impact, ML-centric use cases. About the Role: We are seeking a Senior ML Infrastructure engineer to help build and scale robust Compute...
Senior
Local area
Work from home
General Motors
Sunnyvale, CA
8 days ago
Senior ML Evaluation Engineer - Autonomous Vehicles
$184k - $287.5k
...assess driving behavior using LLMs, VLMs, and multimodal models Develop... ...workflows that chain model inference, retrieval, and structured... ...analyzers that are candidates for ML replacement and build the... ...in Computer Science, Computer Engineering, or a related technical field....
Senior
NVIDIA Gruppe
Santa Clara, CA
1 day ago
Senior ML Performance Engineer
...supercomputer — feel like one seamless engine. Developers can write once,... ...the Role We're looking for a Senior ML Performance Engineer to... ...platform for evaluating LLM inference workloads across GPU clusters... ...transformer‑based models and LLMs Hands‑on experience with GPU...
Senior
Lemurian Labs
Santa Clara, CA
3 days ago
Senior Applied Machine Learning Engineer
$190.2k - $345.65k
...Applied Machine Learning Engineer We're looking for an Applied Machine... ..., debug, and operationalize ML systems for layout, generative... .... Optimize training and inference (mixed precision, quantization... ...building data pipelines and model serving infrastructure. ~ Strong...
Senior
Temporary work
Local area
Worldwide
Adobe
San Jose, CA
22 hours ago
Sr. ML Engineer, Siri User Experience Metrics and Data
$181.1k - $318.4k
Sr. ML Engineer, Siri User Experience Metrics and Data Cupertino, California... ...We’re looking for a Senior Machine Learning Engineer to... ...applying large language models (LLMs) for downstream tasks (classification... ...operations, including model serving, distributed training, CI/CD...
Senior
Relocation
Apple Inc.
Cupertino, CA
3 days ago
Senior Software Engineer, Quantized Inference
$152k - $241.5k
Senior Software Engineer, Quantized Inference page is loaded## Senior Software Engineer, Quantized... ...efficient inference recipes for LLMs. A recipe defines which... ...correctly for downstream serving* Build prototypes and... ...tooling* Experience with ML accelerators with a basic...
Senior
NVIDIA
Santa Clara, CA
2 days ago
Staff ML Engineer, Inference Platform
$195k - $298k
...relocation assistance. About the Team The ML Inference Platform is part of the AI Compute... ...powers GM’s AI efforts. We’re proud to serve as the AI infrastructure platform for teams... ...We are seeking a Staff ML Infrastructure engineer to help build and scale robust Compute platforms...
Relocation package
Flexible hours
General Motors
Sunnyvale, CA
3 days ago
Sr. Staff Machine Learning Engineer - Data Lake, Anomaly Detection
$154k - $220k
...for a Sr. Staff Software Engineer to join our Zscaler... ...multitenant architecture that serves over 15 million users.... ...features, utilizing LLMs, various machine... ...processing, fine-tuning, and inference optimization Work with... ...problems using AI/ML and distributed systems...
Senior
Full time
Work at office
Local area
Worldwide
3 days per week
Zscaler
San Jose, CA
2 days ago
Senior ML Systems Engineer - LLM Serving & GPU Performance
$207k - $300k
Google Inc. is seeking a Software Engineer in Sunnyvale, CA, to develop cutting-edge technologies for serving Large Language Models. This critical role focuses on performance... ...extensive experience in software development, ML infrastructure, and performance profiling. The...
Senior
Full time
Google Inc.
Sunnyvale, CA
2 days ago
Senior AI Inference Engineer - High-Performance LLM Serving
$152k - $241.5k
...NVIDIA Gruppe is seeking a Senior Software Engineer – AI Inference in Santa Clara, California. This role involves enhancing open-source LLM serving optimizations and implementing high-performance runtime capabilities. Candidates should have 5+ years of experience in building...
Senior
NVIDIA Gruppe
Santa Clara, CA
2 days ago
Senior System Software Engineer — GPU AI Inference (Triton)
NVIDIA Gruppe is seeking a Senior System Software Engineer in Santa Clara, California, to develop world-class GPU-accelerated AI inference serving software. This role involves contributing to feature development and optimizing software for deployment in production environments...
Senior
NVIDIA Gruppe
Santa Clara, CA
1 day ago
Foundation Model Services ML Engineer for Scale Inference
$181.1k - $318.4k
Apple Inc. in Santa Clara, California, is looking for an experienced Machine Learning engineer to optimize and build production-grade solutions serving millions in real time. You will work closely with product teams and utilize advanced machine learning technologies, contributing...
Apple Inc.
Santa Clara, CA
3 days ago
Senior Health AI ML Engineer: Generative Models & LLMs
$212k - $386.3k
Apple Inc. in Cupertino is seeking a Senior Engineer for the Health AI team to design innovative machine learning solutions that impact millions. The ideal candidate will have over 10 years of software development experience, expertise in machine learning, and a strong...
Senior
Apple Inc.
Cupertino, CA
4 days ago
Machine Learning Engineer, LLM Fine-Tuning
...Machine Learning Engineer, LLM Fine‑Tuning... ...Design privacy‑first ML pipelines on AWS:... ...dependable model serving: Bedrock model invocation... ...self‑hosted inference (vLLM/TensorRT‑LLM... ...productization: integrate LLMs with internal... ...engineers. Seniority Level Mid‑Senior...
Full time
FIRST SOFTSOLUTIONS INC
San Jose, CA
2 days ago
ML Engineer: LLMs, VLMs & Reasoning AI | Equity
...An innovative AI company in San Jose is seeking a skilled Machine Learning Engineer with expertise in developing LLMs and VLMs. The ideal candidate will have a strong education background and proven experience in natural language processing and computer vision. This role...
Tensor
San Jose, CA
2 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior ML Serving Engineer for LLMs & Inference. Be the first to apply!