Senior ML Performance Engineer: LLM Benchmarking & GPU

Amadeus Search

A leading AI infrastructure company is seeking a Senior ML Performance Engineer to design a comprehensive performance testing platform for large language models. This role requires a minimum of 7 years in performance engineering and strong experience with GPU programming and ML inference workloads. Candidates should have expertise in Python and C/C++. The position offers competitive compensation, equity, and wellness benefits in a hybrid work environment. #J-18808-Ljbffr

Apply

Vacancy posted 2 days ago

Similar jobs that could be interesting for youBased on the Senior ML Performance Engineer: LLM Benchmarking & GPU in San Francisco, CA vacancy

Senior ML Accelerator Engineer - GPU
$128.7k - $261.3k
...export, kernel development, and performance engineering so that every cycle on our... ...builds high‑performance GPU kernels and custom libraries... ...at the heart of on‑vehicle ML inference for ADAS and autonomous... .... Hands‑on experience benchmarking, profiling, debugging and optimizing...
Senior
Performance
Local area
Flexible hours
Israelvcforum
San Francisco, CA
2 days ago
Senior AI/ML Engineer LLM & Agent Stack
...Senior AI/ML Engineer — LLM & Agent Stack Every production AI system, whether it's powering customer... .... Build and improve tracing, benchmarking and observability for LLMs and agents... ...orchestration, service meshes, and performance tuning. ~ Proven track record building...
Senior
Performance
TrueFoundry
San Francisco, CA
8 hours ago
Senior GPU ML Infra Engineer — Mid-Training & Inference
...San Francisco is seeking a specialist to design and operate large-scale GPU infrastructure. This role requires expertise in deploying GPU systems for high-throughput inference and model performance optimization. The ideal candidate will have hands-on experience with modern...
Senior
Performance
Reflection AI
San Francisco, CA
3 days ago
Senior ML Performance Engineer
...Position: Senior ML Performance Engineer Location: SF Bay Area (US) or Toronto (Canada... ...compiler optimization on modern GPU architectures. This role... ...performance testing platform for LLM inference workloads across GPU clusters Define benchmarking methodologies, metrics, and...
Senior
Performance
Full time
Amadeus Search
San Francisco, CA
2 days ago
Senior ML Systems Engineer, LLM Infra & AI Ops
TRM Labs is looking for a Senior or Staff ML Systems Engineer to focus on building and scaling the technical... ...versioning to ensure compliance and performance. Ideal candidates will have strong Python... ..., and experience deploying LLM workflows. Join us at TRM Labs to help...
Senior
Performance
TRM Labs
San Francisco, CA
3 days ago
Senior / Staff ML Onboard Optimization Engineer
$141k - $249k
...with autonomy and algorithm engineers to scale safe self-driving systems... ...on the truck. - Create and benchmark new CUDA kernels for... ...runtime and memory to pinpoint performance bottlenecks. Qualifications... ...Skilled in profiling CPU and GPU code using tools such as...
Senior
Performance
Work at office
Work from home
Flexible hours
Waabi
San Francisco, CA
26 days ago
Senior AI/ML Infra & SRE Engineer
...Senior Infrastructure Engineer – Bland As a Senior Infrastructure Engineer... .... Lead – AI/ML Stack Infrastructure... ...technology refresh and benchmark proprietary tools against... ...AI/ML workloads with GPU support, implementing... ...monitoring for model performance and drift. Responsibilities...
Senior
Performance
Temporary work
AI Chopping Block, Inc.
San Francisco, CA
2 days ago
Senior ML Engineer Whole-Body Control & Simulation
$200k - $350k
...train whole-body policies, build simulation environments, and run GPU training experiments. Ideal candidates should have strong... ...compensation range of $200K to $350K, and you’ll work with a small, elite team in a dynamic, high-performance environment. #J-18808-Ljbffr...
Senior
Performance
Pantera Capital
San Francisco, CA
2 days ago
Senior AI/ML Engineer
...building production-grade ML infrastructure used... ...are looking for a Senior AI/ML Engineer to own model training... ...deployment Own model performance, latency, and cost... ...harnesses and offline benchmarks for fast iteration... ...distributed training, GPU optimization, or inference...
Senior
Performance
Full time
Clera
San Francisco, CA
8 days ago
Senior Engineer 2: GPU Kernel and Performance
$167.2k - $209k
...DigitalOcean is seeking a Senior Engineer 2 to play a key... ...the industry-leading performance for our inference services... ...strategy for benchmarking and performance optimizations... ...inference engine and GPU kernel layers, ensuring... ...familiarity with the Gen AI (LLM, VLM, LMM) landscape,...
Senior
Performance
Local area
Remote work
Worldwide
Flexible hours
DigitalOcean
San Francisco, CA
4 days ago
Gentoro | Senior ML Engineer
...Role We are looking for a visionary Senior ML Engineer who will bridge the gap between high-... ...agent reasoning paths, tool usage, and performance in real-time Develop and enforce technical... ..., specifically training or fine-tuning LLM models, embeddings; building clustering...
Senior
Performance
Shift work
Palm Venture Studios
San Francisco, CA
4 days ago
Senior ML Inference Engineer Production Systems
...MakerMaker.AI is looking for a Senior Machine Learning Systems Engineer in San Francisco. In this role, you will... ...inference systems, optimizing for performance and reliability. The ideal candidate... ..., and have strong knowledge in GPU-accelerated inference. Excellent communication...
Senior
Performance
MakerMaker.AI
San Francisco, CA
3 days ago
Sr. Machine Learning Engineer (LLM)
...for a Sr. MLE with AI/ML expertise to build cutting... ...of software engineering and applied AI, turning... ...test, and improve AI performance Turn the latest advancements... ..., leveraging modern LLM's (strong plus for exp... ...in our portfolio. Seniority level ~ Seniority...
Senior
Performance
Full time
Immediate start
Greylock Partners
San Francisco, CA
2 days ago
Senior ML Engineer
...Highlight AI We're a small, senior team building the intelligent... ...We're hiring a Senior ML Engineer to help build the AI systems... ...measure and improve ML system performance Investigate alternative models... ...engineering org Stay current on LLM advances, retrieval...
Senior
Performance
Work at office
Relocation
Relocation package
Flexible hours
Highlight AI
San Francisco, CA
2 days ago
Senior ML Compiler Engineer
$128.7k - $261.3k
...export, kernel development, and performance engineering so that every cycle on our... ...deep compiler, systems, and GPU engineers who enjoy working on... ...automated driving. The Role As a Senior Compiler Engineer on the AI... ...reliable, and effortless for ML engineers across the AV...
Senior
Performance
Local area
Flexible hours
Israelvcforum
San Francisco, CA
2 days ago
Senior Machine Learning Engineer, Voice AI
$200k - $260k
...Senior Machine Learning Engineer, Voice AI San Francisco About the Role... ...looking for a Senior ML Engineer to drive the... ...engines like TRT-LLM and SGLang to optimize... ...frontier. You'll profile GPU utilization, design... ...Optimize inference performance for voice models (STT...
Senior
Performance
Full time
Together AI
San Francisco, CA
4 days ago
ML Infra Engineer Scale AI (SF On-site)
$250k - $350k
...them actually work. We’re hiring ML Infrastructure Engineers to tackle a hard, real-world... ..., and AI. This isn’t clean benchmark data. It’s messy, continuous,... ...inference systems for multimodal / LLM-based models GPU infrastructure and performance optimisation Hybrid...
Performance
Trades Workforce Solutions
San Francisco, CA
3 days ago
Senior Machine Learning Engineer - VLM/LLM Evaluation
$204k - $259k
...Senior Machine Learning Engineer – VLM/LLM Evaluation Waymo is an autonomous driving technology... ...evaluation systems and benchmarks for Waymo Foundation... ...experience Experience in ML engineering and applied... ...location or, if the role can be performed remote, the specific...
Senior
Full time
Temporary work
Remote work
Waymo
San Francisco, CA
1 day ago
Senior Machine Learning Engineer, Animation Integration
$180k - $270k
...have an AI persona. Senior Machine Learning Engineer to join our Avatar Technology... ...owning the applied ML work required to make... ...ML-driven animation performs reliably and at high... ...with Behavior and LLM teams to integrate predictive... ...across CPU, GPU, and memory constraints...
Senior
Performance
Full time
Work experience placement
Work at office
Cerebras
San Francisco, CA
3 days ago
Senior Machine Learning Engineer
...Valley, a small team of engineers is working on what could... ...wear many hats (building ML platforms, MLOps tools, data/LLM infrastructure). You... ...paced environment. As a Senior ML Engineer, you will lead... ...observability, and lead performance benchmarking. You’re comfortable...
Senior
Performance
Work at office
Flexible hours
2 days per week
3 days per week
Sailplane
San Francisco, CA
2 days ago
Senior Machine Learning Engineer
$161.93k - $227.33k
...Senior Machine Learning Engineer Brisbane, California At Freenome,... ...machine learning (AI/ML) systems in a cloud... ...efficient training, and performing model optimizations.... ..., optimization, and benchmarking. Implement efficient... ...data. Experience GPU/Accelerator...
Senior
Performance
Work at office
Local area
Remote work
2 days per week
3 days per week
Freenome
Brisbane, CA
4 days ago
Senior Machine Learning Engineer, Public Sector
$240.45k - $300.3k
...The goal of a Senior Machine Learning Engineer at Scale is to leverage techniques... ...vision. On the LLM side, we are... ...evaluation tools to benchmark and refine agent behavior... ...while preserving core performance characteristics... ...identify and prototype ML-driven product enhancements...
Senior
Performance
Full time
Scale AI
San Francisco, CA
9 days ago
ML Infra Engineer: Scale GPU Compute & Models
$100k - $200k
...Voiceflow is seeking a skilled ML-Infrastructure Engineer in San Francisco to architect and operate auto-scaling systems... ...platform. The role includes optimizing GPU and compute infrastructure, ensuring high performance and reliability. Ideal candidates have hands-on...
Performance
Work at office
Voiceflow
San Francisco, CA
2 days ago
Machine Learning Engineer, Speech LLM Training - San Francisco
$180k - $270k
...the intersection of research and engineering, eager to design novel sequence... ...distributed training runs, managing GPU memory utilization, and resolving complex performance bottlenecks. Thrive in a fast‑... ...(e.g., vLLM, TensorRT‑LLM, SGLang) to minimize latency for...
Performance
Full time
Work at office
Plaud
San Francisco, CA
2 days ago
Senior/Staff ML Engineer, Performance Optimization
...and bleeding-edge part of our engine. You'll be working on making AI... ...PyTorch code that pushes performance boundaries You love diving deep... ...You think the current state of ML deployment could be way better... ...you've worked with diffusion/LLM models before or built custom...
Senior
Performance
Comfy
San Francisco, CA
2 days ago
LLM/ML Engineer (Inference)
...memory management, networking, storage, performance, and scale. You're experienced with modern... ...inference systems like TGI, vLLM, TensorRT-LLM, and Optimum, and comfortable creating... ...source contributions and staying current with ML infrastructure developments Bring...
Performance
Work at office
Reducto, Inc.
San Francisco, CA
2 days ago
Senior ML Engineer, Autonomous Driving Evaluation
...in Python and standard ML frameworks (e.g., JAX,... ...leadership, influencing senior stakeholders, and driving... ...and software engineers who are passionate about... ...driver to improve the performance of our technology stack... ...models and Generative AI (LLM/VLM) solutions. These solutions...
Senior
Performance
Waymo
San Francisco, CA
1 day ago
Machine Learning Engineer, Inference & Serving (Speech LLM) - San Francisco
$180k - $270k
...elevate productivity and performance through note-taking... ...-low-latency inference engines for large language models... ...deep understanding of GPU architectures (NVIDIA Ampere... ...between the core ML training team and the backend... ...with modern LLM serving frameworks like...
Performance
Full time
Work at office
Worldwide
Plaud
San Francisco, CA
3 days ago
ML Ops Engineer Agentic AI Lab (Founding Team)
...About the Role ML Ops Engineer — Agentic AI Lab (Founding Team... ...orchestration, GPU infrastructure, fine-tuned... ...automated pipelines for: LLM fine-tuning, SFT, LoRA... ...manage evaluation and benchmarking frameworks (e.g.... ...latency, token usage, performance metrics, error tracing...
Performance
Full time
Fabrion
San Francisco, CA
6 days ago
Senior ML Training Systems Engineer - Distributed GPU Infra
...company in San Francisco is looking for a Senior Software Engineer to build scalable infrastructure for... ...distributed training systems and optimize GPU utilization while collaborating with... ...candidates have over 5 years of experience in ML infrastructure and a strong background...
Senior
BaseTen
San Francisco, CA
2 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior ML Performance Engineer: LLM Benchmarking & GPU. Be the first to apply!