Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Senior ML Performance Engineer: LLM Benchmarking & GPU

Amadeus Search

A leading AI infrastructure company is seeking a Senior ML Performance Engineer to design a comprehensive performance testing platform for large language models. This role requires a minimum of 7 years in performance engineering and strong experience with GPU programming and ML inference workloads. Candidates should have expertise in Python and C/C++. The position offers competitive compensation, equity, and wellness benefits in a hybrid work environment. #J-18808-Ljbffr

Vacancy posted 2 days ago
Similar jobs that could be interesting for youBased on the Senior ML Performance Engineer: LLM Benchmarking & GPU in San Francisco, CA vacancy
  • $128.7k - $261.3k

     ...export, kernel development, and performance engineering so that every cycle on our...  ...builds high‑performance GPU kernels and custom libraries...  ...at the heart of on‑vehicle ML inference for ADAS and autonomous...  .... Hands‑on experience benchmarking, profiling, debugging and optimizing... 
    Senior
    Performance
    Local area
    Flexible hours

    Israelvcforum

    San Francisco, CA
    2 days ago
  •  ...Senior AI/ML Engineer — LLM & Agent Stack Every production AI system, whether it's powering customer...  .... Build and improve tracing, benchmarking and observability for LLMs and agents...  ...orchestration, service meshes, and performance tuning. ~ Proven track record building... 
    Senior
    Performance

    TrueFoundry

    San Francisco, CA
    8 hours ago
  •  ...San Francisco is seeking a specialist to design and operate large-scale GPU infrastructure. This role requires expertise in deploying GPU systems for high-throughput inference and model performance optimization. The ideal candidate will have hands-on experience with modern... 
    Senior
    Performance

    Reflection AI

    San Francisco, CA
    3 days ago
  •  ...Position: Senior ML Performance Engineer Location: SF Bay Area (US) or Toronto (Canada...  ...compiler optimization on modern GPU architectures. This role...  ...performance testing platform for LLM inference workloads across GPU clusters Define benchmarking methodologies, metrics, and... 
    Senior
    Performance
    Full time

    Amadeus Search

    San Francisco, CA
    2 days ago
  • TRM Labs is looking for a Senior or Staff ML Systems Engineer to focus on building and scaling the technical...  ...versioning to ensure compliance and performance. Ideal candidates will have strong Python...  ..., and experience deploying LLM workflows. Join us at TRM Labs to help... 
    Senior
    Performance

    TRM Labs

    San Francisco, CA
    3 days ago
  • $141k - $249k

     ...with autonomy and algorithm engineers to scale safe self-driving systems...  ...on the truck. - Create and benchmark new CUDA kernels for...  ...runtime and memory to pinpoint performance bottlenecks.   Qualifications...  ...Skilled in profiling CPU and GPU code using tools such as... 
    Senior
    Performance
    Work at office
    Work from home
    Flexible hours

    Waabi

    San Francisco, CA
    26 days ago
  •  ...Senior Infrastructure Engineer – Bland As a Senior Infrastructure Engineer...  .... Lead – AI/ML Stack Infrastructure...  ...technology refresh and benchmark proprietary tools against...  ...AI/ML workloads with GPU support, implementing...  ...monitoring for model performance and drift. Responsibilities... 
    Senior
    Performance
    Temporary work

    AI Chopping Block, Inc.

    San Francisco, CA
    2 days ago
  • $200k - $350k

     ...train whole-body policies, build simulation environments, and run GPU training experiments. Ideal candidates should have strong...  ...compensation range of $200K to $350K, and you’ll work with a small, elite team in a dynamic, high-performance environment. #J-18808-Ljbffr... 
    Senior
    Performance

    Pantera Capital

    San Francisco, CA
    2 days ago
  •  ...building production-grade ML infrastructure used...  ...are looking for a Senior AI/ML Engineer to own model training...  ...deployment Own model performance, latency, and cost...  ...harnesses and offline benchmarks for fast iteration...  ...distributed training, GPU optimization, or inference... 
    Senior
    Performance
    Full time

    Clera

    San Francisco, CA
    8 days ago
  • $167.2k - $209k

     ...DigitalOcean is seeking a Senior Engineer 2 to play a key...  ...the industry-leading performance for our inference services...  ...strategy for benchmarking and performance optimizations...  ...inference engine and GPU kernel layers, ensuring...  ...familiarity with the Gen AI (LLM, VLM, LMM) landscape,... 
    Senior
    Performance
    Local area
    Remote work
    Worldwide
    Flexible hours

    DigitalOcean

    San Francisco, CA
    4 days ago
  •  ...Role We are looking for a visionary Senior ML Engineer who will bridge the gap between high-...  ...agent reasoning paths, tool usage, and performance in real-time Develop and enforce technical...  ..., specifically training or fine-tuning LLM models, embeddings; building clustering... 
    Senior
    Performance
    Shift work

    Palm Venture Studios

    San Francisco, CA
    4 days ago
  •  ...MakerMaker.AI is looking for a Senior Machine Learning Systems Engineer in San Francisco. In this role, you will...  ...inference systems, optimizing for performance and reliability. The ideal candidate...  ..., and have strong knowledge in GPU-accelerated inference. Excellent communication... 
    Senior
    Performance

    MakerMaker.AI

    San Francisco, CA
    3 days ago
  •  ...for a Sr. MLE with AI/ML expertise to build cutting...  ...of software engineering and applied AI, turning...  ...test, and improve AI performance Turn the latest advancements...  ..., leveraging modern LLM's (strong plus for exp...  ...in our portfolio. Seniority level ~ Seniority... 
    Senior
    Performance
    Full time
    Immediate start

    Greylock Partners

    San Francisco, CA
    2 days ago
  •  ...Highlight AI We're a small, senior team building the intelligent...  ...We're hiring a Senior ML Engineer to help build the AI systems...  ...measure and improve ML system performance Investigate alternative models...  ...engineering org Stay current on LLM advances, retrieval... 
    Senior
    Performance
    Work at office
    Relocation
    Relocation package
    Flexible hours

    Highlight AI

    San Francisco, CA
    2 days ago
  • $128.7k - $261.3k

     ...export, kernel development, and performance engineering so that every cycle on our...  ...deep compiler, systems, and GPU engineers who enjoy working on...  ...automated driving. The Role As a Senior Compiler Engineer on the AI...  ...reliable, and effortless for ML engineers across the AV... 
    Senior
    Performance
    Local area
    Flexible hours

    Israelvcforum

    San Francisco, CA
    2 days ago
  • $200k - $260k

     ...Senior Machine Learning Engineer, Voice AI San Francisco About the Role...  ...looking for a Senior ML Engineer to drive the...  ...engines like TRT-LLM and SGLang to optimize...  ...frontier. You'll profile GPU utilization, design...  ...Optimize inference performance for voice models (STT... 
    Senior
    Performance
    Full time

    Together AI

    San Francisco, CA
    4 days ago
  • $250k - $350k

     ...them actually work. We’re hiring ML Infrastructure Engineers to tackle a hard, real-world...  ..., and AI. This isn’t clean benchmark data. It’s messy, continuous,...  ...inference systems for multimodal / LLM-based models GPU infrastructure and performance optimisation Hybrid... 
    Performance

    Trades Workforce Solutions

    San Francisco, CA
    3 days ago
  • $204k - $259k

     ...Senior Machine Learning Engineer – VLM/LLM Evaluation Waymo is an autonomous driving technology...  ...evaluation systems and benchmarks for Waymo Foundation...  ...experience Experience in ML engineering and applied...  ...location or, if the role can be performed remote, the specific... 
    Senior
    Full time
    Temporary work
    Remote work

    Waymo

    San Francisco, CA
    1 day ago
  • $180k - $270k

     ...have an AI persona. Senior Machine Learning Engineer to join our Avatar Technology...  ...owning the applied ML work required to make...  ...ML-driven animation performs reliably and at high...  ...with Behavior and LLM teams to integrate predictive...  ...across CPU, GPU, and memory constraints... 
    Senior
    Performance
    Full time
    Work experience placement
    Work at office

    Cerebras

    San Francisco, CA
    3 days ago
  •  ...Valley, a small team of engineers is working on what could...  ...wear many hats (building ML platforms, MLOps tools, data/LLM infrastructure). You...  ...paced environment. As a Senior ML Engineer, you will lead...  ...observability, and lead performance benchmarking. You’re comfortable... 
    Senior
    Performance
    Work at office
    Flexible hours
    2 days per week
    3 days per week

    Sailplane

    San Francisco, CA
    2 days ago
  • $161.93k - $227.33k

     ...Senior Machine Learning Engineer Brisbane, California At Freenome,...  ...machine learning (AI/ML) systems in a cloud...  ...efficient training, and performing model optimizations....  ..., optimization, and benchmarking. Implement efficient...  ...data. Experience GPU/Accelerator... 
    Senior
    Performance
    Work at office
    Local area
    Remote work
    2 days per week
    3 days per week

    Freenome

    Brisbane, CA
    4 days ago
  • $240.45k - $300.3k

     ...The goal of a Senior Machine Learning Engineer at Scale is to leverage techniques...  ...vision. On the LLM side, we are...  ...evaluation tools to benchmark and refine agent behavior...  ...while preserving core performance characteristics...  ...identify and prototype ML-driven product enhancements... 
    Senior
    Performance
    Full time

    Scale AI

    San Francisco, CA
    9 days ago
  • $100k - $200k

     ...Voiceflow is seeking a skilled ML-Infrastructure Engineer in San Francisco to architect and operate auto-scaling systems...  ...platform. The role includes optimizing GPU and compute infrastructure, ensuring high performance and reliability. Ideal candidates have hands-on... 
    Performance
    Work at office

    Voiceflow

    San Francisco, CA
    2 days ago
  • $180k - $270k

     ...the intersection of research and engineering, eager to design novel sequence...  ...distributed training runs, managing GPU memory utilization, and resolving complex performance bottlenecks. Thrive in a fast‑...  ...(e.g., vLLM, TensorRT‑LLM, SGLang) to minimize latency for... 
    Performance
    Full time
    Work at office

    Plaud

    San Francisco, CA
    2 days ago
  •  ...and bleeding-edge part of our engine. You'll be working on making AI...  ...PyTorch code that pushes performance boundaries You love diving deep...  ...You think the current state of ML deployment could be way better...  ...you've worked with diffusion/LLM models before or built custom... 
    Senior
    Performance

    Comfy

    San Francisco, CA
    2 days ago
  •  ...memory management, networking, storage, performance, and scale. You're experienced with modern...  ...inference systems like TGI, vLLM, TensorRT-LLM, and Optimum, and comfortable creating...  ...source contributions and staying current with ML infrastructure developments Bring... 
    Performance
    Work at office

    Reducto, Inc.

    San Francisco, CA
    2 days ago
  •  ...in Python and standard ML frameworks (e.g., JAX,...  ...leadership, influencing senior stakeholders, and driving...  ...and software engineers who are passionate about...  ...driver to improve the performance of our technology stack...  ...models and Generative AI (LLM/VLM) solutions. These solutions... 
    Senior
    Performance

    Waymo

    San Francisco, CA
    1 day ago
  • $180k - $270k

     ...elevate productivity and performance through note-taking...  ...-low-latency inference engines for large language models...  ...deep understanding of GPU architectures (NVIDIA Ampere...  ...between the core ML training team and the backend...  ...with modern LLM serving frameworks like... 
    Performance
    Full time
    Work at office
    Worldwide

    Plaud

    San Francisco, CA
    3 days ago
  •  ...About the Role ML Ops Engineer — Agentic AI Lab (Founding Team...  ...orchestration, GPU infrastructure, fine-tuned...  ...automated pipelines for: LLM fine-tuning, SFT, LoRA...  ...manage evaluation and benchmarking frameworks (e.g....  ...latency, token usage, performance metrics, error tracing... 
    Performance
    Full time

    Fabrion

    San Francisco, CA
    6 days ago
  •  ...company in San Francisco is looking for a Senior Software Engineer to build scalable infrastructure for...  ...distributed training systems and optimize GPU utilization while collaborating with...  ...candidates have over 5 years of experience in ML infrastructure and a strong background... 
    Senior

    BaseTen

    San Francisco, CA
    2 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior ML Performance Engineer: LLM Benchmarking & GPU. Be the first to apply!