Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Senior ML Performance Engineer

Amadeus Search

Position: Senior ML Performance Engineer
Location: SF Bay Area (US) or Toronto (Canada) - Hybrid
Employment Type: Full-Time
Industry: AI Infrastructure / Compiler Systems Overview

A venture-backed AI infrastructure company is building a high-performance, portable compiler designed to let developers "build once, deploy anywhere." This includes cloud, edge, and hybrid environments - all optimized for resource efficiency, scalability, and sustainable AI development.

The team is looking for a Senior ML Performance Engineer to architect and lead a Performance Testing Platform from the ground up, measuring and optimizing the performance of large language models (LLMs) before and after compiler optimization on modern GPU architectures.

This role sits at the intersection of ML systems, GPU architecture, and performance engineering , with high visibility into product quality and customer impact.
Key Responsibilities
  • Design and implement a comprehensive performance testing platform for LLM inference workloads across GPU clusters
  • Define benchmarking methodologies, metrics, and test suites (latency, throughput, memory utilization, power consumption, and model accuracy)
  • Establish baseline performance for unoptimized models and validate post-optimization improvements
  • Build automated pipelines for continuous performance validation across compiler releases and model updates
  • Investigate performance bottlenecks using GPU profilers and system-level monitoring
  • Collaborate with compiler engineers, ML engineers, and DevOps to integrate performance testing into development workflows
  • Create dashboards and reporting to track performance trends, regressions, and wins
  • Document best practices for GPU-based ML performance testing
Required Qualifications
  • 7+ years in performance engineering, benchmarking, or systems engineering roles
  • Strong knowledge of ML inference workloads, particularly transformer-based LLMs
  • Hands-on GPU programming and optimization experience (CUDA, ROCm, or similar)
  • Strong programming skills in Python and C/C++
  • Proven experience building performance testing infrastructure or benchmarking platforms from scratch
  • Experience with ML frameworks: PyTorch, TensorFlow, ONNX Runtime, vLLM, TensorRT-LLM
  • Proficiency with profiling and debugging GPU workloads
  • Experience with CI/CD systems and test automation frameworks
  • Strong analytical skills with the ability to design experiments, analyze results, and communicate findings clearly
Nice to Have
  • AMD GPU experience (Mi200/Mi300) and ROCm ecosystem
  • Compiler optimization knowledge
  • Distributed inference and multi-GPU workloads
  • ML model quantization, pruning, and optimization techniques
  • High-performance computing or systems-level optimization
  • Infrastructure-as-code experience: Kubernetes, Docker, Terraform
  • Contributions to open-source ML or systems projects
Personal Attributes
  • Detail-oriented - able to spot subtle regressions
  • Self-driven and accountable
  • Collaborative and team-oriented
  • Passionate about sustainable AI
  • Clear and effective communicator
Compensation & Benefits
  • Competitive salary, dependent on experience and location
  • Equity and bonus opportunities
  • Medical, dental, and vision coverage
  • Retirement savings plan
  • Additional wellness benefits
Why This Role Is Unique
  • Build the infrastructure that validates high-performance ML models
  • Influence core product quality and customer outcomes
  • Work in a highly technical, high-impact environment at the forefront of AI systems
  • Collaborate across a globally distributed team
Vacancy posted 2 days ago
Similar jobs that could be interesting for youBased on the Senior ML Performance Engineer in San Francisco, CA vacancy
  •  ...scale.  About the Role We are looking for a visionary Senior ML Engineer who will bridge the gap between high-level architecture and...  ...) to monitor agent reasoning paths, tool usage, and performance in real-time Develop and enforce technical safety mechanisms... 
    Senior
    Performance
    Shift work

    Palm Venture Studios

    San Francisco, CA
    8 days ago
  •  ...most complex and bleeding-edge part of our engine. You'll be working on making AI models...  ...written production PyTorch code that pushes performance boundaries You love diving deep into...  ...works You think the current state of ML deployment could be way better What you... 
    Senior
    Performance

    ComfyUI

    San Francisco, CA
    10 hours ago
  • $180k - $240k

     ...recruiting for one of its clients a Senior Machine Learning Engineer - this is a fully remote role for US/...  ...experience to join our small but mighty ML team building production-grade AI...  ...Translating the latest research into high-performing systems and models that can be... 
    Senior
    Performance
    Remote work
    Flexible hours

    Career Renew

    San Francisco, CA
    4 days ago
  •  ...operate large-scale GPU infrastructure. This role requires expertise in deploying GPU systems for high-throughput inference and model performance optimization. The ideal candidate will have hands-on experience with modern inference frameworks and a solid understanding of... 
    Senior
    Performance

    Reflection AI

    San Francisco, CA
    2 days ago
  •  ...Senior ML Engineer Highlight is building a shared intelligence layer for the modern workforce. Highlight unifies context across every...  ...comprehensive evals and monitoring to measure and improve ML system performance Investigate alternative models and fine tuning... 
    Senior
    Performance
    Work at office
    Relocation
    Relocation package
    Flexible hours

    Highlight AI

    San Francisco, CA
    3 days ago
  •  ...Senior ML/RL Engineer, Behavior Planning At Bot Auto, we are revolutionizing the transportation of goods with our cutting-edge autonomous...  ...Competitive salary based on experience, with opportunities for performance bonuses and equity. Benefits: Comprehensive health... 
    Senior
    Performance
    Shift work

    Bot Auto

    San Francisco, CA
    2 days ago
  •  ...most complex and bleeding-edge part of our engine. You'll be working on making AI models...  ...written production PyTorch code that pushes performance boundaries You love diving deep into...  ...works You think the current state of ML deployment could be way better What you... 
    Senior
    Performance

    Comfy

    San Francisco, CA
    15 hours ago
  • $200k - $350k

     ...Python, C++, or Rust, and a solid understanding of reinforcement learning principles. The position offers a competitive compensation range of $200K to $350K, and you’ll work with a small, elite team in a dynamic, high-performance environment. #J-18808-Ljbffr Pantera Capital
    Senior
    Performance

    Pantera Capital

    San Francisco, CA
    15 hours ago
  • An innovative company is seeking a Distributed Systems/ML Engineer to enhance the training throughput of its internal framework. This...  ...engineering skills, particularly in Python, and a passion for performance optimization. Join a forward-thinking team dedicated to pushing... 
    Senior
    Performance

    OpenAI

    San Francisco, CA
    2 days ago
  • $128.7k - $261.3k

     ...pioneer new approaches to model export, kernel development, and performance engineering so that every cycle on our accelerators translates into...  ...and custom libraries that sit at the heart of our on-vehicle ML inference for ADAS and autonomous driving . We own making... 
    Senior
    Performance
    Local area
    Work from home
    Relocation package
    Flexible hours

    General Motors

    San Francisco, CA
    4 days ago
  • A leading AI infrastructure company is seeking a Senior ML Performance Engineer to design a comprehensive performance testing platform for large language models. This role requires a minimum of 7 years in performance engineering and strong experience with GPU programming... 
    Senior
    Performance

    Amadeus Search

    San Francisco, CA
    4 days ago
  •  ...Senior/Staff ML Research Engineer We're assisting a profitable Enterprise AI Customer Support startup with their search for senior/staff ML research...  ...models for customer support tasks that exceed the performance of closed source models Experiment with small open-source... 
    Senior
    Performance
    Work at office

    DRH Search

    San Francisco, CA
    2 days ago
  • $128.7k - $261.3k

     ...model export, kernel development, and performance engineering so that every cycle on our accelerators...  ...automated driving. The Role As a Senior Compiler Engineer on the AI Kernels & Compilers...  ...fast, reliable, and effortless for ML engineers across the AV organization to... 
    Senior
    Performance
    Local area
    Work from home
    Relocation package
    Flexible hours

    General Motors

    San Francisco, CA
    4 days ago
  •  ...Senior Client Infrastructure Engineer SAN FRANCISCO, CA ENGINEERING FULL-TIME What Will You Be Doing? Building infrastructure that enables...  .... Helping maintain tools for monitoring the performance of machine learning models at scale and ensuring the integrity... 
    Senior
    Performance
    Full time
    Work experience placement

    1872 Consulting

    San Francisco, CA
    4 days ago
  • A forward-thinking AI company seeks experienced ML engineers to build distributed training infrastructure. This role involves designing scalable systems using PyTorch and Ray, ensuring performance and reliability in large-scale environments. The ideal candidates will possess... 
    Senior
    Performance

    Preference Model, Inc.

    San Francisco, CA
    15 hours ago
  • $242k - $290k

    Zoox in San Francisco is seeking a Senior Machine Learning Engineer to develop and deploy models for realistic...  ...pipelines, ensuring safety and performance of autonomous systems. Ideal candidates...  ...related fields, particularly in production ML and transformer models. The... 
    Senior
    Performance

    jobs.frontdoordefense.com - Jobboard

    San Francisco, CA
    3 days ago
  •  ...mission is to give leaders clarity and engineers time. We help leaders understand...  ...About the role We're looking for a Senior Applied ML Engineer to design, build, and...  ...architectural decisions to improve model performance. You will also play a lead role in designing... 
    Senior
    Performance
    Odd job
    Shift work

    Macroscope

    San Francisco, CA
    2 days ago
  •  ...leading financial technology company in San Francisco is seeking a Senior Research Scientist to lead applied research on their...  ...strong background in machine learning, with experience in model performance evaluations and production system development. This role is crucial... 
    Senior
    Performance

    Plaid Inc

    San Francisco, CA
    15 hours ago
  • $141k - $249k

     ...You will... - Collaborate closely with autonomy and algorithm engineers to scale safe self-driving systems using an AI-first approach....  ...Comprehensively profile model runtime and memory to pinpoint performance bottlenecks. - Identify and evaluate emerging technologies... 
    Senior
    Performance
    Work at office
    Work from home
    Flexible hours

    Waabi

    San Francisco, CA
    2 days ago
  •  ...The role involves managing KV cache allocation and improving execution performance across various components. Ideal candidates should have strong software engineering skills and experience with ML inference systems, particularly in Python and C++. This position is an opportunity... 
    Senior
    Performance

    Gimlet Labs

    San Francisco, CA
    15 hours ago
  •  ...should help the world, not harm it. We are building a high-performance, portable compiler that lets developers "build once,...  ...without leaving a mess behind. Role We’re looking for a Senior ML Performance Engineer to architect and lead our Performance Testing Platform... 
    Senior
    Performance

    Alumni Ventures

    San Francisco, CA
    1 day ago
  • $190k - $205k

     ...electrical, and visual signals Production Engineering Write clean, scalable, well-tested...  ...shared codebase. Build end-to-end ML pipelines including data processing, feature...  ...and deployment. Optimize models for performance, reliability, and real-world constraints... 
    Senior
    Performance
    Live in

    Gridware

    San Francisco, CA
    4 days ago
  • Arena Intelligence, Inc. in San Francisco, CA, is seeking a Senior Software Engineer (Infrastructure) to lead the design of scalable data and...  ...involves architecting real-time data pipelines, ensuring performance and reliability, and mentoring engineers. Candidates should... 
    Senior
    Performance

    Arena Intelligence, Inc.

    San Francisco, CA
    1 day ago
  • $175k - $250k

     ...Senior AI/ML Engineer: Python & Scientific Computing SF, NYC, Remote About Swayable Swayable is a fast-growing AI and automated data...  ...improving our tools, techniques, and architecture for high-performance computing. You will work with a talented and diverse team... 
    Senior
    Performance
    Remote work

    Swayable

    San Francisco, CA
    3 days ago
  •  ...Senior AI/ML Engineer — LLM & Agent Stack Every production AI system, whether it's powering customer support, writing code, analyzing financial...  ...: Kubernetes, container orchestration, service meshes, and performance tuning. ~ Proven track record building observability,... 
    Senior
    Performance

    TrueFoundry

    San Francisco, CA
    4 days ago
  • $118k - $169k

     ...production models, serving predictions in real time. The Sr. ML Ops Engineer will partner with our Data Science, Data Product Management,...  ...usage to minimize infrastructure expense while maximizing performance. Monitors and maintains the performance, security, and scalability... 
    Senior
    Performance
    Hourly pay
    Work experience placement
    Work at office
    Immediate start
    Visa sponsorship
    Work visa
    Flexible hours

    Early Warning Services

    San Francisco, CA
    15 hours ago
  •  ...Senior AI / ML Engineer We are seeking a proactive, hands-on Senior ML/AI Engineer to help us advance the frontier of intelligent systems...  ...Full-Stack AI Engineering: Author maintainable, high-performance code and develop clean APIs and services to support machine... 
    Senior
    Performance

    Implaion Recruiting

    San Francisco, CA
    15 hours ago
  •  ...systems for wealth management. This role emphasizes reliability and performance in live financial environments. Candidates should possess over 5 years of relevant experience with a strong background in ML systems, Python programming, and LLM frameworks. The position... 
    Senior
    Performance

    Arta Finance

    San Francisco, CA
    1 day ago
  • The Role: Why, What and the Who Infrastructure Engineers build the foundation for Ivo’s entire platform. Customers...  ...→ staging → prod). Design strategies to isolate ML vs API workloads while optimizing for cost, performance, and reliability. Implement security and... 
    Senior
    Performance

    Icehouseventures

    San Francisco, CA
    3 days ago
  • $250k - $325k

     ...and have grown 800% over the last 12 months. Engineering at Ivo Engineers at Ivo are inventors. Ivo was...  ...staging → prod) Design strategies to isolate ML vs API workloads while optimizing for cost, performance, and reliability Implement security and compliance... 
    Senior
    Performance
    Contract work
    Work at office
    Remote work

    IVO Inc

    San Francisco, CA
    4 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior ML Performance Engineer. Be the first to apply!