Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Member of Technical Staff, AMD GPU Performance Engineering

$200k - $400k
Full-time

Inferact

Inferact's mission is to grow vLLM as the world's AI inference engine and accelerate AI progress by making inference cheaper and faster. Founded by the creators and core maintainers of vLLM, we sit at the intersection of models and hardware, a position that took years to build. About the Role We're looking for an AMD GPU performance engineer to make vLLM a first-class inference engine across the AMD accelerator ecosystem. You'll build and optimize AMD GPU backends, kernels, runtime paths, and benchmarking infrastructure using ROCm, HIP, Triton, CK, AITER, and related tooling so vLLM can deliver frontier inference performance on AMD GPUs. You'll work at the boundary of inference systems, kernels, compilers, and hardware architecture, improving performance-critical paths such as attention, GEMM, sampling, KV cache, and communication-heavy operations. Your work will help make AMD GPU support in vLLM usable, fast, benchmarked, and maintainable. Skills and Qualifications Minimum qualifications: Bachelor's degree or equivalent experience in computer science, engineering, systems, machine learning, or similar. Hands-on experience optimizing AMD GPU workloads using ROCm, HIP, Triton, CK, AITER, or similar AMD ecosystem tools. Deep understanding of AMD GPU execution, memory behavior, toolchains, kernel performance, and backend-specific performance constraints. Experience optimizing ML kernels or inference paths such as attention, GEMM, sampling, KV cache, fused kernels, or communication-heavy runtime paths. Strong performance profiling and benchmarking skills, with the ability to use measurements, hardware counters, correctness tests, and reproducible benchmarks to guide optimization work. Preferred qualifications: Experience with vLLM, SGLang, TensorRT-LLM, ROCm-based serving, or other LLM inference systems. Familiarity with batching, KV cache, decoding, serving tradeoffs, and backend performance constraints in production inference systems. Experience with compiler and kernel technologies such as Triton, MLIR, LLVM, CK, AITER, HIP, or other kernel DSLs and backend libraries. Knowledge of quantization methods such as INT8, FP8, mixed precision, or AMD hardware-specific numeric formats, including accuracy and performance tradeoffs. Bonus points if you have: Contributed to vLLM, ROCm, HIP, Triton, CK, AITER, PyTorch, compiler projects, or other open-source ML infrastructure. Built AMD GPU benchmarking infrastructure or automated performance regression detection for accelerator workloads. Worked directly with AMD, accelerator platform teams, or early-access programs to ship backend, compiler, or inference performance improvements. Logistics Location: This role is based in San Francisco, California. Will consider remote in the US for exceptional candidates. Compensation: Depending on background, skills, and experience, the expected annual salary range for this position is $200,000 - $400,000 USD + equity. Visa sponsorship: We sponsor visas on a case-by-case basis. Benefits: Inferact offers generous health, dental, and vision benefits as well as 401(k) company match.

Vacancy posted 2 days ago
Similar jobs that could be interesting for youBased on the Member of Technical Staff, AMD GPU Performance Engineering in United States vacancy
  • Perplexity is seeking an intrepid, polymathic Member of Technical Staff to take on one of the AI industry's most unique engineering roles. You'll work directly with Perplexity...  ...for recruiting, labor compliance, and performance management. Participate in meetings with Capitol... 
    Performance
    Full time

    Perplexity

    Washington DC
    3 days ago
  • $324k - $396k

     ...small, highly motivated, and focused on engineering excellence. This organization is for...  ...knowledge with their teammates. Member of Technical Staff (X.AI LLC; Palo Alto, CA): Build collaborative...  ...design and implementation. Perform cutting-edge research on advanced techniques... 
    Performance
    Full time

    xAI

    Palo Alto, CA
    4 days ago
  • $180k

     ...motivated, and focused on engineering excellence. This organization...  ...We are building the high-performance inference platform that serves...  ...reliability. As a Member of Technical Staff - Inference, you will design...  ...deep low-level optimizations (GPU kernels, quantization, speculative... 
    Performance
    Temporary work

    xAI

    Palo Alto, CA
    more than 2 months ago
  • $180k

     ...motivated, and focused on engineering excellence. This...  ...the need of our high-performance large-scale LLM training...  ...scale LLMs with JAX (on GPU or TPU) and applying...  ...interview”) during which a member of our team will ask...  ...consists of four technical interviews: # Coding... 
    Performance
    Temporary work
    Relocation

    xAI

    Palo Alto, CA
    more than 2 months ago
  •  ...motivated, and focused on engineering excellence. This organization...  ...seeking a highly skilled Member of Technical Staff to join our team in managing...  ...services, optimize system performance, and minimize downtime—including...  ...systems for AI workloads, GPU clusters, or high-... 
    Performance

    xAI

    Memphis, TN
    8 days ago
  • $150k - $250k

     ...AI agents that optimize high-performance, mission-critical computing systems...  ...the intersection of kernel engineering and applied AI to scale up AI...  ...kernels for ML or other GPU-heavy workloads Fluency in...  ...Code Experience with TPUs, AMD GPUs, edge AI, or other accelerator... 
    Performance
    Work at office
    Flexible hours

    Asari AI

    San Francisco, CA
    16 days ago
  •  ...Member of Technical Staff Location: NYC (onsite only – not remote) Alliance is the leading accelerator...  ...Staff to join our in-house engineering team. You'll report directly to Carter...  ...) and own data models, caching, and performance. Raise the bar on engineering... 
    Performance
    Full time
    Relocation

    Alliance

    New York, NY
    28 days ago
  • $180k

     ...small, highly motivated, and focused on engineering excellence. This organization is for individuals...  ...scaling paradigms for state-of-the-art performance. Build research tooling, user-...  ...systems (training/inference optimization, GPU utilization, multi-GPU/TPU setups,... 
    Performance
    Temporary work

    xAI

    Palo Alto, CA
    more than 2 months ago
  •  ...potential You will work with the Engineering Manager, Product Manager and the...  ...driver of the system architecture, technical direction and each team member’s technical skill development...  ...differently at Anchorage Digital. We define performance as acquiring, possessing, and... 
    Performance
    Full time

    Anchorage Digital

    United States
    25 days ago
  • $180k

     ...small, highly motivated, and focused on engineering excellence. This organization is for...  ...assisting in training AI models to enhance performance and user satisfaction. Exceptional...  ...interview (“phone interview”) during which a member of our team will ask some basic... 
    Performance
    Work at office
    Remote work
    Work from home
    Relocation

    xAI

    Palo Alto, CA
    more than 2 months ago
  • Perplexity is seeking energetic engineers to join our highly driven Agents engineering team. The Agents team consists of backend, full-stack...  ...: Designing AI agents to navigate the digital world and perform increasingly valuable units of work for our users; Training action... 
    Performance
    Full time
    Flexible hours

    Perplexity

    San Francisco, CA
    3 days ago
  • $128.1k - $192.1k

     ...What You'll Do: As a Senior Specialist Member of Technical Staff, you will work hands-on with...  ...drive resolution, and improve system performance. Support large-scale deployments, including...  ...change management. Collaborate with engineering and operations teams to identify,... 
    Performance
    Full time
    Temporary work
    Work at office
    Local area
    Relocation

    AT&T

    Bothell, WA
    2 days ago
  • $23 - $31.4 per hour

     ...Materials is a global leader in materials engineering solutions used to produce virtually...  ...Materials' systems. May need technical assistance in performance of daily responsibilities. Other...  ...and support to more junior team members Problem Solving ~ Provides solutions... 
    Performance
    Full time
    Work experience placement
    Relocation
    Flexible hours
    Shift work
    Night shift

    Applied Materials

    Austin, TX
    4 days ago
  •  ...global leader in materials engineering solutions used to produce virtually...  ...of system equipment. Performs daily, weekly, monthly, quarterly...  ...III start-ups. Provides technical assistance to less experienced...  ...support to more junior team members Problem Solving ~... 
    Performance
    Full time
    Contract work
    Relocation

    Applied Materials

    Chandler, AZ
    1 day ago
  • $23 - $31.4 per hour

     ...global leader in materials engineering solutions used to produce virtually...  ...of system equipment. Performs daily, weekly, monthly, quarterly...  ...III start-ups. Provides technical assistance to less experienced...  ...support to more junior team members Problem Solving ~... 
    Performance
    Full time
    Contract work
    Relocation

    Applied Materials

    Portland, OR
    4 days ago
  • $27.5 - $32.5 per hour

     ...Technical Support Representative San Jose, California, United...  ...Representative At Orionyx Engineering Ltd., The Technical Support...  ...functional teams to enhance product performance and user experience. This...  ...with clients and team members to understand and resolve technical... 
    Performance
    Hourly pay
    Full time
    Remote work
    Monday to Friday
    Flexible hours

    Orionyx Engineering Ltd

    San Jose, CA
    1 day ago
  • $70.3k - $143k

     ...something great? Want to join a 20,000-member team that works on the technology...  ...) products, a broad spectrum of high-performance linear, mixed-signal, power management...  ...products. Microchip Technology Inc. has a Technical Staff Engineer - Digital Design opening based in the... 
    Performance
    Permanent employment
    Full time
    Work at office

    Microchip

    San Jose, CA
    3 days ago
  • $150k - $250k

     ...), Charlie Songhurst (Board Member, Meta), and Michael Jones (Former...  ..., UChicago, and Oxford engineers and researchers. Our omnichannel...  ...for a (Human) Member of Technical Staff specializing in backend...  ...reliability, observability, and performance across the stack... 
    Performance
    Full time
    Internship
    Worldwide

    Krew

    San Francisco, CA
    11 days ago
  • $38.46 - $44.71 per hour

     ...Technical Services Specialist The Technical Services Specialist...  ...Distributors, Field Service Engineers (FSE), and Account Managers (...  ...Technical Services Specialist performs the full range of technical service...  ...timely information to team members. Share knowledge and... 
    Performance
    Hourly pay
    Work at office

    Mizuho OSI

    Union City, CA
    1 day ago
  • $180k

     ...knowledge. Our team is small, highly motivated, and focused on engineering excellence. This organization is for individuals who...  ...we design an efficient and robust environment for the agent to perform actions in? # Evaluations and observability are a core part of... 
    Performance
    Temporary work

    xAI

    Palo Alto, CA
    more than 2 months ago
  • $180k

     ...knowledge. Our team is small, highly motivated, and focused on engineering excellence. This organization is for individuals who...  ...infrastructure that enables engineers to monitor, debug, and optimize the performance and reliability of their systems. We handle telemetry at... 
    Performance
    Temporary work

    xAI

    Palo Alto, CA
    more than 2 months ago
  • $180k

     ...knowledge. Our team is small, highly motivated, and focused on engineering excellence. This organization is for individuals who...  ...worldwide. In this role, you'll build and scale robust, high-performance systems that power immersive, multi-modal media interactions—... 
    Performance
    Temporary work
    Worldwide

    xAI

    Palo Alto, CA
    more than 2 months ago
  • $180k

     ...knowledge. Our team is small, highly motivated, and focused on engineering excellence. This organization is for individuals who...  ...rebuild core media processing and distribution pipelines in high-performance languages (Rust, C++ or Go) Obsess over every millisecond and... 
    Performance
    Temporary work

    xAI

    Palo Alto, CA
    8 days ago
  • $180k

     ...knowledge. Our team is small, highly motivated, and focused on engineering excellence. This organization is for individuals who...  ...training data to continuously improve safety while maintaining high performance and low latency. Own full-cycle development of safety features... 
    Performance
    Temporary work
    Worldwide

    xAI

    Palo Alto, CA
    28 days ago
  • $180k

     ...knowledge. Our team is small, highly motivated, and focused on engineering excellence. This organization is for individuals who...  ...keep Grok leading in reasoning and utility. Architect high-performance systems for personalized, reliable interactions at global scale... 
    Performance
    Temporary work

    xAI

    Palo Alto, CA
    more than 2 months ago
  • $180k

     ...knowledge. Our team is small, highly motivated, and focused on engineering excellence. This organization is for individuals who...  ...unifying common components across the storage systems Dive into performance issues and work with customers and deliver solutions to cater... 
    Performance
    Temporary work

    xAI

    Palo Alto, CA
    more than 2 months ago
  • $180k

     ...knowledge. Our team is small, highly motivated, and focused on engineering excellence. This organization is for individuals who...  ...and product teams to deliver end-to-end experiences. Drive performance, reliability, and quality of voice interactions at global scale... 
    Performance
    Temporary work

    xAI

    Palo Alto, CA
    more than 2 months ago
  • $150k - $250k

     ...Product Hunt), Charlie Songhurst (Board Member, Meta), and Michael Jones (Former Chair,...  ...the United Nations, UChicago, and Oxford engineers and researchers. Our omnichannel...  ...email agent framework # Develop custom performance and quality evaluations for our agents... 
    Performance
    Full time
    Internship
    Worldwide

    Krew

    San Francisco, CA
    17 days ago
  • $209.7k - $256.3k

    Technical Staff Mechanical Engineer - Mechanical Engineering Center of Excellence At the Mechanical...  ...ensure the robustness and optimal performance of new products. In this role, you...  ..., and SPC * Mentor and guide team members, fostering integration of mechanical... 
    Performance
    Full time

    Dell Technologies

    Austin, TX
    4 days ago
  •  ...solutions that improve health, safety and performance. Terra Universal’s diversified...  ...Degree Required. Science degree preferred (Engineering, Chemistry, Biology). Copy of...  ...floating holidays and 10 paid holidays. Staff members become eligible for these benefits after... 
    Performance

    Terra Universal

    Fullerton, CA
    a month ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Member of Technical Staff, AMD GPU Performance Engineering. Be the first to apply!