Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Member of Technical Staff - Inference

$180k

Xai

Member Of Technical Staff - Inference

Palo Alto, CA

About Xai

Xai's mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, and focused on engineering excellence. This organization is for individuals who appreciate challenging themselves and thrive on curiosity. We operate with a flat organizational structure. All employees are expected to be hands-on and to contribute directly to the company's mission. Leadership is given to those who show initiative and consistently deliver excellence. Work ethic and strong prioritization skills are important. All employees are expected to have strong communication skills. They should be able to concisely and accurately share knowledge with their teammates.

About The Role
  • We are building the high-performance inference platform that serves Grok to millions of users every day with lightning speed and perfect reliability.
  • As a Member of Technical Staff - Inference, you will design and optimize large-scale model serving systems end-to-end. You will own everything from distributed infrastructure (global KV cache, continuous batching, load balancing, auto-scaling) to deep low-level optimizations (GPU kernels, quantization, speculative decoding, tail latency).
  • This is a high-impact role where your work directly determines how fast and reliably users interact with Grok at massive scale

Responsibilities:

  • Architect and implement scalable distributed infrastructure for model serving (load balancing, auto-scaling, batch scheduling, global KV cache).
  • Optimize latency and throughput of model inference under real production workloads.
  • Build reliable, high-concurrency serving systems that serve billions of users with 100% uptime, 0% error rate, and excellent tail latency.
  • Benchmark, fine-tune, and accelerate inference engines (including low-level GPU kernel work and code generation).
  • Develop custom tools to trace, replay, and fix issues across the full stack — from orchestration down to GPU kernels.
  • Create robust CI/CD infrastructure for seamless endpoint deployment, image publishing, and inference engine updates.
  • Accelerate research on scaling test-time compute, RL rollout, and model-hardware co-design for next-generation systems.
Basic Qualifications
  • Deep low-level systems programming (C/C++ or Rust)
  • Experience with large-scale, high-concurrent production serving.
  • Experience with GPU inference engines (vLLM, SGLang, Triton, TensorRT-LLM, etc.).
  • Strong background in system optimizations: batching, caching, load balancing, parallelism.
  • Low-level inference optimizations: GPU kernels, code generation.
  • Algorithmic inference optimizations: quantization, speculative decoding, distillation, low-precision numerics.
  • Experience with testing, benchmarking, and reliability of inference services.
  • Experience designing and implementing CI/CD infrastructure for inference.
Compensation And Benefits

$180,000 - $440,000 USD

Base salary is just one part of our total rewards package at xAI, which also includes equity, comprehensive medical, vision, and dental coverage, access to a 401(k) retirement plan, short & long-term disability insurance, life insurance, and various other discounts and perks.

xAI is an equal opportunity employer. For details on data processing, view our Recruitment Privacy Notice.

Vacancy posted 3 days ago
Similar jobs that could be interesting for youBased on the Member of Technical Staff - Inference in San Francisco, CA vacancy
  • $150k - $300k

     ...position spanning cloud LLM serving, LLM inference optimization and RL systems. You will be...  ...into our RL training stack. Core Technical Responsibilities LLM Serving Multi‑tenant...  ...in open development and encourage team members to contribute to the broader AI community... 
    Suggested
    Work at office
    Remote work
    Visa sponsorship
    Relocation package
    Flexible hours
    Shift work

    Prime Intellect

    San Francisco, CA
    7 hours ago
  • Member of Technical Staff — ML Systems & Inference Employment Type: Full-time Workplace: On-site About the Company We are building the execution layer for the next generation of AI infrastructure. As AI workloads scale and hardware architectures diversify, the bottleneck... 
    Suggested
    Full time

    Acceler8 Talent

    San Francisco, CA
    3 days ago
  • About the Role As a Member of Technical Staff, Inference at Radical Numerics, you will build and optimize the systems that bring frontier biological AI models into production. Your work will focus on delivering state-of-the-art inference performance for large-scale genome... 
    Suggested
    Local area

    Radical Numerics Inc.

    San Francisco, CA
    2 days ago
  •  ...production workloads built to scale to gigawatt‑class AI datacenters. Mission Gimlet Labs is seeking a Member of Technical Staff focused on ML systems and inference. In this role, you will design and build inference systems that execute full models end‑to‑end under real... 
    Suggested

    Gimlet Labs, Inc.

    San Francisco, CA
    1 day ago
  •  ...and parallelism strategies, and help us squeeze every FLOP out of our hardware. What you’ll do Modify and extend state-of-the-art inference engines like vLLM and SGLang. Understand every microsecond of GPU time spent during a forward pass. You'll be able to explain every... 
    Suggested

    Sail Research

    San Francisco, CA
    1 day ago
  • $225k

     ...more reliably than humans can alone. Our approach combines frontier‑scale pre‑training, domain‑specific RL, ultra‑long context, and inference‑time compute to achieve this goal. About The Role As a Software Engineer on the Inference & RL Systems team, you will design and... 
    Relocation
    Visa sponsorship

    Magic

    San Francisco, CA
    2 days ago
  • $240k - $280k

     ...Direct message the job poster from Cabana Senior Engineer / Member of Technical Staff @ AI Healthcare Startup $240,000 - $280,000 You know...  ...increase your chances of interviewing at Cabana by 2x Inferred from the description for this job 401(k) Get notified... 
    Full time
    Remote work
    Worldwide
    Relocation

    Cabana

    San Francisco, CA
    7 hours ago
  •  ...Pixeltable Inc. Member of Technical Staff San Francisco, CA·Full time Apply for Member of Technical Staff As a founding member of the engineering...  ...ingestion, transformation, training/fine-tuning, and inference? You will also: Find opportunities to go deep into a wide... 
    Full time
    Part time
    Work at office
    Work from home
    Flexible hours
    2 days per week

    Pixeltable, Inc.

    San Francisco, CA
    1 day ago
  • $250k

     ...leaves their servers. The team is small, technical, and moving fast, with strong early...  ...· Industry: AI Tools. The Role Member of Technical Staff who can handle everything from modeling...  ...Design scalable pipelines for training, inference, and data processing Improve latency,... 
    Full time

    David Joseph & Company

    San Francisco, CA
    21 hours ago
  •  ...Job Description We’re looking for a Member of Technical Staff to build and deploy production-grade AI systems. In this role, you’ll work across...  ...Engineering: Design scalable pipelines for training, inference, and data processing Performance Optimization: Improve latency... 

    ERAGON

    San Francisco, CA
    1 day ago
  •  ...pointing ours at the frontier of science. Role Overview As a Member of Technical Staff you will shape Conductor's core offerings: AI software...  ...Build back‑end services for data collection, labelling, and inference. Integrate with external systems for secure, reliable... 

    Conductor Quantum

    San Francisco, CA
    1 day ago
  •  ...uses Shapes every single day, and everyone talks to users. Member of Technical Staff is the title we use for engineers who own hard problems...  ...have experience with LLM training, fine-tuning, evaluation, inference, or RAG at scale High-performance Python backends at scale... 

    Shapes

    San Francisco, CA
    1 day ago
  •  ...exceptional people to help us get there. The Opportunity Our Edge Inference team compiles Liquid Foundation Models into optimized machine...  ...on-device AI possible. You will work directly with the technical lead on problems that require deep understanding of both ML architectures... 

    Liquid AI

    San Francisco, CA
    3 days ago
  • Member of Technical Staff — Kernels & GPU Performance Employment Type: Full-time Workplace: On-site About the Company We are building the execution...  ...behavior, and execution characteristics across the inference stack Partner with compiler, runtime, and distributed systems... 
    Full time

    Acceler8 Talent

    San Francisco, CA
    3 days ago
  • $200k

     ...pre-training, domain-specific RL, ultra-long context, and inference-time compute to achieve this goal. About The Role Evals builds...  ...of many of the company's most important decisions. As a Member of Technical Staff on Evals, you will build both the platform and the... 
    Visa sponsorship
    Relocation package

    Magic

    San Francisco, CA
    2 days ago
  • $170k - $220k

    Member of Technical Staff - Infrastructure & LLMs Location: San Francisco, CA (Hybrid) Compensation: $170,000 - $220,000 base + 1-3% equity...  ...join a lean, high-performance team building next-generation inference infrastructure for LLMs. This is an opportunity to own the... 
    Full time
    Temporary work
    Immediate start
    Visa sponsorship
    Work visa

    Amadeus Search

    San Francisco, CA
    2 days ago
  • $300k

    Member of Technical Staff - RL Infrastructure About V max V max is an applied research lab developing AI capable of open-ended learning. We are...  ...at scale: distributed rollouts, training orchestration, inference, evals, data pipelines, observability, and reliability. You... 
    Work at office
    Local area

    Vmax

    San Francisco, CA
    2 days ago
  •  ...recognize parts of inputs that are unimportant, reducing inference costs for scale-ups and enterprises that integrate LLMs into...  ...team is 5 people with a research and product focus. As a Member of Technical Staff on our infrastructure team, you'll own the cloud systems... 
    Visa sponsorship

    The Token Company

    San Francisco, CA
    5 days ago
  •  ...companies running some of the most demanding inference workloads in the world. About the Role...  ...early hire changes the company. As an early member of the engineering team, you will help define the systems, standards, and technical culture behind a new class of AI... 

    Acceler8 Talent

    San Francisco, CA
    4 days ago
  • The opportunity We are looking for a Member of Technical Staff with deep expertise in generative modelling to work at the interface between our...  ...of generative model architectures, training dynamics and inference behaviour. You are a skilful ML developer. You write ML code... 
    Flexible hours

    Gravity Engineering Services Pvt Ltd.

    San Francisco, CA
    4 days ago
  •  ...design and the responsibility to defend. About the Role As a Member of Technical Staff, Infrastructure & Training Systems at Radical Numerics,...  ...only strong research ideas, but exceptional training and inference systems: infrastructure that makes large-scale experimentation... 
    Local area

    Radical Numerics Inc.

    San Francisco, CA
    3 days ago
  •  ...design and the responsibility to defend. About the Role As a Member of Technical Staff focused on statistical genetics, you will help us turn...  ...with colocalization, Mendelian randomization, TWAS, causal inference, cross-ancestry genetics, admixed populations, or privacy-... 
    Local area

    Radical Numerics Inc.

    San Francisco, CA
    3 days ago
  •  ...contributions to developer tools or AI/ML repositories (Desirable) Inference & Hardware Knowledge: Interest in the hardware side of AI—...  ...end‑to‑end What the job involves We are seeking a Member of Technical Staff, Evals & Post‑Training Product to help define how... 

    Fireworks AI

    San Francisco, CA
    1 day ago
  • # Founding Member of Technical Staff, AI Infrastructure**Location:** San Francisco / Bay Area preferred. Remote exceptional for the right person...  ...make AI workloads cheaper and easier to own by turning inference behavior, traces, workload replay, GPU signals, and task-path... 
    Full time
    Remote work

    Touchdown Labs, Inc.

    San Francisco, CA
    5 days ago
  • $150k - $300k

     ...infrastructure that runs the jobs. Core Technical Responsibilities Hosted Training...  ...and operate Kubernetes-based training and inference orchestration across multi-cluster, multi...  ...in open development and encourage team members to contribute to the broader AI community... 
    Work at office
    Local area
    Remote work
    Visa sponsorship
    Relocation package
    Flexible hours

    Kubelt

    San Francisco, CA
    2 days ago
  • About the Role As a Member of Technical Staff, AI Supercomputing at Radical Numerics, you will design, build, and operate the GPU supercomputing environment that powers our large-scale training and inference. You will deliver high-performance, reliable, and cost-efficient... 
    Local area

    Radical Numerics Inc.

    San Francisco, CA
    2 days ago
  • $150k - $280k

    Member of Technical Staff (Backend) San Francisco, CA Compensation: $150,000 - $280,000 + Competitive Equity Type: Full-Time Visa Sponsorship:...  ...used by AI agents. Build ML/agent pipelines, distributed inference, and automation frameworks. Own features vertically: design... 
    Full time
    Temporary work
    H1b
    Work at office
    Visa sponsorship
    Relocation package

    Fuku

    San Francisco, CA
    3 days ago
  • Member of Technical Staff - Post‑Training Join to apply for the Member of Technical Staff - Post‑Training role at Reflection AI . Our Mission...  ...pipelines, reward models, reinforcement learning algorithms, and inference‑time scaling techniques. Collaborate across pre‑training... 
    Full time
    Relocation package

    Reflection AI

    San Francisco, CA
    1 day ago
  • $150k

     ...pioneers to lead key initiatives in robotic intelligence. As a Member of Technical Staff, you'll spearhead the development of breakthrough...  ...end‑to‑end vision‑language‑action models, efficient model inference, and video tokenization Design and implement novel deep learning... 
    Local area

    Amazon Science

    San Francisco, CA
    5 days ago
  • Member of Technical Staff, ML Systems Mirendil Mirendil is a tech-first company focused on solving core bottlenecks that unlock step-change acceleration...  ...on (not limited to): Building and scaling training and inference infrastructure (potentially for various chips across... 

    Mirendil

    San Francisco, CA
    3 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Member of Technical Staff - Inference. Be the first to apply!