Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Member of Technical Staff - Inference

$180k

Xai

Member Of Technical Staff - Inference

Palo Alto, CA

About Xai

Xai's mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, and focused on engineering excellence. This organization is for individuals who appreciate challenging themselves and thrive on curiosity. We operate with a flat organizational structure. All employees are expected to be hands-on and to contribute directly to the company's mission. Leadership is given to those who show initiative and consistently deliver excellence. Work ethic and strong prioritization skills are important. All employees are expected to have strong communication skills. They should be able to concisely and accurately share knowledge with their teammates.

About The Role
  • We are building the high-performance inference platform that serves Grok to millions of users every day with lightning speed and perfect reliability.
  • As a Member of Technical Staff - Inference, you will design and optimize large-scale model serving systems end-to-end. You will own everything from distributed infrastructure (global KV cache, continuous batching, load balancing, auto-scaling) to deep low-level optimizations (GPU kernels, quantization, speculative decoding, tail latency).
  • This is a high-impact role where your work directly determines how fast and reliably users interact with Grok at massive scale

Responsibilities:

  • Architect and implement scalable distributed infrastructure for model serving (load balancing, auto-scaling, batch scheduling, global KV cache).
  • Optimize latency and throughput of model inference under real production workloads.
  • Build reliable, high-concurrency serving systems that serve billions of users with 100% uptime, 0% error rate, and excellent tail latency.
  • Benchmark, fine-tune, and accelerate inference engines (including low-level GPU kernel work and code generation).
  • Develop custom tools to trace, replay, and fix issues across the full stack — from orchestration down to GPU kernels.
  • Create robust CI/CD infrastructure for seamless endpoint deployment, image publishing, and inference engine updates.
  • Accelerate research on scaling test-time compute, RL rollout, and model-hardware co-design for next-generation systems.
Basic Qualifications
  • Deep low-level systems programming (C/C++ or Rust)
  • Experience with large-scale, high-concurrent production serving.
  • Experience with GPU inference engines (vLLM, SGLang, Triton, TensorRT-LLM, etc.).
  • Strong background in system optimizations: batching, caching, load balancing, parallelism.
  • Low-level inference optimizations: GPU kernels, code generation.
  • Algorithmic inference optimizations: quantization, speculative decoding, distillation, low-precision numerics.
  • Experience with testing, benchmarking, and reliability of inference services.
  • Experience designing and implementing CI/CD infrastructure for inference.
Compensation And Benefits

$180,000 - $440,000 USD

Base salary is just one part of our total rewards package at xAI, which also includes equity, comprehensive medical, vision, and dental coverage, access to a 401(k) retirement plan, short & long-term disability insurance, life insurance, and various other discounts and perks.

xAI is an equal opportunity employer. For details on data processing, view our Recruitment Privacy Notice.

Vacancy posted 10 hours ago
Similar jobs that could be interesting for youBased on the Member of Technical Staff - Inference in San Francisco, CA vacancy
  •  ...production workloads built to scale to gigawatt-class AI datacenters. Mission Gimlet Labs is seeking a Member of Technical Staff focused on ML systems and inference. In this role, you will design and build the inference systems that execute full models end-to-end under... 
    Suggested

    Gimlet Labs

    San Francisco, CA
    2 days ago
  • $150k - $280k

     ...Member of Technical Staff (Backend) San Francisco, CA Compensation: $150,000 – $280,000 + Competitive Equity Type: Full-Time Visa...  ...millions of transactions on AWS, including: - Distributed inference - Caching - Queue orchestration - Self-healing... 
    Suggested
    Full time
    Temporary work
    H1b
    Work at office
    Visa sponsorship
    Relocation package

    Fuku

    San Francisco, CA
    3 days ago
  • $150k - $300k

     ...infrastructure that runs the jobs. Core Technical Responsibilities Hosted Training...  ...operate Kubernetes-based training and inference orchestration across multi-cluster,...  ...in open development and encourage team members to contribute to the broader AI community... 
    Suggested
    Work at office
    Local area
    Remote work
    Visa sponsorship
    Relocation package
    Flexible hours

    Prime Intellect

    San Francisco, CA
    8 hours ago
  • $256k - $276k

     ...picture and our vision at Postman. The Opportunity As a Member of Technical Staff on AI Infrastructure, you will build and maintain the...  ...infrastructure that power AI model post training, inference, and data pipelines. You will collaborate with engineering... 
    Suggested
    Work at office
    Flexible hours
    3 days per week

    Postman

    San Francisco, CA
    2 days ago
  •  ...AI datacenters. Mission Gimlet Labs is seeking an Member of Technical Staff focused on AI research. As an AI Researcher, you will be evaluating...  ...new model architectures and experimenting with novel inference efficiency techniques such as KV caching and... 
    Suggested

    Gimlet Labs

    San Francisco, CA
    5 days ago
  • $150k

     ...Amazon's Frontier AI & Robotics (FAR) team is seeking a Member of Technical Staff to drive foundational research and build intelligent robotic...  ...end-to-end vision-language-action models, efficient model inference, video tokenization - Design and implement novel deep learning... 
    Local area

    Amazon

    San Francisco, CA
    3 days ago
  •  ...Inference Engine Engineer We build and run the inference engine behind every Perplexity query and deploy dozens of model architectures at scale with tight latency and cost budgets. Our stack is Rust, Python, CUDA, and CuTe DSL - and we need another engineer to join... 

    Perplexity AI

    San Francisco, CA
    9 hours ago
  •  ...Member of Technical Staff, Model Efficiency Who are we? Our mission is to scale intelligence to serve humanity. We’re training and deploying...  ...building reliable ML systems and pushing the boundaries of LLM inference efficiency. We develop techniques that improve how models... 
    Full time
    Work at office
    Remote work
    Flexible hours

    Cohere

    San Francisco, CA
    4 days ago
  •  ...Member of Technical Staff, Product TL;DR: Listen is building the human layer of AI. We're Sequoia-backed, raised $100M, and our customers include Anthropic, Google, and Cursor. We're hiring engineers who can build a complex AI-native product on a small team of former... 
    Flexible hours
    Shift work

    Listen Labs

    San Francisco, CA
    5 days ago
  • $10k

     ...multi-cloud infrastructure. 60 Day: You'll deliver a new service like Anycast Global Router. 90 Day: You'll own a domain like GPU inference clusters. Who You Are: You've seen Series B to F. You've scaled massive systems that are resilient and performant.... 
    Flexible hours
    Shift work

    Superpowered Inc

    San Francisco, CA
    8 hours ago
  • $200k - $350k

     ...About the job Pantheon - Member of Technical Staff: Infrastructure Member of Technical Staff: Infrastructure Posted by Transparent Search Group on behalf of Pantheon . About Pantheon Autonomous physical labor Website: The role We are... 
    H1b
    Remote work
    Visa sponsorship

    Transparent Search Group

    San Francisco, CA
    4 days ago
  • $180k

     ...Member Of Technical Staff - RL Infrastructure Palo Alto, CA xAI's mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, and focused on engineering excellence.... 
    Temporary work

    Xai

    San Francisco, CA
    8 hours ago
  • $200k

     ...scale pre-training, domain-specific RL, ultra-long context, and inference-time compute to achieve this goal. About the role As a...  ...taste in interaction design Ability to translate complex technical systems into simple user workflows Comfort operating in ambiguous... 
    Relocation
    Visa sponsorship

    Magic AI Corp.

    San Francisco, CA
    5 days ago
  •  ...Arielle Zuckerberg, Pascal Levy-Garboua), and strategic angels including Ryan Hoover (Founder, Product Hunt), Charlie Songhurst (Board Member, Meta), and Michael Jones (Former Chair, Huntington Bank Ventures). We are a talent dense team comprising of ex-Figure... 
    Full time
    Work experience placement
    Internship
    Worldwide

    Krew Research

    San Francisco, CA
    2 days ago
  •  ...designing, building, and scaling core infrastructure that powers a high-volume data platform for AI applications. We are looking for team members who love building enabling systems that empower our engineers and power our rapidly growing product. We're looking for folks... 
    Work at office

    LlamaIndex

    San Francisco, CA
    4 days ago
  •  ...power real production workloads built to scale to gigawatt-class AI datacenters. Mission Gimlet Labs is seeking a Member of Technical Staff focused on distributed systems. In this role, you will build the core platform that schedules, routes, and operates AI workloads... 

    Gimlet Labs

    San Francisco, CA
    2 days ago
  •  ...Member of Technical Staff humans& is a human-centric frontier AI lab. We believe AI can be reimagined, centering around people and their relationships with each other. We are looking for researchers and engineers who have done exceptional work at the frontier... 

    Humans&

    San Francisco, CA
    8 hours ago
  •  ...Member Of Technical Staff @ Lotus AI Lotus AI is a groundbreaking primary care app that integrates your medical records, AI, and real doctors to provide free, personalized healthcare and prescriptions. Our team includes ex-founders and engineers who have built and... 

    Lotus Health

    San Francisco, CA
    6 hours ago
  •  ...built brag to your friends about your hyper-optimized AI coding workflows tinker and build software for the love of the game feel equally strong obligations to both 1) choose good and 2) to win think that this role should be renamed "member of tomo staff"... 
    Immediate start

    Tomo

    San Francisco, CA
    3 days ago
  • $200k

     ...Join to apply for the Member of Technical Staff role at Listen Labs . TL;DR: We are seeing strong market demand and an aggressive 6‑month product roadmap, so we are expanding our engineering team. We're looking for someone highly technical (our current team includes 3... 
    Flexible hours

    Listen Labs

    San Francisco, CA
    4 days ago
  •  ...Activant, 1984 Ventures and Page One. The Role We’re hiring a Member of Technical Staff - AI/ML to design, build, and deploy AI-powered systems...  ...the needle Build robust AI pipelines from ingestion to inference — reliable, maintainable, and cost-efficient through smart... 
    Full time
    Flexible hours

    Stuut

    San Francisco, CA
    4 days ago
  •  ...Member Of Technical Staff, Platform Engineer You'll design, build, and own distributed systems and core platform infrastructure end-to-end across the stack - from user-facing product surfaces and real-time interactions to evaluation pipelines, model orchestration, and... 

    Arcada Labs Incorporated

    San Francisco, CA
    3 days ago
  • $180k

     ...Member Of Technical Staff - Pre-Training Palo Alto, CA About XAI XAI's mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, and focused on engineering excellence... 
    Temporary work

    Xai

    San Francisco, CA
    5 days ago
  • $150k

     ...human feedback, self-course-correct, and infer our goals. In particular, we are really excited...  ...The Product Manager - Technical role for the AGI Autonomy Lab focuses on...  ...cooperatively with other employees, supervisors, and staff; adhere to standards of excellence despite... 
    Local area

    Amazon

    San Francisco, CA
    3 days ago
  •  ...Member Of Technical Staff - Image / Video Generation Freiburg (Germany) About Black Forest Labs We're the team behind Latent Diffusion, Stable Diffusion, and FLUX—foundational technologies that changed how the world creates images and video. We're creating the... 
    Remote work
    Worldwide
    2 days per week

    Black Forest Labs

    San Francisco, CA
    5 days ago
  • $148.5k - $223.9k

     ...iterate agentic AI systems with customers. With your strong technical competence, strategic thinking and customer engagement, you...  ...Experience implementing and debugging model training, evaluation, and inference pipelines Infrastructure & Deployment Experience... 

    Salesforce.Com Inc

    San Francisco, CA
    4 days ago
  •  ...concrete improvements in Perplexity's systems, and share knowledge and resources that strengthen the broader AI ecosystem. As a member of SII, you'll conduct original and impactful research on improving the security and privacy of frontier intelligence systems. Your... 

    Perplexity

    San Francisco, CA
    7 hours ago
  •  ...'re looking for a Backend / Infrastructure Engineer who thrives at the intersection of cloud systems, SDK design, and large-scale inference infrastructure. You'll build and scale the backbone that powers NomadicML's video intelligence platform — from secure cloud ingestion... 
    Worldwide

    Pear VC

    San Francisco, CA
    7 hours ago
  • $150k

     ...We are seeking a Member of Technical Staff Simulation Engineer to join our AI robotics research team developing foundation models for robotics. You will rapidly develop 3D physics-based and photorealistic simulations alongside scientists to enable training large-scale... 
    Internship
    Local area

    Amazon

    San Francisco, CA
    5 days ago
  •  ...Member Of Technical Staff – Fullstack Stuut is transforming accounts receivable for B2B companies—making collections smarter and faster for companies that have historically relied on manual processes that are labor intensive and costly. Our platform is gaining traction... 
    Full time
    Flexible hours

    Stuut

    San Francisco, CA
    4 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Member of Technical Staff - Inference. Be the first to apply!