Member of Technical Staff - Inference
Sail Research
Optimize token processing down to the lowest layers of the stack. You'll optimize kernel performance, develop new scheduling and parallelism strategies, and help us squeeze every FLOP out of our hardware. What you’ll do Modify and extend state-of-the-art inference engines like vLLM and SGLang. Understand every microsecond of GPU time spent during a forward pass. You'll be able to explain every kernel launch on an NSys profile. Design and implement exotic parallelism schemes to work with "interesting" hardware topologies. Write custom GPU kernels to excel in specific regimes, such as cascade attention. What we’re looking for Strong understanding of LLM mechanics, like KV cache, mixture-of-experts, prefill vs. decode phases. Interest in MLSys research—great ideas like speculative decoding and sparse attention come from research, that we need to follow closely. Familiarity with modern, tile-based GPU programming, e.g. Triton, CUTLASS, ThunderKittens, etc. Or an interest in learning these! Benefits Meals are provided. Every employee receives a Studio Display. #J-18808-Ljbffr Sail Research
- About the Role As a Member of Technical Staff, Inference at Radical Numerics, you will build and optimize the systems that bring frontier biological AI models into production. Your work will focus on delivering state-of-the-art inference performance for large-scale genome...SuggestedLocal area
- Member of Technical Staff — ML Systems & Inference Employment Type: Full-time Workplace: On-site About the Company We are building the execution layer for the next generation of AI infrastructure. As AI workloads scale and hardware architectures diversify, the bottleneck...SuggestedFull time
$150k - $300k
...position spanning cloud LLM serving, LLM inference optimization and RL systems. You will be... ...into our RL training stack. Core Technical Responsibilities LLM Serving Multi‑tenant... ...in open development and encourage team members to contribute to the broader AI community...SuggestedWork at officeRemote workVisa sponsorshipRelocation packageFlexible hoursShift work- ...power real production workloads built to scale to gigawatt-class AI datacenters. Gimlet Labs is seeking a Member of Technical Staff focused on ML systems and inference. In this role, you will design and build the inference systems that execute full models end-to-end under...Suggested
$350k
...engineers from Anthropic, Google DeepMind, xAI, OpenAI, Microsoft, Apple, and MIT. The Role We are looking for an engineer to own the inference systems that power our models in production and research. You'll work across the full inference stack, from serving infrastructure...Suggested$225k
About the role As a Software Engineer on the Inference & RL Systems team, you will design and operate the distributed systems that serve our models in production and power large-scale post-training workflows. This role sits at the boundary between model execution and distributed...RelocationVisa sponsorship- ...optimizations for model serving, such as batching, caching, load balancing, and parallelism , Worked on low-level optimizations for inference, such as GPU kernels and code generation , Worked on algorithmic optimizations for inference, such as quantization, distillation,...
- ...Member Of Technical Staff We're looking for a member of technical staff to build and deploy production-grade AI systems. In this role, you... ...world applications Design scalable pipelines for training, inference, and data processing Improve latency, throughput, cost...
- ...exceptional people to help us get there. The Opportunity Our Edge Inference team compiles Liquid Foundation Models into optimized machine... ...on-device AI possible. You will work directly with the technical lead on problems that require deep understanding of both ML architectures...
$150k - $280k
...Member of Technical Staff (Backend) San Francisco, CA Compensation: $150,000 – $280,000 + Competitive Equity Type: Full-Time Visa... ...millions of transactions on AWS, including: - Distributed inference - Caching - Queue orchestration - Self-healing...Full timeTemporary workH1bWork at officeVisa sponsorshipRelocation package- ...uses Shapes every single day, and everyone talks to users. Member of Technical Staff is the title we use for engineers who own hard problems... ...have experience with LLM training, fine-tuning, evaluation, inference, or RAG at scale High-performance Python backends at scale...
$200k
...pre-training, domain-specific RL, ultra-long context, and inference-time compute to achieve this goal. About The Role Evals builds... ...of many of the company's most important decisions. As a Member of Technical Staff on Evals, you will build both the platform and the...Visa sponsorshipRelocation package- Pixeltable Inc. Member of Technical Staff San Francisco, CA·Full time Apply for Member of Technical Staff As a founding member of the engineering... ...ingestion, transformation, training/fine-tuning, and inference? You will also: Find opportunities to go deep into a wide...Full timePart timeWork at officeWork from homeFlexible hours2 days per week
- Member of Technical Staff — Kernels & GPU Performance Employment Type: Full-time Workplace: On-site About the Company We are building the execution... ...behavior, and execution characteristics across the inference stack Partner with compiler, runtime, and distributed systems...Full time
$170k - $220k
Member of Technical Staff - Infrastructure & LLMs Location: San Francisco, CA (Hybrid) Compensation: $170,000 - $220,000 base + 1-3% equity... ...join a lean, high-performance team building next-generation inference infrastructure for LLMs. This is an opportunity to own the...Full timeTemporary workImmediate startVisa sponsorshipWork visa- ...companies running some of the most demanding inference workloads in the world. About the Role... ...early hire changes the company. As an early member of the engineering team, you will help define the systems, standards, and technical culture behind a new class of AI...
- ...recognize parts of inputs that are unimportant, reducing inference costs for scale-ups and enterprises that integrate LLMs into... ...team is 5 people with a research and product focus. As a Member of Technical Staff on our infrastructure team, you'll own the cloud systems...Visa sponsorship
$250k
...leaves their servers. The team is small, technical, and moving fast, with strong early... ...· Industry: AI Tools. The Role Member of Technical Staff who can handle everything from modeling... ...Design scalable pipelines for training, inference, and data processing Improve latency,...Full time- ...pointing ours at the frontier of science. Role Overview As a Member of Technical Staff you will shape Conductor's core offerings: AI software... ...Build back‑end services for data collection, labelling, and inference. Integrate with external systems for secure, reliable...
$300k
Member of Technical Staff - RL Infrastructure About V max V max is an applied research lab developing AI capable of open-ended learning. We are... ...at scale: distributed rollouts, training orchestration, inference, evals, data pipelines, observability, and reliability. You...Work at officeLocal area- # Founding Member of Technical Staff, AI Infrastructure**Location:** San Francisco / Bay Area preferred. Remote exceptional for the right person... ...make AI workloads cheaper and easier to own by turning inference behavior, traces, workload replay, GPU signals, and task-path...Full timeRemote work
- ...design and the responsibility to defend. About the Role As a Member of Technical Staff focused on statistical genetics, you will help us turn... ...with colocalization, Mendelian randomization, TWAS, causal inference, cross-ancestry genetics, admixed populations, or privacy-...Local area
- ...contributions to developer tools or AI/ML repositories (Desirable) Inference & Hardware Knowledge: Interest in the hardware side of AI—... ...end‑to‑end What the job involves We are seeking a Member of Technical Staff, Evals & Post‑Training Product to help define how...
- About the Role As a Member of Technical Staff, AI Supercomputing at Radical Numerics, you will design, build, and operate the GPU supercomputing environment that powers our large-scale training and inference. You will deliver high-performance, reliable, and cost-efficient...Local area
$150k - $300k
...infrastructure that runs the jobs. Core Technical Responsibilities Hosted Training... ...and operate Kubernetes-based training and inference orchestration across multi-cluster, multi... ...in open development and encourage team members to contribute to the broader AI community...Work at officeLocal areaRemote workVisa sponsorshipRelocation packageFlexible hours- The opportunity We are looking for a Member of Technical Staff with deep expertise in generative modelling to work at the interface between our... ...of generative model architectures, training dynamics and inference behaviour. You are a skilful ML developer. You write ML code...Flexible hours
- ...design and the responsibility to defend. About the Role As a Member of Technical Staff, Infrastructure & Training Systems at Radical Numerics,... ...only strong research ideas, but exceptional training and inference systems: infrastructure that makes large-scale experimentation...Local area
$150k
...pioneers to lead key initiatives in robotic intelligence. As a Member of Technical Staff, you'll spearhead the development of breakthrough... ...end‑to‑end vision‑language‑action models, efficient model inference, and video tokenization Design and implement novel deep learning...Local area- Member of Technical Staff, ML Systems Mirendil Mirendil is a tech-first company focused on solving core bottlenecks that unlock step-change acceleration... ...on (not limited to): Building and scaling training and inference infrastructure (potentially for various chips across...
- Member of Technical Staff - Post‑Training Join to apply for the Member of Technical Staff - Post‑Training role at Reflection AI . Our Mission... ...pipelines, reward models, reinforcement learning algorithms, and inference‑time scaling techniques. Collaborate across pre‑training...Full timeRelocation package
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Member of Technical Staff - Inference. Be the first to apply!
- salesforce technical analyst San Francisco, CA
- desktop support analyst San Francisco, CA
- personal computer support technician San Francisco, CA
- technical support specialist San Francisco, CA
- support analyst San Francisco, CA
- customer support technician San Francisco, CA
- support technician San Francisco, CA
- application support technician San Francisco, CA
- technical solutions specialist San Francisco, CA
- help desk administrator San Francisco, CA

