Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Staff + Senior Software Engineer, Inference Deployment

$320k

United States Digital Space LLC

About the company the company’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems. About the role the company serves Claude to millions of users across GPUs, TPUs, and Trainium — and every model update must reach production safely, quickly, and without disrupting service. The Launch Engineering team's mandate is to make inference deployment boring and unattended. As a Software Engineer on Launch Engineering, you'll design and build the deployment infrastructure that moves inference code from merge to production. This is a resource-constrained optimization problem at its core: validation and deployment consume the same accelerator chips that serve customer traffic, so your deploys compete with live user requests for the same hardware. Every model brings different fleet sizes, startup times, and correctness requirements, and the system must adapt continuously. You'll build systems that navigate these constraints — orchestrating validation, scheduling deployments intelligently, and driving down cycle time from merge to production. Key responsibilities Own deployment orchestration that continuously moves validated inference builds into production across GPU, TPU, and Trainium fleets, unattended under normal conditions Improve capacity-aware deployment scheduling to maximize deployment throughput against constrained accelerator budgets and variable fleet sizes Extend deployment observability — dashboards and tooling that answer "what code is running in production," "where is my commit," and "what validation passed for this deploy" Drive down cycle time from code merge to production with pipeline architectures that minimize serial dependencies and maximize parallelism Optimize fleet rollout strategies for large-scale deployments across thousands of accelerator chips, minimizing disruption to serving capacity Evolve self-service model onboarding so new models can be added to the continuous deployment pipeline without Launch Engineering involvement Partner across the Inference organization with teams owning validation, autoscaling, and model routing to integrate deployment automation with their systems Minimum qualifications Strong software engineering skills, including experience designing systems that manage complex state machines and multi-stage pipelines Proficiency with Kubernetes-based deployments, rolling update mechanics, and container orchestration Experience building deployment, release, or delivery infrastructure where resource constraints (fleet capacity, network bandwidth, hardware availability, coordinated rollout windows) shape the design A track record of building automation that measurably improves deployment velocity and reliability Comfort working across the stack — from backend services and databases to CLI tools and web UIs Strong communication skills and the ability to work closely with oncall engineers, model teams, and infrastructure partners Preferred qualifications 5+ years of experience building deployment, release, or delivery infrastructure at scale Experience with Python and/or Rust in production systems Experience with ML inference or training infrastructure deployment, particularly across multiple accelerator types (GPU, TPU, Trainium) Background in capacity planning or resource-constrained scheduling (e.g., bin-packing, fleet management, job scheduling with hardware affinity) Experience with progressive delivery in systems with long validation cycles: canary/soak testing, blue-green deployments, traffic shifting, automated rollback Experience at companies with large-scale release engineering challenges (mobile release trains, monorepo deployments, multi-datacenter rollouts) The annual compensation range for this role is listed below. For sales roles, the range provided is the role’s On Target Earnings ("OTE") range, meaning that the range includes both the sales commissions/sales bonuses target and annual base salary for the role. Annual Salary: $320,000 — $485,000 USD Logistics Minimum education: Bachelor’s degree or an equivalent combination of education, training, and/or experience Required field of study: A field relevant to the role as demonstrated through coursework, training, or professional experience Minimum years of experience: Years of experience required will correlate with the internal job level requirements for the position Location-based hybrid policy: Currently, we expect all staff to be in one of our offices at least 25% of the time. However, some roles may require more time in our offices. Visa sponsorship: We do sponsor visas! However, we aren't able to successfully sponsor visas for every role and every candidate. But if we make you an offer, we will make every reasonable effort to get you a visa, and we ret #J-18808-Ljbffr United States Digital Space LLC

Vacancy posted 2 days ago
Similar jobs that could be interesting for youBased on the Staff + Senior Software Engineer, Inference Deployment in San Francisco, CA vacancy
  • United States Digital Space LLC is looking for a Software Engineer to join the Launch Engineering team in San Francisco. You’ll design and build deployment infrastructure for continuous and unattended inference deployment. The ideal candidate will have at least 5 years... 
    Senior

    United States Digital Space LLC

    San Francisco, CA
    5 days ago
  • $320k

     ...committed researchers, engineers, policy experts, and...  .... About the role Our Inference team is responsible for...  ...compute‑agnostic inference deployments. We are responsible...  ...Significant software engineering experience...  ...Currently, we expect all staff to be in one of our offices... 
    Senior
    Worldwide
    Visa sponsorship

    United States Digital Space LLC

    San Francisco, CA
    6 days ago
  • $405k

     ...of committed researchers, engineers, policy experts, and business...  ...is seeking an exceptional Senior Staff Software Engineer to join the Claude...  ...partnering closely with Research, Inference, Platform, Infrastructure,...  ...security, and responsible deployment. Mentor and develop Staff‑... 
    Senior
    Work at office
    Remote work
    Visa sponsorship
    Flexible hours

    Menlo Ventures

    San Francisco, CA
    4 days ago
  • $405k

     ...the role Anthropic's Inference organization serves Claude...  ...We're looking for a Staff Engineer to be a technical...  ...builds on. This is a senior IC role with broad...  ...ensuring new models and deployment targets pay only for...  ...them Have significant software engineering... 
    Suggested
    Work at office
    Visa sponsorship
    Flexible hours

    jobr.pro

    San Francisco, CA
    6 days ago
  •  ...Location Type Hybrid Department Inference Model Serving Who are we? Our...  ...humanity. We’re training and deploying frontier models for...  ...Cohere is a team of researchers, engineers, designers, and more, who are...  ...looking for Members of Technical Staff to join the Model Serving... 
    Suggested
    Full time
    Work experience placement
    Work at office
    Remote work
    Flexible hours

    Jaide Health

    San Francisco, CA
    5 days ago
  • $200k - $250k

     ...Voxel is looking for a Staff Machine-Learning Infrastructure Engineer to drive the next wave of...  ...decisions, from data schemas to inference optimizations, ensuring...  ...versioning. Strong software‑engineering fundamentals...  .... Exposure to edge‑deployment or real‑time inference systems... 
    Senior
    Work at office
    Flexible hours

    Dormont Manufacturing Company

    San Francisco, CA
    6 days ago
  • $405k

     ...committed researchers, engineers, policy experts, and...  ...role We are seeking a Staff Software Engineer to build and...  ...organization and the Cloud Inference team: taking...  ...gaps. You will build, deploy and operate the multi‑...  ...control the whole stack Senior enough to own a cross‑... 
    Visa sponsorship

    United States Digital Space LLC

    San Francisco, CA
    6 days ago
  • $320k - $405k

     ...group of committed researchers, engineers, policy experts, and...  ...with research, training, and inference to understand workload shapes...  ...qualifications Significant software engineering experience building...  ..., or large DNS/service-mesh deployments) Familiarity with ML infrastructure... 
    Senior

    Menlo Ventures

    San Francisco, CA
    3 days ago
  • United States Digital Space LLC in San Francisco is seeking a Software Engineer for the Launch Engineering team. This role focuses on designing and building deployment infrastructure for inference code across various accelerator fleets, ensuring efficient production processes... 
    Senior

    United States Digital Space LLC

    San Francisco, CA
    2 days ago
  •  ...leading organizations across IT, Engineering, Financial Services &...  ...#ZR 77 We are looking for a Staff Software Engineer, AI/ML with at least...  ...who can design, build, and deploy end-to-end AI-powered solutions...  ...practices. Requirements Seniority : 6 - 15 years of experience... 
    Senior

    AI Talent Now

    San Francisco, CA
    4 days ago
  • $215k - $265k

     ...Hamilton, VP, Solutions Architecture & Engineering | NVIDIA DDN is the global leader in...  .... Job Description We are seeking a Senior Staff Software Engineer for the ongoing development...  ...design to development to testing to deployment Participate in technical reviews throughout... 
    Senior
    Local area
    Remote work
    Worldwide

    DataDirect Networks Inc

    San Francisco, CA
    3 days ago
  •  ...industry standards for the responsible deployment of AI across health systems. We are a growing...  ..., PhDs, creatives, technologists, and engineers working together to empower people and...  ...that power how we develop and ship software. As an early member of this team, you'll... 
    Senior
    Hourly pay
    Full time
    Work at office
    Local area
    Relocation
    Flexible hours
    3 days per week

    Abridge AI

    San Francisco, CA
    2 days ago
  •  ...humanity. The Identity Infrastructure Engineering team sits at the core of this effort,...  .... About the Role We’re looking for a Staff+ Software Engineer to help build and evolve the...  ...About OpenAI OpenAI is an AI research and deployment company dedicated to ensuring that... 
    Senior
    Work at office
    Relocation package

    Slope

    San Francisco, CA
    1 day ago
  • $180k - $250k

     ...product is built upon as well as many of the tools that enable engineers to develop, ship, and observe their code. We are responsible...  ...Design and implement scalable infrastructure solutions for various deployment models, including SaaS, single-tenant, and private deployments... 
    Senior
    Work at office

    Dormont Manufacturing Company

    San Francisco, CA
    3 days ago
  • $148.5k - $223.9k

    Job Category Software Engineering Role Description Join the team responsible for innovating, maintaining and monitoring Salesforce’s massive...  ...Voice. In this role, you will leverage your experience in deploying, maintaining, monitoring large‑scale voice infrastructure services... 
    Senior
    Flexible hours

    Centaur Labs

    San Francisco, CA
    5 days ago
  • $160k - $210k

     ...Kubernetes platform, observability and deployment tooling. You will partner with the Core...  ...lead in SF for the broader Foundations Engineering Group. Your Role You will join a super...  ...end to end Qualifications 6+ years of software engineering experience in... 
    Senior

    ZipHQ, Inc.

    San Francisco, CA
    1 day ago
  • $300k

     ...committed researchers, engineers, policy experts, and...  .... About the role Our Inference team is responsible for...  ...compute‑agnostic inference deployments. We handle the entire...  ...Significant software engineering experience...  ...‑based hybrid policy: staff to be in one office at... 
    Senior
    Work at office
    Worldwide
    Visa sponsorship

    United States Digital Space LLC

    San Francisco, CA
    2 days ago
  •  ...About the Role We are looking for a software engineer to own the design of our APIs. Our API...  ...internal title Member of Technical Staff. We use Senior Staff externally to signal the depth...  ...the company is an AI research and deployment company dedicated to ensuring that general... 
    Senior

    United States Digital Space LLC

    San Francisco, CA
    6 days ago
  • $320k

    About the Role The Cloud Inference team scales and...  ...‑day operations. Our engineers are extremely high leverage...  ...validation and deployment pipelines, that reliably...  ...You Have significant software engineering experience...  ...Currently, we expect all staff to be in one of our offices... 
    Senior
    Visa sponsorship

    United States Digital Space LLC

    San Francisco, CA
    6 days ago
  •  ...Baseten powers mission-critical inference for the world's most dynamic...  ...and help build the platform engineers turn to to ship AI products....  ...experience. We enable customers to deploy and operate cutting‑edge LLM...  ..., and ease of use. As a Software Engineer on the Inference... 
    Flexible hours

    The Consensus

    San Francisco, CA
    2 days ago
  • About the Team We’re hiring a Developer Productivity engineer to support the company’s Inference Runtime teams. These teams own the systems...  ...optimizations, cloud provider integrations, and large‑scale deployments across a rapidly evolving inference stack. About the... 

    United States Digital Space LLC

    San Francisco, CA
    2 days ago
  • $245k - $295k

     ...other, come build with us at Crusoe. Crusoe is seeking a Senior Staff Software Engineer to own how our network monitors, configures, and heals...  ...represent the automation org in cross‑functional planning with Deployment, Operations, and Site Reliability. You're not writing... 
    Senior
    Temporary work

    ProducePay

    San Francisco, CA
    2 days ago
  • $205k - $250k

     ...About the Role We are seeking a Backend Engineer to design and scale high-performance backend...  ...external AI APIs, managing ML inference pipelines, or supporting data infrastructure...  ...contributor—you care about how your systems are deployed, monitored, and managed in production.... 
    Work experience placement
    Private practice
    Work at office

    3Y Health

    San Francisco, CA
    1 day ago
  • $150k - $230k

     ...and most scalable infrastructure for AI inference. Fal Serverless powers 1,300+ endpoints...  ...workloads. Enterprises use fal Serverless to deploy, operate, and scale custom AI models...  .... About this role As a Forward Deployed Engineer on Serverless, you will work directly with... 
    Currently hiring
    Relocation
    Visa sponsorship

    Fal

    San Francisco, CA
    4 days ago
  • Software Engineer Intern (AI Infrastructure / Training / Inference) About the Role We are hiring Software Engineers focused on AI Infrastructure to build the systems...  ...research systems. Drive improvements in deployment workflows, automation, and platform usability.... 
    Internship
    Immediate start

    SpreeAI

    San Francisco, CA
    2 days ago
  •  ...groundbreaking AI training and inference possible. The Lambda Infrastructure Engineering organization forges...  ...seeking a seasoned Staff Storage Software Engineer with deep...  ...designing and deploying storage protocol solutions...  ...Mentor and develop senior engineers, providing... 
    Work at office
    Local area
    Work from home
    Flexible hours

    AI Chopping Block

    San Francisco, CA
    2 days ago
  • $200k - $400k

     ..., Oura Health, and Hunter Douglas to deploy AI agents that power personalized, deeply...  ...and model‑serving platforms for LLM inference with multi‑provider routing and...  ...accurately. About the Role We’re hiring a Senior Infrastructure Engineer to design, build, and operate... 
    Full time
    Work at office
    Local area

    Decagon AI, Inc.

    San Francisco, CA
    2 days ago
  • $192k - $260k

     ...unified, scalable, and governed platform to deploy and manage AI/ML models — from...  ...models. It offers real-time, low-latency inference, governance, monitoring, and lineage. As...  ...strong SLAs and cost efficiency. As a Staff Engineer, you’ll play a critical role in shaping... 
    Local area
    Worldwide

    Cacheflow

    San Francisco, CA
    4 days ago
  • $200k - $300k

    F2 Staff Software Engineer, Infrastructure Location: San Francisco Employment Type: Full time Location...  ...with the founding team to architect, deploy, and scale the cloud infrastructure...  ...infrastructure, and high‑throughput LLM inference paths; balancing latency, throughput,... 
    Full time

    F2

    San Francisco, CA
    5 days ago
  • $300k

     ...full-scale model training, or inference.  Our client operates high...  ...tune, and operate inference engines such as vLLM, SGLang, and...  ...multi-tenant and dedicated deployments. Collaborate with cross-functional...  ...Strong understanding of GPU software stacks (CUDA, Triton, NCCL)... 
    Senior
    Permanent employment
    Worldwide
    San Francisco, CA
    more than 2 months ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Staff + Senior Software Engineer, Inference Deployment. Be the first to apply!