Staff + Sr. Software Engineer, Cloud Inference
$320kUnited States Digital Space LLC
About the Role The Cloud Inference team scales and optimizes Claude to serve the massive audiences of developers and enterprise companies across AWS, GCP, Azure, and future cloud service providers (CSPs). We own the end‑to‑end product of Claude on each cloud platform, from API integration and intelligent request routing to inference execution, capacity management, and day‑to‑day operations. Our engineers are extremely high leverage: we simultaneously drive multiple major revenue streams while optimizing one of the company’s most precious resources: compute. As we expand to more cloud platforms, the complexity of managing inference efficiently across providers with different hardware, networking stacks, and operational models grows significantly. We need product‑minded backend engineers who can navigate these platform differences, design the services and abstractions that work across providers, and make architectural decisions that keep us reliable and cost‑effective at massive scale. Your work will increase the scale at which our services operate, accelerate our ability to reliably launch new frontier models and innovative features to customers across all platforms, and ensure our LLMs meet rigorous safety, performance, and security standards. What You’ll Do Design, build, and own backend services and infrastructure that serve Claude across multiple CSPs, accounting for differences in compute hardware, networking, APIs, and operational models. Work cross‑functionally with internal inference, product API, systems, and security teams, among others, and with CSP partners to stand up the full serving stack on new cloud platforms, resolve operational issues, and influence provider roadmaps. Build and evolve CI/CD automation systems, including validation and deployment pipelines, that reliably ship new model versions to millions of users across cloud platforms without regressions. Design interfaces and tooling abstractions across CSPs that enable cost‑effective inference management, scale across providers, and reduce per‑platform complexity. Contribute to capacity planning, autoscaling, and workload routing strategies that match supply with demand and direct requests to the most cost‑effective accelerator and region. Analyze observability data across providers to identify performance bottlenecks, cost anomalies, and regressions, and drive remediation based on real‑world production workloads. You May Be a Good Fit If You Have significant software engineering experience, with a strong background in high‑performance, large‑scale distributed systems serving millions of users. Have experience building or operating services on at least one major cloud platform (AWS, GCP, or Azure), with exposure to Kubernetes, Infrastructure as Code, or container orchestration. Are curious about LLM serving; prior inference or ML experience is not required. Thrive in cross‑functional collaboration with both internal teams and external partners. Are a fast learner who can quickly ramp up on new technologies, hardware platforms, and provider ecosystems. Are highly autonomous and take ownership of problems end‑to‑end, including work that falls outside your job description. Strong Candidates May Also Have Experience With Direct experience working with CSPs to scale infrastructure or products across multiple platforms, navigating differences in networking, security, privacy, billing, and managed service offerings. Have experience working with external partners to align goals and deliver impact. Hands‑on experience with capacity management, cost optimization, or resource planning at scale across heterogeneous environments. Solid understanding of multi‑region deployments, geographic routing, and global traffic management. Proficiency in Python or Rust. Annual Salary
$320,000 – $485,000 USD
Logistics Minimum education: Bachelor’s degree or an equivalent combination of education, training, and/or experience. Required field of study: A field relevant to the role as demonstrated through coursework, training, or professional experience. Minimum years of experience: Years of experience required will correlate with the internal job level requirements for the position. Location‑based hybrid policy: Currently, we expect all staff to be in one of our offices at least 25% of the time. However, some roles may require more time in our offices. Visa sponsorship: We do sponsor visas! However, we aren’t able to successfully sponsor visas for every role and every candidate. We will make every reasonable effort to get you a visa if we make you an offer. #J-18808-Ljbffr United States Digital Space LLC$300k
...committed researchers, engineers, policy experts, and... .... About the role Our Inference team is responsible for... ...Qualifications Significant software engineering experience... .... Kubernetes and cloud infrastructure (AWS,... ...‑based hybrid policy: staff to be in one office at...SeniorSoftwareWork at officeWorldwideVisa sponsorship$320k
...group of committed researchers, engineers, policy experts, and business... ...Role Our mandate is to make inference deployment boring and... ...continuous and unattended. As a Software Engineer on the Launch Engineering... ...: Currently, we expect all staff to be in one of our offices...SeniorSoftwareVisa sponsorshipShift work$320k
United States Digital Space LLC is seeking a backend engineer for the Cloud Inference team. This role involves designing and building infrastructure... ...and cost. The ideal candidate will have significant software engineering experience with a major cloud platform. We offer...SeniorSoftware$229.9k - $262.4k
...Sr. Lead AI Engineer (Inference Optimization, FM hosting, AI Platform) Overview: At Capital One,... ...develop, test, deploy, and support AI software components including foundation model... ...scalable and responsible AI solutions on cloud platforms (e.g. AWS, Google Cloud,...SeniorSoftwareFull timePart timeLocal area- A pioneering AI infrastructure company is looking for a Senior Staff Software Engineer to lead initiatives in cloud software. This role requires over 10 years in software engineering with expertise in systems engineering and Kubernetes. Key responsibilities include setting...SeniorSoftware
- A leading tech company in San Francisco is seeking a Senior Staff Software Engineer who specializes in hypervisor virtualization. You will be responsible... ...optimizing virtualization technologies tailored for an AI cloud infrastructure. Proven expertise in hypervisor internals and...SeniorSoftwareFull time
$405k
...group of committed researchers, engineers, policy experts, and... ...About the role We are seeking a Staff Software Engineer to build and operate... ...Claude on third‑party cloud service provider (CSP) platforms... ...organization and the Cloud Inference team: taking classifiers, detection...SoftwareVisa sponsorship- United States Digital Space LLC is seeking a Staff Software Engineer to build and operate safety mechanisms that protect AI systems on cloud platforms. The ideal candidate will have significant experience in software development, particularly in trust & safety, and be proficient...SeniorSoftware
$139.2k - $174k
A leading cloud services provider is looking for a Senior Engineer 2 to join their AI Infrastructure Control Plane team. This role involves architecting high-quality software solutions for AI workloads while driving design and operational excellence. Candidates should...SeniorSoftwareRemote work$314.8k - $359.3k
...Sr. Distinguished AI Engineer (Agentic AI Platform) Overview: At... ...office hours, mentoring Staff, Principal and... ...expertise in hardware, software, and AI enable you to... ...AI solutions on cloud platforms (e.g. AWS,... ...technologies (e.g. LLM Inference, Similarity Search and...SeniorSoftwareFull timePart timeWork at officeLocal area- ...world‑class applied science and engineering teams to deliver our industry... ...test, deploy, and support AI software components including... ...training, large language model inference, similarity search, guardrails... ...responsible AI solutions on cloud platforms (e.g. AWS, Google Cloud...SeniorSoftwareLocal area
- ...Quantitative Scientist (Staff / Sr Staff) - Power Markets San Francisco, NYC, Boulder or... ...analysts; and d) partnering with our clients, engineers, product managers, and scientists to... ...‑intensive infrastructure, and modern software & ML engineering. Our experience in the...SeniorSoftwareTemporary workRemote workFlexible hours
- Staff / Sr. Staff Software Engineer (Frontend) San Francisco Bay Area, California, United States About Us: Tessell is a fast‑growing company focused... ...Cypress). Familiarity with containerization (e.g., Docker) and cloud services (AWS, Azure). Prior experience working in a...SeniorSoftware
$325k
...group of committed researchers, engineers, policy experts, and business... ...across multiple regions and cloud provider Lead incident... ...looking for reliability-minded software engineers and SREs Are curious... ...policy: Currently, we expect all staff to be in one of our offices...SeniorSoftwareVisa sponsorship- ...is looking for a Developer Platform Engineer to build and maintain their API platform for inference. This role involves defining user-facing... ...robust infrastructures across cloud providers. Ideal candidates have 5+ years of software engineering experience, are collaborative...Software
$220k
Perplexity is looking for an engineer to join their team in San Francisco. You will work on building and operating the inference engine, supporting new models, migrating GPU kernels... ...candidate has 3+ years of experience in software engineering with a focus on ML inference...SeniorSoftware- United States Digital Space LLC is looking for a Software Engineer to join the Launch Engineering team in San Francisco. You’ll design and... ...build deployment infrastructure for continuous and unattended inference deployment. The ideal candidate will have at least 5 years...SeniorSoftware
$320k
...committed researchers, engineers, policy experts, and... .... About the role Our Inference team is responsible for... ...hardware running in multiple cloud platforms. Key... ...qualifications Significant software engineering experience... ..., we expect all staff to be in one of our offices...SeniorSoftwareWorldwideVisa sponsorship$300k
...center startup building an AI and cloud platform, powered by... ...full-scale model training, or inference. Our client operates high-... ...tune, and operate inference engines such as vLLM, SGLang, and TensorRT... ...Strong understanding of GPU software stacks (CUDA, Triton, NCCL) and...SeniorSoftwarePermanent employmentWorldwide- ...focused on AI workloads is seeking a Member of Technical Staff to design and optimize inference systems. The role involves managing KV cache... ...various components. Ideal candidates should have strong software engineering skills and experience with ML inference systems,...SeniorSoftware
- Databricks is seeking a Senior Software Engineer (Infrastructure) in San Francisco. You will be a core technical contributor to our IT Infrastructure team, building scalable solutions and enhancing our AWS infrastructure. The ideal candidate has over 5 years of experience...SeniorSoftware
- Sciforium, based in San Francisco, is seeking a Distributed Training and Inference Engineer to enhance its machine learning infrastructure. This role involves maintaining critical software stacks and optimizing performance for large-scale AI workloads. The ideal candidate...SeniorSoftwareFlexible hours
$197.3k - $225.1k
...Lead AI Engineer (FM Hosting, LLM Inference) Overview At Capital One, we are creating responsible and... ..., test, deploy, and support AI software components including foundation model... ...scalable and responsible AI solutions on cloud platforms (e.g. AWS, Google Cloud,...SoftwareFull timePart timeLocal area- ...Distributed Systems Software Engineer - Public Cloud (Senior/Lead/Principal) Our Public Cloud engineering teams are responsible for innovating and maintaining a large scale distributed systems engineering platform that ships hundreds of features to production for tens...SeniorSoftware
$163k - $246.5k
Semgrep is seeking a role focused on building and managing cloud infrastructure to enhance developer experience. Applicants should have 5+ years of software engineering experience with a strong focus on developer tooling and cloud infrastructure, especially AWS or GCP....SeniorSoftware- ...Francisco is seeking an experienced DevOps Engineer/Site Reliability Engineer (SRE) to join... ...ideal candidate will have over 8 years of Software Engineering experience and a strong... ...services. Candidates should be familiar with cloud platforms such as AWS, Azure, or GCP, and...SeniorSoftware
$300k
United States Digital Space LLC is seeking a skilled software engineer to join the Inference team in San Francisco. You will be responsible for building and maintaining systems that serve Claude to millions of users. The role emphasizes maximizing compute efficiency and...SeniorSoftwareWork at office- Databricks is seeking a Senior Software Engineer to join the Compute Infrastructure team based in San Francisco, California. You will design and develop systems that power Databricks’ compute infrastructure, ensuring the ability to quickly launch and scale world-class products...SeniorSoftware
- A leading software company in San Francisco is seeking a Software Developer to drive feature implementation and improve developer productivity... ...skills are essential. Come join us to create innovative solutions in the world of design and engineering. #J-18808-Ljbffr AutodeskSeniorSoftware
- A leading AI technology company based in San Francisco is looking for a seasoned Software Engineer with expertise in cloud architecture to join their Infrastructure Engineering team. The successful candidate will lead the design of core services infrastructure, automate...SeniorSoftware
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Staff + Sr. Software Engineer, Cloud Inference. Be the first to apply!
- software engineer amazon San Francisco, CA
- experienced software developer San Francisco, CA
- federal - software developer San Francisco, CA
- software developer internship San Francisco, CA
- senior software engineer San Francisco, CA
- software developer fintech San Francisco, CA
- part time software developer remote San Francisco, CA
- software developer intern San Francisco, CA
- software data engineer San Francisco, CA
- fall software engineering internship San Francisco, CA

