Software Engineer, Inference Deployment

$320k

Full-time

Anthropic

About Anthropic

Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems.

About the Role

Our mandate is to make inference deployment boring and unattended.

Anthropic serves Claude to millions of users across GPUs, TPUs, and Trainium — and every model update must reach production safely, quickly, and without disrupting service. We're building the systems that make inference deployment continuous and unattended.

As a Software Engineer on the Launch Engineering team, you'll design and build the deployment infrastructure that moves inference code from merge to production. This is a resource-constrained optimization problem at its core: validation and deployment consume the same accelerator chips that serve customer traffic — your deploys compete with live user requests for the same hardware. Every model brings different fleet sizes, startup times, and correctness requirements, so the system must adapt continuously. You'll build systems that navigate these constraints — orchestrating validation, scheduling deployments intelligently, and driving down cycle time from merge to production.

If you've built deployment systems at scale and gravitate toward the hardest problems at the intersection of automation and resource management, this team will give you an outsized scope to work on them.

Responsibilities

Own deployment orchestration that continuously moves validated inference builds into production across GPU, TPU, and Trainium fleets, unattended under normal conditions

Improve capacity-aware deployment scheduling to maximize deployment throughput against constrained accelerator budgets and variable fleet sizes

Extend deployment observability — dashboards and tooling that answer "what code is running in production," "where is my commit," and "what validation passed for this deploy"

Drive down cycle time from code merge to production with pipeline architectures that minimize serial dependencies and maximize parallelism

Optimize fleet rollout strategies for large-scale deployments across thousands of GPU, TPU, and Trainium chips, minimizing disruption to serving capacity

Evolve self-service model onboarding so that new models can be added to the continuous deployment pipeline without Launch Engineering involvement

Partner across the Inference organization with teams owning validation, autoscaling, and model routing to integrate deployment automation with their systems

You May Be a Good Fit If You Have

5+ years of experience building deployment, release, or delivery infrastructure at scale

Strong software engineering skills with experience designing systems that manage complex state machines and multi-stage pipelines

Experience with deployment systems where resource constraints shape the design — whether that's fleet capacity, network bandwidth, hardware availability, or coordinated rollout windows

A track record of building automation that measurably improves deployment velocity and reliability

Proficiency with Kubernetes-based deployments, rolling update mechanics, and container orchestration

Comfort working across the stack — from backend services and databases to CLI tools and web UIs

Strong communication skills and the ability to work closely with oncall engineers, model teams, and infrastructure partners

Strong Candidates May Also Have

Experience with ML inference or training infrastructure deployment, particularly across multiple accelerator types (GPU, TPU, Trainium)

Background in capacity planning or resource-constrained scheduling (e.g., bin-packing, fleet management, job scheduling with hardware affinity)

Experience with progressive delivery in systems with long validation cycles: canary/soak testing, blue-green deployments, traffic shifting, automated rollback

Experience at companies with large-scale release engineering challenges (mobile release trains, monorepo deployments, multi-datacenter rollouts)

Experience with Python and/or Rust in production systems

The annual compensation range for this role is listed below.

For sales roles, the range provided is the role’s On Target Earnings ("OTE") range, meaning that the range includes both the sales commissions/sales bonuses target and annual base salary for the role.

Annual Salary:

$320,000 - $485,000 USD

Logistics

Education requirements: We require at least a Bachelor's degree in a related field or equivalent experience.

Location-based hybrid policy: Currently, we expect all staff to be in one of our offices at least 25% of the time. However, some roles may require more time in our offices.

Visa sponsorship: We do sponsor visas! However, we aren't able to successfully sponsor visas for every role and every candidate. But if we make you an offer, we will make every reasonable effort to get you a visa, and we retain an immigration lawyer to help with this.

We encourage you to apply even if you do not believe you meet every single qualification. Not all strong candidates will meet every single qualification as listed. Research shows that people who identify as being from underrepresented groups are more prone to experiencing imposter syndrome and doubting the strength of their candidacy, so we urge you not to exclude yourself prematurely and to submit an application if you're interested in this work. We think AI systems like the ones we're building have enormous social and ethical implications. We think this makes representation even more important, and we strive to include a range of diverse perspectives on our team.

Your safety matters to us. To protect yourself from potential scams, remember that Anthropic recruiters only contact you from @anthropic.com email addresses. In some cases, we may partner with vetted recruiting agencies who will identify themselves as working on behalf of Anthropic. Be cautious of emails from other domains. Legitimate Anthropic recruiters will never ask for money, fees, or banking information before your first day. If you're ever unsure about a communication, don't click any links—visit directly for confirmed position openings.

How we're different

We believe that the highest-impact AI research will be big science. At Anthropic we work as a single cohesive team on just a few large-scale research efforts. And we value impact — advancing our long-term goals of steerable, trustworthy AI — rather than work on smaller and more specific puzzles. We view AI research as an empirical science, which has as much in common with physics and biology as with traditional efforts in computer science. We're an extremely collaborative group, and we host frequent research discussions to ensure that we are pursuing the highest-impact work at any given time. As such, we greatly value communication skills.

The easiest way to understand our research directions is to read our recent research. This research continues many of the directions our team worked on prior to Anthropic, including: GPT-3, Circuit-Based Interpretability, Multimodal Neurons, Scaling Laws, AI & Compute, Concrete Problems in AI Safety, and Learning from Human Preferences.

Come work with us!

Anthropic is a public benefit corporation headquartered in San Francisco. We offer competitive compensation and benefits, optional equity donation matching, generous vacation and parental leave, flexible working hours, and a lovely office space in which to collaborate with colleagues. Guidance on Candidates' AI Usage: Learn about our policy for using AI in our application process

Apply

Vacancy posted 23 hours ago

Similar jobs that could be interesting for youBased on the Software Engineer, Inference Deployment in New York, NY vacancy

Software Engineer, Inference
$300k
...group of committed researchers, engineers, policy experts, and... ...About the role Our Inference team is responsible for building... ...compute-agnostic inference deployments. We are responsible for the... ...you: Have significant software engineering experience, particularly...
Suggested
Full time
Work at office
Worldwide
Visa sponsorship
Flexible hours
Anthropic
New York, NY
23 hours ago
Staff + Senior Software Engineer, Cloud Inference
$300k
...group of committed researchers, engineers, policy experts, and... ...About the Role The Cloud Inference team scales and optimizes Claude... ..., including validation and deployment pipelines, that reliably... ...You: Have significant software engineering experience, with...
Suggested
Full time
Work at office
Visa sponsorship
Flexible hours
Anthropic
New York, NY
23 hours ago
Software Engineer, Backend
$300k - $320k
...group of committed researchers, engineers, policy experts, and business... ...is looking for backend software engineers to work across our... ...that let Fortune 500 companies deploy Claude at scale, to the agentic... ...You'll partner closely with inference and safeguards to optimize the...
Suggested
Full time
Work at office
Visa sponsorship
Flexible hours
Anthropic
New York, NY
23 hours ago
Software Engineering (remote)
$109k - $145k
...team enables both internal engineers and customers to monitor, troubleshoot... ...About the role: As a Software Engineer on the... ...modern testing frameworks and deployment strategies (e.g., canary, blue... ...systems, large-scale training/inference workloads, or MLOps tooling...
Suggested
Permanent employment
Full time
Temporary work
Casual work
Work at office
Remote work
Flexible hours
Coreweave
New York, NY
23 hours ago
Software Engineering-Technology Delivery
$120k - $240k
...tech company of scientists and engineers, developing machine learning... ...and libraries. You’ll deploy your expertise across multiple... ...deployment of custom-tailored software solutions with small, high performing... ...(eg. model training, model inference, hardware accelerations)...
Suggested
Full time
Work experience placement
Work at office
Remote work
Flexible hours
Physicsx
New York, NY
23 hours ago
Senior Software Engineer, Machine Learning
$175k - $250k
...Software Engineer, Machine Learning (MLOps & Data) A Career with Point72’s Surveillance... ...engineering to model training, real-time deployment, and monitoring. Specifically, you... ...models, from data ingestion to production inference, contributing to the design of our...
Full time
Work experience placement
Point72
New York, NY
23 hours ago
Software Engineer
...multi-modal artificial intelligence and oncology. We develop and deploy AI-native tools to assist physicians in selecting the most... ...services spanning TypeScript/React, Python backends, and ML/CV inference, all running on Google Cloud and Modal. Comfort moving between...
Full time
Ataraxis AI
New York, NY
23 hours ago
Senior Software Engineer, Infrastructure
$250k - $330k
...-800-FLOWERS.COM, and Hunter Douglas to deploy AI agents that power personalized, deeply... ...GPU and model‑serving platforms for LLM inference with multi‑provider routing and support... ...We’re hiring a Senior Infrastructure Engineer to design, build, and operate production...
Full time
Work at office
Decagon
New York, NY
23 hours ago
Software Engineer, AI Infrastructure
...models with the fastest and most scalable inference in the industry. We’ve been... ...Google Vertex AI. The Role: As a Software Engineer on our AI Infrastructure team, you will... ...discussions, and continuous integration and deployment processes Minimum Qualifications:...
Full time
Fireworks Ai
New York, NY
23 hours ago
Senior Software Engineer, Infrastructure
$200k - $400k
...Chime, Oura Health, and Hunter Douglas to deploy AI agents that power personalized,... ...GPU and model‑serving platforms for LLM inference with multi‑provider routing and support... ...We’re hiring a Senior Infrastructure Engineer to design, build, and operate production...
Full time
Work at office
Local area
Decagon
New York, NY
23 hours ago
Software Engineer, Agents
...role owns the orchestration layer, the inference serving, the internal tooling, and the... ...You will establish and maintain software engineering practices across the team: testing, CI... ...infrastructure, containerization, and deployment pipelines You think in systems: you...
Full time
Output Biosciences
New York, NY
23 hours ago
Sr Software Engineer
...assisted tooling for aerospace engine design. We’re looking for a... ...build, deliver, and maintain software applications and services across... .../procedures, testing, deployment, and operational support.... ...workflows that enable modeling, inference, and decision support in design...
Work experience placement
Remotive
New York, NY
4 days ago
Software Engineer I
...As a Software Engineer I at Aledade, we maintain, improve, and expand our web application and... ...continuous integration and continuous deployment(CI/CD) pipelines. Experience with security... ...data techniques (such as causal inference, syntactic analysis, sampling methods,...
Aledade,-Inc.-
New York, NY
3 days ago
Senior Software Engineer - Payments
$165k - $180k
...Join to apply for the Senior Software Engineer - Payments role at Brigit Join to apply for the... ...a mature CICD platform with automated deployment processes, an easy-to-use local/staging... ...of interviewing at Brigit by 2x Inferred from the description for this job Medical...
Full time
Summer work
Internship
Work at office
Local area
Immediate start
Remote work
Flexible hours
Brigit
New York, NY
4 days ago
SRE - Software Engineering
$109k - $145k
...team enables both internal engineers and customers to monitor, troubleshoot... ...scale. About the role: As a Software Engineer on the... ...modern testing frameworks and deployment strategies (e.g., canary, blue... ...systems, large‑scale training/inference workloads, or MLOps tooling...
Permanent employment
Temporary work
Casual work
Work at office
Flexible hours
CoreWeave
New York, NY
4 days ago
Software Engineer
...wrapper slapped onto legacy software - we built a proprietary general... ...the role Hanover Park is an engineering-first company on a mission... ..., backend, infra, AI, deployment Hard technical problems at the... ...for background jobs and AI inference. What we\'re looking for Need...
Local area
Hanover Park
New York, NY
4 days ago
Software Engineer
$180k - $220k
...compounds across the platform. Engineers here treat AI as a force... ...data → evals → iteration → deployment → monitoring → continuous improvement... ...engineering team as a Software Engineer. The engineering... ...that supports production-grade inference, evaluation, and monitoring....
Full time
For contractors
Work at office
Relocation
Syndesus, Inc.
New York, NY
4 days ago
Senior Software Engineer
$240k - $260k
....00/yr - $260,000.00/yr Senior / Staff Software Engineer (AI + Secure Data Infrastructure) — Mission... ...built for you. What You’ll Do Build, deploy, and support mission-critical backend... ...systems leveraging LLMs, embeddings, and inference into high-security production...
Full time
FORTË
New York, NY
4 days ago
Software Engineer
$120k - $140k
...Harnham Recruitment Consultant | Data & Software Engineering at Harnham Software Engineer New York,... ...initiatives from development through deployment Collaborate cross-functionally to... ...chances of interviewing at Harnham by 2x Inferred from the description for this job Medical...
Full time
Harnham
New York, NY
2 days ago
Software Engineer, Machine Learning (Systems)
$240k
...age and are looking for an exceptional ML engineer to stabilize the system that turns raw... .... We're building that layer today by deploying alongside the world’s highest-stakes teams... ..., trusted decisions. Define how inference works when inputs are incomplete, noisy,...
Full time
Flexible hours
Sweep360
New York, NY
23 hours ago
Senior Software Engineer
...compounds across the platform. Engineers here use AI as a force... ...data → evals → iteration → deployment → monitoring → continuous improvement... ...lives. About the Role As a Software Engineer, you’ll work... ...that supports production-level inference, evaluation, and monitoring...
Temporary work
Work at office
Flexible hours
Baton
New York, NY
2 days ago
Senior AI Software Engineer
$190k - $240k
...is the leading legal writing software trusted by top litigators,... ...matter, and we’re looking for engineers who care about the same... ...LoRA/QLoRA), quantization, and inference optimizations when they deliver... ...experience designing, deploying, and iterating on RAG systems...
Live in
BriefCatch
New York, NY
4 days ago
Senior Software Engineer (Python)
$250k - $400k
...trading firm looking for a Senior Python Engineer to architect and build the critical data... ...work will directly enable the rapid deployment of new strategies and technologies. Full... ...Employment type Full-time Job function Inferred from the description for this job: Medical...
Full time
Engtal Inc
New York, NY
4 days ago
Edge Inference Developer Tooling Founder
$250k
...underneath it doesn't exist. Every team deploying models on edge devices rebuilds memory management... ...for granted. The Opportunity Build the software and tooling layer that makes edge... ...those models are doing in the field. Inference latency, memory pressure, thermal headroom...
Forum Ventures
New York, NY
1 day ago
Forward Deployed Software Engineer - US Government
...World-Changing Company Palantir builds the world’s leading software for data-driven decisions and operations. By bringing the... ...locate missing children, and more. The Role Forward Deployed Software Engineers (FDSEs) understand our customers’ greatest pain points and...
Palantir
New York, NY
9 days ago
Staff Software Engineer, AI Inference
$190k - $230k
...platform, we're investing in our own inference stack to deliver best-in-class performance... ...models. We're looking for a Staff Software Engineer to spearhead this effort. You'll... ...the foundation for how AI models are deployed, optimized, monitored, and operated in...
Remote work
Syllo
New York, NY
8 days ago
Full Stack Software Engineer
...As a Full Stack Software Engineer at Regard, you’ll be involved in all stages of the product development and deployment lifecycle: idea generation, planning, design, prototyping, execution, deployment, and iteration of new and existing features. You’ll collaborate closely...
Full time
Work at office
Local area
Home office
Visa sponsorship
Relocation package
Regard
New York, NY
23 hours ago
Full Stack Software Engineer, API Experience
...flows, and the systems that help developers go from first request to production deployment quickly and confidently. About the Role We’re looking for full stack and frontend engineers to help define and build the next generation of OpenAI’s developer experience. In...
Full time
Internship
OpenAI
New York, NY
23 hours ago
Senior Software Engineer, Backend
...architectural decisions through implementation, deployment, and support Design and build... ...code reviews, documentation, and engineering best practices Partner closely with... ...We expect 6+ years of professional software engineering experience. You are excited...
Full time
Work at office
3 days per week
Siro
New York, NY
23 hours ago
Software Engineer (Backend Rust)
$120k - $260k
...and more. Position Overview: As a key member of the software engineering team, you will spearhead the development of the core... ...experience to ensure the creation, testing, debugging, and deployment of production-grade components. Your role will involve maintaining...
Full time
Work experience placement
N1
New York, NY
23 hours ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Software Engineer, Inference Deployment. Be the first to apply!