Senior Engineer II, AI Inference Optimization

$167.2k - $209k

Full-time

DigitalOcean

Dive in and do the best work of your career at DigitalOcean. Journey alongside a strong community of top talent who are relentless in their drive to build the simplest scalable cloud. If you have a growth mindset, naturally like to think big and bold, and are energized by the fast-paced environment of a true industry disruptor, you’ll find your place here. We value winning together—while learning, having fun, and making a profound difference for the dreamers and builders in the world. We are seeking a Staff Engineer to implement and contribute to the design and optimization of our Serverless Inference infrastructure and APIs. In this role, you will tackle the challenges of large-scale AI workloads, focusing on throughput, GPU utilization, and fault tolerance to support next-generation inference needs of AI native enterprises.

WHAT YOU'LL DO:

* Design and build scalable, multi-tenant services that power AI inference and intelligent routing workloads. * Strengthen platform resiliency through improved observability, capacity management, automation, and operational tooling. * Partner closely with platform, GPU infrastructure, and product engineering teams to deliver production-grade systems and highly available APIs. * Raise the engineering bar through strong software design, operational discipline, incident management, and continuous improvement practices. * Create leverage for the broader engineering organization through reusable patterns, technical standards, documentation, and coaching. * Provide technical leadership on projects by breaking down ambiguous problems and guiding engineers toward clear execution plans. * Support the growth of junior and mid-level engineers through coaching, pairing, and constructive feedback. * Foster a culture of ownership, accountability, and continuous learning across the team. * Lead by example in incident response, operational readiness, and production-quality engineering practices.

WHAT YOU'LL BRING:

REQUIRED

* 8+ years of experience building and operating multi-tenant platforms or distributed backend systems * Strong experience operating high-scale distributed services in production environments * Deep understanding of SRE principles, including observability, incident management, reliability engineering, capacity planning, and operational automation

2+ years of hands-on experience with Go / Golang in production systems
2+ years of experience with Kubernetes
Strong understanding of cloud-native architectures, microservices, and

distributed systems fundamentals * Experience debugging performance, scalability, and reliability issues in production systems * Observability Proficiency: Experience tracking infrastructure and inference metrics like Time To First Token (TTFT), Time Per Output Token (TPOT), and GPU utilization.

BONUS

* AI/ML Framework Knowledge: Understanding of modern LLM serving architectures and familiarity with engines like vLLM or Triton.

Experience with API gateways, traffic routing, or service mesh technologies
Familiarity with LLM serving stacks such as vLLM, TensorRT-LLM, or similar

technologies * Experience building systems for inference optimization, rate limiting, routing, or workload orchestration

COMPENSATION RANGE:

$167,200.00 - $209,000
This is a remote role

JR: 2026-7593

#LI-Remote

WHY YOU’LL LIKE WORKING FOR DIGITALOCEAN

* We innovate with purpose. You’ll be a part of a cutting-edge technology company with an upward trajectory, who are proud to simplify cloud and AI so builders can spend more time creating software that changes the world. As a member of the team, you will be a Shark who thinks big, bold, and scrappy, like an owner with a bias for action and a powerful sense of responsibility for customers, products, employees, and decisions. * We prioritize career development. At DO, you’ll do the best work of your career. You will work with some of the smartest and most interesting people in the industry. We are a high-performance organization that will always challenge you to think big. Our organizational development team will provide you with resources to ensure you keep growing. We provide employees with reimbursement for relevant conferences, training, and education. All employees have access to LinkedIn Learning's 10,000+ courses to support their continued growth and development. * We care about your well-being. Regardless of your location, we will provide you with a competitive array of benefits to support you from our Employee Assistance Program to Local Employee Meetups to flexible time off policy, to name a few. While the philosophy around our benefits is the same worldwide, specific benefits may vary based on local regulations and preferences. * We reward our employees. The salary range for this position is based on market data, relevant years of experience, and skills. You may qualify for a bonus in addition to base salary; bonus amounts are determined based on company and individual performance. We also provide equity compensation to eligible employees, including equity grants upon hire and the option to participate in our Employee Stock Purchase Program. * DigitalOcean is an equal-opportunity employer. We do not discriminate on the basis of race, religion, color, ancestry, national origin, caste, sex, sexual orientation, gender, gender identity or expression, age, disability, medical condition, pregnancy, genetic makeup, marital status, or military service. Application Limit: You may apply to a maximum of 3 positions within any 180-day period. This policy promotes better role-candidate matching and encourages thoughtful applications where your qualifications align most strongly.

Apply

Vacancy posted 4 days ago

Similar jobs that could be interesting for youBased on the Senior Engineer II, AI Inference Optimization in United States vacancy

Senior AI Inference Systems Engineer: GPU-Optimized, Cloud
$184k - $356.5k
NVIDIA Gruppe is looking for skilled software engineers to develop AI inference systems that operate with high efficiency. The role involves architecting high-performance inference frameworks and optimizing GPU processes. Ideal candidates should have extensive programming...
Senior
NVIDIA Gruppe
Santa Clara, CA
4 days ago
Senior AI Inference Optimizations Engineer — Remote
A cloud technology company is looking for a Senior Engineer 2 to enhance their AI Inference Optimization team. In this role, you will drive architectural decisions that improve throughput and reduce latency in large models. Candidates should have over 5 years of experience...
Senior
Remote job
DigitalOcean
Seattle, WA
4 days ago
Senior DL Engineer: Inference & Generative AI
$152k - $287.5k
NVIDIA, based in Redmond, is looking for a Senior Deep Learning Engineer to drive advancements in AI. This role involves optimizing next-generation inference for multi-agent AI systems and generative models. Candidates should possess a Master's degree in relevant fields...
Senior
NVIDIA
Redmond, WA
2 days ago
Senior Inference Engineer: Real-Time Video AI on GPUs
Pika is looking for a Senior Inference Engineer in Palo Alto to enhance the performance of AI-driven products. This pivotal role involves designing and optimizing inference pipelines, applying advanced techniques to improve model speed and efficiency. The ideal candidate...
Senior
Pika
Palo Alto, CA
3 days ago
Senior DL Inference Engineer - GPU Optimization Equity
NVIDIA is seeking a Senior DL Algorithms Engineer to optimize LLM/Omni models and enhance performance across its... ...in deep learning, specifically in inference. This role involves profiling, analyzing... ...with teams to advance AI solutions. A strong understanding of...
Senior
NVIDIA
Santa Clara, CA
4 days ago
Senior AI Systems Engineer: Inference Kernels & Runtimes
$184k - $287.5k
NVIDIA Gruppe is seeking talented AI systems engineers to advance innovative technologies in AI inference systems software. This role involves developing cutting-edge... ...The ideal candidate will design, implement, and optimize kernels while collaborating with cross-...
Senior
NVIDIA Gruppe
Santa Clara, CA
4 days ago
Senior AI Inference Systems Engineer (GPU Performance)
$184k - $287.5k
Dormont Manufacturing Co is seeking highly skilled software engineers to join their team in California. The role involves developing and optimizing AI inference systems for large-scale models, collaborating with cross-functional teams to enhance system performance. The...
Senior
Dormont Manufacturing Co
California, MO
2 days ago
Senior ML Inference Engineer Production Systems
MakerMaker.AI is looking for a Senior Machine Learning Systems Engineer in San Francisco. In this role, you will build and operate production inference systems, optimizing for performance and reliability. The ideal candidate will have 3+ years of experience in production...
Senior
MakerMaker.AI
San Francisco, CA
3 days ago
Senior ML Inference Systems Engineer
A tech startup focused on AI workloads is seeking a Member of Technical Staff to design and optimize inference systems. The role involves managing KV cache allocation and... ...Ideal candidates should have strong software engineering skills and experience with ML inference...
Senior
Gimlet Labs
San Francisco, CA
4 days ago
Senior Software Engineer, AI Inference Systems
$184k - $287.5k
...We are seeking highly skilled and motivated software engineers to join us and build AI inference systems that serve large-scale models with extreme efficiency... ...and implement high-performance inference stacks, optimize GPU kernels and compilers, drive industry benchmarks,...
Senior
NVIDIA Gruppe
Santa Clara, CA
4 days ago
Senior ML Tools Engineer: Quantization & Inference
$126.7k - $190.1k
Qualcomm is seeking a Machine Learning Engineer in Raleigh, North Carolina, to develop cutting-edge machine learning... .... Responsibilities include collaborating in the AI software team and optimizing AI model inference on Snapdragon platforms. Competitive salary range $1...
Senior
Qualcomm
Raleigh, NC
4 days ago
Senior ML Engineer - Low-Latency Inference & Systems
A leading AI technology firm in the United States is seeking an experienced engineer to optimize model performance. The role requires expertise in inference optimization, model acceleration, and proficiency in C++, CUDA, and Python, among other skills. You'll work collaboratively...
Senior
Relocation package
Inworld
New Bremen, OH
4 days ago
Senior Engineer 2: AI Inference Engine Systems
$167.2k - $209k
...world. DigitalOcean is expanding its AI Infrastructure layer to support the... ...applications. We are seeking a Senior Engineer 2 to join our AI Inference Data Plane team. In this role, you... ...resiliency standards. Performance Optimization: Implement and optimise distributed...
Senior
Local area
Remote work
Worldwide
Flexible hours
DigitalOcean
Seattle, WA
5 days ago
Senior Backend Engineer, ML Inference Systems
$135.8k - $237.05k
...they love. Our Vector Gamer AI team sits at the heart of that... ...at scale. We’re hiring a Senior Backend Engineer to build and operate the infrastructure... ..., and scalability of inference systems. Join us and help... ...and Grafana Manage and optimize cloud infrastructure on GCP,...
Senior
Work at office
Worldwide
Relocation package
Dormont Manufacturing Co
Mountain View, CA
1 day ago
Senior ML Inference Engineer: Optimize Next-Gen Models
Dormont Manufacturing Co in San Francisco is seeking a skilled individual to optimize model inference and strengthen the core of ComfyUI. Your role will focus on making AI models run faster and more efficiently. The ideal candidate will have experience in production-level...
Senior
Dormont Manufacturing Co
San Francisco, CA
1 day ago
Senior/Staff Software Engineer - Machine Learning & System Optimization
$226k - $307k
...intelligence. As a Machine Learning and System Optimization Engineer, you will orchestrate and allocate... ...that allow for more efficient inference by sharing various parts of the perception... ...Proficiency in low-level programming for AI accelerators, specifically developing and...
Senior
Full time
Temporary work
Relocation package
Zoox
Seattle, WA
3 days ago
Edge Inference Engineer: Local AI Latency Optimizer
Intel is seeking a skilled expert to optimize inference engines for edge environments. The role requires strong C++ and Python skills along with a robust background in software development. You will be profiling and optimizing local inference strategies while managing...
Local area
Intel
Folsom, CA
3 days ago
Edge Inference Engineer: Local AI Latency Optimizer
Intel in Santa Clara, California is seeking a talented individual to optimize inference engines for local environments, impacting the future of AI. Applicants should have a strong background in C++ and software development, with experience in profiling performance issues...
Local area
Intel
Santa Clara, CA
3 days ago
Edge Inference Engineer: Optimize On-Device AI Kernels
Liquid AI is seeking a Systems Programmer to join their Edge Inference team in San Francisco. In this role, you will implement and optimize inference kernels on various hardware, ensuring efficiency and performance. Ideal candidates have over 5 years of systems programming...
Flexible hours
Liquid AI
San Francisco, CA
3 days ago
Senior/Staff AI Performance Engineer: Inference Optimization
Qualcomm in San Diego is looking for an AI Engineer specializing in machine learning. You will convert and optimize models, analyze performance, and collaborate across teams to advance AI technologies. The ideal candidate should have extensive hands-on experience with PyTorch...
Senior
Qualcomm
San Diego, CA
5 days ago
Senior Model Inference Engineer for Production-Scale AI
$325k
A leading AI research company in San Francisco seeks an engineer to optimize their powerful AI models for high-volume production environments. The ideal candidate has over 5 years of software engineering experience, strong familiarity with ML architectures, and experience...
Senior
Jobleads-US
San Francisco, CA
1 day ago
Senior AI-Driven Power Optimization Engineer
$136k - $264.5k
A leading technology company is seeking a Senior Power Analysis and Optimization Engineer in Austin, TX. This role involves leveraging machine learning to optimize energy efficiency in GPU designs. Responsibilities include analyzing power, developing predictive models,...
Senior
NVIDIA Corporation
Austin, TX
1 day ago
Senior Software Engineer II — AI-Powered Web & API Platform
Cacheflow is looking for a Senior Software Engineer II in San Francisco, California. This role involves leading platform development, building scalable web applications, and integrating AI capabilities. With a focus on clean and maintainable architecture, you will collaborate...
Senior
Cacheflow
San Francisco, CA
1 day ago
Applied Machine Learning & Data Engineering Senior Consultant II - National General
$130k - $150k
...Description Business Title: Applied Machine Learning Engineer Senior Consultant II About the Role This role supports high‑impact AI and data automation initiatives focused on... ...data (PDF, Excel, CSV, etc.) Develop and optimize LLM‑based solutions (prompt engineering,...
Senior
Work from home
Allstate Insurance Company
New York, NY
2 days ago
Senior LLM Engineer: Build & Optimize AI Infrastructure
A leading tech firm in Columbia, MD, is seeking a Senior LLM Engineer to join their AI/ML team. The role involves designing and maintaining infrastructure for language model instances, optimizing training pipelines, and collaborating with cross-functional teams. The ideal...
Senior
Link, LLC
Columbia, MD
5 days ago
Senior AI Systems Engineer — SGLang & Inference on GPUs
A leading technology company is seeking a skilled engineer to optimize deep learning frameworks and enhance GPU kernel performance. The ideal candidate... ...work environment with a focus on innovative solutions and advancing AI technologies. #J-18808-Ljbffr Advanced Micro Devices
Senior
Advanced Micro Devices
Santa Clara, CA
3 days ago
Senior GPU Performance Engineer - Optimize AI Accelerators
A leading tech company is seeking a Software Engineer specialized in GPU development to optimize AI accelerators for critical products. You will work on performance enhancements and software stack optimizations that impact billions of users. Ideal candidates have strong...
Senior
Google
Seattle, WA
2 days ago
Senior GPU Performance Engineer - Optimize AI workloads
$166k - $244k
Google is looking for a Senior Software Engineer in Sunnyvale, CA to lead GPU performance optimizations for cutting-edge AI and machine learning technologies. This role offers the opportunity to work on innovative projects that impact billions of users around the globe....
Senior
Google
Sunnyvale, CA
5 days ago
Senior Engineer II - Fleet Virtualization & GPU AI (Remote)
$140k - $175k
A leading cloud infrastructure provider is seeking a Senior Engineer II to join their fleet virtualization team. This role entails designing and developing core IT systems to enhance virtualization technologies while maintaining a strong security focus. Candidates should...
Senior
Remote job
Full time
DigitalOcean
Denver, CO
3 days ago
Senior AI Systems Engineer - Inference & GPU Kernel Dev
$184k - $287.5k
Dormont Manufacturing Co is seeking an outstanding AI systems engineer to develop groundbreaking technologies in the inference systems software stack. You'll innovate and develop new AI systems technologies while collaborating closely with other engineers at NVIDIA across...
Senior
Dormont Manufacturing Co
California, MO
1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior Engineer II, AI Inference Optimization. Be the first to apply!