Staff Software Engineer, Inference

$188k - $275k

CoreWeave

Job Description

CoreWeave is The Essential Cloud for AI™. Built for pioneers by pioneers, CoreWeave delivers a platform of technology, tools, and teams that enables innovators to build and scale AI with confidence. Trusted by leading AI labs, startups, and global enterprises, CoreWeave combines superior infrastructure performance with deep technical expertise to accelerate breakthroughs and turn compute into capability. Founded in 2017, CoreWeave became a publicly traded company (Nasdaq: CRWV) in March 2025. Learn more at

What You'll Do:

Inference Platform Team
The Inference team builds and operates CoreWeave's Kubernetes-native inference platform, powering low-latency, high-throughput AI workloads at massive scale. The team is responsible for request routing, scheduling, GPU resource management, and system-wide optimizations that drive performance, efficiency, and reliability across real-time inference systems.

About the role:
As a Staff Software Engineer (IC5) on the Inference team, you will act as a technical leader driving architecture, performance, and reliability across multiple services and teams. Your day-to-day will involve leading cross-team design initiatives, optimizing inference performance (latency, throughput, and GPU utilization), and improving system reliability at scale. You will work deeply in distributed systems and Kubernetes-based infrastructure, focusing on areas like scheduling, batching, and memory optimization. This role requires hands-on technical leadership and the ability to influence engineering direction across the organization.

Who You Are:

8–12+ years of experience building and operating large-scale distributed systems or cloud platforms
Proven experience leading cross-team technical initiatives impacting multiple services or organizations
Strong programming skills in Go, Python, or C++
Deep expertise in Kubernetes at production scale, including orchestration, scheduling, and service design
Strong understanding of distributed systems, networking, and performance optimization
Experience designing and operating low-latency, high-throughput systems with strict P95/P99 latency requirements
Hands-on experience with inference systems, including batching or micro-batching strategies, caching, and memory optimization
Experience improving system performance using metrics-driven approaches (e.g., latency, throughput, utilization)
Familiarity with mixed precision (BF16, FP8) and streaming inference workloads

Preferred:

Experience with inference frameworks such as vLLM, Triton, TensorRT-LLM, Ray Serve, or TorchServe
Experience with GPU systems and performance optimization (CUDA, NCCL, RDMA, NUMA, GPU interconnects)
Experience leading multi-team or org-level technical initiatives
Exposure to large-scale AI/ML infrastructure or hyperscale cloud environments

Wondering if you're a good fit?
We believe in investing in our people, and value candidates who can bring their own diversified experiences to our teams – even if you aren't a 100% skill or experience match. Here are a few qualities we've found compatible with our team. If some of this describes you, we'd love to talk.

You love to design and optimize high-performance distributed systems at scale
You're curious about AI inference, GPU systems, and emerging performance techniques
You're an expert in building reliable, low-latency infrastructure and driving system-wide improvements

Why CoreWeave?
At CoreWeave, we work hard, have fun, and move fast! We're in an exciting stage of hyper-growth that you will not want to miss out on. We're not afraid of a little chaos, and we're constantly learning. Our team cares deeply about how we build our product and how we work together, which is represented through our core values:

Be Curious at Your Core
Act Like an Owner
Empower Employees
Deliver Best-in-Class Client Experiences
Achieve More Together

We support and encourage an entrepreneurial outlook and independent thinking. We foster an environment that encourages collaboration and enables the development of innovative solutions to complex problems. As we get set for takeoff, the organization's growth opportunities are constantly expanding. You will be surrounded by some of the best talent in the industry, who will want to learn from you, too. Come join us!

The base salary range for this role is $188,000 to $275,000. The starting salary will be determined based on job-related knowledge, skills, experience, and market location. We strive for both market alignment and internal equity when determining compensation. In addition to base salary, our total rewards package includes a discretionary bonus, equity awards, and a comprehensive benefits program (all based on eligibility).

What We Offer

The range we've posted represents the typical compensation range for this role. To determine actual compensation, we review the market rate for each candidate which can include a variety of factors. These include qualifications, experience, interview performance, and location.

In addition to a competitive salary, we offer a variety of benefits to support your needs. The benefits below reflect our US-based offerings; for roles in other locations, benefits vary and are shared during the hiring process. These include:

Medical, dental, and vision insurance - 100% paid for by CoreWeave
Company-paid Life Insurance
Voluntary supplemental life insurance
Short and long-term disability insurance
Flexible Spending Account
Health Savings Account
Tuition Reimbursement
Ability to Participate in Employee Stock Purchase Program (ESPP)
Mental Wellness Benefits through Spring Health
Family-Forming support provided by Carrot
Paid Parental Leave
Flexible, full-service childcare support with Kinside
401(k) with a generous employer match
Flexible PTO
Catered lunch each day in our office and data center locations
A casual work environment
A work culture focused on innovative disruption

California Applicants

California Consumer Privacy Act

Equal Opportunity & Accommodations

CoreWeave is an equal opportunity employer, committed to fostering an inclusive and supportive workplace. All qualified applicants and candidates will receive consideration for employment without regard to race, color, religion, sex, disability, age, sexual orientation, gender identity, national origin, veteran status, or genetic information.

As part of this commitment and consistent with the Americans with Disabilities Act (ADA) , CoreWeave will ensure that qualified applicants and candidates with disabilities are provided reasonable accommodations for the hiring process, unless such accommodation would cause an undue hardship. If reasonable accommodation is needed, please contact: View email address on ziprecruiter.com.

Export Control Compliance

This position requires access to export controlled information. To conform to U.S. Government export regulations applicable to that information, applicant must either be (A) a U.S. person, defined as a (i) U.S. citizen or national, (ii) U.S. lawful permanent resident (green card holder), (iii) refugee under 8 U.S.C. § 1157, or (iv) asylee under 8 U.S.C. § 1158, (B) eligible to access the export controlled information without a required export authorization, or (C) eligible and reasonably likely to obtain the required export authorization from the applicable U.S. government agency. CoreWeave may, for legitimate business reasons, decline to pursue any export licensing process.

Apply

Vacancy posted 26 days ago

Similar jobs that could be interesting for youBased on the Staff Software Engineer, Inference in Sunnyvale, CA vacancy

Staff Software Engineer - Real-Time AI Inference Infra
Cerebras Systems, Inc. is seeking a Software Engineer in Sunnyvale, California to enhance high-performance, low-latency inference infrastructure. This role involves deploying scalable services, optimizing resource allocation, and integrating with containerized environments...
Suggested
Cerebras Systems, Inc.
Sunnyvale, CA
13 hours ago
Staff Software Engineer: AI Inference Infra & Kubernetes
Cerebras Systems in Sunnyvale, CA is seeking a Member of Technical Staff (Software Engineer) to implement infrastructure for high-performance, low-latency inference services. Applicants should have a Master’s degree in Computer Science or a related field and at least one...
Suggested
CEREBRAS SYSTEMS INC.
Sunnyvale, CA
1 day ago
Staff Software Engineer, Inference Cloud
About the Role We're hiring a Staff Engineer to own major areas of the architecture of our Inference Cloud Platform. This team owns the cloud layer behind our Inference... ...& Qualifications 8+ years of experience in software engineering, with substantial individual contributor...
Suggested
Cerebras Systems, Inc.
Sunnyvale, CA
12 hours ago
Staff Software Engineer, Inference Platform
Location: Sunnyvale We're hiring a Staff Engineer to help lead, drive, and contribute to projects on our Inference Platform team. Our team primarily owns the orchestration... ...& Qualifications 8+ years of experience in software engineering, with substantial individual...
Suggested
Cerebras
Sunnyvale, CA
2 days ago
Senior Staff Software Engineer - High Performance GPU Inference Systems
$248.71k - $292.6k
About Groq Groq delivers fast, efficient AI inference. Our LPU-based system powers GroqCloud™, giving businesses and developers... ...AI is within reach, anything is possible. Build fast. Sr. Staff Software Engineer - High Performance GPU Inference Systems Mission Push the limits...
Suggested
I did my part and supported the Regular Toilet
Palo Alto, CA
2 days ago
Staff Software Engineer
$160.5k - $240.7k
...Company Qualcomm Technologies, Inc. Job Area Engineering Group Machine Learning Engineering... ...through machine learning hardware and software. Minimum Qualifications Bachelor’s... ...analytics, spanning model architectures, inference pipelines, and runtime frameworks deployed...
Work experience placement
Work from home
Qualcomm
Santa Clara, CA
3 days ago
Senior Software Engineer, Deep Learning Inference - TensorRT
$152k - $241.5k
Senior Software Engineer - Deep Learning Inference What you’ll be doing: Craft and develop robust inferencing software that can be scaled to multiple platforms for functionality and performance Develop components of TensorRT, NVIDIA’s SDK for high-performance deep learning...
NVIDIA Gruppe
Santa Clara, CA
2 days ago
DL Software Engineer - TensorRT Performance & Inference
NVIDIA Gruppe in Santa Clara is seeking a Deep Learning Software Engineer focused on improving performance of deep learning inference software like TensorRT. The ideal candidate will have a strong foundation in C++ and Python, relevant experience with deep learning frameworks...
NVIDIA Gruppe
Santa Clara, CA
2 days ago
Senior Software Development Engineer - SGLang and Inference Stack
...RL training and SOTA LLM and Multimodal inference at scale across multi-GPU and multi-node... ...You will collaborate across internal GPU software teams and engage with open-source... ...software ecosystem. THE PERSON: Skilled engineer with strong technical and analyticalexpertisein...
Advanced Micro Devices
Santa Clara, CA
4 days ago
Senior Software Engineer - AI Inference
$152k - $241.5k
NVIDIA is the platform upon which every new AI‑powered application is built. We are seeking a Senior Software Engineer - AI Inference to advance open‑source LLM serving by contributing directly to upstream inference engines like vLLM and SGLang-ensuring they run best‑in...
NVIDIA Gruppe
Santa Clara, CA
2 days ago
Senior Staff Software Engineer, TPU Performance
$262k - $365k
Senior Staff Software Engineer, TPU Performance corporate_fare Google place Sunnyvale, CA, USA Apply Bachelor’s degree or equivalent practical... ..., enabling them to execute massive training and inference workloads using PyTorch and JAX. The AI and Infrastructure...
Worldwide
Google Inc.
Sunnyvale, CA
4 days ago
Staff Software Development Engineer (LLM)
$196.5k - $219.3k
...requirements into platform features. Mentor junior engineers on secure backend development and best... ...the timely delivery of high‑quality software features while adhering to project... ...AI/ML systems (e.g. implementing model inference pipelines, fine‑tuning models, or working...
Full time
Zoomcar
Sunnyvale, CA
3 days ago
Staff Software Engineer, Machine Learning Compilers, Edge TPU
$197k - $291k
Staff Software Engineer, Machine Learning Compilers, Edge TPU Google Mountain View, CA, USA ; Kirkland, WA, USA Apply X In accordance with Washington... ...technical field. Experience in optimizing ML models for inference. Experience in Multi-Level Intermediate Representation (...
Full time
Temporary work
Google Inc.
Mountain View, CA
4 days ago
Staff Software Engineer, Cluster Orch (SUNK)
$207k - $275k
...foundation that powers AI training and inference at scale. This is an opportunity to help... ...possible with AI. What You’ll Do As a Staff Engineer (IC5), you will be a technical leader... ...Who You Are 8-12 years of professional software engineering experience. Proven track record...
Permanent employment
Temporary work
Casual work
Work at office
Flexible hours
Jobr
Sunnyvale, CA
2 days ago
Staff Software Engineer, Applied Research, Foundation User Models
$197k - $291k
Staff Software Engineer, Applied Research, Foundation User Models corporate_fare Google place Mountain View, CA, USA Apply Bachelor’s degree or... ...adaptation) that balances high-quality output with strict inference latency requirements for production environments. Drive architectural...
Full time
Immediate start
Worldwide
Google Inc.
Mountain View, CA
1 day ago
Senior Staff Software Development Engineer- GPU/AI/ML
...run on our GPUs. THE OPPORTUNITY We're looking for a senior software engineer who combines deep systems performance work with modern AI—... ...software from GPU kernels through distributed training and inference. You’ll join a core team of specialists working on the latest...
Shift work
Advanced Micro Devices , Inc.
Santa Clara, CA
4 days ago
Senior Staff Software Engineer, TPU Performance
$262k - $365k
...experience. 8 years of experience in software development. 7 years of experience leading... ...Master’s degree or PhD in Engineering, Computer Science, or a related technical... ...enabling them to execute massive training and inference workloads using PyTorch and JAX. The AI...
Worldwide
Google
Sunnyvale, CA
2 days ago
Principal Software Engineer - AI Inference
$272k - $431.25k
NVIDIA is the platform for every new AI-powered application. We seek a Principal Software Engineer - AI Inference to advance open-source LLM serving. This role involves contributing to upstream inference engines like vLLM and SGLang. You will ensure they run outstandingly...
NVIDIA Gruppe
Santa Clara, CA
2 days ago
AI Inference Performance Engineer
Cerebras Systems, Inc. is seeking engineers for its Inference Core Platform group in Sunnyvale, California. This role involves building foundational software and hardware infrastructure to enhance AI inference performance on the Cerebras Wafer-Scale Engine. Ideal candidates...
Cerebras Systems, Inc.
Sunnyvale, CA
13 hours ago
AI Inference Performance Engineer
$152k - $241.5k
We optimize and benchmark GenAI inference on NVIDIA's latest accelerators, defining the... ...at the intersection of GPU performance engineering and public accountability. What You Will... ...equivalent experience. 5+ years of relevant software development experience. Strong Python...
NVIDIA Gruppe
Santa Clara, CA
2 days ago
Senior AI Kernel & Inference Engineer
A leading technology company is seeking a Senior AI Software Engineer to join their team in Santa Clara, California. In this role, you will... ...innovate and develop groundbreaking AI systems software for inference applications including deep learning framework optimizations...
NVIDIA
Santa Clara, CA
4 days ago
Senior AI Inference Kernel Engineer
$184k - $287.5k
NVIDIA Gruppe in Santa Clara is seeking an AI Systems Engineer to innovate and develop cutting-edge technologies in the AI inference software stack. Candidates should hold a Master's degree and possess over 6 years of experience in ML/DL systems development. The role involves...
NVIDIA Gruppe
Santa Clara, CA
2 days ago
Senior AI Inference Performance Engineer (GPU/Cluster)
$152k - $241.5k
...seeking a talented individual to optimize and benchmark GenAI inference using the latest acceleration technologies. The role involves... ...Required qualifications include a relevant degree and significant software development experience in Python or C++. A deep understanding...
NVIDIA Gruppe
Santa Clara, CA
2 days ago
Senior Software Engineer, AI Inference Systems
$184k - $287.5k
Position Overview We are seeking highly skilled and motivated software engineers to join us and build AI inference systems that serve large-scale models with extreme efficiency. You’ll architect and implement high-performance inference stacks, optimize GPU kernels and...
NVIDIA Gruppe
Santa Clara, CA
2 days ago
Senior AI Inference Performance Engineer
Cerebras Systems, Inc. is looking for a Senior Performance Engineer to enhance the performance benchmarking and competitive pricing models... ...candidate will have extensive experience with open-source inference frameworks and an understanding of ML systems. This role...
Cerebras Systems, Inc.
Sunnyvale, CA
13 hours ago
High-Performance AI Inference Engineer (TensorRT)
$124k - $195.5k
NVIDIA Gruppe is looking for a passionate Software Engineer to join its TensorRT team in Santa Clara, California. This role involves designing and developing high-performance AI inference solutions while contributing to performance optimizations and collaborating with various...
NVIDIA Gruppe
Santa Clara, CA
4 days ago
Senior GPU AI Inference Engineer - Triton & Dynamo
A leading technology company is seeking a Senior System Software Engineer to develop GPU-accelerated AI inference serving software. The ideal candidate will have over 5 years of experience with deep learning software, strong skills in Rust and C++, and a collaborative approach...
NVIDIA Corporation
Santa Clara, CA
3 days ago
Senior AI Inference Engineer - High-Performance LLM Serving
$152k - $241.5k
NVIDIA Gruppe is seeking a Senior Software Engineer - AI Inference in Santa Clara, California. This role involves enhancing open-source LLM serving optimizations and implementing high-performance runtime capabilities. Candidates should have 5+ years of experience in building...
NVIDIA Gruppe
Santa Clara, CA
2 days ago
Senior/Staff Software Engineer
$185k - $265k
...Fortinet is looking for a Senior/Staff Software Engineer on the FortiCNAPP Team! Be a valuable team member that owns and operates high-availability, cross-cloud, large-volume, data processing system that is one of the foundational pieces of Fortinet-Lacework’s Cloud security...
Flexible hours
Fortinet, Inc.
Sunnyvale, CA
2 days ago
Member of Technical Staff (Software Engineer)
Member of Technical Staff (Software Engineer) Sunnyvale, CA Cerebras Systems builds the world’s largest AI chip, 56 times larger than GPUs.... ...scale architecture delivers industry‑leading training and inference speeds, empowering machine learning users to run large‑scale...
Full time
Part time
Internship
CEREBRAS SYSTEMS INC.
Sunnyvale, CA
1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Staff Software Engineer, Inference. Be the first to apply!