Staff Software Engineer, Inference

$188k - $275k

Full-time

CoreWeave

CoreWeave is The Essential Cloud for AI™. Built for pioneers by pioneers, CoreWeave delivers a platform of technology, tools, and teams that enables innovators to build and scale AI with confidence. Trusted by leading AI labs, startups, and global enterprises, CoreWeave combines superior infrastructure performance with deep technical expertise to accelerate breakthroughs and turn compute into capability. Founded in 2017, CoreWeave became a publicly traded company (Nasdaq: CRWV) in March 2025. Learn more at [ What You’ll Do: Inference Platform Team The Inference team builds and operates CoreWeave’s Kubernetes-native inference platform, powering low-latency, high-throughput AI workloads at massive scale. The team is responsible for request routing, scheduling, GPU resource management, and system-wide optimizations that drive performance, efficiency, and reliability across real-time inference systems. About the role: As a Staff Software Engineer (IC5) on the Inference team, you will act as a technical leader driving architecture, performance, and reliability across multiple services and teams. Your day-to-day will involve leading cross-team design initiatives, optimizing inference performance (latency, throughput, and GPU utilization), and improving system reliability at scale. You will work deeply in distributed systems and Kubernetes-based infrastructure, focusing on areas like scheduling, batching, and memory optimization. This role requires hands-on technical leadership and the ability to influence engineering direction across the organization. Who You Are: * 8–12+ years of experience building and operating large-scale distributed systems or cloud platforms * Proven experience leading cross-team technical initiatives impacting multiple services or organizations

Strong programming skills in Go, Python, or C++
Deep expertise in Kubernetes at production scale, including orchestration,

scheduling, and service design * Strong understanding of distributed systems, networking, and performance optimization * Experience designing and operating low-latency, high-throughput systems with strict P95/P99 latency requirements * Hands-on experience with inference systems, including batching or micro-batching strategies, caching, and memory optimization * Experience improving system performance using metrics-driven approaches (e.g., latency, throughput, utilization) * Familiarity with mixed precision (BF16, FP8) and streaming inference workloads Preferred: * Experience with inference frameworks such as vLLM, Triton, TensorRT-LLM, Ray Serve, or TorchServe * Experience with GPU systems and performance optimization (CUDA, NCCL, RDMA, NUMA, GPU interconnects)

Experience leading multi-team or org-level technical initiatives
Exposure to large-scale AI/ML infrastructure or hyperscale cloud environments

Wondering if you’re a good fit? We believe in investing in our people, and value candidates who can bring their own diversified experiences to our teams – even if you aren't a 100% skill or experience match. Here are a few qualities we’ve found compatible with our team. If some of this describes you, we’d love to talk.

You love to design and optimize high-performance distributed systems at scale
You’re curious about AI inference, GPU systems, and emerging performance

techniques * You’re an expert in building reliable, low-latency infrastructure and driving system-wide improvements Why CoreWeave? At CoreWeave, we work hard, have fun, and move fast! We’re in an exciting stage of hyper-growth that you will not want to miss out on. We’re not afraid of a little chaos, and we’re constantly learning. Our team cares deeply about how we build our product and how we work together, which is represented through our core values:

Be Curious at Your Core
Act Like an Owner
Empower Employees
Deliver Best-in-Class Client Experiences
Achieve More Together

We support and encourage an entrepreneurial outlook and independent thinking. We foster an environment that encourages collaboration and enables the development of innovative solutions to complex problems. As we get set for takeoff, the organization's growth opportunities are constantly expanding. You will be surrounded by some of the best talent in the industry, who will want to learn from you, too. Come join us! The base salary range for this role is $188,000 to $275,000. The starting salary will be determined based on job-related knowledge, skills, experience, and market location. We strive for both market alignment and internal equity when determining compensation. In addition to base salary, our total rewards package includes a discretionary bonus, equity awards, and a comprehensive benefits program (all based on eligibility). What We Offer The range we’ve posted represents the typical compensation range for this role. To determine actual compensation, we review the market rate for each candidate which can include a variety of factors. These include qualifications, experience, interview performance, and location. In addition to a competitive salary, we offer a variety of benefits to support your needs. The benefits below reflect our US-based offerings; for roles in other locations, benefits vary and are shared during the hiring process. These include:

Medical, dental, and vision insurance - 100% paid for by CoreWeave
Company-paid Life Insurance
Voluntary supplemental life insurance
Short and long-term disability insurance
Flexible Spending Account
Health Savings Account
Tuition Reimbursement
Ability to Participate in Employee Stock Purchase Program (ESPP)
Mental Wellness Benefits through Spring Health
Family-Forming support provided by Carrot
Paid Parental Leave
Flexible, full-service childcare support with Kinside
401(k) with a generous employer match
Flexible PTO
Catered lunch each day in our office and data center locations
A casual work environment
A work culture focused on innovative disruption

California Applicants California Consumer Privacy Act [ Equal Opportunity & Accommodations CoreWeave is an equal opportunity employer, committed to fostering an inclusive and supportive workplace. All qualified applicants and candidates will receive consideration for employment without regard to race, color, religion, sex, disability, age, sexual orientation, gender identity, national origin, veteran status, or genetic information. As part of this commitment and consistent with the Americans with Disabilities Act (ADA) [ CoreWeave will ensure that qualified applicants and candidates with disabilities are provided reasonable accommodations for the hiring process, unless such accommodation would cause an undue hardship. If reasonable accommodation is needed, please contact: View email address on click.appcast.io [View email address on click.appcast.io]. Export Control Compliance This position requires access to export controlled information. To conform to U.S. Government export regulations applicable to that information, applicant must either be (A) a U.S. person, defined as a (i) U.S. citizen or national, (ii) U.S. lawful permanent resident (green card holder), (iii) refugee under 8 U.S.C. § 1157, or (iv) asylee under 8 U.S.C. § 1158, (B) eligible to access the export controlled information without a required export authorization, or (C) eligible and reasonably likely to obtain the required export authorization from the applicable U.S. government agency. CoreWeave may, for legitimate business reasons, decline to pursue any export licensing process.

Apply

Vacancy posted 3 days ago

Similar jobs that could be interesting for youBased on the Staff Software Engineer, Inference in Sunnyvale, CA vacancy

Staff Software Engineer - Real-Time AI Inference Infra
Cerebras Systems, Inc. is seeking a Software Engineer in Sunnyvale, California to enhance high-performance, low-latency inference infrastructure. This role involves deploying scalable services, optimizing resource allocation, and integrating with containerized environments...
Suggested
Cerebras Systems, Inc.
Sunnyvale, CA
2 days ago
Staff Software Engineer, Inference Cloud
About the Role We're hiring a Staff Engineer to own major areas of the architecture of our Inference Cloud Platform. This team owns the cloud layer behind our Inference... ...& Qualifications 8+ years of experience in software engineering, with substantial individual contributor...
Suggested
Cerebras Systems, Inc.
Sunnyvale, CA
2 days ago
Staff Software Engineer, Inference Platform
Location: Sunnyvale We're hiring a Staff Engineer to help lead, drive, and contribute to projects on our Inference Platform team. Our team primarily owns the orchestration... ...& Qualifications 8+ years of experience in software engineering, with substantial individual...
Suggested
Cerebras
Sunnyvale, CA
5 days ago
Staff Software Engineer
$160.5k - $240.7k
...Company Qualcomm Technologies, Inc. Job Area Engineering Group Machine Learning Engineering... ...through machine learning hardware and software. Minimum Qualifications Bachelor’s... ...analytics, spanning model architectures, inference pipelines, and runtime frameworks deployed...
Suggested
Work experience placement
Work from home
Qualcomm
Santa Clara, CA
1 day ago
Senior Staff Software Engineer
$185.9k - $278.9k
...Job Title: Machine Learning Engineer As a leading technology innovator, Qualcomm pushes... ...through machine learning hardware and software. Minimum Qualifications: • Bachelor... ...designed with machine learning software) for inference or training solutions. • Develops...
Suggested
Work experience placement
Immediate start
Work from home
Qualcomm
Santa Clara, CA
14 hours ago
Senior Software Development Engineer - SGLang and Inference Stack
...RL training and SOTA LLM and Multimodal inference at scale across multi-GPU and multi-node... ...You will collaborate across internal GPU software teams and engage with open-source... ...software ecosystem. THE PERSON: Skilled engineer with strong technical and analyticalexpertisein...
Advanced Micro Devices
Santa Clara, CA
2 days ago
Senior Software Engineer I, Inference
$139k - $204k
What You’ll Do: Senior engineers are area owners who lead designs, raise engineering standards, and deliver measurable improvements to... ...orchestration, and hardware teams to evolve our Kubernetes-native inference platform and meet strict P99 SLAs at scale. About the role:...
Permanent employment
Temporary work
Casual work
Work at office
Flexible hours
Shift work
Dormont Manufacturing Co
Sunnyvale, CA
2 days ago
Senior Software Engineer, Deep Learning Inference - TensorRT
$152k - $241.5k
## Senior Software Engineer, Deep Learning Inference - TensorRTApplylocations: US, CA, Santa Claratime type: Full timeposted on: Posted Todayjob requisition id: JR2013020We are now looking for a Senior Software Engineer for Deep Learning Inference! Would you like to make...
NVIDIA Corporation
Santa Clara, CA
5 days ago
Staff Software Development Engineer (LLM)
$196.5k - $219.3k
...requirements into platform features. Mentor junior engineers on secure backend development and best... ...the timely delivery of high‑quality software features while adhering to project... ...AI/ML systems (e.g. implementing model inference pipelines, fine‑tuning models, or working...
Full time
Zoomcar
Sunnyvale, CA
6 days ago
Senior Staff Software Engineer, TPU Performance
$262k - $365k
...experience. 8 years of experience in software development. 7 years of experience leading... ...Master’s degree or PhD in Engineering, Computer Science, or a related technical... ...enabling them to execute massive training and inference workloads using PyTorch and JAX. The AI...
Worldwide
Google
Sunnyvale, CA
5 days ago
Staff Software Engineer, Applied Research, Foundation User Models
$197k - $291k
Staff Software Engineer, Applied Research, Foundation User Models corporate_fare Google place Mountain View, CA, USA Apply Bachelor’s degree or... ...adaptation) that balances high-quality output with strict inference latency requirements for production environments. Drive architectural...
Full time
Immediate start
Worldwide
Google Inc.
Mountain View, CA
4 days ago
Senior Staff Software Engineer - Dev Tools & Diagnostics
Entrada Ventures is seeking a Senior Staff Software Engineer to join their team in Santa Clara, CA. This role involves designing and developing cutting-edge developer and diagnostic tools for AI inference accelerators. The ideal candidate will have over 7 years of experience...
Entrada Ventures
Santa Clara, CA
3 days ago
Staff Software Engineer, Machine Learning Compilers, Edge TPU
$197k - $291k
Staff Software Engineer, Machine Learning Compilers, Edge TPU Google Mountain View, CA, USA ; Kirkland, WA, USA Apply X In accordance with Washington... ...technical field. Experience in optimizing ML models for inference. Experience in Multi-Level Intermediate Representation (...
Full time
Temporary work
Google Inc.
Mountain View, CA
2 days ago
Staff Software Engineer, Deep Learning Acceleration
$189k - $274k
...make mobility more efficient and accessible for all. As a Staff Software Engineer focusing on Deep Learning Acceleration at Aurora, you will... .... Experience with TensorRT, OpenAI Triton, Mojo and other inference acceleration tools. The base salary range for this position...
Work at office
Local area
3 days per week
Dormont Manufacturing Co
Mountain View, CA
2 days ago
Staff Software Engineer, Cluster Orch (SUNK)
$207k - $275k
...foundation that powers AI training and inference at scale. This is an opportunity to help... ...possible with AI. What You’ll Do As a Staff Engineer (IC5), you will be a technical leader... ...Who You Are 8-12 years of professional software engineering experience. Proven track record...
Permanent employment
Temporary work
Casual work
Work at office
Flexible hours
Jobr
Sunnyvale, CA
5 days ago
Senior Staff Software Engineer, Developer and Qualification Tools
...technology. We are at the forefront of software and hardware innovation, pushing the boundaries... ...Canada. The role: Senior StaffSoftware Engineer, Developer and Qualification Tools What... ...tools for d-Matrix' cutting edge AI inference accelerators. You will be responsible...
Entrada Ventures
Santa Clara, CA
3 days ago
Senior Staff Software Development Engineer- GPU/AI/ML
...run on our GPUs. THE OPPORTUNITY We're looking for a senior software engineer who combines deep systems performance work with modern AI—... ...software from GPU kernels through distributed training and inference. You’ll join a core team of specialists working on the latest...
Shift work
Advanced Micro Devices , Inc.
Santa Clara, CA
2 days ago
AI Inference Performance Engineer
Cerebras Systems, Inc. is seeking engineers for its Inference Core Platform group in Sunnyvale, California. This role involves building foundational software and hardware infrastructure to enhance AI inference performance on the Cerebras Wafer-Scale Engine. Ideal candidates...
Cerebras Systems, Inc.
Sunnyvale, CA
2 days ago
Senior GPU AI Inference Engineer - Triton & Dynamo
A leading technology company is seeking a Senior System Software Engineer to develop GPU-accelerated AI inference serving software. The ideal candidate will have over 5 years of experience with deep learning software, strong skills in Rust and C++, and a collaborative approach...
NVIDIA Corporation
Santa Clara, CA
6 days ago
Senior AI Inference Kernel Engineer
$184k - $287.5k
NVIDIA Gruppe in Santa Clara is seeking an AI Systems Engineer to innovate and develop cutting-edge technologies in the AI inference software stack. Candidates should hold a Master's degree and possess over 6 years of experience in ML/DL systems development. The role involves...
NVIDIA Gruppe
Santa Clara, CA
5 days ago
Senior AI Inference Performance Engineer (GPU/Cluster)
$152k - $241.5k
...seeking a talented individual to optimize and benchmark GenAI inference using the latest acceleration technologies. The role involves... ...Required qualifications include a relevant degree and significant software development experience in Python or C++. A deep understanding...
NVIDIA Gruppe
Santa Clara, CA
5 days ago
High-Performance AI Inference Engineer (TensorRT)
$124k - $195.5k
NVIDIA Gruppe is looking for a passionate Software Engineer to join its TensorRT team in Santa Clara, California. This role involves designing and developing high-performance AI inference solutions while contributing to performance optimizations and collaborating with various...
NVIDIA Gruppe
Santa Clara, CA
2 days ago
Senior AI Inference Performance Engineer
Cerebras Systems, Inc. is looking for a Senior Performance Engineer to enhance the performance benchmarking and competitive pricing models... ...candidate will have extensive experience with open-source inference frameworks and an understanding of ML systems. This role...
Cerebras Systems, Inc.
Sunnyvale, CA
2 days ago
Senior Software Engineer, AI Inference Systems
$184k - $287.5k
Position Overview We are seeking highly skilled and motivated software engineers to join us and build AI inference systems that serve large-scale models with extreme efficiency. You’ll architect and implement high-performance inference stacks, optimize GPU kernels and...
NVIDIA Gruppe
Santa Clara, CA
5 days ago
AI Inference Performance Engineer
$152k - $241.5k
We optimize and benchmark GenAI inference on NVIDIA's latest accelerators, defining the... ...at the intersection of GPU performance engineering and public accountability. What You Will... ...equivalent experience. 5+ years of relevant software development experience. Strong Python...
NVIDIA Gruppe
Santa Clara, CA
5 days ago
Senior AI Kernel & Inference Engineer
A leading technology company is seeking a Senior AI Software Engineer to join their team in Santa Clara, California. In this role, you will... ...innovate and develop groundbreaking AI systems software for inference applications including deep learning framework optimizations...
NVIDIA Corporation
Santa Clara, CA
2 days ago
Member of Technical Staff (Software Engineer)
...deliver industry-leading training and inference speeds; over 10 times faster than GPU-based... .... About The Role We are seeking a Software Engineer to develop and maintain high-performance... ...Software Developer), Member of Technical Staff (Software Engineer), Software Engineer,...
Full time
Part time
Internship
Cerebras Systems, Inc.
Sunnyvale, CA
2 days ago
Senior Staff AI System Software Engineer
...Clara, CA, headquarters 3 days per week. The role Senior Staff AI/ML System Software Engineer What you will do The role requires you to be part of... ...(such as ONNX Runtime, TensorRT, …) Experience with inference servers/model serving frameworks (such as Triton, TFServ...
Work experience placement
3 days per week
Entrada Ventures
Santa Clara, CA
3 days ago
Senior/Staff Software Engineer, ML Data
$193.93k - $352.29k
...connected future. About the Role We are looking for a Senior/Staff Software Engineer to serve as a technical leader for Nuro’s ML Data engine.... ...methods. E.g. build systems that compute embeddings or run inference at scale, manage vector databases, and automatically sample...
Shift work
Icehouseventures
Mountain View, CA
2 days ago
Senior AI Systems Engineer: Inference Kernels & Runtimes
$184k - $287.5k
NVIDIA Gruppe is seeking talented AI systems engineers to advance innovative technologies in AI inference systems software. This role involves developing cutting-edge libraries, code generators, and kernel technologies for NVIDIA's architecture, emphasizing high-impact...
NVIDIA Gruppe
Santa Clara, CA
5 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Staff Software Engineer, Inference. Be the first to apply!