Staff Software Engineer, Inference
$188k - $275kCoreWeave
CoreWeave is The Essential Cloud for AI™. Built for pioneers by pioneers, CoreWeave delivers a platform of technology, tools, and teams that enables innovators to build and scale AI with confidence. Trusted by leading AI labs, startups, and global enterprises, CoreWeave combines superior infrastructure performance with deep technical expertise to accelerate breakthroughs and turn compute into capability. Founded in 2017, CoreWeave became a publicly traded company (Nasdaq: CRWV) in March 2025. Learn more at [ What You’ll Do: Inference Platform Team The Inference team builds and operates CoreWeave’s Kubernetes-native inference platform, powering low-latency, high-throughput AI workloads at massive scale. The team is responsible for request routing, scheduling, GPU resource management, and system-wide optimizations that drive performance, efficiency, and reliability across real-time inference systems. About the role: As a Staff Software Engineer (IC5) on the Inference team, you will act as a technical leader driving architecture, performance, and reliability across multiple services and teams. Your day-to-day will involve leading cross-team design initiatives, optimizing inference performance (latency, throughput, and GPU utilization), and improving system reliability at scale. You will work deeply in distributed systems and Kubernetes-based infrastructure, focusing on areas like scheduling, batching, and memory optimization. This role requires hands-on technical leadership and the ability to influence engineering direction across the organization. Who You Are: * 8–12+ years of experience building and operating large-scale distributed systems or cloud platforms * Proven experience leading cross-team technical initiatives impacting multiple services or organizations
- Strong programming skills in Go, Python, or C++
- Deep expertise in Kubernetes at production scale, including orchestration,
- Experience leading multi-team or org-level technical initiatives
- Exposure to large-scale AI/ML infrastructure or hyperscale cloud environments
- You love to design and optimize high-performance distributed systems at scale
- You’re curious about AI inference, GPU systems, and emerging performance
- Be Curious at Your Core
- Act Like an Owner
- Empower Employees
- Deliver Best-in-Class Client Experiences
- Achieve More Together
- Medical, dental, and vision insurance - 100% paid for by CoreWeave
- Company-paid Life Insurance
- Voluntary supplemental life insurance
- Short and long-term disability insurance
- Flexible Spending Account
- Health Savings Account
- Tuition Reimbursement
- Ability to Participate in Employee Stock Purchase Program (ESPP)
- Mental Wellness Benefits through Spring Health
- Family-Forming support provided by Carrot
- Paid Parental Leave
- Flexible, full-service childcare support with Kinside
- 401(k) with a generous employer match
- Flexible PTO
- Catered lunch each day in our office and data center locations
- A casual work environment
- A work culture focused on innovative disruption
$188k - $275k
...Staff Software Engineer, Inference CoreWeave is The Essential Cloud for AI™. Built for pioneers by pioneers, CoreWeave delivers a platform of technology, tools, and teams that enables innovators to build and scale AI with confidence. Trusted by leading AI labs, startups...SuggestedPermanent employmentTemporary workCasual workWork at officeRemote workFlexible hours- Cerebras Systems in Sunnyvale, CA is seeking a Member of Technical Staff (Software Engineer) to implement infrastructure for high-performance, low-latency inference services. Applicants should have a Master’s degree in Computer Science or a related field and at least one...Suggested
- ...deliver industry-leading training and inference speeds and empowers machine learning users... ...: Sunnyvale We're hiring a Staff Engineer to own major areas of the architecture... ...Qualifications ~8+ years of experience in software engineering, with substantial...Suggested
$169.6k - $175k
Cerebras in Sunnyvale is seeking a Member of Technical Staff (Software Engineer) to implement infrastructure for high-performance inference services. You will deploy Kubernetes, optimize resource allocation, and develop Python scripts for data preprocessing. A Master’s...SuggestedRemote work- ...approach allows Cerebras to deliver industry-leading training and inference speeds and empowers machine learning users to effortlessly... ...hyperscale cloud inference services. About The Role As a software engineer on our AI cloud platform, you will work on our cloud...Suggested
$248.71k - $292.6k
About Groq Groq delivers fast, efficient AI inference. Our LPU-based system powers GroqCloud™, giving businesses and developers... ...AI is within reach, anything is possible. Build fast. Sr. Staff Software Engineer - High Performance GPU Inference Systems Mission Push the limits...$236k - $339.25k
...frameworks. Strong track record of working with machine learning systems and/or platforms. Experience in serving LLMs using inference engines like vLLM, TensorRT-LLM, TEI, SGLang, and knowing tradeoffs between them. Experience serving fine-tuned LLMs (PEFT, DPO,...Flexible hours- ...RL training and SOTA LLM and Multimodal inference at scale across multi-GPU and multi-node... ...You will collaborate across internal GPU software teams and engage with open-source... ...THE PERSON: Skilled engineer with strong technical and analytical expertise...
$193.3k - $261.5k
...AWS Neuron is the software stack powering AWS Inferentia and Trainium... ...high-performance, low-cost inference at scale. The Neuron Serving... ...seeking a Software Development Engineer to lead and architect our... ...employees, supervisors, and staff; adhere to standards of excellence...InternshipLocal areaFlexible hours$170k - $216k
...up developer velocity. We’re looking for a software engineer to join the team to build and maintain the... ...will report to the Head of ML Platform- Senior Staff Software Engineer. You will: Develop Waymo's inference platform to make it scalable, high throughput...Full timeRemote work$152k - $241.5k
...We are now looking for a Senior Software Engineer for Deep Learning Inference! Would you like to make a big impact in Deep Learning by helping build a state-of-the-art inference framework for accelerating Deep Learning models, especially Large Language Models, on NVIDIA...$152k - $241.5k
...NVIDIA is the platform upon which every new AI‑powered application is built. We are seeking a Senior Software Engineer – AI Inference to advance open‑source LLM serving by contributing directly to upstream inference engines like vLLM and SGLang-ensuring they run best‑...Remote work- ...ROLE: As a senior member of the LLM inference framework team, you will be responsible... ...role sits at the intersection of inference engines, distributed systems, and GPU runtime... ...architectures and kernel development Software Engineering ~ Expertise in Python and...
$165k - $242k
...Senior Software Engineer II, Inference Sunnyvale, CA / Bellevue, WA CoreWeave is The Essential Cloud for AI™. Built for pioneers by pioneers, CoreWeave delivers a platform of technology, tools, and teams that enables innovators to build and scale AI with confidence....Permanent employmentTemporary workCasual workWork at officeRemote workFlexible hoursShift work$152k - $241.5k
...some of the world’s most challenging problems. We're seeking talented and motivated engineers to join our TensorRT team in developing the industry-leading deep learning inference software for NVIDIA AI accelerators. As a Senior Software Engineer in the TensorRT team,...$185k - $275k
...Staff Software Engineer, Cluster Orchestration Bellevue, WA / Sunnyvale, CA CoreWeave is The Essential Cloud for AI™. Built for pioneers by... ...our Kubernetes-native foundation that powers AI training and inference at scale. This is an opportunity to help shape one of the...Permanent employmentTemporary workCasual workWork at officeRemote workFlexible hours$185.9k - $278.9k
...Job Title Qualcomm Machine Learning Engineer Company Qualcomm Technologies, Inc... ...products through machine learning hardware and software. Minimum Qualifications • Bachelor'... ...with machine learning software) for inference or training solutions. • Develops...Work experience placementImmediate startWork from home$150k - $225k
...PlusAI is a Physical AI company pioneering AI-based virtual driver software for factory-built autonomous trucks. Headquartered in Silicon... ...Preferred Skills Experience with CV pipeline and model inference on edge platforms Experience with ROS2 and DDS Experience...$230k - $290k
...managing these features. This role is ideal for a self-sufficient software engineer who thrives in fast-paced environments and is passionate... ...AI/ML APIs (image generation, video generation, LLM inference, TTS/voice) is a strong plus. ~ Strong communication skills...Work at office$281k - $356k
...Senior Staff Software Engineer, TLM Waymo is an autonomous driving technology company with the mission to be the world's most trusted driver... ...to solve the "technical moat" of high-fidelity ML inference at a petabyte scale Key Responsibilities ~ Scalable...Full timeRemote work$196.5k - $219.3k
...requirements into platform features. Mentor junior engineers on secure backend development and best... ...the timely delivery of high-quality software features while adhering to project... ...AI/ML systems (e.g. implementing model inference pipelines, fine-tuning models, or working...Full timeWorldwide- ...AMD is looking for an influential software engineer who is passionate about improving the performance... ...THE PERSON: As a Senior Staff Software Developer, you will be at the... ...Mixture-of-Experts (MoE) architectures, inference optimizations (e.g., quantization,...
$192.6k - $305.6k
...ways of interacting with content, we're engineering the next generation of pipelines that... ...'re seeking a versatile, self-driven Staff Software Engineer to raise the bar on reliability... ..., powering asset generation, LLM inference, and the ML-driven tools that bring AI...Temporary workWork at officeWorldwideRelocation package$189k - $303k
...more efficient and accessible for all. We're searching for a Staff Software Engineer on the Autonomy Data: Continuous Learning team. The... ...interesting events to millions of miles Own model training and inference pipelines for all core Autonomy models Collaborate...Work at officeLocal area3 days per week$189k - $274k
...make mobility more efficient and accessible for all. As a Staff Software Engineer focusing on Deep Learning Acceleration at Aurora, you will... .... Experience with TensorRT, OpenAI Triton, Mojo and other inference acceleration tools. The base salary range for this position...Work at officeLocal area3 days per week$184k - $230k
...Business Area: Engineering Seniority Level: Mid-Senior level Job Description... ...enterprises. Cloudera is looking for a Staff Software Engineer to join the Enterprise AI... ...elegant, scalable, enterprise-quality AI inference services powered by machine learning...Work from homeRelocationFlexible hours- NVIDIA Gruppe in Santa Clara is seeking a Deep Learning Software Engineer focused on improving performance of deep learning inference software like TensorRT. The ideal candidate will have a strong foundation in C++ and Python, relevant experience with deep learning frameworks...
$193.3k - $261.5k
...(AWS) builds AWS Neuron, the software development kit used to accelerate... ...JAX enabling unparalleled ML inference and training performance.... ...-software boundary, our engineers build systematic infrastructure... ...employees, supervisors, and staff; adhere to standards of excellence...Work experience placementInternshipLocal areaFlexible hours$272k - $431.25k
...NVIDIA is the platform for every new AI-powered application. We seek a Principal Software Engineer - AI Inference to advance open-source LLM serving. This role involves contributing to upstream inference engines like vLLM and SGLang. You will ensure they run outstandingly...Remote work$152k - $241.5k
Senior Software Engineer, Quantized Inference page is loaded## Senior Software Engineer, Quantized Inferencelocations: US, WA, Redmond: US, CA, Santa Claratime type: Full timeposted on: Posted Yesterdayjob requisition id: JR2013890We are now looking for a Senior Software...
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Staff Software Engineer, Inference. Be the first to apply!
- javascript software engineer Sunnyvale, CA
- senior c# .net software developer Sunnyvale, CA
- ultimate software Sunnyvale, CA
- software technical support engineer Sunnyvale, CA
- software intern Sunnyvale, CA
- healthcare software sales Sunnyvale, CA
- mobile software developer Sunnyvale, CA
- software quality assurance Sunnyvale, CA
- software sales Sunnyvale, CA
- embedded software Sunnyvale, CA

