Staff Software Engineer, Inference Platform

Cerebras Systems, Inc.

Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs. This architecture allows Cerebras to deliver industry-leading training and inference speeds; over 10 times faster than GPU-based hyperscale cloud inference services. This order of magnitude increase in speed is transforming the user experience of AI applications, unlocking real-time iteration and increasing intelligence via additional agentic computation. Cerebras works with the leading model labs, global enterprises, and cutting-edge AI-native startups. OpenAI recently announced a multi-year partnership with Cerebras, to deploy 750 megawatts of scale, transforming key workloads with ultra high-speed inference. About the Role We're hiring a Staff Engineer to help lead, drive, and contribute to projects on our Inference Platform team. Our team primarily owns the orchestration layer that runs inference on our datacenter clusters, connecting cloud components with machine learning services. We are often the first team to face problems that haven't been solved yet, leading solutions across Kubernetes operators, service security policies, and CI/CD. If you're interested in building the next‑generation architecture of a globally distributed inference platform, we'd like to talk. Responsibilities Design, develop, test, and maintain production software, with responsibilities spanning testing, continuous development, observability, security, networking, debugging, and productionization. Raise the effectiveness of senior engineers through design feedback, pairing, and clear technical standards. Platform Direction. Help shape the technical direction for the Inference Platform, Kubernetes custom resource definitions, failure domains, service boundaries, and system evolution over time, and own the roadmap for major technical areas. Reliability & Performance. Architect active‑active systems with rapid failover, graceful degradation, and clear SLOs. Drive system‑level improvements in latency, throughput, capacity efficiency, and resilience under unpredictable demand. Execution on Critical Paths. Write and review production code in the most important parts of the platform. Make high‑consequence architectural decisions within your area and set the technical bar through design reviews, code reviews, and sound engineering judgment. Production Leadership. Lead on the hardest production issues and cross‑system bottlenecks. Drive observability, incident response, capacity planning, and post‑incident improvement with a high standard for operational rigor. Technical Influence. Partner with ML, Product, Infrastructure, and Cloud teams to translate product and business requirements into scalable system designs, and drive alignment on shared technical decisions within your domain and adjacent platform surfaces. Skills & Qualifications 8+ years of experience in software engineering, with substantial individual contributor experience building and operating large‑scale distributed systems or cloud infrastructure. Deep expertise in distributed systems architecture, ideally with Kubernetes. Strong track record of making sound architectural decisions for highly available, latency‑sensitive systems at scale. Experience with security (certificates, TLS, mTLS). Experience optimizing latency, throughput, and efficiency in high‑QPS systems. Experience with TTFT and tail‑latency reduction is a strong plus. Strong proficiency in backend or systems languages such as Go or C++, with the expectation that you can contribute production code directly. Experience designing observability and reliability practices, including metrics, logging, tracing, alerting, incident response, and SLO‑driven operations. Ability to influence senior engineers and cross‑functional partners through technical credibility, communication, and judgment, especially within your domain and adjacent systems. Preferred Skills & Qualifications Experience with ML inference infrastructure, model serving systems, or GPU‑accelerated workloads. Why Join Cerebras People who are serious about software make their own hardware. At Cerebras, we have built a breakthrough architecture that is unlocking new opportunities for the AI industry. With dozens of model releases and rapid growth, we’ve reached an inflection point in our business. Members of our team tell us there are five main reasons they joined Cerebas: Build a breakthrough AI platform beyond the constraints of the GPU. Publish and open source their cutting‑edge AI research. Work on one of the fastest AI supercomputers in the world. Enjoy job stability with startup vitality. Our simple, non‑corporate work culture that respects individual beliefs. Find out more about what it's like to work at Cerebras here! Apply today and become part of the forefront of groundbreaking advancements in AI! Cerebras Systems is committed to creating an equal and diverse environment and is proud to be an equal opportunity employer. We celebrate different backgrounds, perspectives, and skills. We believe inclusive teams build better products and companies. We try every day to build a work environment that empowers people to do their best work through continuous learning, growth and support of those around us. This website or its third‑party tools process personal data. For more details, click here to review our CCPA disclosure notice. #J-18808-Ljbffr Cerebras Systems, Inc.

Apply

Vacancy posted 8 hours ago

Similar jobs that could be interesting for youBased on the Staff Software Engineer, Inference Platform in Sunnyvale, CA vacancy

Staff Software Engineer - Real-Time AI Inference Infra
Cerebras Systems, Inc. is seeking a Software Engineer in Sunnyvale, California to enhance high-performance, low-latency inference infrastructure. This role involves deploying scalable services, optimizing resource allocation, and integrating with containerized environments...
Suggested
Cerebras Systems, Inc.
Sunnyvale, CA
9 hours ago
Staff Software Engineer: AI Inference Infra & Kubernetes
Cerebras Systems in Sunnyvale, CA is seeking a Member of Technical Staff (Software Engineer) to implement infrastructure for high-performance, low-latency inference services. Applicants should have a Master’s degree in Computer Science or a related field and at least one...
Suggested
CEREBRAS SYSTEMS INC.
Sunnyvale, CA
1 day ago
Staff Software Engineer, Inference Cloud
About the Role We're hiring a Staff Engineer to own major areas of the architecture of our Inference Cloud Platform. This team owns the cloud layer behind our Inference Service... ...Qualifications 8+ years of experience in software engineering, with substantial individual...
Suggested
Cerebras Systems, Inc.
Sunnyvale, CA
8 hours ago
Senior Staff Software Engineer - High Performance GPU Inference Systems
$248.71k - $292.6k
About Groq Groq delivers fast, efficient AI inference. Our LPU-based system powers GroqCloud™, giving businesses and developers... ...AI is within reach, anything is possible. Build fast. Sr. Staff Software Engineer - High Performance GPU Inference Systems Mission Push the limits...
Suggested
I did my part and supported the Regular Toilet
Palo Alto, CA
2 days ago
Senior/Staff Software Engineer - AI, Search & Knowledge Platforms
$181.1k - $318.4k
Senior/Staff Software Engineer - AI, Search & Knowledge Platforms Santa Clara, California, United States Machine Learning and AI The AI, Search & Knowledge Platforms... ...end-user devices and to Private Cloud Compute inference infrastructure. As a member of the team, you would...
Suggested
Relocation
Apple Inc.
Santa Clara, CA
1 day ago
Inference Platform Engineer | Kubernetes & Scalable AI
Cerebras Systems, Inc. is looking for a Software Engineer to enhance its Inference Platform. You will design and maintain critical software to support a high-performance AI architecture. As part of your role, you will tackle innovative challenges and help ensure the reliability...
Cerebras Systems, Inc.
Sunnyvale, CA
8 hours ago
Software Engineer, Inference Platform
...Cerebras to deliver industry-leading training and inference speeds; over 10 times faster than GPU-based... ...inference. About the Role We’re hiring a Software Engineer to help contribute to projects on our Inference Platform team. Our team primarily owns the orchestration...
Cerebras Systems, Inc.
Sunnyvale, CA
8 hours ago
Senior Platform Engineer, Inference & Kubernetes
Cerebras is seeking a Software Engineer to join our Inference Platform team in Sunnyvale, California. This role involves developing and leading projects that integrate cloud and ML components. You will contribute to shaping the technical direction and improve system performance...
Cerebras
Sunnyvale, CA
2 days ago
Senior Inference Platform Engineer Kubernetes & Latency
$165k - $242k
A cloud service provider is seeking a Senior Software Engineer II for their Inference team in Sunnyvale, California. In this role, you'll lead design reviews, implement optimizations, and improve service reliability. The ideal candidate has extensive experience with distributed...
CoreWeave
Sunnyvale, CA
1 day ago
Staff Engineer, AI Inference Platform & Kubernetes
Cerebras Systems, Inc. is looking for an experienced Staff Engineer to join our Inference Platform team in Sunnyvale, California. The role involves designing and maintaining production software that operates at scale, solving complex engineering challenges on the cutting...
Cerebras Systems, Inc.
Sunnyvale, CA
9 hours ago
Senior Engineering Manager AI Inference Platform, Distributed Cloud
$262k - $365k
Senior Engineering Manager AI Inference Platform, Distributed Cloud Location: Sunnyvale, CA, USA Pay US: $262,000 - $365,000 (USD) + 25% bonus target + equity + benefits. About the role In this role, you will be pivotal in architecting and optimizing the serving stack...
Google Inc.
Sunnyvale, CA
1 day ago
Staff Software Engineer, Inference
$188k - $275k
...pioneers by pioneers, CoreWeave delivers a platform of technology, tools, and teams that... ...Learn more at What You'll Do: Inference Platform Team The Inference team... ...systems. About the role: As a Staff Software Engineer (IC5) on the Inference team, you will...
Permanent employment
Temporary work
Casual work
Work at office
Flexible hours
CoreWeave
Sunnyvale, CA
26 days ago
Senior ML Inference Engineer - Platform
$128.7k - $261.3k
...Team The Model Deployment & Inference Solutions team in GM AV deploys... ...: build the ML deployment platform that makes model rollouts fast... ...currently performed manually by engineers. Build the developer... ...designing clean, well‑tested software with clear interfaces and good...
Local area
Remote work
Flexible hours
Shift work
General Motors
Mountain View, CA
1 day ago
Senior ML Infrastructure Engineer, Inference Platform
$155.42k - $395.9k
Job Description About the Team: The ML Inference Platform is part of the AV ML Infrastructure... ...are seeking a Senior ML Infrastructure engineer to help build and scale robust platforms... ...and implement core platform backend software components. Collaborate with ML engineers...
Local area
Remote work
Relocation
Relocation package
Flexible hours
Israelvcforum
Mountain View, CA
2 days ago
Remote Senior ML Inference Platform Engineer
General Motors is seeking a Senior ML Infrastructure Engineer to build and scale a robust platform for machine learning inference workflows. You will design backend software components, collaborate with ML engineers, and lead initiatives across GM's ML ecosystem. With...
Remote job
General Motors
Sunnyvale, CA
4 days ago
Senior ML Inference Platform Engineer (Remote)
...looking for a Senior ML Infrastructure Engineer in Mountain View, California. This... ...position aims to build and scale robust platforms for ML inference workflows supporting GM’s AI efforts.... ...strategies and handle backend software components. The position demands 5+ years...
Remote job
Israelvcforum
Mountain View, CA
2 days ago
ML Engineer — AI Platform & Multimodal Inference
...Mountain View is seeking a Machine Learning Engineer to build and optimize the infrastructure for its Intelligence Composition Platform. The role involves designing and deploying... ...multimodal data understanding, optimizing inference pipelines, and collaborating with teams to...
Corvic
Mountain View, CA
19 hours ago
Lead Principal Engineer, Inference Cloud Platform
Cerebras Systems, Inc. is seeking a Principal Engineer to lead their Inference Cloud Platform team. This pivotal role involves identifying key platform issues... .... The ideal candidate has over 10 years of software engineering experience and deep expertise in distributed...
Cerebras Systems, Inc.
Sunnyvale, CA
9 hours ago
Staff Engineer, Inference Platform - Distributed Systems
Cerebras is seeking a Staff Engineer to join their Inference Platform team in Sunnyvale, California. This role involves leading and contributing to projects... ...candidate will have over 8 years of experience in software engineering, particularly in distributed systems and...
Cerebras
Sunnyvale, CA
2 days ago
Staff Software Engineer
$160.5k - $240.7k
...Technologies, Inc. Job Area Engineering Group Machine Learning Engineering... ...learning hardware and software. Minimum Qualifications... ...spanning model architectures, inference pipelines, and runtime frameworks... ...on AI camera and embedded platforms. Develop and integrate...
Work experience placement
Work from home
Qualcomm
Santa Clara, CA
3 days ago
Staff ML Engineer, Inference Platform
$195k - $298k
...assistance. About the Team The ML Inference Platform is part of the AI Compute... ...About the Role We are seeking a Staff ML Infrastructure Engineer to build and scale robust compute... ...and implement core platform backend software components. Collaborate with ML engineers...
Local area
Relocation package
Flexible hours
Israelvcforum
Sunnyvale, CA
4 days ago
Senior Software Engineer, Inference Platform
$126k - $248k
...About the Role We’re looking for a Senior Engineer to help build the next‑generation inference platform that supports embedding models used for semantic search... ...backend or infrastructure systems at scale Strong software engineering skills in languages such as Go, Rust,...
Local area
The Consulting Solutions
Palo Alto, CA
3 days ago
Staff Software Engineer, Frontend Platform Architecture and Infrastructure
...Staff Software Engineer Wayfair's Customer Tech organization is at the heart of everything our customers experience at Wayfair.com. We own... ...monolith that powers the full customer-facing storefront — the platform that hundreds of engineers build on every day to bring the...
Immediate start
Wayfair
Mountain View, CA
4 days ago
Senior Software Engineer - AI Inference
$152k - $241.5k
NVIDIA is the platform upon which every new AI‑powered application is built. We are seeking a Senior Software Engineer - AI Inference to advance open‑source LLM serving by contributing directly to upstream inference engines like vLLM and SGLang-ensuring they run best‑in...
NVIDIA Gruppe
Santa Clara, CA
2 days ago
Senior Software Development Engineer - SGLang and Inference Stack
...and SOTA LLM and multimodal inference at scale across multi‑GPU and... ...collaborate across internal GPU software teams and engage with open‑... ...ecosystem. THE PERSON Skilled engineer with strong technical and... ...SGLang framework across AMD GPU platforms for LLM, multimodal serving...
Advanced Micro Devices , Inc.
Santa Clara, CA
1 day ago
Senior Software Engineer, Deep Learning Inference - TensorRT
$152k - $241.5k
Senior Software Engineer - Deep Learning Inference What you’ll be doing: Craft and develop robust inferencing software that can be scaled to multiple platforms for functionality and performance Develop components of TensorRT, NVIDIA’s SDK for high-performance deep learning...
NVIDIA Gruppe
Santa Clara, CA
2 days ago
DL Software Engineer - TensorRT Performance & Inference
NVIDIA Gruppe in Santa Clara is seeking a Deep Learning Software Engineer focused on improving performance of deep learning inference software like TensorRT. The ideal candidate will have a strong foundation in C++ and Python, relevant experience with deep learning frameworks...
NVIDIA Gruppe
Santa Clara, CA
2 days ago
Principal Engineer, Inference Cloud
...deliver industry-leading training and inference speeds; over 10 times faster than... .... We're hiring a Principal Engineer for our Inference Cloud Platform. This team owns the cloud layer behind... ...10+ years of experience in software engineering, with substantial individual...
Cerebras Systems, Inc.
Sunnyvale, CA
8 hours ago
Senior Staff Engineer — AI Inference & Cloud Infra
$230k - $250k
Cerebras Systems is seeking a Sr. Member of Technical Staff in Sunnyvale, CA. This role involves designing resilient software features for cloud-based AI inference, leveraging AWS tools and services. Candidates should have a Master’s degree in Computer Science and experience...
Cerebras Systems
Sunnyvale, CA
1 day ago
Senior Staff Engineer - AI Inference & Resilient Cloud
Cerebras Systems, Inc. is looking for a Sr. Member of Technical Staff to design software features that enhance system resiliency and high... ...distributed environments. The role includes developing scalable AI inference services and deploying cloud-based workflows. Ideal...
Cerebras Systems, Inc.
Sunnyvale, CA
8 hours ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Staff Software Engineer, Inference Platform. Be the first to apply!