Distributed LLM Inference Engineer - Scale HighThroughput AI

Cerebras

Anyscale is seeking a Distributed LLM Inference Engineer in Palo Alto, California. The role focuses on pushing the boundaries of performance for AI inference at large scale, collaborating closely with product teams and open source communities. The ideal candidate should have experience in running ML inference, familiarity with top deep learning frameworks like PyTorch, and a strong grasp of distributed systems. Attractive benefits and compensation plan included. #J-18808-Ljbffr Cerebras

Apply

Vacancy posted 5 days ago

Similar jobs that could be interesting for youBased on the Distributed LLM Inference Engineer - Scale HighThroughput AI in Palo Alto, CA vacancy

Distributed LLM Inference Engineer
...on a mission to democratize distributed computing and make it accessible... ...accelerate the progress of AI applications out into the... ...developer or data scientist can scale an ML application from their... ...the role As a Distributed LLM Inference Engineer, you will help systems and...
Suggested
Work at office
Cerebras
Palo Alto, CA
5 days ago
Principal Software Engineer - Large-Scale LLM Memory and Storage Systems
$272k - $425.5k
Principal Software Engineer – Large-Scale LLM Memory and Storage Systems page is loaded## Principal... ...Dynamo is a high-throughput, low-latency inference framework for serving generative AI and reasoning models across multi-node distributed environments. Built in Rust for...
Suggested
Local area
Remote work
NVIDIA Corporation
Santa Clara, CA
1 day ago
ML Systems Engineer: Production-Scale LLM Inference
ScOp Venture Capital is looking for an ML Systems Engineer to optimize LLM inference systems crucial for their AI platform. The role focuses on enhancing performance and efficiency via low-level systems optimization, directly impacting industry leader processes in semiconductor...
Suggested
ScOp Venture Capital
Santa Clara, CA
2 days ago
Backend Engineer - Distributed Systems
...Department: Backend Engineer · Work type: On-... ...About A rchetype AI Archetype AI is developing... ...-time multimodal LLM for real life,... ..., and resilient distributed systems. You’ll work... ...production—at scale, with reliability,... ...-latency AI model inference and data services....
Suggested
Full time
Neara
Palo Alto, CA
2 days ago
Senior Software Engineer, AI Inference Systems
$184k - $287.5k
...skilled and motivated software engineers to join us and build AI inference systems that serve large-scale models with extreme... ...architecture, parallel programming, distributed systems, deep learning theories... ...building and optimizing LLM inference engines (e.g., vLLM...
Suggested
NVIDIA Gruppe
Santa Clara, CA
3 days ago
Senior Engineering Manager AI Inference Platform, Distributed Cloud
$262k - $365k
Senior Engineering Manager AI Inference Platform, Distributed Cloud Location: Sunnyvale, CA, USA Pay US: $262,000 - $365... ...experience optimizing, profiling, and scaling production‑grade systems on GPU... ...experience implementing advanced LLM serving architectures and...
Google Inc.
Sunnyvale, CA
2 days ago
Staff ML Systems Engineer — Distributed Training at Scale
A leading AI infrastructure company in California seeks a Member of Technical Staff — Training to design and optimize large-scale distributed training systems for frontier AI models. Candidates should have 5+ years of experience in ML systems and be proficient in Python...
RadixArk
Palo Alto, CA
4 days ago
SR Principal Software Engineer - LLM Engineering
...Principal Software Engineer at JPMorganChase... ...services, enabling scale across teams and functions... ...using Model Inference servers such as... ...production operations for AI workloads,... ...architecting and deploying LLM & GNN solutions on... ...optimization and distributed systems for large...
TwinThread LLC
Palo Alto, CA
5 days ago
Application Software Engineer, Inference
$135k - $160k
Application Software Engineer, Inference SpaceX was founded under... ...a high-performance AI inference platform that... ...design and optimize large-scale model serving systems... ...everything from distributed infrastructure to deep... ...SGLang, vLLM, TensorRT-LLM) Develop custom tools...
Permanent employment
Temporary work
Remote work
Worldwide
Weekend work
SPACE EXPLORATION TECHNOLOGIES CORP
Palo Alto, CA
1 day ago
Senior Software Engineer - Distributed Data Systems
$166k - $225k
...running the world's best data and AI infrastructure platform so... ...their business. Founded by engineers — and customer obsessed — we... ...for interfacing with data to scaling our services and infrastructure... ...building the next generation distributed data storage and processing...
Local area
Worldwide
Databricks Inc.
Mountain View, CA
2 days ago
Senior Software Engineer, Distributed Systems - NIM Factory
$168k - $270.25k
Senior Software Engineer, Distributed Systems - NIM Factory page is loaded## Senior... ...upon which every new AI-powered application is built... ...infrastructure and automation for NVIDIA Inference Microservices (NIMs). The... ...in working with large scale full stack developmentWe are...
Remote work
NVIDIA Corporation
Santa Clara, CA
2 days ago
Senior Software Engineer, Distributed Compute System
$160.36k - $240.54k
...driver, combining cutting-edge AI with automotive-grade... ...clear path to AVs at commercial scale, empowering a safer, richer,... ...Role We’re looking for senior engineers to build/scale Nuro's large-scale... ...and developing large-scale distributed applications (e.g. Kubernetes...
Icehouseventures
Mountain View, CA
3 days ago
Senior Software Engineer, Graph DB & Distributed Systems
$180k - $220k
black.ai is looking for a Senior Software Engineer, Calibration & Control in Palo Alto, CA. In this role, you will... ...the control systems for utility-scale quantum computers. You will be responsible... ...in Python or C++, with a focus on distributed storage and graph databases. The...
black.ai
Palo Alto, CA
3 days ago
Staff Software Engineer - Distributed Data Systems
$192k - $260k
...running the world's best data and AI infrastructure platform, so... ...companies in the world. Our engineering teams build highly technical... ...the resilience, security and scale that is critical to making... ...Optional: MS or PhD in databases, distributed systems. Comfortable working...
Work at office
Local area
Menlo Ventures
Mountain View, CA
5 days ago
Senior AI Systems Performance Engineer San Jose, California, United States
Senior AI Systems Performance Engineer Palo Alto, California, United States... ...and operations at scale. SambaNova Suite™... ...for large‑scale AI inference. Responsibilities... ...both single‑node and distributed systems. Basic Qualifications... ...‑on experience with LLM or multimodal model...
Full time
Temporary work
Local area
Flexible hours
SambaNova
Palo Alto, CA
1 day ago
System Software Engineer, Distributed Systems
...unlimited potential of AI to define the next era... ...supports 1,000+ chip design engineers by building tools and... ...with an emphasis on distributed systems and operational... ...concurrency, and reliability at scale. Responsibilities... ...language (including LLM‑generated code) to implement...
NVIDIA Corporation
Santa Clara, CA
4 days ago
Inference Optimization Engineer United States - Remote · Remote
$198k - $286k
...mission to revolutionize AI infrastructure by... ...Modular, we optimize inference from kernel to cloud on... ...makes this possible at scale. We continuously apply... ...kernels, the inference engine, and distributed systems so that customer... ...Cloud, delivering LLM performance on the Pareto...
Remote job
Work experience placement
Work at office
Local area
Flexible hours
Modular Mailing Systems, Inc.
Los Altos, CA
2 days ago
Senior Software Engineer - AI Inference
$152k - $241.5k
...platform upon which every new AI‑powered application is... ...a Senior Software Engineer - AI Inference to advance open‑source LLM serving by contributing... ...low‑latency inference at scale. This is a hands‑on role... ...mindset. Familiarity with distributed systems concepts and concurrency...
NVIDIA Gruppe
Santa Clara, CA
3 days ago
Compiler Engineer - AI Inference
$152k - $241.5k
...learning ignited modern AI — the next era of... ...seeking top‑tier AI Compiler Engineers to drive innovation... ...tangible impact on a global scale. What you’ll be doing:... ...for AI workloads (both inference and training) and... ...accelerator architectures. LLM Knowledge: Deep understanding...
NVIDIA Gruppe
Santa Clara, CA
3 days ago
Senior Software Development Engineer - SGLang and Inference Stack
...computing experiences—from AI and data centers, to PCs,... ...enabling RL training and SOTA LLM and Multimodal inference at scale across multi-GPU and multi... .... THE PERSON: Skilled engineer with strong technical and... ...serving and RL‑training. Distributed System Optimization: Tune...
Advanced Micro Devices , Inc.
Santa Clara, CA
5 days ago
Senior Software Engineer, Inference
...the Role We are seeking a Senior Inference Engineer to accelerate the performance of Pika's AI-driven products. In this highly... ...‑leading user experiences at scale. You will design and optimize inference... ...computing kernels and distributed workloads using CUDA and NCCL....
Work at office
3 days per week
Pika
Palo Alto, CA
2 days ago
Senior Staff Software Engineer - High Performance GPU Inference Systems
$248.71k - $292.6k
...Groq delivers fast, efficient AI inference. Our LPU-based system powers... ...developers the speed and scale they need. Headquartered in... ...Build fast. Sr. Staff Software Engineer - High Performance GPU... ...opportunities in this role Distributed Systems Engineering : Design...
I did my part and supported the Regular Toilet
Palo Alto, CA
3 days ago
Principal Software Engineer - AI Inference
$272k - $431.25k
...platform for every new AI-powered application. We... ...seek a Principal Software Engineer - AI Inference to advance open-source LLM serving. This role involves... ...-latency inference at scale. This is a hands‑on,... ...performance engineering, and distributed systems. You will...
NVIDIA Gruppe
Santa Clara, CA
3 days ago
Senior Inference Systems Engineer — Large-Scale GPUs
A leading AI infrastructure company in California is seeking a Member of Technical Staff — Inference to design and optimize large-scale AI inference systems. The role demands 5+ years in systems engineering and expertise in large-scale inference systems. Successful candidates...
Flexible hours
RadixArk
Palo Alto, CA
1 day ago
Senior AI Inference Engineer - High-Performance LLM Serving
$152k - $241.5k
NVIDIA Gruppe is seeking a Senior Software Engineer - AI Inference in Santa Clara, California. This role involves enhancing open-source LLM serving optimizations and implementing high-performance runtime capabilities. Candidates should have 5+ years of experience in building...
NVIDIA Gruppe
Santa Clara, CA
3 days ago
QA Engineer IV
$154.4k - $212.3k
...one of the largest B2B AI‑native companies—decades‑proven, built‑for‑scale and designed for the enterprise... ...Overview As a Staff QA Engineer at Uniphore, you’ll... ...thrives in fast‑paced, distributed environments and is passionate... ...testing frameworks, LLM workflows, or chatbot...
Uniphore Technologies North America Inc
Palo Alto, CA
3 days ago
AI Inference Performance Engineer
$152k - $241.5k
...and benchmark GenAI inference on NVIDIA's latest... ...within TensorRT-LLM, SGLang, and vLLM,... ...serving performance at scale. This team sits at... ...GPU performance engineering and public... ...memory management, and distributed inference across TensorRT... ...other emerging AI use cases....
NVIDIA Gruppe
Santa Clara, CA
3 days ago
Senior Software Engineer - Distributed Systems
$147.4k - $272.1k
Senior Software Engineer - Distributed Systems Cupertino, California, United States Machine Learning and AI Our team is on a mission to build innovative infrastructure and tools... ...performance through algorithm design and testing Scale services to ever-increasing problem sizes...
Relocation
Apple Inc.
Cupertino, CA
3 days ago
Software Engineer - Distributed Build Systems
$126.8k - $220.9k
Software Engineer - Distributed Build Systems Cupertino, California, United States Software and Services... ...ships to billions of customers — a scale that has few peers in the industry. This... ...monitoring, or SRE practices Leveraging AI-assisted development tools to improve...
Relocation
Apple Inc.
Cupertino, CA
5 days ago
Software Engineer Agentic AI Systems Moveworks
...Role Are you a software engineer who has honed your... ...at the cutting edge of AI agents? This may be the... ...to perform reliably at scale. You will have the opportunity... ..., agent memory, LLM self-reflection and improvement... ...We approach our distributed world of work with flexibility...
Work at office
Immediate start
Remote work
Flexible hours
Centaur Labs
Mountain View, CA
3 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Distributed LLM Inference Engineer - Scale HighThroughput AI. Be the first to apply!