Software Engineer, ML Inference Performance

SambaNova Systems

Software Engineer, ML Inference Performance

Palo Alto, California, United States

The era of pervasive AI has arrived. In this era, organizations will use generative AI to unlock hidden value in their data, accelerate processes, reduce costs, drive efficiency and innovation to fundamentally transform their businesses and operations at scale.

SambaNova Suite™ is the first full-stack, generative AI platform, from chip to model, optimized for enterprise and government organizations. Powered by the intelligent SN40L chip, the SambaNova Suite is a fully integrated platform, delivered on-premises or in the cloud, combined with state-of-the-art open-source models that can be easily and securely fine-tuned using customer data for greater accuracy. Once adapted with customer data, customers retain model ownership in perpetuity, so they can turn generative AI into one of their most valuable assets.

About The Role

The Principal Compiler Engineer - ML Systems position will be responsible for working with the different layers of the compiler stack and coordinating with other development teams here at SambaNova. It is a critical role responsible for driving innovation in compiler infrastructure and optimization algorithms that enable state-of-the-art ML model performance on the SambaNova platform. This can involve anything from digging through PyTorch and machine learning models to determining how to map operations on to our underlying hardware.

Responsibilities

Lead compiler engineering through ensuring standard methodologies, enterprise product insertion and process evolution.
Work with peers, domain experts, developers, customers, and work across the enterprise seeking optimal solutions.
Develop, integrate, and implement products.
Provide support for proposals in key areas aligned with core team competencies.

Basic Qualifications

Bachelor's or Master's Degree in Computer Science, Computer Engineering, or equivalent with 5-10 years of industry experience.

Additional Qualifications

Deep theoretical understanding of compiler fundamentals.
Experience building and deploying software products.
Experience with one or more deep learning frameworks (i.e. TensorFlow, PyTorch) is a plus.
Experience with common compiler development practices and methodologies.
Excitement about high-performance systems engineering and performance debugging.
An appreciation for process and developing cross-disciplinary collaboration.

Preferred Qualifications

Experience with MLIR.
Familiarity with machine learning models and frameworks.
Familiarity with accelerated computing.
Exposure to dataflow architectures.

Submission Guidelines Please note that in order to be considered an applicant for any position at SambaNova Systems, you must submit an application form for each position for which you believe you are qualified.

EEO Policy SambaNova Systems is an Equal Opportunity/Affirmative Action Employer. All qualified applicants will receive consideration for employment without regard basis of age (40 and over), color, disability, gender identity, genetic information, marital status, military or veteran status, national origin/ancestry, race, religion, creed, sex (including pregnancy, childbirth, breastfeeding), sexual orientation, and any other applicable status protected by federal, state, or local laws.

Benefits Summary for US-Based, Full-Time Employment Positions SambaNova offers a competitive total rewards package, including the base salary, plus equity and benefits. We cover 95% premium coverage for employee medical insurance, and 77% premium coverage for dependents and offer a Health Savings Account (HSA) with employer contribution. We also offer Dental, Vision, Short/Long term Disability, Basic Life, Voluntary Life, and AD&D insurance plans in addition to Flexible Spending Account (FSA) options like Health Care, Limited Purpose, and Dependent Care. Our library of well-being benefits available to you and your dependents includes a full subscription to Headspace, Gympass+ membership with access to physical gyms, One Medical membership, counseling services with an Employee Assistance Program, and much more.

Apply

Vacancy posted 1 day ago

Similar jobs that could be interesting for youBased on the Software Engineer, ML Inference Performance in Palo Alto, CA vacancy

Internship, Software Engineer, AI Inference (Summer 2026)
$100k - $150k
...internships. Our team puts ML models into production- we... ...neural networks for efficient inference on compute-constrained edge devices... ...an aim to maximize network performance while minimizing latency... ...with AI scientists and compiler engineers to effectively compress large...
Performance
Full time
Temporary work
Summer work
Internship
Flexible hours
Tesla
Palo Alto, CA
4 days ago
Senior Software Engineer, Inference Platform Palo Alto
...re looking for a Senior Engineer to help build the next-generation inference platform that supports embedding... ...and collaborate with ML researchers and... ...designed for reliability, performance, and ease of use. We're... ...systems at scale Strong software engineering skills in languages...
Performance
Local area
Worldwide
MongoDB
Palo Alto, CA
4 days ago
Software Engineer, Inference
$187.5k - $395k
...Ship new model architectures by integrating them into our inference engine Collaborate closely across research, engineering and... ...(RoCE, Infiniband, NVLink) ~ Experience with high performance large scale ML systems ( ~100 GPUs) ~ Experience with FFmpeg and multimedia...
Performance
Luma AI
Redwood City, CA
2 days ago
Senior Software Engineer, Deep Learning Inference - TensorRT
$152k - $241.5k
...We are now looking for a Senior Software Engineer for Deep Learning Inference! Would you like to make a big impact... ...multiple platforms for functionality and performance Develop components of TensorRT,... ...TensorFlow, ONNX Runtime or other ML frameworks. NVIDIA is widely...
Performance
NVIDIA
Santa Clara, CA
4 days ago
Senior Software Engineer II, Inference
$165k - $242k
...Senior Software Engineer II, Inference Sunnyvale, CA / Bellevue, WA CoreWeave is The Essential Cloud for... ...combines superior infrastructure performance with deep technical expertise to accelerate... ...performance. ~ Optimize end-to-end ML system performance by developing and...
Performance
Permanent employment
Temporary work
Casual work
Work at office
Remote work
Flexible hours
Shift work
CoreWeave
Sunnyvale, CA
4 days ago
Software Development Engineer AI/ML, Inference Serving, AWS Neuron
$193.3k - $261.5k
...AWS Neuron is the software stack powering AWS Inferentia and Trainium... ..., designed to deliver high-performance, low-cost inference at scale. The Neuron... ...seeking a Software Development Engineer to lead and architect our... ...the design of distributed ML serving systems optimized for...
Performance
Internship
Local area
Flexible hours
Amazon
Cupertino, CA
1 day ago
Internship, Software Compiler Engineer, AI Inference (Summer 2026)
$100k - $150k
...responsible for the internal working of the AI inference stack and compiler running neural... ...will collaborate closely with the AI Engineers and Hardware Engineers to understand the... ...the compiler to extract the maximum performance out of our hardware. The inference stack...
Performance
Full time
Temporary work
Part time
Summer work
Internship
Immediate start
Flexible hours
Tesla
Palo Alto, CA
4 days ago
Staff Software Engineer, ML Training and Inference Infrastructure
$228k - $285k
...Rivian Staff Software Engineer, ML Training And Inference Infrastructure Rivian is on a mission to keep the world adventurous forever. This goes for... ...driving models; and optimizing the training and inference performance. Responsibilities: Optimize the performance...
Performance
Full time
Contract work
Local area
Rivian
Palo Alto, CA
1 day ago
Senior Software Development Engineer, AI/ML, AWS Neuron, Model Inference
$193.3k - $261.5k
...AWS) builds AWS Neuron, the software development kit used to... ...s Inferentia and Trainium ML accelerators. This comprehensive... ...enabling unparalleled ML inference and training performance. The Inference... ...hardware-software boundary, our engineers build systematic...
Performance
Work experience placement
Internship
Local area
Flexible hours
Amazon
Cupertino, CA
22 hours ago
Senior Software Engineer, AI Inference Systems
$184k - $287.5k
...seeking highly skilled and motivated software engineers to join us and build AI inference systems that serve large-scale... ...’ll architect and implement high-performance inference stacks, optimize GPU... ...pareto frontier for the field of ML Systems; survey recent publications...
Performance
NVIDIA
Santa Clara, CA
4 days ago
AI Engineer, ML Inference Optimization, Tesla AI
$176k - $420k
...train and deploy large-scale ML systems powering products from... ...design the model architecture and engineer algorithmic optimizations that make large-scale model inference fast, reliable, and hardware-... ...to improve inference performance Design inference algorithms...
Performance
Hourly pay
Full time
Temporary work
Flexible hours
Tesla
Palo Alto, CA
2 days ago
Senior Staff Software Engineer - High Performance GPU Inference Systems
$248.71k - $292.6k
...delivers fast, efficient AI inference. Our LPU-based system powers... ...are on a mission to make high performance AI compute more accessible... ...possible. Build fast. Sr. Staff Software Engineer - High Performance GPU... ...Work closely with teams across ML compilers, orchestration,...
Performance
I did my part and supported the Regular Toilet
Palo Alto, CA
5 days ago
Software Engineer, Inference AI/ML
$92k - $135k
...combines superior infrastructure performance with deep technical expertise... ...What You'll Do: Join the Inference team to ship production... ...mentorship from experienced engineers. About the role: Implement... ...that deployed a microservice or ML inference demo. Coursework...
Performance
Permanent employment
Temporary work
Casual work
Internship
Work at office
Remote work
Flexible hours
CoreWeave
Sunnyvale, CA
9 days ago
Senior/Staff Software Engineer - Machine Learning Platform (Inference)
$236k - $339.25k
...infrastructure optimizations, orchestration, performance, and security. The team aims to solve... ...simple, secure, and enable end-to-end ML workflows. We are on an early journey... ...Experience in serving LLMs using inference engines like vLLM, TensorRT-LLM, TEI, SGLang, and...
Performance
Flexible hours
Snowflake Computing
Menlo Park, CA
5 days ago
Senior Inference Platform Engineer — Low-Latency, Multi-Tenant
...company in Palo Alto seeks a Senior Engineer to develop a cutting-edge inference platform supporting semantic search... ..., or Python. You'll work alongside ML researchers to enhance... ...real-time inference, ensuring high performance and reliability. This hybrid role offers...
Performance
MongoDB
Palo Alto, CA
4 days ago
Software Engineer, Bulk/Interactive Inference
$170k - $216k
...across 15+ U.S. states. The ML Ops team, part of Waymo ML... .... We’re looking for a software engineer to join the team to build and... ...will: Develop Waymo's inference platform to make it scalable... ...location or, if the role can be performed remote, the specific salary...
Full time
Remote work
Waymo
Mountain View, CA
4 days ago
Robotics ML Inference Engineer — Edge & Cloud AI
...cutting-edge robotics company in California seeks an ML Infrastructure Engineer to build and operate inference systems for their automation stack.... ...maintaining infrastructure for model inference, optimizing performance, and collaborating with research teams. Candidates...
Performance
Rhoda AI
Palo Alto, CA
4 days ago
Senior ML Infrastructure Engineer, Inference Platform
$155.42k - $205.9k
...Description About the Team: The ML Inference Platform is part of the AV... ...inference, with a focus on performance, availability, concurrency,... ...a Senior ML Infrastructure engineer to help build and scale... ...implement core platform backend software components. Collaborate...
Performance
Local area
Remote work
Work from home
Relocation
Relocation package
Flexible hours
General Motors
Mountain View, CA
2 days ago
Senior ML Inference Engineer - Platform
$128.7k - $261.3k
...The Model Deployment & Inference Solutions team in GM AV deploys... ...mission is two-fold: build the ML deployment platform that... ...workflows currently performed manually by engineers. Build the developer experience... ...clean, well-tested software with clear interfaces and good...
Performance
Local area
Remote work
Work from home
Relocation package
Flexible hours
Shift work
General Motors
Mountain View, CA
4 days ago
Staff Software Engineer, Ads ML Inference Infrastructure
...Staff Software Engineer, Ads ML Inference Infrastructure The Ads ML Inference Infra team owns the online inference and feature serving systems... ...pipelines to meet strict SLOs while improving performance, efficiency, and cost . Partner with Ads ML and product...
Performance
Full time
Work at office
Relocation
Relocation package
Pinterest
Palo Alto, CA
3 hours ago
Software Engineer, ML Inference, Simulation Infrastructure
$170k - $216k
...evaluate the Waymo Driver's software stack at a massive scale. We... ...range of customers Software Engineers, Product, Data Science, System... ...will: Build and evolve ML inference infrastructure for... ...location or, if the role can be performed remote, the specific salary...
Full time
Remote work
Waymo
Mountain View, CA
2 days ago
Software Engineer
$150k - $195k
...is looking for early-career Software Engineers to join our team. You'll work... ...If you're excited about AI/ML, have built and shipped projects... ...Design, develop, and test inference solutions for state-of-the-... ...features, improve system performance, and contribute to overall system...
Performance
Full time
Deep Infra
Palo Alto, CA
4 days ago
Software Engineer, iOS
$175k - $250k
...Senior Software Engineer, iOS About the Role At Pika, we are building... ...partner closely with AI/ML, platform, product, and design... ...Core ML, ML Kit, and custom inference frameworks to power agent-... ...mobile products, with a focus on performance, usability, and reliability....
Performance
Remote work
PIKA Inc
Palo Alto, CA
2 days ago
Senior Software Engineer, Machine Learning Inference
$152k - $241.5k
...We're seeking talented and motivated engineers to join our TensorRT team in developing the industry-leading deep learning inference software for NVIDIA AI accelerators. As a Senior... ..., JAX. Knowledge of close-to-metal performance analysis, optimization techniques, and...
Performance
NVIDIA
Santa Clara, CA
4 days ago
Senior Software Engineer, ML Evaluation Infra and Efficiency
$238k - $302k
...Senior Software Engineer, ML Evaluation Infra and Efficiency Waymo is an autonomous driving... ...Profile evaluation platforms, identify performance bottlenecks (CPU, memory, I/O, network... ...implement optimizations to improve inference speed and resource utilization. Collaborate...
Performance
Full time
Remote work
Waymo
Mountain View, CA
3 days ago
Software Engineer, Early Career
$140k - $150k
...Software Engineer, Early Career DeepInfra is looking for early-career... ...If you're excited about AI/ML, have taken related courses... ...to design, develop, and test inference solutions for state-of-the-art... ...experiment with improving model performance. Try new things. Ship...
Performance
Full time
Internship
Deepinfra
Palo Alto, CA
4 days ago
Kernel Optimization Software Engineer, AI Hardware
$176k - $420k
...What to Expect The Performance Optimization team takes research models and makes... ...profile highly performant kernels for inference and training on Tesla's AI and Dojo... ...with compiler, hardware, and ML teams Degree in Engineering, Computer Science, or equivalent in...
Performance
Hourly pay
Full time
Temporary work
Flexible hours
Tesla
Palo Alto, CA
1 day ago
Software Engineer - Performance Optimization
...Software Engineer Applied Intuition is a Tier 1 vehicle software supplier... ...maintaining algorithmic performance, analyzing runtime behavior,... ...Collaborate closely with ML runtime optimization engineers to ensure smooth model inference execution within the stack...
Performance
Worldwide
Applied Compute
Mountain View, CA
1 day ago
Senior Software Engineer, Middleware
$193k - $291k
...Senior Software Engineer, Middleware Mountain View, California (HQ) Nuro is a self-driving... .... Our mission is to provide a high-performance, highly reliable foundation of the... ...robotics frameworks Robotics experience, ML inference optimization experience, computer...
Performance
Nuro
Mountain View, CA
1 day ago
Software Engineer III - AI/ML Deep Learning & GPU ML Serving
$133k - $185k
...technology products. As a Software Engineer III at JPMorgan Chase within... ...models for production inference, including quantization and... ...experience, with emphasis on ML systems. Strong proficiency... ...GPU programming (CUDA) and performance optimization. Experience...
Performance
JPMorgan Chase Bank, N.A.
Palo Alto, CA
2 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Software Engineer, ML Inference Performance. Be the first to apply!