Software Engineer, ML Inference Performance
SambaNova Systems
Software Engineer, ML Inference Performance
Palo Alto, California, United States
The era of pervasive AI has arrived. In this era, organizations will use generative AI to unlock hidden value in their data, accelerate processes, reduce costs, drive efficiency and innovation to fundamentally transform their businesses and operations at scale.
SambaNova Suite™ is the first full-stack, generative AI platform, from chip to model, optimized for enterprise and government organizations. Powered by the intelligent SN40L chip, the SambaNova Suite is a fully integrated platform, delivered on-premises or in the cloud, combined with state-of-the-art open-source models that can be easily and securely fine-tuned using customer data for greater accuracy. Once adapted with customer data, customers retain model ownership in perpetuity, so they can turn generative AI into one of their most valuable assets.
About The Role
The Principal Compiler Engineer - ML Systems position will be responsible for working with the different layers of the compiler stack and coordinating with other development teams here at SambaNova. It is a critical role responsible for driving innovation in compiler infrastructure and optimization algorithms that enable state-of-the-art ML model performance on the SambaNova platform. This can involve anything from digging through PyTorch and machine learning models to determining how to map operations on to our underlying hardware.
Responsibilities
- Lead compiler engineering through ensuring standard methodologies, enterprise product insertion and process evolution.
- Work with peers, domain experts, developers, customers, and work across the enterprise seeking optimal solutions.
- Develop, integrate, and implement products.
- Provide support for proposals in key areas aligned with core team competencies.
Basic Qualifications
- Bachelor's or Master's Degree in Computer Science, Computer Engineering, or equivalent with 5-10 years of industry experience.
Additional Qualifications
- Deep theoretical understanding of compiler fundamentals.
- Experience building and deploying software products.
- Experience with one or more deep learning frameworks (i.e. TensorFlow, PyTorch) is a plus.
- Experience with common compiler development practices and methodologies.
- Excitement about high-performance systems engineering and performance debugging.
- An appreciation for process and developing cross-disciplinary collaboration.
Preferred Qualifications
- Experience with MLIR.
- Familiarity with machine learning models and frameworks.
- Familiarity with accelerated computing.
- Exposure to dataflow architectures.
Submission Guidelines Please note that in order to be considered an applicant for any position at SambaNova Systems, you must submit an application form for each position for which you believe you are qualified.
EEO Policy SambaNova Systems is an Equal Opportunity/Affirmative Action Employer. All qualified applicants will receive consideration for employment without regard basis of age (40 and over), color, disability, gender identity, genetic information, marital status, military or veteran status, national origin/ancestry, race, religion, creed, sex (including pregnancy, childbirth, breastfeeding), sexual orientation, and any other applicable status protected by federal, state, or local laws.
Benefits Summary for US-Based, Full-Time Employment Positions SambaNova offers a competitive total rewards package, including the base salary, plus equity and benefits. We cover 95% premium coverage for employee medical insurance, and 77% premium coverage for dependents and offer a Health Savings Account (HSA) with employer contribution. We also offer Dental, Vision, Short/Long term Disability, Basic Life, Voluntary Life, and AD&D insurance plans in addition to Flexible Spending Account (FSA) options like Health Care, Limited Purpose, and Dependent Care. Our library of well-being benefits available to you and your dependents includes a full subscription to Headspace, Gympass+ membership with access to physical gyms, One Medical membership, counseling services with an Employee Assistance Program, and much more.
$100k - $150k
...internships. Our team puts ML models into production- we... ...neural networks for efficient inference on compute-constrained edge devices... ...an aim to maximize network performance while minimizing latency... ...with AI scientists and compiler engineers to effectively compress large...PerformanceFull timeTemporary workSummer workInternshipFlexible hours- ...re looking for a Senior Engineer to help build the next-generation inference platform that supports embedding... ...and collaborate with ML researchers and... ...designed for reliability, performance, and ease of use. We're... ...systems at scale Strong software engineering skills in languages...PerformanceLocal areaWorldwide
$187.5k - $395k
...Ship new model architectures by integrating them into our inference engine Collaborate closely across research, engineering and... ...(RoCE, Infiniband, NVLink) ~ Experience with high performance large scale ML systems ( ~100 GPUs) ~ Experience with FFmpeg and multimedia...Performance$152k - $241.5k
...We are now looking for a Senior Software Engineer for Deep Learning Inference! Would you like to make a big impact... ...multiple platforms for functionality and performance Develop components of TensorRT,... ...TensorFlow, ONNX Runtime or other ML frameworks. NVIDIA is widely...Performance$165k - $242k
...Senior Software Engineer II, Inference Sunnyvale, CA / Bellevue, WA CoreWeave is The Essential Cloud for... ...combines superior infrastructure performance with deep technical expertise to accelerate... ...performance. ~ Optimize end-to-end ML system performance by developing and...PerformancePermanent employmentTemporary workCasual workWork at officeRemote workFlexible hoursShift work$193.3k - $261.5k
...AWS Neuron is the software stack powering AWS Inferentia and Trainium... ..., designed to deliver high-performance, low-cost inference at scale. The Neuron... ...seeking a Software Development Engineer to lead and architect our... ...the design of distributed ML serving systems optimized for...PerformanceInternshipLocal areaFlexible hours$100k - $150k
...responsible for the internal working of the AI inference stack and compiler running neural... ...will collaborate closely with the AI Engineers and Hardware Engineers to understand the... ...the compiler to extract the maximum performance out of our hardware. The inference stack...PerformanceFull timeTemporary workPart timeSummer workInternshipImmediate startFlexible hours$228k - $285k
...Rivian Staff Software Engineer, ML Training And Inference Infrastructure Rivian is on a mission to keep the world adventurous forever. This goes for... ...driving models; and optimizing the training and inference performance. Responsibilities: Optimize the performance...PerformanceFull timeContract workLocal area$193.3k - $261.5k
...AWS) builds AWS Neuron, the software development kit used to... ...s Inferentia and Trainium ML accelerators. This comprehensive... ...enabling unparalleled ML inference and training performance. The Inference... ...hardware-software boundary, our engineers build systematic...PerformanceWork experience placementInternshipLocal areaFlexible hours$184k - $287.5k
...seeking highly skilled and motivated software engineers to join us and build AI inference systems that serve large-scale... ...’ll architect and implement high-performance inference stacks, optimize GPU... ...pareto frontier for the field of ML Systems; survey recent publications...Performance$176k - $420k
...train and deploy large-scale ML systems powering products from... ...design the model architecture and engineer algorithmic optimizations that make large-scale model inference fast, reliable, and hardware-... ...to improve inference performance Design inference algorithms...PerformanceHourly payFull timeTemporary workFlexible hours$248.71k - $292.6k
...delivers fast, efficient AI inference. Our LPU-based system powers... ...are on a mission to make high performance AI compute more accessible... ...possible. Build fast. Sr. Staff Software Engineer - High Performance GPU... ...Work closely with teams across ML compilers, orchestration,...Performance$92k - $135k
...combines superior infrastructure performance with deep technical expertise... ...What You'll Do: Join the Inference team to ship production... ...mentorship from experienced engineers. About the role: Implement... ...that deployed a microservice or ML inference demo. Coursework...PerformancePermanent employmentTemporary workCasual workInternshipWork at officeRemote workFlexible hours$236k - $339.25k
...infrastructure optimizations, orchestration, performance, and security. The team aims to solve... ...simple, secure, and enable end-to-end ML workflows. We are on an early journey... ...Experience in serving LLMs using inference engines like vLLM, TensorRT-LLM, TEI, SGLang, and...PerformanceFlexible hours- ...company in Palo Alto seeks a Senior Engineer to develop a cutting-edge inference platform supporting semantic search... ..., or Python. You'll work alongside ML researchers to enhance... ...real-time inference, ensuring high performance and reliability. This hybrid role offers...Performance
$170k - $216k
...across 15+ U.S. states. The ML Ops team, part of Waymo ML... .... We’re looking for a software engineer to join the team to build and... ...will: Develop Waymo's inference platform to make it scalable... ...location or, if the role can be performed remote, the specific salary...Full timeRemote work- ...cutting-edge robotics company in California seeks an ML Infrastructure Engineer to build and operate inference systems for their automation stack.... ...maintaining infrastructure for model inference, optimizing performance, and collaborating with research teams. Candidates...Performance
$155.42k - $205.9k
...Description About the Team: The ML Inference Platform is part of the AV... ...inference, with a focus on performance, availability, concurrency,... ...a Senior ML Infrastructure engineer to help build and scale... ...implement core platform backend software components. Collaborate...PerformanceLocal areaRemote workWork from homeRelocationRelocation packageFlexible hours$128.7k - $261.3k
...The Model Deployment & Inference Solutions team in GM AV deploys... ...mission is two-fold: build the ML deployment platform that... ...workflows currently performed manually by engineers. Build the developer experience... ...clean, well-tested software with clear interfaces and good...PerformanceLocal areaRemote workWork from homeRelocation packageFlexible hoursShift work- ...Staff Software Engineer, Ads ML Inference Infrastructure The Ads ML Inference Infra team owns the online inference and feature serving systems... ...pipelines to meet strict SLOs while improving performance, efficiency, and cost . Partner with Ads ML and product...PerformanceFull timeWork at officeRelocationRelocation package
$170k - $216k
...evaluate the Waymo Driver's software stack at a massive scale. We... ...range of customers Software Engineers, Product, Data Science, System... ...will: Build and evolve ML inference infrastructure for... ...location or, if the role can be performed remote, the specific salary...Full timeRemote work$150k - $195k
...is looking for early-career Software Engineers to join our team. You'll work... ...If you're excited about AI/ML, have built and shipped projects... ...Design, develop, and test inference solutions for state-of-the-... ...features, improve system performance, and contribute to overall system...PerformanceFull time$175k - $250k
...Senior Software Engineer, iOS About the Role At Pika, we are building... ...partner closely with AI/ML, platform, product, and design... ...Core ML, ML Kit, and custom inference frameworks to power agent-... ...mobile products, with a focus on performance, usability, and reliability....PerformanceRemote work$152k - $241.5k
...We're seeking talented and motivated engineers to join our TensorRT team in developing the industry-leading deep learning inference software for NVIDIA AI accelerators. As a Senior... ..., JAX. Knowledge of close-to-metal performance analysis, optimization techniques, and...Performance$238k - $302k
...Senior Software Engineer, ML Evaluation Infra and Efficiency Waymo is an autonomous driving... ...Profile evaluation platforms, identify performance bottlenecks (CPU, memory, I/O, network... ...implement optimizations to improve inference speed and resource utilization. Collaborate...PerformanceFull timeRemote work$140k - $150k
...Software Engineer, Early Career DeepInfra is looking for early-career... ...If you're excited about AI/ML, have taken related courses... ...to design, develop, and test inference solutions for state-of-the-art... ...experiment with improving model performance. Try new things. Ship...PerformanceFull timeInternship$176k - $420k
...What to Expect The Performance Optimization team takes research models and makes... ...profile highly performant kernels for inference and training on Tesla's AI and Dojo... ...with compiler, hardware, and ML teams Degree in Engineering, Computer Science, or equivalent in...PerformanceHourly payFull timeTemporary workFlexible hours- ...Software Engineer Applied Intuition is a Tier 1 vehicle software supplier... ...maintaining algorithmic performance, analyzing runtime behavior,... ...Collaborate closely with ML runtime optimization engineers to ensure smooth model inference execution within the stack...PerformanceWorldwide
$193k - $291k
...Senior Software Engineer, Middleware Mountain View, California (HQ) Nuro is a self-driving... .... Our mission is to provide a high-performance, highly reliable foundation of the... ...robotics frameworks Robotics experience, ML inference optimization experience, computer...Performance$133k - $185k
...technology products. As a Software Engineer III at JPMorgan Chase within... ...models for production inference, including quantization and... ...experience, with emphasis on ML systems. Strong proficiency... ...GPU programming (CUDA) and performance optimization. Experience...Performance
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Software Engineer, ML Inference Performance. Be the first to apply!
- graduate software developer Palo Alto, CA
- rust software engineer Palo Alto, CA
- senior software design engineer Palo Alto, CA
- software engineer amazon Palo Alto, CA
- software developer positions Palo Alto, CA
- software engineer full time Palo Alto, CA
- new graduate software engineer Palo Alto, CA
- software engineer Palo Alto, CA
- software engineer intern Palo Alto, CA
- agile software developer Palo Alto, CA



