Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Software Engineer, ML Inference Performance

SambaNova Systems

Software Engineer, ML Inference Performance

Palo Alto, California, United States

The era of pervasive AI has arrived. In this era, organizations will use generative AI to unlock hidden value in their data, accelerate processes, reduce costs, drive efficiency and innovation to fundamentally transform their businesses and operations at scale.

SambaNova Suite™ is the first full-stack, generative AI platform, from chip to model, optimized for enterprise and government organizations. Powered by the intelligent SN40L chip, the SambaNova Suite is a fully integrated platform, delivered on-premises or in the cloud, combined with state-of-the-art open-source models that can be easily and securely fine-tuned using customer data for greater accuracy. Once adapted with customer data, customers retain model ownership in perpetuity, so they can turn generative AI into one of their most valuable assets.

About The Role

The Principal Compiler Engineer - ML Systems position will be responsible for working with the different layers of the compiler stack and coordinating with other development teams here at SambaNova. It is a critical role responsible for driving innovation in compiler infrastructure and optimization algorithms that enable state-of-the-art ML model performance on the SambaNova platform. This can involve anything from digging through PyTorch and machine learning models to determining how to map operations on to our underlying hardware.

Responsibilities
  • Lead compiler engineering through ensuring standard methodologies, enterprise product insertion and process evolution.
  • Work with peers, domain experts, developers, customers, and work across the enterprise seeking optimal solutions.
  • Develop, integrate, and implement products.
  • Provide support for proposals in key areas aligned with core team competencies.
Basic Qualifications
  • Bachelor's or Master's Degree in Computer Science, Computer Engineering, or equivalent with 5-10 years of industry experience.
Additional Qualifications
  • Deep theoretical understanding of compiler fundamentals.
  • Experience building and deploying software products.
  • Experience with one or more deep learning frameworks (i.e. TensorFlow, PyTorch) is a plus.
  • Experience with common compiler development practices and methodologies.
  • Excitement about high-performance systems engineering and performance debugging.
  • An appreciation for process and developing cross-disciplinary collaboration.
Preferred Qualifications
  • Experience with MLIR.
  • Familiarity with machine learning models and frameworks.
  • Familiarity with accelerated computing.
  • Exposure to dataflow architectures.

Submission Guidelines Please note that in order to be considered an applicant for any position at SambaNova Systems, you must submit an application form for each position for which you believe you are qualified.

EEO Policy SambaNova Systems is an Equal Opportunity/Affirmative Action Employer. All qualified applicants will receive consideration for employment without regard basis of age (40 and over), color, disability, gender identity, genetic information, marital status, military or veteran status, national origin/ancestry, race, religion, creed, sex (including pregnancy, childbirth, breastfeeding), sexual orientation, and any other applicable status protected by federal, state, or local laws.

Benefits Summary for US-Based, Full-Time Employment Positions SambaNova offers a competitive total rewards package, including the base salary, plus equity and benefits. We cover 95% premium coverage for employee medical insurance, and 77% premium coverage for dependents and offer a Health Savings Account (HSA) with employer contribution. We also offer Dental, Vision, Short/Long term Disability, Basic Life, Voluntary Life, and AD&D insurance plans in addition to Flexible Spending Account (FSA) options like Health Care, Limited Purpose, and Dependent Care. Our library of well-being benefits available to you and your dependents includes a full subscription to Headspace, Gympass+ membership with access to physical gyms, One Medical membership, counseling services with an Employee Assistance Program, and much more.

Vacancy posted 1 day ago
Similar jobs that could be interesting for youBased on the Software Engineer, ML Inference Performance in Palo Alto, CA vacancy
  • $100k - $150k

     ...internships. Our team puts ML models into production- we...  ...neural networks for efficient inference on compute-constrained edge devices...  ...an aim to maximize network performance while minimizing latency...  ...with AI scientists and compiler engineers to effectively compress large... 
    Performance
    Full time
    Temporary work
    Summer work
    Internship
    Flexible hours

    Tesla

    Palo Alto, CA
    4 days ago
  •  ...re looking for a Senior Engineer to help build the next-generation inference platform that supports embedding...  ...and collaborate with ML researchers and...  ...designed for reliability, performance, and ease of use. We're...  ...systems at scale Strong software engineering skills in languages... 
    Performance
    Local area
    Worldwide

    MongoDB

    Palo Alto, CA
    4 days ago
  • $187.5k - $395k

     ...Ship new model architectures by integrating them into our inference engine Collaborate closely across research, engineering and...  ...(RoCE, Infiniband, NVLink) ~ Experience with high performance large scale ML systems ( ~100 GPUs) ~ Experience with FFmpeg and multimedia... 
    Performance

    Luma AI

    Redwood City, CA
    2 days ago
  • $152k - $241.5k

     ...We are now looking for a Senior Software Engineer for Deep Learning Inference! Would you like to make a big impact...  ...multiple platforms for functionality and performance Develop components of TensorRT,...  ...TensorFlow, ONNX Runtime or other ML frameworks. NVIDIA is widely... 
    Performance

    NVIDIA

    Santa Clara, CA
    4 days ago
  • $165k - $242k

     ...Senior Software Engineer II, Inference Sunnyvale, CA / Bellevue, WA CoreWeave is The Essential Cloud for...  ...combines superior infrastructure performance with deep technical expertise to accelerate...  ...performance. ~ Optimize end-to-end ML system performance by developing and... 
    Performance
    Permanent employment
    Temporary work
    Casual work
    Work at office
    Remote work
    Flexible hours
    Shift work

    CoreWeave

    Sunnyvale, CA
    4 days ago
  • $193.3k - $261.5k

     ...AWS Neuron is the software stack powering AWS Inferentia and Trainium...  ..., designed to deliver high-performance, low-cost inference at scale. The Neuron...  ...seeking a Software Development Engineer to lead and architect our...  ...the design of distributed ML serving systems optimized for... 
    Performance
    Internship
    Local area
    Flexible hours

    Amazon

    Cupertino, CA
    1 day ago
  • $100k - $150k

     ...responsible for the internal working of the AI inference stack and compiler running neural...  ...will collaborate closely with the AI Engineers and Hardware Engineers to understand the...  ...the compiler to extract the maximum performance out of our hardware. The inference stack... 
    Performance
    Full time
    Temporary work
    Part time
    Summer work
    Internship
    Immediate start
    Flexible hours

    Tesla

    Palo Alto, CA
    4 days ago
  • $228k - $285k

     ...Rivian Staff Software Engineer, ML Training And Inference Infrastructure Rivian is on a mission to keep the world adventurous forever. This goes for...  ...driving models; and optimizing the training and inference performance. Responsibilities: Optimize the performance... 
    Performance
    Full time
    Contract work
    Local area

    Rivian

    Palo Alto, CA
    1 day ago
  • $193.3k - $261.5k

     ...AWS) builds AWS Neuron, the software development kit used to...  ...s Inferentia and Trainium ML accelerators. This comprehensive...  ...enabling unparalleled ML inference and training performance. The Inference...  ...hardware-software boundary, our engineers build systematic... 
    Performance
    Work experience placement
    Internship
    Local area
    Flexible hours

    Amazon

    Cupertino, CA
    22 hours ago
  • $184k - $287.5k

     ...seeking highly skilled and motivated software engineers to join us and build AI inference systems that serve large-scale...  ...’ll architect and implement high-performance inference stacks, optimize GPU...  ...pareto frontier for the field of ML Systems; survey recent publications... 
    Performance

    NVIDIA

    Santa Clara, CA
    4 days ago
  • $176k - $420k

     ...train and deploy large-scale ML systems powering products from...  ...design the model architecture and engineer algorithmic optimizations that make large-scale model inference fast, reliable, and hardware-...  ...to improve inference performance Design inference algorithms... 
    Performance
    Hourly pay
    Full time
    Temporary work
    Flexible hours

    Tesla

    Palo Alto, CA
    2 days ago
  • $248.71k - $292.6k

     ...delivers fast, efficient AI inference. Our LPU-based system powers...  ...are on a mission to make high performance AI compute more accessible...  ...possible. Build fast. Sr. Staff Software Engineer - High Performance GPU...  ...Work closely with teams across ML compilers, orchestration,... 
    Performance

    I did my part and supported the Regular Toilet

    Palo Alto, CA
    5 days ago
  • $92k - $135k

     ...combines superior infrastructure performance with deep technical expertise...  ...What You'll Do: Join the Inference team to ship production...  ...mentorship from experienced engineers. About the role: Implement...  ...that deployed a microservice or ML inference demo. Coursework... 
    Performance
    Permanent employment
    Temporary work
    Casual work
    Internship
    Work at office
    Remote work
    Flexible hours

    CoreWeave

    Sunnyvale, CA
    9 days ago
  • $236k - $339.25k

     ...infrastructure optimizations, orchestration, performance, and security. The team aims to solve...  ...simple, secure, and enable end-to-end ML workflows. We are on an early journey...  ...Experience in serving LLMs using inference engines like vLLM, TensorRT-LLM, TEI, SGLang, and... 
    Performance
    Flexible hours

    Snowflake Computing

    Menlo Park, CA
    5 days ago
  •  ...company in Palo Alto seeks a Senior Engineer to develop a cutting-edge inference platform supporting semantic search...  ..., or Python. You'll work alongside ML researchers to enhance...  ...real-time inference, ensuring high performance and reliability. This hybrid role offers... 
    Performance

    MongoDB

    Palo Alto, CA
    4 days ago
  • $170k - $216k

     ...across 15+ U.S. states. The ML Ops team, part of Waymo ML...  .... We’re looking for a software engineer to join the team to build and...  ...will: Develop Waymo's inference platform to make it scalable...  ...location or, if the role can be performed remote, the specific salary... 
    Full time
    Remote work

    Waymo

    Mountain View, CA
    4 days ago
  •  ...cutting-edge robotics company in California seeks an ML Infrastructure Engineer to build and operate inference systems for their automation stack....  ...maintaining infrastructure for model inference, optimizing performance, and collaborating with research teams. Candidates... 
    Performance

    Rhoda AI

    Palo Alto, CA
    4 days ago
  • $155.42k - $205.9k

     ...Description About the Team: The ML Inference Platform is part of the AV...  ...inference, with a focus on performance, availability, concurrency,...  ...a Senior ML Infrastructure engineer to help build and scale...  ...implement core platform backend software components. Collaborate... 
    Performance
    Local area
    Remote work
    Work from home
    Relocation
    Relocation package
    Flexible hours

    General Motors

    Mountain View, CA
    2 days ago
  • $128.7k - $261.3k

     ...The Model Deployment & Inference Solutions team in GM AV deploys...  ...mission is two-fold: build the ML deployment platform that...  ...workflows currently performed manually by engineers. Build the developer experience...  ...clean, well-tested software with clear interfaces and good... 
    Performance
    Local area
    Remote work
    Work from home
    Relocation package
    Flexible hours
    Shift work

    General Motors

    Mountain View, CA
    4 days ago
  •  ...Staff Software Engineer, Ads ML Inference Infrastructure   The Ads ML Inference Infra team owns the online inference and feature serving systems...  ...pipelines to meet strict SLOs while improving performance, efficiency, and cost . Partner with Ads ML and product... 
    Performance
    Full time
    Work at office
    Relocation
    Relocation package

    Pinterest

    Palo Alto, CA
    3 hours ago
  • $170k - $216k

     ...evaluate the Waymo Driver's software stack at a massive scale. We...  ...range of customers Software Engineers, Product, Data Science, System...  ...will: Build and evolve ML inference infrastructure for...  ...location or, if the role can be performed remote, the specific salary... 
    Full time
    Remote work

    Waymo

    Mountain View, CA
    2 days ago
  • $150k - $195k

     ...is looking for early-career Software Engineers to join our team. You'll work...  ...If you're excited about AI/ML, have built and shipped projects...  ...Design, develop, and test inference solutions for state-of-the-...  ...features, improve system performance, and contribute to overall system... 
    Performance
    Full time

    Deep Infra

    Palo Alto, CA
    4 days ago
  • $175k - $250k

     ...Senior Software Engineer, iOS About the Role At Pika, we are building...  ...partner closely with AI/ML, platform, product, and design...  ...Core ML, ML Kit, and custom inference frameworks to power agent-...  ...mobile products, with a focus on performance, usability, and reliability.... 
    Performance
    Remote work

    PIKA Inc

    Palo Alto, CA
    2 days ago
  • $152k - $241.5k

     ...We're seeking talented and motivated engineers to join our TensorRT team in developing the industry-leading deep learning inference software for NVIDIA AI accelerators. As a Senior...  ..., JAX. Knowledge of close-to-metal performance analysis, optimization techniques, and... 
    Performance

    NVIDIA

    Santa Clara, CA
    4 days ago
  • $238k - $302k

     ...Senior Software Engineer, ML Evaluation Infra and Efficiency Waymo is an autonomous driving...  ...Profile evaluation platforms, identify performance bottlenecks (CPU, memory, I/O, network...  ...implement optimizations to improve inference speed and resource utilization. Collaborate... 
    Performance
    Full time
    Remote work

    Waymo

    Mountain View, CA
    3 days ago
  • $140k - $150k

     ...Software Engineer, Early Career DeepInfra is looking for early-career...  ...If you're excited about AI/ML, have taken related courses...  ...to design, develop, and test inference solutions for state-of-the-art...  ...experiment with improving model performance. Try new things. Ship... 
    Performance
    Full time
    Internship

    Deepinfra

    Palo Alto, CA
    4 days ago
  • $176k - $420k

     ...What to Expect The Performance Optimization team takes research models and makes...  ...profile highly performant kernels for inference and training on Tesla's AI and Dojo...  ...with compiler, hardware, and ML teams Degree in Engineering, Computer Science, or equivalent in... 
    Performance
    Hourly pay
    Full time
    Temporary work
    Flexible hours

    Tesla

    Palo Alto, CA
    1 day ago
  •  ...Software Engineer Applied Intuition is a Tier 1 vehicle software supplier...  ...maintaining algorithmic performance, analyzing runtime behavior,...  ...Collaborate closely with ML runtime optimization engineers to ensure smooth model inference execution within the stack... 
    Performance
    Worldwide

    Applied Compute

    Mountain View, CA
    1 day ago
  • $193k - $291k

     ...Senior Software Engineer, Middleware Mountain View, California (HQ) Nuro is a self-driving...  .... Our mission is to provide a high-performance, highly reliable foundation of the...  ...robotics frameworks Robotics experience, ML inference optimization experience, computer... 
    Performance

    Nuro

    Mountain View, CA
    1 day ago
  • $133k - $185k

     ...technology products. As a Software Engineer III at JPMorgan Chase within...  ...models for production inference, including quantization and...  ...experience, with emphasis on ML systems. Strong proficiency...  ...GPU programming (CUDA) and performance optimization. Experience... 
    Performance

    JPMorgan Chase Bank, N.A.

    Palo Alto, CA
    2 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Software Engineer, ML Inference Performance. Be the first to apply!