ML Runtime Optimization Engineer

$159.05k - $199.3k

Decisive Point

About the role We are looking for a software engineer with deep experience in optimizing ML models and deploying them on production‑grade embedded runtime environments. You’ll work across the entire ML framework stack (e.g. PyTorch, JAX, ONNX, TensorRT, CUDA, XLA, Triton). At Applied Intuition, you will: Drive ML performance optimization on multiple technologies for on‑road and off‑road ADAS / AD stacks targeting deployment on a variety of embedded compute platforms Develop compute usage strategies to optimize efficiency and latency of model inference for compute boards selected by our customers Work on model pruning and quantization, and support deployment on memory constrained platforms Collaborate closely with ML engineers and software developers on technical efforts to find and optimize efficient model architecture solutions Set up methodologies to profile the model performance on target embedded compute platforms and identify performance bottlenecks as part of stack integration We're looking for someone who has: Bachelors in Electrical Engineering or Computer Science, OR B.Sc. in Computer Science, Mathematics, Physics or a related field 3+ years of experience with ML accelerators, GPU, CPU, SoC architecture and micro‑architecture Strong software development skills with the focus on embedded programming Experience profiling and optimizing model performance on embedded compute platforms Experience in working with deep learning frameworks (e.g., PyTorch, JAX, ONNX, etc.) Nice to have: M.Sc or PhD in a ML related area Built an ML optimization framework from scratch before Deployed ML solutions to embedded chips for real time robotics applications Compensation at Applied Intuition for eligible roles includes base salary, equity, and benefits. Base salary is a single component of the total compensation package, which may also include equity in the form of options and/or restricted stock units, comprehensive health, dental, vision, life and disability insurance coverage, 401k retirement benefits with employer match, learning and wellness stipends, and paid time off. Note that benefits are subject to change and may vary based on jurisdiction of employment. Applied Intuition pay ranges reflect the minimum and maximum intended target base salary for new hire salaries for the position. The actual base salary offered to a successful candidate will additionally be influenced by a variety of factors including experience, credentials & certifications, educational attainment, skill level requirements, interview performance, and the level and scope of the position. For pay transparency purposes, the base salary range for this full‑time position in the location listed is: $159,053 - $199,295 USD annually. Applied Intuition is an equal opportunity employer and federal contractor or subcontractor. Consequently, the parties agree that, as applicable, they will abide by the requirements of 41 CFR 60‑1.4(a), 41 CFR 60‑300.5(a) and 41 CFR 60‑741.5(a) and that these regulations prohibit discrimination against qualified individuals based on their status as protected veterans or individuals with disabilities, and prohibit discrimination against all individuals based on race, color, religion, sex, sexual orientation, gender identity or national origin. These regulations require that covered prime contractors and subcontractors take affirmative action to employ and advance in employment individuals without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, protected veteran status or disability. The parties also agree that, as applicable, they will abide by the requirements of Executive Order 13496 (29 CFR Part 471, Appendix A to Subpart A), relating to the notice of employee rights under federal labor laws. #J-18808-Ljbffr Decisive Point

Apply

Vacancy posted 2 days ago

Similar jobs that could be interesting for youBased on the ML Runtime Optimization Engineer in Sunnyvale, CA vacancy

Inference Optimization Engineer (local / edge runtime)
$170.5k - $315.49k
## Inference Optimization Engineer (local / edge runtime)Applylocations: US, California, Santa Clara: US, Oregon, Hillsboro: US, California, Folsom: US, Arizona, Phoenixtime type: Full timeposted on: Posted Yesterdayjob requisition id: JR0284871# **Job Details:**## Job...
Suggested
Internship
Local area
Immediate start
Shift work
Intel
Santa Clara, CA
2 days ago
ML Systems Engineer Runtime & Optimization (Hybrid, Equity)
$204k - $259k
...autonomous driving technology company is looking for an experienced engineer to improve compute performance in machine learning systems. This hybrid role involves collaboration with a world-class ML team and requires strong expertise in ML software or systems. The ideal...
Suggested
Waymo
Mountain View, CA
5 hours ago
Staff Data Science Engineer, Siri Runtime Systems and Interaction
$181.1k - $318.4k
Staff Data Science Engineer, Siri Runtime Systems and Interaction Cupertino, California, United States Software and Services Apple is where individual... ...monitoring Drive technical decisions and architecture for ML/AI initiatives Identify high-impact opportunities where data...
Suggested
Relocation
Apple Inc.
Cupertino, CA
1 day ago
Senior GPU Performance Engineer - Optimize AI workloads
$166k - $244k
Google is looking for a Senior Software Engineer in Sunnyvale, CA to lead GPU performance optimizations for cutting-edge AI and machine learning technologies. This role offers the opportunity to work on innovative projects that impact billions of users around the globe....
Suggested
Google
Sunnyvale, CA
2 days ago
Runtime Engineer
...AI platform, from chip to model, optimized for enterprise and government organizations... ...assets. The Opportunity The Runtime team at Sambanova is a seasoned engineering team with a proven track record of... ...data-flow applications such as ML training and inference and HPC applications...
Suggested
Full time
Temporary work
Local area
Flexible hours
SambaNova
Palo Alto, CA
2 days ago
Senior Software & Machine Learning Engineer - Energy Optimization
...Senior Software & Machine Learning Engineer to join our Energy Optimization team. This role focuses on building... ...integration applications Build automated ML pipelines for model training,... ...performance, memory utilization, and runtime efficiency. Develop monitoring, simulation...
Pentangle Tech Services | P5 Group
Palo Alto, CA
1 day ago
Runtime Systems Engineer for High-Performance AI
A leading AI software company in California is seeking a Software Engineer to develop and enhance runtime stacks for scalable ML applications. The role involves working on system software and collaborating with various teams to support next-generation high-performance...
SambaNova
Palo Alto, CA
2 days ago
Staff Runtime Systems Engineer for AI Inference (Hybrid)
d-Matrix inc. is seeking a Staff Runtime Systems Engineer to join our team in Santa Clara, CA. This hybrid role involves working onsite three days... ...firmware and software for multiprocessor systems, ensuring optimal runtime performance. Ideal candidates have a Bachelor's in...
3 days per week
d-Matrix inc.
Santa Clara, CA
1 day ago
Senior ML Runtime Engineer: Onboard & Offboard Systems
$213k - $263k
Waymo is seeking experienced engineers with ML software and systems expertise to develop the next generation of its onboard ML inference engine. The role involves architecting high-performance ML systems for autonomous vehicles, requiring over 5 years of software engineering...
Full time
Waymo
Mountain View, CA
1 day ago
Software Engineer, Agentic Systems - Moveworks
...automation with Moveworks’ Reasoning Engine and natural language... ...The Role We're building the runtime infrastructure that powers Moveworks... ...in real time. This is not an ML role. This is a distributed... .../per-bot scoping, batch read optimization, and hot‑reload configuration...
Work at office
Remote work
Flexible hours
ServiceNow
Mountain View, CA
1 day ago
Senior System Software Engineer, Automotive Performance
$224k - $356.5k
...Performance Senior Software Engineer to join our energetic team. You... ...be doing: Play a key role in optimizing system software for Nvidia... ...Test teams to track key boot & runtime performance benchmarks. Ensure... ...software efficiency. AI/ML experience is highly desirable...
NVIDIA
Santa Clara, CA
1 day ago
Sr. Software Engineer II, Sensor - Sensor Event Runtime (Hybrid)
$160k - $250k
...with competitive salaries and equity. Senior Software Engineer II, Sensor Event Runtime (SER) team - responsible for the design and implementation... ...data structures). Memory Management & Performance Optimization: Comprehensive knowledge of memory management techniques...
Work at office
Local area
Flexible hours
CrowdStrike
Sunnyvale, CA
1 day ago
Staff Software Engineer, Agentic Systems - Moveworks
Staff Software Engineer, Agentic Systems - Moveworks Full-time Employee... ...The Role We're building the runtime infrastructure that powers... ...in real time. This is not an ML role. This is a distributed... .../per-bot scoping, batch read optimization, and hot-reload configuration...
Full time
Work at office
ServiceNow
Mountain View, CA
1 day ago
Senior DL Compiler Engineer — AI/GPU Optimizer; Equity
$152k - $241.5k
NVIDIA is looking for a skilled individual to analyze deep learning networks and optimize compilers in Santa Clara, California. The role involves collaboration with software and hardware teams to accelerate deep learning software. With a competitive base salary ranging...
NVIDIA
Santa Clara, CA
1 day ago
Edge Inference Engineer: Local AI Latency Optimizer
Intel in Santa Clara, California is seeking a talented individual to optimize inference engines for local environments, impacting the future of AI. Applicants should have a strong background in C++ and software development, with experience in profiling performance issues...
Local area
Intel
Santa Clara, CA
15 hours ago
DL Compiler Engineer - Performance Optimizer (Equity)
...individual to analyze deep learning networks and develop compiler optimization algorithms. The role involves collaboration with software... ...software. Candidates must have a degree in Computer Science/Engineering and at least 2 years of relevant experience in performance analysis...
NVIDIA
Santa Clara, CA
2 days ago
Senior Real-Time Animation Engineer — ML/Simulation
Apple is seeking a Senior Simulation Animation Engineer to be part of an innovative team developing advanced simulation projects. You... ...Responsibilities include designing real-time animation systems and optimizing features for multiple platforms. Join Apple to make a...
Apple
Cupertino, CA
1 day ago
Practice Customer Engineer, Applied AI, Google Public Sector
$152k - $222k
...Experience building or leveraging AI solutions, ML APIs, prompting, agent tooling,... ...with the product marketing management and engineering teams to stay on top of industry trends and... ...providing best practice advice to customers to optimize Google Cloud effectiveness. Google is...
Temporary work
Local area
Google
Sunnyvale, CA
1 day ago
Entry-Level Forward Deployed Engineer: AI Deployments
Ordr Inc. is seeking a Forward Deployed Engineer to work directly with global customers, deploying... ...deployment strategies, configure and optimize the AI platform, and collaborate with AI... .... You will also build end-to-end ML pipelines, demonstrate results to stakeholders...
ordr
Santa Clara, CA
1 day ago
WW Consulting Engineer - AI/ML
$207.4k - $311.7k
...integrate into their environments. As a Worldwide Consulting Engineer focused on AI/ML within Apple's Enterprise, Education & Government sales... ...for on‑device deployment. Demonstrated experience optimizing and deploying machine learning models for on‑device inference...
Remote work
Worldwide
Relocation
Shift work
Apple Inc.
Cupertino, CA
2 days ago
Senior CUDA Engineer - Equivariant ML & GPU Kernels
$184k - $287.5k
...discovery. We are looking for a Senior Software Engineer to join the cuEquivariance team - an... ...You Will Be Doing: Build, implement, and optimize CUDA kernels for equivariant neural... ...end delivery of GPU-accelerated geometric ML primitives: from implementation to validated...
Nvidia Corporation
Santa Clara, CA
1 day ago
GPU Performance Engineer for AI Accelerators
Google in Sunnyvale, CA is seeking a Software Engineer to advance GPU software for AI accelerators. You will contribute to core ML projects, optimize GPU kernels, and collaborate with ML, compiler, and systems teams across Google. The role emphasizes building optimizations...
Google
Sunnyvale, CA
1 day ago
Senior CPU Performance Engineer - Optimize Large-Scale Apps
$152k - $241.5k
NVIDIA is seeking a Senior Developer Technology Engineer to optimize CPU performance. This role involves researching and developing techniques for large-scale applications on advanced CPU platforms, performing in-depth analysis on complex workloads, and publishing results...
NVIDIA
Santa Clara, CA
1 day ago
Edge Inference Optimization Engineer (Local)
Intel Corporation in Santa Clara seeks an Inference Optimization Engineer to optimize AI models for local and edge environments. Candidates should possess over 5 years of experience in software development, proficient in C++ and Python, and comfortable with performance...
Local area
Intel Corporation
Santa Clara, CA
1 day ago
Senior PD Methodology Engineer - Innovus Flows & ML-PPA
$168k - $310.5k
...advanced technology. The ideal candidate should have at least 7 years of experience and a strong background in physical design optimization and AI/ML approaches. A competitive salary ranging from $168,000 to $310,500 is offered based on experience and level. #J-18808-...
NVIDIA
Santa Clara, CA
1 day ago
Founding Engineer - ML Demand Generation
...Area / Hyderabad. About the Role We are looking for a Founding ML Engineer focused on Demand Generation, someone who can blend data... ...measurable impact on demand. What You’ll Do Build ML models that optimize lead scoring, conversion prediction, and campaign performance....
Work at office
Getclera
Mountain View, CA
3 days ago
Runtime Engineer
...supercomputer — feel like one seamless engine. Developers can write once,... ...the Role We're looking for a Runtime Engineer to design and build... ...you'll take the output of our optimizing compiler and make it execute —... ...the evolving needs of ML engineers and drive improvements...
Lemurian Labs
Santa Clara, CA
25 days ago
Agentic AI Systems Engineer
$152k - $208.5k
...a global leader in materials engineering solutions used to produce virtually... ..., distribution, and runtime loading across Java, Python,... ...on experience integrating AI/ML models into developer workflows... ...to on‑prem LLM deployment and optimization. Practical knowledge of RAG...
Full time
Relocation
Applied Materials
Santa Clara, CA
3 days ago
Full-Stack ML Systems Engineer
...UT Austin and world‑renowned ML systems researcher with a pedigree... ...streams applied directly to runtime AI operations. What We’re... ...of production experience engineering ML systems, OR a PhD from a top... ...production experience deploying and optimizing models via frameworks such as...
Shift work
Success Matcher Recruitment
Sunnyvale, CA
1 day ago
Senior ML & Software Engineer — Energy Optimization
Pentangle Tech Services | P5 Group is seeking a Senior Software & Machine Learning Engineer for their Energy Optimization team. You will build scalable software and machine learning solutions to optimize battery charging strategies for energy systems. The ideal candidate...
Pentangle Tech Services | P5 Group
Palo Alto, CA
1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to ML Runtime Optimization Engineer. Be the first to apply!