AI Runtime Engineer

Oho Group

Runtime Engineer – AI Runtime & Execution

About the Role

We're looking for a Runtime Engineer to help build and optimise the execution layer that powers next-generation AI workloads. Working at the intersection of systems software, compiler technology, and hardware acceleration, you'll play a key role in ensuring that compiled models execute with maximum performance, scalability, and reliability across a range of computing architectures.

This is an exciting opportunity to work on low-level runtime systems, execution engines, and hardware-aware optimisation, collaborating closely with compiler, hardware, and product teams to shape the future of AI infrastructure.

Key Responsibilities

You will help design, build, and evolve a high-performance execution engine capable of supporting multiple hardware platforms and accelerator architectures.
You'll get the chance to optimise workload execution through advanced scheduling, partitioning, and parallelisation strategies that maximise hardware utilisation.
You will work directly with compiled workloads and binaries, profiling execution behaviour and identifying opportunities for performance improvements.
This is an excellent opportunity for you to develop internal tooling, telemetry systems, and diagnostic frameworks that help uncover execution bottlenecks and system inefficiencies.
You'll be responsible for analysing runtime performance across physical hardware, ensuring models achieve optimal throughput, latency, and resource utilisation.
You will contribute to the development and evaluation of experimental runtime features, prototypes, and execution strategies that influence future platform capabilities.
You'll collaborate closely with compiler, hardware, and product teams to translate machine learning requirements into scalable runtime solutions.

Required Qualifications

You'll need strong experience developing runtime systems, execution engines, systems software, or hardware-facing infrastructure.
You should be highly proficient in modern C++ and comfortable working within large-scale performance-critical codebases.
You must have a strong understanding of concurrent programming, multi-threaded architectures, asynchronous execution, and workload scheduling.
You'll need a solid understanding of computer architecture, including memory hierarchies, cache behaviour, processor execution models, and low-level performance considerations.
Experience working close to operating system primitives, drivers, kernel-level functionality, or low-level systems programming is highly desirable.
You should be comfortable profiling, debugging, and optimising software running directly on physical hardware platforms.

Preferred Qualifications

Experience working with GPU computing technologies such as CUDA, ROCm, or other accelerator programming frameworks.
Exposure to machine learning frameworks and compiler technologies including Triton, PyTorch, JAX, MLIR, or similar ecosystems.
Understanding of distributed computing systems, HPC environments, or large-scale parallel processing architectures.
Experience building performance analysis, telemetry, or observability tooling for complex software systems.
Strong interest in compiler technology, hardware acceleration, and AI infrastructure.

Education

You should be educated to BS, MS, or PhD level in Computer Science, Computer Engineering, Electrical Engineering, or a related technical discipline, or possess equivalent industry experience.

Apply

Vacancy posted 9 hours ago

Similar jobs that could be interesting for youBased on the AI Runtime Engineer in Sunnyvale, CA vacancy

AI Runtime Engineer
...Runtime Engineer – AI Runtime & Execution About the Role We're looking for a Runtime Engineer to help build and optimise the execution layer that powers next-generation AI workloads. Working at the intersection of systems software, compiler technology, and hardware...
Suggested
Oho Group
Santa Clara, CA
9 hours ago
Staff AI Runtime Engineer
$180k - $225k
...About FlexAI Build and Deploy AI the right way, anywhere. The FlexAI Compute Infrastructure Platform provides an "end... ...next-generation training and inference workloads. As a Staff AI Runtime Engineer , you'll play a pivotal role in the design, development, and...
Suggested
Work at office
FlexAI
Santa Clara, CA
3 days ago
Senior AI Runtime Engineer: Distributed Training & Scale
A forward-thinking AI infrastructure company is seeking a Staff AI Runtime Engineer to lead the design and optimization of their AI compute platform. In this leadership role, you'll enhance AI training and inference capabilities. Successful candidates will have over 8...
Suggested
FlexAI
Santa Clara, CA
3 days ago
Senior AI Runtime & Systems Engineer (Embedded Linux)
d-Matrix, based in Santa Clara, CA, is seeking a Staff Runtime Systems Engineer to lead the development of runtime software for AI inference platforms. You'll be responsible for architecting and developing firmware for multiprocessor systems-on-chip, collaborating with...
Suggested
3 days per week
d-Matrix
Santa Clara, CA
4 days ago
Senior AI Systems Engineer: Inference Kernels & Runtimes
$184k - $287.5k
NVIDIA Gruppe is seeking talented AI systems engineers to advance innovative technologies in AI inference systems software. This role involves developing cutting-edge libraries, code generators, and kernel technologies for NVIDIA's architecture, emphasizing high-impact...
Suggested
NVIDIA Gruppe
Santa Clara, CA
1 day ago
AI Platform Engineer - Agent Runtime & Tools
GoTo Meeting is looking for its first dedicated Agent Platform engineer to build the foundational systems for next-generation AI products. This role involves working collaboratively in a small team to ship innovative solutions each week, integrating and benchmarking new...
GoTo Meeting
Mountain View, CA
3 days ago
Software Engineer, Agentic Runtime
$170k - $265k
...About Glean: Glean is the Work AI platform that helps everyone work smarter with... ...company. About the Role: The Agents Runtime team builds the low-latency, reliable,... ...investments. You are: ~3+ years of software engineering experience building production...
Home office
Flexible hours
3 days per week
Glean.info
Mountain View, CA
2 days ago
Software Engineer, Metal Runtime (API & Abstractions)
$100k
...Software Engineer, Metal Runtime (API & Abstractions) Tenstorrent is leading the industry on cutting-edge AI technology, revolutionizing performance expectations, ease of use, and cost efficiency. With AI redefining the computing paradigm, solutions must evolve to...
Tenstorrent
Santa Clara, CA
5 days ago
Senior AI Software Engineer, Kernel Libraries
$184k - $287.5k
...We're looking for outstanding AI systems engineers to develop groundbreaking technologies in the inference systems software stack! We build... ...efficient attention kernel implementations, new LLM inference runtimes components, and kernel code generators to accelerate large language...
Remote work
NVIDIA
Santa Clara, CA
4 days ago
Senior/Staff Software Engineer, Runtime
$150k - $225k
...PlusAI is a Physical AI company pioneering AI-based virtual driver software for factory-built autonomous trucks. Headquartered in... ...teams. Responsibilities: Work closely with our autonomy and runtime teams to improve our redundant on-vehicle platform and autonomous...
PlusAI, Inc.
Santa Clara, CA
2 days ago
Staff AI Software Engineer, Siri User Experiences
$181.1k - $318.4k
...Staff AI Software Engineer, Siri User Experiences Do you have a passion for building the software platform on device that enables Siri to... ...constrained environments Experience working on AI software runtimes with strong modeling intuition Experience building key...
Relocation
Apple
Cupertino, CA
5 days ago
Senior AI Infrastructure Engineer
$180k - $240k
...operations. About the role We are seeking a Senior AI Infrastructure Engineer to design, build, and scale the high-performance AI... ...and scale optimized model artifacts using TensorRT, ONNX Runtime, and Triton Inference Server, fine-tuning pipelines for both...
Odd job
Work at office
Gatik AI
Mountain View, CA
3 days ago
AI Inference Performance Engineer
$152k - $241.5k
...sits at the intersection of GPU performance engineering and public accountability. What You Will... ..., agentic workflows, and other emerging AI use cases. Collaborate with framework and... ..., tilelang, OpenAI Triton) or compiler/runtime paths (torch.compile, graph lowering, operator...
NVIDIA Gruppe
Santa Clara, CA
1 day ago
Senior High Performance AI Engineer
$184k - $287.5k
...tapping into the unlimited potential of AI to define the next era of computing. An era... ...outstanding Senior High Performance AI Engineer to build groundbreaking multi-agent systems... ...ecosystem. We build innovative agentic runtimes and compiler-integrated orchestration that...
2100 NVIDIA USA
Santa Clara, CA
9 hours ago
Senior AI Algorithm Engineer in oneDNN
$195.2k - $275.58k
The Software and AI (SAI) organization is seeking a highly skilled Software Development Engineer to contribute to the development and optimization of oneDNN, a complex, cross... ...such as OpenVINO, TensorFlow, PyTorch, ONNX Runtime, and many others. This is a unique...
Local area
Remote work
Worldwide
Flexible hours
Shift work
Intel Corporation
Santa Clara, CA
9 hours ago
Senior AI Inference Engineer - High-Performance LLM Serving
$152k - $241.5k
NVIDIA Gruppe is seeking a Senior Software Engineer - AI Inference in Santa Clara, California. This role involves enhancing open-source LLM serving optimizations and implementing high-performance runtime capabilities. Candidates should have 5+ years of experience in building...
NVIDIA Gruppe
Santa Clara, CA
1 day ago
Agentic AI Systems Engineer
$152k - $208.5k
...Materials is a global leader in materials engineering solutions used to produce virtually every... ...that literally connect our world – like AI and IoT. If you want to push the boundaries... ...authoring, versioning, distribution, and runtime loading across Java, Python, C++, and web...
Full time
Relocation
Applied Materials
Santa Clara, CA
4 days ago
Software Systems Engineering
$147.4k - $272.1k
...Software Engineer, Siri Runtime Systems and Interaction Apple is where individual imaginations gather together, committing to the values that lead to great work. Every new product we build, service we create, or Apple Store experience we deliver is the result of us...
Relocation
Apple
Cupertino, CA
4 days ago
Software Development Engineer, Siri Runtime Systems and Interaction
$147.4k - $272.1k
...Software Development Engineer, Siri Runtime Systems and Interaction Apple is where individual imaginations gather together, committing to the values that lead to great work. Every new product we build, service we create, or Apple Store experience we deliver is the...
Relocation
Apple
Cupertino, CA
2 days ago
Compiler Security Engineer - C/C++, Languages & Runtimes
$147.4k - $272.1k
...Compiler Security Engineer - C/C++, Languages & Runtimes The Security Tools team at Apple is looking for software engineers to develop secure language features for C/C++ and enhance security features in the Clang compiler. Clang is a core part of Apple's developer...
Relocation
Apple
Cupertino, CA
1 day ago
AI/ML Engineer - Agentic
...AI/ML Engineer - Agentic This role has been designed as 'Hybrid' with an expectation that you will work on average 2 days per week from... ...) that power orchestration, tool execution, and agent runtime behavior. Own observability and reliability for non-deterministic...
Work at office
2 days per week
Hewlett Packard Enterprise
Alviso, CA
5 days ago
Agent Engineer/AI Engineer/Solutions Architect/Technical Lead/ Platform /Runtime Engineer
...1) Agent Engineer (Agentic AI / LLM Systems) Job Title: Agent Engineer Location: In person in Palo Alto CA - only local... ...-Level) Collaboration & Influence 5) Platform / Runtime Engineer (Systems & Infrastructure) Job Title:...
Local area
Diverse Lynx
Palo Alto, CA
3 days ago
Principal AI Inference Engineer Open-Source & GPU-Focused
$272k - $431.25k
NVIDIA Gruppe is looking for a Principal Software Engineer to advance open-source AI inference. This hands-on role emphasizes running high-performance... .... Key responsibilities include optimizing inference runtimes, improving efficiency, and mentoring other engineers. A...
NVIDIA Gruppe
Santa Clara, CA
1 day ago
Principal AI Inference Systems Engineer
...next-generation computing experiences-from AI and data centers, to PCs, gaming and... ...AMD is looking for a Senior Staff AI Infra Engineer who is passionate about improving the performance... ...issues across GPU, network, and runtime layers. • Drive technical excellence, foster...
Advanced Micro Devices , Inc.
Santa Clara, CA
2 days ago
Software Engineer, Metal Runtime (Core Systems)
$100k
...Software Engineer, Metal Runtime (Core Systems) Tenstorrent is leading the industry on cutting-edge AI technology, revolutionizing performance expectations, ease of use, and cost efficiency. With AI redefining the computing paradigm, solutions must evolve to unify...
Tenstorrent
Santa Clara, CA
5 days ago
Embedded Runtime Engineer Intern: Linux/RTOS, RISC-V
MixMode is seeking an enthusiastic Runtime Engineer Intern to work on embedded systems development and Linux software implementation. The ideal candidate is pursuing a Bachelor's degree in Computer Science or a related field, with experience in C/C++ development for Linux...
Internship
Work at office
MixMode
Santa Clara, CA
2 days ago
Applied AI Engineer - AI Agent (New Graduate)
$179.5k - $260k
...cybersecurity company building a next-generation AI-driven operations platform, designed to... .... We’re looking for an Applied AI Engineer with strong backend and AI experience who... ...Proficiency with at least one modern backend runtime/language (e.g., Python, Go) and...
Full time
Night shift
Fortinet, Inc.
Sunnyvale, CA
1 day ago
AI Research Engineer - Agentic AI
$165k - $180k
...of the global research organization, our AI research in Silicon Valley focuses on Foundation... ...ADAS) and Autonomous Systems, AI Systems Engineering, and Industry AI. We develop scalable,... ...and edge environments, focusing on model/runtime optimization, partial or offline...
Temporary work
Work experience placement
Worldwide
Robert Bosch Group
Sunnyvale, CA
2 days ago
Advanced AI Full Stack Engineer
$73.8k - $220.4k
We Are: We are at the forefront of a new era in enterprise AI — one that moves beyond isolated models and experiments... ...clients build, deploy, and operate AI at scale. We design and engineer the platforms, runtimes, and developer tooling that make autonomous AI agents a...
Work experience placement
Live in
Work at office
Local area
Relocation
Accenture
Mountain View, CA
1 day ago
Senior Staff AI/ML System Software Engineer
...Senior Staff AI/ML System Software Engineer At d-Matrix, we are focused on unleashing the potential of generative AI to power the transformation... ...and TensorFlow). Experience with deep learning runtimes (such as ONNX Runtime, TensorRT,...). Experience with...
Work experience placement
3 days per week
D-Matrix
Santa Clara, CA
4 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to AI Runtime Engineer. Be the first to apply!