Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

AI Runtime Engineer

Oho Group

Runtime Engineer – AI Runtime & Execution

About the Role

We're looking for a Runtime Engineer to help build and optimise the execution layer that powers next-generation AI workloads. Working at the intersection of systems software, compiler technology, and hardware acceleration, you'll play a key role in ensuring that compiled models execute with maximum performance, scalability, and reliability across a range of computing architectures.

This is an exciting opportunity to work on low-level runtime systems, execution engines, and hardware-aware optimisation, collaborating closely with compiler, hardware, and product teams to shape the future of AI infrastructure.

Key Responsibilities

  • You will help design, build, and evolve a high-performance execution engine capable of supporting multiple hardware platforms and accelerator architectures.
  • You'll get the chance to optimise workload execution through advanced scheduling, partitioning, and parallelisation strategies that maximise hardware utilisation.
  • You will work directly with compiled workloads and binaries, profiling execution behaviour and identifying opportunities for performance improvements.
  • This is an excellent opportunity for you to develop internal tooling, telemetry systems, and diagnostic frameworks that help uncover execution bottlenecks and system inefficiencies.
  • You'll be responsible for analysing runtime performance across physical hardware, ensuring models achieve optimal throughput, latency, and resource utilisation.
  • You will contribute to the development and evaluation of experimental runtime features, prototypes, and execution strategies that influence future platform capabilities.
  • You'll collaborate closely with compiler, hardware, and product teams to translate machine learning requirements into scalable runtime solutions.

Required Qualifications

  • You'll need strong experience developing runtime systems, execution engines, systems software, or hardware-facing infrastructure.
  • You should be highly proficient in modern C++ and comfortable working within large-scale performance-critical codebases.
  • You must have a strong understanding of concurrent programming, multi-threaded architectures, asynchronous execution, and workload scheduling.
  • You'll need a solid understanding of computer architecture, including memory hierarchies, cache behaviour, processor execution models, and low-level performance considerations.
  • Experience working close to operating system primitives, drivers, kernel-level functionality, or low-level systems programming is highly desirable.
  • You should be comfortable profiling, debugging, and optimising software running directly on physical hardware platforms.

Preferred Qualifications

  • Experience working with GPU computing technologies such as CUDA, ROCm, or other accelerator programming frameworks.
  • Exposure to machine learning frameworks and compiler technologies including Triton, PyTorch, JAX, MLIR, or similar ecosystems.
  • Understanding of distributed computing systems, HPC environments, or large-scale parallel processing architectures.
  • Experience building performance analysis, telemetry, or observability tooling for complex software systems.
  • Strong interest in compiler technology, hardware acceleration, and AI infrastructure.

Education

You should be educated to BS, MS, or PhD level in Computer Science, Computer Engineering, Electrical Engineering, or a related technical discipline, or possess equivalent industry experience.

Vacancy posted 9 hours ago
Similar jobs that could be interesting for youBased on the AI Runtime Engineer in Sunnyvale, CA vacancy
  •  ...Runtime Engineer – AI Runtime & Execution About the Role We're looking for a Runtime Engineer to help build and optimise the execution layer that powers next-generation AI workloads. Working at the intersection of systems software, compiler technology, and hardware... 
    Suggested

    Oho Group

    Santa Clara, CA
    9 hours ago
  • $180k - $225k

     ...About FlexAI Build and Deploy AI the right way, anywhere. The FlexAI Compute Infrastructure Platform provides an "end...  ...next-generation training and inference workloads. As a Staff AI Runtime Engineer , you'll play a pivotal role in the design, development, and... 
    Suggested
    Work at office

    FlexAI

    Santa Clara, CA
    3 days ago
  • A forward-thinking AI infrastructure company is seeking a Staff AI Runtime Engineer to lead the design and optimization of their AI compute platform. In this leadership role, you'll enhance AI training and inference capabilities. Successful candidates will have over 8... 
    Suggested

    FlexAI

    Santa Clara, CA
    3 days ago
  • d-Matrix, based in Santa Clara, CA, is seeking a Staff Runtime Systems Engineer to lead the development of runtime software for AI inference platforms. You'll be responsible for architecting and developing firmware for multiprocessor systems-on-chip, collaborating with... 
    Suggested
    3 days per week

    d-Matrix

    Santa Clara, CA
    4 days ago
  • $184k - $287.5k

    NVIDIA Gruppe is seeking talented AI systems engineers to advance innovative technologies in AI inference systems software. This role involves developing cutting-edge libraries, code generators, and kernel technologies for NVIDIA's architecture, emphasizing high-impact... 
    Suggested

    NVIDIA Gruppe

    Santa Clara, CA
    1 day ago
  • GoTo Meeting is looking for its first dedicated Agent Platform engineer to build the foundational systems for next-generation AI products. This role involves working collaboratively in a small team to ship innovative solutions each week, integrating and benchmarking new... 

    GoTo Meeting

    Mountain View, CA
    3 days ago
  • $170k - $265k

     ...About Glean: Glean is the Work AI platform that helps everyone work smarter with...  ...company. About the Role: The Agents Runtime team builds the low-latency, reliable,...  ...investments. You are: ~3+ years of software engineering experience building production... 
    Home office
    Flexible hours
    3 days per week

    Glean.info

    Mountain View, CA
    2 days ago
  • $100k

     ...Software Engineer, Metal Runtime (API & Abstractions) Tenstorrent is leading the industry on cutting-edge AI technology, revolutionizing performance expectations, ease of use, and cost efficiency. With AI redefining the computing paradigm, solutions must evolve to... 

    Tenstorrent

    Santa Clara, CA
    5 days ago
  • $184k - $287.5k

     ...We're looking for outstanding AI systems engineers to develop groundbreaking technologies in the inference systems software stack! We build...  ...efficient attention kernel implementations, new LLM inference runtimes components, and kernel code generators to accelerate large language... 
    Remote work

    NVIDIA

    Santa Clara, CA
    4 days ago
  • $150k - $225k

     ...PlusAI is a Physical AI company pioneering AI-based virtual driver software for factory-built autonomous trucks. Headquartered in...  ...teams. Responsibilities: Work closely with our autonomy and runtime teams to improve our redundant on-vehicle platform and autonomous... 

    PlusAI, Inc.

    Santa Clara, CA
    2 days ago
  • $181.1k - $318.4k

     ...Staff AI Software Engineer, Siri User Experiences Do you have a passion for building the software platform on device that enables Siri to...  ...constrained environments Experience working on AI software runtimes with strong modeling intuition Experience building key... 
    Relocation

    Apple

    Cupertino, CA
    5 days ago
  • $180k - $240k

     ...operations. About the role We are seeking a Senior AI Infrastructure Engineer to design, build, and scale the high-performance AI...  ...and scale optimized model artifacts using TensorRT, ONNX Runtime, and Triton Inference Server, fine-tuning pipelines for both... 
    Odd job
    Work at office

    Gatik AI

    Mountain View, CA
    3 days ago
  • $152k - $241.5k

     ...sits at the intersection of GPU performance engineering and public accountability. What You Will...  ..., agentic workflows, and other emerging AI use cases. Collaborate with framework and...  ..., tilelang, OpenAI Triton) or compiler/runtime paths (torch.compile, graph lowering, operator... 

    NVIDIA Gruppe

    Santa Clara, CA
    1 day ago
  • $184k - $287.5k

     ...tapping into the unlimited potential of AI to define the next era of computing. An era...  ...outstanding Senior High Performance AI Engineer to build groundbreaking multi-agent systems...  ...ecosystem. We build innovative agentic runtimes and compiler-integrated orchestration that... 

    2100 NVIDIA USA

    Santa Clara, CA
    9 hours ago
  • $195.2k - $275.58k

    The Software and AI (SAI) organization is seeking a highly skilled Software Development Engineer to contribute to the development and optimization of oneDNN, a complex, cross...  ...such as OpenVINO, TensorFlow, PyTorch, ONNX Runtime, and many others. This is a unique... 
    Local area
    Remote work
    Worldwide
    Flexible hours
    Shift work

    Intel Corporation

    Santa Clara, CA
    9 hours ago
  • $152k - $241.5k

    NVIDIA Gruppe is seeking a Senior Software Engineer - AI Inference in Santa Clara, California. This role involves enhancing open-source LLM serving optimizations and implementing high-performance runtime capabilities. Candidates should have 5+ years of experience in building... 

    NVIDIA Gruppe

    Santa Clara, CA
    1 day ago
  • $152k - $208.5k

     ...Materials is a global leader in materials engineering solutions used to produce virtually every...  ...that literally connect our world – like AI and IoT. If you want to push the boundaries...  ...authoring, versioning, distribution, and runtime loading across Java, Python, C++, and web... 
    Full time
    Relocation

    Applied Materials

    Santa Clara, CA
    4 days ago
  • $147.4k - $272.1k

     ...Software Engineer, Siri Runtime Systems and Interaction Apple is where individual imaginations gather together, committing to the values that lead to great work. Every new product we build, service we create, or Apple Store experience we deliver is the result of us... 
    Relocation

    Apple

    Cupertino, CA
    4 days ago
  • $147.4k - $272.1k

     ...Software Development Engineer, Siri Runtime Systems and Interaction Apple is where individual imaginations gather together, committing to the values that lead to great work. Every new product we build, service we create, or Apple Store experience we deliver is the... 
    Relocation

    Apple

    Cupertino, CA
    2 days ago
  • $147.4k - $272.1k

     ...Compiler Security Engineer - C/C++, Languages & Runtimes The Security Tools team at Apple is looking for software engineers to develop secure language features for C/C++ and enhance security features in the Clang compiler. Clang is a core part of Apple's developer... 
    Relocation

    Apple

    Cupertino, CA
    1 day ago
  •  ...AI/ML Engineer - Agentic This role has been designed as 'Hybrid' with an expectation that you will work on average 2 days per week from...  ...) that power orchestration, tool execution, and agent runtime behavior. Own observability and reliability for non-deterministic... 
    Work at office
    2 days per week

    Hewlett Packard Enterprise

    Alviso, CA
    5 days ago
  •  ...1) Agent Engineer (Agentic AI / LLM Systems) Job Title: Agent Engineer Location: In person in Palo Alto CA - only local...  ...-Level) Collaboration & Influence 5) Platform / Runtime Engineer (Systems & Infrastructure) Job Title:... 
    Local area

    Diverse Lynx

    Palo Alto, CA
    3 days ago
  • $272k - $431.25k

    NVIDIA Gruppe is looking for a Principal Software Engineer to advance open-source AI inference. This hands-on role emphasizes running high-performance...  .... Key responsibilities include optimizing inference runtimes, improving efficiency, and mentoring other engineers. A... 

    NVIDIA Gruppe

    Santa Clara, CA
    1 day ago
  •  ...next-generation computing experiences-from AI and data centers, to PCs, gaming and...  ...AMD is looking for a Senior Staff AI Infra Engineer who is passionate about improving the performance...  ...issues across GPU, network, and runtime layers. • Drive technical excellence, foster... 

    Advanced Micro Devices , Inc.

    Santa Clara, CA
    2 days ago
  • $100k

     ...Software Engineer, Metal Runtime (Core Systems) Tenstorrent is leading the industry on cutting-edge AI technology, revolutionizing performance expectations, ease of use, and cost efficiency. With AI redefining the computing paradigm, solutions must evolve to unify... 

    Tenstorrent

    Santa Clara, CA
    5 days ago
  • MixMode is seeking an enthusiastic Runtime Engineer Intern to work on embedded systems development and Linux software implementation. The ideal candidate is pursuing a Bachelor's degree in Computer Science or a related field, with experience in C/C++ development for Linux... 
    Internship
    Work at office

    MixMode

    Santa Clara, CA
    2 days ago
  • $179.5k - $260k

     ...cybersecurity company building a next-generation AI-driven operations platform, designed to...  .... We’re looking for an Applied AI Engineer with strong backend and AI experience who...  ...Proficiency with at least one modern backend runtime/language (e.g., Python, Go) and... 
    Full time
    Night shift

    Fortinet, Inc.

    Sunnyvale, CA
    1 day ago
  • $165k - $180k

     ...of the global research organization, our AI research in Silicon Valley focuses on Foundation...  ...ADAS) and Autonomous Systems, AI Systems Engineering, and Industry AI. We develop scalable,...  ...and edge environments, focusing on model/runtime optimization, partial or offline... 
    Temporary work
    Work experience placement
    Worldwide

    Robert Bosch Group

    Sunnyvale, CA
    2 days ago
  • $73.8k - $220.4k

    We Are: We are at the forefront of a new era in enterprise AI — one that moves beyond isolated models and experiments...  ...clients build, deploy, and operate AI at scale. We design and engineer the platforms, runtimes, and developer tooling that make autonomous AI agents a... 
    Work experience placement
    Live in
    Work at office
    Local area
    Relocation

    Accenture

    Mountain View, CA
    1 day ago
  •  ...Senior Staff AI/ML System Software Engineer At d-Matrix, we are focused on unleashing the potential of generative AI to power the transformation...  ...and TensorFlow). Experience with deep learning runtimes (such as ONNX Runtime, TensorRT,...). Experience with... 
    Work experience placement
    3 days per week

    D-Matrix

    Santa Clara, CA
    4 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to AI Runtime Engineer. Be the first to apply!