Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Staff AI Runtime Engineer

$180k - $225k

FlexAI

About FlexAI

Build and Deploy AI the right way, anywhere.

The FlexAI Compute Infrastructure Platform provides an "end-to-end AI compute layer" for running and managing workloads across any cloud, any GPU, and any deployment model (public, hybrid, or on-prem). It brings together "1-click simplicity" for users with "enterprise-grade orchestration, security, and automation" under the hood.

Founded by Brijesh Tripathi, who bring experience from Nvidia, Apple, Tesla, Intel and Zoox, FlexAI is not just building a product - we're shaping the future of AI. Our teams are strategically distributed across Silicon Valley and Bengaluru, united by a shared mission: to deliver more compute with less complexity.

If you're passionate about shaping the future of artificial intelligence, driving innovation, and contributing to a sustainable and inclusive AI ecosystem, FlexAI is the place for you !

Role Overview

At FlexAI, we're building a high-performance, cloud-agnostic AI compute platform designed for next-generation training and inference workloads. As a Staff AI Runtime Engineer , you'll play a pivotal role in the design, development, and optimization of the core runtime infrastructure that powers distributed training and deployment of large AI models (LLMs and beyond).

This is a hands-on leadership role - perfect for a systems-minded software engineer who thrives at the intersection of AI workloads, runtimes, and performance-critical infrastructure. You'll own critical components of our PyTorch-based stack, lead technical direction, and collaborate across engineering, research, and product to push the boundaries of elastic, fault-tolerant, high-performance model execution.

What You'll Do

Lead Runtime Design & Development:
  • Own the core runtime architecture supporting AI training and inference at scale.
  • Design resilient and elastic runtime features (e.g. dynamic node scaling, job recovery) within our custom PyTorch stack.
  • Optimize distributed training reliability, orchestration, and job-level fault tolerance.
Drive Performance at Scale:
  • Profile and enhance low-level system performance across training and inference pipelines.
  • Improve packaging, deployment, and integration of customer models in production environments.
  • Ensure consistent throughput, latency, and reliability metrics across multi-node, multi-GPU setups.
Build Internal Tooling & Frameworks:
  • Design and maintain libraries and services that support model lifecycle: training, checkpointing, fault recovery, packaging, and deployment.
  • Implement observability hooks, diagnostics, and resilience mechanisms for deep learning workloads.
  • Champion best practices in CI/CD, testing, and software quality across the AI Runtime stack.
Collaborate & Mentor:
  • Work cross-functionally with Research, Infrastructure, and Product teams to align runtime development with customer and platform needs.
  • Guide technical discussions, mentor junior engineers, and help scale the AI Runtime team's capabilities.
What You'll Need to Be Successful
  • 8+ years of experience in systems/software engineering, with deep exposure to AI runtime, distributed systems, or compiler/runtime interaction.
  • Experience in delivering PaaS services.
  • Proven experience optimizing and scaling deep learning runtimes (e.g. PyTorch, TensorFlow, JAX) for large-scale training and/or inference.
  • Strong programming skills in Python and C++ (Go or Rust is a plus).
  • Familiarity with distributed training frameworks, low-level performance tuning, and resource orchestration.
  • Experience working with multi-GPU, multi-node, or cloud-native AI workloads.
  • Solid understanding of containerized workloads, job scheduling, and failure recovery in production environments.
Nice to Have
  • Contributions to PyTorch internals or open-source DL infrastructure projects.
  • Familiarity with LLM training pipelines, checkpointing, or elastic training orchestration.
  • Experience with Kubernetes, Ray, TorchElastic, or custom AI job orchestrators.
  • Background in systems research, compilers, or runtime architecture for HPC or ML.
  • Start up previous experience
This position is In-Person and located at our Santa Clara, CA Office.


What We Offer
  • A competitive salary and benefits package
  • Work on cutting-edge AI infrastructure
  • Build products used by developers and enterprises
  • High ownership, fast execution, real impact
  • Collaborative, high-caliber team

The pay range for this role is:

180,000 - 225,000 USD per year (US)
Vacancy posted 5 days ago
Similar jobs that could be interesting for youBased on the Staff AI Runtime Engineer in Santa Clara, CA vacancy
  •  ...A forward-thinking AI infrastructure company is seeking a Staff AI Runtime Engineer to lead the design and optimization of their AI compute platform. In this leadership role, you'll enhance AI training and inference capabilities. Successful candidates will have over 8... 
    Suggested

    FlexAI

    Santa Clara, CA
    4 days ago
  •  ...Build and Deploy AI the right way, anywhere. The FlexAI Compute Infrastructure Platform provides an "end-to-end AI compute...  ...for next-generation training and inference workloads. As a Staff AI Runtime Engineer , you’ll play a pivotal role in the design, development, and... 
    Suggested
    Work at office

    FlexAI

    Santa Clara, CA
    5 days ago
  • $140k - $215k

     ...CrowdStrike, Inc. is seeking a Software Development Engineer for the Cloud Runtime Protection team. In this role, you will design critical features for the Falcon platform, focusing on AI and cloud-native workloads. Experience with C/C++, Linux, and eBPF is required. This... 
    Suggested
    Work at office

    Koitecc Solutions

    Sunnyvale, CA
    4 days ago
  • $203.45k - $344.3k

     ...Senior Staff Physical AI Data Algorithm Engineer Santa Clara, CA XPENG is a leading smart technology company at the forefront of innovation, integrating advanced AI and autonomous driving technologies into its vehicles, including electric vehicles (EVs), electric... 
    Suggested
    Full time
    Temporary work
    Work experience placement

    XPENG

    Santa Clara, CA
    5 days ago
  • $184k - $287.5k

    NVIDIA Gruppe is seeking talented AI systems engineers to advance innovative technologies in AI inference systems software. This role involves developing cutting-edge libraries, code generators, and kernel technologies for NVIDIA's architecture, emphasizing high-impact... 
    Suggested

    NVIDIA Gruppe

    Santa Clara, CA
    3 days ago
  •  ...end ML platforms and developer‑facing products to support AI and ML innovation across teams such as Embodied AI, Simulation...  ...across diverse infrastructures. Position Overview As a Staff AI/ML Full‑Stack Engineer, you will design and build end‑to‑end software products,... 

    Israelvcforum

    Sunnyvale, CA
    4 days ago
  •  ...A leading automotive company in California is seeking a Staff AI/ML Full-Stack Engineer to design and build end-to-end software products for autonomous vehicles. This hands-on role emphasizes technical depth and system design, involving mentorship, full-stack development... 

    General Motors

    Sunnyvale, CA
    4 days ago
  • $203.45k - $344.3k

     ...Senior Staff AI Data Infrastructure/Pipeline Engineer Santa Clara, CA XPENG is a leading smart technology company at the forefront of innovation, integrating advanced AI and autonomous driving technologies into its vehicles, including electric vehicles (EVs), electric... 
    Full time
    Overseas

    XPENG

    Santa Clara, CA
    2 days ago
  • $100k

     ...Software Engineer, Metal Runtime (API & Abstractions) Tenstorrent is leading the industry on cutting-edge AI technology, revolutionizing performance expectations, ease of use, and cost efficiency. With AI redefining the computing paradigm, solutions must evolve to... 

    Tenstorrent

    Santa Clara, CA
    12 days ago
  • $150k - $225k

     ...role requires substantial expertise in C++ and Python, alongside a passion for innovation in the AI field. Your responsibilities will include working closely with runtime teams, optimizing software performance, and ensuring quality management standards are met. The position... 

    PlusAI, Inc.

    Santa Clara, CA
    4 days ago
  • $184k - $287.5k

     ...We're looking for outstanding AI systems engineers to develop groundbreaking technologies in the inference systems software stack! We build...  ...efficient attention kernel implementations, new LLM inference runtimes components, and kernel code generators to accelerate large language... 

    NVIDIA Gruppe

    Santa Clara, CA
    4 days ago
  • $148k - $235.75k

     ...tapping into the unlimited potential of AI to define the next era of computing. An era...  ...outstanding Senior High Performance AI Engineer to build groundbreaking multi-agent systems...  ...ecosystem. We build innovative agentic runtimes and compiler-integrated orchestration that... 

    NVIDIA Gruppe

    Santa Clara, CA
    5 days ago
  • $150k - $225k

     ...PlusAI is a Physical AI company pioneering AI-based virtual driver software for factory-built autonomous trucks. Headquartered in...  ...teams. Responsibilities: Work closely with our autonomy and runtime teams to improve our redundant on-vehicle platform and autonomous... 

    PlusAI, Inc.

    Santa Clara, CA
    4 days ago
  • $152k - $241.5k

     ...NVIDIA Gruppe is seeking a Senior Software Engineer – AI Inference in Santa Clara, California. This role involves enhancing open-source LLM serving optimizations and implementing high-performance runtime capabilities. Candidates should have 5+ years of experience in building... 

    NVIDIA Gruppe

    Santa Clara, CA
    4 days ago
  • $139k - $229k

     ...NYC Sr. AI Engineer: Assets, Formats & Placements The AFP team defines how ad creatives are packaged and delivered across LinkedIn —...  ...Measurement, and Serving to provide the schema, APIs, and delivery runtime that enable creative portability, policy compliance, and cross... 
    For contractors
    Work at office
    Flexible hours

    LinkedIn

    Sunnyvale, CA
    1 day ago
  • $272k - $431.25k

     ...NVIDIA Gruppe is looking for a Principal Software Engineer to advance open-source AI inference. This hands-on role emphasizes running high-performance...  .... Key responsibilities include optimizing inference runtimes, improving efficiency, and mentoring other engineers. A... 

    NVIDIA Gruppe

    Santa Clara, CA
    5 days ago
  • $195.2k - $275.58k

    The Software and AI (SAI) organization is seeking a highly skilled Software Development Engineer to contribute to the development and optimization of oneDNN, a complex, cross...  ...such as OpenVINO, TensorFlow, PyTorch, ONNX Runtime, and many others. This is a unique... 
    Local area
    Remote work
    Worldwide
    Flexible hours
    Shift work

    Intel Corporation

    Santa Clara, CA
    2 days ago
  • $124k - $195.5k

    ## AI Inference Performance Engineer - New College Grad 2026Applylocations: US, CA, Santa Claratime type: Full timeposted on: Posted Yesterdayjob requisition...  ...(CUTLASS, cuteDSL, tilelang, OpenAI Triton) or compiler/runtime paths (torch.compile, graph lowering, operator fusion).... 

    NVIDIA Corporation

    Santa Clara, CA
    2 days ago
  •  ...MixMode is seeking an enthusiastic Runtime Engineer Intern to work on embedded systems development and Linux software implementation. The ideal candidate is pursuing a Bachelor's degree in Computer Science or a related field, with experience in C/C++ development for Linux... 
    Internship
    Work at office

    MixMode

    Santa Clara, CA
    5 days ago
  •  ...AI/ML Engineer - Agentic This role has been designed as 'Hybrid' with an expectation that you will work on average 2 days per week from...  ...) that power orchestration, tool execution, and agent runtime behavior. Own observability and reliability for non-deterministic... 
    Work at office
    2 days per week

    Hewlett Packard Enterprise

    Alviso, CA
    2 days ago
  • $150k - $250k

    Love Freedom Solution is looking for an AI Agent Engineer in Sunnyvale, California. This is a unique opportunity to influence the development...  .... You'll lead the architecture and execution for the agent's runtime and work in a startup environment that is backed by prominent... 
    Remote job

    Love Freedom Solution

    Sunnyvale, CA
    16 hours ago
  • $100k

     ...Software Engineer, Metal Runtime (Core Systems) Tenstorrent is leading the industry on cutting-edge AI technology, revolutionizing performance expectations, ease of use, and cost efficiency. With AI redefining the computing paradigm, solutions must evolve to unify... 

    Tenstorrent

    Santa Clara, CA
    7 days ago
  • $152k - $208.5k

     ...Materials is a global leader in materials engineering solutions used to produce virtually every...  ...that literally connect our world – like AI and IoT. If you want to push the boundaries...  ...authoring, versioning, distribution, and runtime loading across Java, Python, C++, and web... 
    Full time
    Relocation

    Applied Materials

    Santa Clara, CA
    7 days ago
  •  ...Senior Software Engineer II As a global leader in cybersecurity, CrowdStrike protects the...  ...security with the world's most advanced AI-native platform. We work on large scale distributed...  ...Engineer II to join our Sensor Event Runtime (SER) team. This role is responsible for... 
    Work at office
    Flexible hours

    CrowdStrike

    Sunnyvale, CA
    2 days ago
  • $179.5k - $260k

     ...cybersecurity company building a next-generation AI-driven operations platform, designed to...  .... We’re looking for an Applied AI Engineer with strong backend and AI experience who...  ...Proficiency with at least one modern backend runtime/language (e.g., Python, Go) and... 
    Full time
    Night shift

    Fortinet, Inc.

    Sunnyvale, CA
    3 days ago
  •  ...-generation computing experiences-from AI and data centers, to PCs, gaming and embedded...  ...ROLE: AMD is looking for a Senior Staff AI Infra Engineer who is passionate about improving the...  ...issues across GPU, network, and runtime layers. • Drive technical excellence,... 

    Advanced Micro Devices , Inc.

    Santa Clara, CA
    4 days ago
  •  ...Principal AI/ML System Software Engineer At d-Matrix, we are focused on unleashing the potential of generative AI to power the transformation...  ...PyTorch and TensorFlow) Experience with deep learning runtimes (such as ONNX Runtime, TensorRT, etc.). Experience with... 
    Work experience placement
    3 days per week

    d-Matrix

    Santa Clara, CA
    1 day ago
  • $140k - $215k

     ...redefined modern security with the world's most advanced AI-native platform. Our customers span all industries, and they...  ...with you. About the Role: This is a Software Development Engineer role on the Cloud Runtime Protection team that builds the core of the CrowdStrike Falcon... 
    Full time
    Work experience placement
    Work at office
    Local area
    2 days per week
    3 days per week

    Koitecc Solutions

    Sunnyvale, CA
    5 days ago
  •  ...Principal AI Agent / ML Software Engineer The Principal AI Agent / ML Software Engineer is a Senior Staff-level, hands-on technical leadership role responsible for defining, building...  ...serving, inference gateways, agent runtimes, workflow engines, developer platforms,... 

    Oracle

    Santa Clara, CA
    1 day ago
  •  ...Wayve is the leading developer of Embodied AI technology. Our advanced AI software and...  ...join a motivated and talented team of engineers to deliver a reliable, stable and flexible...  ...critical code paths and algorithms to improve runtime efficiency, reduce latency, and enhance... 
    Full time
    Work at office
    Work from home
    Flexible hours

    Icehouseventures

    Sunnyvale, CA
    4 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Staff AI Runtime Engineer. Be the first to apply!