Staff AI Runtime Engineer
$180k - $225kFlexAI
About FlexAI Build and Deploy AI the right way, anywhere. The FlexAI Compute Infrastructure Platform provides an "end-to-end AI compute layer" for running and managing workloads across any cloud, any GPU, and any deployment model (public, hybrid, or on-prem). It brings together "1-click simplicity" for users with "enterprise-grade orchestration, security, and automation" under the hood. Founded by Brijesh Tripathi, who bring experience from Nvidia, Apple, Tesla, Intel and Zoox, FlexAI is not just building a product - we're shaping the future of AI. Our teams are strategically distributed across Silicon Valley and Bengaluru, united by a shared mission: to deliver more compute with less complexity. If you're passionate about shaping the future of artificial intelligence, driving innovation, and contributing to a sustainable and inclusive AI ecosystem, FlexAI is the place for you ! Role Overview At FlexAI, we're building a high-performance, cloud-agnostic AI compute platform designed for next-generation training and inference workloads. As a Staff AI Runtime Engineer , you'll play a pivotal role in the design, development, and optimization of the core runtime infrastructure that powers distributed training and deployment of large AI models (LLMs and beyond). This is a hands-on leadership role - perfect for a systems-minded software engineer who thrives at the intersection of AI workloads, runtimes, and performance-critical infrastructure. You'll own critical components of our PyTorch-based stack, lead technical direction, and collaborate across engineering, research, and product to push the boundaries of elastic, fault-tolerant, high-performance model execution. What You'll Do Lead Runtime Design & Development:
What We Offer
- Own the core runtime architecture supporting AI training and inference at scale.
- Design resilient and elastic runtime features (e.g. dynamic node scaling, job recovery) within our custom PyTorch stack.
- Optimize distributed training reliability, orchestration, and job-level fault tolerance.
- Profile and enhance low-level system performance across training and inference pipelines.
- Improve packaging, deployment, and integration of customer models in production environments.
- Ensure consistent throughput, latency, and reliability metrics across multi-node, multi-GPU setups.
- Design and maintain libraries and services that support model lifecycle: training, checkpointing, fault recovery, packaging, and deployment.
- Implement observability hooks, diagnostics, and resilience mechanisms for deep learning workloads.
- Champion best practices in CI/CD, testing, and software quality across the AI Runtime stack.
- Work cross-functionally with Research, Infrastructure, and Product teams to align runtime development with customer and platform needs.
- Guide technical discussions, mentor junior engineers, and help scale the AI Runtime team's capabilities.
- 8+ years of experience in systems/software engineering, with deep exposure to AI runtime, distributed systems, or compiler/runtime interaction.
- Experience in delivering PaaS services.
- Proven experience optimizing and scaling deep learning runtimes (e.g. PyTorch, TensorFlow, JAX) for large-scale training and/or inference.
- Strong programming skills in Python and C++ (Go or Rust is a plus).
- Familiarity with distributed training frameworks, low-level performance tuning, and resource orchestration.
- Experience working with multi-GPU, multi-node, or cloud-native AI workloads.
- Solid understanding of containerized workloads, job scheduling, and failure recovery in production environments.
- Contributions to PyTorch internals or open-source DL infrastructure projects.
- Familiarity with LLM training pipelines, checkpointing, or elastic training orchestration.
- Experience with Kubernetes, Ray, TorchElastic, or custom AI job orchestrators.
- Background in systems research, compilers, or runtime architecture for HPC or ML.
- Start up previous experience
What We Offer
- A competitive salary and benefits package
- Work on cutting-edge AI infrastructure
- Build products used by developers and enterprises
- High ownership, fast execution, real impact
- Collaborative, high-caliber team
Vacancy posted 5 days ago
Similar jobs that could be interesting for youBased on the Staff AI Runtime Engineer in Santa Clara, CA vacancy
- ...A forward-thinking AI infrastructure company is seeking a Staff AI Runtime Engineer to lead the design and optimization of their AI compute platform. In this leadership role, you'll enhance AI training and inference capabilities. Successful candidates will have over 8...Suggested
- ...Build and Deploy AI the right way, anywhere. The FlexAI Compute Infrastructure Platform provides an "end-to-end AI compute... ...for next-generation training and inference workloads. As a Staff AI Runtime Engineer , you’ll play a pivotal role in the design, development, and...SuggestedWork at office
$140k - $215k
...CrowdStrike, Inc. is seeking a Software Development Engineer for the Cloud Runtime Protection team. In this role, you will design critical features for the Falcon platform, focusing on AI and cloud-native workloads. Experience with C/C++, Linux, and eBPF is required. This...SuggestedWork at office$203.45k - $344.3k
...Senior Staff Physical AI Data Algorithm Engineer Santa Clara, CA XPENG is a leading smart technology company at the forefront of innovation, integrating advanced AI and autonomous driving technologies into its vehicles, including electric vehicles (EVs), electric...SuggestedFull timeTemporary workWork experience placement$184k - $287.5k
NVIDIA Gruppe is seeking talented AI systems engineers to advance innovative technologies in AI inference systems software. This role involves developing cutting-edge libraries, code generators, and kernel technologies for NVIDIA's architecture, emphasizing high-impact...Suggested- ...end ML platforms and developer‑facing products to support AI and ML innovation across teams such as Embodied AI, Simulation... ...across diverse infrastructures. Position Overview As a Staff AI/ML Full‑Stack Engineer, you will design and build end‑to‑end software products,...
- ...A leading automotive company in California is seeking a Staff AI/ML Full-Stack Engineer to design and build end-to-end software products for autonomous vehicles. This hands-on role emphasizes technical depth and system design, involving mentorship, full-stack development...
$203.45k - $344.3k
...Senior Staff AI Data Infrastructure/Pipeline Engineer Santa Clara, CA XPENG is a leading smart technology company at the forefront of innovation, integrating advanced AI and autonomous driving technologies into its vehicles, including electric vehicles (EVs), electric...Full timeOverseas$100k
...Software Engineer, Metal Runtime (API & Abstractions) Tenstorrent is leading the industry on cutting-edge AI technology, revolutionizing performance expectations, ease of use, and cost efficiency. With AI redefining the computing paradigm, solutions must evolve to...$150k - $225k
...role requires substantial expertise in C++ and Python, alongside a passion for innovation in the AI field. Your responsibilities will include working closely with runtime teams, optimizing software performance, and ensuring quality management standards are met. The position...$184k - $287.5k
...We're looking for outstanding AI systems engineers to develop groundbreaking technologies in the inference systems software stack! We build... ...efficient attention kernel implementations, new LLM inference runtimes components, and kernel code generators to accelerate large language...$148k - $235.75k
...tapping into the unlimited potential of AI to define the next era of computing. An era... ...outstanding Senior High Performance AI Engineer to build groundbreaking multi-agent systems... ...ecosystem. We build innovative agentic runtimes and compiler-integrated orchestration that...$150k - $225k
...PlusAI is a Physical AI company pioneering AI-based virtual driver software for factory-built autonomous trucks. Headquartered in... ...teams. Responsibilities: Work closely with our autonomy and runtime teams to improve our redundant on-vehicle platform and autonomous...$152k - $241.5k
...NVIDIA Gruppe is seeking a Senior Software Engineer – AI Inference in Santa Clara, California. This role involves enhancing open-source LLM serving optimizations and implementing high-performance runtime capabilities. Candidates should have 5+ years of experience in building...$139k - $229k
...NYC Sr. AI Engineer: Assets, Formats & Placements The AFP team defines how ad creatives are packaged and delivered across LinkedIn —... ...Measurement, and Serving to provide the schema, APIs, and delivery runtime that enable creative portability, policy compliance, and cross...For contractorsWork at officeFlexible hours$272k - $431.25k
...NVIDIA Gruppe is looking for a Principal Software Engineer to advance open-source AI inference. This hands-on role emphasizes running high-performance... .... Key responsibilities include optimizing inference runtimes, improving efficiency, and mentoring other engineers. A...$195.2k - $275.58k
The Software and AI (SAI) organization is seeking a highly skilled Software Development Engineer to contribute to the development and optimization of oneDNN, a complex, cross... ...such as OpenVINO, TensorFlow, PyTorch, ONNX Runtime, and many others. This is a unique...Local areaRemote workWorldwideFlexible hoursShift work$124k - $195.5k
## AI Inference Performance Engineer - New College Grad 2026Applylocations: US, CA, Santa Claratime type: Full timeposted on: Posted Yesterdayjob requisition... ...(CUTLASS, cuteDSL, tilelang, OpenAI Triton) or compiler/runtime paths (torch.compile, graph lowering, operator fusion)....- ...MixMode is seeking an enthusiastic Runtime Engineer Intern to work on embedded systems development and Linux software implementation. The ideal candidate is pursuing a Bachelor's degree in Computer Science or a related field, with experience in C/C++ development for Linux...InternshipWork at office
- ...AI/ML Engineer - Agentic This role has been designed as 'Hybrid' with an expectation that you will work on average 2 days per week from... ...) that power orchestration, tool execution, and agent runtime behavior. Own observability and reliability for non-deterministic...Work at office2 days per week
$150k - $250k
Love Freedom Solution is looking for an AI Agent Engineer in Sunnyvale, California. This is a unique opportunity to influence the development... .... You'll lead the architecture and execution for the agent's runtime and work in a startup environment that is backed by prominent...Remote job$100k
...Software Engineer, Metal Runtime (Core Systems) Tenstorrent is leading the industry on cutting-edge AI technology, revolutionizing performance expectations, ease of use, and cost efficiency. With AI redefining the computing paradigm, solutions must evolve to unify...$152k - $208.5k
...Materials is a global leader in materials engineering solutions used to produce virtually every... ...that literally connect our world – like AI and IoT. If you want to push the boundaries... ...authoring, versioning, distribution, and runtime loading across Java, Python, C++, and web...Full timeRelocation- ...Senior Software Engineer II As a global leader in cybersecurity, CrowdStrike protects the... ...security with the world's most advanced AI-native platform. We work on large scale distributed... ...Engineer II to join our Sensor Event Runtime (SER) team. This role is responsible for...Work at officeFlexible hours
$179.5k - $260k
...cybersecurity company building a next-generation AI-driven operations platform, designed to... .... We’re looking for an Applied AI Engineer with strong backend and AI experience who... ...Proficiency with at least one modern backend runtime/language (e.g., Python, Go) and...Full timeNight shift- ...-generation computing experiences-from AI and data centers, to PCs, gaming and embedded... ...ROLE: AMD is looking for a Senior Staff AI Infra Engineer who is passionate about improving the... ...issues across GPU, network, and runtime layers. • Drive technical excellence,...
- ...Principal AI/ML System Software Engineer At d-Matrix, we are focused on unleashing the potential of generative AI to power the transformation... ...PyTorch and TensorFlow) Experience with deep learning runtimes (such as ONNX Runtime, TensorRT, etc.). Experience with...Work experience placement3 days per week
$140k - $215k
...redefined modern security with the world's most advanced AI-native platform. Our customers span all industries, and they... ...with you. About the Role: This is a Software Development Engineer role on the Cloud Runtime Protection team that builds the core of the CrowdStrike Falcon...Full timeWork experience placementWork at officeLocal area2 days per week3 days per week- ...Principal AI Agent / ML Software Engineer The Principal AI Agent / ML Software Engineer is a Senior Staff-level, hands-on technical leadership role responsible for defining, building... ...serving, inference gateways, agent runtimes, workflow engines, developer platforms,...
- ...Wayve is the leading developer of Embodied AI technology. Our advanced AI software and... ...join a motivated and talented team of engineers to deliver a reliable, stable and flexible... ...critical code paths and algorithms to improve runtime efficiency, reduce latency, and enhance...Full timeWork at officeWork from homeFlexible hours
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Staff AI Runtime Engineer. Be the first to apply!
Related searches
- assistant engineer Santa Clara, CA
- staff engineer Santa Clara, CA
- assistant electrical engineer Santa Clara, CA
- assistant mechanical engineer Santa Clara, CA
- software engineer staff Santa Clara, CA
- senior staff systems engineer Santa Clara, CA
- senior staff engineer Santa Clara, CA
- technology administrator Santa Clara, CA
- engineering aide Santa Clara, CA
- ai engineer remote Santa Clara, CA

