Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Senior AI Runtime Engineer: Distributed Training & Scale

FlexAI

A forward-thinking AI infrastructure company is seeking a Staff AI Runtime Engineer to lead the design and optimization of their AI compute platform. In this leadership role, you'll enhance AI training and inference capabilities. Successful candidates will have over 8 years of experience in systems engineering, expertise with PyTorch and TensorFlow, and strong programming skills in Python and C++. This role is based in Santa Clara, CA, and offers a competitive salary along with the chance to work on cutting-edge technology. #J-18808-Ljbffr FlexAI

Vacancy posted 19 hours ago
Similar jobs that could be interesting for youBased on the Senior AI Runtime Engineer: Distributed Training & Scale in Santa Clara, CA vacancy
  • $180k - $225k

     ...Build and Deploy AI the right way, anywhere...  ...teams are strategically distributed across Silicon Valley...  ...designed for next-generation training and inference workloads. As a Staff AI Runtime Engineer , you'll play a...  ...training and inference at scale. Design resilient... 
    Training
    Work at office

    FlexAI

    Santa Clara, CA
    19 hours ago
  • A leading technology company is seeking a Fellow/Sr. Fellow Machine Learning Engineer to join the Training At Scale team in San Jose, CA. The candidate will work on distributed training of large models and improve training efficiency. Responsibilities include enhancing... 
    Senior
    Training

    Advanced Micro Devices

    San Jose, CA
    4 days ago
  • $180k

    A cutting-edge AI research firm in California seeks a Member of Technical Staff specializing...  ...hands-on experience with multimodal pre-training and a strong proficiency in Python, JAX,...  ...Responsibilities include designing large-scale systems and developing data pipelines to push... 
    Senior
    Training

    x.ai

    Palo Alto, CA
    19 hours ago
  • $124.09k - $210k

     ...Senior AI Data Infrastructure Engineer Santa Clara, CA XPENG is a leading smart technology...  ...We don't just process EB-scale perception data from tens...  ...and data versioning. Training Throughput Optimization:...  ...of building large-scale distributed systems. Work... 
    Senior
    Training
    Full time
    Work experience placement

    XPENG

    Santa Clara, CA
    19 hours ago
  • $184k - $287.5k

     ...NVIDIA's DGX Cloud AI Efficiency Team...  ...AI workloads - pre-training, post-training, inference...  ...resources and scale to foster...  ...infrastructure software engineer to join our team....  ...systems. As a senior DGX Cloud AI Infrastructure...  ...large-scale distributed systems. Experience... 
    Senior
    Training

    NVIDIA

    Santa Clara, CA
    4 days ago
  • $170.6k - $261.3k

     ...Job Description Senior AI/ML Engineer, AV ML Infra We're General Motors...  ...by running large-scale simulation workloads and managing...  ...andoptimizeslarge-scale ML training and inference across cloud...  ...implement, and test scalable distributed computing and data processing... 
    Senior
    Training
    Local area
    Work from home
    Flexible hours

    General Motors

    Sunnyvale, CA
    1 day ago
  • $155.42k - $395.9k

     ...supports the end-to-end AI lifecycle of ML...  ...experimentation and large-scale training to evaluation, lineage...  ...interfaces, enabling ML engineers and researchers to...  ...The Role: As a Senior AI/ML Engineer, you will...  ...implement, and test scalable distributed computing and data... 
    Senior
    Training
    Local area
    Remote work
    Work from home
    Relocation
    Relocation package
    Flexible hours

    General Motors

    Sunnyvale, CA
    1 day ago
  • $170.6k - $261.3k

     ...world! The Data Labeling Engineering team designs, builds, and operates...  ..., data engineering, and AI/ML, defining the strategies...  ...that create reliable training data at scale. Our tools and platform are...  ...experience building robust distributed platforms and applications.... 
    Senior
    Training
    Local area
    Remote work
    Work from home
    Flexible hours

    General Motors

    Sunnyvale, CA
    19 days ago
  • $170.5k - $240.71k

     ...Role We are looking for a Senior AI Software Engineer — Agentic AI System to help...  ...AI systems operating at scale. Key Responsibilities Design...  ...workflows for distributed AI systems Build data pipelines...  ...and relevant education or training. Your recruiter can share... 
    Senior
    Training
    Local area
    Immediate start
    Remote work
    Shift work

    Intel Corporation

    Santa Clara, CA
    1 day ago
  •  ...minds and talent in AI and machine...  ...hear from you. Senior AI Systems Performance Engineer Palo Alto, California...  ...operations at scale. SambaNova Suite...  ...compiler, runtime, and hardware layers...  ...single-node and distributed systems. Basic Qualifications...  ...model training and inference.... 
    Senior
    Training
    Full time
    Temporary work
    Local area
    Flexible hours

    SambaNova

    Palo Alto, CA
    19 hours ago
  • $133k - $254k

     ...Us 42dot is a mobility AI company committed to solving...  ...Our AI Data Pipeline Engineers build up the core data...  .... We develop the distributed system of a scalable data pipeline for large‑scale dataset (millions of scenes...  ...serving SDKs for ML model training / evaluation. The data... 
    Senior
    Training
    Full time
    Work experience placement

    42dot Inc.

    Sunnyvale, CA
    4 days ago
  • $200k - $400k

     ...designs and operates ultra-scale GPU supercomputing systems to train next-generation...  ...communication systems, runtime, and hardware topology....  ...communication performance, distributed reliability, and cross-layer...  ...for a deeply technical engineer to co-design and optimize... 
    Senior
    Training
    Full time
    Visa sponsorship

    Institute Of Foundation Models

    Sunnyvale, CA
    19 hours ago
  • $180k - $240k

     ...About the role We are seeking a Senior AI Infrastructure Engineer to design, build, and scale the high-performance AI...  ...infrastructure that enables distributed training, experiment tracking, and seamless...  ...using TensorRT, ONNX Runtime, and Triton Inference Server,... 
    Senior
    Training
    Odd job
    Full time
    Work at office

    Gatik AI

    Mountain View, CA
    11 hours ago
  •  ...Alto seeks a Staff/Principal ML Systems Engineer to enhance training performance for their innovative humanoid robots. You will optimize distributed training systems and engage closely...  ...paced environments, and possess strong debugging skills. #J-18808-Ljbffr Rhoda AI
    Senior
    Training

    Rhoda AI

    Palo Alto, CA
    3 days ago
  • $160k - $253k

    Senior Technical Marketing Engineer - Data Center Scale Out page is loaded## Senior Technical Marketing...  ...software to power AI at scale. To help customers...  ...-leading inference and training performance and power efficiency...  ...cabling, power distribution, and thermal scaling.*... 
    Senior
    Training

    NVIDIA Corporation

    Santa Clara, CA
    3 days ago
  • $184k - $287.5k

     ...We're looking for outstanding AI systems engineers to develop groundbreaking technologies in the...  ...kernel implementations, new LLM inference runtimes components, and kernel code generators...  ...solutions for LLM inference and training (e.g. FlashInfer, Flash Attention)... 
    Senior
    Training
    Remote work

    NVIDIA

    Santa Clara, CA
    1 day ago
  • $166k - $225k

     ...world's best data and AI infrastructure...  ...business. Founded by engineers — and customer...  ...interfacing with data to scaling our services and...  ...engineer on the Runtime team at Databricks...  ...next generation distributed data storage and processing...  ...and training, and specific work... 
    Senior
    Training
    Local area
    Worldwide

    Databricks Inc.

    Mountain View, CA
    2 days ago
  • $140k - $215k

     ...Software Development Engineer As a global...  ...world's most advanced AI-native platform....  ...role on the Cloud Runtime Protection team...  ...workloads deployed at scale Design and...  ...effectively in a distributed team Benefits...  ...recruitment, selection, training, compensation,... 
    Senior
    Training
    Work experience placement
    Work at office
    Local area
    2 days per week
    3 days per week

    CrowdStrike

    Sunnyvale, CA
    7 days ago
  •  ...Senior AI Engineer Gauss Labs is looking for a passionate and talented...  ...for data processing, model training, evaluation, and deployment....  ...inferencing structures for large scale ML/DL models. Experience...  ...). Experience in distributed/parallel systems, information... 
    Senior
    Training

    Gauss Labs

    Palo Alto, CA
    2 days ago
  • $110k - $190k

     ...Role Overview We are hiring a Senior Software & AI Engineer to build production-grade AI systems...  ...the right solution: data preparation, training, evaluation, deployment, and monitoring...  ...core to how we create value, scale operations, and differentiate in the... 
    Senior
    Training

    Covalent

    Sunnyvale, CA
    4 days ago
  • $166k - $244k

     ...About the job Google's software engineers develop the next-generation...  ...information at massive scale, and extend well beyond web search...  ...including information retrieval, distributed computing, large-scale system...  ..., and relevant education or training. Your recruiter can share... 
    Senior
    Training
    Full time

    Google Inc.

    Sunnyvale, CA
    1 day ago
  • $174k - $252k

    Senior Software Engineer, Google Distributed Cloud Hosted, Infrastructure Google Sunnyvale, CA, USA Bachelor’s degree...  ...experience with developing large-scale infrastructure, distributed systems...  ...experience, and relevant education or training. Your recruiter can share more... 
    Senior
    Training
    Full time

    Google Inc.

    Sunnyvale, CA
    19 hours ago
  • $176.8k - $265.2k

     ...is building an enterprise-scale Agentic AI platform to enable secure,...  ...Principal Software Development Engineer to serve as the technical...  ...ideal candidate has strong distributed systems expertise, deep familiarity...  ..., promotion, benefits, training, discipline, and... 
    Senior
    Training
    Local area

    F5

    San Jose, CA
    3 days ago
  • $209k

     ...data preprocessing, feature engineering, and dataset versioning....  ...downtime. • Enable support for distributed model training and hyperparameter...  ...GPU utilization for large-scale training workloads, ensuring...  ...tolerant, and resource-efficient AI workloads across multi-node... 
    Senior
    Training
    Work at office
    Remote work
    1 day per week

    Zoom Video Communications

    San Jose, CA
    2 days ago
  • $244.14k - $413.16k

     ...Senior Staff AI Engineer Santa Clara, CA XPENG is a leading smart technology company at the forefront...  ...Senior Staff AI Engineer to build and scale production-grade AI systems that drive...  ...experience, and relevant education or training. We are an Equal Opportunity Employer.... 
    Senior
    Training
    Full time

    XPENG

    Santa Clara, CA
    2 days ago
  • $223k - $306.5k

     ...Integrity, and Inclusion. We weave AI into the fabric of everything...  ...As a Sr Principal AI Engineer, you will join a dynamic team...  ...behavioral analysis, and adversarial training to protect model instructions...  ...environments, delivering large-scale implementations with... 
    Senior
    Training
    Full time
    Work at office

    Palo Alto Networks

    Santa Clara, CA
    3 days ago
  • $123k - $215.25k

     ...Senior AI Engineer II - Agentic AI New York, NY, United States Sunrise...  ...operate responsibly and at scale across the enterprise. Our...  ...and services: REST, gRPC Distributed systems: event-driven...  ...~ Career development and training opportunities For a full... 
    Senior
    Training
    Full time
    Work at office
    Local area
    Remote work
    Visa sponsorship
    Flexible hours
    Shift work

    American Express

    Palo Alto, CA
    3 days ago
  • $188k - $237.5k

     ...driving the transformation to AI-enabled software-defined...  ...fast-growing company with the scale and impact of an established...  ...seeking a highly motivated Senior AI Engineer to join our team and help us...  ..., including modeling, training, tuning, validating, deploying... 
    Senior
    Training
    Work at office
    Local area
    Worldwide
    Flexible hours
    Shift work

    Sonatus

    Sunnyvale, CA
    2 days ago
  • $100k

     ...Software Engineer, TT-Distributed Tenstorrent is leading...  ...on cutting-edge AI technology, revolutionizing...  ...of all seniorities. As our TT-Distributed...  ...inference and training infrastructure....  ...expert in large-scale distributed AI...  ...with compilers, runtimes, and AI frameworks... 
    Training

    Tenstorrent

    Santa Clara, CA
    2 days ago
  • $227.5k - $300k

     ...driving the transformation to AI-enabled software-defined...  ...fast-growing company with the scale and impact of an established...  ...Summary: We are seeking a Senior Staff AI Engineer with a combination of architectural...  ..., including modeling, training, tuning, validating, deploying... 
    Senior
    Training
    Work at office
    Worldwide
    Flexible hours
    Shift work

    Sonatus

    Sunnyvale, CA
    2 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior AI Runtime Engineer: Distributed Training & Scale. Be the first to apply!