Senior AI Runtime Engineer: Distributed Training & Scale
FlexAI
A forward-thinking AI infrastructure company is seeking a Staff AI Runtime Engineer to lead the design and optimization of their AI compute platform. In this leadership role, you'll enhance AI training and inference capabilities. Successful candidates will have over 8 years of experience in systems engineering, expertise with PyTorch and TensorFlow, and strong programming skills in Python and C++. This role is based in Santa Clara, CA, and offers a competitive salary along with the chance to work on cutting-edge technology. #J-18808-Ljbffr FlexAI
$180k - $225k
...Build and Deploy AI the right way, anywhere... ...teams are strategically distributed across Silicon Valley... ...designed for next-generation training and inference workloads. As a Staff AI Runtime Engineer , you'll play a... ...training and inference at scale. Design resilient...TrainingWork at office- A leading technology company is seeking a Fellow/Sr. Fellow Machine Learning Engineer to join the Training At Scale team in San Jose, CA. The candidate will work on distributed training of large models and improve training efficiency. Responsibilities include enhancing...SeniorTraining
$180k
A cutting-edge AI research firm in California seeks a Member of Technical Staff specializing... ...hands-on experience with multimodal pre-training and a strong proficiency in Python, JAX,... ...Responsibilities include designing large-scale systems and developing data pipelines to push...SeniorTraining$124.09k - $210k
...Senior AI Data Infrastructure Engineer Santa Clara, CA XPENG is a leading smart technology... ...We don't just process EB-scale perception data from tens... ...and data versioning. Training Throughput Optimization:... ...of building large-scale distributed systems. Work...SeniorTrainingFull timeWork experience placement$184k - $287.5k
...NVIDIA's DGX Cloud AI Efficiency Team... ...AI workloads - pre-training, post-training, inference... ...resources and scale to foster... ...infrastructure software engineer to join our team.... ...systems. As a senior DGX Cloud AI Infrastructure... ...large-scale distributed systems. Experience...SeniorTraining$170.6k - $261.3k
...Job Description Senior AI/ML Engineer, AV ML Infra We're General Motors... ...by running large-scale simulation workloads and managing... ...andoptimizeslarge-scale ML training and inference across cloud... ...implement, and test scalable distributed computing and data processing...SeniorTrainingLocal areaWork from homeFlexible hours$155.42k - $395.9k
...supports the end-to-end AI lifecycle of ML... ...experimentation and large-scale training to evaluation, lineage... ...interfaces, enabling ML engineers and researchers to... ...The Role: As a Senior AI/ML Engineer, you will... ...implement, and test scalable distributed computing and data...SeniorTrainingLocal areaRemote workWork from homeRelocationRelocation packageFlexible hours$170.6k - $261.3k
...world! The Data Labeling Engineering team designs, builds, and operates... ..., data engineering, and AI/ML, defining the strategies... ...that create reliable training data at scale. Our tools and platform are... ...experience building robust distributed platforms and applications....SeniorTrainingLocal areaRemote workWork from homeFlexible hours$170.5k - $240.71k
...Role We are looking for a Senior AI Software Engineer — Agentic AI System to help... ...AI systems operating at scale. Key Responsibilities Design... ...workflows for distributed AI systems Build data pipelines... ...and relevant education or training. Your recruiter can share...SeniorTrainingLocal areaImmediate startRemote workShift work- ...minds and talent in AI and machine... ...hear from you. Senior AI Systems Performance Engineer Palo Alto, California... ...operations at scale. SambaNova Suite... ...compiler, runtime, and hardware layers... ...single-node and distributed systems. Basic Qualifications... ...model training and inference....SeniorTrainingFull timeTemporary workLocal areaFlexible hours
$133k - $254k
...Us 42dot is a mobility AI company committed to solving... ...Our AI Data Pipeline Engineers build up the core data... .... We develop the distributed system of a scalable data pipeline for large‑scale dataset (millions of scenes... ...serving SDKs for ML model training / evaluation. The data...SeniorTrainingFull timeWork experience placement$200k - $400k
...designs and operates ultra-scale GPU supercomputing systems to train next-generation... ...communication systems, runtime, and hardware topology.... ...communication performance, distributed reliability, and cross-layer... ...for a deeply technical engineer to co-design and optimize...SeniorTrainingFull timeVisa sponsorship$180k - $240k
...About the role We are seeking a Senior AI Infrastructure Engineer to design, build, and scale the high-performance AI... ...infrastructure that enables distributed training, experiment tracking, and seamless... ...using TensorRT, ONNX Runtime, and Triton Inference Server,...SeniorTrainingOdd jobFull timeWork at office- ...Alto seeks a Staff/Principal ML Systems Engineer to enhance training performance for their innovative humanoid robots. You will optimize distributed training systems and engage closely... ...paced environments, and possess strong debugging skills. #J-18808-Ljbffr Rhoda AISeniorTraining
$160k - $253k
Senior Technical Marketing Engineer - Data Center Scale Out page is loaded## Senior Technical Marketing... ...software to power AI at scale. To help customers... ...-leading inference and training performance and power efficiency... ...cabling, power distribution, and thermal scaling.*...SeniorTraining$184k - $287.5k
...We're looking for outstanding AI systems engineers to develop groundbreaking technologies in the... ...kernel implementations, new LLM inference runtimes components, and kernel code generators... ...solutions for LLM inference and training (e.g. FlashInfer, Flash Attention)...SeniorTrainingRemote work$166k - $225k
...world's best data and AI infrastructure... ...business. Founded by engineers — and customer... ...interfacing with data to scaling our services and... ...engineer on the Runtime team at Databricks... ...next generation distributed data storage and processing... ...and training, and specific work...SeniorTrainingLocal areaWorldwide$140k - $215k
...Software Development Engineer As a global... ...world's most advanced AI-native platform.... ...role on the Cloud Runtime Protection team... ...workloads deployed at scale Design and... ...effectively in a distributed team Benefits... ...recruitment, selection, training, compensation,...SeniorTrainingWork experience placementWork at officeLocal area2 days per week3 days per week- ...Senior AI Engineer Gauss Labs is looking for a passionate and talented... ...for data processing, model training, evaluation, and deployment.... ...inferencing structures for large scale ML/DL models. Experience... ...). Experience in distributed/parallel systems, information...SeniorTraining
$110k - $190k
...Role Overview We are hiring a Senior Software & AI Engineer to build production-grade AI systems... ...the right solution: data preparation, training, evaluation, deployment, and monitoring... ...core to how we create value, scale operations, and differentiate in the...SeniorTraining$166k - $244k
...About the job Google's software engineers develop the next-generation... ...information at massive scale, and extend well beyond web search... ...including information retrieval, distributed computing, large-scale system... ..., and relevant education or training. Your recruiter can share...SeniorTrainingFull time$174k - $252k
Senior Software Engineer, Google Distributed Cloud Hosted, Infrastructure Google Sunnyvale, CA, USA Bachelor’s degree... ...experience with developing large-scale infrastructure, distributed systems... ...experience, and relevant education or training. Your recruiter can share more...SeniorTrainingFull time$176.8k - $265.2k
...is building an enterprise-scale Agentic AI platform to enable secure,... ...Principal Software Development Engineer to serve as the technical... ...ideal candidate has strong distributed systems expertise, deep familiarity... ..., promotion, benefits, training, discipline, and...SeniorTrainingLocal area$209k
...data preprocessing, feature engineering, and dataset versioning.... ...downtime. • Enable support for distributed model training and hyperparameter... ...GPU utilization for large-scale training workloads, ensuring... ...tolerant, and resource-efficient AI workloads across multi-node...SeniorTrainingWork at officeRemote work1 day per week$244.14k - $413.16k
...Senior Staff AI Engineer Santa Clara, CA XPENG is a leading smart technology company at the forefront... ...Senior Staff AI Engineer to build and scale production-grade AI systems that drive... ...experience, and relevant education or training. We are an Equal Opportunity Employer....SeniorTrainingFull time$223k - $306.5k
...Integrity, and Inclusion. We weave AI into the fabric of everything... ...As a Sr Principal AI Engineer, you will join a dynamic team... ...behavioral analysis, and adversarial training to protect model instructions... ...environments, delivering large-scale implementations with...SeniorTrainingFull timeWork at office$123k - $215.25k
...Senior AI Engineer II - Agentic AI New York, NY, United States Sunrise... ...operate responsibly and at scale across the enterprise. Our... ...and services: REST, gRPC Distributed systems: event-driven... ...~ Career development and training opportunities For a full...SeniorTrainingFull timeWork at officeLocal areaRemote workVisa sponsorshipFlexible hoursShift work$188k - $237.5k
...driving the transformation to AI-enabled software-defined... ...fast-growing company with the scale and impact of an established... ...seeking a highly motivated Senior AI Engineer to join our team and help us... ..., including modeling, training, tuning, validating, deploying...SeniorTrainingWork at officeLocal areaWorldwideFlexible hoursShift work$100k
...Software Engineer, TT-Distributed Tenstorrent is leading... ...on cutting-edge AI technology, revolutionizing... ...of all seniorities. As our TT-Distributed... ...inference and training infrastructure.... ...expert in large-scale distributed AI... ...with compilers, runtimes, and AI frameworks...Training$227.5k - $300k
...driving the transformation to AI-enabled software-defined... ...fast-growing company with the scale and impact of an established... ...Summary: We are seeking a Senior Staff AI Engineer with a combination of architectural... ..., including modeling, training, tuning, validating, deploying...SeniorTrainingWork at officeWorldwideFlexible hoursShift work
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Senior AI Runtime Engineer: Distributed Training & Scale. Be the first to apply!
- machine learning ai engineer Santa Clara, CA
- senior ai engineer Santa Clara, CA
- ai engineer remote Santa Clara, CA
- ai ml engineer Santa Clara, CA
- ai engineer Santa Clara, CA
- ai developer Santa Clara, CA
- ai prompt engineer Santa Clara, CA
- senior development executive Santa Clara, CA
- senior technical manager Santa Clara, CA
- senior software development engineer in test Santa Clara, CA


