AIML - Staff ML Infrastructure Engineer, ML Platform & Technology - Pre-training Infrastructure
$181.1k - $318.4kApple Oakbrook
AIML - Staff ML Infrastructure Engineer, ML Platform & Technology - Pre-training Infrastructure San Francisco Bay Area, California, United States Machine Learning and AI Description As an engineer on ML Compute team, your work will include: Drive large‑scale pre‑training initiatives to support cutting‑edge foundation models, focusing on resiliency, efficiency, scalability, and resource optimization. Enhance distributed training techniques for foundation models. Research and implement new patterns and technologies to improve system performance, maintainability, and design. Optimize execution and performance of workloads built with JAX, PyTorch, XLA and CUDA on large distributed systems. Leverage high‑performance networking technologies such as NCCL for GPU collectives and TPU interconnect (ICI/Fabric) for large‑scale distributed training. Architect a robust MLOps platform to streamline and automate pre‑training operations. Operationalize large‑scale ML workloads on Kubernetes, ensuring distributed trainings are robust, efficient, and fault‑tolerant. Lead complex technical projects, defining requirements and tracking progress with team members. Collaborate with cross‑functional engineers to solve large‑scale ML training challenges. Mentor engineers in areas of your expertise, fostering skill growth and knowledge sharing. Cultivate a team centered on collaboration, technical excellence, and innovation. Minimum Qualifications Bachelors in Computer Science, engineering, or a related field 6+ years of hands‑on experience in building scalable backend systems for training and evaluation of machine learning models Proficient in relevant programming languages, like Python or Go Strong expertise in distributed systems, reliability and scalability, containerization, and cloud platforms Proficient in cloud computing infrastructure and tools: Kubernetes, Ray, PySpark Ability to clearly and concisely communicate technical and architectural problems, while working with partners to iteratively find Preferred Qualifications Advance degrees in Computer Science, engineering, or a related field Proficient in working with and debugging accelerators, like: GPU, TPU, AWS Trainium Proficient in ML training and deployment frameworks, like: JAX, Tensorflow, PyTorch, TensorRT, vLLM At Apple, base pay is one part of our total compensation package and is determined within a range. This provides the opportunity to progress as you grow and develop within a role. The base pay range for this role is between $181,100 and $318,400, and your base pay will depend on your skills, qualifications, experience, and location. Apple employees also have the opportunity to become an Apple shareholder through participation in Apple’s discretionary employee stock programs. Apple employees are eligible for discretionary restricted stock unit awards, and can purchase Apple stock at a discount if voluntarily participating in Apple’s Employee Stock Purchase Plan. You’ll also receive benefits including comprehensive medical and dental coverage, retirement benefits, a range of discounted products and free services, and for formal education related to advancing your career at Apple, reimbursement for certain educational expenses — including tuition. Additionally, this role might be eligible for discretionary bonuses or commission payments as well as relocation. Note: Apple benefit, compensation and employee stock programs are subject to eligibility requirements and other terms of the applicable plan or program. Apple is an equal opportunity employer that is committed to inclusion and diversity. We seek to promote equal opportunity for all applicants without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, Veteran status, or other legally protected characteristics. Learn more about your EEO rights as an applicant. At Apple, we believe accessibility is a fundamental human right. You’ll find that idea reflected in everything here — in our culture, our benefits and our digital tools. By welcoming as many perspectives as possible, we help you build a career where you feel like you belong. Apple accepts applications to this posting on an ongoing basis. #J-18808-Ljbffr
- ...A technology company in San Francisco is seeking an experienced ML Infrastructure Engineer to develop platforms for machine learning jobs and to lead cross-functional initiatives. The ideal candidate will have experience with continuous integration and deployment models...Suggested
- ...Accelerated AI Server Engineer Sygaldry Technologies is building quantum-... ...speed up training and inference for AI... .... They need compute infrastructure that stays out of their... ...manage the compute platform this team runs on. The... ...training) Python-based ML and scientific...TrainingCasual workLocal areaVisa sponsorship
- ...advanced hardware engineering and AI solutions.... ...cutting-edge technologies that restore autonomy... ...Machine Learning Infrastructure Engineer to join... ...modeling, and analysis platforms, playing a... ...production-grade ML ecosystem to support... ...distributed training pipelines, with support...TrainingFlexible hours
- ...The AI Infrastructure team at Zensors builds the engine that powers our visual sensing platform. We provide the tools to automate the lifecycle... ...Learning Engineer in ML Runtime & Optimization , you will develop technologies to accelerate the training and inference of...Training
$216k - $270k
...As a Software Engineer on the Machine Learning Infrastructure team, you will build the "Operating... ...a high-performance training platform that handles the immense... ...and integrate emerging technologies in the CNCF and AI ecosystem... ...on orchestrating ML workloads at scale (100...TrainingFull time- Principal Engineer, AI Platform & Infrastructure About the Role SPREEAI is building the future... .... This role spans ML platform engineering, deployment... ...You'll Own ML Platform & Training Enablement Build and... ...lifelike photorealistic try‑on technology and hyper‑personalized...Training
$216k - $270k
...As a Software Engineer on the ML Infrastructure team, you will design and build platforms for scalable, reliable, and efficient serving... ..., and relevant education or training. Scale employees in eligible... ...quality data and full-stack technologies that power the world's...TrainingFull time$212k - $318.4k
...A leading technology company in San Francisco is seeking a Software Engineer to join its Applied Machine Learning team. This role focuses on designing and building a robust ML platform and infrastructure to support enterprise-level initiatives. Candidates should have at...- ...read on. What You'll Do Training Automation: Design and... ...requirement. Evaluation Infrastructure: Build scalable evaluation... ...degree in Computer Science, Engineering, or equivalent practical experience... ...Engineering, MLOps, or ML Infrastructure ~ Strong Python...TrainingImmediate startRelocation packageNight shift
$180k - $250k
...AI is building the pre-model intelligence... ...the context engine layer that solves... ...come from better infrastructure around models: Better... ...PhD in Robotics and ML. Clark Zhang, CTO... ...years of Enterprise Technology experience. About... ..., such as training pipelines, inference...TrainingFull time- .... Our AI-powered platform was purpose-built... ...enterprise-grade technology transforms patient... ...and engineers working together... ...The Role As an ML Infrastructure Engineer Model Inference... ...model inference and training Develop optimize... ...usage. ~ Pre-tax Benefits: Access...TrainingHourly payFull timeFlexible hours
- ...A progressive technology company in San Francisco is looking for a Data Infrastructure Engineer to design and operate data and ML infrastructure on AWS. The ideal candidate will have strong software engineering fundamentals and experience building production systems, particularly...
$250k - $350k
...This one builds what makes them actually work. We’re hiring ML Infrastructure Engineers to tackle a hard, real-world problem, understanding what’s... ...video pipelines handling millions of hours of data Training and inference systems for multimodal / LLM-based models GPU...Training$200k - $280k
...Engineering San Francisco Full-time $200,000 - $280,000 About the Role Join our ML Infrastructure team to build the systems that train, deploy, and serve our AI models at scale. You'll work at the intersection of machine learning and systems engineering. What You Will...TrainingFull timeWork at office- ...ML Infrastructure Engineer Spectral Labs is a spatial intelligence company building reasoning models for engineering physical systems. Our... ...models better. Responsibilities Optimize distributed training & RL across our GPU cluster of hundreds of H100 GPUs (FSDP...Training
- ...A healthcare technology firm in San Francisco is seeking an ML Infrastructure Engineer, Model Inference to build and optimize AI-driven solutions. You will design scalable Kubernetes clusters, enhance ML model serving infrastructure, and collaborate with cross-functional...
$320k - $405k
...committed researchers, engineers, policy experts,... ...Machine Learning Infrastructure Engineer to join... ...safety, developing the platforms and tools that... ...and implement ML infrastructure that... ...combination of education, training, and/or experience... ..., we expect all staff to be in one of...TrainingWork at officeVisa sponsorshipFlexible hours- ...We are seeking a Data Infrastructure Engineer to build and operate the infrastructure... ...standards that keep the platform trustworthy as data volume,... ...and operate scalable data and ML infrastructure on AWS,... ...to support perception model training and evaluation workflows, enabling...TrainingPermanent employmentFull time
$300k - $430k
...conversational AI platform empowering every brand... ...experiences. Our technology enables industry-defining... ...About the Team The ML Infrastructure team builds the... ...for model training, the infrastructure... ...Role We're hiring a Staff ML Infrastructure Engineer to own the platforms...TrainingWork at office- ...on scaling and optimizing ML training systems. Key responsibilities... ...owning the training infrastructure, improving performance, and... ...will have strong software engineering foundations, hands-on experience... ...and familiarity with cloud platforms. This position provides a unique...Training
- ...research organization in San Francisco seeks an Infrastructure Engineer to design and maintain large distributed ML training and inference clusters. The ideal candidate... ...FSDP and DeepSpeed. Proficiency in cloud platforms and containerization is essential. Join us to...Training
$200.2k - $357.5k
...Cloud, which is a platform that enables... ...are the infrastructure of our planet,... ...We’re hiring a Staff / Senior Staff... ...Infrastructure Engineer to lead the design... ...end-to-end ML platform powering... ...and mature technologies driving scalable... ...ML platform (training, experimentation...TrainingWork at officeRemote workFlexible hours- ...Ltd. is seeking a Machine Learning Engineer with expertise in high-performance... ...systems to manage and optimize their infrastructure for ML model training and deployment. The ideal candidate... ...Science and experience with cloud platforms like AWS and Google Cloud. Responsibilities...TrainingFull time
$292k - $417.2k
...Director, ML Engineering & Infrastructure San Francisco, CA; Los Angeles, CA;... ...feature engineering, model training, evaluation, and serving.... ...latency services, streaming platforms, and large-scale serving.... ...) and ML infrastructure technologies. ~ Track record of...TrainingFull timeTemporary workLocal areaRemote workFlexible hours$212k - $318.4k
...Software Engineer, ML platform and Infrastructure San Francisco Bay Area, California, United States Software... ...integration of cutting‑edge open‑source technologies and building innovative internal... ...for data processing and model training/fine‑tuning workflows. Hands‑on experience...TrainingRelocation- ...Workshop Labs Job Posting Build the infrastructure to serve personal AI models privately and... ...ever seeing your data. Our core ML systems challenge: how do we serve the... ...SNP, or related confidential computing technologies. We encourage speculative applications...Remote workShift work
$230.77k - $323.08k
...seeking a Principal Software Engineer to join our Vehicle Platforms team for our... ...the foundational platform infrastructure-the "nervous system"-that... ...transportation/shipping training. Required for certain... ...Credential, which includes pre-employment and random drug...TrainingPermanent employmentTemporary workLocal area- ...Machine Learning Engineer, Training Infrastructure Job Title: Machine Learning Engineer... ...: We are looking for an ML Engineer with 3+ YOE in high... ...Science, Information Technology, or a related field, with... ...Experience with cloud computing platforms such as Amazon Web...TrainingFull timeWork experience placement
$200k - $300k
...for a Machine Learning Infrastructure Engineer to join our AI Platform team. This is a high-leverage... ...work closely with our ML, data, and product teams... ...workloads – whether training pipelines, evaluation systems... ...with cutting edge AI technology, on a product that dramatically...TrainingWork at office3 days per week- A leading livestream shopping platform is looking for an AI/ML Platform Engineer to shape the future of AI and ML systems. This role involves designing the infrastructure that powers machine learning applications, working alongside experts to deploy models at scale. Candidates...Remote workFlexible hours
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to AIML - Staff ML Infrastructure Engineer, ML Platform & Technology - Pre-training Infrastructure. Be the first to apply!
- software engineer staff San Francisco, CA
- assistant engineer San Francisco, CA
- assistant engineering manager San Francisco, CA
- staff design engineer San Francisco, CA
- project engineer assistant project manager San Francisco, CA
- technology administrator San Francisco, CA
- staff data engineer San Francisco, CA
- assistant chief engineer San Francisco, CA
- senior staff systems engineer San Francisco, CA
- staff engineer San Francisco, CA

