Staff AI/ML Engineer - Large-Scale Systems
PrismML
We build high-performance foundation models designed to run efficiently across a wide range of environments—from edge devices to large-scale deployments. Our work spans models from ~1B to 100B+ parameters across LLMs, diffusion models, and other modalities, with a strong focus on scalable training, efficient inference, and real-world deployment. Role Overview We are seeking a Staff-level (or higher) AI/ML engineer to lead large-scale model training efforts. This role combines hands-on ownership of large training runs with responsibility for setting technical direction, mentoring engineers, and improving model quality and system performance across the organization. Responsibilities You will design, implement, and optimize distributed training systems for large-scale models across all major training phases. Core responsibilities include: Leading model development across pretraining, fine-tuning, and post-training stages Designing and improving data pipelines, including curation, filtering, deduplication, and dataset composition Improving training efficiency, scalability, and reliability across large distributed systems Optimizing model performance with respect to convergence, throughput, memory usage, and stability Translating cutting-edge research into robust, production-ready systems Providing technical leadership through mentoring, design reviews, and cross-functional collaboration Basic Qualifications You bring deep experience in large-scale AI/ML systems and strong fundamentals in modern model training: 8–10+ years of experience in machine learning or AI or strong publication record Strong Python programming skills with production-quality code Hands-on experience training large-scale models (multi-billion parameters) Solid understanding of optimization, distributed training, and training dynamics Experience with modern model training workflows (e.g., pretraining, fine-tuning, reinforcement learning approaches) Proven ability to mentor and lead other AI/ML engineers Preferred Qualifications You have additional experience aligned with large-scale, high-performance AI/ML systems: Experience training very large models (tens to hundreds of billions of parameters) Familiarity with modern accelerator hardware (e.g., GPUs or TPUs) and distributed training frameworks Experience improving system performance, resource utilization, and training efficiency Exposure to deployment environments with real-world constraints (e.g., latency, cost, or hardware limitations) Experience with advanced optimization techniques and scaling strategies Contributions to research, publications, or open-source AI/ML systems Ideal Candidate Profile You have led or significantly contributed to training large models end-to-end, understand common failure modes in large-scale training systems, and know how to debug and improve them. You care about building efficient, reliable systems that work in real-world settings, enjoy mentoring others, and thrive at the intersection of research, engineering, and product. #J-18808-Ljbffr PrismML
- PrismML is seeking a Staff-level AI/ML engineer to lead large-scale model training efforts. This role focuses on technical direction, mentoring engineers, and enhancing model quality and system performance. The ideal candidate will design, implement, and optimize distributed...Suggested
$230k - $310k
...company in San Francisco is seeking a Staff Engineer to lead critical backend initiatives. This... ...architecting scalable back-end systems and mentoring engineers while ensuring... ...expertise in event streaming systems and large-scale APIs. The position offers a competitive...SuggestedWork at officeRemote work- ...Staff ML Platform Engineer – Large Scale Training (LLMOps/MLOps) We're TrueFoundry, and we're building the foundational infrastructure for production AI systems. We're looking for a Staff ML Platform Engineer – Large Scale Training (LLMOps/MLOps) to join the team....SuggestedFlexible hours
- A leading AI research organization in San Francisco seeks an Infrastructure Engineer to design and maintain large distributed ML training and inference clusters. The ideal candidate will have a strong grasp of optimizing training workloads and experience with distributed...Suggested
- ...AI/ML Engineer (RL & Physical Systems) FLUIX is building the AI Operating System for data centers. We deploy autonomous AI that optimizes, predicts... ...touch real chillers, real cooling loops, and real megawatt-scale infrastructure. Who You'll Work Closely With Abhi...SuggestedWeekend work
$147.4k - $272.1k
Machine Learning Engineer — Large Language Models, Generative AI & Agentic Systems San Francisco Bay Area, California, United States... ...high-quality inferences at scale! Description We are in search of... ...matters most is curiosity, strong ML fundamentals, and the ability to...Relocation- A leading AI contracting platform is seeking an AI-native builder for its GTM AI & Systems team in San Francisco. This role focuses on replacing manual marketing tasks with... ..., and collaborating with various teams to scale solutions. Candidates should have experience...
- A leading AI Time platform provider in Los Angeles seeks a Staff AI Platform Engineer to enhance its AI platform. The role requires deep experience in building AI/ML platforms at scale and strong backend systems knowledge. Responsibilities include owning AI challenges,...Work at office
$180k - $260k
...AI And ML Engineer Profound is on a mission to help companies understand and control their AI presence. As an AI and ML Engineer, you will design, build, and ship large scale NLP and LLM systems that power classification, ranking, clustering, topic discovery, and content...Work at office- ...Brain Co. is an applied AI startup co-founded by... ...governments, healthcare systems, and critical industries... ...The Role As an AI/ML Engineer at Brain Co., you will... ...verticals. Optimize and Scale: Build scalable data... ..., and society at large. Engage with Leaders...Worldwide
- ...Series A-funded agentic AI company building the... ...our Silicon Valley engineering team — a small,... ...FAISS, Weaviate) at scale. • Implement agentic systems using LangGraph, LlamaIndex... ...AI features into large-scale data pipelines... ...• 3–5 years of AI/ML engineering; minimum...Visa sponsorship
- ...Senior AI / ML Engineer We are seeking a proactive, hands-on Senior ML/... ...the frontier of intelligent systems within the sector of advanced... ...based on stakeholder feedback to scale models for production... ...specifically those utilizing Large Language Models (LLMs) and transformer...
$170k - $216k
...Job Description: ai/ml phthon engineer The Perception team builds the system which learns the spatial-temporal representation and their semantic... ...for efficiently and continuously learning from large scale real-world data, to (2) develop models and model...Full timeRemote work- ...About the job Applied AI / ML Engineer About Us Catalyst Labs is... ...Established tech companies: scaling their ML infrastructure, recommendation systems, and data platforms, and Enterprise... ...of resources and reach of a large multi national firm. Roles &...Full timeVisa sponsorship
$300k - $400k
...Principal AI/ML Engineer - AdTech San Francisco, California, United States Zeta Global... ...creative content generation, operating at large scale and low latency to handle billions of... ...and data science teams to ensure our ML systems are highly performant, scalable, and...$220k - $255.8k
...Seattle/WA. Team: AI Platform Engineering, WEX Inc. About... ...to build, deploy, and scale AI-powered experiences... ...about building systems that make AI a core part... ...If you're excited by Large Language Models, Agentic... ...Design and maintain ML pipelines, from data ingestion...Remote workFlexible hours$308k - $423.5k
...We are seeking a Principal AI / ML Engineer to be a company-level technical... ...and lead deployment of AI systems (LLM fine-tuning, RLHF, agent... ...deploying machine learning models at scale, conducting applied AI... ...background: experience with large-scale data pipelines, ML feature...Work experience placementWork at officeLocal areaRemote workMonday to FridayFlexible hours3 days per week$202k
About the Role (Sr AI/ML Engineer : Not Data Scientist) Core Security Engineering... ...for providing and managing systems, services, and libraries to... ..., and enforcement at scale. The scope spans across multiple... ...retraining. Familiarity with large‑scale data/infra systems (Kafka...Full time$190k - $260k
A leading AI-driven recruiting platform in San Francisco seeks a Senior/Staff AI/ML Engineer to design and implement innovative AI features. You will develop intelligent search systems and contribute to shaping AI-driven recruitment solutions. The role emphasizes collaboration...Work at officeFlexible hours3 days per week$140k - $185k
...develop edge machine learning systems that to improve the autonomy and... ...robots Build scalable ML infrastructure for model training... ...construction environments Analyze large-scale operational datasets to... ...based commercial and open-source AI tools into our autonomy stack...Local areaFlexible hours- A leading AI technology firm located in San Francisco is seeking a Research Engineer specializing in AI Performance & Kernel Optimization... ...the performance of large-scale AI systems, optimizing kernels, and collaborating... ...and experience with ML workloads. Benefits include...
- A leading AI company in San Francisco is seeking a skilled ML Infrastructure Engineer to manage and optimize large-scale training systems. In this role, you will design and maintain infrastructure for model training, ensuring efficient GPU/TPU utilization while working...
- ...wide range of environments—from edge devices to large-scale deployments. Our work spans models from ~1B... ...deployment. Role Overview We are seeking a Staff-level (or higher) AI/ML engineer with expertise in multimodal systems to lead the development of capabilities that...
- A leading AI research company in San Francisco seeks Senior/Staff Engineers skilled in distributed systems and large-scale ML training. Responsibilities include designing systems optimized for low-bandwidth conditions and implementing robust training strategies. Ideal candidates...Remote work
- ...are building next-generation AI creative tools. We are dedicated... ...We're looking for an ML Engineer to architect and build Krea's... ...personalization and recommendation systems from scratch. You'll have... ...points Experience with large-scale data systems and production ML...H1b
- Job Title AI/ML Research Engineer Company Description Generalcatalyst.com - YC W... ...models and vision-language systems that automate complex manual... ...reason, plan, and execute at scale. High‑velocity research environment... ...for GUI automation using large reasoning models and chain‑...
- Sydecar in San Francisco is seeking a Staff Software Engineer to lead complex projects and mentor a team of engineers. The successful candidate... ...in JavaScript/TypeScript, and expertise in building large-scale systems. You will be responsible for outlining technology...
$181.1k - $318.4k
Apple Inc. is looking for a Staff ML Infrastructure Engineer in San Francisco to lead pre-training initiatives for cutting-edge foundation models in... ...over 6 years of experience in building scalable backend systems, be proficient in Python and Go, and possess strong knowledge...$197.3k - $225.1k
...Lead AI/ML Engineer (Platform, kubeflow) Overview At Capital One,... ...responsible and reliable AI systems, changing banking for good. For... ...including foundation model training, large language model inference,... ..., throughput — of large scale production AI systems....Full timePart timeLocal area$214k - $300k
Monograph is seeking an engineer to build and improve AI evaluation systems aimed at increasing shipping quality for AI tools. You will enhance scalable eval runners, improve benchmarks, and ensure reliability in distributed systems. Strong engineering fundamentals and...
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Staff AI/ML Engineer - Large-Scale Systems. Be the first to apply!
- staff security engineer San Francisco, CA
- assistant engineer San Francisco, CA
- engineering aide San Francisco, CA
- assistant chief engineer San Francisco, CA
- staff engineer San Francisco, CA
- technology administrator San Francisco, CA
- senior staff systems engineer San Francisco, CA
- assistant mechanical engineer San Francisco, CA
- staff data engineer San Francisco, CA
- software engineer staff San Francisco, CA

