Machine Learning Engineer, Training Infrastructure
Intellipro Group
Job Title: Machine Learning Engineer, Training Infrastructure
Position Type: Full time
Location: San Francisco, CA, USA
Salary Range: $150,000 - $250, 000 (USD)
Job ID#: 158135 Job Description:
We are looking for an ML Engineer with 3+ YOE in high-performance computing systems to manage and optimize our computational infrastructure for training and deploying our machine learning models. The ideal candidate has diverse experience managing ML workloads at scale, supporting our 3DVAE and video diffusion models. We encourage you to apply even if you don't meet every requirement - we value curiosity, creativity, and the drive to solve hard problems.
Responsibilities
Founded in 2009, IntelliPro is a global leader in talent acquisition and HR solutions. Our commitment to delivering unparalleled service to clients, fostering employee growth, and building enduring partnerships sets us apart. We continue leading global talent solutions with a dynamic presence in over 160 countries, including the USA, China, Canada, Singapore, Japan, Philippines, UK, India, Netherlands, and the EU.
IntelliPro, a global leader connecting individuals with rewarding employment opportunities, is dedicated to understanding your career aspirations. As an Equal Opportunity Employer, IntelliPro values diversity and does not discriminate based on race, color, religion, sex, sexual orientation, gender identity, national origin, age, genetic information, disability, or any other legally protected group status. Moreover, our Inclusivity Commitment emphasizes embracing candidates of all abilities and ensures that our hiring and interview processes accommodate the needs of all applicants. Learn more about our commitment to diversity and inclusivity at Compensation: The pay offered to a successful candidate will be determined by various factors, including education, work experience, location, job responsibilities, certifications, and more. Additionally, IntelliPro provides a comprehensive benefits package, all subject to eligibility.
Position Type: Full time
Location: San Francisco, CA, USA
Salary Range: $150,000 - $250, 000 (USD)
Job ID#: 158135 Job Description:
We are looking for an ML Engineer with 3+ YOE in high-performance computing systems to manage and optimize our computational infrastructure for training and deploying our machine learning models. The ideal candidate has diverse experience managing ML workloads at scale, supporting our 3DVAE and video diffusion models. We encourage you to apply even if you don't meet every requirement - we value curiosity, creativity, and the drive to solve hard problems.
Responsibilities
- Design, implement, and maintain scalable computing solutions for training and deploying ML models, ensuring infrastructure can handle large video datasets.
- Manage and optimize the performance of our computing clusters or cloud instances, such as AWS or Google Cloud, to support distributed training.
- Ensure that our infrastructure can handle the resource-intensive tasks associated with training large generative models.
- Monitor system performance and implement improvements to maximize efficiency and utilization, using tools like Airflow for orchestration.
- Collaborate across research teams to understand their computational needs and provide appropriate solutions, facilitating seamless model deployment.
- Bachelor's degree in Computer Science, Information Technology, or a related field, with a focus on system administration.
- Experience with cloud computing platforms such as Amazon Web Services, Google Cloud, or Microsoft Azure, essential for managing large-scale ML workloads.
- This role is vital for ensuring the computational backbone supports the company's ML efforts, focusing on deployment and scalability.
- Values engineering processes and version control (CI/CD).
- Knowledge of containerization technologies like Docker and Kubernetes required for deployments at scale.
- Understanding of distributed training techniques and how to scale models across multi-node clusters aligning with video generation needs.
- Strong problem-solving and communication skills, given the need to collaborate with diverse teams.
Founded in 2009, IntelliPro is a global leader in talent acquisition and HR solutions. Our commitment to delivering unparalleled service to clients, fostering employee growth, and building enduring partnerships sets us apart. We continue leading global talent solutions with a dynamic presence in over 160 countries, including the USA, China, Canada, Singapore, Japan, Philippines, UK, India, Netherlands, and the EU.
IntelliPro, a global leader connecting individuals with rewarding employment opportunities, is dedicated to understanding your career aspirations. As an Equal Opportunity Employer, IntelliPro values diversity and does not discriminate based on race, color, religion, sex, sexual orientation, gender identity, national origin, age, genetic information, disability, or any other legally protected group status. Moreover, our Inclusivity Commitment emphasizes embracing candidates of all abilities and ensures that our hiring and interview processes accommodate the needs of all applicants. Learn more about our commitment to diversity and inclusivity at Compensation: The pay offered to a successful candidate will be determined by various factors, including education, work experience, location, job responsibilities, certifications, and more. Additionally, IntelliPro provides a comprehensive benefits package, all subject to eligibility.
Vacancy posted 17 hours ago
Similar jobs that could be interesting for youBased on the Machine Learning Engineer, Training Infrastructure in San Francisco, CA vacancy
- ...Report to CEO | OpenAI for Physics | 5 Days Onsite Machine Learning Infrastructure Engineer Location: Onsite in San Francisco Compensation:... ...deployments: data generation + simulation orchestration + training/fine-tuning infrastructure + benchmarking pipelines +...TrainingWork at officeFlexible hours1 day per week
$112.7k - $169.1k
...analytics, product intelligence, machine learning pipelines, and business operations... ...enables large-scale model training, feature generation, and experimentation... ...looking for a Machine Learning Engineer to join our Offline Infrastructure team. This is an ideal role for...TrainingWork at officeWorldwideRelocation package- ...Machine Learning Engineer In ML Runtime & Optimization Zensors is the spatial intelligence platform... ...edge compute resources. The AI Infrastructure team at Zensors builds the engine... ...develop technologies to accelerate the training and inference of computer vision...TrainingWork at office
$183.7k - $248.6k
...The opportunity Unity is looking for a Senior Machine Learning Infrastructure Engineer to join our Vector Ads team, where we build the real-time... ...operate the infrastructure that brings ML models from training into production, ensuring our ranking, bidding, and targeting...TrainingWork at officeRemote workWorldwideRelocation package$245k - $345k
...latest Whatnot updates on our news and engineering blogs and join us as we enable... ...You’ll design and scale the core infrastructure that powers machine learning and self-hosted large language model... ...model serving to distributed training & high‑throughput GPU inference. What...TrainingWork experience placementWork at officeLocal areaRemote workWork from homeHome officeFlexible hours$209.7k - $283.8k
...San Francisco, CA, USA Staff Machine Learning Engineer, ML Infrastructure Location San Francisco, CA, USA Department AI & Machine Learning... ...complexity grow, our platform also supports large-scale model training, feature generation, and experimentation workflows...TrainingWork at officeWorldwideRelocation package- ...creatives, technologists, and engineers working together to empower... .... The Role As an ML Infrastructure Engineer, Model Inference at... ...infrastructure that powers our machine learning models. Your work will be... ...for AI model inference and training Develop, optimize, and maintain...TrainingHourly payFull timeFlexible hours
$248k - $310k
# Senior Staff Machine Learning Engineer, InfrastructureAirbnb·United States·$248k - $310kfull-timeleadPosted... ...& worry-free travel experience. ML Infrastructure, which is the team you will join in,... ...of ML/AI best practices (e.g. training/serving skew minimization, A/B test,...TrainingCasual workLive inWork at officeRemote work$200k - $300k
...and innovators. The Role: We’re looking for a Machine Learning Infrastructure Engineer to join our AI Platform team. This is a high-leverage role... ...support ML research or production workloads – whether training pipelines, evaluation systems, or deployment frameworks...TrainingFull timeWork at office3 days per week- ...Pluralis Research is pioneering Protocol Learning – a fully decentralised way to train and deploy AI models that opens this... ...looking for an ML Training Platform Engineer to architect, build, and scale the foundational infrastructure powering our decentralised ML training...TrainingWork experience placement
- ...maintain rigid systems, Lightfield learns from how companies actually... ...of ML product development infrastructure, focusing on scaling and... ...in the context of LLM model training and prompting Create and... ...best practices for software engineering in an AI-driven development...TrainingWork from home
$166k - $225k
...Founded in late 2020 by a small group of machine learning engineers and researchers, Mosaic AI enables companies to securely fine-tune, train and deploy custom AI models on their... ...Design and build the core platform infrastructure that supports our customer-facing product...TrainingLocal areaWorldwide$151.8k - $265.35k
...all related technical fields, such as Machine Learning, Deep Learning, Computer Vision, and Natural... ...with world-class researchers and ML engineers to bring research ideas to production.... ...accelerated pipelines for (customized) model training and inference, focusing on performance,...TrainingTemporary workLocal areaWorldwide$246.5k - $339k
...the power of tech, data, and machine learning to connect this thriving... ...Machine Learning Platform Engineer, you will help design, improve... ...to accelerate model training, deployment, and governance... ...Design and operate ML infrastructure, including workspaces, clusters...TrainingWork experience placementWork at officeLocal areaRemote workMonday to FridayFlexible hours3 days per week$250k - $350k
...one builds what makes them actually work. We're hiring ML Infrastructure Engineers to tackle a hard, real-world problem, understanding what's... ...video pipelines handling millions of hours of data Training and inference systems for multimodal / LLM-based models...Training- ...ML Infrastructure Engineer Spectral Labs is a spatial intelligence company building reasoning models for engineering physical systems. Our... ...models better. Responsibilities Optimize distributed training & RL across our GPU cluster of hundreds of H100 GPUs (FSDP,...Training
- ...The Role At Mach9, ML infrastructure engineers build and maintain the systems that power production AI models for civil engineering and surveying... ...infrastructure engineers with experience building for both training and inference. You'll build training pipelines that...TrainingWork experience placement
$320k - $405k
...growing group of committed researchers, engineers, policy experts, and business... ...About the role We are seeking a Machine Learning Infrastructure Engineer to join our Safeguards organization... ...combination of education, training, and/or experience Required field...TrainingWork at officeVisa sponsorshipFlexible hours- ...Sygaldry Quantum-Accelerated AI Server Engineer Sygaldry Technologies is building quantum... ...AI servers to exponentially speed up training and inference for AI. By integrating... ...accelerate and transform AI. They need compute infrastructure that stays out of their way: GPU access...TrainingCasual workLocal areaVisa sponsorship
$320k - $405k
...ML Infrastructure Engineer, Safeguards San Francisco, CA About Anthropic Anthropic's mission... ...About the Role We are seeking a Machine Learning Infrastructure Engineer to join our... ...combination of education, training, and/or experience Required field...TrainingWork at officeVisa sponsorshipFlexible hours- ...through advanced hardware engineering and AI solutions. Our mission... .... We emphasize continuous learning and growth, fostering... ...We are seeking a Senior Machine Learning Infrastructure Engineer to join our team.... ...optimize scalable distributed training pipelines, with support for...TrainingFlexible hours
$110k - $180k
...time trading, all backed by robust data infrastructure. The Role Arta is building the AI... ...innovation, collaboration, and continuous learning are highly valued The opportunity to work... ...experience, and relevant education or training. Our offers are based on salary bands...TrainingWork at officeRelocation- ...advertising, as well as the machine learning/AI and data platforms that... ...Make: As a machine learning engineer or scientist, your... ...house Machine Learning tools & infrastructure to develop reusable, highly... ...ML/AI best practices (e.g. training/serving skew minimization,...TrainingWork experience placementRemote workShift work
- A forward-thinking AI company seeks experienced ML engineers to build distributed training infrastructure. This role involves designing scalable systems using PyTorch and Ray, ensuring performance and reliability in large-scale environments. The ideal candidates will possess...Training
- Whatnot is seeking an AI/ML Platform Engineer to shape the future of machine learning within a fast-growing livestream shopping platform. In this role, you... ...functions, prototype novel architectures, and build robust training pipelines. Ideal candidates will have 4+ years in...TrainingRemote job
- ...scaling and optimizing ML training systems. Key responsibilities... ...owning the training infrastructure, improving performance, and... ...will have strong software engineering foundations, hands-on experience... ...work at the intersection of machine learning and scalable infrastructure...Training
- ...Machine Learning Engineer At Krea, we are building next-generation AI creative tools. We are dedicated to making AI intuitive and controllable... ...who can work on large-scale image and video models training experiments. Some stuff you can do: Train foundation...Training
$200k - $280k
Engineering San Francisco Full-time $200,000 - $280,000 About the Role Join our ML Infrastructure team to build the systems that train, deploy, and serve our AI models at scale. You'll work at the intersection of machine learning and systems engineering. What You Will Do...TrainingFull timeWork at office- ...Generative Ai Engineer We are looking for a generative AI engineer to work on the full... ...problem, structuring image-based training data into usable formats, training and... ...Computer Science with a focus or specialty in Machine Learning ~2+ years of experience training and...Training
- ...will have an AI persona. Genies is looking for a Senior Machine Learning Engineer to join our Avatar Technology team, focused on building the... ...created and represented. What You’ll Be Doing: Design, train, and deploy machine learning models for animation...TrainingFull timeWork experience placementWork at office
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Machine Learning Engineer, Training Infrastructure. Be the first to apply!
Related searches
- machine learning ai engineer San Francisco, CA
- machine learning engineer San Francisco, CA
- entry level machine learning engineer San Francisco, CA
- junior machine learning research engineer San Francisco, CA
- machine learning software engineer San Francisco, CA
- ai ml engineer San Francisco, CA
- senior ml engineer San Francisco, CA
- graduate machine learning engineer San Francisco, CA
- computer vision machine learning engineer San Francisco, CA
- data scientist machine learning engineer San Francisco, CA


