Machine Learning Engineer, Training Infrastructure
Ipro Networks Pte. Ltd.
Machine Learning Engineer, Training Infrastructure Job Title: Machine Learning Engineer, Training Infrastructure Position Type: Full time Location: San Francisco, CA, USA Job ID#: 158135 Job Description: We are looking for an ML Engineer with 3+ YOE in high-performance computing systems to manage and optimize our computational infrastructure for training and deploying our machine learning models. The ideal candidate has diverse experience managing ML workloads at scale, supporting our 3DVAE and video diffusion models. We encourage you to apply even if you don't meet every requirement — we value curiosity, creativity, and the drive to solve hard problems. Responsibilities Design, implement, and maintain scalable computing solutions for training and deploying ML models, ensuring infrastructure can handle large video datasets. Manage and optimize the performance of our computing clusters or cloud instances, such as AWS or Google Cloud, to support distributed training. Ensure that our infrastructure can handle the resource-intensive tasks associated with training large generative models. Monitor system performance and implement improvements to maximize efficiency and utilization, using tools like Airflow for orchestration. Collaborate across research teams to understand their computational needs and provide appropriate solutions, facilitating seamless model deployment. Requirements Bachelor’s degree in Computer Science, Information Technology, or a related field, with a focus on system administration. Experience with cloud computing platforms such as Amazon Web Services, Google Cloud, or Microsoft Azure, essential for managing large-scale ML workloads. This role is vital for ensuring the computational backbone supports the company’s ML efforts, focusing on deployment and scalability. Values engineering processes and version control (CI/CD). Knowledge of containerization technologies like Docker and Kubernetes required for deployments at scale. Understanding of distributed training techniques and how to scale models across multi-node clusters aligning with video generation needs. Strong problem-solving and communication skills, given the need to collaborate with diverse teams. About Us Founded in 2009, IntelliPro is a global leader in talent acquisition and HR solutions. Our commitment to delivering unparalleled service to clients, fostering employee growth, and building enduring partnerships sets us apart. We continue leading global talent solutions with a dynamic presence in over 160 countries, including the USA, China, Canada, Singapore, Japan, Philippines, UK, India, Netherlands, and the EU. IntelliPro, a global leader connecting individuals with rewarding employment opportunities, is dedicated to understanding your career aspirations. As an Equal Opportunity Employer, IntelliPro values diversity and does not discriminate based on race, color, religion, sex, sexual orientation, gender identity, national origin, age, genetic information, disability, or any other legally protected group status. Moreover, our Inclusivity Commitment emphasizes embracing candidates of all abilities and ensures that our hiring and interview processes accommodate the needs of all applicants. Learn more about our commitment to diversity and inclusivity at . Compensation The pay offered to a successful candidate will be determined by various factors, including education, work experience, location, job responsibilities, certifications, and more. Additionally, IntelliPro provides a comprehensive benefits package, all subject to eligibility. #J-18808-Ljbffr Ipro Networks Pte. Ltd.
- ...Company: One of our consumer AI investments is hiring an ML Infrastructure Engineer. The founding team helped build iconic consumer products at... ...: You’ll be an early ML Infra hire helping scale training and inference systems that directly power a consumer product...TrainingFull time
- ...Machine Learning Infrastructure Engineer Join to apply for the Machine Learning Infrastructure Engineer role at Character.AI Machine Learning... ...with experience designing, building and maintaining training and serving infrastructure for ML research. About The...TrainingFull timeInternship
- ...Report to CEO | OpenAI for Physics | 5 Days Onsite Machine Learning Infrastructure Engineer Location: Onsite in San Francisco Compensation:... ...deployments: data generation + simulation orchestration + training/fine-tuning infrastructure + benchmarking pipelines +...TrainingWork at officeFlexible hours1 day per week
- ...Machine Learning Engineer In ML Runtime & Optimization Zensors is the spatial intelligence platform... ...edge compute resources. The AI Infrastructure team at Zensors builds the engine... ...develop technologies to accelerate the training and inference of computer vision...TrainingWork at office
$183.7k - $248.6k
...The opportunity Unity is looking for a Senior Machine Learning Infrastructure Engineer to join our Vector Ads team, where we build the real-time... ...operate the infrastructure that brings ML models from training into production, ensuring our ranking, bidding, and targeting...TrainingWork at officeRemote workWorldwideRelocation package$179k - $248k
...Machine Learning Infrastructure Engineer Join to apply for the Machine Learning Infrastructure Engineer role at Abridge . Base pay range $... ...scalable Kubernetes clusters for AI model inference and training Develop, optimize, and maintain ML model serving and...TrainingHourly payFull timeFlexible hours$209.7k - $283.8k
...San Francisco, CA, USA Staff Machine Learning Engineer, ML Infrastructure Location San Francisco, CA, USA Department AI & Machine Learning... ...complexity grow, our platform also supports large-scale model training, feature generation, and experimentation workflows...TrainingWork at officeWorldwideRelocation package- ...Lightfield AI/ML Engineer Lightfield is an AI-native CRM that... ...maintain rigid systems, Lightfield learns from how companies actually... ...of ML product development infrastructure, focusing on scaling and... ...in the context of LLM model training and prompting Create and...TrainingWork from home
$166k - $225k
...Founded in late 2020 by a small group of machine learning engineers and researchers, Mosaic AI enables companies to securely fine-tune, train and deploy custom AI models on their... ...Design and build the core platform infrastructure that supports our customer-facing product...TrainingLocal areaWorldwide- ...Join a dynamic team as a Machine Learning Infrastructure Engineer at Character.AI, where you'll enhance infrastructure for machine learning endeavors. This role requires substantial experience and expertise in cloud platforms, GPU management, and diagnostic tooling. Contribute...Training
- ...ML Infrastructure Engineer Spectral Labs is a spatial intelligence company building reasoning models for engineering physical systems. Our... ...models better. Responsibilities Optimize distributed training & RL across our GPU cluster of hundreds of H100 GPUs (FSDP,...Training
$250k - $350k
...one builds what makes them actually work. We're hiring ML Infrastructure Engineers to tackle a hard, real-world problem, understanding what's... ...video pipelines handling millions of hours of data Training and inference systems for multimodal / LLM-based models...Training- ...Sygaldry Quantum-Accelerated AI Server Engineer Sygaldry Technologies is building quantum... ...AI servers to exponentially speed up training and inference for AI. By integrating... ...accelerate and transform AI. They need compute infrastructure that stays out of their way: GPU access...TrainingCasual workLocal areaVisa sponsorship
$320k - $405k
...growing group of committed researchers, engineers, policy experts, and business... ...About the role We are seeking a Machine Learning Infrastructure Engineer to join our Safeguards organization... ...combination of education, training, and/or experience Required field...TrainingWork at officeVisa sponsorshipFlexible hours$110k - $180k
...time trading, all backed by robust data infrastructure. The Role Arta is building the AI... ...innovation, collaboration, and continuous learning are highly valued The opportunity to work... ...experience, and relevant education or training. Our offers are based on salary bands...TrainingWork at officeRelocation- ...advertising, as well as the machine learning/AI and data platforms that... ...Make: As a machine learning engineer or scientist, your... ...house Machine Learning tools & infrastructure to develop reusable, highly... ...ML/AI best practices (e.g. training/serving skew minimization,...TrainingWork experience placementRemote workShift work
- Whatnot is seeking an AI/ML Platform Engineer to shape the future of machine learning within a fast-growing livestream shopping platform. In this role, you... ...functions, prototype novel architectures, and build robust training pipelines. Ideal candidates will have 4+ years in...TrainingRemote job
- ...scaling and optimizing ML training systems. Key responsibilities... ...owning the training infrastructure, improving performance, and... ...will have strong software engineering foundations, hands-on experience... ...work at the intersection of machine learning and scalable infrastructure...Training
$200k - $280k
Engineering San Francisco Full-time $200,000 - $280,000 About the Role Join our ML Infrastructure team to build the systems that train, deploy, and serve our AI models at scale. You'll work at the intersection of machine learning and systems engineering. What You Will Do...TrainingFull timeWork at office- ...through advanced hardware engineering and AI solutions. Our mission... .... We emphasize continuous learning and growth, fostering... ...Summary We are seeking a Senior Machine Learning Infrastructure Engineer to join our team.... ...scalable distributed training pipelines, with support for...TrainingFlexible hours
- ...Machine Learning Engineer At Krea, we are building next-generation AI creative tools. We are dedicated to making AI intuitive and controllable... ...who can work on large-scale image and video models training experiments. Some stuff you can do: Train foundation...Training
$225k - $325k
...Senior Machine Learning Engineer ABOUT THE ROLE This is a hands-on, high-ownership role for... ...own model performance end-to-end—from training pipelines to post-deployment... ...inform model iterations. Level Up Infrastructure – Design and maintain the ML infrastructure...TrainingH1b- ...We're assisting a well-funded startup with their search for Machine Learning Engineers. Their product helps AI teams turn complex documents into... ...role will work onsite in the SF office. What you'll do: Train and deploy new state of the art models for parsing and...TrainingWork at office
- ...persistent challenges in data infrastructure: extracting accurate,... ...small, fast-growing team of engineers in San Francisco powering Fortune... ...San Francisco office Eager to learn and adapt quickly Prior startup... ...Pulse. You will have autonomy to train and fine-tune models and to...TrainingWork at officeVisa sponsorshipRelocation package
$140k - $200k
...Join to apply for the Founding ML Research Engineer - Training Infrastructure role at Kalpa Labs (YC F25) This range is provided by Kalpa Labs... ...your skills and experience — talk with your recruiter to learn more. Base pay range $140,000.00/yr - $200,000.00/yr...TrainingFull time- ...Title: Machine Learning Engineer Job Type: Contract Contract Length: 6 months Target Start Date: ASAP Work Location/Structure... ...Skills & Experience ~3+ years of end-to-end experience in training, evaluating, and deploying machine learning models in a...TrainingContract workImmediate startRemote work
- ...companies use cutting-edge reinforcement learning techniques to fine-tune open-source language... ...About The Role We’re looking for a Machine Learning Engineer to contribute to high-performance distributed training infrastructure for RL at scale. You’ll work directly with...Training
$150k - $225k
...About You: You want to learn from the best of the best, get... ...looking to be an impeccable machine learning engineer working on cutting-edge AI solutions... ...the identity space ML Infrastructure: Build and maintain infrastructure for model training, evaluation, and deployment,...TrainingFull timeWork at officeFlexible hours3 days per week- ...Generative Ai Engineer We are looking for a generative AI engineer to work on the full... ...problem, structuring image-based training data into usable formats, training and... ...Computer Science with a focus or specialty in Machine Learning ~2+ years of experience training and...Training
$200k - $400k
...-generation data platform to train AI video models. Troveo offers... ...an innovative strategic engineer to help us scale. Role Overview The Senior Machine Learning Engineer will play a central... ...insights and cost implications. Infrastructure & Optimization – Scale ML...TrainingWork experience placement
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Machine Learning Engineer, Training Infrastructure. Be the first to apply!
- machine learning software engineer San Francisco, CA
- ai ml engineer San Francisco, CA
- graduate machine learning engineer San Francisco, CA
- computer vision machine learning engineer San Francisco, CA
- machine learning engineer San Francisco, CA
- senior ml engineer San Francisco, CA
- junior machine learning research engineer San Francisco, CA
- machine learning ai engineer San Francisco, CA
- data scientist machine learning engineer San Francisco, CA
- security infrastructure engineer San Francisco, CA


