ML Platform Engineer - GPU Infrastructure
Optimal
Job Title: ML Platform Engineer - GPU Infrastructure Support team by designing, implementing, and maintaining the automation and ML workload enablement layer of the GPU cluster platform. This role focuses on optimizing GPU compute environments for AI/ML training and Isaac Sim simulation workloads, integrating GPU jobs into CI/CD pipelines, standardizing runtime environments, and supporting reliable storage and artifact management. Required Experience 3+ years of experience in ML Platform Engineering, DevOps, Infrastructure Engineering, or related field Bachelor's or Master's degree in Systems Engineering, Computer Science, Computer Engineering, or related discipline Responsibilities Support GPU cluster platforms for AI/ML and simulation workloads Optimize GPU compute environments for ML training and Isaac Sim execution Integrate GPU workload execution into CI/CD pipelines Standardize runtime environments using containers and automation tools Manage storage, artifacts, and workload outputs Troubleshoot and improve platform reliability, scalability, and performance Collaborate with ML, infrastructure, and engineering teams Required Skills Experience with Linux, Kubernetes, Docker, and GPU infrastructure Knowledge of CI/CD tools and automation scripting (Python/Bash) Experience supporting AI/ML workloads and distributed systems Familiarity with NVIDIA GPU technologies and containerized environments Strong troubleshooting and performance optimization skills Preferred Skills Experience with Isaac Sim or simulation workloads Exposure to cloud platforms (AWS, Azure, or GCP) Knowledge of monitoring and observability tools such as Grafana or Prometheus #J-18808-Ljbffr Optimal
- Optimal is seeking an ML Platform Engineer focusing on GPU Infrastructure in Kentucky. You will be responsible for optimizing GPU compute environments for AI/ML workloads and integrating these into CI/CD pipelines. This role requires a strong background in ML Platform Engineering...Suggested
- Job Title: GPU Platform Infrastructure Engineer Job Summary Support the GM ARC RTD team by building and maintaining the foundational GPU cluster platform infrastructure supporting shared AI/ML, simulation, and validation workloads. This role focuses on GPU access governance...Suggested
$216.7k - $303.4k
...Machine Learning Systems Engineer Remote - United States... ...: The Machine Learning Platform team at Reddit is a... ...impact team that owns the infrastructure that powers recommendations... ...You’ll Do: As a Senior ML Infrastructure Engineer... ...time, efficiency, and GPU training costs in a...SuggestedFor contractorsWork experience placementRemote work- Company Overview Deepgram is the leading platform underpinning the emerging trillion-... ...of the hardest problems in AI. As an ML Ops Infrastructure Engineer at Deepgram, you will own the... ...Datadog, or similar) Familiarity with GPU‑accelerated inference optimization and...SuggestedHome officeFlexible hours
- ...As the first and founding ML Operations Engineer at Tennr, you’ll play a crucial... ...our AI-driven healthcare platform is powered by robust,... ...scale. Develop and maintain infrastructure that supports efficient ML... ...Triton, etc Experience with GPU orchestration, including managing...SuggestedWork at office
- A financial technology company is seeking a Senior Cloud and Platform Engineer to design and operate cloud-native infrastructure for AI development. The ideal candidate has over 8 years in cloud infrastructure and DevOps, with expertise in MLOps practices. You will lead...Remote jobFlexible hours
- ...that puts human values first. About the Role ML Systems Engineers at Basis ensure training and evaluation infrastructure is fast, reliable, and scalable. You will own the... ...cloud/DevOps responsibilities. You will manage GPU clusters, optimize cloud spending, ensure...Full timeWork at office
- Optimal is seeking a GPU Platform Infrastructure Engineer to support the GM ARC RTD team by building and maintaining the foundational GPU cluster platform infrastructure. This role focuses on GPU access governance, resource allocation, scheduling policies, and operational...
- A pioneering AI infrastructure company is seeking a GPU Cloud Platform Engineer to design and operate large-scale GPU clusters. This remote position aims to ensure high availability and performance of containerized AI workloads across cloud environments. The ideal candidate...Remote job
- ...of hardware—from commodity to high-end GPUs. Our platform supports major large language models (LLMs) and offers... ...AI development. ️ Role Overview We are seeking a GPU Cloud Platform Engineer to join our core infrastructure team and help build the next-generation AI compute...Full timeRemote workFlexible hours
$145k - $160k
...make high-performance cloud infrastructure easy to use, affordable,... ...global Cloud Compute, Cloud GPU, Bare Metal, and Cloud... ...and experienced Staff AI/ML Infrastructure Engineer to drive the design, performance... ...of our AI infrastructure platform. The ideal candidate is a...Work at officeImmediate startRemote workFlexible hours- A cutting-edge AI company is seeking an experienced ML Ops Infrastructure Engineer to bridge research and production. This role focuses on designing and building CI/CD pipelines and deploying ML models for real-time applications. With a strong emphasis on automation, monitoring...
$180k - $300k
...constantly evolving our firm’s IT infrastructure and engineering capabilities, positioning... ...we build and operate our platforms and applications. As a... ...end-to-end machine learning (ML) workflows Collaborate with... ...-management strategies for GPU and accelerator compute environments...Work experience placement$200.2k - $357.5k
...pioneer of the Connected Operations Cloud, a platform that enables organizations that depend on physical... ...a Staff / Senior Staff Machine Learning Infrastructure Engineer to lead the design and evolution of our end‑to‑end ML platform powering Safety AI and adjacent product...Full timeWork at officeRemote workFlexible hours$170k - $220k
...000 clients nationwide. Our ML and AI capabilities are expanding... ...developer tooling—and the infrastructure underneath needs to scale... ...As our first dedicated ML Platform Engineer, you'll define the technical... ...and are investing in hosted GPU inference to support the next...Full timeWork at officeLocal area- ...Physica in New York, NY is seeking a Machine Learning Engineer focused on Data & Training Infrastructure. In this role, you'll build the core systems that transform... ...and a strong background in distributed systems and ML infrastructure. Benefits include full medical coverage...
- A tech company specializing in IoT solutions is seeking a Staff/Senior Staff Machine Learning Infrastructure Engineer to design and evolve their ML platform. This remote position requires strong expertise in distributed computing frameworks like Ray and Spark, as well...Remote jobFlexible hours
$200.2k - $357.5k
A leading technology firm is seeking a Staff/Senior Staff Machine Learning Infrastructure Engineer to lead the design of their ML platform. This role involves building scalable systems for safety AI and requires over 10 years of experience in machine learning engineering...Remote job$128.7k - $261.3k
...mission is two-fold: build the ML deployment platform that makes model rollouts... ...performed manually by engineers. Build the developer experience... ...production platform or infrastructure systems where reliability,... ...Familiarity with the NVIDIA GPU stack at the integration level...Flexible hoursShift work- A leading tech company in the United States is seeking an experienced Infrastructure GPU Engineer to build and support high-performance cloud infrastructure. This role involves optimizing resource allocation for GPU workloads, ensuring system reliability, and collaborating...Remote job
- Mozilla Corporation is seeking a Senior Machine Learning Engineer for the AI Platform team. This position involves designing and operating core components... ...will have a strong background in Python, experience with GPU workloads, and the ability to lead cross-functional...Remote jobFlexible hours
- A leading healthcare AI company is seeking engineers to build foundational ML infrastructure for healthcare applications. The role involves designing scalable... ...such as TensorFlow Serving, and comfort with cloud platforms like AWS and GCP. This position offers high...Remote job
$204k - $262k
A leading dating app company is seeking an experienced ML Engineer to join the AI Platform Core team. The role involves developing infrastructure for AI/ML models, optimizing processes, and collaborating with cross-functional teams. The ideal candidate has over 4 years...- Build the infrastructure that powers healthcare AI at scale. New Jersey or Remote Research & Development Full-Time Neurex... ...inside real healthcare workflows. This role is for engineers who want to build the foundational ML infrastructure that enables healthcare AI at scale....Remote jobFull time
- A leading investment firm in New York is seeking an experienced ML Infrastructure/Platform Engineer to enhance their AI data platform. You will build production data pipelines and manage scalable infrastructure for model deployment. The ideal candidate has 3+ years of experience...
- A tech-focused company is seeking a Senior Infrastructure Platform Engineer to design and maintain robust infrastructure platforms. This role involves automating deployments, monitoring performance, and collaborating with development teams. The ideal candidate should have...Remote job
- ...Senior Manager to lead its Machine Learning Platform engineering organization. You will own the technical strategy for the ML Platform, manage a team of engineering... ...learning skills, with expertise in large-scale ML infrastructure and data. Affirm offers a remote-first work...Remote job
- Whatnot, the leading livestream shopping platform, is seeking an AI/ML Platform Engineer in the Seattle area to develop core infrastructure for machine learning applications. This role involves designing scalable systems, productionizing ML architectures, and enhancing...Flexible hours
$141.1k - $262.1k
F. Hoffmann-La Roche AG is seeking a motivated ML Engineer for its Genentech team in New York. The role focuses on designing and maintaining ML infrastructure to support drug discovery initiatives. The ideal candidate will have a strong background in AWS, Python, and C++...- ...edge technology company is looking for exceptional generalist engineers who thrive with autonomy. This fully remote role allows you... ...degree and a strong track record in systems programming or ML infrastructure. Competitive compensation and benefits are offered. #J-18808...Remote job
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to ML Platform Engineer - GPU Infrastructure. Be the first to apply!
- machine learning engineer Brooklyn, NY
- machine learning software engineer Brooklyn, NY
- platform developer Brooklyn, NY
- senior platform engineer Brooklyn, NY
- platform engineer Brooklyn, NY
- security infrastructure engineer Brooklyn, NY
- infrastructure engineer Brooklyn, NY
- data infrastructure engineer Brooklyn, NY
- senior infrastructure engineer Brooklyn, NY
- remote infrastructure engineer Brooklyn, NY

