Staff Machine Learning Systems Engineer (MLOps)
Hims & Hers
Hims & Hers is the leading health and wellness platform, on a mission to help the world feel great through the power of better health. We are redefining healthcare by putting the customer first and delivering access to care that is affordable, accessible, and personal, from diagnosis to treatment to delivery. No two people are the same, so we provide access to personalized care designed for results. By normalizing health & wellness challenges and innovating on their solutions, we’re making better health outcomes easier to achieve. Hims & Hers is a public company, traded on the NYSE under the ticker symbol “HIMS.” To learn more about the brand and offerings, you can visit hims.com/about and hims.com/how-it-works . For information on the company’s outstanding benefits, culture, and its talent-first flexible/remote work approach, see below and visit About the Role: We're hiring a Staff ML Systems Engineer to design, build, and operate the production infrastructure that powers AI across Hims & Hers. This is a deeply technical, hands-on infrastructure role focused on the systems underneath AI — the Kubernetes platform, CI/CD and GitOps pipelines, infrastructure-as-code, inference and model-serving infrastructure, and the observability and tracing stack that keeps AI services reliable, debuggable, and compliant in production. You won't just deploy models — you'll own the machinery that lets every AI team ship and operate safely. You'll own critical systems like our EKS clusters, deployment and autoscaling infrastructure, IAM and secrets management, LLM tracing/observability pipelines (Langfuse, Datadog, OpenTelemetry), and the developer platform that AI and product engineers rely on daily. You'll partner with ML engineers, product engineers, and clinical teams to ensure our AI systems are reliable, observable, secure, and trustworthy in a regulated healthcare environment. This role is ideal for someone who thinks in systems and infrastructure, cares deeply about reliability, security, and cost, and wants to define how AI runs in production at a company where it directly impacts patient outcomes. You Will: Own and scale the AI compute and deployment platform Own and evolve our containerized application deployment platform and related systems for AI workloads, encompassing general process and job orchestration (e.g. Kubernetes) — cluster operations, node lifecycle, autoscaling (Karpenter), storage (EBS CSI), and workload isolation across staging and production. Build and maintain GitOps-based deployment pipelines (Helm/Kustomize overlays, environment promotion) that let teams ship AI services safely and repeatably. Design ephemeral/preview environments, feature-branched deployments, and nightly release pipelines so teams can validate AI changes in production-like conditions before release. Drive efficiency and cost management across compute, autoscaling, and inference infrastructure. Build inference and model-serving infrastructure Operate and scale inference infrastructure and a multi-provider LLM AI gateway (e.g. Bedrock, Vertex, and other providers) — including credentials, rate limits, and failover. Build reliable serving patterns for LLM-powered workflows: routing, grounding, tool execution, and context assembly at the platform level. Create reusable infrastructure abstractions and contracts that standardize how AI services are deployed, configured, and consumed across the company. Own observability, tracing, and reliability Own the LLM/AI observability and tracing stack — provisioning and scaling systems like Langfuse, Datadog (dd-trace), OpenTelemetry tracing (OTLP), and the underlying datastores (e.g. ClickHouse) — so AI behavior is auditable and debuggable in production. Build analytics and monitoring pipelines that surface latency, error, quality, and regression signals to engineering and clinical stakeholders. Define SLOs, alerting, on-call runbooks, and incident response for AI infrastructure; lead troubleshooting and continuously raise platform reliability. Scale the AI developer platform and CI/CD Own and improve the monorepo build system and CI/CD pipelines for AI workloads — including eval workflows, Docker image builds, automated PR checks and convention enforcement, and cross-platform test execution. Own shared infrastructure tooling, CLIs, and IaC modules (Terraform, Scalr) that AI and product engineers use daily. Identify and eliminate platform bottlenecks — reducing CI/CD cycle times, build latency, and deployment friction — to improve developer velocity across the Applied AI organization. Drive security, compliance, and governance at the systems level Build IAM, OIDC, and secrets management as first-class infrastructure — scoped, least-privilege roles, write-only secret rotation, and cross-account access audits. Encode security-by-default, scope boundaries, and access controls into the platform so AI services are HIPAA-compliant and privacy-first. Partner with clinical, legal, security, and data platform teams (including Databricks/Unity Catalog access governance) to enforce compliant, auditable data access. Set technical direction and raise the bar Drive multi-quarter infrastructure initiatives, from cluster and deployment architecture to inference platform, GPU compute strategy, and observability evolution. Write and lead technical design documents and design reviews, define infrastructure standards and development-workflow conventions, and contribute to technical governance across AI engineering. Mentor engineers on reliability engineering, infrastructure-as-code, and MLOps best practices, and bridge the gap between prototypes and production-grade systems. You Have: 8+ years of professional experience in infrastructure, platform, DevOps, or SRE engineering — with at least 3 years focused on ML/AI systems in production. Deep, hands-on experience with Kubernetes (ideally EKS) and the cloud-native ecosystem — autoscaling, GitOps, Helm/Kustomize, operating clusters at scale, and general process/job orchestration. Strong infrastructure-as-code skills (Terraform) and experience designing secure cloud architectures: IAM, OIDC, secrets management, and least-privilege access. Strong proficiency in Python, with experience building production infrastructure tooling, CLIs, and data/observability pipelines. 2+ years of experience operating LLM-based systems in production (LLMOps) — inference routing, serving, tracing, and the reliability patterns needed to run them at scale. Hands-on experience with observability/tracing stacks (Datadog, OpenTelemetry, Langfuse, or equivalent) and metrics/log/trace pipelines. Experience designing and maintaining CI/CD pipelines, build systems, and developer tooling for fast-moving engineering teams. A systems-and-operations mindset: you think about failure modes, SLOs, observability, security, and long-term maintainability before shipping. Experience writing and leading technical design documents (TDDs/RFCs) for infrastructure-scale initiatives. Strong collaboration skills across engineering, ML, product, security, and clinical teams. A deep appreciation for safety, privacy, and security — ideally with experience in a regulated domain such as healthcare, fintech, or life sciences. Nice to Have: Experience with AWS (EKS, Bedrock, S3, CloudFront, IAM) and multi-cloud (GCP/Vertex AI) inference routing. Experience with Databricks (MLflow, Unity Catalog, Spark, Delta) and data platform access governance. Experience provisioning LLM observability infrastructure (Langfuse, ClickHouse, OpenTelemetry/OTLP tracing, LogFire) and LLM behavior monitoring. Experience with Karpenter, cluster autoscaling, and cost optimization for ML compute. Experience with monorepo build systems (Pants, Bazel) and large-scale CI/CD. Experience building automated PR-review / convention-enforcement pipelines and developer-workflow standards. Familiarity with Vertex AI Agent Builder, Vertex AI Model Registry, or GCP managed AI/ML services as a stretch growth area. Contributions to open-source infrastructure, IaC modules, SDKs, or developer tooling projects. Why Join Us At Hims & Hers, you'll be part of a small, high-impact team defining how AI infrastructure runs in production for healthcare. The platform you build — compute, deployment, inference, observability, and security — is the foundation that every AI-powered experience depends on. Reliability, security, and developer velocity aren't afterthoughts here; they're the product. Join us in building the infrastructure that makes healthcare AI smarter, safer, and more trustworthy. Our Benefits (there are more but here are some highlights): Competitive salary & equity compensation for full-time roles Unlimited PTO, company holidays, and quarterly mental health days Comprehensive health benefits including medical, dental & vision, and parental leave Employee Stock Purchase Program (ESPP) 401k benefits with employer matching contribution Offsite team retreats We are committed to building a workforce that reflects diverse perspectives and prioritizes ethics, wellness, and a strong sense of belonging. If you're excited about this role, we encourage you to apply—even if you're not sure if your background or experience is a perfect match. Hims considers all qualified applicants for employment, including applicants with arrest or conviction records, in accordance with the San Francisco Fair Chance Ordinance, the Los Angeles County Fair Chance Ordinance, the California Fair Chance Act, and any similar state or local fair chance laws. It is unlawful in Massachusetts to require or administer a lie detector test as a condition of employment or continued employment. An employer who violates this law shall be subject to criminal penalties and civil liability. Hims & Hers is committed to providing reasonable accommodations for qualified individuals with disabilities and disabled veterans in our job application procedures. If you need assistance or an accommodation due to a disability, please contact us at View email address on click.appcast.io and describe the needed accommodation. Your privacy is important to us, and any information you share will only be used for the legitimate purpose of considering your request for accommodation. Hims & Hers gives consideration to all qualified applicants without regard to any protected status, including disability. Please do not send resumes to this email address. To learn more about how we collect, use, retain, and disclose Personal Information, please visit our Global Candidate Privacy Statement.
- ...listed on behalf of a partner company, who manages all applications and next steps. Our partner is looking for a Staff Machine Learning Systems Engineer (MLOps) based in the United States. This is a high-impact infrastructure role focused on building and operating the...SuggestedRemote jobFull time
$131.4k - $235.95k
...By creating software tools for making buildings, machines, and even the latest movies, we influence and empower... ...people in the world. As a Senior Machine Learning Engineer focused on Machine Learning Ops (MLOps) for CAD and BIM, you will ensure AI-powered experiences...SuggestedFor contractorsRemote work$133.2k - $173k
...Corporation seeks an enthusiastic Senior ML Engineer to join our Data Science and Machine Learning department. In this role, you will be... ...models into reliable, production-grade systems through strong infrastructure design, MLOps automation, and performance optimization...SuggestedMinimum wageLocal areaRemote workFlexible hours- ...MLOps Engineer — AI/ML Systems & Deployment Dayton, OH (On-site Preferred) | Remote Eligible (U.S.-based, Clearance-Ready) Clearance-Eligible... .... You will operate at the intersection of: machine learning cloud-native infrastructure distributed systems...SuggestedWeekly payTemporary workRemote workHome office
- ...Motors is seeking a Staff AI/ML Engineer for the Vehicle Mechatronic... ..., and operating real systems - not on academic... ...detection, deep learning where appropriate) with... ...Science, Data Science, Machine Learning, Statistics,... ...scikit-learn, and with MLOps tooling (e.g., MLflow...SuggestedLocal areaRemote workWork from homeRelocationRelocation package
- ...Senior MLOps & Data Systems Engineer As a global leader in micromobility, Lime is on a mission to build a future where transportation is... ...Systems Engineer to help build and scale the core data and machine learning infrastructure for the Lime Vision team. In this role,...Local areaRemote work
- ...Mid-Level MLOps Engineer The Applied Research Laboratory for Intelligence & Security (ARLIS... ..., artificial intelligence / machine learning, quantum science, and human-machine teaming... ...operationalization of machine learning systems for national security applications. This...Interim roleWork at office
$120k - $180k
...function in orchards. Our vision, AI, and machine control systems offer human-level environment... ...About the role We’re looking for an MLOps engineer who thrives in real-world robotics environments... ...and can own the entire machine learning lifecycle—from data ingestion and...Local area$170k - $230k
...company leverages patented machine learning and AI technology (Cognition... ...Machine Learning Operations Engineer to lead our ML Ops team and... ...enable our machine learning systems to operate at scale. You will... ...ArgoCD, AWS, GCP, Datadog • MLOps: Triton Inference Server,...Summer workWork at officeFlexible hours$92.25k - $120k
...steps. Our partner is looking for an AWS MLOps Engineer based in the United States. This role sits at the intersection of machine learning, cloud engineering, and production-grade... ...on building and scaling end-to-end ML systems in AWS environments. You will be responsible...Remote jobFull timeFlexible hours$172.5k - $306.63k
...Senior Machine Learning Engineer At Adobe's Experience Platform, we are looking for a Senior Machine... ..., and operate scalable intelligent AI systems that power end-user AI products. You... ...optimization. Hands-on experience with MLOps, including model lifecycle management,...Temporary workLocal areaWorldwide- ...Job Title: ML Engineer - MLops Sub Skill: MLOPS - Cloud (AWS/ GCP/ Azure) Type: Contract Location: Cincinnati... ...like Dask, Ray etc. will be advantage. Experience with Machine learning frameworks, libraries, and agile environments. Experience...Contract work
- ...Analytica is seeking a highly skilled Senior MLOps Engineer to lead the design, development, and deployment of machine learning operations infrastructure for defense applications. This role requires expertise in systems integration, ML pipeline development, and responsible...For contractorsLocal area
$175k - $270k
...cloud delivery, modern tech stacks, machine learning, and hand-crafted native app... ...experiences. Senior Machine Learning Engineer (Recommendation Systems) Philo's recommendation system improves... ...with Amazon SageMaker or similar MLOps platforms More about Philo...Full timeFor contractorsWork at officeRemote workHome officeFlexible hours3 days per week- ...Staff Machine Learning Engineer, Listings and Host Tools Data and AI Airbnb was born in 2007 when two hosts welcomed three guests to their San Francisco home, and has since grown to over 5 million hosts who have welcomed over 2 billion guest arrivals in almost every...Work experience placement
- ...Job Description Job Description Dev/ MLOps Engineer I (Full Stack) Location: HQ In Person About Us: Origin is redefining... ...infrastructure that powers our cloud applications and machine learning systems. In this role, you will work across the stack, contributing...
- ...Data Science-ML Ops Engineer We are seeking a highly capable Senior ML Engineer / MLOps Engineer with strong experience in building, deploying, and scaling machine learning systems in production. The ideal candidate will have hands-on expertise across the end-to-end...
- ...MLOPS Engieer OR ML Engineer with DevOps East Hanover, NJ 12 Months Contract Job... ...cycle management and monitoring of Machine Learning(ML) and Deep Learning (DL) models in... ...environments 8. Operate and maintain systems supporting the provisioning of new...Contract work
- ...where we're headed. We're proud to share our story and Make Amazing Happen at CDW. The Senior ML / MLOps Engineer designs, builds, and operates scalable machine learning solutions on the Databricks platform. This role partners closely with data scientists and analytics...Local areaRemote work
$160k - $180k
...transformation consultancy and engineering services company that... ...migration, big data analytics, machine learning, artificial intelligence, and... ...Annum Job Description: MLOps - AWS+K8S, Python, MLOps, ML... ...and maintain scalable ML systems and pipelines...Remote work$150k - $200k
...purposeful focus on Distributed Data Systems, Platforms at Scale, and... ...Principal ML Ops Engineer to support our customers and... ...in a more mature end-to-end machine learning platform to support model development... ...Raft's ML platform and MLOps infrastructure. You will work...Live inRemote workFlexible hours$113k - $188k
...Secret What You Will Do: As an MLOps Engineer, you will design, implement, and support... ..., secure, and reliable deployment of machine learning solutions for federal clients. You will... ..., and governance of AI/ML systems Collaborate with stakeholders to...Temporary workRemote workFlexible hours$190k - $220k
...Senior MLOps Engineer Guild is seeking a Senior MLOps Engineer. As a Senior ML Ops Engineer... ...develop, deploy, and iterate on machine learning models and AI agents. Your contributions... ...of monitoring, logging, and alerting systems for ML models in production. Responsibilities...Temporary workRemote work- ...Cloudary is hiring a Senior ML Engineer with strong data engineering foundations and hands-on MLOps experience for a contract engagement with an international client. You'll own ML pipelines and model serving infrastructure - bridging data and production ML at scale....Hourly payFull timeContract workRemote work
- ...MLOps Engineer Candidate should live within driving distance or be willing to relocate to the following areas: Wichita, KS; Lawton... ...MLOps Engineer designs, builds, and operates scalable machine learning systems that transform spatial-temporal and sensor-derived data...Remote workRelocation
- ...Data Scientist - Palantir (GenAI MLOps Engineer) Jersey City, New Jersey, United States... ...experience in MLOps or similar roles involving machine learning and cloud operations. Strong... ...to troubleshoot and debug complex systems. Excellent communication skills, both...Work at office
$45k - $121k
...Job Title: MLOPS Engineer (Infrastructure) City: Sunnyvale State/Province: California Posting Start Date: 5/1... ...Skills & Experience Strong experience with production ML systems , incident management, and on-call operations. Deep hands...Minimum wageLocal area- ...Skills: Machine Learning, Generative AI, LLMs, GPT, BERT, T5, Hugging Face, Ollama, Prompt Engineering, RAG, NLP, Python, PyTorch, TensorFlow,... ...Detection, Image Segmentation, MLOps, MLflow, Kubeflow, Airflow... ...vision, recommender systems, time-series forecasting,...Contract work
$125k - $275k
...Staff MLOps Engineer TriNet is a leading provider of comprehensive human resources solutions... ...designing, implementing, and optimizing machine learning operations within our infrastructure.... ..., version control, and monitoring systems. Design and implement automation pipelines...Permanent employmentFull timeWork at officeRemote workRelocation$170k - $260k
...technical expertise for mission-critical systems. In this role, you will translate... ...with SMEs and software engineers, you will guide the transition of... ...Expertise: Significant background in MLOps and Data Science, including machine learning, data mining, and advanced...16 hoursFor contractorsRemote work
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Staff Machine Learning Systems Engineer (MLOps). Be the first to apply!
- software engineer staff United States
- staff devops engineer United States
- information technology support assistant United States
- assistant engineer United States
- structural engineering assistant United States
- assistant engineering manager United States
- engineering administrative assistant United States
- staff design engineer United States
- project engineer assistant project manager United States
- technology administrator United States





