Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

MLOPS Engineer

VBeyond

Job Description:


Required Skills & Experience

  • Strong experience with production ML systems , incident management, and on-call operations.
  • Deep hands-on expertise with AWS, Kubernetes/EKS , and infrastructure-as-code.
  • Experience executing platform or environment migrations with minimal production risk.
  • Familiarity with CI/CD pipelines, observability, and secure service operations.
  • Ability to work independently, own deliverables end-to-end and leave high-quality documentation.
Key Responsibilities


Realtime-ML Operations & Playground Migration (~45%)
  • Act as primary on-call and maintenance owner for the Realtime-ML production stack during the Auriga migration window.
  • Monitor system health, triage and resolve incidents, and address data/model-serving issues.
  • Apply security updates, dependency patches, and ensure SLA continuity for downstream consumers.
  • Lead migration of the Realtime-ML Playground environment, including infra parity checks, configuration migration, integration testing, and documentation.
EKS Migration - HarperCollins Bundles (~40%)
  • Execute end-to-end migration of HarperCollins service bundles to AWS EKS .
  • Author Kubernetes manifests, configure IAM and networking, and update CI/CD pipelines.
  • Validate in staging and perform a controlled production cutover.
  • Produce rollback plans and operational runbooks.
Buildings Production Pipeline (Supporting, ~15%)
  • Contribute to design and initial build-out of a pipeline streaming ML-detected missing building into the Basemap data flow.
  • Deliver pipeline scaffolding, integration patterns with upstream ML outputs, and schema inputs.
  • Leave a documented, partially implemented pipeline with clear handoff notes for post-engagement completion.
Key Deliverables (by End of Engagement)
  • Stable Realtime-ML production environment throughout Auriga migration, with documented incidents and resolutions.
  • Fully migrated Realtime-ML Playground with handoff documentation.
  • HarperCollins bundles live on EKS with completed cutover and operational runbooks.
  • Partially implemented Buildings pipeline with documentation enabling seamless handoff.
  • All code, IaC, and documentation checked into team repositories.
Vacancy posted more than 2 months ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to MLOPS Engineer. Be the first to apply!