Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Executive Director, AI Ops Engineering

$175.1k - $334.75k

CVS Health

We're building a world of health around every individual - shaping a more connected, convenient and compassionate health experience. At CVS Health®, you'll be surrounded by passionate colleagues who care deeply, innovate with purpose, hold ourselves accountable and prioritize safety and quality in everything we do. Join us and be part of something bigger - helping to simplify health care one person, one family and one community at a time.

Executive Director, AI Platform SRE

About the Role

CVS Health is seeking an Executive Director, AI Ops Engineering to build and lead a team of professionals responsible for the continuous operation, monitoring, and optimization of CVS's Enterprise AI environment. This is first and foremost an engineering leadership role - your core accountability is ensuring the platform is always on, always performing, and always improving.

CVS Health's AI platform is a critical enterprise asset powering clinical, operational, and consumer capabilities at scale across one of the nation's largest healthcare organizations. Keeping it reliable, observable, and continuously improving is the mission. Reporting to the Global Head of Infrastructure/AI Operations and Service Delivery, you will establish and maintain operational baselines across the full infrastructure stack, ensure all changes are continuously monitored, observed, and adjusted, and drive the highest levels of availability, reliability, and scalability across every layer of the environment.

This is a greenfield organizational build - the person in this role will define the operating model, shape the team culture, and establish the engineering standards that will govern CVS's AI infrastructure for years ahead. If you thrive on building from the ground up, this role was designed for you.

Teams You Will Lead

You will build and lead a multi-disciplinary SRE organization structured across nine functional areas spanning core platform operations and innovation. The team is organized to ensure full-spectrum coverage of the AI environment - from hardware and network through platform reliability, security, observability, and 24/7 operations - while continuously developing advanced automation and self-healing capabilities.

Core operational teams cover the following domains:
  • Platform Reliability - SLO/SLI/error budget management, availability baseline enforcement, cluster administration, GPU quota governance, and infrastructure-as-code
  • Infrastructure - Compute, storage, and hardware lifecycle management, including compliance controls and data isolation
  • Network - High-performance GPU networking, fabric management, security segmentation, and continuous network baseline enforcement
  • Observability - End-to-end monitoring strategy, alerting pipelines, SLI/SLO dashboards, and the feedback loops that connect operational data to improvement
  • Security SRE - Security posture, access controls, audit logging, vulnerability management, and regulatory compliance (HIPAA, NIST AI RMF)
  • 24/7 Operations Center - Round-the-clock incident response, on-call protocols, escalation management, and shift-level change execution, structured for sustainable coverage with no mandatory overtime
  • Change & Release Management - Change lifecycle governance, ITIL process management, compliance frameworks, ModelOps boundary definition, and platform knowledge base
  • FinOps - GPU cost governance, utilization optimization, tenant quota enforcement, and chargeback models in partnership with Finance
In addition to core operations, you will oversee three Innovation PODs - focused on AI-driven automation, infrastructure-as-code and self-service capabilities, and chaos engineering and resilience testing - with the goal of continuously reducing manual toil and building a self-healing, self-optimizing platform over time.

What You'll Do

Leadership
  • Own the SRE vision, strategy, and long-range roadmap with availability (>99.99%), reliability, and scalability as the primary measures of success
  • Lead, develop, and integrate all functional teams into a cohesive, always-on operations organization - setting clear ownership, accountability, and performance expectations for each team and each engineer
  • Establish and enforce operational baselines across all platform components; ensure deviations are detected, escalated, and resolved within defined SLAs
  • Drive end-to-end observability with continuous feedback loops connecting monitoring data to incident response, change decisions, and improvement cycles
  • Oversee change management ensuring every modification is risk-assessed, monitored during rollout, and baseline-validated post-deployment
  • Ensure configuration consistency and drift detection across all platform components to prevent baseline degradation over time
  • Build and sustain a high-performing 24/7 operations model - zero mandatory overtime, zero burnout attrition, and measurable team health and retention
  • Empower the Security SRE Lead to implement and maintain a world-class security posture, minimizing risk and ensuring robust compliance with frameworks like HIPAA and NIST AI RMF
  • Direct Innovation POD strategy to develop self-healing and autonomous capabilities that proactively prevent degradation before it impacts availability
  • Lead GPU FinOps governance - utilization optimization, tenant quota enforcement, and cost reduction - in partnership with the Finance organization
  • Manage vendor relationships and performance accountability
Program Governance
  • Lead the structured transition of operational ownership from the incumbent managed services provider to CVS's internal SRE organization, governing phased handoffs, competency validation, and milestone sign-offs, ensuring a seamless transition with minimal disruption to platform availability and business operations
  • Establish and lead the long-term operating model by institutionalizing key technical, architectural, and delivery leadership capabilities into permanent CVS roles, ensuring the organization is fully self-sustaining at program close
What You'll Bring
  • 10+ years in SRE, platform operations, or DevOps engineering leadership with a demonstrated focus on availability and reliability outcomes
  • 5+ years leading multiple technical teams simultaneously, including 24/7 operations organizations - with measurable team health, retention, and performance outcomes
  • Proven success establishing and enforcing operational baselines, SLO/SLI/error budget frameworks, and observability-driven continuous improvement in complex environments
  • Deep expertise in Kubernetes/OpenShift, IaC, GPU computing, and AI/ML infrastructure
  • Experience managing large-scale MSP transitions or platform operational handoffs while ensuring business continuity and minimizing disruption.
  • Demonstrated FinOps and GPU cost optimization experience in cloud or on-premises environments
  • Security framework implementation and compliance program management in regulated industries (HIPAA, NIST AI RMF)
  • Track record building sustainable 24/7 operations models with measurable retention and no burnout-related attrition
  • Executive stakeholder communication, vendor negotiation, and budget ownership
  • Background in innovation programs, POD structures, or centers of excellence
  • Willingness to travel and work off hours as required. Our 24/7 model is designed for sustainable, predictable coverage that eliminates mandatory overtime. As a leader, you will be an escalation point for critical incidents, but our goal is a resilient system and culture that protects our team's time
Preferred Qualifications
  • NVIDIA AI Enterprise, Run:AI, or GPU orchestration platform experience
  • Healthcare or regulated industry background
  • Certifications: ITIL Expert, PMP, AWS/Azure/GCP, CISSP
  • Familiarity with Cisco UCS, VAST storage, EVPN-VXLAN, and RDMA/RoCE protocols
  • Chaos engineering and AI-driven operations experience
  • Thought leadership: published work or speaking at industry conferences

Education

Required: Bachelor's in Computer Science, Engineering, or related field | Preferred: Master's degree

Pay Range

The typical pay range for this role is:

$175,100.00 - $334,750.00

This pay range represents the base hourly rate or base annual full-time salary for all positions in the job grade within which this position falls. The actual base salary offer will depend on a variety of factors including experience, education, geography and other relevant factors. This position is eligible for a CVS Health bonus, commission or short-term incentive program in addition to the base pay range listed above. This position also includes an award target in the company's equity award program.


Our people fuel our future. Our teams reflect the customers, patients, members and communities we serve and we are committed to fostering a workplace where every colleague feels valued and that they belong.

Great benefits for great people

We take pride in offering a comprehensive and competitive mix of pay and benefits that reflects our commitment to our colleagues and their families.

This full-time position is eligible for a comprehensive benefits package designed to support the physical, emotional, and financial well-being of colleagues and their families. The benefits for this position include medical, dental, and vision coverage, paid time off, retirement savings options, wellness programs, and other resources, based on eligibility.

Additional details about available benefits are provided during the application process and on Benefits Moments.

We anticipate the application window for this opening will close on: 05/31/2026

Qualified applicants with arrest or conviction records will be considered for employment in accordance with all federal, state and local laws.
Vacancy posted 2 days ago
Similar jobs that could be interesting for youBased on the Executive Director, AI Ops Engineering in United States vacancy
  • $175.1k - $334.75k

     ...simplify health care one person, one family and one community at a time. Executive Director, AI Platform SRE About the Role CVS Health is seeking an Executive Director, AI Ops Engineering to build and lead a team of professionals responsible for the continuous... 
    Suggested
    Hourly pay
    Permanent employment
    Full time
    Temporary work
    Local area
    Shift work

    CVS Health

    Michigan
    18 hours ago
  • $85k - $135k

     ...AI Ops Engineer (Salesforce) Project Delivery Services | Hartford, Connecticut We are seeking an AI Ops Engineer (Agentforce) responsible for production operations, reliability, governance, and continuous optimization of Salesforce Agentforce agents deployed across... 
    Suggested

    Simplus

    Hartford, CT
    4 days ago
  • $174.99k - $209.98k

     ...Staff AI Engineer - Grafana Ops, AI/ML | USA | Remote United States (Remote) Grafana Labs is a remote-first, open-source powerhouse. There are more than 20M users of Grafana, the open source visualization tool, around the globe, monitoring everything from beehives to climate... 
    Suggested
    Local area
    Remote work

    Grafana

    New York, NY
    2 days ago
  •  ...A leading technology solutions provider is seeking an AI Ops Engineer to work remotely. The role involves monitoring and troubleshooting AI/ML systems, ensuring the scalability of deployments, and collaborating with ML engineers. With a focus on automation and process... 
    Suggested
    Remote work

    Insight Global

    Plano, TX
    1 day ago
  •  ...Job Title: AWS AI/ML Cloud Architect Overview / Summary: The AWS AI/ML Cloud Architect is responsible for designing, building, and...  ...networking, security, and system architecture ~ Experience with AWS AI Ops tools including Amazon Bedrock, CloudWatch, X-Ray, Model Context... 
    Suggested

    HTC Global Services

    Charlotte, NC
    7 days ago
  • $160k - $300k

     ...health technology startup in New York seeks an ML operations engineer to manage the lifecycle of AI systems, from experimentation to production. The role...  ...AI performance on biopharma data, implementing ML Ops best practices, and cross-functional collaboration in a... 

    Solstice Health

    New York, NY
    1 day ago
  • Cisco Systems, Inc. is looking for an AI Ops Engineer based in San Jose, CA or North Carolina. In this role, you will drive AI solutions within the Circuit platform, focusing on user experience and security. You will tackle high-impact technical challenges utilizing advanced... 

    Cisco Systems, Inc.

    San Jose, CA
    1 day ago
  •  ...AI/ML Ops Engineer We are looking for a skilled AI/ML Ops Engineer to join our team in Pleasanton CA. The ideal candidate will bring a strong foundation in AI/ML Operations along with a working knowledge of data engineering principles and project delivery best practices... 

    Kasmo Global

    Pleasanton, CA
    1 day ago
  •  ...Role: AI-ML Engineers for MLOPs/LLM Ops Capability Development Location: Charlotte, NC/Concord, CA-Day one onsite Duration: 6+ Months Job...  ...resiliency and fault tolerance of critical applications ~ Execute on roadmaps that align with technology and business... 

    Zortech Solutions

    Concord, CA
    5 days ago
  •  ...About the job Remote ServiceNow AI Ops Support Engineer Remote ServiceNow AI Ops Support Engineer needs 8+ years of experience working with the ServiceNow platform, especially in ITOM / AIOps modules Remote ServiceNow AI Ops Support Engineer requires: Hands... 
    Remote work

    Global Channel Management

    United States
    3 days ago
  • $257k - $322k

     ...Grow at BlackLine! Make Your Mark: The Principal AI/ML Operations Engineer leads the architecture, automation, and operationalization...  ...This role defines the strategy and technical standards for ML-Ops and AIOps across the organization, ensuring models and... 
    Temporary work
    Work at office
    Shift work
    2 days per week

    BlackLine

    Pleasanton, CA
    2 days ago
  •  ...Overview: Job Title: AI/ML Ops & Infrastructure Engineer Company: R2 Technologies Location: Alpharetta, GA (Hybrid / Remote Options Available) Employment Type: Full-Time / Contractual About R2 Technologies: R2 Technologies is a Certified Minority... 
    Full time
    Remote work
    Shift work

    R2 Technologies

    Alpharetta, GA
    2 days ago
  • $199.7k - $254.6k

     ...Automation AI Ops Engineer This role is hybrid. Onsite three days per week in Raleigh (Research Triangle Park), North Carolina or San Jose, California. Our team is part of the Cisco AI and Automation portfolio, responsible for the "Circuit" platform—the central AI... 
    Full time
    Temporary work
    Flexible hours
    3 days per week

    Webex Events (formerly Socio)

    San Jose, CA
    21 hours ago
  •  ...Description Moveworks is the Agentic AI Assistant platform that empowers the entire...  ...automation with Moveworks' Reasoning Engine and natural language capabilities, we deliver...  ...guidelines, standards, and patterns for future CS Ops engineering Partner cross-functionally... 
    Work at office
    Remote work
    Flexible hours

    ServiceNow

    Mountain View, CA
    4 days ago
  • $91.7k - $158.82k

     ...healthy, fulfilling life in and outside of work. Your Mission: We are seeking a highly motivated and talented AI Infrastructure & Platform Ops Engineer to join our team. In this role you will have the opportunity to work on cutting-edge AI technologies and... 
    Full time
    Temporary work
    Work experience placement
    Work at office
    Remote work
    Relocation
    Flexible hours
    Shift work
    3 days per week

    Lockheed Martin Corporation

    King of Prussia, PA
    2 days ago
  • MGM Resorts International is seeking a Senior AI Commercial & ML Ops Engineer to optimize machine learning pipelines that drive business impact. This highly visible role involves designing, automating, and operationalizing AI workflows while collaborating with Data Science... 

    MGM Resorts International

    Oklahoma City, OK
    2 days ago
  • $135.96k - $203.94k

    ## AI Ops Senior Backend EngineerApplyremote type: Hybridlocations: Austin, TXtime type: Full timeposted on: Posted 14 Days Agojob requisition...  ...this is the place for you.We are looking for a Senior Backend Engineer to join our AI Operations organization and partner closely with... 
    Local area

    Commerce Inc

    Austin, TX
    3 days ago
  •  ...AI Marketing Ops Consultant Right Side Up is a collective of premium marketing talent—with all of the marketing chops, and none of the...  ...~3-5+ years of experience in marketing operations, growth engineering, or a similar role at a tech company or agency ~ Hands-on... 

    Right Side Up

    Austin, TX
    23 hours ago
  • A leading entertainment company is looking for an AI/ML Ops Engineer to enhance their AI Engineering team in Seattle, WA. The role revolves around managing AI platforms aimed at optimizing guest experiences and includes responsibilities like model deployment, performance... 
    Permanent employment
    Full time
    Contract work

    Motion Recruitment

    Seattle, WA
    4 days ago
  • $130.1k - $173.5k

    ## AI Commercial & ML Ops EngineerApplylocations: Home Office - US, NV: Home Office - US, NJ: Home Office - US, MI: Home Office - US, FL: Home...  ...Artificial Intelligence and Machine Learning Operations Engineer to design, implement, and optimize scalable machine learning... 
    Home office
    Shift work

    MGM Resorts International

    Florida, NY
    2 days ago
  • $86.7k - $173.3k

    A major healthcare company is seeking a Senior Applied AI Engineer to design and implement AI-powered systems for high-impact business use cases. This remote position requires at least 7 years of experience in natural language processing and generative AI, strong software... 
    Remote work

    Abbott Laboratories company

    Springfield, IL
    18 hours ago
  •  ...Request ID: 75337-1 Title: AI/DevOps Engineer Location: Austin TX (100% ONSITE ROLE local candidates only) Duration: 6 months...  ...Collaborate with cross-functional teams on project planning and execution Stay current with industry trends in AI, DevOps, and... 
    Internship
    Local area
    Visa sponsorship

    Artech

    Austin, TX
    1 day ago
  • CVS Health seeks an Executive Director, AI Ops Engineering to lead a team ensuring optimal operation of the AI platform. You'll oversee SRE practices, building a multi-disciplinary team, managing vendor relationships, and driving success with top-notch availability and... 

    Hispanic Alliance for Career Enhancement

    New York, NY
    3 days ago
  • $175.1k - $334.75k

    CVS Health is seeking an Executive Director, AI Ops Engineering to lead and optimize the Enterprise AI environment. This role involves building and managing a multi-disciplinary SRE organization focused on platform reliability, security, and operational excellence. Candidates... 

    Hispanic Alliance for Career Enhancement

    Kansas City, MO
    3 days ago
  • CVS Health is seeking an Executive Director for their AI Ops Engineering team. This leadership role is crucial for maintaining and optimizing CVS's AI platform, ensuring high performance, availability, and reliability. The ideal candidate will have extensive experience... 

    Hispanic Alliance for Career Enhancement

    Louisiana, MO
    3 days ago
  • TwinThread is seeking a highly skilled Principal AI/ML and Gen AI Engineer to join its dynamic team in Palo Alto, California. The ideal candidate...  ...machine learning models, and implementing best practices for LLM Ops. The company offers a competitive salary and comprehensive... 

    Aumni

    Palo Alto, CA
    3 days ago
  • $300k

     ...Data Engineer III Mindbank Consulting Group is seeking a Top Secret-cleared Data Engineer III to build and operationalize secure, mission...  .... This role focuses on enabling real-time analytics and AI-driven decision-making in classified and DDIL environments. This... 
    Work at office

    Navstar

    Washington DC
    1 day ago
  • $217.8k - $292.1k

    Job Posting Title: Director, Decision Science AI/ML Engineering & Ops Req ID: 10151158 Job Description: Do you thrive on transforming brilliant and complex...  .... MLOps Strategy & Capability Oversight: Define and execute a comprehensive MLOps roadmap. Architect and implement... 
    Permanent employment
    Full time
    Work experience placement

    The Walt Disney Company

    Burbank, CA
    3 days ago
  •  ...Saviynt's AI-powered identity platform manages and governs human and non-human access...  ...We're looking for a Marketing AI Agent Engineer to help build, configure, and scale AI-powered...  ..., data orchestration, and go-to-market execution. This person will help turn our AI GTM... 
    Remote work

    Saviynt

    United States
    3 days ago
  • Job Posting Diverse Lynx LLC is an Equal Employment Opportunity employer. All qualified applicants will receive due consideration for employment without any discrimination. All applicants will be evaluated solely on the basis of their ability, competence and their proven...

    Diverse Lynx

    San Ramon, CA
    4 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Executive Director, AI Ops Engineering. Be the first to apply!