Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Manager Site Reliability Engineering

$51.9 per hour

Highmark Health

Company :

Allegheny Health Network

Job Description :

GENERAL OVERVIEW:

This job is responsible for the reliability, availability, and performance of critical healthcare IT systems, principally in the Environment of Care (EOC), enabling seamless access to essential services for patients, providers, and the people we serve. Proactively identifies and mitigates potential disruptions to maintain the highest standards of care and operational efficiency. This role blends software engineering, clinical engineering, and security principles with a deep understanding of healthcare operations to minimize downtime, improve system resilience, and to support clinical workflows and continuity of hospital operations. Works cross-functionally with AHN site leaders and teams to navigate and to monitor and support building automation and facility systems, clinical engineering / IoT, healthcare delivery technology architecture, infrastructure and platform operations, and cybersecurity. Fosters a culture of automation, continuous improvement, collaboration, and patient safety. Develops core metrics for monitoring and maintaining system health for SRE practitioners (e.g., latency, traffic, errors, and saturation) leveraging industry practices, manufacturer guidance, and other service delivery metrics.

ESSENTIAL RESPONSIBILITIES

  • Perform management responsibilities to include, but are not limited to: involved in hiring and termination decisions, coaching and development, rewards and recognition, performance management and staff productivity.Plan, organize, staff, direct and control the day-to-day operations of the department; develop and implement policies and programs as necessary; may have budgetary responsibility and authority. (25%)

  • Oversees the partnership with clinical engineering, cybersecurity, device manufacturers, suppliers, and Information Technology SMEs to oversee and to implement strategies for managing, monitoring, and securing a diverse range of clinical devices and other technology equipment (e.g., IoT), ensuring compliance with HIPAA and other relevant regulations (e.g., FDA, TJC, PCI). Keeps current on healthcare IT trends, including AI, security patching, and best practices for device hardening. Oversees and assists with network segmentation and access controls to isolate and to protect clinical and other critical devices. Automates monitoring tasks to improve efficiency and reduce errors. Identifies and remediates vulnerabilities in clinical devices and related infrastructure. Manages and reports issues with assets, devices, integration services, and other equipment. Engages the appropriate parties to develop and deploy a fix/solution or oversees ownership of resolution actions. Utilizes observability practices to gain deep insights into system behavior, enabling faster identification and resolution of issues. (15%)

  • Oversees the SRE partnership with Clinical Engineering and Cybersecurity Engineering to troubleshoot technical issues related to medical equipment and systems. Participates in the medical device technology lifecycle - from product/device evaluation, discovery, to implementation, maintenance, and through retirement. Develops the framework and structure to maintain documentation related to the IT infrastructure supporting clinicaland other critical devices. Participates in the planning and oversees the execution of preventative maintenance activities. Provides direction and guidance to team members on how to analyze complex problems and develop effective solutions, how to troubleshoot system outages and performance issues, and how to work collaboratively with other IT, cybersecurity, facility, AI and application teams to resolve issues and to conduct root cause analyses. (15%)

  • Oversees the SRE partnership with facility leaders to optimize the performance and monitoring of building automation systems (BAS), including HVAC, lighting, fire suppression, security systems, etc. Manages processes and procedures used to monitor BAS performance metrics and proactively identifies potential issues. Works with facilities management to implement improvements to the BAS infrastructure. Works with cybersecurity, vendors/manufacturers, et. al. to ensure the security of building automation systems and oversees monitoring of performance, service delivery, and support. (15%)

  • Oversees the SRE partnership with IT teams including, but not limited to platform / product management, disaster recovery services, infrastructure and architecture, storage management, and release management. Participates in the planning and execution of downtime drills and system / device recovery exercises. Supports other emergency preparedness drills and exercises, as needed. Leads or participates in post-incident reviews to identify root causes and implement corrective actions. Works with cross-functional stakeholders to Implement and to maintain redundant systems and failover mechanisms to minimize downtime. Reviews and provides feedback on emergency operations plans and other materials which are used to respond to emergency situations (e.g., Continuity of Operations Plans, Incident Response Guides, Downtime Procedures). Manages team members who are supporting the planning and execution of system migrations, releases, and upgrades to ensure minimal disruption to clinical operations. Oversees detailed migration or installation plans, including risk assessments, rollback procedures, and communication strategies. Assists local site leaders with navigating shared services (e.g., AI, IT, Information Security, Clinical Engineering, Platform Operations, Technology Acquisition). (15%)

  • Establishes core metrics for monitoring and maintaining system health for SRE practitioners (e.g., latency, traffic, errors, and saturation). Manages the processes and procedures used for documentation and knowledge sharing including maintaining detailed documentation of systems, device inventories, processes, and procedures.Leads by example by sharing knowledge and best practices with other staff and cross-functional teams. Provides training and mentorship to junior or less experienced team members. Stays current with the latest technologies and trends in site reliability engineering. Leads or participates in briefings with cross-functional stakeholders to manage priorities and team assignments, support ticket queues, etc. (10%)

  • Other duties as assigned or requested. (5%)

Q UALIFICATIONS:

Required

  • Bachelor's degree in Computer Science, Engineering, Management Information Systems, IT, or related field or relevant experience and/or education as determined by the company in lieu of bachelor's degree.

  • 3 years with Management or leadership role

Preferred

  • Master's degree in Computer Science, Engineering, Management Information Systems, IT, or related field

  • 5 years of experience with Site Reliability Engineering (SRE), Systems Administration, or DevOps particularly in healthcare IT

  • 5 years of experience in Medical device management lifecycle, network / device segmentation, vulnerability and patch management

  • 5 years of experience in Healthcare IT experience in architecture, automation, IoT, telemetry, telehealth, security, system development lifecycle, capacity planning, networking, continuous integration / continuous delivery pipelines (CI/CD), incident management, scripting, metrics, monitoring, redundancy, etc.

  • 3 years of experience working in highly regulated environments

  • 3 years of experience with Progressive leadership roles, preferably inclinical engineering, IT, business continuity, backup and storage management, building automation, or cybersecurity discipline in healthcare

SKILLS:

  • Problem-Solving: Excellent analytical and troubleshooting skills; High capacity to think analytically, interpret information / observations, apply judgment and to assist with making effective, strategic decisions.

  • Collaboration: Ability to work effectively in a team environment; demonstrated ability to support multiple sites and locations while maintaining consistency in service delivery processes and procedures.

  • Communication: Strong written and verbal communication skills.

  • Flexibility: Willingness to participate in activities or incidents which may occur outside of regular work schedules.

  • Leadership: Demonstrated resource and project planning capabilities, decision making skills, history of results-oriented delivery, and effective team building across multiple locations and a diverse team of staff, partners, and stakeholders.

  • Security Awareness: Understanding of security best practices and how to apply them in a healthcare IT environment.

  • Delivery and Execution: Demonstrated competency in the execution of multiple projects, including managing resources across multiple projects to meet goals.

  • Relationships: Strong relationship building skills and ability to influence with and without authority in a matrixed organization.

Disclaimer: The job description has been designed to indicate the general nature and essential duties and responsibilities of work performed by employees within this job title. It may not contain a comprehensive inventory of all duties, responsibilities, and qualifications required of employees to do this job.

Compliance Requirement : This job adheres to the ethical and legal standards and behavioral expectations as set forth in the code of business conduct and company policies.

As a component of job responsibilities, employees may have access to covered information, cardholder data, or other confidential customer information that must be protected at all times. In connection with this, all employees must comply with both the Health Insurance Portability Accountability Act of 1996 (HIPAA) as described in the Notice of Privacy Practices and Privacy Policies and Procedures as well as all data security guidelines established within the Company's Handbook of Privacy Policies and Practices and Information Security Policy.

Furthermore, it is every employee's responsibility to comply with the company's Code of Business Conduct. This includes but is not limited to adherence to applicable federal and state laws, rules, and regulations as well as company policies and training requirements.

Pay Range Minimum:

$51.90

Pay Range Maximum:

$83.84

Base pay is determined by a variety of factors including a candidate's qualifications, experience, and expected contributions, as well as internal peer equity, market, and business considerations. The displayed salary range does not reflect any geographic differential Highmark may apply for certain locations based upon comparative markets.

Highmark Health and its affiliates prohibit discrimination against qualified individuals based on their status as protected veterans or individuals with disabilities and prohibit discrimination against all individuals based on any category protected by applicable federal, state, or local law.

We endeavor to make this site accessible to any and all users. If you would like to contact us regarding the accessibility of our website or need assistance completing the application process, please contact the email below.

For accommodation requests, please contact HR Services Online at View email address on click.appcast.io

California Consumer Privacy Act Employees, Contractors, and Applicants Notice

Req ID: J280531

Vacancy posted 1 day ago
Similar jobs that could be interesting for youBased on the Manager Site Reliability Engineering in Washington DC vacancy
  • $126k - $248k

     ..., you will partner with SRE leaders and engineers to scale the platform that underpins all...  ...program execution, strengthen production reliability practices, and coordinate cross-...  ...criteria with SRE engineers and leaders. Manage dependencies across platform teams, keep... 
    Suggested
    Local area
    Remote work
    Worldwide
    Flexible hours

    MongoDB

    Washington DC
    2 days ago
  •  ...Description: Onsite in Washington, DC our client seeks a Sr. Site Reliability Engineer III to design, automate, and operate mission-critical...  ...automated CI/CD pipelines, monitoring, and configuration management workflows across all environments. Provision, configure... 
    Suggested
    Hourly pay
    Permanent employment
    Full time
    Local area
    Immediate start

    Eliassen Group

    Washington DC
    2 days ago
  • $104k - $130k

     ...as well as help improve the reliability, quality of services and overall...  ..., runbooks, configuration management, DDoS protection,...  ...ll collaborate or embed with engineering teams, helping them to improve...  ...our locations by visiting our site. Compensation & Benefits... 
    Suggested
    Full time
    Work experience placement

    AppFolio

    Washington DC
    1 day ago
  • $96k - $151.8k

     ...ON MAP ( Schedule Full Time Located Remotely? Y Position Type Management Bonus Eligible: Y Expiration Date: 06/22/2026 JOB SUMMARY: The Systems Engineer - Site Reliability Engineering (SRE) is responsible for the reliability, scalability, and performance... 
    Suggested
    Full time
    Remote work
    Flexible hours

    Marriott

    Bethesda, MD
    2 days ago
  • $160k - $200k

     ...Engineering Leader Filevine is a Legal AI company delivering Legal Operating Intelligence...  ...engineering leader to spearhead system reliability, drive platform project execution, and...  ...close collaboration with the product managers and the development engineering teams.... 
    Suggested
    Full time
    Temporary work
    Work experience placement

    Filevine

    Washington DC
    4 days ago
  •  ...Description Role Overview We are seeking a high-caliber Site Reliability Engineer (SRE) to join our Forward Engineering team. You will be...  ...Responsibilities 1. Reliability & Performance Engineering SLA/SLO Management: Define, monitor, and maintain Service Level Objectives (... 
    Local area

    Tiger Analytics Inc.

    Washington DC
    25 days ago
  •  ...teams. Annually, or as required, revalidates with users and engineering that technology continues to meet requirements. Implement...  ...and improve all documentation related to configuration management and change control Develops, reviews, and maintains a configuration... 

    Powder River Industries, LLC

    Washington DC
    14 days ago
  •  ...This role requires regularly working on-site at customer locations in Arlington, VA....  .... About The Role We are hiring a Site Reliability Engineer to join our Infrastructure & Security...  ...the overall experience of deploying and managing Onebrief on premise. About You You care... 
    Relocation
    Relocation package

    Onebrief, Inc.

    Arlington, VA
    3 days ago
  • Senior Site Reliability Engineer Job Description Overview CoStar Group (NASDAQ: CSGP) is a leading global provider of commercial and residential...  ..., we provide an invaluable edge in real estate. We manage petabytes of real-time data, millions of active users, and... 
    Full time
    Work at office
    Work from home
    Monday to Thursday

    Visual Lease

    Arlington, VA
    2 days ago
  • $166k - $220k

    ABOUT THE JOB As a site reliability engineer in Platform Discovery, you will solve a wide variety of problems involving networking, autonomy,...  ...through root cause analysis and creating tooling capable of managing large scale deployments Drive continuous organizational... 
    Full time
    Work experience placement
    Relocation package

    Slope

    Washington DC
    2 days ago
  • $60 per hour

     ...including front-end, back-end, full-stack, machine learning, and other engineers — who are driving real-world impact in AI development.Our...  .... Those located outside of these countries will not see work or assessments available on our site at this time.J-18808-Ljbffr... 
    Hourly pay
    Full time
    Remote work
    Flexible hours

    DataAnnotation

    Washington DC
    4 days ago
  •  ...developing automation scripts in Bash, Python, and PowerShell, integrating systems, and managing Microsoft Entra services. A minimum of 5 years of experience in systems engineering is required along with a Bachelor's degree in Computer Science. The position offers a hybrid... 
    Local area

    Highlighttech

    Washington DC
    4 days ago
  • Geico is seeking a Staff Engineer to innovate and enhance systems while mentoring engineers and collaborating across teams. This position involves utilizing programming languages like Go and Python, working with Azure services, Docker, and Kubernetes, and requires 6+ years... 

    Geico

    Bethesda, MD
    1 day ago
  • $135.2k - $278.5k

     ...government forward! **Job Description** The Release Train Engineer (RTE) will act as the servant leader and chief Scrum Master for...  ...is to facilitate program-level execution, remove impediments, manage risk, and drive relentless improvement. Facilitate all major Program... 
    Live in
    Work at office
    Local area

    Accenture

    Arlington, VA
    3 days ago
  • $170k - $220k

    As a Sr. Site Reliability Engineer (SRE) III, you’ll work as part of a collaborative and high-performing team providing your expertise to deliver...  ...automated CI/CD pipelines, monitoring, and configuration management workflows to support reliable software delivery and... 
    Full time
    Work experience placement
    Local area
    Immediate start
    Flexible hours

    MetroStar

    Washington DC
    1 day ago
  • $207k - $284.9k

     ...This is an opportunity to do career-defining work. We're all in on this mission. If you are too, let's talk. Senior Manager, Site Reliability Engineering District of Columbia Area Secure Every Identity, from AI to Human Identity is the key to unlocking the potential... 
    Permanent employment
    Full time
    Local area
    Worldwide
    Flexible hours

    Okta

    Washington DC
    4 days ago
  • Salesforce is seeking a Site Reliability Engineer in Washington, DC to ensure cloud services availability. This role involves monitoring services, incident management, and driving automation for resilient systems. Candidates should have a Bachelor's in Computer Science... 

    Salesforce

    Washington DC
    3 days ago
  •  ...Senior Manager, Premium Support Engineer (Databolt) - Capital One Software (Remote) Summary: As a Premium Support Engineer, you will be the primary technical authority and strategic partner for our highest-priority enterprise customers. You will bridge the gap between... 
    Local area
    Remote work

    Comfort Systems USA

    Washington DC
    4 days ago
  • Leidos Inc. is seeking an experienced Release Train Engineer to enhance enterprise data and analytics products across multiple DoD organizations in Alexandria, Virginia. The candidate will lead Agile Release Trains to ensure efficient delivery while fostering collaboration... 

    Leidos Inc

    Alexandria, VA
    1 day ago
  • $166k - $220k

     ...Partner Relationship Manager, Systems Integrator Washington, District of Columbia, United...  ...teams (e.g., Co-Build for joint engineering efforts, Sales for SI partner adoption of...  ...National Capital Region (NCR) and at customer sites where our partners are delivering... 
    Full time
    Work experience placement
    Work at office

    anduril

    Washington DC
    3 days ago
  •  ...consider a career in Advisory. KPMG is currently seeking a Manager, AI Engineer to join our Advisory Services practice. Responsibilities:...  ...can be found towards the bottom of our KPMG US Careers site at Benefits & How We Work. Follow this link to obtain... 
    H1b
    Local area

    KPMG

    Washington DC
    18 hours ago
  •  ...Mid-Level Systems Engineer / Acquisition Manager Stellar Solutions, a nationally recognized Great Place to Work, is seeking a mid-level systems...  ...Performance of tasks is expected to take place remotely from an off-site location (home or Stellar Solutions office, as appropriate... 
    Contract work
    Work at office
    Remote work

    Stellar Solutions

    Washington DC
    2 days ago
  • Unissant is looking for a Release Train Engineer (RTE) to join their team in Washington DC-Baltimore area. This remote position requires leading Agile project management activities, specifically Program Increment (PI) Planning events. Ideal candidates should have over 1... 
    Remote job

    Unissant

    Washington DC
    2 days ago
  • $125k - $200k

    Overview As a Site Reliability Engineer (SRE) , you will help design, build, and operate reliable, secure, and observable cloud‑native systems that...  ...implement infrastructure‑as‑code (IaC) to provision and manage cloud resources (e.g., AWS, Azure, GCP). Build and maintain... 
    Local area
    2 days per week

    Steampunk

    Mc Lean, VA
    23 hours ago
  •  ...Engineering Manager, Platform Services Washington, DC Metropolitan Area Resonate is a leading provider of high-quality, AI-powered...  ...in platform performance or cost efficiency) Own platform reliability, scalability, and integrations across the martech/adtech ecosystem... 
    Remote work
    Work from home
    Flexible hours

    Resonate

    Washington DC
    4 days ago
  • $53k - $108k

     ...what to expect during your journey as a candidate with us. Site Reliability Engineer The Opportunity: Everyone is trying to “harness the cloud,”...  ...knows how. As a DevOps engineer, you’re eager to develop, manage, and secure a container platform that meets your client’s needs... 
    Full time
    Contract work
    Part time
    Local area
    Remote work

    Booz Allen Hamilton

    Mc Lean, VA
    3 days ago
  • $55.2k - $126k

     ...what to expect during your journey as a candidate with us. Engineering to make a system more resilient and efficient frees up time...  ...have a passion for making systems better, we need you! As a site reliability engineer on our team, you’ll help our Platform Engineering team... 
    Full time
    Contract work
    Part time
    Local area
    Remote work

    Phase2 Technology

    Mc Lean, VA
    3 days ago
  • Job Category Software Engineering Overview of the Role Join our Site Reliability Engineering (SRE) team, where you'll work alongside Infrastructure and Research...  .... Demonstrated experience with incident management and a solid understanding of IT Infrastructure Library... 
    Work experience placement

    salesforce.com, inc.

    Washington DC
    3 days ago
  • Job Category: Software Engineering About Salesforce Salesforce is the #1 AI CRM, where...  ...it all. Overview Of The Role Join our Site Reliability Engineering (SRE) team, where you’ll work...  ...Demonstrated experience with incident management and a solid understanding of IT... 
    Work experience placement

    Salesforce

    Washington DC
    23 hours ago
  • $147.4k - $221.2k

    Senior Site Reliability Engineer page is loaded## Senior Site Reliability Engineerremote type: Flexlocations: USA, VA, McLean: USA.VA.Restontime...  .... As a Fortune 500 company and a leading AI platform for managing people, money, and agents, we’re shaping the future of... 
    Work experience placement
    Work at office
    Remote work
    Home office
    Flexible hours

    Workday, Inc.

    Mc Lean, VA
    2 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Manager Site Reliability Engineering. Be the first to apply!