Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Site Reliability Engineer

LE013 Zelis Payments, LLC

Position Overview We are seeking a strategic and results‑oriented Site Reliability Engineer (Golden Signals Lead) to define and drive the observability roadmap across all platforms. Job Title: Site Reliability Engineer Location: Remote, In‑office, or Hybrid Department: IT Operations Reports To: Manager of Observability & Reliability Job Type: Full‑Time Employee (FTE) Summary: This role is responsible for establishing a consistent and scalable approach to monitoring and alerting, leveraging golden signals to enhance system reliability and operational efficiency. The successful candidate will collaborate closely with the ZEIT SRE team, engineering leads, and India‑based resources to build a unified observability strategy aligned with organizational goals. Key Responsibilities Observability Roadmap Development Define a unified vision for observability across all platforms, with golden signals as the foundation for monitoring and alerting. Develop and maintain a comprehensive roadmap to improve observability, reduce tool redundancy, and standardize practices across platforms. Establish and track key performance indicators (KPIs) to measure progress and ensure accountability for roadmap milestones. Collaboration and Alignment Partner with the ZEIT SRE team and engineering leads to break down silos and promote consistent observability practices. Drive cross‑platform collaboration to reduce operational inconsistencies and define a "north star" approach for observability. Facilitate knowledge sharing to ensure alignment on current and future observability initiatives. Monitoring and Alerting Standardize the implementation of golden signals across applications to improve system reliability and incident detection. Optimize alerting tools and reduce redundant or ineffective monitoring interfaces. Lead efforts to enhance observability while minimizing operational overhead for platform teams. Maintain and enhance observability dashboards, delivering actionable insights into application health and performance. Operational Support and Improvement Identify and address gaps in existing observability practices, prioritizing long‑term scalability and reliability. Collaborate with India‑based resources to execute observability build‑outs efficiently and with high quality. Reduce client, provider, and print facility‑raised issues through proactive monitoring and early detection. Reporting and Continuous Improvement Measure and report on observability success metrics, including actionable alert volume and reduced issue escalations. Continuously evaluate and refine observability strategies based on stakeholder feedback and evolving organizational needs. Qualifications Educational Background: Bachelor’s degree in Computer Science, Information Technology, or a related field (or equivalent experience). Minimum 5 years of experience in Site Reliability Engineering, DevOps, or a related role with a strong focus on observability. Hands‑on experience with .NET (C#), including advanced knowledge of ASP.NET Core, Web APIs, and performance optimization. Demonstrated success in designing and implementing monitoring and alerting solutions across complex IT environments. Deep understanding of SRE principles and golden signals for system monitoring. Proficiency with observability tools such as Prometheus, Grafana, Splunk, New Relic, or Datadog. Familiarity with cloud platforms (AWS, Azure, GCP) and containerization technologies (Docker, Kubernetes). Advanced proficiency in scripting languages such as PowerShell. Experience in front‑end development using React.js. Advanced knowledge of .NET. Strong leadership and collaboration abilities, with a proven ability to align diverse teams toward common goals. Excellent analytical and problem‑solving skills, with a proactive approach to identifying and resolving issues. Clear and effective communication skills, capable of conveying technical concepts to stakeholders at all levels. Open to candidates without visa sponsorship. Preferred Qualifications Experience with building observability roadmaps and scaling solutions in enterprise environments. Certifications in cloud or DevOps‑related disciplines (e.g., AWS Certified DevOps Engineer, Kubernetes Administrator). Location and Workplace Flexibility We have offices in Atlanta, GA; Boston, MA; Morristown, NJ; Plano, TX; St. Louis, MO; St. Petersburg, FL; and Hyderabad, India. We foster a hybrid and remote‑friendly culture; work locations are based on the needs of the position and determined by leadership. Equal Employment Opportunity Zelis is proud to be an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws. We welcome applicants from all backgrounds and encourage you to apply even if you don’t meet 100% of the qualifications for the role. #J-18808-Ljbffr LE013 Zelis Payments, LLC

Vacancy posted 5 days ago
Similar jobs that could be interesting for youBased on the Site Reliability Engineer in Atlanta, GA vacancy
  • Job Title :- Site Reliability Engineer (SRE) Employment Type :- W2 Duration :- Long Term Visa Type :- All Visa applicable which are ready for W2 Location :- Atlanta, GA (Onsite) Job Description We are seeking a highly skilled Site Reliability Engineer (SRE)with expertise... 
    Suggested

    Highbrow LLC

    Atlanta, GA
    1 day ago
  • You’re more valuable than ever - And that’s just how we’ll make you feel.The Site Reliability Engineer is responsible for maintaining and enhancing the reliability, security, and performance of our customer-facing web and mobile applications. This role is critical in remediating... 
    Suggested
    Work experience placement
    Work at office
    Local area

    GoHealth Urgent Care

    Atlanta, GA
    3 days ago
  • Summary: As a Sr. Site Reliability Engineer, you are instrumental in helping make our client’s Kubernetes-centric ProArchive application resilient. This position will coordinate with multiple teams to develop a migration plan for various components and services as well... 
    Suggested

    Lexicon Solutions

    Atlanta, GA
    4 days ago
  •  ...- AWS, Google Cloud, and Azure is a plus - CI/CD Automation, Database Management. The Technical Support Specialist in Site Reliability Engineering (SRE) will be responsible for ensuring the reliability and stability of the systems and applications. The role involves... 
    Suggested

    TechDigital Group

    Atlanta, GA
    4 days ago
  •  ...for new team members who want to be a part of this journey! Who we’re looking for We’re looking for a proactive, hands‑on Site Reliability Engineer who thrives in building and scaling cloud infrastructure in fast‑moving startup environments. You’re someone who enjoys... 
    Suggested
    Work experience placement
    Flexible hours

    Rainforest

    Atlanta, GA
    5 days ago
  •  ...and YouTube. Job Description AbbVie Information Security is looking for a highly motivated, diligent, and skillful Site Reliability Engineer to join the Cyber Security Engineering (CSE) Team. The CSE Team, working within the Cyber Security Operations (CSO)... 
    Temporary work
    Local area
    Remote work

    AbbVie

    Atlanta, GA
    3 days ago
  • GoHealth Urgent Care is hiring a Site Reliability Engineer in Atlanta, Georgia. This role focuses on maintaining and enhancing the reliability, security, and performance of web and mobile applications. You will be responsible for managing Azure DevOps pipelines and collaborating... 

    GoHealth Urgent Care

    Atlanta, GA
    3 days ago
  • Site Reliability Engineering (SRE) Architect Location: Atlanta, GA Duration: 12Months+ Extension Hourly Rate: Depending on Experience (DOE) Work Authorization: As an SRE Architect, you will be a pivotal technical leader responsible for designing, building, and... 
    Hourly pay
    Permanent employment
    Contract work
    Local area
    Early shift

    Ethereum Technologies LLC

    Atlanta, GA
    5 days ago
  • About the Role You\'ll own the reliability posture of a large-scale healthcare platform. That means infrastructure design, deployment pipelines...  ...isn\'t production-ready. You\'ll work alongside software engineers and security engineers who are building real capabilities -... 
    Permanent employment
    Flexible hours

    Satine Technologies

    Atlanta, GA
    5 days ago
  • $135.8k - $183.8k

     ...Postgres DBs in support of key services that make the internet work. The ideal candidate will work with other DBA SREs, application engineers, Infrastructure teams, Security and Project Managers maintaining critical internet infrastructure. Responsibilities Maintain and... 
    Work experience placement
    Work at office
    Flexible hours

    The Association of Technology, Management and Applied Engine...

    Atlanta, GA
    4 days ago
  • $300k - $360k

     ...giving consumers the flexibility to buy now and pay later without any hidden fees or compounding interest.As a Director of Site Reliability Engineering, you will own execution for reliability, availability, and operational excellence across Affirms global platform. You... 
    Work at office
    Remote work
    Flexible hours

    Affirm

    Atlanta, GA
    2 days ago
  • $180k - $220k

     ...a Lead SRE, you'll be a technical and operational leader for reliability across Develocity. You'll help define our SRE vision, set standards...  ...grows. This is a hands-on role with broad influence across engineering, cloud platform, and customer-facing teams. The SRE team... 
    Remote work
    Work from home
    Shift work

    Gradle Technologies

    Atlanta, GA
    14 days ago
  •  ...smart growth" approach ensures that we will continue to scale our company effectively. Summary We are seeking a Lead Site Reliability Engineer to spearhead our SRE team. You are not just an operator; you are an experienced software engineer who excels at... 
    Remote work
    Flexible hours

    Intellum, Inc.

    Atlanta, GA
    14 days ago
  • $130k - $150k

     ...handed off. You'll work alongside software engineers and security engineers who are building...  ...Own and improve CI/CD pipelines - reliability, deployment safety, rollback capability...  ...and we'll figure it out together. Senior Site Reliability Engineer Salary: $130,000 -... 
    Permanent employment
    Flexible hours

    Satine Technologies

    Atlanta, GA
    3 days ago
  • $126k - $248k

    As a TPM for SRE, you will partner with SRE leaders and engineers to scale the platform that underpins all of MongoDB’s cloud products. You will drive program execution, strengthen production reliability practices, and coordinate cross-functional efforts across US and EMEA... 
    Local area
    Remote work
    Worldwide
    Flexible hours

    MongoDB

    Atlanta, GA
    3 days ago
  •  ...cloud-native systems. As a Staff Platform Engineer, you will play a critical role in...  ...technical leadership role. You will own reliability for major platform domains, design scalable...  ...Infrastructure Development, Platform Engineering, or Site Reliability Engineering role, with a... 

    Saviynt

    Atlanta, GA
    17 days ago
  • A leading IT solutions provider in Atlanta is looking for a Site Reliability Engineer (SRE) with expertise in Adobe Experience Manager (AEM) and DevOps practices. The successful candidate will maintain and enhance the reliability of AEM applications while implementing scalable... 

    Highbrow LLC

    Atlanta, GA
    1 day ago
  •  ...Site Reliability Engineer (SRE) - Grafana, OpenTelemetry (OTEL) & PromQL Atlanta, GA 30308 Job Description: Site Reliability Engineer (SRE) with strong hands-on experience in modern observability and telemetry practices.  The ideal candidate will have... 

    Denken Solutions, Inc.

    Atlanta, GA
    10 hours ago
  • An innovative firm is seeking a Technical Support Specialist to join their Site Reliability Engineering team. This role is pivotal in ensuring the reliability and stability of systems and applications. You will provide technical support, troubleshoot issues, and implement... 

    TechDigital Group

    Atlanta, GA
    4 days ago
  • SRE DevOps Engineer Location: Overland Park, KS / Atlanta, GA / Frisco, TX (Onsite) Requirements 4-9 years in SRE/DevOps/Systems Engineering...  ...trends, propose improvements in monitoring, capacity, and reliability. Collaborate with engineering teams on deployment, upgrades,... 

    Highbrow LLC

    Atlanta, GA
    5 days ago
  • Overview: Job Purpose The SDLC Tools Engineer, Platform Engineering, will be responsible for ICE’s overall SDLC strategy. This role...  ...Implement monitoring, alerting, and dashboards, to maintain reliability and operational insight into SDLC platforms and CI/CD pipelines... 
    For contractors
    Work experience placement

    Intercontinental Exchange

    Atlanta, GA
    7 days ago
  • care was taken to include all competencies needed to successfully perform in this position. However, for Americans with Disabilities Act (ADA) purposes, the essential functions of the job may or may not have been described for purposes of ADA reasonable accommodation. All...

    FIS

    Atlanta, GA
    a month ago
  • $94.9k - $135.6k

     ...development, testing, operations, and platform teams to deliver value safely and efficiently. Cardinal Health is seeking a Release Engineer to lead iteration and release management activities supporting mission critical warehouse transformation initiatives on Program... 
    Temporary work
    Local area
    Immediate start
    Flexible hours

    Cardinal Health

    Atlanta, GA
    6 days ago
  •  ...Position: Release Engineering Contractor Only locals Location: Alpharetta, GA 30005 Job type: Contract Position Overview We are seeking an exceptional Release Engineering contractor to join our team and drive continuous improvement in... 
    Contract work
    For contractors
    Local area

    Equiliem

    Atlanta, GA
    3 days ago
  • Highbrow LLC is looking for an experienced SRE DevOps Engineer based in Overland Park, KS. The ideal candidate should have 4-9 years of experience in SRE or DevOps, with strong expertise in Kubernetes, incident troubleshooting, and automation. The role involves resolving... 

    Highbrow LLC

    Atlanta, GA
    5 days ago
  • $105k - $130k

     ...provide the high-speed capabilities our nation and its allies need to maintain a durable, asymmetric advantage. The Mission Systems Engineering (MSE) Team develops the Mission Management System (MMS)-a software platform that integrates mission subsystems, autonomy services... 
    Weekly pay
    Permanent employment
    Work at office

    Hermeus

    Atlanta, GA
    2 days ago
  • $103.71k - $138.28k

     ...demonstrated knowledge and experience in system architecture and engineering disciplines. Specific technical knowledge of enterprise level...  ...Amazon Web Services. -Supports due diligence activities including site surveys, design, design review, bill of materials creation,... 
    Temporary work
    Remote work

    Lumen Inc

    Atlanta, GA
    2 days ago
  • Carter's Inc. in Atlanta, Georgia is searching for a Platform Reliability & Operation Engineer. This role supports performance and reliability of...  ...Responsibilities include troubleshooting production issues, maintaining site performance, and collaborating with teams. Opportunities... 

    Carter's Inc.

    Atlanta, GA
    4 days ago
  • Lead Stability Engineer The Lead Stability Engineer is responsible for providing advanced technical support, troubleshooting, and problem...  ...details on Truist’s generous benefit plans, please visit our Benefits site. Depending on the position and division, this job may also be... 
    Full time
    Part time
    Work experience placement
    Work at office

    Cooper Lighting Solutions

    Atlanta, GA
    3 days ago
  •  ...patching/software. The front-end of this portal shall use responsive design and be accessible across many platforms/mediums. This Engineer will be responsible for integrating the output of the portal (XML) to the Enterprise electronic software delivery system Microsoft... 
    For contractors

    Staffing the Universe

    Atlanta, GA
    4 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Site Reliability Engineer. Be the first to apply!