Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Senior Site Reliability Engineer

$129k - $160k

TAG - The Aspen Group

About the Company

As a Senior Site Reliability Engineer (SRE) at TAG – The Aspen Group, you will be responsible for ensuring the reliability, performance, and scalability of our core systems. This role involves proactively building and managing, monitoring solutions, lead incident response, and continuously optimizing system performance to exceed business objectives. We are actively integrating AI and machine learning into our operational workflows, and you will be on the front lines, leveraging intelligent automation and machine learning to build a proactive resilient infrastructure. This is an opportunity to go beyond SRE by applying cutting-edge technology to solve complex reliability challenges.

About the Role

As a Senior Site Reliability Engineer (SRE) at TAG – The Aspen Group, you will be responsible for ensuring the reliability, performance, and scalability of our core systems.

Responsibilities

  • Intelligent Site Reliability Engineering
  • Design and build highly scalable and resilient systems to support our applications and services, incorporating predictive analytics to anticipate reliability risks.
  • Develop and manage Service Level Objectives (SLOs) and Service Level Indicators (SLIs) using machine learning anomaly detection to ensure systems meet reliability targets.
  • Drive improvements in system reliability, availability, and performance through proactive measures, automation, and intelligent failure prediction.

Advanced Observability

  • Implement and manage comprehensive monitoring and alerting solutions, integrating with intelligent observability platforms that reduce alert noise and correlate events.
  • Develop and maintain dashboards and reporting tools that provide data-driven insights for actionable troubleshooting recommendations and performance optimization.
  • Evaluate and integrate advanced monitoring tools and operational intelligence platforms to enhance observability and root cause identification.

Proactive Incident Management

  • Lead and participate in incident response efforts, using intelligent log analysis and automated event correlation to speed up troubleshooting and root cause identification.
  • Develop and maintain incident management processes incorporating automated decision support systems to improve response times and minimize service disruptions.
  • Conduct post-incident reviews, using automated pattern recognition and trend analysis to identify systemic issues and implement preventive measures.

Performance and Capacity Optimization

  • Analyze performance metrics and logs, supported by advanced observability tools, to detect bottlenecks and inefficiencies.
  • Collaborate with development teams to implement automated profiling and optimization recommendations for code and infrastructure improvements.
  • Perform capacity planning using machine learning forecasting models to ensure systems can handle current and future loads.

Automation and Process Improvement

  • Develop and implement automation solutions, including intelligent runbook automation, self-healing systems, and automated incident triage.
  • Identify and drive process improvements by applying machine learning to operational data for continuous optimization.
  • Maintain documentation that includes automation and machine learning guidelines for monitoring, incident management, and SRE best practices.

Collaboration and Communication

  • Work closely with engineering, operations, and product teams to align reliability and monitoring goals, including automation adoption strategies.
  • Communicate effectively with stakeholders, providing regular updates on system health, incidents, performance improvements, and data-driven insights.
  • Foster a culture of collaboration, knowledge sharing, and automation best practices within the team and across the organization.

Qualifications

  • Bachelor's degree in computer science or a related technical field.
  • At least 5 years of experience in Site Reliability Engineering or a similar role.

Required Skills

  • Strong proficiency in at least one programming language such as Python, Go, or C#.
  • Demonstrated experience applying machine learning and automation to operational workflows such as monitoring, alerting and incident response.
  • Expertise with infrastructure as code tools such as Terraform.
  • Proven experience working and monitoring container environments such as Cloud Run and Kubernetes.
  • Hands-on experience using and working within an Azure, AWS, and GCP environment (GCP preferred).
  • Strong understanding of networking, distributed systems, and cloud infrastructure.
  • Familiarity with intelligent monitoring platforms and operational analytics tools such as Prometheus, Grafana, OpenSearch, Sentry, Google Cloud Observability.
  • Excellent problem-solving skills and the ability to work independently and as part of a team.
  • Experience with incident management, root cause analysis, and automated operational workflows.

Annual pay range : $129,000-$160,000

A generous benefits package that includes paid time off, health, dental, vision, and 401(k) savings plan with match

Vacancy posted 1 day ago
Similar jobs that could be interesting for youBased on the Senior Site Reliability Engineer in Chicago, IL vacancy
  • $160k - $200k

    Ripple is seeking a Senior Site Reliability Engineer in Chicago. In this role, you will enhance platform reliability by embedding with engineering teams and coaching them on CI/CD practices, observability, and application security. Your expertise will help us redefine... 
    Senior

    jobr.pro

    Chicago, IL
    3 days ago
  • $160k - $200k

    Ripple in Chicago is seeking a Senior Site Reliability Engineer to enhance product reliability and performance. In this role, you will engage with engineering teams to implement observability practices and optimize CI/CD pipelines, ensuring robust security. The position... 
    Senior

    Ripple

    Chicago, IL
    6 days ago
  • $106.28k - $145k

    CCC Information Services in Chicago is looking for a Senior Site Reliability Engineer to enhance and support their multi-cloud solutions. This hybrid position offers a salary range of $106,277.25 to $145,000.00, and candidates should have over two years of experience in... 
    Senior

    CCC Information Services

    Chicago, IL
    4 days ago
  • $140k - $205k

     ...Senior Technology Site Reliability Engineer Cooley is seeking a Senior Site Reliability Engineer to join the Infrastructure & Development Operationsteam. Position summary: The Senior Technology Site Reliability Engineer("SRE") is responsible for ensuring the reliability... 
    Senior
    Full time
    Temporary work
    Work at office
    Flexible hours
    Weekend work

    Cooley

    Chicago, IL
    17 days ago
  • $127k - $249k

     ...The Team Platform Engineering is the department within SRE that is responsible for a range of critical infrastructure and operational...  ...fleet, alongside the critical components that ensure cluster reliability and security (e.g., CoreDNS, cert-manager, and Gatekeeper).... 
    Senior
    Work at office
    Local area
    Remote work
    Worldwide
    Flexible hours

    MongoDB

    Chicago, IL
    3 days ago
  • $125.04k - $187.56k

     ...Delhaize USA company team includes just over 100 associates across all East Coast office locations. Primary Purpose The Site Reliability Engineer (SRE) III is responsible for ensuring the scalability, reliability, and performance of production systems through... 
    Senior
    Full time
    Work at office
    Local area
    Remote work
    Flexible hours

    Peapod Digital Labs

    Chicago, IL
    2 days ago
  • $130k - $180k

     ...of both work styles in a workplace that is intentional about belonging, collaboration, and accomplishment. Being a Senior Site Reliability Engineer at iManage Means... You are an engineer, a builder, and a systems thinker. You'll create middleware and platform... 
    Senior
    Work at office
    Local area
    Remote work
    Worldwide
    Monday to Friday
    Flexible hours

    iManage

    Chicago, IL
    5 days ago
  • $145k - $175k

     ...Senior Site Reliability Engineer (Hybrid) Chicago, IL For 41 years, Rewards Network has been helping restaurants grow revenue, increase traffic, and boost customer engagement through innovative financial, marketing services, and premier dining rewards programs.... 
    Senior
    Full time
    Temporary work
    Work at office
    Local area
    Flexible hours
    3 days per week

    Rewards Network

    Chicago, IL
    4 days ago
  • $130k - $165k

     ...Job Title: Senior Software Engineer Company: Snapsheet Job Location: USA, Remote Job Type: Full-time, direct hire Job Department: Technology  Team : Site Reliability Engineering About Snapsheet: Snapsheet exists to simplify claims. We leverage our expertise... 
    Senior
    Full time
    Temporary work
    Local area
    Remote work
    Visa sponsorship
    Work visa
    Flexible hours

    Snapsheet

    Chicago, IL
    1 day ago
  • $127k - $249k

     ...Eastern or Central time zones. We are looking for an experienced Senior Engineer for our SRE, Atlas team to support, maintain and grow the...  ...workloads. Role Overview We are seeking a talented Site Reliability Engineer (SRE) with a strong infrastructure background.... 
    Senior
    Local area
    Remote work
    Worldwide
    Flexible hours

    MongoDB

    Chicago, IL
    5 days ago
  •  ...Senior Site Reliability Engineer – Google Distributed Cloud Edge (Edge SRE) Location: Hybrid – Chicago, IL (preferred) Employment Type: W2, Contract to Hire, Direct Hire Overview Our client is seeking a highly skilled Edge Site Reliability Engineer (Edge SRE... 
    Senior
    Contract work

    CoSourcing Partners - Enterprise-AI and IT Services Company

    Chicago, IL
    2 days ago
  • CME Chicago Mercantile Exchange Inc. is seeking a Site Reliability Engineer III to enhance stability for CME Clearing & Risk. In this role, you will ensure secure and reliable technology solutions, bridging development and operations while maintaining risk management services... 
    Senior

    CME Chicago Mercantile Exchange Inc.

    Chicago, IL
    3 days ago
  • $109.5k - $150.55k

     ...strive for the best, own our actions, and grow and evolve. Job Description Renaissance is looking for an experienced Sr Site Reliability Engineer to be part of the Engineering Enablement group's Site Reliability Team with a focus on Application and Infrastructure... 
    Senior
    For contractors
    Local area
    Remote work
    Worldwide
    Work visa
    Flexible hours
    Weekend work

    Renaissance Services

    Chicago, IL
    5 days ago
  • $106k - $130k

     ...sponsorship. Overall Purpose To create and maintain the next generation of application infrastructure and to be responsible for reliability, automation and scalability using and the latest best practices. Essential Functions Implement software and tools to... 
    Senior
    Hourly pay
    Work experience placement
    Work at office
    Immediate start
    Visa sponsorship
    Work visa
    Flexible hours

    Early Warning Services

    Chicago, IL
    1 day ago
  •  ...Senior/Staff Site Reliability Engineer, Consumer Apps Chicago, IL; Redwood City, CA About Attain Built for consumers and companies, alike Klover's engineering team powers one of the fastest-growing fintech platforms in the U.S., supporting over one million... 
    Senior
    Work at office
    Immediate start
    Remote work

    Attain

    Chicago, IL
    1 day ago
  • $130k - $140k

     ...GlobalLogic is seeking a Senior Infrastructure Engineer in Deer Park, IL, to design and operate the enterprise observability stack. The ideal candidate has 7+ years in SRE or cloud infrastructure engineering, deep expertise in Microsoft Azure, and strong skills in Infrastructure... 
    Senior

    GlobalLogic

    Chicago, IL
    5 days ago
  • Hitachi Vantara Corporation is looking for a Site Reliability Engineer (SRE) to design and operate the enterprise observability stack, including Azure Monitor and Managed Grafana. This position requires extensive experience in SRE and cloud infrastructure, with a focus... 
    Senior

    Hitachi Vantara Corporation

    Chicago, IL
    6 days ago
  • $111k - $188k

     ...drives our business. Our team is made up of talented software engineers, infrastructure engineers, leaders and UX professionals. We...  ...centers, infrastructure, design and grit. The Role: Senior Site Reliability Engineer with extensive experience in automation and... 
    Senior
    Temporary work
    Work at office
    Immediate start
    Remote work
    3 days per week

    Eskilstuna-Kuriren

    Chicago, IL
    more than 2 months ago
  • $100.7k - $167.8k

     ...Job Summary The Site Reliability Engineer III is a pivotal architect of stability for CME Clearing & Risk. You will engineer secure, scalable, and reliable technology solutions that safeguard the global marketplace. By bridging the gap between development and operations... 
    Worldwide

    CME Group

    Chicago, IL
    2 days ago
  •  ...CST Anchor Days: W (flexible on other 2 days) Site Reliability Engineer - Northern Trust, Goals Driven Wealth Management We are...  ...capacity planning and performance optimization efforts. Work with senior staff and management on service delivery improvements... 
    Contract work
    Work experience placement
    Local area
    Flexible hours

    Apex Informatics

    Chicago, IL
    1 day ago
  • $175k - $225k

     ...Old Mission Capital is seeking a well-rounded technologist with core strengths in Linux and network administration. This Site Reliability Engineer will be responsible for owning and managing the deployment, maintenance, and enhancement of our servers. This Site Reliability... 
    Full time
    Work at office
    Remote work
    Monday to Friday
    Flexible hours
    Rotating shift

    Old Mission

    Chicago, IL
    3 days ago
  • $130k - $150k

     ...Site Reliability Engineer - Disaster Recovery & Business Continuity Boston, MA, United States; Chicago, IL, United States About Charles River...  ...career mentoring and performance coaching from an assigned senior colleague. Additional leadership and collaboration... 
    Work at office
    Work from home
    3 days per week

    Charles River Associates

    Chicago, IL
    1 day ago
  •  ...Qualifications: 8+ years of Software Engineering experience, or equivalent demonstrated through...  ...implement and maintain scalable and reliable infrastructure on Google Cloud Platform...  ...vendor resources Willingness to work on-site at stated location in the job opening... 
    Contract work
    For contractors
    Work experience placement

    Cedent

    Chicago, IL
    5 days ago
  •  ...Edward Jones Site Reliability Engineer 100% remote Initial contract is 6 months, but will be a multi year engagement. Position Overview: As a Senior Site Reliability Engineer, you will play a critical role in ensuring the reliability, availability, and performance... 
    Contract work
    Remote work

    HCL Global Systems

    Chicago, IL
    5 days ago
  •  ...Site Reliability Engineer in Wealth Management Chicago (IL) / Tempe (AZ) Onsite Job ROLE: This role will be Responsible for application observability, maintenance, and support, identifying and implementing preventive measures proactively, evaluates and makes... 
    Flexible hours

    Info Way Solutions

    Chicago, IL
    2 days ago
  • $130k - $140k

     ...platform automation using Logic Apps and Python. #LI-VK1 Requirements 7+ years of experience in SRE, platform engineering, or cloud infrastructure engineering in large-scale enterprise environments. Deep, hands-on expertise with Microsoft Azure (minimum... 
    Temporary work
    Work experience placement
    Work from home
    Flexible hours

    GlobalLogic

    Chicago, IL
    5 days ago
  • $91k - $110k

     ...that makes a real difference. Job Description The Site Reliability Engineer (SRE) is responsible for ensuring the reliability,...  ...demonstrated by building strong relationships, influencing peers and senior stakeholders, and navigating conflict to achieve successful... 
    Full time
    Part time
    Local area
    Remote work
    Monday to Friday
    Flexible hours
    Weekend work

    Genex Services

    Chicago, IL
    2 days ago
  • $175k - $225k

     ...Site Reliability Engineer Chicago, IL or New York, NY Old Mission is a global proprietary trading firm that leverages state-of-the-art technology and research to identify and execute profitable trading strategies across multiple asset classes around the world. Our... 
    Full time
    Work at office
    Remote work
    Monday to Friday
    Flexible hours
    Rotating shift

    Old Mission Capital

    Chicago, IL
    5 days ago
  • $130k - $225k

     ...expectations, integrity, innovation and a willingness to challenge consensus. The Algorithmic Trading Team is looking for a Site Reliability Engineer for our Chicago office. The SRE team is critical to the success of our trading – ensuring that our production trading... 
    Temporary work
    Work at office
    Flexible hours

    DRW

    Chicago, IL
    4 days ago
  •  ...Site Reliability Engineer As a Site Reliability Engineer, you will build and secure infrastructure supporting our AI platform with special attention to safeguarding US customer data and supporting the Aerospace and Defense Industrial Base. You'll have strong ownership... 

    Inclusion Cloud

    Chicago, IL
    2 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior Site Reliability Engineer. Be the first to apply!