Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Senior Site Reliability Engineer

$129k - $160k

TAG - The Aspen Group

About the Company

As a Senior Site Reliability Engineer (SRE) at TAG – The Aspen Group, you will be responsible for ensuring the reliability, performance, and scalability of our core systems. This role involves proactively building and managing, monitoring solutions, lead incident response, and continuously optimizing system performance to exceed business objectives. We are actively integrating AI and machine learning into our operational workflows, and you will be on the front lines, leveraging intelligent automation and machine learning to build a proactive resilient infrastructure. This is an opportunity to go beyond SRE by applying cutting-edge technology to solve complex reliability challenges.

About the Role

As a Senior Site Reliability Engineer (SRE) at TAG – The Aspen Group, you will be responsible for ensuring the reliability, performance, and scalability of our core systems.

Responsibilities

  • Intelligent Site Reliability Engineering
  • Design and build highly scalable and resilient systems to support our applications and services, incorporating predictive analytics to anticipate reliability risks.
  • Develop and manage Service Level Objectives (SLOs) and Service Level Indicators (SLIs) using machine learning anomaly detection to ensure systems meet reliability targets.
  • Drive improvements in system reliability, availability, and performance through proactive measures, automation, and intelligent failure prediction.

Advanced Observability

  • Implement and manage comprehensive monitoring and alerting solutions, integrating with intelligent observability platforms that reduce alert noise and correlate events.
  • Develop and maintain dashboards and reporting tools that provide data-driven insights for actionable troubleshooting recommendations and performance optimization.
  • Evaluate and integrate advanced monitoring tools and operational intelligence platforms to enhance observability and root cause identification.

Proactive Incident Management

  • Lead and participate in incident response efforts, using intelligent log analysis and automated event correlation to speed up troubleshooting and root cause identification.
  • Develop and maintain incident management processes incorporating automated decision support systems to improve response times and minimize service disruptions.
  • Conduct post-incident reviews, using automated pattern recognition and trend analysis to identify systemic issues and implement preventive measures.

Performance and Capacity Optimization

  • Analyze performance metrics and logs, supported by advanced observability tools, to detect bottlenecks and inefficiencies.
  • Collaborate with development teams to implement automated profiling and optimization recommendations for code and infrastructure improvements.
  • Perform capacity planning using machine learning forecasting models to ensure systems can handle current and future loads.

Automation and Process Improvement

  • Develop and implement automation solutions, including intelligent runbook automation, self-healing systems, and automated incident triage.
  • Identify and drive process improvements by applying machine learning to operational data for continuous optimization.
  • Maintain documentation that includes automation and machine learning guidelines for monitoring, incident management, and SRE best practices.

Collaboration and Communication

  • Work closely with engineering, operations, and product teams to align reliability and monitoring goals, including automation adoption strategies.
  • Communicate effectively with stakeholders, providing regular updates on system health, incidents, performance improvements, and data-driven insights.
  • Foster a culture of collaboration, knowledge sharing, and automation best practices within the team and across the organization.

Qualifications

  • Bachelor's degree in computer science or a related technical field.
  • At least 5 years of experience in Site Reliability Engineering or a similar role.

Required Skills

  • Strong proficiency in at least one programming language such as Python, Go, or C#.
  • Demonstrated experience applying machine learning and automation to operational workflows such as monitoring, alerting and incident response.
  • Expertise with infrastructure as code tools such as Terraform.
  • Proven experience working and monitoring container environments such as Cloud Run and Kubernetes.
  • Hands-on experience using and working within an Azure, AWS, and GCP environment (GCP preferred).
  • Strong understanding of networking, distributed systems, and cloud infrastructure.
  • Familiarity with intelligent monitoring platforms and operational analytics tools such as Prometheus, Grafana, OpenSearch, Sentry, Google Cloud Observability.
  • Excellent problem-solving skills and the ability to work independently and as part of a team.
  • Experience with incident management, root cause analysis, and automated operational workflows.

Annual pay range : $129,000-$160,000

A generous benefits package that includes paid time off, health, dental, vision, and 401(k) savings plan with match

Vacancy posted 5 days ago
Similar jobs that could be interesting for youBased on the Senior Site Reliability Engineer in Chicago, IL vacancy
  •  ...No H1 or C2C. Must be Permanent Resident or US Citizen Senior Site Reliability Engineer Description and Requirements About Our Team We are building Quantum , a next‑generation hybrid AI platform that spans Windows, Android, and cloud. As part of this vision... 
    Senior
    Permanent employment
    Remote work

    SDI International

    Chicago, IL
    1 day ago
  • A cutting-edge technology company in Chicago is seeking a Senior Site Reliability Engineer to maintain the reliability and operational health of their systems. You will design scalable cloud infrastructure, define SRE best practices, and mentor other engineers. Ideal candidates... 
    Senior

    Ubiety

    Chicago, IL
    4 days ago
  • $130k - $180k

     ...of both work styles in a workplace that is intentional about belonging, collaboration, and accomplishment. Being a Senior Site Reliability Engineer at iManage Means…  You are an engineer, a builder, and a systems thinker. You’ll create middleware and platform guardrails... 
    Senior
    Work at office
    Local area
    Remote work
    Worldwide
    Monday to Friday
    Flexible hours

    iManage

    Chicago, IL
    25 days ago
  • $130k - $150k

     ...breakthrough technologies that make the world a safer and more intelligent place to live. FLEXIBLE HOURS FLEXIBLE PTO Open Roles Senior Site Reliability Engineer Chicago, IL Ubiety is the creator of HomeAware, an AI-powered mobile application that delivers real-time presence... 
    Senior
    Full time
    Work at office
    Flexible hours

    Ubiety

    Chicago, IL
    4 days ago
  • $125.04k - $187.56k

     ...Ahold Delhaize USA company team includes just over 100 associates across all East Coast office locations. Primary Purpose The Site Reliability Engineer (SRE) III is responsible for ensuring the scalability, reliability, and performance of production systems through... 
    Senior
    Full time
    Work at office
    Local area
    Remote work
    Flexible hours

    ViziRecruiter,LLC.

    Chicago, IL
    2 days ago
  • About the job We are looking for a senior site reliability engineer to join the Cloud FinOps team at Hopper. We manage a large infrastructure in Google Cloud that is used by hundreds of engineers to provide a first class experience to millions of end users around the world... 
    Senior
    Remote job
    Work from home
    Sleeping nights

    Hopper

    Chicago, IL
    2 days ago
  • Senior Site Reliability Engineer - Google Distributed Cloud Edge (Edge SRE) Location: Hybrid - Chicago, IL (preferred) | Employment Type: W2, Contract to Hire, Direct Hire Overview Our client is seeking a highly skilled Edge Site Reliability Engineer (Edge SRE) to lead... 
    Senior
    Contract work

    CoSourcing Partners Inc.

    Chicago, IL
    1 day ago
  • $94k - $163k

     ...Summary We’re looking for a Technical Senior Manager of SRE to play a central role in...  ...approximately 70% of time to hands‑on engineering tasks, such as developing new deployments...  ...: Proven ability to collaborate with Site Reliability Engineers and cross‑functional teams, facilitating... 
    Senior
    Work at office
    Flexible hours

    Ring Inc

    Chicago, IL
    22 hours ago
  • $125.04k - $187.56k

    A leading global food retailer is seeking a Site Reliability Engineer (SRE) III to enhance system reliability and performance through automation and observability. This role is crucial for operational excellence in a cloud-native environment and involves collaborating with... 
    Senior
    Work at office
    Flexible hours

    ViziRecruiter,LLC.

    Chicago, IL
    2 days ago
  • $165k - $225k

     ...enterprises to deploy demanding AI workloads with enterprise-grade reliability and compliance. Your Role: You will be instrumental in...  ...expertise at its core. Working closely with our systems engineers, network engineers, and platform engineering team, you'll architect... 
    Senior
    Remote work
    Flexible hours

    Moonlite

    Chicago, IL
    10 days ago
  • $130k - $140k

    GlobalLogic is seeking a Senior Infrastructure Engineer in Deer Park, IL, to design and operate the enterprise observability stack. The ideal candidate has 7+ years in SRE or cloud infrastructure engineering, deep expertise in Microsoft Azure, and strong skills in Infrastructure... 
    Senior

    GlobalLogic

    Chicago, IL
    1 day ago
  •  ...Partner,Good Morning ,Greetings from Nukasani group Inc !, We have below urgent long term contract project immediately available for **Senior Systems Software Programmer , Chicago, IL, _Onsite_** need submissions you please review the below role, if you are available,... 
    Senior
    Long term contract
    For contractors
    Local area
    Immediate start
    Day shift

    Guru Schools

    Chicago, IL
    4 days ago
  • $130k - $165k

     ...Job Title: Senior Software Engineer Company: Snapsheet Job Location: USA, Remote Job Type: Full-time, direct hire Job Department: Technology  Team : Site Reliability Engineering   About Snapsheet: Snapsheet exists to simplify claims. We leverage... 
    Senior
    Full time
    Temporary work
    Local area
    Remote work
    Visa sponsorship
    Work visa
    Flexible hours

    Snapsheet

    Chicago, IL
    27 days ago
  • $111k - $188k

     ...drives our business. Our team is made up of talented software engineers, infrastructure engineers, leaders and UX professionals. We...  ...centers, infrastructure, design and grit. The Role: Senior Site Reliability Engineer with extensive experience in automation and... 
    Senior
    Temporary work
    Work at office
    Immediate start
    Remote work
    3 days per week

    Eskilstuna-Kuriren

    Chicago, IL
    more than 2 months ago
  • $93.9k - $156.5k

    Site Reliability Engineer II page is loaded## Site Reliability Engineer IIlocations: Chicago - 20 S. Wackertime type: Full timeposted on: Posted...  ...trading days.The successful candidate will work alongside senior engineers to learn how we observe, monitor, automate, and improve... 
    Work at office
    Local area
    Worldwide
    2 days per week

    CME Group Inc.

    Chicago, IL
    1 day ago
  • $130k - $225k

    Site Reliability Engineer - Algorithmic Trading Job Location Chicago Employment type Regular Department Technology Targeted Start Date Immediate DRW is a diversified trading firm with over 3 decades of experience bringing sophisticated technology and exceptional people... 
    Temporary work
    Work at office
    Immediate start
    Flexible hours

    DRW Holdings, LLC.

    Chicago, IL
    3 days ago
  • $127.33k - $159.17k

     ...Service Management. It’s our goal to always provide an engaging, relevant, and simple experience for our customers. The Site Reliability Engineer (SRE) - Edge Platform is a key member of the Edge Operations and SRE team within Global Technology Infrastructure & Operations... 
    Local area
    Flexible hours
    Shift work

    McDonald's Corporation

    Chicago, IL
    2 days ago
  • Site Reliability Engineer (Chicago, IL; Dallas, TX; ...) Qualifications: 8+ years of Software Engineering experience, or equivalent demonstrated through one or a combination of: work experience, training, experience, education. Contractor will implement and maintain scalable... 
    Contract work
    For contractors
    Work experience placement

    Cedent

    Chicago, IL
    22 hours ago
  • $93.9k - $156.5k

    CME Group Inc. is looking for a Site Reliability Engineer II in Chicago to assist in building, operating, and scaling systems. This role requires...  ..., and problem-solving. Candidates will work with senior engineers and collaborate across teams to enhance service reliability... 

    CME Group Inc.

    Chicago, IL
    3 days ago
  • We are seeking a highly skilled and experienced Site Reliability Engineer (SRE) to join our dynamic team. In this role, you will apply SRE principles to increase the reliability, scalability, and performance of critical enterprise applications. You will partner with cross... 

    Compunnel, Inc.

    Chicago, IL
    3 days ago
  • $130k - $140k

     ...#LI-VK1 Requirements 7+ years of experience in SRE, platform engineering, or cloud infrastructure engineering in large‑scale enterprise...  ...By joining GlobalLogic, you’re placing your trust in a safe, reliable, and ethical global company. Integrity and trust are a cornerstone... 
    Work experience placement
    Work at office

    GlobalLogic

    Chicago, IL
    2 days ago
  •  ...the process defined for an alerts/ issue Contact external vendors if their integrations fail Measure the front-end metrics for the site with various tools available Qualifications Must have worked on support projects Must know GCP, Kubernetes and Dynatrace... 

    TechDigital Group

    Chicago, IL
    22 hours ago
  • iManage is seeking a Senior Site Reliability Engineer in Chicago, IL to enhance their cloud platform's reliability and scalability. As an SRE, you will drive automation initiatives to reduce operational toil and mentor teammates. Your role includes leading architectural... 
    Senior

    iManage

    Chicago, IL
    2 days ago
  • A financial technology firm is seeking a Senior FTM developer to support strategic platform initiatives. This role requires at least 5-7 years of expertise with IBM's Financial Transaction Manager, experience in high-value payments, and strong skills in system design and... 
    Senior
    Contract work
    Remote work

    Kanz.us

    Chicago, IL
    2 days ago
  • Upbound is seeking a Senior Software Engineer to build and operate Upbound Spaces, the management software central to the Upbound Platform in Chicago. You'll troubleshoot issues in a multi-tenant SaaS environment, debug complex scenarios, and enhance features based on... 
    Senior

    Upbound - Job Posting

    Chicago, IL
    1 day ago
  • A leading financial technology firm in Chicago is seeking a Staff Site Reliability Engineer who will pioneer reliable infrastructure for critical clearing systems. The role involves architecting solutions, driving automation, and collaborating across teams to enhance performance... 

    CME Group Inc.

    Chicago, IL
    2 days ago
  • A leading media and technology company is seeking passionate Software Engineers to design and build innovative video advertising platforms. This virtual-based position requires collaboration with global teams, technical leadership, and a focus on problem-solving. The ideal... 
    Senior
    Remote work

    Comcast

    Chicago, IL
    6 days ago
  • A leading technology solutions provider is seeking a Senior Solutions Engineer to work remotely. This key role involves designing material handling systems and applications during the pre-sales phase, working closely with teams and partners to deliver solutions that meet... 
    Senior
    Remote work

    Trew LLC

    Chicago, IL
    4 days ago
  • $128.5k - $214.1k

    We're looking for a Staff Site Reliability Engineer to join our team, focusing on the core systems that power global financial markets. This isn't just about keeping the lights on; it's about pioneering the future of financial technology. As a member of our Clearing department... 
    Work at office
    Worldwide
    2 days per week

    CME Group Inc.

    Chicago, IL
    2 days ago
  • $160k - $200k

     ...continue to grow. Job Description The Opportunity We, at Flywire, are looking for an experienced Manager II, Site Reliability Engineering to join our team. In this  role, you’ll help drive reliability, automation and performance within our cloud-based... 
    Full time
    Temporary work
    Local area
    Immediate start
    Remote work
    Shift work

    Flywire

    Chicago, IL
    1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior Site Reliability Engineer. Be the first to apply!