Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Principal Site Reliability Engineer

$146.96k - $220.44k

ViziRecruiter,LLC.

Introduction Ahold Delhaize USA, a division of global food retailer Ahold Delhaize, is part of the U.S. family of brands, which also includes five leading omnichannel grocery brands – Food Lion, Giant Food, The GIANT Company, Hannaford and Stop & Shop. Ahold Delhaize USA associates support the brands with a wide range of services, including Finance, Legal, Sustainability, Commercial, Digital and E-commerce, Technology and more. Overview The Site Reliability Engineer (SRE) IV is a senior technical leader responsible for designing, guiding, and scaling site reliability engineering practices across complex, distributed systems. This role plays a crucial part in driving operational excellence, ensuring system resiliency, and fostering a high-performing engineering culture. The SRE IV works closely with senior leadership, engineering, and product teams to set strategic goals around availability, performance, and incident response while leading large-scale reliability initiatives. This position emphasizes deep technical expertise in platforms such as Spring Boot, Java, Tomcat, Redis, and Kafka, along with infrastructure tooling like AKS, Kubernetes, ArgoCD, Terraform, GitHub Actions, and observability platforms like Datadog. The ideal candidate will also bring strong experience working with Ubuntu/Linux environments, containerization with Docker, and automation of operational workflows across a modern DevOps toolchain. Our flexible/hybrid work schedule includes 3 in-person days at one of our Chicago, IL office and 2 remote days. Applicants must be currently authorized to work in the United States on a full-time basis. Responsibilities Architect, evolve, and lead implementation of enterprise-level SRE frameworks, tools, and cloud-native reliability strategies. Build, scale, and manage microservices platforms using Spring Boot, Java, Tomcat, and Redis with Kubernetes and AKS. Lead technical design reviews, chaos testing, and infrastructure planning with an emphasis on scalability, high availability, and fault tolerance. Define, implement, and refine SLOs/SLIs and operational health indicators for business-critical services. Automate infrastructure provisioning and application deployment workflows using Terraform, GitHub Actions, and ArgoCD. Drive observability and telemetry adoption using Datadog, including dashboards, alerts, custom metrics, and distributed tracing. Act as incident commander during critical production issues; conduct blameless postmortems and guide root cause remediation. Lead cross-team efforts in reducing mean time to detect (MTTD) and resolve (MTTR), and promoting self-healing systems. Partner with security and compliance teams to ensure that systems are secure, auditable, and operationally compliant. Enhance service resiliency through strategies including Kafka-based event-driven architecture, retries, rate limiting, and circuit breakers. Mentor junior SREs and engineers, lead technical communities of practice, and promote a culture of continuous improvement. Maintain and improve Ubuntu-based production systems and containerized workloads with Docker. Evaluate and integrate emerging DevOps technologies to support scalability and reliability objectives. Requirements Bachelor's or Master's degree in Computer Science, Engineering, or a related technical field; equivalent practical experience may be considered. 8+ years of experience in Site Reliability Engineering, DevOps, or Platform Engineering roles in large-scale production environments. Expertise in building and maintaining Java-based microservices using Spring Boot, Tomcat, and Redis in containerized deployments. Strong hands-on experience with Kubernetes, AKS, and ArgoCD for orchestration and GitOps deployment workflows. Proficiency in Python, Java, Bash, or Go for automation, scripting, and infrastructure tooling. Proven ability to implement observability platforms and practices using Datadog (metrics, logs, traces, dashboards, alerts). Advanced experience working with CI/CD pipelines using GitHub and GitHub Actions. Deep understanding of networking, Linux (especially Ubuntu), distributed systems, and container security. Experience operating message-driven architectures using Kafka, with an emphasis on throughput, retries, and resilience. Solid knowledge of Terraform and infrastructure as code best practices. Excellent communication, collaboration, and stakeholder alignment skills across engineering and business teams. Salary Range: $146,960 - $220,440 Actual compensation offered to a candidate may vary based on their unique qualifications and experience, internal equity, and market conditions. Final compensation decisions will be made in accordance with company policies and applicable laws. #J-18808-Ljbffr ViziRecruiter,LLC.

Vacancy posted 4 days ago
Similar jobs that could be interesting for youBased on the Principal Site Reliability Engineer in Quincy, MA vacancy
  • $140k - $210.9k

     ...environments, strong communication, and a background in infrastructure or software engineering. Successful candidates will be responsible for producing CI/CD automation and ensuring reliability in distributed systems. A salary range of $140,000 - $210,900 is offered for... 
    Suggested

    Federal Reserve Bank of New York

    Boston, MA
    4 days ago
  • A modern observability platform located in the Boston area is seeking a skilled Site Reliability Engineer to join their Cloud Infrastructure Team. This role involves managing high-scale environments, collaborating with R&D to improve system stability, and performing operational... 
    Suggested

    Coralogix, inc.

    Boston, MA
    4 days ago
  • $160k - $195k

     ...federal, state and local agencies fuels the RapidSOS HARMONY AI engine that delivers this intelligence to those who need it most....  ...What this role is about Are you excited to work on systems where reliability directly impacts real‑world outcomes? At RapidSOS, we build... 
    Suggested
    Local area
    Flexible hours

    RapidSOS

    Boston, MA
    3 days ago
  •  ...Software Engineer, Front End The Software Engineer, Front End will build modular web applications that are easy to use and fully tested...  ...: Implement user interfaces that are highly intuitive, reliable, and meet the needs of our customers Contribute to component... 
    Principal

    Roberts Recruiting

    Boston, MA
    3 days ago
  • An innovative technology firm in Boston is seeking a Site Reliability Engineer to join their Cloud Infrastructure Team. This role involves working in high-scale environments, handling significant data processing and ensuring robust operation of FedRAMP cloud products. The... 
    Suggested

    Coralogix, inc.

    Boston, MA
    22 hours ago
  • $140k - $210.9k

     ...States. The position will be primarily on‑site with residency commutable to one of our...  .../DevOps backgrounds or software engineering backgrounds (e.g., Java, Python, Go) with...  ...strong interest in operating and improving reliability of distributed production systems. Responsibilities... 

    Federal Reserve Bank of New York

    Boston, MA
    4 days ago
  • $180k - $225k

    Your Impact You are a Sr. Site Reliability Engineer II who will help define how Axon builds and operates its core platforms, with a primary focus on Zero Touch, our controlled, compliant execution framework, and the identity and security foundations that sit around it.... 
    Work at office
    Immediate start
    Remote work

    Koitecc Solutions

    Boston, MA
    2 days ago
  • $134.25k - $214.8k

     ...matters at a company where you matter. Your Impact Are you an engineer who gets excited about the challenge of making complex...  ...it. You will be part of the Observability team within Axon’s Site Reliability organization - a focused team responsible for Axon’s metrics,... 
    Work at office
    Remote work

    Koitecc Solutions

    Boston, MA
    4 days ago
  •  ...Software Engineer, Back End We are a company dedicated to harnessing nature to help farmers sustainably feed the planet. With a vision...  ...Engineer) and their API needs Deep commitment to quality, reliability, scalability and maintainability Works and interacts well... 
    Principal

    Roberts Recruiting

    Boston, MA
    3 days ago
  • $127k - $249k

    The Team Platform Engineering is the department within SRE that is responsible for a range of critical infrastructure and operational...  ...fleet, alongside the critical components that ensure cluster reliability and security (e.g., CoreDNS, cert-manager, and Gatekeeper). As... 
    Work at office
    Local area
    Remote work
    Worldwide
    Flexible hours

    MongoDB

    Boston, MA
    22 hours ago
  • $125.04k - $187.56k

     ...services, including Finance, Legal, Sustainability, Commercial, Digital and E-commerce, Technology and more. Overview The Site Reliability Engineer (SRE) III is responsible for ensuring the scalability, reliability, and performance of production systems through... 
    Full time
    Work at office
    Remote work
    Flexible hours

    ViziRecruiter,LLC.

    Quincy, MA
    9 days ago
  • $134.25k - $214.8k

     ...automation platforms that Axon's product engineering teams depend on. You will architect...  ..., using operational experience to drive reliability improvements and inform platform investment...  ...software engineering, cloud infrastructure, or site reliability engineering. Experience... 
    Work experience placement
    Work at office
    Remote work

    Axon Enterprise

    Boston, MA
    4 days ago
  •  ...Software Development Engineer We're creating a platform that will change the way organizations measure their software development efforts...  ...teams can work and the tools they use. Location: on-site in Boston. We believe that it takes a diverse team to build the... 
    Principal

    Roberts Recruiting

    Boston, MA
    a month ago
  •  ...DevOps/Site Reliability Engineer We are hiring DevOps/Site Reliability Engineers to innovate upon the way we deploy, test, and develop our industry-leading marketing and analytics software. Engineers here solve problems in distributed computing, infrastructure automation... 

    Roberts Recruiting

    Boston, MA
    3 days ago
  • A leading software company in Boston is seeking a Senior Manager, Principal Software Engineer to develop the Pharmacometrics product suite. This pivotal role requires collaboration with diverse teams, deploying AI/ML solutions and optimizing resource utilization. Ideal... 
    Principal
    Flexible hours

    SupportFinity™

    Boston, MA
    4 days ago
  •  ...improve software solutions to ensure system reliability and availability, mitigate operational...  ...issues. # You will help lead chaos engineering efforts in a production-alike environment...  ...professionals, with engineers focused on site reliability engineering and... 
    Permanent employment
    Flexible hours

    Teradata

    Boston, MA
    10 days ago
  • ## Site Reliability EngineerBoston, MA · Full-time · Senior#### About The PositionCoralogix is a modern, full-stack observability platform...  ...observability spend by up to 70%.We are looking for a Site Reliability Engineer to work as part of our Cloud Infrastructure Team. Focusing on... 
    Full time

    Coralogix, inc.

    Boston, MA
    4 days ago
  • A global food retailer is seeking a Site Reliability Engineer III to ensure system reliability, scalability, and performance in their cloud-native environment. Responsibilities include designing infrastructure solutions and mentoring junior engineers, while requirements... 

    ViziRecruiter,LLC.

    Quincy, MA
    4 days ago
  • A leading cloud analytics company in Boston seeks a Principal Engineer to lead innovation in Agentic AI. In this role, you will architect and design frameworks for autonomous AI agents, ensure security by design, and collaborate across teams to define the next-generation... 
    Principal
    Flexible hours

    Teradata Corporation (SE)

    Boston, MA
    4 days ago
  • A leading cloud solutions provider seeks a Principal Software Engineer in Boston. The role involves developing innovative VDI solutions, focusing on complex operating system challenges, and driving the software development life cycle. Ideal candidates have 12-15 years of... 
    Principal

    Cloud Software Group

    Boston, MA
    1 day ago
  • $193.39k - $318.98k

    Red Hat, Inc. is seeking a Senior Principal Software Engineer to join the Azure Red Hat OpenShift Engineering team in Boston, MA. This high-impact role demands extensive experience in software development, particularly in Linux and Golang, and expertise in Azure cloud architecture... 
    Principal

    Red Hat, Inc.

    Boston, MA
    4 days ago
  • A leading open-source software company is seeking a Principal Software Engineer to drive AI integration within their product portfolio. This role involves leading the design of scalable solutions, implementing Multi-Agent Systems, and collaborating with both internal teams... 
    Principal

    Red Hat

    Boston, MA
    22 hours ago
  •  ...Principal Back-end Java Engineer A fast growing Series A startup located in Boston, we're a group of construction business owners in the Boston area determined to find an efficient alternative to an antiquated dump truck hiring and management workflow. Due to our recent... 
    Principal

    Roberts Recruiting

    Boston, MA
    3 days ago
  • Snyk is seeking a Principal Software Engineer to lead our Engineering Team in Boston, Massachusetts. You will influence the technical direction of our platform and play a crucial role in integrating security into software development processes. The ideal candidate has over... 
    Principal

    Dormont Manufacturing Co

    Boston, MA
    4 days ago
  • $170k - $210k

     ...Principal Software Engineer Step into a high-impact Principal Software Engineer opportunity with a confidential client, where you will help...  ...170000-210000 US Dollars per annual salary. A clear on-site opportunity with defined business impact and the chance to... 
    Principal

    Top Engineer

    Boston, MA
    3 days ago
  • $170k - $205k

    A Boston-based medical technology company is seeking a Principal Platform Engineer to build resilient cloud platforms that impact healthcare outcomes. The ideal candidate will have over 7 years of experience in DevOps and Platform engineering, with expertise in AWS services... 
    Principal

    Elucid

    Boston, MA
    1 day ago
  • $151k - $297k

    The Team Platform Engineering is the department within SRE that is responsible for a range of critical infrastructure and operational...  ...fleet, alongside the critical components that ensure cluster reliability and security (e.g., CoreDNS, cert-manager, and Gatekeeper). As... 
    Local area
    Immediate start
    Remote work
    Flexible hours
    Shift work

    MongoDB

    Boston, MA
    3 days ago
  •  ...Software Engineer Opportunity We're looking for talented software engineers to join our rapidly growing team in Boston! Be a part of a company poised to dominate an untapped segment of the construction industry! We built a cloud-based construction logistics technology... 
    Principal
    Casual work
    Flexible hours

    Roberts Recruiting

    Boston, MA
    3 days ago
  •  ...Job Title: Generative AI Engineer (Senior / Lead / Principal)- Multiple openings Experience Level: 8+ to 13+ Years Location: Hybrid - Remote (India-based) with onsite every Thursday in Chennai Industry: AI/ML, Enterprise Applications, Healthcare... 
    Principal
    Work at office
    Remote work

    Saviance

    Boston, MA
    2 days ago
  • PTC Inc is seeking a Principal Software Engineer to join our Enterprise and Platform Engineering team in Boston, MA. In this hybrid role, you'll design and develop scalable features for the SaaS-based CAD/PDM service while taking ownership of critical modules and collaborating... 
    Principal

    PTC Inc

    Boston, MA
    1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Principal Site Reliability Engineer. Be the first to apply!