Systems Engineer - Site Reliability Engineering
Full-time
Marriott
JOB SUMMARY:
The Systems Engineer - Site Reliability Engineering (SRE) is responsible for the reliability, scalability, and performance of mission-critical cloud and on-prem services that support millions of Marriot customers globally. This role involves overseeing incident management, driving automation efforts, and working closely with cross-functional teams to ensure alignment between SRE strategy and business objectives. Partners closely with Product Teams, Applications teams, Infrastructure, and the broader Applications and Infrastructure Delivery teams to develop key metrics and KPIs to improve applications stability, availability and performance. The ideal candidate will bring strong communication skills, collaborating with key stakeholders across the company to optimize cloud infrastructure and uphold the highest standards of operational excellence in a dynamic, fast-paced environment.CANDIDATE PROFILE:
Required: * Undergraduate degree in an engineering or computer science discipline and/or equivalent experience/certification * 5+ years of hands-on experience in designing, building and operating production grade systems including: * 2+ years of experience as a Site Reliability Engineer (SRE), building and managing highly available and mission critical systems * Deep understanding of SRE practices, such as Service Level Objectives, Error Budgets, Toil Management, Observability & Monitoring, Blameless Postmortems, Incident Response Process, Capacity Planning * Expertise in AWS services including designing highly available, multi-AZ and multi-region architectures, for example:- Compute: EC2, Auto Scaling, Lambda
- Containers: EKS (Mandatory), ECS (good to have)
- Networking: VPC, subnets, route tables, NAT gateways, Transit Gateway
- Security: IAM roles/Policies, KMS, Secret manager
- Storage and Databases: S3, EBS, EFS, RDS, DocumentDB.
- Proven automation and programming experience in one or more of the following
(EKS, AKS, ACK)
* Familiarity with service mesh technologies to enable secure and resilient service communication, including mTLS, traffic shaping, and policy enforcement. * Familiarity with Infrastructure as Code (Iac) tools like Terraform and CloudFormation. * Familiarity with configuration management and automation tools such as Ansible. * Familiarity with vulnerability management, OS hardening, patching, security compliance of infrastructure, applications and databases * Understanding of basic networking fundamentals Preferred: * Experience driving cloud cost optimization initiatives (rightsizing, reserved instances, autoscaling strategies, cost observability) * Networking expertise including Load Balancing, Firewalls, Security Groups, NACLs, TCP/IP, DNS, SSL/TLS etcCORE WORK ACTIVITIES:
* Ensure the reliability, availability, and performance of mission-critical cloud services, implementing best practices for monitoring, alerting, and incident management. * Oversee the management of high-severity incidents, driving quick resolution and post-incident analysis to identify root causes and prevent recurrence. * Drive the automation of operational processes and ensure systems can scale effectively to support growing user demand, optimizing cloud and on-prem infrastructure and resource usage. * Develop and execute the SRE strategy aligned with business goals, and communicate service health, reliability, and performance metrics to senior leadership and stakeholders Drive Applications Performance Management and Monitoring:- Assess application architectures to identify key monitoring points
- Identify Key Performance Indicators, apply monitoring, and report out on
- Gather information to develop reporting metrics and KPIs
- Ensure that all applications adhere to appropriate monitoring standards based
- Champions leaders’ vision for product and service delivery.
- Executes the necessary decisions to keep moving forward toward achievement of
- Understands and meets the needs of key stakeholders.
- Communicates concepts in a clear and persuasive manner that is easy to
- Demonstrates an understanding of business priorities.
- Supports achievement of performance goals, budget goals, team goals, etc.
- Provides technical expertise within own and other teams.
- Provides recommendations to improve the effectiveness of processes and
- Keeps up-to-date technically and applies new knowledge to job.
- Performs other reasonable duties as required for this position.
Vacancy posted 1 day ago
Similar jobs that could be interesting for youBased on the Systems Engineer - Site Reliability Engineering in Bethesda, MD vacancy
- Geico is seeking a Staff Engineer to innovate and enhance systems while mentoring engineers and collaborating across teams. This position involves utilizing programming languages like Go and Python, working with Azure services, Docker, and Kubernetes, and requires 6+ years...Suggested
- ...Description Role Overview We are seeking a high-caliber Site Reliability Engineer (SRE) to join our Forward Engineering team. You will be the... ...performant. This role is a hybrid of software engineering and systems architecture, with a specialized focus on MLOps —bridging...SuggestedLocal area
- ...with cyber teams. Annually, or as required, revalidates with users and engineering that technology continues to meet requirements. Implement CM discipline for the entire life cycle of systems from initial requirements/capabilities baselines to system end-of-life...Suggested
- ...Description Job Description Description: Onsite in Washington, DC our client seeks a Sr. Site Reliability Engineer III to design, automate, and operate mission-critical systems for federal environments. The role focuses on Kubernetes or VMWare platforms, CI/CD...SuggestedHourly payPermanent employmentFull timeLocal areaImmediate start
$125k - $200k
Overview As a Site Reliability Engineer (SRE) , you will help design, build, and operate reliable, secure, and observable cloud‑native systems that support mission‑critical applications and services. You will blend software engineering, DevOps practices, and infrastructure...SuggestedLocal area2 days per week- A leading technology company is seeking a Senior Site Reliability Engineer in Virginia. The role involves maintaining a Kubernetes-based platform, ensuring high availability, and automating infrastructure processes with tools like Terraform. The ideal candidate will have...Remote jobFlexible hours
- Senior Site Reliability Engineer Job Description Overview CoStar Group (NASDAQ: CSGP) is a leading global provider of commercial and residential... ...-time data, millions of active users, and mission-critical systems across a globally distributed platform. As we scale, we're...Full timeWork at officeWork from homeMonday to Thursday
$166k - $220k
ABOUT THE JOB As a site reliability engineer in Platform Discovery, you will solve a wide variety of problems involving networking, autonomy, systems integration, robotics, and more, while making pragmatic engineering tradeoffs along the way. Your efforts will ensure that...Full timeWork experience placementRelocation package- ...This role requires regularly working on-site at customer locations in Arlington, VA.... .... About The Role We are hiring a Site Reliability Engineer to join our Infrastructure & Security... ...continuously improved, and you aim to leave systems easier to operate than you found them....RelocationRelocation package
$55.2k - $126k
...to expect during your journey as a candidate with us. Engineering to make a system more resilient and efficient frees up time and money to... ...a passion for making systems better, we need you! As a site reliability engineer on our team, you’ll help our Platform Engineering...Full timeContract workPart timeLocal areaRemote work$53k - $108k
...a balanced, fulfilling life. YOUR CANDIDATE JOURNEY Discover what to expect during your journey as a candidate with us. Site Reliability Engineer The Opportunity: Everyone is trying to “harness the cloud,” but not everyone knows how. As a DevOps engineer, you’re eager...Full timeContract workPart timeLocal areaRemote work$147.4k - $221.2k
Senior Site Reliability Engineer page is loaded## Senior Site Reliability Engineerremote type: Flexlocations: USA, VA, McLean: USA.VA.Restontime type: Full Timeposted on: Posted Yesterdayjob requisition id: JR-0104084**Your work days are brighter here.**We’re obsessed with...Work experience placementWork at officeRemote workHome officeFlexible hours- A technology solutions provider seeks a System Developer based in Washington, DC, to support operations for the Small Business Administration... ...Entra services. A minimum of 5 years of experience in systems engineering is required along with a Bachelor's degree in Computer Science...Local area
- A leading consulting firm located in McLean, Virginia, is looking for a Site Reliability Engineer to enhance their platform's reliability. You will focus on building resilient systems, implementing monitoring tools, and automating tasks. The ideal candidate has experience...Remote job
$55.2k - $126k
A leading consulting firm in McLean, Virginia, is seeking a Site Reliability Engineer to enhance system resilience and efficiency. Key duties include developing robust infrastructure, implementing automation, and reducing manual tasks. The role requires experience with...Remote job$100.2k - $203.4k
...moves missions and the government forward! The work As a Site Reliability Engineer, you will play a pivotal role in advancing operational AI adoption... ..., scalability, and continuous monitoring of enterprise AI systems that support mission-critical applications and enterprise...Full timeLive inWork at officeLocal area- Salesforce is seeking a Site Reliability Engineer in Washington, DC to ensure cloud services availability. This role involves monitoring services... ...incident management, and driving automation for resilient systems. Candidates should have a Bachelor's in Computer Science or...
$128.04k
...Skills: Artificial Intelligence (AI), DevSecOps, Kubernetes, Reliability Analysis Certifications: None Experience: 8 + years of... ...challenges. The CDAO Advana team is seeking an Site Reliability Engineering Lead - Model Serving, to join their efforts in the DC area...Full timeWork at officeImmediate startRemote workWorldwideFlexible hours- Job Category Software Engineering Overview of the Role Join our Site Reliability Engineering (SRE) team, where you'll work alongside Infrastructure and Research... ..., drive automation, and help build the resilient systems that millions of customers depend on every day. This...Work experience placement
$3,000 per month
...DOING Lockheed Martin, Rotary Mission Systems Cyber & Intelligence invites you to step... ...standards, confer with users or system engineers; analyze systems flow, data usage and work... ...to match the caliber of your work. Reliable, high-performing, and mission-ready. You...- ...Technology Platform (DTP) contract. You will work closely with Systems Engineers, Software Engineers, Architects, and Operations Engineering/... ...professional development. While most work is conducted on-site at our client location in Bethesda, MD, we offer a flexible schedule...Contract workRemote workFlexible hours
$87.1k - $157.45k
Release Train Engineer The Decision Advantage division at Leidos currently has an opening for a Release Train Engineer. This is an exciting... ...Train (ART) events, including PI Planning, Scrum of Scrums, & System Demos Coordinate and synchronize multiple Agile teams to...For contractors- Relha LLC is seeking a Site Reliability Engineer to join their team in Washington, DC. The role involves monitoring customer-facing services, managing incidents, and automating production issue resolutions. Candidates should possess a Bachelor's degree in Computer Science...
- Salesforce.com, inc. is looking for a Site Reliability Engineer in Washington, DC. In this role, you will monitor customer-facing services, respond to critical incidents, and drive automation to enhance service resiliency. Required qualifications include a Bachelor's degree...
- Salesforce, Inc. is looking for a Site Reliability Engineer to join their team in Washington, D.C. This role involves monitoring and responding to urgent incidents to ensure cloud services remain operational. You will also automate recurring issues, contribute to self-healing...
- ...IBM z15 and z16 Mainframe Support Supporting IBM z15 and z16 mainframe and z/OS 2.5 or higher operating system. As part of IBM zCloud resource pool, supporting the zCloud Offering which consists of the following clients: Federal Retirement Investment Board (FRTIB),...Work at office
- A bioscience and IT firm located in Rockville, Maryland is seeking a DevOps Engineer with extensive experience in Linux and cloud platforms. The successful candidate will be responsible for designing scalable infrastructure, leading DevOps practices, and optimizing AI/...
- A leading defense contractor in Bethesda, MD is looking for a Senior Software Developer to create advanced sonar tactical decision aids, supporting the U.S. Navy's operational readiness. Candidates should have at least 3 years of experience in Java and/or C++ development...For contractors
- Sr. Software Engineer Responsibilities: Gather requirements and develop technical, functional and solution documents. Architect and implement... ...‑ups, and address issues to keep projects on track. Document system and domain knowledge to eliminate single points of failure....Relocation
- A leading insurance company is seeking a Senior Engineer to drive innovation in building high-performance, low-maintenance platforms. You will lead technical projects, improve existing systems, and collaborate with teams to enhance engineering capabilities. The role requires...
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Systems Engineer - Site Reliability Engineering. Be the first to apply!
Related searches
- operations support system engineer Bethesda, MD
- microsoft systems engineer Bethesda, MD
- mission system engineer Bethesda, MD
- digital communications systems engineer Bethesda, MD
- system performance engineer Bethesda, MD
- system engineer contract Bethesda, MD
- senior staff systems engineer Bethesda, MD
- operating system engineer Bethesda, MD
- advanced systems engineer Bethesda, MD
- senior windows systems engineer Bethesda, MD



