Sr Site Reliability Engineer
Commence
Job Description
Job Description
Description:
At Commence, we’re the start of a new age of data-centric transformation, elevating health outcomes and powering better, more efficient process to program and patient health. We combine quality data-driven solutions that fuel answers, technology that advances performance, and clinical expertise that builds trust to create a more efficient path to quality care.
With human-centered, healthcare-relevant, and value-based solutions, we create new possibilities with data. We provide proof beyond the concept and performance beyond the scope with a focus on efficiencies that transform the lives of those we serve. With a culture driven by purpose, straightforward communication and clinical domain expertise, Commence cuts straight to better care.
Requirements:As a Senior Site Reliability Engineer at Commence, you will own the reliability, scalability, and operational health of our mission-critical healthcare data platform. You will bridge the gap between engineering and operations—embedding reliability as a first-class concern from architecture through deployment. This role is built for someone who thrives when systems are under pressure and who treats an outage as a problem to be engineered away permanently, not just survived.
- Design, implement, and own observability infrastructure including metrics, logging, tracing, and alerting across distributed systems.
- Define and enforce SLOs, SLIs, and error budgets in partnership with product and engineering teams.
- Lead incident response: triage, coordinate remediation, conduct blameless post-mortems, and drive systemic fixes.
- Build and maintain CI/CD pipelines that support rapid, safe delivery of changes to production.
- Collaborate with engineering teams on infrastructure changes; able to read, modify, and contribute to existing infrastructure-as-code (Terraform or CloudFormation).
- Design and operate highly available, fault-tolerant systems—including auto-scaling, failover, and disaster recovery strategies.
- Reduce operational toil through automation; eliminate manual processes before they become habits.
- Collaborate with software engineers to establish reliability-first design patterns and review architectures for operational risk.
- Manage Kubernetes or container orchestration environments at scale.
- Ensure systems meet compliance and security requirements, particularly those applicable to healthcare data (HIPAA, SOC 2).
- Provide technical mentorship and guidance to engineers across the organization on reliability practices.
- Participate in on-call rotation with a commitment to continuously reducing the need for it.
Qualifications
- 7+ years of experience in SRE, platform engineering, or DevOps roles.
- Exceptional problem-solving under pressure—demonstrated track record of diagnosing complex, high-stakes system failures and building durable solutions.
- Deep hands-on experience with AWS services including EC2, EKS/ECS, Lambda, RDS, S3, CloudWatch, and related tooling.
- Familiarity with infrastructure-as-code (Terraform or CloudFormation)—able to contribute to existing configurations.
- Experience designing and operating distributed systems with strict availability and latency requirements.
- Proficiency in at least one scripting or systems language (Python, Go, Bash, or similar) for automation and tooling.
- Experience with container orchestration (Kubernetes, ECS) in production environments.
- Expertise in observability tooling (OpenSearch, Prometheus/Grafana, or equivalent).
- Hands-on experience with CI/CD platforms (GitHub Actions, Jenkins, CircleCI, or similar).
- Proven ability to define and operationalize SLOs and error budgets.
- Experience with relational and NoSQL databases—performance tuning, replication, and backup strategies.
- Strong working knowledge of networking fundamentals: DNS, load balancing, VPCs, TLS.
- Excellent communication skills—able to translate technical risk into business impact for non-engineering stakeholders.
Additional Requirements
- AWS Certifications (Solutions Architect, DevOps Engineer, or SysOps Administrator).
- Experience in healthcare technology or other regulated industries (HIPAA, SOC 2, FedRAMP).
- Familiarity with chaos engineering practices and tooling.
- Experience with data pipeline reliability (ETL/ELT workflows, streaming systems).
- Exposure to AI/ML infrastructure and the reliability challenges unique to model serving.
- Familiarity with additional cloud platforms (Azure, Google Cloud).
- Contributions to open-source reliability or infrastructure tooling.
Work Environment/Physical Demands
The work environment and physical demands described here are representative of those that must be met by an employee to successfully perform the essential functions of this job. Reasonable accommodations may be made to enable individuals with disabilities to perform the essential functions.
This is a remote position. While performing the duties of this job, the employee regularly works in a climate-controlled environment. Candidates must be able to sit, read, work on a computer, and watch a computer screen for extended periods of time. Occasionally required to stand, walk, use hands and fingers, kneel or crouch.
Commence is an equal employment opportunity employer. All personnel processes are merit-based and applied without discrimination on the basis of race, color, religion, sex, sexual orientation, gender identity, marital status, age, disability, national or ethnic origin, military and veteran status or any other characteristic protected by applicable law.
Commence.AI is committed to providing equal employment opportunities to all applicants, including individuals with disabilities. If you require a reasonable accommodation to participate in the application process due to a disability, please contact Human Resources at View phone number on ziprecruiter.com or View email address on ziprecruiter.com. Please note that unless you are requesting an accommodation, all applications must be submitted through our online application system.
- ...overcome the most advanced technical challenges. The team comprises engineers of multiple disciplines including vulnerability research,... ...join our team to solve real-time cyber challenges, working on site with our end users in their spaces. Applicants will join a team...SeniorRelocationRelocation package
- ...Role: Sr. Java Developer with AWS Location: Dulles, VA Duration: 12 Months 10 + years of Java development with 1+ yr of AWS Top Three Skills: 1. 8+ years of Java Development- Spring Framework,moving towards JAX-RS 2. Familiarity...Senior
$90 - $100 per hour
...Sr. Backend Software Developer Location: Sterling, VA, US Job ID: ATR 17672 Job Description Job Title: Sr. Backend Software... ...We are seeking a highly experienced and skilled SeniorBackend Engineer to join our team. The ideal candidate will possess a deep...SeniorContract workRemote work- ...Sr. Software Engineer in Sterling, VA Job Title: Software Engineer Job Location: Sterling, VA Job Duration: 12 Months contract to hire for the right candidate Education Requirements: Bachelor's Degree and three years of relevant experience Notes: candidates...SeniorPermanent employmentContract workLocal area
$95.5k - $181.7k
...clearance required on day 1 Senior Software Engineer - Training Systems Software (Onsite)... ...defense. We are currently searching for a Sr. Software Engineer to join our team in... ...Travel to domestic and international customer sites to support training events and product...SeniorTemporary workWork experience placementWork at officeRemote workRelocationRelocation packageFlexible hours- ...intelligence community, defense, civil, and commercial markets. Nightwing Information Technology is seeking a Senior Cloud Platform Engineer to lead the operation, development, and continuous improvement of our network and server/application infrastructure in AWS...Senior
$149.6k - $254.32k
...BAE Systems is looking for a Platform or Infrastructure Systems Engineer (Senior) experienced in configuring, deploying, and maintaining... ...specifics. Senior Platform/Tier3 Engineer 125407BR EEO Career Site Equal Opportunity Employer. Minorities . females . veterans ....SeniorFull timeLocal area- ...know your interest along with your Updated Resume and the best time and contact number to reach you. Job Title: Sr. / Lead DevOps Engineer Location: Sterling, VA(Remote till Covid-19) Duration: 6+ Months Contract then Hire Our client...SeniorContract workRemote work
- ...Responsibilities may include, but are not limited to : Infrastructure Engineering: Design, test, and implement advanced cybersecurity... ...Develop automated unit tests and debug thoroughly to ensure reliable, high-quality code. Collaborate on Deployment, DevOps, and...
- ...change. Constantly grow as you work hard for a mission that matters at a company where you matter. Your Impact As a senior engineer on the team, you will be intimately involved in the architecture decisions that will shape this product. You live and breathe...SeniorWork at officeRemote work
- Nightwing provides technically advanced full-spectrum cyber, data operations, systems integration and intelligence mission support services to meet our customers' most demanding challenges. Our capabilities include cyber space operations, cyber defense and resiliency, ...Senior
$86.8k - $165.2k
...motivated and experienced Senior Software Engineer with a focus on Mission Engineering to... ...testing to ensure system functionality and reliability. Support software verification and... ...of whether the role is designated as on-site, hybrid or remote. The salary range for...SeniorTemporary workWork experience placementWork at officeRemote workRelocation packageFlexible hours- ...Job Title: Sr. Software Engineer (Network Programming) Location: Sterling, VA Duration: Longterm Contract Responsibilities: Enhance the product and maintain its libraries Expand its offering across various platforms Administer and maintain...SeniorContract work
$120.8k - $265.8k
Job Title: Network Management Systems (NMS) ServiceNow Developer Job Category: Information Technology Time Type: Full time Minimum Clearance Required to Start: TS/SCI with Polygraph Employee Type: Regular Percentage of Travel Required: Up to 10% Type of...Full timeContract workWork experience placementImmediate startFlexible hours- ...values purpose as much as progress, G&A is the place for you! Summary: Goldschmitt and Associates is seeking a Senior Software Engineer to join our Agile team to support an essential modernization program for the U.S. Department of Education, Office of Federal...SeniorTemporary workWork at officeLocal areaImmediate startRemote workFlexible hours
- ...Sr. Java Developer The Sr. Java Developer position is part of a fast-paced software development team. The Angular, Spring Boot, Spring MVC, Struts 1 or 2 or both and EJB 3 Developer is responsible for designing, developing, and maintaining web applications using the...Senior
- Web Developer The Web Developer position is part of a fast-paced software development team. They will assist in software development and maintenance of the current product suite by participating actively in all phases of the software development lifecycle. They should...Senior
$79.4k - $162.7k
...Job Title: Software Systems Engineer Job Category: Engineering Time Type: Full time Minimum Clearance Required to Start: Top Secret Employee Type: Regular Percentage of Travel Required: Up to 10% Type of Travel: Continental US Job Description: CACI...Full timeContract workWork experience placementFlexible hours- Job Title: ServiceNow Developer Senior Location: Leesburg, Virginia Type: Contract Compensation: $70-80/HR W-2 Contractor Work Model: Hybrid – onsite and remote Responsibilities Design, develop, and implement ServiceNow applications...SeniorContract workFor contractorsLocal areaRemote work
- ...Sr. Java Developer At B&A, we foster and embrace a distinct set of values that we live by and instill in all aspects of our organization... ...at B&A’s offices, and other workplaces (including client sites) and all other locations where B&A is providing services, and to...SeniorFull timeContract workWork experience placementWork at officeLocal area
$156.4k - $234.6k
...and Operational environment(s). The Senior Systems Security Engineer will maintain an enhanced focus on designing, implementing, and... ...SRR, PDR, CDR, TRR, PSR) Basic Qualifications: Education: Sr. Principal Cyber Systems Engineer ~ Bachelor's Degree (STEM) with...SeniorRelocation packageShift work$152.2k - $243.7k
...and data solutions, cyber security, and B2C platforms. The Opportunity: We are seeking a seasoned Fullstack Software Engineer with experience building GenAI-enabled applications. We're looking for someone who is both a hands-on full-stack engineer and a...SeniorWork experience placementWork at officeLocal area$110.7k - $171.8k
...starts with you. Job Description As a Senior Network Engineer specializing in Network Tools, you will be at the forefront of... ...24x7x365. Our core objectives include: Ensuring network reliability, performance, and security compliance through robust tooling solutions...SeniorWork experience placementWork at officeLocal area$95.5k - $181.7k
...existing security clearance required on day 1 Senior Software Engineer - DevOps / Infrastructure (Onsite) At RTX, the world largest... ...accomplished in a lab environment, so this position requires regular on-site presence. This position is located onsite at our Sterling, VA...SeniorTemporary workWork experience placementWork at officeRemote workRelocation packageFlexible hours- ...and commercial markets. Job Title: Platform/Infrastructure Engineer Location: Sterling, VA Clearance: TS/SCI Poly **This position... ...with Infrastructure as Code (IaC) focused on efficiency, reliability, and security The PIE blends technical skills in container...Contract workWork at office
- Nightwing provides technically advanced full-spectrum cyber, data operations, systems integration and intelligence mission support services to meet our customers' most demanding challenges. Our capabilities include cyber space operations, cyber defense and resiliency, ...
- ...skills REQNUMBER: 2611948 SAIC is a premier technology integrator, solving our nation's most complex modernization and systems engineering challenges across the defense, space, federal civilian, and intelligence markets. Our robust portfolio of offerings includes high...Immediate start
- ..., to assist in addressing readiness issues impacting the client mission. Use expertise to continuously update the client Sharepoint site and update other applicable database activities. Submit requests for travel and assist in providing the necessary support to begin work...Overseas
$131.8k - $290k
...Management Systems (NMS) Application Integrator Job Category: Engineering Time Type: Full time Minimum Clearance Required to... ...automation Optimize existing integrations to improve performance, reliability, and scalability Troubleshoot and resolve complex...Full timeContract workWork experience placementImmediate startFlexible hours- Nightwing provides technically advanced full-spectrum cyber, data operations, systems integration and intelligence mission support services to meet our customers' most demanding challenges. Our capabilities include cyber space operations, cyber defense and resiliency, ...Contract work
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Sr Site Reliability Engineer. Be the first to apply!
- senior performance engineer Leesburg, VA
- senior leadership Leesburg, VA
- senior financial analyst remote Leesburg, VA
- senior vice president of operations Leesburg, VA
- senior software engineer remote Leesburg, VA
- senior manager diversity & inclusion Leesburg, VA
- senior Leesburg, VA
- senior business analyst contract Leesburg, VA
- remote senior business analyst Leesburg, VA
- senior implementation engineer Leesburg, VA

