Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Sr Site Reliability Engineer

Commence

Job Description

Job Description

Description:

At Commence, we’re the start of a new age of data-centric transformation, elevating health outcomes and powering better, more efficient process to program and patient health. We combine quality data-driven solutions that fuel answers, technology that advances performance, and clinical expertise that builds trust to create a more efficient path to quality care.

With human-centered, healthcare-relevant, and value-based solutions, we create new possibilities with data. We provide proof beyond the concept and performance beyond the scope with a focus on efficiencies that transform the lives of those we serve. With a culture driven by purpose, straightforward communication and clinical domain expertise, Commence cuts straight to better care.

Requirements:

As a Senior Site Reliability Engineer at Commence, you will own the reliability, scalability, and operational health of our mission-critical healthcare data platform. You will bridge the gap between engineering and operations—embedding reliability as a first-class concern from architecture through deployment. This role is built for someone who thrives when systems are under pressure and who treats an outage as a problem to be engineered away permanently, not just survived.

  • Design, implement, and own observability infrastructure including metrics, logging, tracing, and alerting across distributed systems.
  • Define and enforce SLOs, SLIs, and error budgets in partnership with product and engineering teams.
  • Lead incident response: triage, coordinate remediation, conduct blameless post-mortems, and drive systemic fixes.
  • Build and maintain CI/CD pipelines that support rapid, safe delivery of changes to production.
  • Collaborate with engineering teams on infrastructure changes; able to read, modify, and contribute to existing infrastructure-as-code (Terraform or CloudFormation).
  • Design and operate highly available, fault-tolerant systems—including auto-scaling, failover, and disaster recovery strategies.
  • Reduce operational toil through automation; eliminate manual processes before they become habits.
  • Collaborate with software engineers to establish reliability-first design patterns and review architectures for operational risk.
  • Manage Kubernetes or container orchestration environments at scale.
  • Ensure systems meet compliance and security requirements, particularly those applicable to healthcare data (HIPAA, SOC 2).
  • Provide technical mentorship and guidance to engineers across the organization on reliability practices.
  • Participate in on-call rotation with a commitment to continuously reducing the need for it.

Qualifications

  • 7+ years of experience in SRE, platform engineering, or DevOps roles.
  • Exceptional problem-solving under pressure—demonstrated track record of diagnosing complex, high-stakes system failures and building durable solutions.
  • Deep hands-on experience with AWS services including EC2, EKS/ECS, Lambda, RDS, S3, CloudWatch, and related tooling.
  • Familiarity with infrastructure-as-code (Terraform or CloudFormation)—able to contribute to existing configurations.
  • Experience designing and operating distributed systems with strict availability and latency requirements.
  • Proficiency in at least one scripting or systems language (Python, Go, Bash, or similar) for automation and tooling.
  • Experience with container orchestration (Kubernetes, ECS) in production environments.
  • Expertise in observability tooling (OpenSearch, Prometheus/Grafana, or equivalent).
  • Hands-on experience with CI/CD platforms (GitHub Actions, Jenkins, CircleCI, or similar).
  • Proven ability to define and operationalize SLOs and error budgets.
  • Experience with relational and NoSQL databases—performance tuning, replication, and backup strategies.
  • Strong working knowledge of networking fundamentals: DNS, load balancing, VPCs, TLS.
  • Excellent communication skills—able to translate technical risk into business impact for non-engineering stakeholders.

Additional Requirements

  • AWS Certifications (Solutions Architect, DevOps Engineer, or SysOps Administrator).
  • Experience in healthcare technology or other regulated industries (HIPAA, SOC 2, FedRAMP).
  • Familiarity with chaos engineering practices and tooling.
  • Experience with data pipeline reliability (ETL/ELT workflows, streaming systems).
  • Exposure to AI/ML infrastructure and the reliability challenges unique to model serving.
  • Familiarity with additional cloud platforms (Azure, Google Cloud).
  • Contributions to open-source reliability or infrastructure tooling.

Work Environment/Physical Demands

The work environment and physical demands described here are representative of those that must be met by an employee to successfully perform the essential functions of this job. Reasonable accommodations may be made to enable individuals with disabilities to perform the essential functions.

This is a remote position. While performing the duties of this job, the employee regularly works in a climate-controlled environment. Candidates must be able to sit, read, work on a computer, and watch a computer screen for extended periods of time. Occasionally required to stand, walk, use hands and fingers, kneel or crouch.

Commence is an equal employment opportunity employer. All personnel processes are merit-based and applied without discrimination on the basis of race, color, religion, sex, sexual orientation, gender identity, marital status, age, disability, national or ethnic origin, military and veteran status or any other characteristic protected by applicable law.

Commence.AI is committed to providing equal employment opportunities to all applicants, including individuals with disabilities. If you require a reasonable accommodation to participate in the application process due to a disability, please contact Human Resources at View phone number on ziprecruiter.com or View email address on ziprecruiter.com. Please note that unless you are requesting an accommodation, all applications must be submitted through our online application system.

Vacancy posted 23 days ago
Similar jobs that could be interesting for youBased on the Sr Site Reliability Engineer in Leesburg, VA vacancy
  •  ...overcome the most advanced technical challenges. The team comprises engineers of multiple disciplines including vulnerability research,...  ...join our team to solve real-time cyber challenges, working on site with our end users in their spaces. Applicants will join a team... 
    Senior
    Relocation
    Relocation package

    Nightwing

    Hamilton, VA
    9 days ago
  •  ...Role: Sr. Java Developer with AWS Location: Dulles, VA Duration: 12 Months 10 + years of Java development with 1+ yr of AWS Top Three Skills: 1. 8+ years of Java Development- Spring Framework,moving towards JAX-RS 2. Familiarity... 
    Senior

    3B Staffing LLC

    Sterling, VA
    2 days ago
  • $90 - $100 per hour

     ...Sr. Backend Software Developer Location: Sterling, VA, US Job ID: ATR 17672 Job Description Job Title: Sr. Backend Software...  ...We are seeking a highly experienced and skilled SeniorBackend Engineer to join our team. The ideal candidate will possess a deep... 
    Senior
    Contract work
    Remote work

    Arena Technical Resources

    Hamilton, VA
    1 day ago
  •  ...Sr. Software Engineer in Sterling, VA Job Title: Software Engineer Job Location: Sterling, VA Job Duration: 12 Months contract to hire for the right candidate Education Requirements: Bachelor's Degree and three years of relevant experience Notes: candidates... 
    Senior
    Permanent employment
    Contract work
    Local area

    Infinite Computing Systems

    Sterling, VA
    4 days ago
  • $95.5k - $181.7k

     ...clearance required on day 1 Senior Software Engineer - Training Systems Software (Onsite)...  ...defense. We are currently searching for a Sr. Software Engineer to join our team in...  ...Travel to domestic and international customer sites to support training events and product... 
    Senior
    Temporary work
    Work experience placement
    Work at office
    Remote work
    Relocation
    Relocation package
    Flexible hours

    RTX

    Hamilton, VA
    4 days ago
  •  ...intelligence community, defense, civil, and commercial markets. Nightwing Information Technology is seeking a Senior Cloud Platform Engineer to lead the operation, development, and continuous improvement of our network and server/application infrastructure in AWS... 
    Senior

    Nightwing

    Hamilton, VA
    2 days ago
  • $149.6k - $254.32k

     ...BAE Systems is looking for a Platform or Infrastructure Systems Engineer (Senior) experienced in configuring, deploying, and maintaining...  ...specifics. Senior Platform/Tier3 Engineer 125407BR EEO Career Site Equal Opportunity Employer. Minorities . females . veterans .... 
    Senior
    Full time
    Local area

    BAE Systems USA

    Hamilton, VA
    3 days ago
  •  ...know your interest along with your Updated Resume and the best time and contact number to reach you. Job Title: Sr. / Lead DevOps Engineer Location: Sterling, VA(Remote till Covid-19) Duration: 6+ Months Contract then Hire Our client... 
    Senior
    Contract work
    Remote work

    Navtech

    Sterling, VA
    4 days ago
  •  ...Responsibilities may include, but are not limited to : Infrastructure Engineering: Design, test, and implement advanced cybersecurity...  ...Develop automated unit tests and debug thoroughly to ensure reliable, high-quality code. Collaborate on Deployment, DevOps, and... 

    Nightwing

    Hamilton, VA
    17 days ago
  •  ...change. Constantly grow as you work hard for a mission that matters at a company where you matter. Your Impact As a senior engineer on the team, you will be intimately involved in the architecture decisions that will shape this product. You live and breathe... 
    Senior
    Work at office
    Remote work

    Axon

    Sterling, VA
    3 days ago
  • Nightwing provides technically advanced full-spectrum cyber, data operations, systems integration and intelligence mission support services to meet our customers' most demanding challenges. Our capabilities include cyber space operations, cyber defense and resiliency, ...
    Senior

    Nightwing

    Hamilton, VA
    8 hours ago
  • $86.8k - $165.2k

     ...motivated and experienced Senior Software Engineer with a focus on Mission Engineering to...  ...testing to ensure system functionality and reliability. Support software verification and...  ...of whether the role is designated as on-site, hybrid or remote. The salary range for... 
    Senior
    Temporary work
    Work experience placement
    Work at office
    Remote work
    Relocation package
    Flexible hours

    RTX

    Hamilton, VA
    1 day ago
  •  ...Job Title: Sr. Software Engineer (Network Programming) Location: Sterling, VA Duration: Longterm Contract Responsibilities: Enhance the product and maintain its libraries Expand its offering across various platforms Administer and maintain... 
    Senior
    Contract work

    Navtech

    Sterling, VA
    4 days ago
  • $120.8k - $265.8k

    Job Title: Network Management Systems (NMS) ServiceNow Developer Job Category: Information Technology Time Type: Full time Minimum Clearance Required to Start: TS/SCI with Polygraph Employee Type: Regular Percentage of Travel Required: Up to 10% Type of...
    Full time
    Contract work
    Work experience placement
    Immediate start
    Flexible hours

    CACI International

    Hamilton, VA
    22 days ago
  •  ...values purpose as much as progress, G&A is the place for you! Summary: Goldschmitt and Associates is seeking a Senior Software Engineer to join our Agile team to support an essential modernization program for the U.S. Department of Education, Office of Federal... 
    Senior
    Temporary work
    Work at office
    Local area
    Immediate start
    Remote work
    Flexible hours

    Goldschmitt and Associates LLC

    Leesburg, VA
    2 days ago
  •  ...Sr. Java Developer The Sr. Java Developer position is part of a fast-paced software development team. The Angular, Spring Boot, Spring MVC, Struts 1 or 2 or both and EJB 3 Developer is responsible for designing, developing, and maintaining web applications using the... 
    Senior

    Samprasoft

    Ashburn, VA
    8 hours ago
  • Web Developer The Web Developer position is part of a fast-paced software development team. They will assist in software development and maintenance of the current product suite by participating actively in all phases of the software development lifecycle. They should...
    Senior

    Samprasoft

    Ashburn, VA
    8 hours ago
  • $79.4k - $162.7k

     ...Job Title: Software Systems Engineer Job Category: Engineering Time Type: Full time Minimum Clearance Required to Start: Top Secret Employee Type: Regular Percentage of Travel Required: Up to 10% Type of Travel: Continental US Job Description: CACI... 
    Full time
    Contract work
    Work experience placement
    Flexible hours

    CACI International

    Hamilton, VA
    1 day ago
  • Job Title: ServiceNow Developer Senior Location: Leesburg, Virginia Type: Contract Compensation: $70-80/HR W-2 Contractor Work Model: Hybrid – onsite and remote Responsibilities Design, develop, and implement ServiceNow applications...
    Senior
    Contract work
    For contractors
    Local area
    Remote work

    System One Holdings, LLC

    Leesburg, VA
    1 day ago
  •  ...Sr. Java Developer At B&A, we foster and embrace a distinct set of values that we live by and instill in all aspects of our organization...  ...at B&A’s offices, and other workplaces (including client sites) and all other locations where B&A is providing services, and to... 
    Senior
    Full time
    Contract work
    Work experience placement
    Work at office
    Local area

    Bart and Associates Inc

    Ashburn, VA
    4 days ago
  • $156.4k - $234.6k

     ...and Operational environment(s). The Senior Systems Security Engineer will maintain an enhanced focus on designing, implementing, and...  ...SRR, PDR, CDR, TRR, PSR) Basic Qualifications: Education: Sr. Principal Cyber Systems Engineer ~ Bachelor's Degree (STEM) with... 
    Senior
    Relocation package
    Shift work

    Northrop Grumman

    Sterling, VA
    1 day ago
  • $152.2k - $243.7k

     ...and data solutions, cyber security, and B2C platforms. The Opportunity: We are seeking a seasoned Fullstack Software Engineer with experience building GenAI-enabled applications. We're looking for someone who is both a hands-on full-stack engineer and a... 
    Senior
    Work experience placement
    Work at office
    Local area

    Visa

    Ashburn, VA
    8 hours ago
  • $110.7k - $171.8k

     ...starts with you. Job Description As a Senior Network Engineer specializing in Network Tools, you will be at the forefront of...  ...24x7x365. Our core objectives include: Ensuring network reliability, performance, and security compliance through robust tooling solutions... 
    Senior
    Work experience placement
    Work at office
    Local area

    Visa

    Ashburn, VA
    3 hours ago
  • $95.5k - $181.7k

     ...existing security clearance required on day 1 Senior Software Engineer - DevOps / Infrastructure (Onsite) At RTX, the world largest...  ...accomplished in a lab environment, so this position requires regular on-site presence. This position is located onsite at our Sterling, VA... 
    Senior
    Temporary work
    Work experience placement
    Work at office
    Remote work
    Relocation package
    Flexible hours

    RTX

    Hamilton, VA
    8 hours ago
  •  ...and commercial markets. Job Title: Platform/Infrastructure Engineer Location: Sterling, VA Clearance: TS/SCI Poly **This position...  ...with Infrastructure as Code (IaC) focused on efficiency, reliability, and security The PIE blends technical skills in container... 
    Contract work
    Work at office

    Nightwing

    Hamilton, VA
    2 days ago
  • Nightwing provides technically advanced full-spectrum cyber, data operations, systems integration and intelligence mission support services to meet our customers' most demanding challenges. Our capabilities include cyber space operations, cyber defense and resiliency, ...

    Nightwing

    Hamilton, VA
    5 days ago
  •  ...skills REQNUMBER: 2611948 SAIC is a premier technology integrator, solving our nation's most complex modernization and systems engineering challenges across the defense, space, federal civilian, and intelligence markets. Our robust portfolio of offerings includes high... 
    Immediate start

    SAIC

    Hamilton, VA
    1 day ago
  •  ..., to assist in addressing readiness issues impacting the client mission. Use expertise to continuously update the client Sharepoint site and update other applicable database activities. Submit requests for travel and assist in providing the necessary support to begin work... 
    Overseas

    Nightwing

    Hamilton, VA
    15 hours ago
  • $131.8k - $290k

     ...Management Systems (NMS) Application Integrator Job Category: Engineering Time Type: Full time Minimum Clearance Required to...  ...automation Optimize existing integrations to improve performance, reliability, and scalability Troubleshoot and resolve complex... 
    Full time
    Contract work
    Work experience placement
    Immediate start
    Flexible hours

    CACI International

    Hamilton, VA
    8 hours ago
  • Nightwing provides technically advanced full-spectrum cyber, data operations, systems integration and intelligence mission support services to meet our customers' most demanding challenges. Our capabilities include cyber space operations, cyber defense and resiliency, ...
    Contract work

    Nightwing

    Hamilton, VA
    3 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Sr Site Reliability Engineer. Be the first to apply!