Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Senior Site Reliability Engineer

OfficeSpace Software

Senior Site Reliability Engineer

United States

About OfficeSpace:

OfficeSpace Software provides the leading AI operating system for the built world, that helps teams plan, connect, and perform in the workplace. As a performance-based, PE-backed company, we hire based on merit and a willingness to do what it takes to succeed long-term. You're a great fit for the role if you're entrepreneurial, passionate, motivated by building at light speed, and an Agentic AI early adopter. Our world-class teams operate in the US, Canada, and Costa Rica in a culture of trust, respect, growth, and impact.

Role Summary:

You own the performance, reliability, and cost efficiency of OfficeSpace's production platform at scale. As a Senior Site Reliability Engineer, you shape how our systems run—fast, resilient, and predictable—while leading the shift from manual operations to AI-assisted reliability engineering. We provide the platform. You make it perform.

What You'll Do:

  • Drive measurable improvements in latency, throughput, and availability across a large-scale production environment.
  • Own system performance—from Linux internals to Kubernetes scheduling—and eliminate bottlenecks before customers feel them.
  • Define and enforce SLIs, SLOs, and error budgets that balance speed, reliability, and growth.
  • Partner with application engineers to profile code paths, improve execution efficiency, and harden services under real load.
  • Lead database performance optimization across queries, indexing, replication, and workload isolation.
  • Design and oversee AI-assisted load testing, stress testing, and capacity planning workflows.
  • Guide the migration from monolithic deployments to multi-tenant Kubernetes platforms.
  • Reduce infrastructure spend through architectural decisions, right-sizing, and intelligent scaling strategies.
  • Build and supervise automation for infrastructure provisioning, configuration management, and observability.
  • Set clear operational standards for reliability, performance, and incident response—and raise the bar for how we run production.

What You Bring:

  • 7+ years operating and evolving large-scale production systems. Deep Linux systems expertise with hands-on performance tuning across CPU, memory, disk, and networking.
  • Strong Python skills for automation, tooling, and AI-assisted systems workflows.
  • Production experience with Ruby/Rails ecosystems, including Puma and Sidekiq.
  • Proven ability to diagnose and resolve complex database performance issues (MySQL/MariaDB or PostgreSQL).
  • Advanced Kubernetes experience—workload sizing, scheduling, and multi-tenant operations.
  • Infrastructure-as-code mastery using Terraform and Terragrunt.
  • Experience with configuration management tools such as Puppet or Ansible.
  • Strong observability instincts across metrics, logs, and traces using tools like Prometheus, Grafana, Datadog, or ELK.
  • AI fluency—comfortable supervising AI agents for analysis, testing, and reporting, and validating their outputs.
  • A builder mindset. You move fast, take ownership, and raise standards.

Preferred Background:

  • Scaling and refactoring monolithic applications under real production load
  • Extracting databases or stateful components from monoliths
  • Apache and Nginx tuning at scale
  • Redis performance optimization and operational management
  • CI/CD systems and GitOps workflows, including ArgoCD
  • Cloud cost optimization and FinOps-aligned operational practices

Why OfficeSpace?

  • High-Performance Culture: At OfficeSpace, we believe in the power of accountability, focus, and drive. Our A-Player team members work together to deliver measurable, meaningful results. We recognize and reward those who push boundaries and achieve excellence.
  • Ownership and Accountability: We trust our employees to take full ownership of their roles, providing the autonomy to innovate and the support to succeed. We seek individuals who are self-motivated and thrive in an environment where they can drive impactful outcomes.
  • Technology-Forward: As a company invested in cutting-edge technology, we integrate AI and other advanced solutions across our platform to enhance productivity, customer experience, and process efficiency. Our team members are excited by the potential of AI and proactively explore ways it can drive our success.
  • Growth Mindset: Continuous learning and improvement are integral to our culture. We encourage our team to embrace challenges, seek knowledge, and develop both personally and professionally.
  • Innovation and Agility: We foster a dynamic, fast-paced environment where fresh ideas and bold solutions are celebrated. We embrace change and thrive on turning challenges into opportunities, with a team that is agile, proactive, and resilient.
  • Collaborative, Results-Driven Environment: We value purposeful collaboration that leads to shared success and stronger results. While our team members are independent, they recognize the value of working together to drive our mission forward.
  • Competitive Benefits and Rewards: OfficeSpace offers comprehensive and competitive benefits packages globally, designed to support our team's health, well-being, and financial security. We invest in our people so they can excel. OfficeSpace is committed to building and promoting a diverse workforce and celebrates the unique qualities that individuals of various backgrounds and experiences offer. We are committed to basing all employment-related decisions upon valid job-related factors without regard to race, color, sex (including pregnancy, sexual orientation, and gender identity), age, religion, national origin, genetic information, military status, veteran status, physical or mental disability, or any other status protected by law.
Vacancy posted 1 day ago
Similar jobs that could be interesting for youBased on the Senior Site Reliability Engineer in Washington DC vacancy
  • $83k - $187k

     ...best practices, and ability to develop tools that automate incident management. Description We are looking for a Senior Site Reliability Engineer to join our OCI team. This role is part of a globally distributed team responsible for detecting, triaging, and... 
    Senior
    Temporary work
    Work experience placement
    Flexible hours

    Oracle

    Washington DC
    4 days ago
  • $106.3k - $221.1k

     ...Senior Site Reliability Engineer At Accenture Federal Services, nothing matters more than helping the US federal government make the nation stronger and safer and life better for people. Our 13,000+ people are united in a shared purpose to pursue the limitless potential... 
    Senior
    Live in
    Work at office
    Local area

    Accenture Federal Services

    Arlington, VA
    4 days ago
  •  ..., and Onsite Notice: This role requires regularly working on-site at customer locations in Arlington, VA. If you are not currently...  ...obtain SCI eligibility. About The Role We are hiring a Site Reliability Engineer to join our Infrastructure & Security team. You’ll work... 
    Senior
    Relocation
    Relocation package

    Onebrief, Inc.

    Arlington, VA
    1 day ago
  • $191k - $287k

     ...requirements and customer expectations. Our systems integration engineers internalize the nuances of each deployment, ensuring the...  ...‑end solutions we ship. About the Job We are looking for a Site Reliability Engineer (SRE) to join AGD, our rapidly growing team in Costa... 
    Senior

    Slope

    Washington DC
    2 days ago
  •  ...Sr. Site Reliability Engineer (SRE) III As a Sr. Site Reliability Engineer (SRE) III, you'll work as part of a collaborative and high-performing team providing your expertise to deliver technical solutions within the highest levels of the federal government. We believe... 
    Senior
    Immediate start

    Mount Indie

    Washington DC
    2 days ago
  • $175k - $195k

     ...Filevine Sr. Observability Engineer Filevine is a Legal AI company delivering Legal Operating...  .... # Define and manage SLIs, SLOs, and reliability metrics. # Lead incident response,...  ..., or operations. #5+ years of Site Reliability Engineering experience. #... 
    Senior
    Full time
    Temporary work

    Filevine

    Washington DC
    3 hours ago
  •  ...Role Overview We are seeking a high-caliber Site Reliability Engineer (SRE) to join our Forward Engineering team. You will be the guardian of our production ecosystems, ensuring that our complex, data-driven AI platforms remain resilient, scalable, and highly performant... 
    Senior
    Local area

    Tiger Analytics

    Washington DC
    2 days ago
  • $109.5k - $150.55k

     ...strive for the best, own our actions, and grow and evolve. Job Description Renaissance is looking for an experienced Sr Site Reliability Engineer to be part of the Engineering Enablement group's Site Reliability Team with a focus on Application and Infrastructure... 
    Senior
    For contractors
    Local area
    Remote work
    Worldwide
    Work visa
    Flexible hours
    Weekend work

    Renaissance Services

    Washington DC
    2 days ago
  • $165k - $230k

     ...is actively developing the technologies to make this possible, with the ultimate goal of enabling human life on Mars. SR. SITE RELIABILITY ENGINEER (STARSHIELD) Starshield leverages SpaceX's Starlink technology and launch capability to support national security efforts.... 
    Senior
    Temporary work
    Immediate start
    Weekend work

    SpaceX

    Washington DC
    2 days ago
  • A leading technology company is seeking a Senior Site Reliability Engineer in Virginia. The role involves maintaining a Kubernetes-based platform, ensuring high availability, and automating infrastructure processes with tools like Terraform. The ideal candidate will have... 
    Senior
    Remote job
    Flexible hours

    Workday, Inc.

    Mc Lean, VA
    5 days ago
  •  ...Principal Site Reliability Engineer The Principal Site Reliability Engineer will be a critical technical leader responsible for driving the...  ...for a key Randstad client in the Washington D.C. area. This senior role merges deep expertise in infrastructure automation (IaC... 

    Software Technology Inc

    Washington DC
    4 days ago
  • $147k - $202k

     ...TechOps) team, we live this mission by building the most reliable and performant systems on the planet. We empower...  ...need. The Role We are looking for an experienced Senior Site Reliability Engineer (SRE) who thrives on the challenge of managing large-scale... 
    Senior
    Permanent employment
    Local area
    Worldwide
    Flexible hours

    Okta

    Washington DC
    14 days ago
  • $135k - $150k

     ...Mission Focused Expertise: From veteran leadership to cleared engineers, our people understand both the technology and the mission. Summary Bridge Defense seeks a highly qualified Site Reliability Engineer to build and lead the company's deployment engineering... 
    Relocation
    Flexible hours

    Bridge Defense

    Washington DC
    1 day ago
  • $131k - $227.13k

     ...Description: The 1LMX MES COE is seeking an engineer who will own infrastructure‑as‑code, cloud platform, and reliability for the Apriso environment on AWS. This role blends full‑stack development, DevOps, and Site Reliability Engineering (SRE) practices to deliver a... 
    Full time
    Temporary work
    Work experience placement
    Work at office
    Remote work
    Relocation
    Flexible hours
    Shift work
    3 days per week

    Lockheed Martin Corporation

    Bethesda, MD
    3 hours ago
  • $112k - $179k

     ...system, network, software, and security solutions. About The Role Peraton is seeking a self-driven and resourceful Site Reliability Engineer to join our dynamic of Network and UC engineers in Washington, DC. This position combines software engineering and systems... 
    Contract work
    Worldwide
    Shift work

    Peraton

    Washington DC
    3 hours ago
  • $160k - $180k

     ...Site Reliability Engineer Location: Hybrid – Washington DC/Virginia/Maryland metro with the ability to travel to Patuxent River, MD, as needed (up to 20% of the time). Compensation: $160,000 - 180,000 per year, depending on experience and qualifications. Employment... 
    Full time
    Temporary work
    Local area
    Remote work
    Flexible hours

    Fortress Information Security

    Washington DC
    1 day ago
  • $207k - $284.9k

     ...This is an opportunity to do career-defining work. We're all in on this mission. If you are too, let's talk. Senior Manager, Site Reliability Engineering District of Columbia Area Secure Every Identity, from AI to Human Identity is the key to unlocking the potential... 
    Senior
    Permanent employment
    Local area
    Worldwide
    Flexible hours

    Okta

    Washington DC
    a month ago
  • $100k - $170k

     ...Site Reliability Engineer Nscale is the GPU cloud engineered for AI—purpose-built to deliver high-performance, cost-efficient infrastructure...  ...availability, scalability, and operational efficiency Learn from senior engineers and grow your expertise in reliability... 
    Flexible hours

    Nscale

    Washington DC
    3 days ago
  • $51.9 per hour

     ...OVERVIEW: This job is responsible for the reliability, availability, and performance of...  ...operational efficiency. This role blends software engineering, clinical engineering, and security...  .... Works cross-functionally with AHN site leaders and teams to navigate and to monitor... 
    For contractors
    Local area

    Highmark Health

    Washington DC
    4 days ago
  • $114.6k - $190.2k

     ...with MANTECH! ***This is for a future opportunity*** MANTECHseeks motivated, career, and customer-oriented Site Reliability Engineer (SRE) for a new initiative. This effort supports the rapid design, deployment, operation, and sustainment of enterprise-... 
    Hourly pay
    Contract work
    Temporary work
    Work experience placement
    Work at office
    Local area
    Remote work

    ManTech International Corporation

    Washington DC
    1 day ago
  •  ...Job Title: Site Reliability Engineer (SRE) Location: Washington, DC (Onsite) Clearance: TS/SCI Position Overview Seeking a highly motivated Site Reliability Engineer (SRE) to support mission-critical enterprise applications and infrastructure in... 

    Input Technology Solutions

    Washington DC
    4 days ago
  • $100.2k - $203.4k

     ...Site Reliability Engineer At Accenture Federal Services, nothing matters more than helping the US federal government make the nation stronger and safer and life better for people. Our 13,000+ people are united in a shared purpose to pursue the limitless potential of... 

    Accenture Federal Services

    Arlington, VA
    4 days ago
  •  ...Site Reliability Engineer (SRE) Randstad is seeking a skilled and proactive Site Reliability Engineer (SRE) to join our client in the Washington D.C. area, focusing on optimizing the availability, performance, and scalability of critical production services. The ideal... 

    Software Technology Inc

    Washington DC
    4 days ago
  •  ...Site Reliability Engineer Location- Wilmington De, Washington DC, Dallas, TX (Onsite Position) Full time position Minimum Qualifications Bachelor’s degree in computer science, Engineering, or a related technical field. Minimum of 5 years of experience... 
    Full time

    Yochana

    Washington DC
    1 day ago
  • $131k - $164k

     ...Staff Site Reliability Engineer New York, New York, United States Position Overview We are seeking a highly skilled Staff Site Reliability...  ...Infrastructure & Operations team. This role is a hands-on senior engineering position responsible for designing, maintaining... 
    Work at office
    Local area
    Flexible hours

    Diligent

    Washington DC
    1 day ago
  • $3,000 per month

     ...analyzing system performance standards, confer with users or system engineers; analyze systems flow, data usage and work processes; and...  ...Our benefits are built to match the caliber of your work. Reliable, high-performing, and mission-ready. You’ll enjoy world‑class... 
    Senior

    Lockheed Martin

    Arlington, VA
    8 hours ago
  • $166k - $220k

    ABOUT THE JOB As a site reliability engineer in Platform Discovery, you will solve a wide variety of problems involving networking, autonomy, systems integration, robotics, and more, while making pragmatic engineering tradeoffs along the way. Your efforts will ensure that... 
    Full time
    Work experience placement
    Relocation package

    Slope

    Washington DC
    5 days ago
  •  ...A leading consulting firm is looking for a Senior DevOps / Jira Platform Engineer to support Agile transformation in a secure environment. This role involves Jira administration and designing integrations with GitLab and ServiceNow. The ideal candidate should have 8+... 
    Senior
    Remote work

    Innovate Corporation

    Washington DC
    1 day ago
  • Phase2 Technology is seeking an Enterprise Agile Coach to support PMW 240 in a hybrid Agile-ITSM environment. You'll guide the adoption of delivery practices, drive continuous improvement, and lead backlog refinement. The ideal candidate has 15+ years of experience and ...
    Senior

    Phase2 Technology

    Arlington, VA
    3 days ago
  •  ...Job Title : Senior Software Engineer - Agentic Systems Location: Arlington, VA (Hybrid 3 days onsite) Role & Responsibilities...  ...paths. Ensure systems meet Client standards for reliability, security, compliance, and performance. Collaborate with... 
    Senior

    Futran Tech Solutions Pvt. Ltd.

    Arlington, VA
    2 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior Site Reliability Engineer. Be the first to apply!