Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Senior Site Reliability Engineer, Production Engineering

$166k - $220k

Dormont Manufacturing Co

Anduril Industries is a defense technology company with a mission to transform U.S. and allied military capabilities with advanced technology. By bringing the expertise, technology, and business model of the 21st century’s most innovative companies to the defense industry, Anduril is changing how military systems are designed, built and sold. Anduril’s family of systems is powered by Lattice OS, an AI-powered operating system that turns thousands of data streams into a realtime, 3D command and control center. As the world enters an era of strategic competition, Anduril is committed to bringing cutting‑edge autonomy, AI, computer vision, sensor fusion, and networking technology to the military in months, not years.

ABOUT THE TEAM

The Production Engineering team is a newly formed organization within Anduril’s Software Platform, dedicated to ensuring the reliability, performance, and scalability of mission‑critical systems that directly support our warfighters in the field. We solve complex reliability challenges at massive scale, ensuring that critical components of Lattice—Anduril’s autonomous command and control platform—operate flawlessly in the most demanding operational environments. This is a foundational role and you will be among the first hires building this team from the ground up. You’ll have the unique opportunity to shape the technical direction, establish best practices, and define what production engineering excellence means at Anduril. Our team operates at the intersection of software engineering and systems reliability, building the infrastructure, tooling, and processes that keep our systems operational 24/7/365.

ABOUT THE ROLE

We are seeking an experienced Senior Site Reliability Engineer who is passionate about building resilient, highly available systems that scale to meet the demands of the core systems powering Lattice. You will work closely with platform engineering teams, product developers, and field operations to proactively identify reliability risks, implement defensive strategies, and continuously improve the operational excellence of our software platform. If you thrive on solving hard problems at scale and want your work to have direct impact on national security, this is the role for you.

WHAT YOU’LL DO

Design and implement comprehensive monitoring, observability, and alerting systems to ensure early detection of reliability issues across the Lattice platform Drive incident response and conduct blameless postmortems to identify systemic improvements and prevent recurrence of production issues Build and maintain infrastructure automation using tools like Terraform, Kubernetes operators, and custom tooling to manage large‑scale distributed systems Establish and track Service Level Objectives (SLOs) and Error Budgets to balance feature velocity with system reliability Partner with software engineering teams to improve system architecture for reliability, implementing patterns like circuit breakers, graceful degradation, and chaos engineering Develop capacity planning models and performance testing frameworks to ensure systems can handle growth and peak operational demands Create runbooks, documentation, and training materials to enable teams to operate production systems effectively Lead cross‑functional efforts to improve deployment safety through progressive rollouts, automated testing, and rollback capabilities Implement security best practices and compliance controls for production environments handling sensitive defense data Build tooling and automation to reduce toil and improve operational efficiency for the engineering organization Participate in on‑call rotations and serve as an escalation point for critical production incidents

REQUIRED QUALIFICATIONS

7+ years of engineering experience with at least 3+ years focused on SRE, production operations, or infrastructure engineering Bachelor’s degree in Computer Science, Engineering, or equivalent practical experience Deep expertise with Kubernetes in production environments, including operational challenges at scale (100+ nodes) Strong programming skills in one or more languages such as Go, Python, Rust, or Java with ability to build production‑grade tooling Proven experience designing and implementing observability stacks (metrics, logging, tracing) using tools like Prometheus, Grafana, ELK/EFK, or equivalent Hands‑on experience with cloud platforms (AWS, Azure, or GCP) and infrastructure as code practices Demonstrated ability to debug complex distributed systems issues across multiple layers of the stack Track record of improving system reliability through architectural changes, not just operational band‑aids Strong incident management and communication skills, with experience leading responses to critical outages Must be a U.S. Person due to required access to U.S. export controlled information or facilities Eligible to obtain and maintain an active U.S. Secret security clearance

PREFERRED QUALIFICATIONS

Experience with defense, aerospace, or other mission‑critical systems where downtime has severe consequences Expertise in performance optimization and capacity planning for high‑throughput, low‑latency systems Knowledge of chaos engineering principles and experience implementing resilience testing frameworks Experience with service‑mesh technologies (Istio, Linkerd) and advanced traffic management patterns Background in database operations and optimization (PostgreSQL, Cassandra, or similar at scale) Familiarity with CI/CD platforms and deployment automation (ArgoCD, FluxCD, Spinnaker, Jenkins) Understanding of networking fundamentals including load balancing, DNS, TLS/SSL, and network security Experience with configuration management and secrets management solutions (Vault, Sealed Secrets, SOPS) Strong written and verbal communication skills with ability to explain technical concepts to non‑technical stakeholders Active Secret or higher security clearance US Salary Range

$166,000—$220,000 USD

The salary range for this role is an estimate based on a wide range of compensation factors, inclusive of base salary only. Actual salary offer may vary based on (but not limited to) work experience, education and/or training, critical skills, and/or business considerations. Highly competitive equity grants are included in the majority of full‑time offers; and are considered part of Anduril’s total compensation package. Additionally, Anduril offers top‑tier benefits for full‑time employees, including: Healthcare Benefits US Roles: Comprehensive medical, dental, and vision plans at little to no cost to you. UK & AUS Roles: We cover full cost of medical insurance premiums for you and your dependents. IE Roles: We offer an annual contribution toward your private health insurance for you and your dependents. Additional Benefits Income Protection: Anduril covers life and disability insurance for all employees. Generous time off: Highly competitive PTO plans with a holiday hiatus in December. Caregiver & Wellness Leave is available to care for family members, bond with a new baby, or address your own medical needs. Family Planning & Parenting Support: Coverage for fertility treatments (e.g., IVF, preservation), adoption, and gestational carriers, along with resources to support you and your partner from planning to parenting. Mental Health Resources: Access free mental health resources 24/7, including therapy and life coaching. Additional work‑life services, such as legal and financial support, are also available. Professional Development: Annual reimbursement for professional development Commuter Benefits: Company‑funded commuter benefits based on your region. Relocation Assistance: Available depending on role eligibility. Retirement Savings Plan US Roles: Traditional 401(k), Roth, and after‑tax (mega backdoor Roth) options. UK & IE Roles: Pension plan with employer match. AUS Roles: Superannuation plan. The recruiter assigned to this role can share more information about the specific compensation and benefit details associated with this role during the hiring process. To view Anduril’s candidate data privacy policy, please visit #J-18808-Ljbffr Dormont Manufacturing Co

Vacancy posted 5 days ago
Similar jobs that could be interesting for youBased on the Senior Site Reliability Engineer, Production Engineering in Seattle, WA vacancy
  • Blue Origin is seeking a Site Reliability Engineer to enhance the digital infrastructure supporting safe human spaceflight. This role involves improving the software lifecycle from design to deployment, particularly in cloud environments. The ideal candidate will possess... 
    Senior

    jobs.frontdoordefense.com - Jobboard

    Seattle, WA
    5 days ago
  • $79.1k - $158.2k

     ...according to terms for reliability and functionality....  ...investigations, and debugging products in order to reach SLOs...  ...basic knowledge of site reliability trends and...  ...escalate issues to senior team members. Collects...  ...skilled Site Reliability Engineer to design, build,... 
    Senior
    Temporary work
    Immediate start
    Flexible hours
    Shift work

    Oracle

    Seattle, WA
    3 days ago
  •  ...APPIT Software Solutions is hiring a Senior Site Reliability Engineer (SRE) in Seattle, USA . Lead site reliability engineering efforts for large-...  ...Responsibilities Lead SRE strategy and practices across multiple product teams ensuring consistent reliability standards Architect... 
    Senior
    Flexible hours

    Appit LLC

    Seattle, WA
    3 days ago
  •  ...about this role, we encourage you to apply. The Role As a Senior Platform Engineer, you are a champion for DevOps and SRE culture and...  ...company goals are met. What You Will Be Doing Improving production reliability and system resilience within an SRE scoped team... 
    Senior
    Flexible hours

    Megaport

    Seattle, WA
    2 days ago
  • $139.5k - $258.1k

     ...Software and Services Apple Services Engineering team is one of the most exciting examples...  ...Service Infrastructure team, as a Site Reliability Engineer, to help support and scale cloud...  ...benefits, a range of discounted products and free services, and for formal education... 
    Senior
    Relocation

    Apple Inc.

    Seattle, WA
    5 days ago
  • $150k - $180k

     ...improve cloud infrastructure reliability, scalability, and operational efficiency. Write production-quality code in Go, Python, or...  ...platforms and tools that enable engineering teams to provision services...  ...engineering, cloud infrastructure, or site reliability engineering.... 
    Senior

    Axon Enterprise

    Seattle, WA
    1 day ago
  • $134.25k - $214.8k

     ...of devices and cloud software. Like our products, we work better together. We connect...  ...where you matter. Your Impact Are you an engineer who gets excited about the challenge of...  ...the Observability team within Axon's Site Reliability organization - a focused team responsible... 
    Senior
    Work experience placement
    Work at office
    Remote work

    Koitecc Solutions

    Seattle, WA
    3 days ago
  • $165k - $242k

     ...Senior Site Reliability Engineer, Data Infrastructure CoreWeave is The Essential Cloud for AI™. Built for pioneers by pioneers, CoreWeave delivers...  ..., and internal AI workloads at scale. We operate with production-grade discipline, supporting mission-critical services... 
    Senior
    Permanent employment
    Temporary work
    Casual work
    Work at office
    Flexible hours

    CoreWeave

    Bellevue, WA
    12 hours ago
  • $127k - $249k

    Senior / Staff Engineer - SRE, InfraSec We are looking for an experienced Senior or Staff Engineer for our SRE, InfraSec team to guide the security of our cloud‑based infrastructure. You will be highly hands‑on technically while also mentoring a small team of SREs. The... 
    Senior
    Local area
    Remote work

    The Consulting Solutions

    Seattle, WA
    4 days ago
  • Next Frontier Capital is seeking a Site Reliability Engineer III to drive innovation within technology sectors, ensuring the reliability and scalability of applications and infrastructure. You will manage and optimize cloud resources, ensuring best practices in site reliability... 
    Senior

    Next Frontier Capital

    Seattle, WA
    2 days ago
  • $70 - $80 per hour

     ...leading organization in the technology sector, is seeking a Sr. Site Reliability Engineer to join their team. As a Sr. Site Reliability Engineer,...  ...teams deploy and operate containerized services in production AKS environments Design, write, and maintain Terraform... 
    Senior
    Weekly pay
    Temporary work
    Local area
    Flexible hours
    3 days per week

    Experis/Manpower Group

    Seattle, WA
    3 days ago
  • $175k - $200k

     ...for you. About the Role: As a member of the Product and Engineering team at PitchBook, you will be part of a team of big...  ...improve. Join our team and grow with us! As a Sr. Site Reliability Engineer (SRE) in PitchBook's engineering division, you will... 
    Senior
    Work at office
    Remote work
    Visa sponsorship

    PitchBook

    Seattle, WA
    12 hours ago
  •  ...Role Overview We are seeking a high-caliber Site Reliability Engineer (SRE) to join our Forward Engineering team. You will be the guardian of our production ecosystems, ensuring that our complex, data-driven AI platforms remain resilient, scalable, and highly performant... 
    Senior
    Local area

    Tiger Analytics

    Seattle, WA
    3 days ago
  • The Consulting Solutions is seeking an experienced Senior / Staff Engineer for our SRE, InfraSec team in Seattle. The role involves leading the security of cloud-based infrastructure, mentoring a team of SREs, and collaborating with other engineering teams to ensure high... 
    Senior
    Remote job

    The Consulting Solutions

    Seattle, WA
    4 days ago
  • $192k - $240k

    ### Senior Security Operations Engineer#### Seattle, Washington, United StatesSenior Security Operations Engineer**Why join us**Brex is the AI-powered spend platform. We help companies spend with confidence with integrated corporate cards, banking, and global payments,... 
    Senior
    Work at office
    Remote work
    Work from home

    Brex Inc.

    Seattle, WA
    16 hours ago
  • Axon in Seattle is seeking a Senior Engineer for its observability team. You'll design and evolve the observability platform, working on distributed tracing, logging, and metrics across Axon's infrastructures. The ideal candidate has strong engineering experience, ideally... 
    Senior

    Koitecc Solutions

    Seattle, WA
    3 days ago
  • $106.8k - $194.8k

     ...WAF Operations Solution Engineer Location: Anywhere in Country Practice Description As a WAF Operations Solution Engineer, you will be responsible for implementing and managing Web Application Firewall (WAF) solutions to protect client applications from cyber threats.... 
    Senior
    Summer holiday
    Flexible hours

    Ernst & Young Oman

    Seattle, WA
    3 days ago
  • $171.6k - $302.2k

    A leading technology company in Seattle is seeking an experienced Site Reliability Engineer to enhance compute infrastructure at scale. You will design and implement innovative solutions, manage cloud infrastructure, and focus on automation. The ideal candidate has over... 
    Senior

    Apple Inc.

    Seattle, WA
    3 days ago
  • $177.57k - $248.59k

    Site Reliability Engineering - Sr. Software Development Engineer Implement and manage the infrastructure for rapid development and deployment of...  ..., and refinement Support services before they go to production through activities such as system design, consulting, developing... 
    Senior
    Permanent employment
    Temporary work
    Local area

    jobs.frontdoordefense.com - Jobboard

    Seattle, WA
    5 days ago
  •  ...Senior Automation Engineer Location: Seattle, WA Visa: GC or Citizen Or H1B Note: Its a Senior position need 10 or above resume only,...  ...experience working in Operations, Engineering, DevOps, or Site Reliability in a medium to large company. ~ Can describe specific... 
    Senior
    H1b

    Georgia IT Inc

    Seattle, WA
    1 day ago
  • Jansoft Global is looking for an experienced SAP Test Automation Engineer in Bellevue, United States. This role involves designing, developing, and executing automated test solutions for enterprise SAP applications using Worksoft Certify. Key responsibilities include end... 
    Senior

    Jansoft Global

    Bellevue, WA
    5 days ago
  • $191k - $253k

     ...Corporate Technology Engineering team is responsible for...  ...seeking a highly motivated Senior Software Engineer to...  ...time, driving our production lines efficiently. You...  ...performance, scalability, and reliability of the Forge platform...  ...work with the team on-site on a rotation for... 
    Senior
    Full time
    Work experience placement
    Immediate start
    Rotating shift

    Anduril Industries

    Seattle, WA
    3 days ago
  • Blue Origin LLC is seeking a Training Engineer Level III to join their In-Space Systems team in Seattle. This role involves designing and executing training for flight operations, ensuring effective mission support. Applicants should possess a strong engineering background... 
    Senior

    Blue Origin LLC

    Seattle, WA
    1 day ago
  •  ...support first responders in saving lives. About this Role: As our lead Site Reliability / DevOps Engineer, you will own the reliability, scalability, and operational excellence of our production systems. You’ll build secure cloud infrastructure, automation, and deployment... 
    Senior
    Work at office
    Worldwide
    Flexible hours

    Brinc

    Seattle, WA
    1 day ago
  • Blue Origin LLC is seeking a Senior Mission Operations Engineer to lead and execute on-orbit operations. This role requires a strong operator focus, effective leadership, and the ability to develop mission plans while streamlining workflows. The ideal candidate will have... 
    Senior

    Blue Origin LLC

    Seattle, WA
    2 days ago
  •  ...GEICO is looking for a talented Engineer in Seattle to enhance our network infrastructure and implement security measures in alignment with our tech transformation goals. You will lead the strategy and execution of technical roadmaps, ensuring network performance and... 
    Senior

    GEICO

    Seattle, WA
    3 days ago
  • $107.1k - $160.7k

     ...the world's leading integrated design practice. Our architects, engineers, interior designers, consultants, sustainability specialists,...  ...your place with Stantec. Your Opportunity The Senior Automation Engineer for BAS/BMS/PLC systems, guides the technical... 
    Senior
    Full time
    Temporary work
    Part time
    Casual work
    Local area
    Flexible hours

    Stantec

    Seattle, WA
    3 days ago
  • Compass is seeking a Senior Software Engineer to join their Staff AI Enablement Team in Seattle. In this role, you will drive the company's enterprise AI strategy by collaborating with major business units to create automated workflows. Your responsibilities involve building... 
    Senior

    Jobr

    Seattle, WA
    3 days ago
  • $148.5k - $237.6k

     ...ecosystem of devices and cloud software. Like our products, we work better together. We connect with...  ...where you matter. Your Impact As a Senior Security Operations Engineer, you'll play a key role in ensuring the reliability, performance, and scalability of our... 
    Senior
    Work experience placement
    Work at office
    Remote work

    Axon

    Seattle, WA
    3 days ago
  •  ...Ziply Fiber in Kirkland, WA is seeking a Principal Network Automation Engineer responsible for planning, designing, and implementing software-driven solutions to enhance its fiber and IP networks. This role emphasizes automation to improve operational efficiency and reduce... 
    Senior

    Ziply Fiber

    Kirkland, WA
    3 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior Site Reliability Engineer, Production Engineering. Be the first to apply!