Senior Site Reliability Engineer, Production Engineering

$166k - $220k

Dormont Manufacturing Co

Anduril Industries is a defense technology company with a mission to transform U.S. and allied military capabilities with advanced technology. By bringing the expertise, technology, and business model of the 21st century’s most innovative companies to the defense industry, Anduril is changing how military systems are designed, built and sold. Anduril’s family of systems is powered by Lattice OS, an AI-powered operating system that turns thousands of data streams into a realtime, 3D command and control center. As the world enters an era of strategic competition, Anduril is committed to bringing cutting‑edge autonomy, AI, computer vision, sensor fusion, and networking technology to the military in months, not years.

ABOUT THE TEAM

The Production Engineering team is a newly formed organization within Anduril’s Software Platform, dedicated to ensuring the reliability, performance, and scalability of mission‑critical systems that directly support our warfighters in the field. We solve complex reliability challenges at massive scale, ensuring that critical components of Lattice—Anduril’s autonomous command and control platform—operate flawlessly in the most demanding operational environments. This is a foundational role and you will be among the first hires building this team from the ground up. You’ll have the unique opportunity to shape the technical direction, establish best practices, and define what production engineering excellence means at Anduril. Our team operates at the intersection of software engineering and systems reliability, building the infrastructure, tooling, and processes that keep our systems operational 24/7/365.

ABOUT THE ROLE

We are seeking an experienced Senior Site Reliability Engineer who is passionate about building resilient, highly available systems that scale to meet the demands of the core systems powering Lattice. You will work closely with platform engineering teams, product developers, and field operations to proactively identify reliability risks, implement defensive strategies, and continuously improve the operational excellence of our software platform. If you thrive on solving hard problems at scale and want your work to have direct impact on national security, this is the role for you.

WHAT YOU’LL DO

Design and implement comprehensive monitoring, observability, and alerting systems to ensure early detection of reliability issues across the Lattice platform Drive incident response and conduct blameless postmortems to identify systemic improvements and prevent recurrence of production issues Build and maintain infrastructure automation using tools like Terraform, Kubernetes operators, and custom tooling to manage large‑scale distributed systems Establish and track Service Level Objectives (SLOs) and Error Budgets to balance feature velocity with system reliability Partner with software engineering teams to improve system architecture for reliability, implementing patterns like circuit breakers, graceful degradation, and chaos engineering Develop capacity planning models and performance testing frameworks to ensure systems can handle growth and peak operational demands Create runbooks, documentation, and training materials to enable teams to operate production systems effectively Lead cross‑functional efforts to improve deployment safety through progressive rollouts, automated testing, and rollback capabilities Implement security best practices and compliance controls for production environments handling sensitive defense data Build tooling and automation to reduce toil and improve operational efficiency for the engineering organization Participate in on‑call rotations and serve as an escalation point for critical production incidents

REQUIRED QUALIFICATIONS

7+ years of engineering experience with at least 3+ years focused on SRE, production operations, or infrastructure engineering Bachelor’s degree in Computer Science, Engineering, or equivalent practical experience Deep expertise with Kubernetes in production environments, including operational challenges at scale (100+ nodes) Strong programming skills in one or more languages such as Go, Python, Rust, or Java with ability to build production‑grade tooling Proven experience designing and implementing observability stacks (metrics, logging, tracing) using tools like Prometheus, Grafana, ELK/EFK, or equivalent Hands‑on experience with cloud platforms (AWS, Azure, or GCP) and infrastructure as code practices Demonstrated ability to debug complex distributed systems issues across multiple layers of the stack Track record of improving system reliability through architectural changes, not just operational band‑aids Strong incident management and communication skills, with experience leading responses to critical outages Must be a U.S. Person due to required access to U.S. export controlled information or facilities Eligible to obtain and maintain an active U.S. Secret security clearance

PREFERRED QUALIFICATIONS

Experience with defense, aerospace, or other mission‑critical systems where downtime has severe consequences Expertise in performance optimization and capacity planning for high‑throughput, low‑latency systems Knowledge of chaos engineering principles and experience implementing resilience testing frameworks Experience with service‑mesh technologies (Istio, Linkerd) and advanced traffic management patterns Background in database operations and optimization (PostgreSQL, Cassandra, or similar at scale) Familiarity with CI/CD platforms and deployment automation (ArgoCD, FluxCD, Spinnaker, Jenkins) Understanding of networking fundamentals including load balancing, DNS, TLS/SSL, and network security Experience with configuration management and secrets management solutions (Vault, Sealed Secrets, SOPS) Strong written and verbal communication skills with ability to explain technical concepts to non‑technical stakeholders Active Secret or higher security clearance US Salary Range

$166,000—$220,000 USD

The salary range for this role is an estimate based on a wide range of compensation factors, inclusive of base salary only. Actual salary offer may vary based on (but not limited to) work experience, education and/or training, critical skills, and/or business considerations. Highly competitive equity grants are included in the majority of full‑time offers; and are considered part of Anduril’s total compensation package. Additionally, Anduril offers top‑tier benefits for full‑time employees, including: Healthcare Benefits US Roles: Comprehensive medical, dental, and vision plans at little to no cost to you. UK & AUS Roles: We cover full cost of medical insurance premiums for you and your dependents. IE Roles: We offer an annual contribution toward your private health insurance for you and your dependents. Additional Benefits Income Protection: Anduril covers life and disability insurance for all employees. Generous time off: Highly competitive PTO plans with a holiday hiatus in December. Caregiver & Wellness Leave is available to care for family members, bond with a new baby, or address your own medical needs. Family Planning & Parenting Support: Coverage for fertility treatments (e.g., IVF, preservation), adoption, and gestational carriers, along with resources to support you and your partner from planning to parenting. Mental Health Resources: Access free mental health resources 24/7, including therapy and life coaching. Additional work‑life services, such as legal and financial support, are also available. Professional Development: Annual reimbursement for professional development Commuter Benefits: Company‑funded commuter benefits based on your region. Relocation Assistance: Available depending on role eligibility. Retirement Savings Plan US Roles: Traditional 401(k), Roth, and after‑tax (mega backdoor Roth) options. UK & IE Roles: Pension plan with employer match. AUS Roles: Superannuation plan. The recruiter assigned to this role can share more information about the specific compensation and benefit details associated with this role during the hiring process. To view Anduril’s candidate data privacy policy, please visit #J-18808-Ljbffr Dormont Manufacturing Co

Apply

Vacancy posted 5 days ago

Similar jobs that could be interesting for youBased on the Senior Site Reliability Engineer, Production Engineering in Seattle, WA vacancy

Senior Site Reliability Engineer - Cloud Platforms & Automation
Blue Origin is seeking a Site Reliability Engineer to enhance the digital infrastructure supporting safe human spaceflight. This role involves improving the software lifecycle from design to deployment, particularly in cloud environments. The ideal candidate will possess...
Senior
jobs.frontdoordefense.com - Jobboard
Seattle, WA
5 days ago
Senior Site Reliability Engineer
$79.1k - $158.2k
...according to terms for reliability and functionality.... ...investigations, and debugging products in order to reach SLOs... ...basic knowledge of site reliability trends and... ...escalate issues to senior team members. Collects... ...skilled Site Reliability Engineer to design, build,...
Senior
Temporary work
Immediate start
Flexible hours
Shift work
Oracle
Seattle, WA
3 days ago
Senior Site Reliability Engineer (SRE)
...APPIT Software Solutions is hiring a Senior Site Reliability Engineer (SRE) in Seattle, USA . Lead site reliability engineering efforts for large-... ...Responsibilities Lead SRE strategy and practices across multiple product teams ensuring consistent reliability standards Architect...
Senior
Flexible hours
Appit LLC
Seattle, WA
3 days ago
Senior Site Reliability Engineer
...about this role, we encourage you to apply. The Role As a Senior Platform Engineer, you are a champion for DevOps and SRE culture and... ...company goals are met. What You Will Be Doing Improving production reliability and system resilience within an SRE scoped team...
Senior
Flexible hours
Megaport
Seattle, WA
2 days ago
Senior Site Reliability Engineer
$139.5k - $258.1k
...Software and Services Apple Services Engineering team is one of the most exciting examples... ...Service Infrastructure team, as a Site Reliability Engineer, to help support and scale cloud... ...benefits, a range of discounted products and free services, and for formal education...
Senior
Relocation
Apple Inc.
Seattle, WA
5 days ago
Senior Site Reliability Engineer I Seattle, Washington, United States Seattle, Washington
$150k - $180k
...improve cloud infrastructure reliability, scalability, and operational efficiency. Write production-quality code in Go, Python, or... ...platforms and tools that enable engineering teams to provision services... ...engineering, cloud infrastructure, or site reliability engineering....
Senior
Axon Enterprise
Seattle, WA
1 day ago
Senior Site Reliability Engineer I
$134.25k - $214.8k
...of devices and cloud software. Like our products, we work better together. We connect... ...where you matter. Your Impact Are you an engineer who gets excited about the challenge of... ...the Observability team within Axon's Site Reliability organization - a focused team responsible...
Senior
Work experience placement
Work at office
Remote work
Koitecc Solutions
Seattle, WA
3 days ago
Senior Site Reliability Engineer, Data Infrastructure
$165k - $242k
...Senior Site Reliability Engineer, Data Infrastructure CoreWeave is The Essential Cloud for AI™. Built for pioneers by pioneers, CoreWeave delivers... ..., and internal AI workloads at scale. We operate with production-grade discipline, supporting mission-critical services...
Senior
Permanent employment
Temporary work
Casual work
Work at office
Flexible hours
CoreWeave
Bellevue, WA
12 hours ago
Site Reliability Engineer (Senior or Staff), Infrastructure Security
$127k - $249k
Senior / Staff Engineer - SRE, InfraSec We are looking for an experienced Senior or Staff Engineer for our SRE, InfraSec team to guide the security of our cloud‑based infrastructure. You will be highly hands‑on technically while also mentoring a small team of SREs. The...
Senior
Local area
Remote work
The Consulting Solutions
Seattle, WA
4 days ago
Senior Site Reliability Engineer - Cloud, Kubernetes CI/CD
Next Frontier Capital is seeking a Site Reliability Engineer III to drive innovation within technology sectors, ensuring the reliability and scalability of applications and infrastructure. You will manage and optimize cloud resources, ensuring best practices in site reliability...
Senior
Next Frontier Capital
Seattle, WA
2 days ago
Sr. Site Reliability Engineer
$70 - $80 per hour
...leading organization in the technology sector, is seeking a Sr. Site Reliability Engineer to join their team. As a Sr. Site Reliability Engineer,... ...teams deploy and operate containerized services in production AKS environments Design, write, and maintain Terraform...
Senior
Weekly pay
Temporary work
Local area
Flexible hours
3 days per week
Experis/Manpower Group
Seattle, WA
3 days ago
Sr. Site Reliability Engineer
$175k - $200k
...for you. About the Role: As a member of the Product and Engineering team at PitchBook, you will be part of a team of big... ...improve. Join our team and grow with us! As a Sr. Site Reliability Engineer (SRE) in PitchBook's engineering division, you will...
Senior
Work at office
Remote work
Visa sponsorship
PitchBook
Seattle, WA
12 hours ago
Sr. Site Reliability Engineer
...Role Overview We are seeking a high-caliber Site Reliability Engineer (SRE) to join our Forward Engineering team. You will be the guardian of our production ecosystems, ensuring that our complex, data-driven AI platforms remain resilient, scalable, and highly performant...
Senior
Local area
Tiger Analytics
Seattle, WA
3 days ago
Senior SRE & InfraSec Engineer — Remote
The Consulting Solutions is seeking an experienced Senior / Staff Engineer for our SRE, InfraSec team in Seattle. The role involves leading the security of cloud-based infrastructure, mentoring a team of SREs, and collaborating with other engineering teams to ensure high...
Senior
Remote job
The Consulting Solutions
Seattle, WA
4 days ago
Senior Security Operations Engineer Seattle, Washington, United States
$192k - $240k
### Senior Security Operations Engineer#### Seattle, Washington, United StatesSenior Security Operations Engineer**Why join us**Brex is the AI-powered spend platform. We help companies spend with confidence with integrated corporate cards, banking, and global payments,...
Senior
Work at office
Remote work
Work from home
Brex Inc.
Seattle, WA
16 hours ago
Senior Observability & SRE Engineer
Axon in Seattle is seeking a Senior Engineer for its observability team. You'll design and evolve the observability platform, working on distributed tracing, logging, and metrics across Axon's infrastructures. The ideal candidate has strong engineering experience, ideally...
Senior
Koitecc Solutions
Seattle, WA
3 days ago
Cyber SDC - WAF Operations Solution Engineer - Senior - Consulting
$106.8k - $194.8k
...WAF Operations Solution Engineer Location: Anywhere in Country Practice Description As a WAF Operations Solution Engineer, you will be responsible for implementing and managing Web Application Firewall (WAF) solutions to protect client applications from cyber threats....
Senior
Summer holiday
Flexible hours
Ernst & Young Oman
Seattle, WA
3 days ago
Senior SRE Engineer: Compute & VM Orchestration
$171.6k - $302.2k
A leading technology company in Seattle is seeking an experienced Site Reliability Engineer to enhance compute infrastructure at scale. You will design and implement innovative solutions, manage cloud infrastructure, and focus on automation. The ideal candidate has over...
Senior
Apple Inc.
Seattle, WA
3 days ago
Site Reliability Engineering - Sr. Software Development Engineer
$177.57k - $248.59k
Site Reliability Engineering - Sr. Software Development Engineer Implement and manage the infrastructure for rapid development and deployment of... ..., and refinement Support services before they go to production through activities such as system design, consulting, developing...
Senior
Permanent employment
Temporary work
Local area
jobs.frontdoordefense.com - Jobboard
Seattle, WA
5 days ago
Senior Automation Engineering
...Senior Automation Engineer Location: Seattle, WA Visa: GC or Citizen Or H1B Note: Its a Senior position need 10 or above resume only,... ...experience working in Operations, Engineering, DevOps, or Site Reliability in a medium to large company. ~ Can describe specific...
Senior
H1b
Georgia IT Inc
Seattle, WA
1 day ago
Senior SAP Test Automation Engineer - Worksoft Certify
Jansoft Global is looking for an experienced SAP Test Automation Engineer in Bellevue, United States. This role involves designing, developing, and executing automated test solutions for enterprise SAP applications using Worksoft Certify. Key responsibilities include end...
Senior
Jansoft Global
Bellevue, WA
5 days ago
Senior Software Engineer - Forge Factory Automation
$191k - $253k
...Corporate Technology Engineering team is responsible for... ...seeking a highly motivated Senior Software Engineer to... ...time, driving our production lines efficiently. You... ...performance, scalability, and reliability of the Forge platform... ...work with the team on-site on a rotation for...
Senior
Full time
Work experience placement
Immediate start
Rotating shift
Anduril Industries
Seattle, WA
3 days ago
Senior Spaceflight Operations Training Engineer
Blue Origin LLC is seeking a Training Engineer Level III to join their In-Space Systems team in Seattle. This role involves designing and executing training for flight operations, ensuring effective mission support. Applicants should possess a strong engineering background...
Senior
Blue Origin LLC
Seattle, WA
1 day ago
Senior SRE/DevOps Engineer
...support first responders in saving lives. About this Role: As our lead Site Reliability / DevOps Engineer, you will own the reliability, scalability, and operational excellence of our production systems. You’ll build secure cloud infrastructure, automation, and deployment...
Senior
Work at office
Worldwide
Flexible hours
Brinc
Seattle, WA
1 day ago
Senior On-Orbit Mission Operations Engineer
Blue Origin LLC is seeking a Senior Mission Operations Engineer to lead and execute on-orbit operations. This role requires a strong operator focus, effective leadership, and the ability to develop mission plans while streamlining workflows. The ideal candidate will have...
Senior
Blue Origin LLC
Seattle, WA
2 days ago
Senior Network Automation Engineer
...GEICO is looking for a talented Engineer in Seattle to enhance our network infrastructure and implement security measures in alignment with our tech transformation goals. You will lead the strategy and execution of technical roadmaps, ensuring network performance and...
Senior
GEICO
Seattle, WA
3 days ago
Senior Automation Engineering
$107.1k - $160.7k
...the world's leading integrated design practice. Our architects, engineers, interior designers, consultants, sustainability specialists,... ...your place with Stantec. Your Opportunity The Senior Automation Engineer for BAS/BMS/PLC systems, guides the technical...
Senior
Full time
Temporary work
Part time
Casual work
Local area
Flexible hours
Stantec
Seattle, WA
3 days ago
Senior Software Engineer - AI Enablement & Automation
Compass is seeking a Senior Software Engineer to join their Staff AI Enablement Team in Seattle. In this role, you will drive the company's enterprise AI strategy by collaborating with major business units to create automated workflows. Your responsibilities involve building...
Senior
Jobr
Seattle, WA
3 days ago
Senior Security Operations Engineering
$148.5k - $237.6k
...ecosystem of devices and cloud software. Like our products, we work better together. We connect with... ...where you matter. Your Impact As a Senior Security Operations Engineer, you'll play a key role in ensuring the reliability, performance, and scalability of our...
Senior
Work experience placement
Work at office
Remote work
Axon
Seattle, WA
3 days ago
Senior Network Automation Engineer
...Ziply Fiber in Kirkland, WA is seeking a Principal Network Automation Engineer responsible for planning, designing, and implementing software-driven solutions to enhance its fiber and IP networks. This role emphasizes automation to improve operational efficiency and reduce...
Senior
Ziply Fiber
Kirkland, WA
3 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior Site Reliability Engineer, Production Engineering. Be the first to apply!