Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Senior Site Reliability Engineer, Production Engineering

$166k - $220k
Full-time

Anduril Industries

Anduril Industries is a defense technology company with a mission to transform U.S. and allied military capabilities with advanced technology. By bringing the expertise, technology, and business model of the 21st century’s most innovative companies to the defense industry, Anduril is changing how military systems are designed, built and sold. Anduril’s family of systems is powered by Lattice OS, an AI-powered operating system that turns thousands of data streams into a realtime, 3D command and control center. As the world enters an era of strategic competition, Anduril is committed to bringing cutting-edge autonomy, AI, computer vision, sensor fusion, and networking technology to the military in months, not years.

ABOUT THE TEAM

The Production Engineering team is a newly formed organization within Anduril's Software Platform, dedicated to ensuring the reliability, performance, and scalability of mission-critical systems that directly support our warfighters in the field. We solve complex reliability challenges at massive scale, ensuring that critical components of Lattice—Anduril's autonomous command and control platform—operates flawlessly in the most demanding operational environments. This is a foundational role and you will be among the first hires building this team from the ground up. You'll have the unique opportunity to shape the technical direction, establish best practices, and define what production engineering excellence means at Anduril. Our team operates at the intersection of software engineering and systems reliability, building the infrastructure, tooling, and processes that keep our systems operational 24/7/365.

ABOUT THE ROLE

We are seeking an experienced Senior Site Reliability Engineer who is passionate about building resilient, highly available systems that scale to meet the demands of the core systems powering Lattice. You will work closely with platform engineering teams, product developers, and field operations to proactively identify reliability risks, implement defensive strategies, and continuously improve the operational excellence of our software platform. If you thrive on solving hard problems at scale and want your work to have direct impact on national security, this is the role for you.

WHAT YOU’LL DO

* Design and implement comprehensive monitoring, observability, and alerting systems to ensure early detection of reliability issues across the Lattice platform * Drive incident response and conduct blameless postmortems to identify systemic improvements and prevent recurrence of production issues * Build and maintain infrastructure automation using tools like Terraform, Kubernetes operators, and custom tooling to manage large-scale distributed systems * Establish and track Service Level Objectives (SLOs) and Error Budgets to balance feature velocity with system reliability * Partner with software engineering teams to improve system architecture for reliability, implementing patterns like circuit breakers, graceful degradation, and chaos engineering * Develop capacity planning models and performance testing frameworks to ensure systems can handle growth and peak operational demands * Create runbooks, documentation, and training materials to enable teams to operate production systems effectively * Lead cross-functional efforts to improve deployment safety through progressive rollouts, automated testing, and rollback capabilities * Implement security best practices and compliance controls for production environments handling sensitive defense data * Build tooling and automation to reduce toil and improve operational efficiency for the engineering organization * Participate in on-call rotations and serve as an escalation point for critical production incidents

REQUIRED QUALIFICATIONS

* 7+ years of engineering experience with at least 3+ years focused on SRE, production operations, or infrastructure engineering * Bachelor's degree in Computer Science, Engineering, or equivalent practical experience * Deep expertise with Kubernetes in production environments, including operational challenges at scale (100+ nodes) * Strong programming skills in one or more languages such as Go, Python, Rust, or Java with ability to build production-grade tooling * Proven experience designing and implementing observability stacks (metrics, logging, tracing) using tools like Prometheus, Grafana, ELK/EFK, or equivalent * Hands-on experience with cloud platforms (AWS, Azure, or GCP) and infrastructure as code practices * Demonstrated ability to debug complex distributed systems issues across multiple layers of the stack * Track record of improving system reliability through architectural changes, not just operational band-aids * Strong incident management and communication skills, with experience leading responses to critical outages * Must be a U.S. Person due to required access to U.S. export controlled information or facilities * Eligible to obtain and maintain an active U.S. Secret security clearance

PREFERRED QUALIFICATIONS

* Experience with defense, aerospace, or other mission-critical systems where downtime has severe consequences * Expertise in performance optimization and capacity planning for high-throughput, low-latency systems * Knowledge of chaos engineering principles and experience implementing resilience testing frameworks * Experience with service mesh technologies (Istio, Linkerd) and advanced traffic management patterns * Background in database operations and optimization (PostgreSQL, Cassandra, or similar at scale) * Familiarity with CI/CD platforms and deployment automation (ArgoCD, FluxCD, Spinnaker, Jenkins) * Understanding of networking fundamentals including load balancing, DNS, TLS/SSL, and network security * Experience with configuration management and secrets management solutions (Vault, Sealed Secrets, SOPS) * Strong written and verbal communication skills with ability to explain technical concepts to non-technical stakeholders * Active Secret or higher security clearance US Salary Range

$166,000—$220,000 USD

The salary range for this role is an estimate based on a wide range of compensation factors, inclusive of base salary only. Actual salary offer may vary based on (but not limited to) work experience, education and/or training, critical skills, and/or business considerations. Highly competitive equity grants are included in the majority of full time offers; and are considered part of Anduril's total compensation package. Additionally, Anduril offers top-tier benefits for full-time employees, including:

BENEFITS

At Anduril, we invest in our people. Our comprehensive, competitive benefits package (available at little to no cost to employees) ensures you’re supported in health, recovery, and whatever comes next. For more information, Explore Our Benefits [

PROTECTING YOURSELF FROM RECRUITMENT SCAMS

Anduril is committed to maintaining the integrity of our Talent acquisition process and the security of our candidates. We've observed a rise in sophisticated phishing and fraudulent schemes where individuals impersonate Anduril representatives, luring job seekers with false interviews or job offers. These scammers often attempt to extract payment or sensitive personal information. To ensure your safety and help you navigate your job search with confidence, please keep the following critical points in mind: * No Financial Requests: Anduril will never solicit payment or demand personal financial details (such as banking information, credit card numbers, or social security numbers) at any stage of our hiring process. Our legitimate recruitment is entirely free for candidates. * Please always verify communications: * Direct from Anduril: If you receive an email from one of our recruiters, it will only come from an @anduril.com address. * Via Agency Partner: If contacted by a recruiting agency for an Anduril role, their email will clearly identify their agency. If you suspect any suspicious activity, please verify the agency's authenticity by reaching out to View email address on click.appcast.io [View email address on click.appcast.io]. * Exercise Caution with Unsolicited Outreach: If you receive any communication that appears suspicious, contains grammatical errors, or makes unusual requests, do not engage. Always confirm the sender's email domain is @anduril.com before providing any personal information or clicking on links. * What to Do If You Suspect Fraud: Should you encounter any questionable or fraudulent outreach claiming to be from Anduril, please report it immediately to View email address on click.appcast.io [View email address on click.appcast.io]. Your proactive caution is invaluable in protecting your personal information and upholding the security and trustworthiness of our recruitment efforts.

DATA PRIVACY

To view Anduril's candidate data privacy policy, please visit [ By submitting your application, you consent to Anduril Industries using a third-party service provider to conduct pre-employment risk, integrity, and due diligence screening and assessing potential risks as part of your application process. This third-party service provider provides risk-intelligence services that may include analysis of sanctions and watchlists, adverse media, public-record information, and other lawful open-source or commercial data sources. This third-party service provider does not act as a consumer reporting agency. Use of this provider helps to ensure compliance with applicable laws and protect technology, intellectual property, and organizational security.

Vacancy posted 1 day ago
Similar jobs that could be interesting for youBased on the Senior Site Reliability Engineer, Production Engineering in Seattle, WA vacancy
  • Blue Origin is seeking a Site Reliability Engineer to enhance the digital infrastructure supporting safe human spaceflight. This role involves improving the software lifecycle from design to deployment, particularly in cloud environments. The ideal candidate will possess... 
    Senior

    jobs.frontdoordefense.com - Jobboard

    Seattle, WA
    2 days ago
  • APPIT Software Solutions is hiring a Senior Site Reliability Engineer (SRE) in Seattle, USA . Lead site reliability engineering efforts for large-scale...  ...Lead SRE strategy and practices across multiple product teams ensuring consistent reliability standards Architect... 
    Senior
    Flexible hours

    Appit LLC

    Seattle, WA
    4 days ago
  •  ...Infrastructure division and is responsible for the reliability, performance, security, and automation...  ...is to make databases invisible: product engineers should be able to provision, scale,...  ...manual toil. What you’ll do As a Senior/Staff Software Engineer on the Database... 
    Senior

    Airwallex

    Seattle, WA
    2 days ago
  • $202.16k - $368.22k

    Senior Site Reliability Engineer - Foundational Storage, ByteStore Location: Seattle Team: Infrastructure Employment Type: Regular Job Code: A12...  ...storage platform, we support multiple storage and computing products, including Object Storage, Block Storage, Relational... 
    Senior
    Temporary work
    Local area

    ByteDance

    Seattle, WA
    4 days ago
  • $122.3k - $158.5k

     ...Canada Kirkland Washington United States of America Senior Site Reliability Engineer (SRE) - SPEAR Electronic Arts is looking for a Senior Site...  ..., bash, PowerShell, or similar) Experience running production systems at scale Hands‑on experience operating and troubleshooting... 
    Senior
    Full time

    Electronic Arts

    Kirkland, WA
    4 days ago
  • $160k - $210k

     ...service DSP, or utilizing our industry-first ContextGPT product. As a part of Cognitiv, you will be at the forefront of AI...  ...Now, we're growing! The Role We are looking for a senior site reliability engineer to work on expanding our global footprint of datacenters... 
    Senior
    Work at office
    Immediate start
    Remote work
    Work from home

    Cognitiv

    Bellevue, WA
    13 days ago
  • About the job We are looking for a senior site reliability engineer to join the Cloud FinOps team at Hopper. We manage a large infrastructure in Google Cloud that is used by hundreds of engineers to provide a first class experience to millions of end users around the world... 
    Senior
    Remote job
    Work from home
    Sleeping nights

    Hopper

    Seattle, WA
    4 days ago
  • $127k - $249k

    We are looking for an experienced Senior or Staff Engineer for our SRE, InfraSec team, to guide the security of our cloud-based infrastructure. As a Staff SRE, you will be very hands-on technically while also mentoring a small team of SREs. The InfraSec team collaborates... 
    Senior
    Full time
    Local area
    Remote work
    Worldwide
    Flexible hours

    MongoDB

    Seattle, WA
    2 days ago
  • Role Overview We are seeking a high-caliber Site Reliability Engineer (SRE) to join our Forward Engineering team. You will be the guardian of our production ecosystems, ensuring that our complex, data-driven AI platforms remain resilient, scalable, and highly performant... 
    Senior
    Local area

    Tiger Analytics

    Seattle, WA
    3 days ago
  • $70 - $80 per hour

     ...leading organization in the technology sector, is seeking a Sr. Site Reliability Engineer to join their team. As a Sr. Site Reliability Engineer,...  ...teams deploy and operate containerized services in production AKS environments Design, write, and maintain Terraform scripts... 
    Senior
    Weekly pay
    Temporary work
    Flexible hours
    3 days per week

    ManpowerGroup Global, Inc.

    Seattle, WA
    22 hours ago
  • $177.57k - $248.59k

    Site Reliability Engineering - Sr. Software Development Engineer Implement and manage the infrastructure for rapid development and deployment of...  ..., and refinement Support services before they go to production through activities such as system design, consulting, developing... 
    Senior
    Permanent employment
    Temporary work
    Local area

    jobs.frontdoordefense.com - Jobboard

    Seattle, WA
    2 days ago
  •  ...support first responders in saving lives. About this Role: As our lead Site Reliability / DevOps Engineer, you will own the reliability, scalability, and operational excellence of our production systems. You’ll build secure cloud infrastructure, automation, and deployment... 
    Senior
    Work at office
    Worldwide
    Flexible hours

    Brinc

    Seattle, WA
    3 days ago
  • Jansoft Global is looking for an experienced SAP Test Automation Engineer in Bellevue, United States. This role involves designing, developing, and executing automated test solutions for enterprise SAP applications using Worksoft Certify. Key responsibilities include end... 
    Senior

    Jansoft Global

    Bellevue, WA
    2 days ago
  • $106.8k - $194.8k

     ...teams and take your career wherever you want it to go.  Join EY and help to build a better working world. WAF Operations Solution Engineer PRACTICE DESCRIPTION: As a WAF Operations Solution Engineer, you will be responsible for implementing and managing Web... 
    Senior
    Summer holiday
    Flexible hours

    EY

    Seattle, WA
    2 days ago
  •  ...Senior Automation Engineer Location: Seattle, WA Visa: GC or Citizen Or H1B Note: Its a Senior position need 10 or above resume only,...  ...experience working in Operations, Engineering, DevOps, or Site Reliability in a medium to large company. ~ Can describe specific... 
    Senior
    H1b

    Staffing the Universe

    Seattle, WA
    3 days ago
  • Blue Origin LLC is seeking a Senior Mission Operations Engineer to lead and execute on-orbit operations. This role requires a strong operator focus, effective leadership, and the ability to develop mission plans while streamlining workflows. The ideal candidate will have... 
    Senior

    Blue Origin LLC

    Seattle, WA
    4 days ago
  • A pioneering public safety technology firm in Seattle is looking for a Lead Site Reliability / DevOps Engineer to ensure the reliability and scalability of production systems. This role involves the construction of secure cloud infrastructures and automation pipelines... 
    Senior

    Brinc

    Seattle, WA
    2 days ago
  • A leading financial institution is seeking a Site Reliability Engineer III in Seattle, WA. This role involves designing and managing cloud infrastructure, deploying containerized applications, and automating provisioning using Terraform. The ideal candidate has a Bachelor... 
    Senior

    JPMorgan Chase & Co.

    Seattle, WA
    4 days ago
  • $153k - $242k

     ...Senior Systems Engineer, OS Automation CoreWeave is The Essential Cloud for AI™. Built for pioneers by pioneers, CoreWeave delivers a platform...  ..., stability) in staging environments before they impact production. ~ Dynamic Kernel Tuning: Implement closed-loop... 
    Senior
    Permanent employment
    Temporary work
    Casual work
    Work at office
    Local area
    Remote work
    Flexible hours

    CoreWeave

    Bellevue, WA
    3 days ago
  •  ...Senior Automation Engineer Operations is at the heart of Amazon's business. We...  ...deliver tens of thousands of products to hundreds of countries worldwide, every day. The Reliability & Maintenance Engineering (...  ...improvements to enhance site throughput. You'll work with... 
    Senior
    Remote work
    Worldwide
    Shift work

    Amazon

    Seattle, WA
    1 day ago
  • A global engineering firm is seeking a Senior Automation Engineer to lead the technical design of BAS/BMS/PLC systems projects. This role involves mentoring team members, managing project scope, and ensuring design quality. Candidates should have an accredited engineering... 
    Senior

    Stantec Consulting International Ltd.

    Seattle, WA
    3 days ago
  • $145k - $193.75k

     ...up for everyday. Corporate Systems Engineering builds and operates the software platforms...  ...systems with the same rigor, reliability, and product mindset as customer-facing software....  ...secure, and auditable automation. As a Senior Software Engineer I (Automation), you... 
    Senior
    Full time
    Temporary work
    Work at office
    Local area
    Remote work
    Flexible hours

    Smartsheet

    Bellevue, WA
    22 hours ago
  • $191k - $253k

     ...Corporate Technology Engineering team is responsible for...  ...seeking a highly motivated Senior Software Engineer to...  ...time, driving our production lines efficiently. You...  ...performance, scalability, and reliability of the Forge platform...  ...work with the team on-site on a rotation for... 
    Senior
    Full time
    Work experience placement
    Immediate start
    Rotating shift

    Anduril Industries

    Seattle, WA
    3 days ago
  • $107.1k - $160.7k

     ...the world’s leading integrated design practice. Our architects, engineers, interior designers, consultants, sustainability specialists,...  ...and design your place with Stantec. Your Opportunity The Senior Automation Engineer for BAS/BMS/PLC systems, guides the technical... 
    Senior
    Full time
    Temporary work
    Part time
    Casual work
    Local area
    Flexible hours

    Stantec

    Seattle, WA
    22 hours ago
  • $160k - $190k

    RadNet, Inc. is looking for a Senior Crane Electrical Engineer in Seattle, responsible for the design and maintenance of electrical systems for crane operations. This hybrid role focuses on optimizing control systems and ensuring safe, efficient operations. The ideal candidate... 
    Senior
    Local area

    RadNet, Inc.

    Seattle, WA
    4 days ago
  • Fred Hutchinson Cancer Research Center is seeking a Systems Engineer III - Storage based in Seattle. This role involves technical ownership of a significant data environment supporting scientific functions. The Systems Engineer will focus on optimizing storage ecosystems... 
    Senior

    Fred Hutchinson Cancer Research Center

    Seattle, WA
    4 days ago
  • $83.43k - $222.48k

    Position Summary The Senior Adversary Operations Engineer plays a critical role in strengthening the organization’s security posture by executing advanced penetration testing and adversary simulation activities that uncover high‑risk vulnerabilities across enterprise,... 
    Senior
    Full time
    Local area

    Hispanic Alliance for Career Enhancement

    Seattle, WA
    4 days ago
  • A leading pediatric healthcare provider in Washington, D.C. is seeking an experienced engineer to manage HVAC systems and building automation. This role involves conducting inspections and maintenance to ensure optimal performance, responding to emergencies, and providing... 
    Senior
    Shift work

    Children's National Medical Center

    Seattle, WA
    2 days ago
  •  ...facilities. As an Automation Engineer at Blue Origin, you will be a...  ...of the state of the art production system for our satellite constellation...  ...levels of quality, reliability, and repeatability in our manufacturing...  ...89.00 - $105,404.25 Other site ranges may differ... 
    Senior
    Permanent employment
    Temporary work
    Local area

    Blue Origin

    Seattle, WA
    1 day ago
  • A leading IT services company based in Seattle is seeking a skilled Selenium Automation Engineer for a permanent on-site position. Candidates should possess 5 to 8 years of hands-on experience in Selenium, with a strong understanding of the Software Testing Life Cycle.... 
    Senior
    Permanent employment

    E*Pro Inc

    Seattle, WA
    22 hours ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior Site Reliability Engineer, Production Engineering. Be the first to apply!