Senior Site Reliability Engineer, Production Engineering
$166k - $220kAnduril Industries
Anduril Industries is a defense technology company with a mission to transform U.S. and allied military capabilities with advanced technology. By bringing the expertise, technology, and business model of the 21st century’s most innovative companies to the defense industry, Anduril is changing how military systems are designed, built and sold. Anduril’s family of systems is powered by Lattice OS, an AI-powered operating system that turns thousands of data streams into a realtime, 3D command and control center. As the world enters an era of strategic competition, Anduril is committed to bringing cutting-edge autonomy, AI, computer vision, sensor fusion, and networking technology to the military in months, not years.
ABOUT THE TEAM
The Production Engineering team is a newly formed organization within Anduril's Software Platform, dedicated to ensuring the reliability, performance, and scalability of mission-critical systems that directly support our warfighters in the field. We solve complex reliability challenges at massive scale, ensuring that critical components of Lattice—Anduril's autonomous command and control platform—operates flawlessly in the most demanding operational environments. This is a foundational role and you will be among the first hires building this team from the ground up. You'll have the unique opportunity to shape the technical direction, establish best practices, and define what production engineering excellence means at Anduril. Our team operates at the intersection of software engineering and systems reliability, building the infrastructure, tooling, and processes that keep our systems operational 24/7/365.ABOUT THE ROLE
We are seeking an experienced Senior Site Reliability Engineer who is passionate about building resilient, highly available systems that scale to meet the demands of the core systems powering Lattice. You will work closely with platform engineering teams, product developers, and field operations to proactively identify reliability risks, implement defensive strategies, and continuously improve the operational excellence of our software platform. If you thrive on solving hard problems at scale and want your work to have direct impact on national security, this is the role for you.WHAT YOU’LL DO
* Design and implement comprehensive monitoring, observability, and alerting systems to ensure early detection of reliability issues across the Lattice platform * Drive incident response and conduct blameless postmortems to identify systemic improvements and prevent recurrence of production issues * Build and maintain infrastructure automation using tools like Terraform, Kubernetes operators, and custom tooling to manage large-scale distributed systems * Establish and track Service Level Objectives (SLOs) and Error Budgets to balance feature velocity with system reliability * Partner with software engineering teams to improve system architecture for reliability, implementing patterns like circuit breakers, graceful degradation, and chaos engineering * Develop capacity planning models and performance testing frameworks to ensure systems can handle growth and peak operational demands * Create runbooks, documentation, and training materials to enable teams to operate production systems effectively * Lead cross-functional efforts to improve deployment safety through progressive rollouts, automated testing, and rollback capabilities * Implement security best practices and compliance controls for production environments handling sensitive defense data * Build tooling and automation to reduce toil and improve operational efficiency for the engineering organization * Participate in on-call rotations and serve as an escalation point for critical production incidentsREQUIRED QUALIFICATIONS
* 7+ years of engineering experience with at least 3+ years focused on SRE, production operations, or infrastructure engineering * Bachelor's degree in Computer Science, Engineering, or equivalent practical experience * Deep expertise with Kubernetes in production environments, including operational challenges at scale (100+ nodes) * Strong programming skills in one or more languages such as Go, Python, Rust, or Java with ability to build production-grade tooling * Proven experience designing and implementing observability stacks (metrics, logging, tracing) using tools like Prometheus, Grafana, ELK/EFK, or equivalent * Hands-on experience with cloud platforms (AWS, Azure, or GCP) and infrastructure as code practices * Demonstrated ability to debug complex distributed systems issues across multiple layers of the stack * Track record of improving system reliability through architectural changes, not just operational band-aids * Strong incident management and communication skills, with experience leading responses to critical outages * Must be a U.S. Person due to required access to U.S. export controlled information or facilities * Eligible to obtain and maintain an active U.S. Secret security clearancePREFERRED QUALIFICATIONS
* Experience with defense, aerospace, or other mission-critical systems where downtime has severe consequences * Expertise in performance optimization and capacity planning for high-throughput, low-latency systems * Knowledge of chaos engineering principles and experience implementing resilience testing frameworks * Experience with service mesh technologies (Istio, Linkerd) and advanced traffic management patterns * Background in database operations and optimization (PostgreSQL, Cassandra, or similar at scale) * Familiarity with CI/CD platforms and deployment automation (ArgoCD, FluxCD, Spinnaker, Jenkins) * Understanding of networking fundamentals including load balancing, DNS, TLS/SSL, and network security * Experience with configuration management and secrets management solutions (Vault, Sealed Secrets, SOPS) * Strong written and verbal communication skills with ability to explain technical concepts to non-technical stakeholders * Active Secret or higher security clearance US Salary Range$166,000—$220,000 USD
The salary range for this role is an estimate based on a wide range of compensation factors, inclusive of base salary only. Actual salary offer may vary based on (but not limited to) work experience, education and/or training, critical skills, and/or business considerations. Highly competitive equity grants are included in the majority of full time offers; and are considered part of Anduril's total compensation package. Additionally, Anduril offers top-tier benefits for full-time employees, including:BENEFITS
At Anduril, we invest in our people. Our comprehensive, competitive benefits package (available at little to no cost to employees) ensures you’re supported in health, recovery, and whatever comes next. For more information, Explore Our Benefits [PROTECTING YOURSELF FROM RECRUITMENT SCAMS
Anduril is committed to maintaining the integrity of our Talent acquisition process and the security of our candidates. We've observed a rise in sophisticated phishing and fraudulent schemes where individuals impersonate Anduril representatives, luring job seekers with false interviews or job offers. These scammers often attempt to extract payment or sensitive personal information. To ensure your safety and help you navigate your job search with confidence, please keep the following critical points in mind: * No Financial Requests: Anduril will never solicit payment or demand personal financial details (such as banking information, credit card numbers, or social security numbers) at any stage of our hiring process. Our legitimate recruitment is entirely free for candidates. * Please always verify communications: * Direct from Anduril: If you receive an email from one of our recruiters, it will only come from an @anduril.com address. * Via Agency Partner: If contacted by a recruiting agency for an Anduril role, their email will clearly identify their agency. If you suspect any suspicious activity, please verify the agency's authenticity by reaching out to View email address on click.appcast.io [View email address on click.appcast.io]. * Exercise Caution with Unsolicited Outreach: If you receive any communication that appears suspicious, contains grammatical errors, or makes unusual requests, do not engage. Always confirm the sender's email domain is @anduril.com before providing any personal information or clicking on links. * What to Do If You Suspect Fraud: Should you encounter any questionable or fraudulent outreach claiming to be from Anduril, please report it immediately to View email address on click.appcast.io [View email address on click.appcast.io]. Your proactive caution is invaluable in protecting your personal information and upholding the security and trustworthiness of our recruitment efforts.DATA PRIVACY
To view Anduril's candidate data privacy policy, please visit [ By submitting your application, you consent to Anduril Industries using a third-party service provider to conduct pre-employment risk, integrity, and due diligence screening and assessing potential risks as part of your application process. This third-party service provider provides risk-intelligence services that may include analysis of sanctions and watchlists, adverse media, public-record information, and other lawful open-source or commercial data sources. This third-party service provider does not act as a consumer reporting agency. Use of this provider helps to ensure compliance with applicable laws and protect technology, intellectual property, and organizational security.- Blue Origin is seeking a Site Reliability Engineer to enhance the digital infrastructure supporting safe human spaceflight. This role involves improving the software lifecycle from design to deployment, particularly in cloud environments. The ideal candidate will possess...Senior
- APPIT Software Solutions is hiring a Senior Site Reliability Engineer (SRE) in Seattle, USA . Lead site reliability engineering efforts for large-scale... ...Lead SRE strategy and practices across multiple product teams ensuring consistent reliability standards Architect...SeniorFlexible hours
- ...Infrastructure division and is responsible for the reliability, performance, security, and automation... ...is to make databases invisible: product engineers should be able to provision, scale,... ...manual toil. What you’ll do As a Senior/Staff Software Engineer on the Database...Senior
$202.16k - $368.22k
Senior Site Reliability Engineer - Foundational Storage, ByteStore Location: Seattle Team: Infrastructure Employment Type: Regular Job Code: A12... ...storage platform, we support multiple storage and computing products, including Object Storage, Block Storage, Relational...SeniorTemporary workLocal area$122.3k - $158.5k
...Canada Kirkland Washington United States of America Senior Site Reliability Engineer (SRE) - SPEAR Electronic Arts is looking for a Senior Site... ..., bash, PowerShell, or similar) Experience running production systems at scale Hands‑on experience operating and troubleshooting...SeniorFull time$160k - $210k
...service DSP, or utilizing our industry-first ContextGPT product. As a part of Cognitiv, you will be at the forefront of AI... ...Now, we're growing! The Role We are looking for a senior site reliability engineer to work on expanding our global footprint of datacenters...SeniorWork at officeImmediate startRemote workWork from home- About the job We are looking for a senior site reliability engineer to join the Cloud FinOps team at Hopper. We manage a large infrastructure in Google Cloud that is used by hundreds of engineers to provide a first class experience to millions of end users around the world...SeniorRemote jobWork from homeSleeping nights
$127k - $249k
We are looking for an experienced Senior or Staff Engineer for our SRE, InfraSec team, to guide the security of our cloud-based infrastructure. As a Staff SRE, you will be very hands-on technically while also mentoring a small team of SREs. The InfraSec team collaborates...SeniorFull timeLocal areaRemote workWorldwideFlexible hours- Role Overview We are seeking a high-caliber Site Reliability Engineer (SRE) to join our Forward Engineering team. You will be the guardian of our production ecosystems, ensuring that our complex, data-driven AI platforms remain resilient, scalable, and highly performant...SeniorLocal area
$70 - $80 per hour
...leading organization in the technology sector, is seeking a Sr. Site Reliability Engineer to join their team. As a Sr. Site Reliability Engineer,... ...teams deploy and operate containerized services in production AKS environments Design, write, and maintain Terraform scripts...SeniorWeekly payTemporary workFlexible hours3 days per week$177.57k - $248.59k
Site Reliability Engineering - Sr. Software Development Engineer Implement and manage the infrastructure for rapid development and deployment of... ..., and refinement Support services before they go to production through activities such as system design, consulting, developing...SeniorPermanent employmentTemporary workLocal area- ...support first responders in saving lives. About this Role: As our lead Site Reliability / DevOps Engineer, you will own the reliability, scalability, and operational excellence of our production systems. You’ll build secure cloud infrastructure, automation, and deployment...SeniorWork at officeWorldwideFlexible hours
- Jansoft Global is looking for an experienced SAP Test Automation Engineer in Bellevue, United States. This role involves designing, developing, and executing automated test solutions for enterprise SAP applications using Worksoft Certify. Key responsibilities include end...Senior
$106.8k - $194.8k
...teams and take your career wherever you want it to go. Join EY and help to build a better working world. WAF Operations Solution Engineer PRACTICE DESCRIPTION: As a WAF Operations Solution Engineer, you will be responsible for implementing and managing Web...SeniorSummer holidayFlexible hours- ...Senior Automation Engineer Location: Seattle, WA Visa: GC or Citizen Or H1B Note: Its a Senior position need 10 or above resume only,... ...experience working in Operations, Engineering, DevOps, or Site Reliability in a medium to large company. ~ Can describe specific...SeniorH1b
- Blue Origin LLC is seeking a Senior Mission Operations Engineer to lead and execute on-orbit operations. This role requires a strong operator focus, effective leadership, and the ability to develop mission plans while streamlining workflows. The ideal candidate will have...Senior
- A pioneering public safety technology firm in Seattle is looking for a Lead Site Reliability / DevOps Engineer to ensure the reliability and scalability of production systems. This role involves the construction of secure cloud infrastructures and automation pipelines...Senior
- A leading financial institution is seeking a Site Reliability Engineer III in Seattle, WA. This role involves designing and managing cloud infrastructure, deploying containerized applications, and automating provisioning using Terraform. The ideal candidate has a Bachelor...Senior
$153k - $242k
...Senior Systems Engineer, OS Automation CoreWeave is The Essential Cloud for AI™. Built for pioneers by pioneers, CoreWeave delivers a platform... ..., stability) in staging environments before they impact production. ~ Dynamic Kernel Tuning: Implement closed-loop...SeniorPermanent employmentTemporary workCasual workWork at officeLocal areaRemote workFlexible hours- ...Senior Automation Engineer Operations is at the heart of Amazon's business. We... ...deliver tens of thousands of products to hundreds of countries worldwide, every day. The Reliability & Maintenance Engineering (... ...improvements to enhance site throughput. You'll work with...SeniorRemote workWorldwideShift work
- A global engineering firm is seeking a Senior Automation Engineer to lead the technical design of BAS/BMS/PLC systems projects. This role involves mentoring team members, managing project scope, and ensuring design quality. Candidates should have an accredited engineering...Senior
$145k - $193.75k
...up for everyday. Corporate Systems Engineering builds and operates the software platforms... ...systems with the same rigor, reliability, and product mindset as customer-facing software.... ...secure, and auditable automation. As a Senior Software Engineer I (Automation), you...SeniorFull timeTemporary workWork at officeLocal areaRemote workFlexible hours$191k - $253k
...Corporate Technology Engineering team is responsible for... ...seeking a highly motivated Senior Software Engineer to... ...time, driving our production lines efficiently. You... ...performance, scalability, and reliability of the Forge platform... ...work with the team on-site on a rotation for...SeniorFull timeWork experience placementImmediate startRotating shift$107.1k - $160.7k
...the world’s leading integrated design practice. Our architects, engineers, interior designers, consultants, sustainability specialists,... ...and design your place with Stantec. Your Opportunity The Senior Automation Engineer for BAS/BMS/PLC systems, guides the technical...SeniorFull timeTemporary workPart timeCasual workLocal areaFlexible hours$160k - $190k
RadNet, Inc. is looking for a Senior Crane Electrical Engineer in Seattle, responsible for the design and maintenance of electrical systems for crane operations. This hybrid role focuses on optimizing control systems and ensuring safe, efficient operations. The ideal candidate...SeniorLocal area- Fred Hutchinson Cancer Research Center is seeking a Systems Engineer III - Storage based in Seattle. This role involves technical ownership of a significant data environment supporting scientific functions. The Systems Engineer will focus on optimizing storage ecosystems...Senior
$83.43k - $222.48k
Position Summary The Senior Adversary Operations Engineer plays a critical role in strengthening the organization’s security posture by executing advanced penetration testing and adversary simulation activities that uncover high‑risk vulnerabilities across enterprise,...SeniorFull timeLocal area- A leading pediatric healthcare provider in Washington, D.C. is seeking an experienced engineer to manage HVAC systems and building automation. This role involves conducting inspections and maintenance to ensure optimal performance, responding to emergencies, and providing...SeniorShift work
- ...facilities. As an Automation Engineer at Blue Origin, you will be a... ...of the state of the art production system for our satellite constellation... ...levels of quality, reliability, and repeatability in our manufacturing... ...89.00 - $105,404.25 Other site ranges may differ...SeniorPermanent employmentTemporary workLocal area
- A leading IT services company based in Seattle is seeking a skilled Selenium Automation Engineer for a permanent on-site position. Candidates should possess 5 to 8 years of hands-on experience in Selenium, with a strong understanding of the Software Testing Life Cycle....SeniorPermanent employment
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Senior Site Reliability Engineer, Production Engineering. Be the first to apply!
- site reliability engineer Seattle, WA
- production operations engineer Seattle, WA
- post production engineer Seattle, WA
- remote operation drilling engineer Seattle, WA
- security operations center engineer Seattle, WA
- operations engineer Seattle, WA
- production network engineer Seattle, WA
- data center operations engineer Seattle, WA
- network operations center engineer Seattle, WA
- senior production engineer Seattle, WA


