Staff Network Reliability Engineer - Scale & Incident Response
$195k - $235kCrusoe Energy Systems LLC
Crusoe Energy Systems LLC is looking for a Staff Network Operations Engineer to ensure production reliability across its global network infrastructure. This role is critical in maintaining uptime and facilitating AI workloads via incident response and operational excellence. The ideal candidate has 8+ years of experience in network engineering, specializing in operations and incident response. You'll work with advanced monitoring tools and help shape the future of AI infrastructure. Compensation ranges from $195,000 to $235,000, plus bonuses and stock options. #J-18808-Ljbffr
$200k - $240k
A leading AI startup in San Francisco is seeking a Staff Software Engineer to help define the future of incident response by creating an autonomous AI SRE. You will design complex data flows, drive product direction, and maintain high engineering standards across the stack...Suggested$225k - $275k
...Francisco is looking for a Senior Staff Network Operations Engineer to ensure production reliability across its global network. In this role, you will lead incident response and define key operational... ...track records in reliability at scale. The position offers competitive...Suggested- ...to perform under real-world scale, reliability, and security demands — and we're looking for an engineer who wants to own the... ...design and operate the global network and reliability layer behind... ...monitoring, alerting, and incident response — SLOs, runbooks, and on-call...Suggested
- ...A leading infrastructure company is seeking a Network Engineer, Reliability & Observability to enhance AI network reliability. This role involves... ...candidates have over 5 years in networking, strong incident response skills, and experience with data center networks. A...Suggested
- ...Crusoe in San Francisco is looking for a Senior Staff Network Operations Engineer to oversee the reliability of its global network. This role entails leading incident responses, defining operational standards, and guiding a team of engineers in maintaining a high-performing...Suggested
$243k - $284k
...P2P is hiring a Senior Incident Response Engineer in San Francisco to lead incident triage and response across AWS and GCP. In this role, you will protect the firm from threats like capital call wire fraud and organized criminal operations. Candidates should have over...$250k - $350k
...persistent, and well-resourced anywhere. We are building Detection & Response Engineering from the ground up: engineering-led, agent-first, and built to scale across IT, OT, and physical surfaces. As the Staff Incident Responder, you are the most senior incident commander in the...Contract workLocal area$250k - $350k
...spanning hardware and software. Speed and scale are our key differentiators. Come be a... ...technology in human history, and being responsible for the physical and logical security of... ...small. Role Scope Run material incidents as incident commander, coordinating...Contract workLocal area$182k - $250k
...Senior Platform Reliability Engineer Grow Therapy is on a mission to serve as the trusted partner... ...Engineer to help define and scale reliability as a first-class capability... ...around observability, SLOs/SLAs, and incident response—while also helping translate those standards...Full timeWork at officeLocal areaRemote workHome officeFlexible hoursDay shift3 days per week$200k - $250k
...infrastructure to ensure the platform is reliable, fast, and resilient as we scale. Role Mission Own service reliability end-to-end: prevent incidents, reduce blast radius when failures... ...command quality: Lead Sev1/Sev2 response end-to-end (containment, communications...Permanent employment$200k - $250k
...This hands-on technical leadership role demands expertise in service reliability to ensure the platform's performance as it scales. Responsibilities include setting reliability standards, managing incident responses, and driving architectural resilience using Kubernetes...- Founding Platform & Reliability Engineer About OpenArt OpenArt is an AI Storytelling and Visual... ...real systems, not slices. Ship at real scale, your work goes to millions of users,... ...in an on-call rotation and lead incident response improvements (alert quality, runbooks...Remote workWorldwideVisa sponsorship
- Overview Senior Platform & Reliability Engineer OpenArt is an AI Storytelling and Visual Creation... ...systems, notslices. Ship at real scale, your work goes to millions of users,... ...Participate in an on-call rotation and improve incident response (alert quality, run books, escalation...Remote workWorldwideVisa sponsorship
- A leading language learning platform is seeking an experienced SRE Engineer to ensure the reliability and resilience of their infrastructure. Responsibilities include leading incident response, improving observability, and collaborating with various teams to enhance platform...
- ...dynamic tech firm located in San Francisco is seeking a Site Reliability Engineer to enhance operational health across their production... ...You will manage production systems' reliability and lead incident response efforts to prevent issues, all while contributing to the...
$202.8k - $327.63k
...Director, SRE Platform Engineering is a senior engineering leader responsible for bringing production... ...Management (ITSM) and Site Reliability Engineering (SRE)... ...global workforce Evolve incident response into a highly... ...Developer Platforms (IDP) at scale Background in building...Permanent employmentContract workWork at officeLocal areaRemote work2 days per week- ...A leading AI research company based in San Francisco is seeking experienced reliability engineers to scale their infrastructure and ensure system performance and reliability. This role involves collaborating with diverse teams to develop resilient systems and enhance...
- ...cloud environments. As we scale, reliability, observability, and security... .... We’re hiring our first engineer fully dedicated to the... ...stability monitoring and incident response security and least-privilege... ...Go, Rust, or C++ Strong networking + security intuition, including...
$150k - $170k
...looking for an Integration Reliability Engineer to own the... ...warehouses. This role is responsible for making systems observable... ...and repeatable as we scale across deployments,... ...Define and improve incident response, severity... ...across infrastructure, networking, and distributed...Permanent employment- ...SRE to join our engineering team at Plenful and... ...ownership of the reliability and performance... ...influence how we build, scale and operate our... ...solving during incidents and a practical... ...Lead incident response, coordinate root... ...across compute, networking and storage. Security...Work at officeRemote workFlexible hours2 days per week
$225k - $275k
...Team The Infrastructure Engineering function sits within IT and is responsible for reliably building, deploying,... ...operational leverage as OpenAI scales. About the Role We are... ..., Identity, and Network teams to ensure... ...monitoring, alerting, and incident response mechanisms to...Full timeWork at officeLocal areaRelocation packageFlexible hours$150k - $250k
...As our Founding Security Reliability Engineer at Charta Health, you'll pioneer... ...opportunity to build and scale the foundational security... ...mitigation, and efficient incident response. You'll be crucial in engineering... ...(primarily AWS), including network security, identity and...$150k - $250k
...hardware and software. Speed and scale are our key differentiators.... ...and validate data center network infrastructure (front-end,... ...ICT, Hardware, and Network Engineering to identify blockers early,... ...during and after deployments: incident response, troubleshooting, and break-...Local area- ...of AI infrastructure: large-scale AI datacenters and the... ...Gimlet Labs is seeking a Network Engineer to design, build, and scale... ...operations teams to improve network reliability, deployment velocity,... ...deployment validation, and incident response workflows. You may be a...
- ...A technology solutions provider is looking for a Network Engineer to enhance and maintain a large-scale network. This role involves managing both wired and wireless infrastructures, conducting assessments, and ensuring network security. Candidates should have a degree...
$130k - $160k
...YOU WILL DO: As part of the Network, Identity, and Security Team... ...others, and managing incidents. A typical work week might include... ...infrastructure that scales and operates efficiently.... ...automate your work. KEY RESPONSIBILITIES: The Network, Identity, and...Work at officeRemote work- ...mission is to create reliable, interpretable, and steerable... ...researchers, engineers, policy experts, and... ...infrastructure — the network, compute, and storage... ...clouds and regions. The scale is real, the spend is... ...error budgets, and incident response for network‑impacting...
- ...Senior Database Reliability Engineer Scribe is where exceptional people come to do the best work of their... ...index builds, NOT VALID constraints), and incident response for the data tier Make the Django ORM a strength at scale: catch N+1 patterns in review, extend...Full timeWork at officeRemote workHome officeFlexible hours3 days per week
- ...What you’ll do As a Senior / Staff Network Engineer, you will define the... ...Alibaba Cloud) at massive scale. Acting as a principal technical... ...or San Francisco. Responsibilities: Build the foundations of... ...network issues, running an incident through to completion and...Flexible hoursWeekend work
- B Capital is looking for a Production Support Engineer in San Francisco. You'll play a key role in ensuring the reliability of the Agentforce Supply Chain platform and work with an agile team on scaling the product and automating infrastructure. The ideal candidate has...
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Staff Network Reliability Engineer - Scale & Incident Response. Be the first to apply!
- research assistant engineering San Francisco, CA
- staff security engineer San Francisco, CA
- assistant mechanical engineer San Francisco, CA
- staff engineer San Francisco, CA
- assistant chief engineer San Francisco, CA
- senior staff systems engineer San Francisco, CA
- assistant electrical engineer San Francisco, CA
- assistant engineering manager San Francisco, CA
- project engineer assistant project manager San Francisco, CA
- staff automation engineer San Francisco, CA


