Network Reliability & Observability Engineer
Fluidstack
A leading infrastructure company is seeking a Network Engineer, Reliability & Observability to enhance AI network reliability. This role involves developing QA processes, serverless workflows, and collaborating with cross-functional teams. Ideal candidates have over 5 years in networking, strong incident response skills, and experience with data center networks. A passion for hardware, software development expertise, and strong problem-solving abilities are essential. This position is based in San Francisco, California and supports a culture of excellence. #J-18808-Ljbffr
- ...home day is currently Tuesday. Engineering at Lambda is responsible for... ...’ll Do Deploy and operate observability platforms for logging,... ...adoptable and improve product reliability. Lead members of other engineering... ...monitoring or network monitoring Experience with Prometheus...NetworkWork at officeLocal areaWork from home
$147k - $202k
...Overview: We are seeking a highly technical Staff Observability Site Reliability Engineer with a specialty in Splunk to own and evolve our... ...Distributed Systems: Deep understanding of Linux internals, networking (TCP/IP, DNS, Load Balancing), and container...NetworkPermanent employmentWork at officeLocal areaWorldwideFlexible hours- We’re looking for a Systems Reliability Engineer to own the reliability of our system across cloud... ...is responsible for making systems observable, diagnosable, and repeatable as we scale... ...issues across infrastructure, networking, and distributed systems Partner with...NetworkPermanent employment
- ...re hiring an SRE to join our engineering team at Plenful and take ownership of the reliability and performance of the systems... ...you’ll do Reliability, Observability and Performance: Maintain... ...resource usage across compute, networking and storage. Security, Compliance...NetworkWork at officeRemote workFlexible hours2 days per week
$293k - $385k
...Team The Infrastructure Engineering function sits within IT and is responsible for reliably building, deploying, and operating... ...IT, Security, Identity, and Network teams to ensure infrastructure... ...Ensure automation is safe, observable, and resilient under failure conditions...NetworkWork at office$150k - $250k
Founding Security Reliability Engineer Location: San Francisco - In office. Employment: Full-time... ...(primarily AWS), including network security, identity and access management... ...secrets management solutions. Security Observability & Monitoring: Establish comprehensive...NetworkFull timeWork at office- ...About the Role We're seeking a Site Reliability Engineer to ensure Hyperbolic's GPU marketplace... ...capacity across our distributed GPU network, and implementing secure rollout and... ...automated rollback mechanisms Proficient in observability tools and practices including metrics...Network
- ...information, please read ourSenior Site Reliability Engineer page is loaded## Senior Site... ...teams to ensure systems are resilient (observable, fault-tolerant, recoverable, scalable... ...experienceAdvanced knowledge of Linux, Networking, and ContainersProficiency in at least...NetworkImmediate startRemote workWorldwide
- ...Site Reliability Engineer - AI Infrastructure Location: Global Remote / San Francisco · Full-Time... ...been quietly building the systems, network, and orchestration layer that makes the... ...implement monitoring, alerting, and observability for critical systems. Collaborate...NetworkFull timeRemote work
- ...onboard services and teams to the reliability tenets. Establish and... ...development teams to build resilient, observable, fault‑tolerant, recoverable... ...in Site Reliability Engineering, managing infrastructure and... ...knowledge of Linux, networking, and containers. Proficiency...Network
- ...We're looking for a world-class Site Reliability Engineer to ensure the reliability, performance... ...our reliability posture end-to-end—observability, performance tuning, incident ops, infrastructure... ...and scale ops. Work across compute, networking, storage, and sandboxed execution...Network
$140k - $220k
About the Job You’ll own reliability and operational excellence for... ...that makes the entire engineering team more effective, establish... ...Deep AWS expertise (ECS, RDS, networking, security) Strong... ...monitoring, alerting, and observability systems from first principles...Network- We are seeking a Sr. Site Reliability Engineer to join our team and run critical infrastructure... ...validator nodes for multiple blockchain networks. You’ll also provide guidance and... ...experiences. They enjoy building testing and observability capabilities that will accelerate the...NetworkRemote job
$125k - $195k
...team of exceptional, hands-on engineers to make this happen.... ...seeking an Infrastructure & Site Reliability Engineer to design, build,... ...Design and setup low level networking components, e.g., service... ...compatible storage, VPNs Scale our observability platform: Build systems to...NetworkWork at officeVisa sponsorshipNight shift$138k - $179k
...proactively support using improved automation, observability and tooling. We are responsible for... ...of other teams from infrastructure and engineering, to QA and business teams, so strong... ...for achieving results. A global network of talented colleagues, who inspire, support...NetworkFlexible hours- ...daily users while enabling our engineering teams to ship fast. You'll... ...and tooling that improves reliability and partnering with engineering... ...to design systems that are observable, resilient, and easy to... ...infrastructure including compute, networking, databases, and managed...NetworkWork at officeWork from home
- ...significantly outperforms individual engineers. We combine language models... ...seeking an experienced Site Reliability Engineer to join our... ...comprehensive monitoring, alerting, and observability solutions using Datadog and... ...reporting Design secure network architectures including VPC...Network
$150k - $170k
Claryo, Inc. is seeking an Integration Reliability Engineer in San Francisco, CA, responsible for... ...candidate will build and maintain observability tools and improve incident response processes... ...experience in SRE, strong Linux and networking skills, and familiarity with...Network$127k - $249k
The Team Platform Engineering is the department within SRE that is... ...Kubernetes infrastructure, networking, load balancing (including... ...internal service mesh), and observability and alerting systems. The... ...components that ensure cluster reliability and security (e.g., CoreDNS...NetworkWork at officeLocal areaRemote workWorldwideFlexible hours$250k
...platform spanning infrastructure, networking, and orchestration.... ...Kubernetes environments. Develop observability, alerting, and auto-healing... ...code, CI/CD pipelines, and reliability standards across thousands... ...DevOps, or Infrastructure Engineering roles supporting large-...NetworkImmediate start- CloudDevs: Senior Web site Reliability Engineer (SRE) CloudDevs works with fast-moving, venture... ...system reliability, efficiency, and observability. Outline and monitor SLIs, SLOs, and... ...debugging expertise throughout providers, networking, and knowledge layers. Arms-on...Network
- ...and Azure. Building reusable Terraform components (networking, IAM, secrets). Wiring up observability and tightening the loop between infra change and production... ...customers. We're looking for an infrastructure engineer who actually wants to live in Terraform and...NetworkLive inWork from home
$227.2k - $324.5k
...About the Role: Site Reliability Engineering (SRE) at Tubi is not a traditional operations... ...technical strategy and vision for Tubi's observability, and automation platforms. Partner... ...of AWS services (especially networking, IAM, EKS, ALBs/NLBs, Route 53, CloudWatch...NetworkFull timeContract workTemporary workLocal areaFlexible hours$238k - $290k
...Role Overview As a Staff Software Engineer on the Site Reliability team at Harvey, you will ensure the... ...infrastructure resources (compute, storage, networking) across 50+ global regions Lead... ..., etc.) Deep familiarity with observability tools (Datadog, Sentry, etc.) and...NetworkRelocation package- ...a Sr. Staff Infrastructure Engineer at Twelve Labs, you will combine... .... Own key tradeoffs across reliability, cost, and velocity, making... ...Strong fundamentals in OS, networking, storage, and compute.... ...Infrastructure as Code, CI/CD, and observability (e.g., Terraform, GitHub...NetworkH1bWork at officeWorldwideVisa sponsorshipFlexible hours
$151k - $297k
The Team Platform Engineering is the department within SRE that is... ...Kubernetes infrastructure, networking, load balancing (including... ...internal service mesh), and observability and alerting systems. The... ...components that ensure cluster reliability and security (e.g., CoreDNS...NetworkLocal areaImmediate startRemote workFlexible hoursShift work- Senior Site Reliability Engineer - AI Infrastructure Location: Global Remote / San Francisco •... ...have been quietly building the systems, network, and orchestration layer that makes... ...that degrade collective operations. Observability: Build deep visibility into GPU utilization...NetworkFull timeRemote work
$300 per month
...On-site Department Cloud Engineering Crusoe's mission is to accelerate... ...Role As a Principal Site Reliability Engineer, you will play a... ...Architect and improve observability systems (metrics, logs, tracing... ...with Infrastructure, Networking, Hardware, and Platform teams...NetworkFull timeTemporary work$181k - $263k
## Senior Staff Site Reliability EngineerApplylocations: San Franciscotime... ...across its premier global network of top-quality partners.****... ...Staff Site Reliability Engineer who will set the technical direction... ...across teams* Expertise in observability engineering—SLOs, SLI...NetworkWork from homeFlexible hoursNight shift$210k - $310k
...tremendous growth of the Stellar blockchain network, an open-source platform that operates... ...SDF is looking for a Director of Site Reliability Engineering to lead a small, high-leverage SRE... ...SDF engineering teams build, deploy, observe, and operate software with confidence....NetworkTemporary workWork at officeLocal areaWorldwideFlexible hours
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Network Reliability & Observability Engineer. Be the first to apply!
- network engineer night shift San Francisco, CA
- network software engineer San Francisco, CA
- work from home network engineer San Francisco, CA
- network engineer full time San Francisco, CA
- network engineer San Francisco, CA
- network engineer level San Francisco, CA
- ip network engineer San Francisco, CA
- IT network engineer San Francisco, CA
- remote cisco network engineer San Francisco, CA
- cisco network engineer San Francisco, CA


