Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Staff Network Engineer, Operations

$195k - $235k

Crusoe

Crusoe is on a mission to accelerate the abundance of energy and intelligence. As the only vertically integrated AI infrastructure company built from the ground up, we own and operate each layer of the stack - from electrons to tokens - to power the world's most ambitious AI workloads. When you join Crusoe, you join a team that is building the future, faster.

We're in the midst of the greatest industrial revolution of our time. The demand for AI compute is boundless, and power is a bottleneck. We're solving that - with an energy-first approach that makes AI infrastructure better for the world and faster for the people innovating with AI.

We're looking for problem-solving, opportunity-finding teammates with a sense of urgency, who believe in the scale of our ambition and thrive on a path not fully paved - people who want to grow their careers alongside a team of experts across energy, manufacturing, data center construction, and cloud services.

If you want to do the most meaningful work of your career, help our customers and partners advance their AI strategies, and be part of a high-performing team that believes in each other, come build with us at Crusoe.

About This Role:

Crusoe Cloud is seeking a Staff Network Operations Engineer to help own production reliability across our global network infrastructure, including edge, backbone, data center fabric, and GPU cluster interconnects. This is a hands-on production ownership role focused on incident response, root cause analysis, and operational excellence initiatives that keep our hyperscale AI infrastructure running at scale. Your work will directly affect the availability of AI workloads running across thousands of GPUs worldwide.

The ideal candidate is a seasoned network engineer with deep operational experience in large-scale environments who thrives in high-pressure situations and takes pride in keeping systems healthy. You'll contribute to defining SLIs and SLOs, improving observability tooling, building automation to reduce toil, and mentoring peers - all while serving as a key escalation point during high-severity network events.

What You'll Be Working On:
  • Production Reliability: Help own uptime across Crusoe's global edge, backbone, data center, and GPU cluster network, directly supporting AI workloads at scale.
  • Incident Response: Lead and contribute to end-to-end response for high-severity network events, including mitigation, stakeholder communication, and postmortem documentation.
  • Root Cause Analysis: Drive RCAs for production incidents, identify systemic issues, and author remediation plans tracked through to closure.
  • Observability Improvements: Contribute to and improve Crusoe's network monitoring stack using streaming telemetry, SNMP, NetFlow, and tools such as Kentik, Grafana, Prometheus, and ThousandEyes.
  • Operational Standards: Author and maintain runbooks, escalation playbooks, and SOPs used across the operations team.
  • Operational Automation: Write Python-based tooling to reduce toil, automate common remediation workflows, and accelerate mean time to resolution.
  • SLI/SLO Contribution: Partner with Architecture and SRE teams to define and track network reliability metrics and service level objectives backed by real-time dashboards.
  • Mentorship: Provide technical guidance to Senior engineers and contribute to a culture of operational excellence and continuous learning.
What You'll Bring to the Team:
  • 8+ years of production network engineering experience with a focus on operations, incident response, and reliability in large-scale or internet-scale environments.
  • Hands-on experience with observability and monitoring tools including streaming telemetry, SNMP, NetFlow/sFlow, Grafana, Prometheus, and ThousandEyes.
  • Experience operating RDMA/RoCE lossless fabrics for GPU or HPC workloads, including familiarity with PFC, ECN, and DCQCN tuning.
  • Expert hands-on knowledge of BGP, EVPN-VXLAN, IS-IS, OSPF, MPLS, QoS, and TCP/IP in production data center environments.
  • Proficiency with Arista (EOS) and Juniper (Junos) platforms in leaf-spine CLOS architectures across multi-vendor environments.
  • Python proficiency for writing auto-remediation scripts, diagnostic tooling, and operational automation.
  • Comfort operating large device fleets across multi-region environments with on-call responsibility, including experience as an escalation point during critical events.
  • Bachelor's degree in Computer Science, Electrical Engineering, or a related field, or equivalent practical experience.
Bonus Points:
  • Experience with NVIDIA/Mellanox networking platforms in GPU cluster environments.
  • Familiarity with Kentik or Arbor for traffic analysis and DDoS visibility.
  • Experience defining or contributing to SLIs and SLOs in partnership with SRE or product teams.
  • Exposure to operating 10K+ device fleets across hyperscale or cloud environments.
  • Background contributing to post-incident learning programs or operational excellence initiatives org-wide.
Benefits:
  • Competitive compensation and equity packages
  • Restricted Stock Units
  • Paid time off, paid holidays & leave of absence programs
  • Comprehensive health, dental & vision insurance
  • Employer contributions to HSA account
  • Paid parental leave
  • Paid life insurance, short-term and long-term disability
  • Professional development & tuition reimbursement
  • Mental health & wellness support
  • Commuter benefits (parking & transit)
  • Cell phone stipend
  • 401(k) Retirement plan with company match up to 4% of salary
  • Volunteer time off
  • Global travel insurance & emergency assistance
  • Daily meals allowance
  • Additional perks & programs specific to location

Compensation Range

Compensation will be paid in the range of up to $195,000 -$235,000 + Bonus. Restricted Stock Units are included in all offers. Compensation to be determined by the applicant's knowledge, education, and abilities, as well as internal equity and alignment with market data.

Crusoe is an Equal Opportunity Employer. Employment decisions are made without regard to race, color, religion, disability, genetic information, pregnancy, citizenship, marital status, sex/gender, sexual preference/ orientation, gender identity, age, veteran status, national origin, or any other status protected by law or regulation.
Vacancy posted 2 days ago
Similar jobs that could be interesting for youBased on the Staff Network Engineer, Operations in San Francisco, CA vacancy
  •  ...Senior / Staff Network Engineer As a Senior / Staff Network Engineer, you will define the global technical strategy, architecture, and roadmap...  ...network issues on endpoints, including basic IT operations support, building a core networking understanding among the... 
    Operations
    Flexible hours
    Weekend work

    Airwallex

    San Francisco, CA
    1 day ago
  • $320k - $405k

     ...Staff Fiber Network Engineer San Francisco, CA | New York City, NY About Anthropic Anthropic's mission is to create reliable, interpretable...  ...roadmap. Monitor degradation and quality over time Operations — Partner with NOC and field-ops on fiber cuts, locates,... 
    Operations
    Work at office
    Visa sponsorship
    Flexible hours
    Night shift

    Anthropic

    San Francisco, CA
    18 hours ago
  • $193k - $234k

     ...infrastructure company built from the ground up, we own and operate each layer of the stack - from electrons to tokens - to...  ...Crusoe Cloud is seeking a high-energy, detail-oriented Staff Network Deployment Engineer to lead the physical and logical implementation of our global... 
    Operations
    Temporary work
    Remote work

    Crusoe

    San Francisco, CA
    18 hours ago
  • $225k - $275k

     ...infrastructure company built from the ground up, we own and operate each layer of the stack — from electrons to tokens — to...  .... About this Role Crusoe Cloud is seeking a Senior Staff Network Deployment Engineer to serve as the technical owner of how we deploy network... 
    Operations
    Temporary work
    Remote work

    Crusoe

    San Francisco, CA
    2 days ago
  • $210k - $230k

     ...during the interview process. About the Role: We're looking for a Senior Staff Network Secruity Engineer to lead Gusto's edge and network security strategy, owning the design and operation of our Cloudflare WAF, DDoS protection, Zero Trust, and broader perimeter... 
    Operations
    Full time
    Work at office
    Local area
    Remote work
    2 days per week
    3 days per week

    Gusto

    San Francisco, CA
    1 day ago
  • Crusoe is seeking a Senior Staff Network Operations Engineer to ensure the reliability of their global network infrastructure. This role focuses on operational excellence, driving incident responses, and mentoring staff engineers. The ideal candidate should have over 12... 
    Operations

    Crusoe

    San Francisco, CA
    18 hours ago
  • $195k - $235k

    Crusoe Energy Systems LLC is looking for a Staff Network Operations Engineer to ensure production reliability across its global network infrastructure. This role is critical in maintaining uptime and facilitating AI workloads via incident response and operational excellence... 
    Operations

    Crusoe Energy Systems LLC

    San Francisco, CA
    18 hours ago
  • Epoch Biodesign is looking for a Senior Staff Network Operations Engineer to ensure production reliability across its global network in San Francisco. This role drives incident response and sets operational standards for Crusoe's extensive AI infrastructure, requiring strong... 
    Operations

    Epoch Biodesign

    San Francisco, CA
    18 hours ago
  • $225k - $275k

    Crusoe Energy Systems LLC in San Francisco is looking for a Senior Staff Network Operations Engineer to ensure production reliability across its global network. In this role, you will lead incident response and define key operational standards. Ideal candidates will bring... 
    Operations

    Crusoe Energy Systems LLC

    San Francisco, CA
    3 days ago
  • $193k - $234k

     ...infrastructure company built from the ground up, we own and operate each layer of the stack — from electrons to tokens — to...  ...: Crusoe Cloud is seeking a high-energy, detail-oriented Staff Network Deployment Engineer to lead the physical and logical implementation of our... 
    Operations
    Temporary work
    Work at office
    Remote work

    Crusoe Energy Systems LLC

    San Francisco, CA
    2 days ago
  • Epoch Biodesign in San Francisco is looking for a Staff Network Operations Engineer to enhance the reliability of their global network infrastructure. This role demands a seasoned network engineer to handle production incidents, maintain high system availability, and optimize... 
    Operations

    Epoch Biodesign

    San Francisco, CA
    18 hours ago
  • $195k - $235k

    Crusoe is seeking a Staff Network Operations Engineer in San Francisco to ensure the reliability of our network infrastructure across AI workloads. This hands-on role involves managing uptime, incident response, and operational excellence. Ideal candidates will have 8+... 
    Operations

    Crusoe

    San Francisco, CA
    9 hours ago
  • $150k - $250k

     ...stack. We acquire power, design and build data centers, and operate them - with teams spanning hardware and software. Speed and scale...  ...is you, please apply! About the Role Fluidstack is seeking a Network Engineer, Reliability & Observability to serve as a reliability... 
    Operations
    Local area

    Fluidstack

    San Francisco, CA
    4 days ago
  • $245k - $295k

     ...Location Type On-site Department Cloud Engineering Crusoe is on a mission to accelerate...  ...company built from the ground up, we own and operate each layer of the stack — from...  ...About this Role Crusoe is seeking a Senior Staff Network Automation Engineer to own how our network... 
    Operations
    Full time
    Temporary work

    Epoch Biodesign

    San Francisco, CA
    1 day ago
  • $171k - $248k

     ...degree in Computer Science, Electrical Engineering, or a technical field, or equivalent...  ...of experience with data center networking architecture, operations, and power distribution systems within...  ...Technical Program Managers or technical staff, and managing cross‑functional teams... 
    Operations
    Full time
    Work at office
    Remote work

    Google Inc.

    San Francisco, CA
    2 days ago
  • Junior Network Engineer job at Revel Staffing. San Francisco, CA. Key Responsibilities Firewall Operations & Security Provide daily operational support of enterprise firewalls, including configuration, troubleshooting, and proactive monitoring. Manage firewall rules... 
    Operations
    Work experience placement

    Revel Staffing

    San Francisco, CA
    1 day ago
  • $340 per month

     ...Developer Productivity , you will lead the development and operations of a robust AWS and Kubernetes-based platform. This platform...  ...Collaboration and Application Onboarding: Partner with product and engineering teams to understand new backend applications and identify... 
    Operations
    Immediate start
    Home office
    Flexible hours

    PayJoy

    San Francisco, CA
    2 days ago
  •  ...team's overall planning. Represent engineering in cross-functional team sessions and...  ...surrounding systems. Assist support and operations teams in identifying and quickly...  ...in AWS foundations including compute, networking, storage, observability and security. Experience... 
    Operations
    Contract work
    Work experience placement
    Work at office
    Visa sponsorship
    Work visa

    Early Warning Services

    San Francisco, CA
    18 hours ago
  • Drata, based in San Francisco, is seeking a Staff Software Engineer for their Monetization Platform. This role involves leading the architecture...  ...and evolution of billing systems that enhance financial operations and support diverse pricing models. The ideal candidate... 
    Operations

    Careers at Drata

    San Francisco, CA
    3 days ago
  • IBM Computing in San Francisco is looking for a Software Engineer to lead the Compute Platform team, focusing on cloud-native architectures...  ...role entails designing platform APIs, managing multi-cluster operations, and ensuring system reliability for enterprise clients. The... 
    Operations

    IBM Computing

    San Francisco, CA
    1 day ago
  •  ...platform that enables users to launch LoRA and fine-tuning runs on managed GPU clusters. Ideal candidates will have strong Kubernetes operations, backend development in Python, and a solid understanding of AI technologies. With a flexible work arrangement and a focus on... 
    Operations
    Flexible hours

    Prime-Intellect

    San Francisco, CA
    2 days ago
  • $204k - $233k

     ...Staff DevOps Engineer San Francisco, CA (Hybrid) | Full-Time We're partnering with a well-capitalized infrastructure technology...  ...systems to high-availability infrastructure powering physical operations. They're looking for a Staff DevOps Engineer to... 
    Operations
    Full time
    Local area

    Motion Recruitment

    San Francisco, CA
    4 days ago
  • $150k - $300k

    Prime Intellect is seeking engineers to build its AI training platform, which allows launching managed GPU jobs effortlessly. Responsibilities...  ...stacks. Candidates should be fluent in Kubernetes operations, Python backend development, and AI stack knowledge. We offer... 
    Operations
    Visa sponsorship
    Work visa
    Flexible hours

    Prime Intellect

    San Francisco, CA
    2 days ago
  • $189k - $274k

     ...efficient and accessible for all. We're searching for a Staff Security Platform Engineer to join our Enterprise Security Engineering team,...  ...Engineering. Aurora is scaling its autonomous trucking operations, and we need someone who makes our security tools... 
    Operations
    Work at office
    Local area
    3 days per week

    Aurora Innovation

    San Francisco, CA
    4 days ago
  • A technology company is seeking a Staff Engineer to lead technical direction and drive impactful projects in the development of software systems for physical operations. This remote position is ideal for candidates with significant experience in software design and a growth... 
    Operations
    Remote job
    Flexible hours

    Samsara

    San Francisco, CA
    2 days ago
  •  ...with a second gear in backend or QA. Hamilton is building the operating system for charter aviation. Quoting, trip planning, live...  ..., and resilient. That's your job. We're hiring a Staff Platform Engineer to own the infrastructure and internal platforms that let Hamilton... 
    Operations
    Second job
    Visa sponsorship

    Hamilton AI

    San Francisco, CA
    1 day ago
  •  ...Role Abridge’s services and engineering teams are in hyperscale mode...  ...are looking for experienced Staff Platform Engineers to join our...  ..., developer platform, and operational maturity in kind. You’ll work...  ...platforms including networking, IAM, Kubernetes, databases,... 
    Operations
    Hourly pay
    Full time
    Local area
    Remote work
    Flexible hours

    Neura Market

    San Francisco, CA
    3 days ago
  • Golunar, based in San Francisco, is seeking a Staff Software Engineer to tackle complex technical challenges in healthcare. You will design...  ...modern, AI-powered software systems that improve hospital operations and patient care. The ideal candidate will have over 10 years... 
    Operations

    Golunar

    San Francisco, CA
    4 days ago
  • Plenful Inc. is seeking a Staff Software Engineer to lead the data platform development. This role involves architecting the core data model...  ...particularly in Python. Join us in transforming healthcare operations while enjoying comprehensive benefits and a hybrid work... 
    Operations

    Plenful Inc.

    San Francisco, CA
    4 days ago
  • $200.2k - $357.5k

    Samsara (NYSE: IOT) is the pioneer of the Connected Operations™ Cloud, which is a platform that enables...  ...leader in AI for physical operations. We’re hiring a Staff / Senior Staff Machine Learning Infrastructure Engineer to lead the design and evolution of our end-to-end... 
    Operations
    Remote job
    Work at office
    Flexible hours

    Samsara

    San Francisco, CA
    3 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Staff Network Engineer, Operations. Be the first to apply!