Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Network Reliability Engineer - Production Infrastructure

Fluidstack

Fluidstack is seeking a Network Engineer in San Francisco, California to oversee the health and operation of our extensive network. This role involves building active debugging tools, developing monitoring frameworks, and implementing automation for seamless network repair. The ideal candidate has a strong understanding of network systems and AI tooling. We offer a competitive compensation package, including salary and equity, along with extensive health benefits and generous PTO. #J-18808-Ljbffr Fluidstack

Vacancy posted 2 days ago
Similar jobs that could be interesting for youBased on the Network Reliability Engineer - Production Infrastructure in San Francisco, CA vacancy
  •  ...deep. We call this role a Cloud Service Reliability Engineer. The Cloud Service Reliability...  ...: Establishes technology product specifications and collaborates with...  ...5 years of experience in automating infrastructure, service delivery, and engineering site... 
    Suggested

    Forhyre

    San Francisco, CA
    22 days ago
  • $157.7k - $277.8k

     ...Employment Type Full time Location Type Hybrid Department Engineering, product & design Compensation SF & NYC Base Compensation $157...  ...: our platform must be available, performant, and reliable, 24/7. As an Infrastructure engineer, you'll be at the heart of making this a... 
    Suggested
    Full time
    Work at office
    Local area
    Flexible hours

    Writer

    San Francisco, CA
    4 days ago
  • $200k - $250k

     ...PostGraphile API services, PostgreSQL, Redis, BullMQ queues, and Kubernetes-based production infrastructure. We’re hiring a senior owner of stability and infrastructure to ensure the platform is reliable, fast, and resilient as we scale. Role Mission Own service reliability end... 
    Suggested
    Permanent employment

    Vizcom

    San Francisco, CA
    4 days ago
  •  ...designs, builds, and operates critical infrastructure that enables research at OpenAI....  ...size of our workloads, while remaining reliable and easy to use. About the Role We'...  ...for an experienced Site Reliability Engineer to own production-critical infrastructure end to end.... 
    Suggested

    OpenAI

    San Francisco, CA
    4 days ago
  • $232k - $319k

     ...building the trusted, neutral infrastructure that enables organizations...  ...with great people and reliable, cost-effective, and efficient...  ...teams focused on Edge networking, K8s platform, CI/CD, Observability...  ...with architects and product engineering Build a world-class... 
    Suggested
    Permanent employment
    Local area
    Worldwide
    Flexible hours

    Okta, Inc.

    San Francisco, CA
    4 days ago
  • $170k - $240k

     ...ownership from early thinking to production. If you're energized by...  ...-time streaming, offline reliability, background sync, and AI-...  .../ Staff-level Mobile Infrastructure Engineer to help own and evolve the...  ...and component systems Networking, retry, and sync primitives... 
    Work at office
    Local area
    Immediate start

    Commure

    San Francisco, CA
    a month ago
  •  ...Senior Electrical Engineer - Infrastructure Power Systems (Owner's Engineer) Lawrence Berkeley...  ...over campus power system operations, reliability, protection and maintenance support,...  ...resolve disagreements while maintaining productive professional relationships.... 
    For contractors

    Berkely Lab

    San Francisco, CA
    3 days ago
  •  ...innovative R&D company in San Francisco is seeking a Site Reliability Engineer to join its Platform Engineering team. This position focuses...  ...Site Reliability Engineering, strong knowledge of GCP and infrastructure as code using Terraform. It offers a competitive salary... 

    CodeRabbit

    San Francisco, CA
    2 days ago
  • $150k - $195k

     ...Machine Learning Solutions Engineer (ML + Infrastructure Focus) New York, New...  ...take ideas from research to production with less friction....  ...strategies (local NVMe vs networked / object storage)...  ...metrics (latency, cost, reliability) Cross-Functional... 
    Work at office
    Local area
    2 days per week

    Lightning AI

    San Francisco, CA
    1 day ago
  • $150k - $170k

    Terawatt-Infrastructure in San Francisco is seeking an Electrical Engineer to design and deliver high-performance EV charging solutions across North America. Your contributions will be key in shaping electrical strategies for charging stations. You will engage in strategic... 

    Terawatt-Infrastructure

    San Francisco, CA
    4 days ago
  • $350k

    Menlo Ventures is seeking a Research Engineer to enhance the reliability and infrastructure of AI systems focused on professional workflows. The ideal candidate will have substantial Python coding experience and a strong background in operating machine learning systems... 
    Work at office

    Menlo Ventures

    San Francisco, CA
    4 days ago
  • $246.5k - $290k

     ...company in San Francisco is seeking an Infrastructure Lead to oversee the production systems behind World Chain. You...  ...be responsible for ensuring the reliability and performance of core infrastructure while leading a talented engineering team. The ideal candidate has over... 

    Kubelt

    San Francisco, CA
    2 days ago
  • A leading AI infrastructure company based in San Francisco is seeking an experienced Supply Chain Operations Program Manager. You will oversee new hardware introductions and production ramps focusing on operational readiness and strategic alignment. The ideal candidate... 

    SupportFinity™

    San Francisco, CA
    4 days ago
  •  ...builders the visibility to understand how AI behaves in production and the tools to improve it. Teams at Notion, Stripe...  ...release. About the role We’re looking for a Cloud Infrastructure Engineer to help us build reliable, scalable infrastructure and give developers a world... 
    Flexible hours

    Braintrust Data, Inc.

    San Francisco, CA
    3 days ago
  • $195k - $235k

    Crusoe Energy Systems LLC is looking for a Staff Network Operations Engineer to ensure production reliability across its global network infrastructure. This role is critical in maintaining uptime and facilitating AI workloads via incident response and operational excellence... 

    Crusoe Energy Systems LLC

    San Francisco, CA
    19 hours ago
  • Epoch Biodesign is looking for a Senior Staff Network Operations Engineer to ensure production reliability across its global network in San Francisco. This role...  ...operational standards for Crusoe's extensive AI infrastructure, requiring strong technical leadership and... 

    Epoch Biodesign

    San Francisco, CA
    19 hours ago
  • Epoch Biodesign in San Francisco is looking for a Staff Network Operations Engineer to enhance the reliability of their global network infrastructure. This role demands a seasoned network engineer to handle production incidents, maintain high system availability, and... 

    Epoch Biodesign

    San Francisco, CA
    19 hours ago
  • $150k - $250k

     .... Whoever deploys frontier compute infrastructure fastest will decide whether AI expands...  ...the Role Fluidstack is seeking a Network Engineer, Reliability & Observability to serve as a...  ...operational experience. You've run production networks or compute, responded to incidents... 
    Local area

    Fluidstack

    San Francisco, CA
    4 days ago
  • $175k - $300k

    Fluidstack, located in San Francisco, is seeking a Production Engineer to ensure the health of their compute fleet. You will build metrics...  ...,000 to $300,000 based on qualifications. Join us in revolutionizing the compute infrastructure for AI. #J-18808-Ljbffr Fluidstack

    Fluidstack

    San Francisco, CA
    2 days ago
  • $150k - $250k

     ...Network Engineer Fluidstack is seeking a Network Engineer to join our Deployment & Integration...  ...and validating AI datacenter network infrastructure at scale. You'll be in the field...  ...to resolve blockers, and ensuring production-ready handovers to operations. This... 
    Local area

    Fluidstack

    San Francisco, CA
    2 days ago
  •  ...the next generation of AI infrastructure: large-scale AI datacenters...  ...Customers deploy through production-grade APIs without needing...  ...Gimlet Labs is seeking a Network Engineer to design, build, and scale...  ...teams to improve network reliability, deployment velocity, operational... 

    Gimlet Labs

    San Francisco, CA
    2 days ago
  •  ...Position Overview We are seeking a Senior Datacenter Network Infrastructure Engineer to help develop the strategic design, implementation and...  ...on config, hardware and software changes to improve our reliability and efficiency Datacenter Process Improvement and... 
    Temporary work
    Local area
    Flexible hours

    Internet Archive

    San Francisco, CA
    3 days ago
  • $140k - $200k

    Astromecha in San Francisco is seeking an experienced IT Systems & Network Administrator to manage and scale our internal infrastructure. This role requires 5+ years in a startup environment, hands-on experience with Cisco networking, and a strong background in cloud systems... 
    Flexible hours

    Astromecha

    San Francisco, CA
    2 days ago
  • $175k - $275k

     ...Founding Infrastructure Engineer - build the runtime that replaces the world's biggest system integrators...  ..., or self-hosted deployments in production? Do you want to be the first...  ...infrastructure team, leveraging your network and eventually growing into a leadership... 
    H1b
    Work at office
    Visa sponsorship

    Venture Up

    San Francisco, CA
    19 hours ago
  •  ...then waiting on design and engineering to do it. Today we deliver...  ...judgment. Flint takes the manual production off their plate so they get...  ...of this role. Cloud infrastructure end-to-end: architecture,...  ...scale, without trading off reliability. Observability and incident... 
    Work at office
    Shift work
    Night shift

    Flint Technologies

    San Francisco, CA
    2 days ago
  • $180k - $200k

     ...Infrastructure Engineer (GPU & Compute) Lightning AI is the company behind PyTorch Lightning....  ...designed to take ideas from research to production with less friction. Through our...  ...operational overhead Improve the reliability, repeatability, and scalability of image... 
    Remote work
    Work from home
    Flexible hours

    Lightning AI

    San Francisco, CA
    3 days ago
  • $230k - $342k

    OpenAI is seeking a Software Engineer for its Core Network Engineering team in San Francisco. This role...  ...on improving performance and reliability. Candidates should have experience...  ...342K plus equity. Join us to build infrastructure vital for advancing AI research. #J... 

    OpenAI

    San Francisco, CA
    2 days ago
  • $250k - $320k

    Gimlet Labs, Inc. is seeking a Network Engineer to design and build network infrastructure for AI workloads at scale. This role involves ensuring robust and reliable networking for production systems across distributed environments, focusing on performance and efficiency... 

    Gimlet Labs, Inc.

    San Francisco, CA
    3 days ago
  •  ...Judgment Labs builds infrastructure for Agent Behavior Monitoring...  ...loss in scaled production environments....  ..., and pinpoint where reliability breaks down in their...  ...Senior Infrastructure Engineer to architect and scale...  ...architectures including network isolation, encryption... 

    Judgment Labs

    San Francisco, CA
    1 day ago
  •  ...Infrastructure / Cluster Engineer Gimlet is building the next generation of AI infrastructure...  ...Customers deploy through production-grade APIs without...  ...production workloads reliably from day one. This...  ...cluster schedulers, high-speed networking, observability,... 

    Gimlet Labs

    San Francisco, CA
    1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Network Reliability Engineer - Production Infrastructure. Be the first to apply!