Senior Network Reliability Engineer - DGX Cloud
$136k - $224.25kNVIDIA
NVIDIA is looking for a Senior Network Reliability Engineer to support and maintain our cloud and datacenter network infrastructures. This network serves the needs across the whole software stack for NVIDIA, from Graphics Drivers to Autonomous Vehicles and Artificial Intelligence.
In this role, the Senior Network Operations Engineer will remediate critical alerts within defined SLAs, triage production impacting network incidents, and interact with internal customers on network related issues. They will also be responsible for engaging with external vendors to remediate hardware and software issues, and participate in project related work such as network device upgrades and capacity augmentations. An ideal candidate will possess a wide range of skills, including alert monitoring & resolution in large-scale networks and CSP environments, outstanding troubleshooting skills, understanding of L3 underlay networks, and network protocol knowledge in large multi-vendor infrastructures.
What you will be doing:
Engage in 24/7 global shift rotations to provide remote support for network repairs and changes while collaborating across teams and updating customers on status and ticket information.
Drive operational improvements in change management and daily operations by following procedures.
Manage and operate large scale IP network technologies and infrastructures.
Utilize your skills in Peering and Datacenter interconnect technologies: PNI, Transit, Exchange, Passive DWDM, Wave circuits.
Monitor and support the network health of on-premises and cloud infrastructures.
Collaborate and develop workflow enhancements while documenting best practices.
What we need to see:
Deep knowledge and experience of TCP/IP, BGP, OSPF, MPLS, IS-IS, VxLAN, EVPN, QoS, GRE, IPsec, DNS, and MACsec.
5+ years of experience in network operations.
Skilled in network troubleshooting techniques and demonstrating creative problem-solving abilities.
Strong track record of alert response within defined SLAs and Incident management.
Experience with one or more of the following CSP environments: AWS, Azure, GCP, OCI.
Familiarity with Arista, Fortinet and Juniper.
Hands-on experience with contributing to tooling and automation for provisioning, monitoring, and managing complex network infrastructures.
Bachelor’s degree in Computer Science, related technical field, or equivalent experience.
Excellent verbal and written communication skills.
Ways To Stand Out From The Crowd:
Solid understanding of Mellanox/Cumulus OS and Infiniband technology.
Skilled in Unix/Linux system administration, with the ability to write and understand Python/Shell scripts to improve efficiency in hyperscale environments.
Familiarity with leveraging tools such as Netbox/Nautobot, Prometheus, Grafana, Panoptes to monitor and manage a global network. Passionate about innovating and investing in ground breaking technologies.
NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hard-working people in the world working for us. Are you creative and autonomous? Do you love a challenge? If so, we want to hear from you. NVIDIA’s deep learning platforms have made major impact to various fields is broadly used across leading academic institutions, start-ups, and industry, including the world’s largest Internet companies. We need passionate, hard-working and creative people to help us take on more of these outstanding opportunities in deep learning cloud solutions.
Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 136,000 USD - 224,250 USD for Level 3, and 168,000 USD - 264,500 USD for Level 4.
You will also be eligible for equity and benefits ( .
Applications for this job will be accepted at least until May 29, 2026.
This posting is for an existing vacancy.
NVIDIA uses AI tools in its recruiting processes.
NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.
$184k - $287.5k
...NVIDIA DGX Cloud is building and operating large-scale GPU infrastructure... .... We are looking for Senior Software Engineers to help build the... ...systems that make GPU clusters reliable, scalable, and safe to run... ...with platform, storage, networking, security, and workload teams...SeniorNetworkRemote work$184k - $287.5k
...Joining NVIDIA's DGX Cloud Lepton Team means contributing... ...software engineer to join our team. You'... ...in production. As a senior DGX Cloud AI Infrastructure... ...meaningful and actionable reliability metrics to track and improve... ...of NVIDIA GPUs, network technologies (RDMA, IB...SeniorNetwork$224k - $356.5k
...the world. As part of the DGX Cloud organization, the... ...security, silicon, and cloud engineering teams to turn embedded hardware... ...attestation standards into reliable, self-service cloud capabilities... ...across Data Center, Automotive, Networking, and AI ecosystems....SeniorNetworkRemote work- ...us in the moments that matter. Engineering delivers on that promise. The Senior Site Reliability Engineer is responsible for... ...automated remediation across cloud infrastructure * Evaluate and... ...Experience with common cloud networking, firewall and load balancing configuration...SeniorNetworkWork experience placementRemote workFlexible hours
$168k
NVIDIA is hiring experienced Senior Production Engineers to help scale up its AI... ...significant experience with site reliability principles and techniques... ...: You will be part of an DGX Cloud team responsible for... ...diagnostics to cluster and network telemetry. Working with...SeniorNetworkFull time$168k - $264.5k
...NVIDIA is looking for a Senior Network Engineer to develop a cloud network infrastructure. The goal is to craft a reliable, scalable and efficient network to support NVIDIA software development workflows and tools, including CI/CD pipelines, compute resource management...SeniorNetworkRemote work- ...About the Role: Sensible Care is now hiring a Senior Cloud & Reliability Engineer who will maintain our current stack, lead our HIPAA compliance... ...with HIPAA or similar regulatory frameworks. Networking: Strong understanding of VPCs, subnets, VPNs, and load...SeniorNetworkImmediate start
$172.8k - $320.9k
...SRE function to support the Veeam Data Cloud, our new SaaS platform. This role... ...ground-up role — you'll help define how reliability engineering works here by mapping systems, writing... ...* Solid grasp of distributed systems, networking, and cloud-native architecture. * Clear...SeniorNetworkBase plus commissionFull timeLocal areaRemote workWorldwide- ...Job Description Sr TechOps & SRE Lead Engineer (AWS Cloud) Department: Technology / Engineering... ...infrastructure, DevOps practices, reliability engineering, and operational excellence... ...Implement VPC architecture, IAM policies, networking, and security best practices. Oversee...SeniorNetworkRemote work
$80 per hour
...Senior Cloud DevOps Engineer/Site Reliability Engineer Position Title: Senior Cloud DevOps Engineer/Site Reliability Engineer Location: San Jose,... ..., CI/CD, automated testing) Good understanding of networking Bachelor degree in Computer Science or equivalent...SeniorNetworkLocal area- ...recruiting for one of its clients a Senior Site Reliability Engineer (Azure) - this is a fully remote role... ...(ArgoCD), Helm, and strong RBAC and network policies. Build and maintain secure... ...solutions Functional Expertise Azure cloud services (networking, compute,...SeniorNetworkRemote work
- .... Since inventing decentralized oracle networks, Chainlink has enabled tens of trillions... ...Reserve. Learn more at chain.link.The Engineering TeamAs adoption of the Chainlink Runtime... ...be a part of that growth to ensure reliability and security remain at the forefront of...SeniorNetworkRemote work
$90k - $215k
...Senior Software Engineer- Observability and Reliability Platform Engineering (REMOTE) Senior Software Engineer- Observability and Reliability Platform Engineering... ..., and maintenance of the hardware, software, and network systems 3+ years of experience in open-source...SeniorNetworkHourly payFull timeWork experience placementLocal areaRemote workFlexible hours$232k - $319k
...service with great people and reliable, cost-effective, and... ...multiple teams focused on Edge networking, K8s platform, CI/CD, Observability... ...with architects and product engineering Build a world-class observability... ...of scalable, self-service Cloud infrastructure platforms (e.g...SeniorNetworkPermanent employmentLocal areaWorldwideFlexible hours- ...Our client is seeking a Senior Systems Reliability Engineer to support and optimize large-scale, distributed infrastructure environments. This... ...scale, distributed systems environments Foundational networking knowledge including DNS, TCP/IP, routing, and firewalls...SeniorNetworkRemote work
$82.3k - $228.8k
...experience. Five9 is a leading provider of cloud contact center software, bringing... ...are seeking a highly experienced Senior Site Reliability Engineer – Compute Platforms to design,... ...storage, Kubernetes, hypervisors, networking, and Linux systems Partner with operations...SeniorNetworkTemporary workWork at officeRemote workWorldwide3 days per week$108.5k - $135.6k
...Senior Reliability Engineer Position at EVgo EVgo (Nasdaq: EVGO) is one of the nation's largest public fast charging networks for electric vehicles. Our mission is to expedite the mass adoption of electric vehicles (EVs) by creating a convenient, reliable, and affordable...SeniorNetworkWork experience placement$105k - $115k
...Senior Reliability Network Engineer Byron Center Office - Byron Center, MI 49315 Overview Salary Range $105,000.00 - $115,000.00 Salary Position Type Full Time Education Level High School Travel Percentage Negligible Description The Senior Network Reliability...SeniorNetworkFull timeWork at office- ...Senior Network Reliability Engineer is a full-time position responsible for supporting and maintaining cloud and datacenter network infrastructures, managing incidents, and collaborating with internal and external stakeholders. Key Responsibilities Provide 24/7 remote...SeniorNetworkFull timeRemote work
$150k - $225k
...Senior Systems Reliability Engineer Remote - Must reside in California or Oregon Senior Systems Reliability Engineer About IEX IEX (IEX... ...the whole stack - hardware, software, application, and network. Document current and future configuration processes...SeniorNetworkWork experience placementLocal areaRemote workFlexible hours$272k - $431.25k
...NVIDIA DGX Cloud is scaling GPU infrastructure across... ...for Principal Software Engineers to help shape the technical... ..., automation, and reliability across large-scale GPU... .... This role is for senior technical leaders who... ...infrastructure, storage, networking, security, and...Network$92.31k - $131.99k
...field demands Work with theinternal reliability teams and other stakeholders on the development... ...scale testing Bachelor's degree in engineering, Computer Science, or related field... ...including Pride! Women's Leadership Network and a Young Professionals Network. Our...SeniorNetworkFull timeTemporary workFlexible hours$96k - $163k
...accessible. Our technology and innovation, partnerships and networks combine to deliver a unique set of products and services... ...their greatest potential. Title and Summary Senior Site Reliability Engineer Overview- The B&MI BizOps team is looking for a Senior...SeniorNetworkFull timePart timeWorldwideFlexible hoursShift work$96k - $163k
...accessible. Our technology and innovation, partnerships and networks combine to deliver a unique set of products and services... ...their greatest potential. Title and Summary Senior Site Reliability Engineer Job Description: The BizOps team is looking for a Site...SeniorNetworkFull timePart timeImmediate startWorldwideFlexible hoursShift workWeekend work$130.45k - $142.38k
...leading home warranty company in Austin, Texas, is seeking a Senior IP Telephony Engineer responsible for engineering and modernizing enterprise... ...and knowledge of SIP troubleshooting, alongside networking fundamentals. The position offers a competitive salary of...SeniorNetwork- ...Job Description Insight Global is seeking a Network Engineer – Reliability & Observability to support the quality, reliability, and lifecycle performance of large-scale AI network infrastructure. This role serves as a reliability engineering leader, responsible for...SeniorNetwork
$130.45k - $142.38k
...prominent home warranty provider is looking for a Senior IP Telephony Engineer in the Town of Texas, Wisconsin. This role... ...management of voice incidents, migration to VoIP/cloud platforms, and collaborating with network teams to ensure high availability and secure operations...SeniorNetwork$96k - $163k
...accessible. Our technology and innovation, partnerships and networks combine to deliver a unique set of products and services... ...their greatest potential. Title and Summary Senior Site Reliability Engineer Overview The BizOps team is looking for a Senior Site...SeniorNetworkFull timePart timeWorldwideFlexible hoursShift work$147k - $237.5k
...Palo Alto Networks seeks an Infrastructure Engineer based in New York, NY to optimize and develop internal tools that enhance developer productivity. The ideal candidate has over 10 years in infrastructure engineering, possesses strong expertise in Go, Kubernetes, and...SeniorNetworkRemote work- ...A technology firm seeks a Senior Network Security Engineer to manage and maintain network security systems. The role involves troubleshooting issues, implementing security solutions, and providing expertise across teams. The ideal candidate has 7-10 years of relevant...SeniorNetworkRemote work
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Senior Network Reliability Engineer - DGX Cloud. Be the first to apply!
- network project engineer United States
- IT network engineer United States
- core network engineer United States
- offshore network engineer United States
- network engineer level United States
- network administrator engineer United States
- data center network engineer United States
- junior network engineer United States
- optical network engineer United States
- cisco ccnp network engineer United States



