Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Systems Software Engineer, Kubernetes Scale - DGX Cloud

SwiftCruit

The DGX Cloud organization at NVIDIA brings together cutting-edge hardware and software innovation to deliver industry-leading accelerated computing for the world's most adventurous AI workloads. We're a team of innovative engineers dedicated to solving some of the world's biggest challenges, constantly driving advancements, and impacting millions of lives worldwide! We are looking for an outstanding Systems Software Engineer with deep experience in distributed systems, open-source technologies such as Kubernetes and containers, and a strong background in systems performance and scalability. The ideal candidate brings broad, end-to-end experience across the stack - from GPU operator and device plugins to distributed inference serving and cloud platforms - along with the technical depth to investigate and address exciting, real-world problems at scale. In this pivotal role, you will take on the challenge of scaling AI infrastructure while optimizing total cost of ownership, driving down cost per token to unlock the next generation of AI innovation and AI factories! What you'll be doing: Drive end-to-end performance and scale characterization for the NVIDIA DGX Cloud software stack, from Kubernetes control and data planes through NVIDIA components such as GPU Operator, Network Operator, DCGM, NIM, and distributed inference serving, following issues from orchestration down to the metal. Collaborate with AI researchers, developers and customers to develop innovative, automated tests that simulate real user workloads using custom-built and leading open-source tools and frameworks. Deep dive into performance and scale issues in complex distributed systems, including interactions between Kubernetes and the NVIDIA software stack, to identify and resolve root causes. Design and develop monitoring, reporting and analysis tools for performance and scale testing across software, GPU and CPU resources. Triage, debug and root cause issues related to operating Kubernetes clusters at ultra-large scale, ensuring reliability and efficiency. Build and maintain a high-velocity framework that enables continuous, always-on performance and scale testing via a modern CI/CD pipeline. Document research, methodologies and results clearly and concisely, and present findings at internal and external venues, including community conferences such as KubeCon and GTC. Engage efficiently with upstream communities — including Kubernetes, CNCF and NVIDIA open-source projects — to validate performance and scalability of AI workloads early and help shape design and development decisions. What we need to see: 2+ years of experience in Computer Architecture, Networking, Storage systems, Accelerators and Bachelors/Masters in Engineering (preferably, Electrical Engineering, Computer Engineering, or Computer Science) or equivalent experience Expertise in Kubernetes and familiarity with related CNCF projects Background in working with large scale parallel and distributed accelerator-based systems Expertise optimizing performance and AI workloads on large scale systems Experience with performance modeling and benchmarking at scale Proficiency in Golang/Python Background with the NVIDIA software ecosystem in both training and inference domains Expertise with at least one of public CSP infrastructure (GCP, AWS, Azure, OCI for example) Ways to stand out from the crowd: Strong operational experience with any one of the Kubernetes distributions Prior experience scaling Kubernetes clusters to ultra-large node and object counts Demonstrated history of working in the open-source community Excellent communication and interpersonal abilities PhD in relevant areas NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hardworking people in the world working for us. If you're creative and autonomous, we want to hear from you! Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. For Poland: The base salary range is 176,250 PLN - 305,500 PLN for Level 2, and 221,250 PLN - 383,500 PLN for Level 3. #J-18808-Ljbffr SwiftCruit

Vacancy posted 2 days ago
Similar jobs that could be interesting for youBased on the Systems Software Engineer, Kubernetes Scale - DGX Cloud in New Bremen, OH vacancy
  • About the Role The DGX Cloud organization at NVIDIA delivers industry...  ...workloads. We’re seeking a Systems Software Engineer with deep experience in...  ...technologies such as Kubernetes and containers, and a strong...  ...end‑to‑end performance and scale characterization for the NVIDIA... 
    Cloud

    NVIDIA Corporation

    New Bremen, OH
    3 days ago
  • NVIDIA Corporation is looking for a Systems Software Engineer to join the DGX Cloud organization, responsible for driving performance and scale characterization. The role requires expertise in Kubernetes and distributed systems, collaborating with AI researchers to develop... 
    Cloud

    NVIDIA Corporation

    New Bremen, OH
    3 days ago
  • Software Careers is seeking a Senior Systems Software Engineer for Kubernetes Scale with a focus on the NVIDIA DGX Cloud software stack. The role involves end-to-end performance and scale characterization, diving deep into distributed performance issues across Kubernetes... 
    Cloud

    Software Careers

    New Bremen, OH
    2 days ago
  •  ...Corporation is looking for an experienced engineer to join the DGX Cloud organization. In this role, you will drive performance and scale characterization for our software stack, collaborating with AI...  ...in engineering and expertise in Kubernetes, you will analyze and resolve... 
    Cloud

    NVIDIA Corporation

    New Bremen, OH
    3 days ago
  • SwiftCruit is looking for a Systems Software Engineer to join their innovative team. Candidates should have...  ...distributed systems and expertise in Kubernetes, along with a strong background in performance optimization for large-scale infrastructures. The position involves... 
    Suggested

    SwiftCruit

    New Bremen, OH
    2 days ago
  • rexx systems GmbH in München sucht einen DevOps Architect mit Schwerpunkt Automatisierung. Sie unterstützen die Cloud Computing Abteilung bei IT-Automatisierung und dem Aufbau von Cloud‑Native‑Plattformen auf Kubernetes. Der ideale Kandidat hat ein Studium der Informatik... 
    Cloud
    Home office

    rexx systems GmbH

    New Bremen, OH
    3 days ago
  •  ...Senior Solutions Engineer At Hydrolix, we are...  ...with an innovative cloud data platform purpose...  ...built for petabyte‑scale datasets. Hydrolix...  ...database management systems, data platforms or database software including SQL. Experience...  .... Experience with Kubernetes, Spark, Databricks,... 
    Cloud
    Local area

    Hydrolix

    New Bremen, OH
    4 days ago
  •  ...workflows, and AI/agentic systems. Trusted by over 10...  ...workloads at scale. The open-source project...  ...for a Solution Engineer based in Germany to...  ...platform tooling (Docker, Kubernetes, Terraform, CI/CD, cloud platforms, or...  ...Past experience as a software engineer, data engineer... 
    Cloud
    Remote job
    Worldwide

    Kestra

    New Bremen, OH
    3 days ago
  •  ...workloads across any cloud, anywhere. EDB...  ...risk, manage costs and scale efficiently for a data...  ...to modernize legacy systems and break data silos...  ...QA/verification/test engineering for enterprise software or complex systems (...  ...). Experience with Kubernetes for AI + data... 
    Cloud
    Remote work

    EDB

    New Bremen, OH
    3 days ago
  •  ...partner is looking for a Sr. IT Systems Engineer based in Germany. Join a...  ...play a key role in building, scaling, and optimizing IT systems...  ...license utilization, and maximize software investments. Develop...  ...automation technologies, and cloud‑based SaaS ecosystems. Career... 
    Cloud
    Remote work
    Work from home
    Flexible hours

    Jobgether

    New Bremen, OH
    5 days ago
  • ANG. - Punkt und Gut! GmbH sucht einen erfahrenen Kubernetes Engineer, der spezialisiert ist auf den Aufbau und die Wartung von Kubernetes-Clustern in hybriden Infrastrukturen. Zu den Hauptaufgaben gehören die Automatisierung von Deployments und die Unterstützung bei CI... 
    Cloud

    ANG. – Punkt und Gut! GmbH

    New Bremen, OH
    4 days ago
  •  ...dten sind Sie verantwortlich für die technische Koordination von Cloud Themen und die Analyse komplexer Anforderungen. Voraussetzungen...  ...Hochschulstudium im Bereich IT und umfassende Erfahrung mit Kubernetes, Docker und Automatisierungstechnologien. Bwi IT bietet ein flexibles... 
    Cloud

    Bwi IT

    New Bremen, OH
    2 days ago
  • EASY SOFTWARE AG sucht einen Entwickler (m/w/d) am Standort in Essen oder Leipzig, um cloudnative Lösungen von Architektur bis...  ...benötigst du sehr gute Kenntnisse in Go und Erfahrung mit Kubernetes sowie Cloud-Services. Flexible Arbeitszeiten, mobiles Arbeiten und umfangreiche... 
    Cloud
    Flexible hours

    EASY SOFTWARE AG

    New Bremen, OH
    6 days ago
  • NewsNowGh is seeking a Cloud Security Engineer to secure cloud infrastructure, particularly within Kubernetes and AWS environments. This role involves implementing best security practices, managing vulnerabilities, and supporting compliance initiatives like ISO 27001 and... 
    Cloud
    Relocation package

    NewsNowGh

    New Bremen, OH
    4 days ago
  • Die Reply Group sucht einen Business Unit Manager (m/w/d) für Platform Engineering in den USA mit Erfahrung in Kubernetes und Cloud-Technologien, insbesondere Azure. Zu Ihren Aufgaben zählen die strategische Ausrichtung der Business Unit, die eigenverantwortliche Steuerung... 
    Cloud

    Reply Group

    New Bremen, OH
    2 days ago
  • ADACOR Hosting GmbH sucht einen Cloud-Operations-Spezialisten, der mit Proxmox und VMware stabile Cloud-Umgebungen bereitstellt und...  ...einem Fokus auf Automatisierung durch Tools wie CI/CD, Ansible, Kubernetes und Terraform. Sie erwartet ein dynamisches Team, das sich auf... 
    Cloud

    ADACOR Hosting GmbH

    New Bremen, OH
    2 days ago
  •  ...in München. In dieser Rolle sind Sie verantwortlich für die Weiterentwicklung und Administration der Kubernetes-Umgebung sowie den Aufbau und die Optimierung der Cloud-Infrastruktur. Anforderungen sind ein abgeschlossenes IT-Studium oder eine Ausbildung und Erfahrung im... 
    Cloud

    indevis IT Consulting and Solutions GmbH

    New Bremen, OH
    3 days ago
  • EASY SOFTWARE AG sucht einen Software Engineer (m/w/d) für den Standort Essen oder Leipzig. In dieser Rolle sind Sie verantwortlich für das Design und...  ...von cloudnativen Lösungen mit einem Fokus auf Go und Kubernetes. Sie arbeiten in einem agilen Team, das Innovation und... 
    Cloud
    Flexible hours

    EASY SOFTWARE AG

    New Bremen, OH
    6 days ago
  • EASY SOFTWARE AG sucht einen Cloud-Native-Entwickler für die Standorte Essen oder Leipzig. Sie werden cloudnative Lösungen entwickeln, Features...  ...Ausbildung, sowie sehr gute Kenntnisse in Go und Erfahrung mit Kubernetes. Das Unternehmen bietet flexible Arbeitszeiten, mobiles... 
    Cloud
    Remote job
    Flexible hours

    EASY SOFTWARE AG

    New Bremen, OH
    6 days ago
  •  ...GmbH sucht einen erfahrenen Fachinformatiker zur Verwaltung der Cloud-Dienste und Infrastruktur. In einem dynamischen Team wird der...  ...sungen gelegt. Zudem erwarten wir umfangreiche Kenntnisse in AWS, Kubernetes und CI/CD-Automatisierung. Flexible Arbeitszeiten und... 
    Cloud
    Flexible hours

    m-u-t GmbH

    New Bremen, OH
    2 days ago
  •  ...us! The Role: As a Solutions Engineer at Descope, you will be a critical...  ...SaaS, or infrastructure software strongly preferred. Technical...  ...architectures. Understanding of cloud infrastructure platforms (AWS,...  ...you understand the distinct UX, scale, and regulatory pressures of consumer... 
    Cloud

    Cacheflow

    New Bremen, OH
    3 days ago
  • Barmenia sucht einen Software-Entwickler in Deutschland, der verantwortungsvoll Fremdsoftware implementiert und betreut. Zu den Aufgaben...  .... Wichtige Qualifikationen sind Kenntnisse in Java, Docker und Kubernetes sowie sehr gute Deutschkenntnisse. #J-18808-Ljbffr Barmenia
    Cloud
    Flexible hours

    Barmenia

    New Bremen, OH
    3 days ago
  •  ...Betriebsplattform auf Basis von Kubernetes (K8s) für die Bundesagentur für...  ...Optimierung der Systeme. Entwicklung, Aktualisierung und Überwachung von Software, um die Systeme auf dem neuesten...  ...Programmiersprachen (vorzugsweise Java) und von Cloud‑Native‑Umgebungen, Container‑... 
    Cloud
    Home office
    Flexible hours

    WIR

    New Bremen, OH
    6 days ago
  •  ...Systemlandschaft der BA auf Basis von Kubernetes (K8s). Responsibilities...  ...und Leistungsdaten, um die Systeme kontinuierlich zu optimieren....  ...und Überwachung von Software, um die Systeme auf dem neuesten...  ..., External Secrets Operator, Cloud Native‑Umfeld). Fundierte Kenntnisse... 
    Cloud
    Home office
    Flexible hours

    LostInBerlin

    New Bremen, OH
    6 days ago
  • Ein führendes Cloud-Lösungsunternehmen sucht einen (Senior) Cloud Engineer Azure, um die digitale Zukunft mitzugestalten. Sie werden Teil eines agilen Teams, das...  ...in Microsoft Azure, CI/CD-Pipelines und Kubernetes. Home-Office und spannende Projekte mit interessanten... 
    Cloud
    Home office

    MAKONIS GmbH

    New Bremen, OH
    6 days ago
  • Ingenieurbüro Gunnar Heimann sucht einen Cloud Platform Engineer (d/m/w) in Immenstaad. Sie entwickeln Kubernetes-Umgebungen und gestalten wichtige Plattformen für die Luft- und Raumfahrttechnik. Zu Ihren Aufgaben gehören das Management und die Automatisierung von CI/CD... 
    Cloud
    Flexible hours

    Ingenieurbüro Gunnar Heimann

    New Bremen, OH
    3 days ago
  •  ...Anspruch an passgenaue, moderne Software zu erfüllen. Gleichzeitig fö...  ...nicht, wie kritisch unsere Systeme sind, und haben trotzdem Spaß...  ...Systemlandschaft der BA auf Basis von Kubernetes (K8s) die strategische...  ...ösungen und Infrastrukturen (Cloud Native-Umfeld und im Kontext... 
    Cloud
    Home office
    Flexible hours

    N Land

    New Bremen, OH
    6 days ago
  • Sysdig is seeking a skilled engineer to develop a Windows security application, lead initiatives across...  ...C++, as well as expertise in Docker, AWS, and Kubernetes. Join us to make an impact in the rapidly evolving field of cloud security. Flexibility and a diverse workplace... 
    Cloud

    I did my part and supported the Regular Toilet

    New Bremen, OH
    6 days ago
  •  ..., is seeking a Senior Site Reliability Engineer to enhance our AI-driven ERP platform's...  .... You will work with technologies like Kubernetes and AWS to ensure a scalable and resilient system. Candidates should have 5+ years of cloud engineering experience, deep Kubernetes... 
    Cloud

    Impower

    New Bremen, OH
    3 days ago
  •  ...suchen einen motivierten und eigenständigen System Engineer, der unser wachsendes deutsches Team...  ...- Dein Plan Zur Weltherrschaft ✔ Kubernetes-Mastermind - Bereitstellung, Wartung und...  ...Kubernetes-Cluster beim Kunden oder in der Cloud ✔ Troubleshooting-Genius - Debugging,... 
    Cloud
    Remote job

    Adfinis

    New Bremen, OH
    6 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Systems Software Engineer, Kubernetes Scale - DGX Cloud. Be the first to apply!