Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Senior AI Infrastructure Engineer, Large-Scale GPU Clusters

NVIDIA

NVIDIA Corporation in Santa Clara is seeking a Senior Software Engineer to lead the optimization of large-scale AI systems. This role will involve profiling and tuning workloads using cutting-edge NVIDIA technology. The ideal candidate will have over 8 years of experience in software infrastructure for AI systems, with expert-level programming in Python and C/C++. Responsibilities include leading the debugging process of multi-GPU environments and mentoring less experienced engineers. #J-18808-Ljbffr NVIDIA Corporation

Vacancy posted 6 days ago
Similar jobs that could be interesting for youBased on the Senior AI Infrastructure Engineer, Large-Scale GPU Clusters in Santa Clara, CA vacancy
  •  ...NVIDIA Gruppe in Santa Clara is seeking a Senior Software Engineer to lead the optimization of distributed training across large-scale GPU platforms. Candidates should have substantial experience in AI applications and technical leadership. This role involves profiling... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    4 days ago
  • NVIDIA Gruppe is seeking highly motivated EngOps and Platform Engineers to develop automated tools for managing large GPU clusters. This position requires strong expertise in high-performance computing and deep learning. The ideal applicants have a BS or MS in a relevant... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    3 days ago
  • $356.5k

     ...NVIDIA Gruppe is seeking an experienced AI infrastructure software engineer to join its DGX Cloud AI Efficiency Team in Santa Clara, California. This role focuses on developing the infrastructure for optimizing AI workloads and ensuring high availability and efficiency... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    4 days ago
  •  ...'s DGX Cloud AI Efficiency Team...  ...to the infrastructure that powers our...  ...resources and scale to foster innovation...  ...software engineer to join our...  ...that enable large‑scale AI training...  .... As a senior DGX Cloud AI...  ...large‑scale clusters. Experience in...  ...Visualization. The GPU, our... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    8 hours ago
  •  ...passionate, and dedicated Senior AI Infrastructure Engineer to join our DGX Cloud group...  ..., build and maintain large‑scale production systems with high...  ...Engineer at NVIDIA ensures our GPU cloud services deliver...  ...multi‑GPU and multi‑node clusters. Engage in and improve the... 
    Senior

    NVIDIA Corporation

    Santa Clara, CA
    3 days ago
  • Google Inc. in Sunnyvale, CA is looking for a Software Engineer to develop next-generation technologies crucial to Google’s operational...  ...needs. The ideal candidate will have experience with large-scale infrastructure and distributed systems, along with proficiency in... 
    Senior

    Google Inc.

    Sunnyvale, CA
    6 days ago
  •  ...NVIDIA Gruppe is seeking a Senior Network Engineer to develop and manage a robust cloud network infrastructure. You will lead the design and implementation of large-scale L3 networks across data centers and corporate IT. Ideal candidates will have over 8 years of networking... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    4 days ago
  •  ...technology company is seeking a Software Engineer to develop next-generation...  ...and debugging complex issues across large-scale systems. Candidates should have a strong...  ...Join a dynamic team at the forefront of AI and infrastructure innovation. #J-18808-Ljbffr Google Inc... 
    Senior

    Google Inc.

    Sunnyvale, CA
    3 days ago
  • $272k - $425.5k

    Principal Software Engineer – Large-Scale LLM Memory and...  ...serving generative AI and reasoning...  ...Dynamo orchestrates GPU shards, routes requests...  ...heterogeneous clusters so that many...  ...memory pools.* Mentor senior and junior...  ...storage, or ML systems infrastructure in C/C++ and... 
    Local area
    Remote work

    NVIDIA Corporation

    Santa Clara, CA
    1 day ago
  •  ...and Visualization. The GPU, our invention,...  ...our team of innovative engineers who develop and maintain...  ...and maintaining large GPU clusters interconnected via NVLink...  ...switches, and related infrastructure. Automation expert...  ...Proficiency in designing large scale networking... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    3 days ago
  • $176k - $333.5k

    NVIDIA Corporation in Santa Clara is seeking experienced EngOps and Platform Engineers to develop and maintain extensive GPU clusters. The role requires extensive hands-on experience with automation tools and a robust understanding of computer networks. The ideal candidate... 
    Senior

    NVIDIA Corporation

    Santa Clara, CA
    6 days ago
  •  ...NVIDIA Gruppe is looking for an experienced GPU Deployment Engineer to tackle end-to-end AI deployment challenges on the NVIDIA RTX AI platform. The role involves analyzing GPU-accelerated applications, improving user experiences, and collaborating with teams to influence... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    4 days ago
  •  ...of distributed training frameworks for large models (LLMs, multimodal), resolving scalability bottlenecks at the scale of 10k–100k GPU clusters. Kernel & performance tuning. Work...  ...utilization. Training pipeline engineering. Build an end-to-end MLOps platform spanning... 

    Stealth Startup

    Menlo Park, CA
    4 days ago
  • $168k - $322k

     ...NVIDIA Gruppe is seeking a Senior AI Platform Engineer to improve engineering efficiency and data security through AI-powered products. The...  ...working with Cloud and AI/ML teams to build and scale infrastructure and shape the technological future of the organization.... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    8 hours ago
  • $180k - $240k

     ...We are seeking a Senior AI Infrastructure Engineer to design, build, and scale the high-performance AI...  ...Architect and optimize multi-GPU setups, ensuring efficient...  ...across H100/A100 clusters. Networking & Hardware...  ...Splatting (3DGS) and large-scale training. Intelligent... 
    Senior
    Odd job
    Work at office

    Gatik AI

    Mountain View, CA
    8 hours ago
  • $200k - $322k

     ...Our invention of the GPU in 1999 sparked the growth...  ...ignited modern AI — the next era of computing...  ...today. Design‑for‑X Engineering at NVIDIA works on groundbreaking...  ...as part of the AI Infrastructure requirements at an org...  ...capacity planning for large‑scale mission‑critical Gen... 
    Senior

    NVIDIA Corporation

    Santa Clara, CA
    3 days ago
  • Senior Systems Software Engineer - GPU Performance at Scale We are looking for a dedicated engineer for the Senior Systems...  ...will drive innovation in AI and GPU computing. What You’ll...  ...of performance practices in large‑scale GPU infrastructure, delivering powerful tools, methodologies... 
    Senior

    NVIDIA Corporation

    Santa Clara, CA
    2 days ago
  •  ...technical, creative, and Senior AI Platform Engineer to build, support,...  ...and lead AI-native infrastructure roadmaps and cross‑...  .... Architect and scale LLM/ML infrastructure...  ...across cloud‑native clusters and on‑premises hardware...  ...model serving, and GPU‑accelerated... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    8 hours ago
  • $184k - $356.5k

    NVIDIA Gruppe is looking for skilled software engineers to develop AI inference systems that operate with high efficiency. The role involves architecting...  ...high-performance inference frameworks and optimizing GPU processes. Ideal candidates should have extensive... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    3 days ago
  • $149.1k - $215.93k

     ...differentiate, innovate, and scale across AI, cloud, networking,...  .... About the Role Senior MLOps & AI Infrastructure Engineer to architect, build...  ...Kubernetes and GPU node pools. Develop...  ...‑tune, and deploy large‑scale models...  ...performance on GPU/TPU clusters. Build and maintain... 
    Senior
    Local area
    Shift work

    191 Altera Corporation

    San Jose, CA
    4 days ago
  •  ...NVIDIA Gruppe in Santa Clara is looking for a Senior HPC Architect to support the deployment of large-scale GPU compute clusters. You will provide engineering solutions for GPU computing products, ensuring technical relationships with teams and assisting in creative solutions... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    4 days ago
  •  ...Clara is seeking a technical leader for the GPU AI/HPC Infrastructure team. You will design and implement cutting-edge GPU compute clusters, focusing on deep learning and high-...  ...have at least 5+ years of experience with large-scale infrastructure, strong programming... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    4 days ago
  • Google Inc. is seeking a Senior Software Engineer for its Infrastructure team in Mountain View, CA. In this role, you will leverage your expertise in C++, software design, and large-scale systems to develop cutting-edge technologies for Google Ads. Responsibilities include... 
    Senior

    Google Inc.

    Mountain View, CA
    5 days ago
  •  ...Databricks in Mountain View is seeking a Senior Software Engineer to join our Networking Infrastructure team. You will design secure, scalable networking solutions for large-scale compute across clouds. Ideal candidates will have 5+ years in programming languages like... 
    Senior

    I did my part and supported the Regular Toilet

    Mountain View, CA
    8 hours ago
  •  ...NVIDIA Gruppe is seeking a Principal AI and ML Infra Software Engineer to join our Hardware Infrastructure team in Santa Clara, CA. In this role, you'll work closely...  ...by addressing infrastructure deficiencies for GPU Clusters, fostering innovations in AI/ML research. The... 

    Jobleads-US

    Santa Clara, CA
    4 days ago
  • $272k - $431.25k

     ...NVIDIA Corporation seeks a Principal AI and ML Infra Software Engineer in Santa Clara, California, to...  ...the efficiency of AI/ML research on GPU Clusters. The role involves collaboration with various teams, monitoring infrastructure performance, and implementing improvements... 

    Jobleads-US

    Santa Clara, CA
    4 days ago
  • NVIDIA Corporation is looking for an HPC Cluster Engineer in Santa Clara, California, to design and operate GPU Compute Clusters for EDA and high-performance...  ...will have extensive experience with large-scale compute infrastructure and exceptional skills in automation and... 
    Senior

    NVIDIA Corporation

    Santa Clara, CA
    3 days ago
  • $184k - $356.5k

     ...NVIDIA Gruppe is seeking an experienced engineer to lead GPU cluster design and support for AI and HPC deployments in Santa Clara, California. The...  ...candidate will have over 8 years of experience with large-scale GPU infrastructure and a strong ability to communicate complex... 
    Senior

    Jobleads-US

    Santa Clara, CA
    4 days ago
  • $152k - $241.5k

    Senior Software Engineer, Fabric Networking - GPU page is loaded## Senior Software Engineer, Fabric Networking - GPUlocations...  ...hardware and software to support large scale computing platforms.* Work with...  ...an existing vacancy.NVIDIA uses AI tools in its recruiting processes.... 
    Senior
    Remote work

    NVIDIA Corporation

    Santa Clara, CA
    4 days ago
  • $152k - $241.5k

    NVIDIA Corporation is seeking a Senior ML Platform Engineer to design and scale high-performance ML infrastructure. You'll utilize IaC techniques with Ansible and Terraform, collaborating closely with ML researchers and ensuring system reliability and performance. This... 
    Senior
    Remote job

    NVIDIA Corporation

    Santa Clara, CA
    3 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior AI Infrastructure Engineer, Large-Scale GPU Clusters. Be the first to apply!