Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Senior Solutions Architect, AI Cluster Performance and Telemetry

$184k - $287.5k

NVIDIA Corporation

  • # Senior Solutions Architect, AI Cluster Performance and TelemetryApplylocations: US, CA, Santa Clara: US, TX, Austintime type: Full timeposted on: Posted Yesterdayjob requisition id: JR2019329We are looking for a Senior Solutions Architect specializing in Data Center Systems & Performance to join our elite solutions architecture team. In this role, you will work at the intersection of groundbreaking hardware and complex software stacks. As a Solutions Architect, you will act as a pivotal technical expert uniting engineering, field teams, and customers with highly intensive requirements. You will be responsible for analyzing and optimizing the performance of world-class AI, deep learning, and HPC ecosystems. Come join us!**What you'll be doing:*** Work together with our partners and customers to identify, analyze, and resolve complex performance bottlenecks across interconnected GPU, CPU, and networking systems.* Complete and maintain robust performance benchmarking suites to stress-test high-performance clusters and establish performance baselines.* Apply industry-standard performance tools to monitor hardware performance counters and extract deep system telemetry.* Deeply investigate system and software configurations to find and fix subtle discrepancies that impact peak performance.* Partner closely with internal engineering units and outside collaborators and customers to collectively develop solutions and boost infrastructure performance.**What we need to see:*** BS or MS in Engineering, Electrical Engineering, Physics, or Computer Science (or equivalent experience).* 8+ years of work-related experience in the high-tech industry, particularly in system build, performance analysis, and technical customer-facing roles.* A strong understanding of how CPUs, GPUs, and high-speed networking fabrics interact within massive clusters.* Practical experience with performance counters, profiling tools, and telemetry collection systems (e.g., Perf, eBPF, Prometheus, Grafana).* Practical experience working with containers, cloud provisioning, and scheduling tools such as Docker, Docker Swarm, Kubernetes, SLURM, Ansible.* Proven track record of transforming raw logs and telemetry into structured time series data, dashboards, and heat maps.* The ability to translate complex, low-level technical performance anomalies into clear, actionable narratives for cross-functional teams.* Strong collaborative skills and a proven history of building successful relationships across diverse engineering and operations teams.**Ways to stand out from the crowd:*** Deep knowledge of multi-GPU communication libraries like NCCL, and how they optimize inter-GPU topologies.* Deep, hands-on experience working directly with NVIDIA hardware architectures, NVLink, NVSwitch, or NVIDIA Nsight tools.* Practical experience optimizing distributed AI training workloads, LLMs, or large-scale high-performance computing environments.* Experience developing or integrating Agentic AI frameworks to autonomously parse telemetry logs, diagnose configuration drifts, or automate cluster triage.Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 184,000 USD - 287,500 USD for Level 4, and 224,000 USD - 356,500 USD for Level 5.You will also be eligible for equity and benefits.Applications for this job will be accepted at least until June 8, 2026.This posting is for an existing vacancy.NVIDIA uses AI tools in its recruiting processes.NVIDIA is committed to fostering an inclusive work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.
  • J-18808-Ljbffr NVIDIA Corporation

Vacancy posted 3 days ago
Similar jobs that could be interesting for youBased on the Senior Solutions Architect, AI Cluster Performance and Telemetry in Santa Clara, CA vacancy
  • $184k - $287.5k

     ...innovative accelerated computing platforms for AI and HPC. Because of our work, scientists,...  ...with internal engineering efforts in GPU cluster design and networking and convey...  ...situational limitations to make the most performant and supportable GPU clusters possible Work... 
    Senior
    Performance

    NVIDIA Gruppe

    Santa Clara, CA
    3 days ago
  •  ...accelerated computing platforms for AI and HPC. Because of our work,...  ...We are seeing a highly motivated Senior Solutions Architect to join the Cluster Design and Architecture team with...  ...situational limitations to make the most performant and supportable GPU clusters... 
    Senior
    Performance

    NVIDIA Gruppe

    Santa Clara, CA
    3 days ago
  • NVIDIA Corporation is seeking a Senior Solutions Architect for AI Cluster Performance in Santa Clara. The role involves resolving performance bottlenecks in GPU and CPU systems, collaborating with engineering teams, and maintaining performance benchmarking suites. Candidates... 
    Senior
    Performance

    NVIDIA Corporation

    Santa Clara, CA
    3 days ago
  • Senior Solution Network Architect, Enterprise Products Responsibilities Own the creation...  ...solutions for enterprise AI/ML systems Craft detailed...  ...architectures Create and validate cluster designs, optimizing them...  ...scalability, resilience, performance, and security in the... 
    Senior
    Performance
    Remote work

    NVIDIA Corporation

    Santa Clara, CA
    4 days ago
  • $184k - $287.5k

    AI/ML Solutions Architect - NVIDIA Lead software customer technical engagement for AI training, inference...  ...or background in HPC (High Performance Computing) environments for AI or ML...  ...applications. Familiarity with multi‑node GPU clusters and performance tuning for large‑... 
    Senior
    Performance

    NVIDIA Gruppe

    Santa Clara, CA
    4 days ago
  • $184k - $287.5k

     ...the unlimited potential of AI to define the next era of computing...  ...is searching for an AI/ML Solutions Architect focusing on Hyperscale...  ...or background in HPC (High Performance Computing) environments for...  ...Familiarity with multi-node GPU clusters and performance tuning for large... 
    Senior
    Performance

    NVIDIA Corporation

    Santa Clara, CA
    2 days ago
  • $184k - $356.5k

    Senior Solutions Architect, Spectrum-X Low Level page is loaded## Senior Solutions...  ...and manufactures high-performance networking equipment that enable...  ...) we make powerful ML/AI platforms possible. We believe...  ...make AI workloads in large clusters even more performant. As a... 
    Senior
    Performance
    Remote work

    NVIDIA Corporation

    Santa Clara, CA
    14 hours ago
  •  ...building the world’s leading AI company, and we are looking...  ...for an expert AV and Robotics Solutions Architect who can help customers...  ...technologies to customers. Perform in‑depth analysis and optimization...  ...at scale on cloud computing clusters with GPUs. Development... 
    Senior
    Performance

    NVIDIA Gruppe

    Santa Clara, CA
    2 days ago
  •  ...looking for an experienced Network Solutions Architect Engineer to help bring our next-generation AI networking platforms into...  ...bring-up of server, network, and cluster infrastructure in customer...  ...Analyze and debug configuration and performance issues in RoCE and InfiniBand... 
    Senior
    Performance
    Remote work

    NVIDIA Gruppe

    Santa Clara, CA
    1 day ago
  •  ...Senior Solution Architect – AI / GPU Cloud Mountain View, California, United States About the job...  ...diagrams, capacity plans, and cost/performance analyses Translate complex technical...  ...& Enablement Guide onboarding, cluster setup, tuning, and scaling Partner... 
    Senior
    Performance

    Glint Tech Solutions

    Mountain View, CA
    3 days ago
  • $184k - $356.5k

     ...NVIDIA Gruppe in Santa Clara is seeking a Senior Solutions Architect focused on networking technologies. This role involves assisting with designs...  ...for next-generation networking solutions that enable advanced AI infrastructure. The ideal candidate will have over 8 years... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    3 days ago
  •  ...Senior Solution Architect – AI / GPU Cloud We are seeking a Senior Solution Architect to design GPU...  ...infrastructure meets the highest standards for performance, security, and scalability in AI...  ...deployment models. Architect GPU clusters, storage, networking, and... 
    Senior
    Performance

    GMI Cloud

    Mountain View, CA
    3 days ago
  • $184k - $287.5k

    A leading technology company seeks a Senior Solutions Architect to work on optimizing AI services on ARM CPUs. The role requires 8+ years of experience in...  ...customers through workload migration, implementing performance tuning, and creating technical presentations. The position... 
    Senior
    Performance

    NVIDIA Corporation

    Santa Clara, CA
    1 day ago
  •  ...Gruppe in Santa Clara is seeking a technical leader for the GPU AI/HPC Infrastructure team. You will design and implement cutting-edge GPU compute clusters, focusing on deep learning and high-performance computing. The ideal candidate will have at least 5+ years of experience... 
    Senior
    Performance

    NVIDIA Gruppe

    Santa Clara, CA
    2 days ago
  • NVIDIA Gruppe is looking for an AI Solutions Architect in Santa Clara, California. This role focuses on enhancing NVIDIA's internal cloud infrastructure...  ...programming skills. Responsibilities include optimizing performance and collaborating with development teams to iterate... 
    Senior
    Performance

    NVIDIA Gruppe

    Santa Clara, CA
    2 days ago
  • A leading AI technology company is seeking an experienced AV and Robotics Solutions Architect to help customers enhance Physical AI workloads using state-of-the-art technologies...  ...models, developing proof-of-concepts, and performing optimizations to enhance performance on... 
    Senior
    Performance

    NVIDIA Corporation

    Santa Clara, CA
    1 day ago
  •  ...day, our work helps care teams perform with greater precision and...  ...We are seeking a ServiceNow Solutions Architect to drive enterprise-wide architecture...  ...design patterns. Advise senior leaders and stakeholders on...  ...ServiceNow processes using AI, automation, and emerging... 
    Senior
    Performance
    Local area
    Worldwide
    Flexible hours

    Intuitive

    Sunnyvale, CA
    1 day ago
  • NVIDIA Corporation in Santa Clara is looking for a Senior Solution Architect to design and deploy AI applications for telecom operations using cutting-edge...  ...includes advising Telco partners and building high-performance systems for network data. The ideal candidate will have... 
    Senior
    Performance

    NVIDIA Corporation

    Santa Clara, CA
    2 days ago
  •  ...ambitious and forward-thinking solution architect to help in the enablement of...  ...on the world by applying AI inference aware technology to...  ...Help customers design high-performance and secure workload aware networks...  ...Strong knowledge of network telemetry, logs, SNMP, NetFlow/IPFI.... 
    Senior
    Performance
    Work experience placement

    NVIDIA Gruppe

    Santa Clara, CA
    2 days ago
  • $184k - $287.5k

     ...member of the NVIDIA Networking Solution Architecture team, your role...  ...in web2.0, cloud, HPC AI, and enterprise data center domains...  ...project delivery to design, architect and test Ethernet networking...  ...validate and monitor network performance. Requirements Bachelor’s degree... 
    Senior
    Performance

    NVIDIA Gruppe

    Santa Clara, CA
    4 days ago
  • $224k - $356.5k

     ...high‑energy, networking engineers to join the Solutions Architecture team in building the world’s largest and fastest AI/HPC systems using NVIDIA Networking. This...  ...Linux, PCIe devices as it relates to networking performance Experience in configuring, testing, and troubleshooting... 
    Senior
    Performance

    NVIDIA Gruppe

    Santa Clara, CA
    2 days ago
  • $184k - $287.5k

    **What You Will Be Doing:*** Partnering with other solution architects, engineering, product and business teams. Understanding...  ...NVIDIA technology and MLOps solutions* Analyzing performance and power efficiency of AI inference workloads on Kubernetes* Some travel to conferences... 
    Senior
    Performance

    NVIDIA Corporation

    Santa Clara, CA
    4 days ago
  •  ...that brings new Artificial Intelligence (AI) technology to NVIDIA’s largest...  ...looking for an experienced networking Solutions Architect to support accelerated computing networking...  ...infrastructure as well as helping them understand performance characteristics for solutions. Work... 
    Senior
    Performance

    NVIDIA Gruppe

    Santa Clara, CA
    2 days ago
  • $184k - $287.5k

     ...CPU platforms? NVIDIA is looking for a Solutions Architect experienced in Arm-based server CPUs to...  ...innovation and collaboration while advancing performance and scalability benchmarks. What you’ll...  ...(NCPs) as we develop and run hosted AI services. Together, we will architect... 
    Senior
    Performance

    NVIDIA Gruppe

    Santa Clara, CA
    2 days ago
  • $184k - $287.5k

    NVIDIA is seeking outstanding AI Solutions Architects to assist and support customers that are building solutions with our newest AI technology...  ..., including debugging, profiling, code optimization, performance analysis, and test design Familiarity with parallel programming... 
    Senior
    Performance

    NVIDIA Corporation

    Santa Clara, CA
    3 days ago
  • $184k - $287.5k

    We are building the AI systems that will fundamentally change...  ...help shape that work. As a Senior Solution Architect on our Telco AI team, you...  ...corpora including network telemetry, logs, SNMP, NetFlow/IPFIX,...  .... Advise on high‑performance ETL pipeline design for telecom... 
    Senior
    Performance
    Remote work

    NVIDIA Corporation

    Santa Clara, CA
    2 days ago
  • $224k - $356.5k

     ...tapping into the unlimited potential of AI to define the next era of computing....  ...on the world. NVIDIA is looking for a Senior Solutions Architect to work in IPP's (Infrastructure, Planning...  ...solutions within the cloud. Identify performance bottlenecks and optimize the speed and... 
    Senior
    Performance
    Work experience placement
    Worldwide

    NVIDIA Gruppe

    Santa Clara, CA
    2 days ago
  • As a member of the GPU AI/HPC Infrastructure team, you will provide...  ...of ground-breaking GPU compute clusters that run demanding deep learning, high-performance computing, and computationally intensive...  ...developing scalable automation solutions. Build and maintain AI and ML... 
    Senior
    Performance

    NVIDIA Gruppe

    Santa Clara, CA
    2 days ago
  • $168k - $258.75k

    Senior Datacenter Technical Program Manager, At-Scale AI Clusters page is loaded## Senior Datacenter Technical...  ...engineers and architects to build and deploy large...  ...Experience with high-performance computing systems and...  ...process of finding a solution* Strong teamwork and... 
    Senior
    Performance
    For contractors
    Remote work

    NVIDIA Corporation

    Santa Clara, CA
    4 days ago
  • $152k - $241.5k

     ...personal gaming, and high-performance computing. Our success...  ...reliable, informative telemetry and data systems that...  ...insights. You will architect, develop, and maintain...  ...pipelines for a compute cluster using open-source...  ...vacancy. NVIDIA uses AI tools in its recruiting... 
    Senior
    Performance

    NVIDIA Gruppe

    Santa Clara, CA
    2 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior Solutions Architect, AI Cluster Performance and Telemetry. Be the first to apply!