Senior Solutions Architect, AI Cluster Performance and Telemetry
$184k - $287.5kNVIDIA Corporation
- # Senior Solutions Architect, AI Cluster Performance and TelemetryApplylocations: US, CA, Santa Clara: US, TX, Austintime type: Full timeposted on: Posted Yesterdayjob requisition id: JR2019329We are looking for a Senior Solutions Architect specializing in Data Center Systems & Performance to join our elite solutions architecture team. In this role, you will work at the intersection of groundbreaking hardware and complex software stacks. As a Solutions Architect, you will act as a pivotal technical expert uniting engineering, field teams, and customers with highly intensive requirements. You will be responsible for analyzing and optimizing the performance of world-class AI, deep learning, and HPC ecosystems. Come join us!**What you'll be doing:*** Work together with our partners and customers to identify, analyze, and resolve complex performance bottlenecks across interconnected GPU, CPU, and networking systems.* Complete and maintain robust performance benchmarking suites to stress-test high-performance clusters and establish performance baselines.* Apply industry-standard performance tools to monitor hardware performance counters and extract deep system telemetry.* Deeply investigate system and software configurations to find and fix subtle discrepancies that impact peak performance.* Partner closely with internal engineering units and outside collaborators and customers to collectively develop solutions and boost infrastructure performance.**What we need to see:*** BS or MS in Engineering, Electrical Engineering, Physics, or Computer Science (or equivalent experience).* 8+ years of work-related experience in the high-tech industry, particularly in system build, performance analysis, and technical customer-facing roles.* A strong understanding of how CPUs, GPUs, and high-speed networking fabrics interact within massive clusters.* Practical experience with performance counters, profiling tools, and telemetry collection systems (e.g., Perf, eBPF, Prometheus, Grafana).* Practical experience working with containers, cloud provisioning, and scheduling tools such as Docker, Docker Swarm, Kubernetes, SLURM, Ansible.* Proven track record of transforming raw logs and telemetry into structured time series data, dashboards, and heat maps.* The ability to translate complex, low-level technical performance anomalies into clear, actionable narratives for cross-functional teams.* Strong collaborative skills and a proven history of building successful relationships across diverse engineering and operations teams.**Ways to stand out from the crowd:*** Deep knowledge of multi-GPU communication libraries like NCCL, and how they optimize inter-GPU topologies.* Deep, hands-on experience working directly with NVIDIA hardware architectures, NVLink, NVSwitch, or NVIDIA Nsight tools.* Practical experience optimizing distributed AI training workloads, LLMs, or large-scale high-performance computing environments.* Experience developing or integrating Agentic AI frameworks to autonomously parse telemetry logs, diagnose configuration drifts, or automate cluster triage.Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 184,000 USD - 287,500 USD for Level 4, and 224,000 USD - 356,500 USD for Level 5.You will also be eligible for equity and benefits.Applications for this job will be accepted at least until June 8, 2026.This posting is for an existing vacancy.NVIDIA uses AI tools in its recruiting processes.NVIDIA is committed to fostering an inclusive work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.
- J-18808-Ljbffr NVIDIA Corporation
Vacancy posted 3 days ago
Similar jobs that could be interesting for youBased on the Senior Solutions Architect, AI Cluster Performance and Telemetry in Santa Clara, CA vacancy
$184k - $287.5k
...innovative accelerated computing platforms for AI and HPC. Because of our work, scientists,... ...with internal engineering efforts in GPU cluster design and networking and convey... ...situational limitations to make the most performant and supportable GPU clusters possible Work...SeniorPerformance- ...accelerated computing platforms for AI and HPC. Because of our work,... ...We are seeing a highly motivated Senior Solutions Architect to join the Cluster Design and Architecture team with... ...situational limitations to make the most performant and supportable GPU clusters...SeniorPerformance
- NVIDIA Corporation is seeking a Senior Solutions Architect for AI Cluster Performance in Santa Clara. The role involves resolving performance bottlenecks in GPU and CPU systems, collaborating with engineering teams, and maintaining performance benchmarking suites. Candidates...SeniorPerformance
- Senior Solution Network Architect, Enterprise Products Responsibilities Own the creation... ...solutions for enterprise AI/ML systems Craft detailed... ...architectures Create and validate cluster designs, optimizing them... ...scalability, resilience, performance, and security in the...SeniorPerformanceRemote work
$184k - $287.5k
AI/ML Solutions Architect - NVIDIA Lead software customer technical engagement for AI training, inference... ...or background in HPC (High Performance Computing) environments for AI or ML... ...applications. Familiarity with multi‑node GPU clusters and performance tuning for large‑...SeniorPerformance$184k - $287.5k
...the unlimited potential of AI to define the next era of computing... ...is searching for an AI/ML Solutions Architect focusing on Hyperscale... ...or background in HPC (High Performance Computing) environments for... ...Familiarity with multi-node GPU clusters and performance tuning for large...SeniorPerformance$184k - $356.5k
Senior Solutions Architect, Spectrum-X Low Level page is loaded## Senior Solutions... ...and manufactures high-performance networking equipment that enable... ...) we make powerful ML/AI platforms possible. We believe... ...make AI workloads in large clusters even more performant. As a...SeniorPerformanceRemote work- ...building the world’s leading AI company, and we are looking... ...for an expert AV and Robotics Solutions Architect who can help customers... ...technologies to customers. Perform in‑depth analysis and optimization... ...at scale on cloud computing clusters with GPUs. Development...SeniorPerformance
- ...looking for an experienced Network Solutions Architect Engineer to help bring our next-generation AI networking platforms into... ...bring-up of server, network, and cluster infrastructure in customer... ...Analyze and debug configuration and performance issues in RoCE and InfiniBand...SeniorPerformanceRemote work
- ...Senior Solution Architect – AI / GPU Cloud Mountain View, California, United States About the job... ...diagrams, capacity plans, and cost/performance analyses Translate complex technical... ...& Enablement Guide onboarding, cluster setup, tuning, and scaling Partner...SeniorPerformance
$184k - $356.5k
...NVIDIA Gruppe in Santa Clara is seeking a Senior Solutions Architect focused on networking technologies. This role involves assisting with designs... ...for next-generation networking solutions that enable advanced AI infrastructure. The ideal candidate will have over 8 years...Senior- ...Senior Solution Architect – AI / GPU Cloud We are seeking a Senior Solution Architect to design GPU... ...infrastructure meets the highest standards for performance, security, and scalability in AI... ...deployment models. Architect GPU clusters, storage, networking, and...SeniorPerformance
$184k - $287.5k
A leading technology company seeks a Senior Solutions Architect to work on optimizing AI services on ARM CPUs. The role requires 8+ years of experience in... ...customers through workload migration, implementing performance tuning, and creating technical presentations. The position...SeniorPerformance- ...Gruppe in Santa Clara is seeking a technical leader for the GPU AI/HPC Infrastructure team. You will design and implement cutting-edge GPU compute clusters, focusing on deep learning and high-performance computing. The ideal candidate will have at least 5+ years of experience...SeniorPerformance
- NVIDIA Gruppe is looking for an AI Solutions Architect in Santa Clara, California. This role focuses on enhancing NVIDIA's internal cloud infrastructure... ...programming skills. Responsibilities include optimizing performance and collaborating with development teams to iterate...SeniorPerformance
- A leading AI technology company is seeking an experienced AV and Robotics Solutions Architect to help customers enhance Physical AI workloads using state-of-the-art technologies... ...models, developing proof-of-concepts, and performing optimizations to enhance performance on...SeniorPerformance
- ...day, our work helps care teams perform with greater precision and... ...We are seeking a ServiceNow Solutions Architect to drive enterprise-wide architecture... ...design patterns. Advise senior leaders and stakeholders on... ...ServiceNow processes using AI, automation, and emerging...SeniorPerformanceLocal areaWorldwideFlexible hours
- NVIDIA Corporation in Santa Clara is looking for a Senior Solution Architect to design and deploy AI applications for telecom operations using cutting-edge... ...includes advising Telco partners and building high-performance systems for network data. The ideal candidate will have...SeniorPerformance
- ...ambitious and forward-thinking solution architect to help in the enablement of... ...on the world by applying AI inference aware technology to... ...Help customers design high-performance and secure workload aware networks... ...Strong knowledge of network telemetry, logs, SNMP, NetFlow/IPFI....SeniorPerformanceWork experience placement
$184k - $287.5k
...member of the NVIDIA Networking Solution Architecture team, your role... ...in web2.0, cloud, HPC AI, and enterprise data center domains... ...project delivery to design, architect and test Ethernet networking... ...validate and monitor network performance. Requirements Bachelor’s degree...SeniorPerformance$224k - $356.5k
...high‑energy, networking engineers to join the Solutions Architecture team in building the world’s largest and fastest AI/HPC systems using NVIDIA Networking. This... ...Linux, PCIe devices as it relates to networking performance Experience in configuring, testing, and troubleshooting...SeniorPerformance$184k - $287.5k
**What You Will Be Doing:*** Partnering with other solution architects, engineering, product and business teams. Understanding... ...NVIDIA technology and MLOps solutions* Analyzing performance and power efficiency of AI inference workloads on Kubernetes* Some travel to conferences...SeniorPerformance- ...that brings new Artificial Intelligence (AI) technology to NVIDIA’s largest... ...looking for an experienced networking Solutions Architect to support accelerated computing networking... ...infrastructure as well as helping them understand performance characteristics for solutions. Work...SeniorPerformance
$184k - $287.5k
...CPU platforms? NVIDIA is looking for a Solutions Architect experienced in Arm-based server CPUs to... ...innovation and collaboration while advancing performance and scalability benchmarks. What you’ll... ...(NCPs) as we develop and run hosted AI services. Together, we will architect...SeniorPerformance$184k - $287.5k
NVIDIA is seeking outstanding AI Solutions Architects to assist and support customers that are building solutions with our newest AI technology... ..., including debugging, profiling, code optimization, performance analysis, and test design Familiarity with parallel programming...SeniorPerformance$184k - $287.5k
We are building the AI systems that will fundamentally change... ...help shape that work. As a Senior Solution Architect on our Telco AI team, you... ...corpora including network telemetry, logs, SNMP, NetFlow/IPFIX,... .... Advise on high‑performance ETL pipeline design for telecom...SeniorPerformanceRemote work$224k - $356.5k
...tapping into the unlimited potential of AI to define the next era of computing.... ...on the world. NVIDIA is looking for a Senior Solutions Architect to work in IPP's (Infrastructure, Planning... ...solutions within the cloud. Identify performance bottlenecks and optimize the speed and...SeniorPerformanceWork experience placementWorldwide- As a member of the GPU AI/HPC Infrastructure team, you will provide... ...of ground-breaking GPU compute clusters that run demanding deep learning, high-performance computing, and computationally intensive... ...developing scalable automation solutions. Build and maintain AI and ML...SeniorPerformance
$168k - $258.75k
Senior Datacenter Technical Program Manager, At-Scale AI Clusters page is loaded## Senior Datacenter Technical... ...engineers and architects to build and deploy large... ...Experience with high-performance computing systems and... ...process of finding a solution* Strong teamwork and...SeniorPerformanceFor contractorsRemote work$152k - $241.5k
...personal gaming, and high-performance computing. Our success... ...reliable, informative telemetry and data systems that... ...insights. You will architect, develop, and maintain... ...pipelines for a compute cluster using open-source... ...vacancy. NVIDIA uses AI tools in its recruiting...SeniorPerformance
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Senior Solutions Architect, AI Cluster Performance and Telemetry. Be the first to apply!
Related searches
- senior cloud solutions architect Santa Clara, CA
- anaplan senior solutions architect Santa Clara, CA
- contact center solution architect Santa Clara, CA
- entry level aws solution architect Santa Clara, CA
- senior solution manager Santa Clara, CA
- business solutions architect Santa Clara, CA
- sap solution architect Santa Clara, CA
- senior solutions architect Santa Clara, CA
- solutions architect Santa Clara, CA
- aws solution architect Santa Clara, CA

