Distributed AI Infra Engineer — Multi-GPU Benchmarking
NVIDIA Corporation
NVIDIA Corporation is seeking a Software Engineer in Santa Clara to optimize and benchmark distributed training workloads for AI. The role involves debugging multi-GPU environments and designing automation workflows for large-scale operations. Applicants should possess a Bachelor's or Master’s in Computer Science, strong programming skills in Python and C/C++, and 3+ years of relevant experience. NVIDIA promotes a diverse and inclusive work environment, offering competitive salaries and benefits. #J-18808-Ljbffr NVIDIA Corporation
- ...company in Santa Clara is seeking a Senior High Performance AI Engineer to build groundbreaking multi-agent systems for the CUDA ecosystem. The ideal... ...development, proficiency in C/C++ and Python, and experience with GPU programming. This role offers competitive salaries and...Suggested
$184k - $356.5k
...Corporation is seeking a Senior Software Engineer in Santa Clara to enhance the... ...and reliability of large-scale AI infrastructures. The role involves... ...leadership in debugging and optimizing distributed training workloads across NVIDIA’s GPU platforms. Ideal candidates should...Suggested- ...NVIDIA Gruppe is seeking a Principal AI and ML Infra Software Engineer to join our Hardware Infrastructure team in Santa Clara, CA. In this role, you... ...efficiency by addressing infrastructure deficiencies for GPU Clusters, fostering innovations in AI/ML research. The ideal...Suggested
$272k - $431.25k
...NVIDIA Corporation seeks a Principal AI and ML Infra Software Engineer in Santa Clara, California, to enhance the efficiency of AI/ML research on GPU Clusters. The role involves collaboration with various teams, monitoring infrastructure performance, and implementing...Suggested$184k - $287.5k
...AI Benchmarking and Telemetry Engineer - NVIS page is loaded## AI Benchmarking and Telemetry Engineer - NVISlocations... ...of computing. An era in which our GPU acts as the brains of computers,... ...solutions for large-scale distributed systems, with proficiency in tools...SuggestedRemote work- ...A leading AI technology firm in California is seeking an experienced Senior Software Engineer to develop and optimize AI infrastructure software using state-of-the-art GPU systems. Candidates should have a Bachelor's degree in a technical field and a minimum of 5 years...
$272k - $431.25k
...We are seeking a Principal AI and ML Infra Software Engineer, GPU Clusters at NVIDIA to join our Hardware Infrastructure team. As an Engineer, you... ...* Capability in supervising and improving substantial distributed training operations using PyTorch (DDP, FSDP), NeMo, or...- ...experiences-from AI and data centers,... ...a Senior Staff AI Infra Engineer who is passionate... ...applications and benchmarks, with a special focus... .../ML workloads and GPU-accelerated computing... ...infrastructure, distributed systems, or performance... ...and optimize multi-GPU training performance...
$184k - $287.5k
...NVIDIA USA in Santa Clara is seeking a Senior High Performance AI Engineer to design and optimize cutting-edge AI systems. The role involves... ...and Python programming skills, and hands-on experience with GPU programming. NVIDIA offers a competitive salary range of $184,0...- ...Machine Learning Engineer | Python | Pytorch | Distributed Training | Optimisation | GPU | Hybrid, San Jose, CA Title: Machine Learning... ...Experience: ~3–5 years in ML/AI engineering roles owning... ...collaborating across Research, Platform/Infra, Data, and Product functions....
$182k - $242k
...Essential Cloud for AI. Built for... ...Kubernetes-native benchmarking services that measure... ...team. Break down engineering tasks into clear milestones... ...building distributed systems, high-performance... ...-critical GPU systems (CUDA, NCCL... ...benchmarking GPU clusters or multi-region...Permanent employmentTemporary workCasual workWork at officeRemote workFlexible hours$148k - $235.75k
...unlimited potential of AI to define the next... ...An era in which our GPU acts as the brains... ...High Performance AI Engineer to build groundbreaking multi-agent systems for the... ...libraries, frameworks, distributed training, and... ...optimizations, evidenced by benchmark wins or published...- ...computing experiences-from AI and data centers,... ...and benchmarks.You will be a member... ...from the lowest-level GPU kernels to large-scale distributed systems, shaping the... ...passion for software engineering, strong technical ownership... ...in distributed, multi-GPU systems....
$152k - $241.5k
We optimize and benchmark GenAI inference on NVIDIA'... ...the intersection of GPU performance engineering and public accountability... ...management, and distributed inference across TensorRT... ...benchmarks, multi-turn coding, agentic... ..., and other emerging AI use cases. Collaborate...$152k - $241.5k
NVIDIA Gruppe is seeking a talented individual to optimize and benchmark GenAI inference using the latest acceleration technologies.... ...involves driving industry benchmark results and architecting distributed inference systems. Required qualifications include a relevant...$184k
...NVIDIA Gruppe is seeking an experienced software professional to design and develop GPU-accelerated Python APIs for numerical computing. This role involves architecting implementations of numerical algorithms and optimizing APIs for performance across CPU and GPU architectures...$184k - $356.5k
...NVIDIA is seeking an experienced software developer to design and develop GPU-accelerated Python APIs for numerical computing. The role requires strong skills in Python, C++, CUDA, and numerical methods, with an emphasis on developing and optimizing implementations for...$275.8k - $340.5k
...Position Overview The Principal AI/ML Engineer will lead a growing organization, guiding the AV ML Infra team in achieving its mission while shaping long‑term vision... ...Azure) to design, implement, and test scalable distributed computing and data processing solutions in the...Local areaRemote workRelocationRelocation packageFlexible hours- ...NVIDIA Gruppe is seeking a Senior Software Engineer for GPU Cloud Infrastructure in Santa Clara, California. The role focuses on designing... ...of experience in scalable cloud services, with expertise in distributed systems and Go programming. NVIDIA offers competitive...
$275.8k - $340.5k
...About the team: The AV ML Infra team at GM builds ML infrastructure... ...meet the unique demands of AI and ML innovation, supporting... ...the productivity of ML engineers, and drive the adoption of cutting... ...implement, and test scalable distributed computing and data processing...Local areaRemote workWork from homeRelocationRelocation packageFlexible hours- ...AI Infra Engineer We are looking for an AI Infra engineer... ...HPC environments for distributed training of large language... ...environments Benchmark system performance, diagnose... ...training processes (Multi-Head Attention, Multi/... ...Experience managing GPU clusters and optimizing...
$130k - $170k
...NTT DATA is hiring a Platform Engineer in Santa Clara, California, to lead the design and operation of scalable infrastructure supporting AI/LLM-based solutions. The ideal candidate will have over 5 years of experience in Platform Engineering. Your role involves managing...- ...neuron™ , a unified AI-native platform for data... ...seeking a motivated AI/ML Engineer to design, build, and... ...about evaluation, benchmarking, and system reliability... ...You Will Work On ~ Multi-agent orchestration systems... ..., DevOps, or distributed systems Benefits...
$184k - $287.5k
...unlimited potential of AI to define the next... ...An era in which our GPU acts as the brains... ...the way up to large multi-node NVLink domain rack... ...a highly motivated engineer to lead performance benchmarking and optimization... ...communications (NCCL), distributed training and inference...Remote work- ...About the team The AV ML Infra team builds end‑to‑end ML platforms... ...developer‑facing products to support AI and ML innovation across teams... ...As a Staff AI/ML Full‑Stack Engineer, you will design and build end... ...implement, and test scalable distributed computing solutions. Project...
- ...Distributed Software Engineer Bengaluru, Karnataka, India; Sunnyvale CA or Toronto... ...builds the world's largest AI chip, 56 times larger than GPUs... ...OpenAI recently announced a multi-year partnership with... ..., over 10 times faster than GPU-based hyperscale cloud inference...
$184k - $287.5k
...A leading technology company seeks an AI Benchmarking and Telemetry Engineer in Santa Clara, California. In this role, you will develop benchmarking approaches for HPC and AI tasks, maintain telemetry frameworks, and collaborate with engineering teams to optimize performance...Remote work$148k - $235.75k
...NVIDIA is looking for a Senior AI Compute Engineer to join its Infrastructure... ...and ability to prioritize/multi-task easily with limited supervision... ...time. ~ Experience with benchmarking tools such as HPL, NCCL... ...experience. Experience with GPU (Graphics Processing Unit)...Remote work- ...Advanced Micro Devices is seeking a principal software developer to join the ROCm GPU-compute team in Santa Clara, California. The ideal candidate will have over 10 years of software development experience in C/C++, Python, and GPU technologies. This role involves developing...
$124k - $195.5k
...NVIDIA Corporation is seeking an AI Inference Performance Engineer - New College Grad 2026 in Santa Clara. This role involves optimizing AI inference benchmarks using NVIDIA’s accelerators and working with various teams on performance enhancements. Applicants should have...
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Distributed AI Infra Engineer — Multi-GPU Benchmarking. Be the first to apply!


