Senior AI Performance Engineer (CUDA / GPU / NVIDIA Stack)
$140.4kBrillfy Technology Inc
Job Title: Senior AI Performance Engineer (CUDA / GPU / NVIDIA Stack)
Duration: Min 12+ Months
Location: 100% Remote
This is a hands-on engineering role , requiring deep expertise in CUDA, GPU architecture, and performance profiling .
Key Responsibilities
- Profile and optimize AI/ML workloads across multi-GPU and multi-node systems
- Identify bottlenecks across compute, memory, networking, and orchestration layers
- Optimize CUDA kernels (memory coalescing, shared memory usage, occupancy tuning)
- Improve inference performance using TensorRT, Triton, DeepStream, NeMo
- Analyze and improve latency, throughput, GPU utilization, and memory efficiency
- Work on distributed AI systems using Apache Ray, NCCL, Kubernetes GPU scheduling
- Build benchmarking frameworks and performance monitoring systems
- Collaborate with AI, DevOps, and Infrastructure teams for system-wide optimization
Required Skills
- Strong hands-on CUDA programming and GPU performance optimization
- Deep understanding of GPU architecture and memory hierarchy
- Experience with Nsight, CUDA profiling tools, performance benchmarking
- Hands-on experience with NVIDIA ecosystem (Triton, TensorRT, NeMo, DeepStream)
- Experience with distributed AI systems (multi-GPU, multi-node, NCCL, Ray)
- Experience working with AI models such as YOLO, GPT, LLaMA, Transformers
- Strong understanding of AI system performance metrics (latency, throughput, utilization)
Preferred
- Experience working at NVIDIA or similar GPU/AI infrastructure companies
- Experience with real-time video / Vision AI systems
- Experience with large-scale production AI deployments
Interview Process (Mandatory)
- Candidates will receive a technical handout 1 day before interview
- 90-minute deep-dive demo discussion (NOT theoretical)
- Candidate must explain:
- Bottleneck identification approach
- GPU optimization strategies
- System-level performance improvements
- ...in Santa Clara is seeking a Senior High Performance AI Engineer to build groundbreaking multi-agent systems for the CUDA ecosystem. The ideal... ...Python, and experience with GPU programming. This role offers... ...diverse and inclusive workplace. #J-18808-Ljbffr Nvidia CorporationSeniorPerformance
$184k - $287.5k
...NVIDIA has been transforming computer graphics, PC... ...unlimited potential of AI to define the next era... .... An era in which our GPU acts as the brains of... ...Nsight Compute helps CUDA engineers around the world to innovate... ...(AI) and High Performance Computing. Join our team...SeniorPerformanceRemote work$152k - $241.5k
NVIDIA Corporation is hiring a Senior AI Compiler Engineer in Austin, Texas. In this role, you will develop MLIR-based graph optimizations, conduct performance analysis, and engage with hardware teams to enhance GPU architectures. Candidates should have at least 3 years...SeniorPerformance- Pragmatike is seeking a CUDA Kernel Engineer for a remote position to develop and optimize NVIDIA CUDA kernels for high-performance AI systems. The ideal candidate will have a deep understanding of GPU architecture, performance optimization strategies, and hands-on experience...SeniorPerformanceRemote workRelocation package
- Darwin Recruitment is seeking a Senior GPU Systems / AI Infrastructure Engineer in New York City. This senior-level... ...engineering, deep experience with CUDA programming, and a strong understanding... ..., directly impacting performance and scalability of frontier AI models...SeniorPerformance
$152k - $241.5k
...company in New York is seeking a Senior AI and FSI Developer Technology Engineer to enhance performance in the Financial Services... ...have a deep understanding of CPU/GPU architecture. The base salary ranges... ...depending on level and experience. #J-18808-Ljbffr NVIDIA CorporationSeniorPerformance$184k - $287.5k
...looking for outstanding AI systems engineers to develop... ...inference systems software stack! We build... ...code generators, and GPU kernel technologies for NVIDIA's hardware architecture... ...kernel development and performance optimizations (especially using CUDA C/C++, cuTile,...SeniorPerformanceRemote work$184k - $356.5k
NVIDIA Corporation is seeking a Senior Deep Learning Software Engineer specializing in Inference to join their growing team... ...role involves optimizing GPU-accelerated software for advanced AI applications, including developing high-performance deep learning frameworks like...SeniorPerformance- ...Devices, Inc. is seeking a Senior Staff Software Developer who... ...role in shaping the future of AI and improving performance in key applications. You'll... ...architect and drive the AI software stack. The ideal candidate has... ...C++ programming and GPU technologies, with experience...SeniorPerformance
- NVIDIA in Santa Clara is seeking an experienced engineer to design and optimize AI systems for the CUDA ecosystem. Ideal candidates will have strong C/C++ and Python skills, with a solid background in AI systems development. The position offers competitive salaries, equitably...SeniorPerformance
- Micron Technology in Boise, Idaho is seeking a GPU Performance Engineer to optimize large-scale AI systems on modern GPU platforms. The role involves architecting... ...proficient in programming languages like Python and CUDA. Micron offers comprehensive benefits and a...SeniorPerformance
- ...technology company is looking for a Senior Software Engineer to work on AI storage solutions. The role involves developing high-performance C++/CUDA libraries and optimizing storage infrastructure... ...a collaborative and diverse environment. #J-18808-Ljbffr NVIDIA CorporationSeniorPerformanceRemote job
- ...Hpc-Ai Engineer NVIDIA is looking for an experienced HPC-AI Engineer... ...intelligence and GPU computing. Provide insights... ...bring up large scale performance platforms. What you... ...system, software stack and application level... ...hardware/software (DGX, Cuda) Experience with RDMA...SeniorPerformanceRemote work
$184k - $287.5k
Join to apply for the Senior High Performance AI Engineer role at NVIDIA NVIDIA has been transforming... .... An era in which our GPU acts as the brains of computers... ...AI systems for the CUDA ecosystem. Co‑design agentic... ...across the AI stack—from hardware through compilers...SeniorPerformance- ...Software Engineer - CUDA Core Libraries NVIDIA's accelerated computing... ...modern HPC and AI. At the core of... ...reliable, and scalable GPU-accelerated... ...down to low-level performance tuning involving... ...across the stack: CI, tests, benchmarks... ...with senior CUDA engineers in...SeniorPerformanceFull timeRemote work
$184k - $287.5k
...NVIDIA is leading the way in groundbreaking developments... ...Intelligence, High Performance Computing and Visualization. The GPU, our invention,... ...motivated Deep Learning engineer to bring advanced CUDA features and... ...Runtime technologies into AI stacks, including PyTorch, TRT...SeniorPerformanceRemote work$152k - $241.5k
...NVIDIA is seeking outstanding senior engineers to work on the CUDA driver, a key component of accelerated GPU computing. You will join a versatile... ...potential and performance of NVIDIA hardware... ...NVIDIA computing stack, you will help design... .... NVIDIA uses AI tools in its...SeniorPerformance$184k - $287.5k
...rack, multi-tenant AI/ML datacenters with NVIDIA GB200, and... ...NVIDIA seeks a Senior Software Engineer for our CSP (Cloud... ...the cloud-native stack for datacenter products... ...ll be doing: Perform deep-dive... ...that expose new GPU capabilities.... ...GPU computing (CUDA), deep learning...SeniorPerformance$184k - $287.5k
Senior Software Engineer, NCCL and CUDA - CSP Engagements page is loaded## Senior... ...on ML software stack functionality and performance for datacenter... ...and improve multi-GPU workloads performance... ...aligned with the NVIDIA ecosystem.* Collaborate... ....NVIDIA uses AI tools in its recruiting...SeniorPerformanceRemote work$152k - $241.5k
...Join the NVIDIA Developer Tools team and empower engineers throughout the world developing... ..., and High Performance Computing! See your... ...profiler stack and application code... ...~ Knowledge of a GPU Compute API such as CUDA, OpenCL, or similar... .... NVIDIA uses AI tools in its recruiting...SeniorPerformanceWorldwide$184k - $287.5k
...We are hiring senior engineers to work on the CUDA driver and runtime, core... ...on the GPU. Our team analyzes performance of applications, investigates... ...the potential of NVIDIA hardware for a... ...teams Analyze full stack performance... ...vacancy. NVIDIA uses AI tools in its recruiting...SeniorPerformance$184k - $287.5k
...NVIDIA has been transforming computer graphics... ...potential of AI to define the... ...era in which our GPU acts as the brains... ...for a dedicated engineer for the Senior Systems Software... ...focusing on GPU Performance at Scale. At NVIDIA... ...computing software stacks (CUDA). Experience with...SeniorPerformanceRemote work- Pragmatike is looking for a CUDA Kernel Engineer to design and optimize custom CUDA kernels for AI systems. This remote position... ...opportunity to work with high-performance AI solutions for Fortune 500... ...candidate has experience with NVIDIA GPU architecture, strong kernel optimization...SeniorPerformanceRemote workRelocation package
$152k - $241.5k
...NVIDIA is leading the way in groundbreaking... ..., High Performance Computing and Visualization... .... The GPU, our invention,... ...motivated Performance engineer to influence the... ...in the stack Evaluate proof... ...Familiarity with CUDA programming and/... ...NVIDIA uses AI tools in its recruiting...SeniorPerformanceRemote work- ...Devices is seeking a principal software developer to join the ROCm GPU-compute team in Santa Clara, California. The ideal candidate... ...operations on GPUs, leading a small team, and optimizing performance. Join AMD to innovate in computing and contribute to shaping the...SeniorPerformance
- Pragmatike is seeking a CUDA Kernel Engineer to develop and optimize CUDA kernels for high-throughput AI systems. This remote role involves maximizing GPU performance, diagnosing PCIe bottlenecks, and working... ...scratch and optimizing for NVIDIA architectures. Benefits...SeniorPerformanceRemote workRelocation package
- A leading analytics firm in the United States is seeking a Full Stack Gen AI Engineer with over 7 years of experience, focused on Python and AWS infrastructure. The role involves building high-performance API services and integrating Generative AI technologies. Candidates...SeniorPerformance
$152k - $241.5k
...NVIDIA's invention of the GPU 1999 sparked the growth of the PC gaming... ...learning ignited modern AI - the next era of... ...Deep Learning Compiler Engineer. NVIDIA is hiring software... ...leading inference performance, fast build time,... ...in GPU architecture. CUDA or OpenCL programming...SeniorPerformanceRemote work$184k - $287.5k
...NVIDIA has been transforming computer graphics... ...potential of AI to define the next... ...era in which our GPU acts as the brains... ...platform, and product engineering to ensure... ...translate into real performance and quality. What... ...PyTorch, C++, and CUDA with strong research...SeniorPerformanceRemote work$220k
Perplexity is looking for an engineer to join their team in San Francisco. You will work on building and operating the inference engine, supporting new models, migrating GPU kernels, and developing a Rust-based serving runtime. The ideal candidate has 3+ years of experience...Senior
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Senior AI Performance Engineer (CUDA / GPU / NVIDIA Stack). Be the first to apply!
- ai research engineer United States
- machine learning ai engineer United States
- ai engineer remote United States
- ai prompt engineer United States
- ai developer United States
- ai engineer United States
- ai ml engineer United States
- senior ai engineer United States
- senior game producer United States
- senior manager process engineering United States

