Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Senior AI Performance Engineer (CUDA / GPU / NVIDIA Stack)

$140.4k
Full-time

Brillfy Technology Inc

 

Job Title: Senior AI Performance Engineer (CUDA / GPU / NVIDIA Stack)

Duration: Min 12+ Months

Location: 100% Remote

This is a hands-on engineering role , requiring deep expertise in CUDA, GPU architecture, and performance profiling .

Key Responsibilities

  • Profile and optimize AI/ML workloads across multi-GPU and multi-node systems
  • Identify bottlenecks across compute, memory, networking, and orchestration layers
  • Optimize CUDA kernels (memory coalescing, shared memory usage, occupancy tuning)
  • Improve inference performance using TensorRT, Triton, DeepStream, NeMo
  • Analyze and improve latency, throughput, GPU utilization, and memory efficiency
  • Work on distributed AI systems using Apache Ray, NCCL, Kubernetes GPU scheduling
  • Build benchmarking frameworks and performance monitoring systems
  • Collaborate with AI, DevOps, and Infrastructure teams for system-wide optimization

Required Skills

  • Strong hands-on CUDA programming and GPU performance optimization
  • Deep understanding of GPU architecture and memory hierarchy
  • Experience with Nsight, CUDA profiling tools, performance benchmarking
  • Hands-on experience with NVIDIA ecosystem (Triton, TensorRT, NeMo, DeepStream)
  • Experience with distributed AI systems (multi-GPU, multi-node, NCCL, Ray)
  • Experience working with AI models such as YOLO, GPT, LLaMA, Transformers
  • Strong understanding of AI system performance metrics (latency, throughput, utilization)

Preferred

  • Experience working at NVIDIA or similar GPU/AI infrastructure companies
  • Experience with real-time video / Vision AI systems
  • Experience with large-scale production AI deployments

Interview Process (Mandatory)

  • Candidates will receive a technical handout 1 day before interview
  • 90-minute deep-dive demo discussion (NOT theoretical)
  • Candidate must explain:
  • Bottleneck identification approach
  • GPU optimization strategies
  • System-level performance improvements

Vacancy posted 1 day ago
Similar jobs that could be interesting for youBased on the Senior AI Performance Engineer (CUDA / GPU / NVIDIA Stack) in United States vacancy
  •  ...in Santa Clara is seeking a Senior High Performance AI Engineer to build groundbreaking multi-agent systems for the CUDA ecosystem. The ideal...  ...Python, and experience with GPU programming. This role offers...  ...diverse and inclusive workplace. #J-18808-Ljbffr Nvidia Corporation
    Senior
    Performance

    Nvidia Corporation

    Santa Clara, CA
    1 day ago
  • $184k - $287.5k

     ...NVIDIA has been transforming computer graphics, PC...  ...unlimited potential of AI to define the next era...  .... An era in which our GPU acts as the brains of...  ...Nsight Compute helps CUDA engineers around the world to innovate...  ...(AI) and High Performance Computing. Join our team... 
    Senior
    Performance
    Remote work

    NVIDIA

    Austin, TX
    2 days ago
  • $152k - $241.5k

    NVIDIA Corporation is hiring a Senior AI Compiler Engineer in Austin, Texas. In this role, you will develop MLIR-based graph optimizations, conduct performance analysis, and engage with hardware teams to enhance GPU architectures. Candidates should have at least 3 years... 
    Senior
    Performance

    NVIDIA Corporation

    Austin, TX
    1 day ago
  • Pragmatike is seeking a CUDA Kernel Engineer for a remote position to develop and optimize NVIDIA CUDA kernels for high-performance AI systems. The ideal candidate will have a deep understanding of GPU architecture, performance optimization strategies, and hands-on experience... 
    Senior
    Performance
    Remote work
    Relocation package

    Pragmatike

    San Francisco, CA
    2 days ago
  • Darwin Recruitment is seeking a Senior GPU Systems / AI Infrastructure Engineer in New York City. This senior-level...  ...engineering, deep experience with CUDA programming, and a strong understanding...  ..., directly impacting performance and scalability of frontier AI models... 
    Senior
    Performance

    Darwin Recruitment

    New York, NY
    3 days ago
  • $152k - $241.5k

     ...company in New York is seeking a Senior AI and FSI Developer Technology Engineer to enhance performance in the Financial Services...  ...have a deep understanding of CPU/GPU architecture. The base salary ranges...  ...depending on level and experience. #J-18808-Ljbffr NVIDIA Corporation
    Senior
    Performance

    NVIDIA Corporation

    New York, NY
    1 day ago
  • $184k - $287.5k

     ...looking for outstanding AI systems engineers to develop...  ...inference systems software stack! We build...  ...code generators, and GPU kernel technologies for NVIDIA's hardware architecture...  ...kernel development and performance optimizations (especially using CUDA C/C++, cuTile,... 
    Senior
    Performance
    Remote work

    NVIDIA

    United States
    19 hours ago
  • $184k - $356.5k

    NVIDIA Corporation is seeking a Senior Deep Learning Software Engineer specializing in Inference to join their growing team...  ...role involves optimizing GPU-accelerated software for advanced AI applications, including developing high-performance deep learning frameworks like... 
    Senior
    Performance

    NVIDIA Corporation

    Santa Clara, CA
    4 days ago
  •  ...Devices, Inc. is seeking a Senior Staff Software Developer who...  ...role in shaping the future of AI and improving performance in key applications. You'll...  ...architect and drive the AI software stack. The ideal candidate has...  ...C++ programming and GPU technologies, with experience... 
    Senior
    Performance

    Advanced Micro Devices, Inc.

    Santa Clara, CA
    2 days ago
  • NVIDIA in Santa Clara is seeking an experienced engineer to design and optimize AI systems for the CUDA ecosystem. Ideal candidates will have strong C/C++ and Python skills, with a solid background in AI systems development. The position offers competitive salaries, equitably... 
    Senior
    Performance

    NVIDIA

    Santa Clara, CA
    2 days ago
  • Micron Technology in Boise, Idaho is seeking a GPU Performance Engineer to optimize large-scale AI systems on modern GPU platforms. The role involves architecting...  ...proficient in programming languages like Python and CUDA. Micron offers comprehensive benefits and a... 
    Senior
    Performance

    Micron Technology

    Boise, ID
    2 days ago
  •  ...technology company is looking for a Senior Software Engineer to work on AI storage solutions. The role involves developing high-performance C++/CUDA libraries and optimizing storage infrastructure...  ...a collaborative and diverse environment. #J-18808-Ljbffr NVIDIA Corporation
    Senior
    Performance
    Remote job

    NVIDIA Corporation

    Santa Clara, CA
    3 days ago
  •  ...Hpc-Ai Engineer NVIDIA is looking for an experienced HPC-AI Engineer...  ...intelligence and GPU computing. Provide insights...  ...bring up large scale performance platforms. What you...  ...system, software stack and application level...  ...hardware/software (DGX, Cuda) Experience with RDMA... 
    Senior
    Performance
    Remote work

    NVIDIA

    United States
    19 hours ago
  • $184k - $287.5k

    Join to apply for the Senior High Performance AI Engineer role at NVIDIA NVIDIA has been transforming...  .... An era in which our GPU acts as the brains of computers...  ...AI systems for the CUDA ecosystem. Co‑design agentic...  ...across the AI stack—from hardware through compilers... 
    Senior
    Performance

    NVIDIA

    California, MO
    2 days ago
  •  ...Software Engineer - CUDA Core Libraries NVIDIA's accelerated computing...  ...modern HPC and AI. At the core of...  ...reliable, and scalable GPU-accelerated...  ...down to low-level performance tuning involving...  ...across the stack: CI, tests, benchmarks...  ...with senior CUDA engineers in... 
    Senior
    Performance
    Full time
    Remote work

    NVIDIA

    United States
    4 days ago
  • $184k - $287.5k

     ...NVIDIA is leading the way in groundbreaking developments...  ...Intelligence, High Performance Computing and Visualization. The GPU, our invention,...  ...motivated Deep Learning engineer to bring advanced CUDA features and...  ...Runtime technologies into AI stacks, including PyTorch, TRT... 
    Senior
    Performance
    Remote work

    NVIDIA

    United States
    2 days ago
  • $152k - $241.5k

     ...NVIDIA is seeking outstanding senior engineers to work on the CUDA driver, a key component of accelerated GPU computing. You will join a versatile...  ...potential and performance of NVIDIA hardware...  ...NVIDIA computing stack, you will help design...  .... NVIDIA uses AI tools in its... 
    Senior
    Performance

    NVIDIA

    Santa Clara, CA
    2 days ago
  • $184k - $287.5k

     ...rack, multi-tenant AI/ML datacenters with NVIDIA GB200, and...  ...NVIDIA seeks a Senior Software Engineer for our CSP (Cloud...  ...the cloud-native stack for datacenter products...  ...ll be doing: Perform deep-dive...  ...that expose new GPU capabilities....  ...GPU computing (CUDA), deep learning... 
    Senior
    Performance

    NVIDIA

    Santa Clara, CA
    1 day ago
  • $184k - $287.5k

    Senior Software Engineer, NCCL and CUDA - CSP Engagements page is loaded## Senior...  ...on ML software stack functionality and performance for datacenter...  ...and improve multi-GPU workloads performance...  ...aligned with the NVIDIA ecosystem.* Collaborate...  ....NVIDIA uses AI tools in its recruiting... 
    Senior
    Performance
    Remote work

    NVIDIA Corporation

    Austin, TX
    19 hours ago
  • $152k - $241.5k

     ...Join the NVIDIA Developer Tools team and empower engineers throughout the world developing...  ..., and High Performance Computing! See your...  ...profiler stack and application code...  ...~ Knowledge of a GPU Compute API such as CUDA, OpenCL, or similar...  .... NVIDIA uses AI tools in its recruiting... 
    Senior
    Performance
    Worldwide

    NVIDIA

    Austin, TX
    3 days ago
  • $184k - $287.5k

     ...We are hiring senior engineers to work on the CUDA driver and runtime, core...  ...on the GPU. Our team analyzes performance of applications, investigates...  ...the potential of NVIDIA hardware for a...  ...teams Analyze full stack performance...  ...vacancy. NVIDIA uses AI tools in its recruiting... 
    Senior
    Performance

    NVIDIA

    Santa Clara, CA
    3 days ago
  • $184k - $287.5k

     ...NVIDIA has been transforming computer graphics...  ...potential of AI to define the...  ...era in which our GPU acts as the brains...  ...for a dedicated engineer for the Senior Systems Software...  ...focusing on GPU Performance at Scale. At NVIDIA...  ...computing software stacks (CUDA). Experience with... 
    Senior
    Performance
    Remote work

    NVIDIA

    United States
    4 days ago
  • Pragmatike is looking for a CUDA Kernel Engineer to design and optimize custom CUDA kernels for AI systems. This remote position...  ...opportunity to work with high-performance AI solutions for Fortune 500...  ...candidate has experience with NVIDIA GPU architecture, strong kernel optimization... 
    Senior
    Performance
    Remote work
    Relocation package

    Pragmatike

    New York, NY
    2 days ago
  • $152k - $241.5k

     ...NVIDIA is leading the way in groundbreaking...  ..., High Performance Computing and Visualization...  .... The GPU, our invention,...  ...motivated Performance engineer to influence the...  ...in the stack Evaluate proof...  ...Familiarity with CUDA programming and/...  ...NVIDIA uses AI tools in its recruiting... 
    Senior
    Performance
    Remote work

    NVIDIA

    Santa Clara, CA
    2 days ago
  •  ...Devices is seeking a principal software developer to join the ROCm GPU-compute team in Santa Clara, California. The ideal candidate...  ...operations on GPUs, leading a small team, and optimizing performance. Join AMD to innovate in computing and contribute to shaping the... 
    Senior
    Performance

    Advanced Micro Devices

    Santa Clara, CA
    4 days ago
  • Pragmatike is seeking a CUDA Kernel Engineer to develop and optimize CUDA kernels for high-throughput AI systems. This remote role involves maximizing GPU performance, diagnosing PCIe bottlenecks, and working...  ...scratch and optimizing for NVIDIA architectures. Benefits... 
    Senior
    Performance
    Remote work
    Relocation package

    Pragmatike

    Atlanta, GA
    4 days ago
  • A leading analytics firm in the United States is seeking a Full Stack Gen AI Engineer with over 7 years of experience, focused on Python and AWS infrastructure. The role involves building high-performance API services and integrating Generative AI technologies. Candidates... 
    Senior
    Performance

    Tiger Analytics

    Caledonia, WI
    4 days ago
  • $152k - $241.5k

     ...NVIDIA's invention of the GPU 1999 sparked the growth of the PC gaming...  ...learning ignited modern AI - the next era of...  ...Deep Learning Compiler Engineer. NVIDIA is hiring software...  ...leading inference performance, fast build time,...  ...in GPU architecture. CUDA or OpenCL programming... 
    Senior
    Performance
    Remote work

    NVIDIA

    United States
    4 days ago
  • $184k - $287.5k

     ...NVIDIA has been transforming computer graphics...  ...potential of AI to define the next...  ...era in which our GPU acts as the brains...  ...platform, and product engineering to ensure...  ...translate into real performance and quality. What...  ...PyTorch, C++, and CUDA with strong research... 
    Senior
    Performance
    Remote work

    NVIDIA

    United States
    1 day ago
  • $220k

    Perplexity is looking for an engineer to join their team in San Francisco. You will work on building and operating the inference engine, supporting new models, migrating GPU kernels, and developing a Rust-based serving runtime. The ideal candidate has 3+ years of experience... 
    Senior

    Perplexity

    San Francisco, CA
    3 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior AI Performance Engineer (CUDA / GPU / NVIDIA Stack). Be the first to apply!