Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

GPU Systems Engineer (CUDA)

Bright Vision Technologies

Bright Vision Technologies is a forward-thinking software development company dedicated to building innovative solutions that help businesses automate and optimize their operations. We leverage cutting-edge technologies to create scalable, secure, and user-friendly applications.

As we continue to grow, we're looking for a skilled GPU Systems Engineer (CUDA) to join our dynamic team and contribute to our mission of transforming business processes through technology.

This is a fantastic opportunity to join an established and well-respected organization offering tremendous career growth potential.GPU Systems Engineer (CUDA)
Job Title: GPU Systems Engineer (CUDA)
Location: 100% Remote (Continental United States)
Position Type: In-house Bright Vision Technologies SOW engagement (no third-party client or vendor)
Experience: 6+ years
Salary: 100k - 150k
Sponsorship: No new H1B sponsorship available. H1B transfers welcomed for qualified candidates.
Employment Type: Full-time, direct W2 with Bright Vision Technologies (no C2C, no 1099, no third-party)
Engagement: Long-term, multi-year, aligned to the Bright Vision SOW delivery roadmap
Compensation: Competitive base salary commensurate with experience, plus benefits.

Employment Terms & Visa Policy

This is a 100% remote, full-time, direct W2 position with Bright Vision Technologies.
This role is part of Bright Vision Technologies' in-house Statement of Work (SOW) engagement. The client, end customer, and employer for this position is Bright Vision Technologies - there is no third-party client, vendor, or implementation partner involved.
We do not engage in C2C, 1099, or third-party arrangements for this role.

BUT STRICTLY NO C2C/1099/3RD PARTY COMPANIES. ALL OUR ROLES ARE W2 AND NO 3RD PARTY BROKERING PLEASE.
Candidates must be willing to work directly as a full-time W2 employee of Bright Vision Technologies and contribute to our in-house SOW deliverables.
No new H1B sponsorship is available for this role.

However, candidates who are currently on a valid H1B visa and require a transfer are welcome to apply. We will support H1B transfers for qualified candidates.
For every role, a technical coding assessment is mandatory. Please apply only if you are confident in your technical abilities and hands-on experience.

Job Summary
We are seeking a GPU Systems Engineer with deep expertise in CUDA programming, GPU architecture, and high-performance computing to design and optimize compute-intensive workloads on modern accelerator hardware. This role focuses on extracting maximum performance from GPU platforms for AI training, inference, scientific computing, and high-throughput data processing workloads. The ideal candidate combines low-level systems mastery with strong software engineering practices, and has a track record of delivering measurable performance improvements on production GPU systems. In this role you will work closely with cross-functional partners - product, design, engineering, operations, and business stakeholders - to translate ambiguous requirements into well-engineered solutions, and will be expected to raise the bar through code review, design review, and mentorship of more junior engineers. The successful candidate brings strong engineering discipline, a clear communication style, and a track record of shipping meaningful work that holds up well in production.

Key Responsibilities
  • Design and implement high-performance CUDA kernels for compute-intensive workloads across AI and HPC use cases.
  • Profile and optimize GPU code using tools such as Nsight Systems, Nsight Compute, and CUDA profilers.
  • Tune memory access patterns, occupancy, register usage, and shared memory utilization for peak performance.
  • Develop highly optimized libraries for linear algebra, attention, and other ML primitives.
  • Optimize multi-GPU and multi-node training using NCCL, RDMA, and high-performance networking.
  • Implement custom operators and fused kernels in PyTorch, JAX, or Triton.
  • Collaborate with ML engineers to identify performance bottlenecks in training and inference pipelines.
  • Develop benchmarks and regression tests to safeguard performance over time.
  • Evaluate new GPU architectures and feature sets, and advise on adoption strategy.
  • Contribute to compiler-level optimizations for tensor programs where appropriate, working at the boundary between ML frameworks and underlying accelerator codegen to unlock performance not reachable through framework-level tuning alone.
  • Optimize memory hierarchy usage across HBM, L2, shared memory, and registers.
  • Implement mixed-precision and quantized compute paths that maximize accelerator throughput while preserving numerical fidelity within bounds acceptable for the target workloads.
  • Document performance characteristics, design decisions, and tuning playbooks for internal teams.
  • Stay current with GPU architecture, CUDA evolution, and emerging accelerator technologies.

Required Qualifications
  • Bachelor's or Master's degree in Computer Science, Computer Engineering, or a related field.
  • Six or more years of experience in GPU programming and performance engineering.
  • Deep expertise in CUDA C/C++ and GPU programming models.
  • Strong understanding of modern GPU architectures, memory hierarchies, and execution models.
  • Hands-on experience profiling and optimizing GPU workloads in production.
  • Familiarity with NCCL, MPI, and high-performance interconnect technologies.
  • Experience integrating custom kernels into ML frameworks.
  • Strong C++ skills and familiarity with modern systems programming practices.
  • Solid grounding in linear algebra and numerical methods.
  • Strong communication and collaboration skills with research and engineering teams.

Preferred Qualifications
  • Experience with Triton, CUTLASS, or other GPU kernel authoring frameworks.
  • Familiarity with TensorRT, FasterTransformer, or vLLM internals.
  • Exposure to compiler infrastructure such as LLVM or MLIR.
  • Open-source contributions to GPU or ML performance libraries.
  • Experience with large-scale distributed training infrastructure.

How to Apply
Would you like to know more about this opportunity?
For immediate consideration, please send your resume to [email protected] or contact us at View phone number on click.appcast.io. Learn more about Bright Vision Technologies at
We recognize that our people are our strength, and the diverse talents they bring to our global workforce are directly linked to our success. We are an equal opportunity employer and place a high value on diversity and inclusion at our company.
We do not discriminate on the basis of any protected attribute, including race, religion, color, national origin, gender, sexual orientation, gender identity, gender expression, age, marital or veteran status, pregnancy or disability, or any other basis protected under applicable law. We also make reasonable accommodations for applicants' and employees' religious practices and beliefs, as well as mental health or physical disability needs.
Bright Vision Technologies is an Equal Opportunity Employer, including Disability/Veterans.
Position offered by "No Fee Agency."
Equal Employment Opportunity (EEO) Statement

Bright Vision Technologies (BV Teck) is committed to equal employment opportunity (EEO) for all employees and applicants without regard to race, color, religion, sex, sexual orientation, gender identity or expression, national origin, age, genetic information, disability, veteran status, or any other protected status as defined by applicable federal, state, or local laws. This commitment extends to all aspects of employment, including recruitment, hiring, training, compensation, promotion, transfer, leaves of absence, termination, layoffs, and recall.

BV Teck expressly prohibits any form of workplace harassment or discrimination. Any improper interference with employees' ability to perform their job duties may result in disciplinary action up to and including termination of employment.
Vacancy posted 4 days ago
Similar jobs that could be interesting for youBased on the GPU Systems Engineer (CUDA) in United States vacancy
  • $100k - $150k

     ...leverage cutting-edge technologies to create scalable, secure, and user-friendly applications. We are seeking a skilled GPU Systems Engineer (CUDA) to join our team and contribute to transforming business processes through technology. This is a full-time, 100% remote... 
    Suggested
    Full time
    H1b
    Immediate start
    Remote work
    Visa sponsorship

    Bright Vision Technologies

    Fremont, CA
    1 day ago
  • Bright Vision Technologies is seeking a GPU Systems Engineer (CUDA) to enhance business processes through technology. This remote full-time position requires 6+ years of experience in GPU programming and performance engineering. The ideal candidate will have in-depth knowledge... 
    Suggested
    Remote job
    Full time

    Bright Vision Technologies

    Edison, NJ
    3 days ago
  • $100k - $150k

     ...technologies to create scalable, secure, and user-friendly applications. As we continue to grow, we’re looking for a skilled GPU Systems Engineer (CUDA) to join our dynamic team and contribute to our mission of transforming business processes through technology. This is... 
    Suggested
    Full time
    H1b
    Local area
    Immediate start
    Remote work
    Visa sponsorship
    Work visa

    Bright Vision Technologies

    Johns Creek, GA
    2 days ago
  •  ...company in Austin, TX, seeks an analytical individual to develop low-level GPU exercisers for data center products. The successful candidate will have strong expertise in GPU programming and Linux systems. Responsibilities include debugging critical issues and contributing... 
    Suggested

    Advanced Micro Devices

    Austin, TX
    1 day ago
  •  ...Seeking a full-time remote GPU Systems Engineer (CUDA) with over six years of experience to design and optimize high-performance CUDA kernels for compute-intensive workloads, collaborating with cross-functional teams to enhance GPU performance for AI and scientific computing... 
    Suggested
    Full time
    Remote work

    Virtual Vocations Inc

    United States
    2 days ago
  • $160k - $320k

     ...deliver excellence.  We seek engineers/researchers with strong...  ...the Role We’re looking for a systems engineer with HPC or parallel...  ...performance systems to optimize GPU performance at the bleeding edge...  ...or LA offices Tech Stack CUDA/C++, GPGPU, Python, Linux Key... 
    Full time
    Work at office

    Vast

    San Francisco, CA
    5 days ago
  • $150k - $300k

     ...Hudson River Trading (HRT) is looking for GPU Systems Engineers to help scale and evolve our exceptionally sophisticated HPC/AI research environment...  ...in Python scripting and automation frameworks ~ CUDA or C/C++ experience is a plus Experience with NVIDIA... 
    Work at office
    Local area
    Immediate start

    Hudson River Trading

    New York, NY
    3 days ago
  •  ...pipelines for on-device deployment in robotics. The role involves designing and optimizing distributed systems on GPU clusters, implementing efficient low-level code such as CUDA and Triton, and managing workloads to ensure high throughput and low latency. Ideal candidates... 

    Genesis AI

    San Francisco, CA
    5 days ago
  • $200k - $300k

     ...System Engineer, GPU Fleet As a System Engineer, GPU Fleet, you will manage, operate, and optimize hyperscale GPU compute infrastructure supporting...  ...AI/ML workloads Deep knowledge of GPU architecture, CUDA toolkit, GPU drivers, monitoring tools (nvidia-smi, DCGM)... 
    Local area

    Fluidstack

    Austin, TX
    9 hours ago
  •  ...candidate to join our talented Team. Job Title: System Engineer Datacenter GPU Location(s): Austin, TX Client is looking...  ...and automation, software driver development and CUDA/TensorRT applications. All qualified applicants... 
    Worldwide

    Ampcus

    Austin, TX
    1 day ago
  • NVIDIA Corporation is seeking a System Software Engineer to work on next-generation Data Center GPU diagnostics for AI supercomputer systems. This role involves building...  ..., and developing diagnostic workloads using CUDA and C++. The ideal candidate has over 5 years of... 

    NVIDIA Corporation

    Durham, NC
    5 days ago
  • $190.58k - $200k

     ...Research Computing GPU Systems Engineer Business Affairs: University IT (UIT), Stanford, California, United States Information Technology...  ...for deep learning frameworks (PyTorch, TensorFlow, JAX) and CUDA application optimization. Benchmark system performance and... 
    Hourly pay
    Full time
    Flexible hours
    Weekend work
    Afternoon shift

    Stanford University

    Stanford, CA
    4 days ago
  • $160k - $230k

     ...Systems Research Engineer, GPU Programming San Francisco About the Role As a Systems Research Engineer specialized in GPU Programming, you...  ...background in GPU programming and parallel computing, such as CUDA and/or Triton. Knowledge of ML/AI applications and... 
    Full time
    Remote work

    Together AI

    San Francisco, CA
    4 days ago
  • $160k - $320k

     ...and deliver excellence. We seek engineers/researchers with strong intrinsic drive,...  ...Angeles. About the Role As a systems/GPU engineer, you will play a crucial role in...  ...our SF or LA offices Tech Stack CUDA/C++, GPGPU, Python, Linux Ideal Experience... 
    Full time
    Work at office

    Vast

    Los Angeles, CA
    5 days ago
  • $200k - $322k

     ...seeking a self‑motivated senior engineer for the Aerial Omniverse...  ...numbers of emulated devices, across systems of potentially thousands of...  ...you will design and implement GPU kernels that apply time‑varying...  ...proven experience. Proficiency in CUDA kernel design with attention... 

    NVIDIA Gruppe

    Santa Clara, CA
    1 day ago
  •  ...automotive company in the United States is seeking an experienced GPU Software Engineer to design and implement high-performance GPU kernels for...  .... The position requires strong programming skills in CUDA and C++, and the ability to collaborate with cross-functional... 

    General Motors

    New York, NY
    3 days ago
  •  ...inclusive, adaptable, and forward-thinking organization, apply now. We are currently seeking a On-Premise LLM Inference & GPU Systems Engineer to join our team in Charlotte, North Carolina (US-NC), United States (US). Job Description: ~ Role Overview We are... 
    Remote work

    The Nippon Telegraph and Telephone Corporation (NTT)

    Charlotte, NC
    2 days ago
  •  ...On-Premise LLM Inference & GPU Systems Engineer NTT DATA strives to hire exceptional, innovative and passionate individuals who want to grow with us. If you want to be part of an inclusive, adaptable, and forward-thinking organization, apply now. We are currently seeking... 

    NTT DATA

    Charlotte, NC
    4 days ago
  • $151.9k - $172.16k

     ...Position Summary Support enterprise AI mission systems by designing, developing, and optimizing GPU clusters, with deep focus on operating systems, hardware...  ...and required features. Work closely with AI/ML engineers to integrate GPUs with Linux-based systems. Optimize... 
    Hourly pay
    Contract work
    Temporary work
    Immediate start
    Flexible hours

    Navstar

    Bethesda, MD
    1 day ago
  • $181k - $248.5k

     ...exploration across our solar system. Its mission is to make planetary...  ...About the Role: Own the GPU compute environment for a...  ...Computer Science or Electrical Engineering and 5+ years of relevant experience...  ...and compute frameworks — CUDA, ROCm, or OpenCL — with real performance... 
    Full time
    Shift work

    Relativity Space

    Long Beach, CA
    1 hour ago
  • Pragmatike is hiring a CUDA Kernel Engineer to develop and optimize NVIDIA CUDA kernels for a leading AI startup. This remote role focuses on maximizing GPU performance and throughput for high-scale AI systems. Candidates should have substantial experience with CUDA and... 
    Remote job
    Relocation package

    Pragmatike

    New York, NY
    4 days ago
  •  ...firm in McLean, VA is looking for a High-Performance Computing Engineer to design, implement, and maintain advanced computing solutions...  ...have strong technical expertise in large-scale HPC environments, GPU computing, and high-speed networking, alongside an active TS/SCI... 

    VTG Defense

    Mc Lean, VA
    1 day ago
  •  ...Robotics in Redwood City, California is seeking a System Software Engineer to contribute to the accelerated compute layer...  ...candidate has over 2 years of experience in GPU systems software development, strong proficiency in CUDA, and a solid understanding of GPU architecture... 

    Sunday Robotics

    Redwood City, CA
    4 days ago
  • Base-2 Solutions in Bethesda, MD is looking for a Senior GPU Systems Engineer to support enterprise AI mission systems. You will design and maintain GPU clusters, collaborate with multidisciplinary teams for architecture optimization, and work with AI/ML engineers to integrate... 
    Flexible hours

    Base-2 Solutions

    Bethesda, MD
    4 days ago
  •  ...skilled professional in New York to design and operate large-scale GPU infrastructure for model inference and reinforcement learning. The role demands several years of experience in deploying GPU systems, optimizing model performance, and working with frameworks like... 

    Reflection

    New York, NY
    5 days ago
  • Pragmatike is seeking a CUDA Kernel Engineer to develop and optimize CUDA kernels for high-throughput AI systems. This remote role involves maximizing GPU performance, diagnosing PCIe bottlenecks, and working with Fortune 500 clients. Ideal candidates will have hands-on... 
    Remote work
    Relocation package

    Pragmatike

    Atlanta, GA
    5 days ago
  • Pragmatike is looking for a CUDA Kernel Engineer to design and optimize custom CUDA kernels for AI systems. This remote position offers an opportunity to work with high-performance...  ...ideal candidate has experience with NVIDIA GPU architecture, strong kernel optimization... 
    Remote work
    Relocation package

    Pragmatike

    New York, NY
    3 days ago
  • Pragmatike is seeking a CUDA Kernel Engineer to work remotely for a rapidly growing AI startup. The...  ...kernels, with a strong understanding of GPU architecture and performance...  ...collaborating with various teams to enhance AI system efficiency. This position offers competitive... 
    Remote job
    Relocation package

    Pragmatike

    San Francisco, CA
    3 days ago
  • Pragmatike is seeking a CUDA Kernel Engineer for a remote position to develop and optimize NVIDIA CUDA kernels for high-performance AI systems. The ideal candidate will have a deep understanding of GPU architecture, performance optimization strategies, and hands-on experience... 
    Remote work
    Relocation package

    Pragmatike

    San Francisco, CA
    3 days ago
  •  ...Position: Software Engineer - GPU, C++, OpenCL, CUDA Location: Waukesha, WI (Onsite) Exp: 5 - 9 yrs Key Skills: GPU, C++, OpenCL, CUDA, OneAPI, Matlab Only USC / GC Job Requirements The CT Program is working on upgrading CT scanners used worldwide. The... 
    Work experience placement
    Worldwide

    Hudson Manpower

    Waukesha, WI
    4 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to GPU Systems Engineer (CUDA). Be the first to apply!