Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Lead Software Systems Engineer - GPU Performance

$170k - $300k

Nebius

About Nebius:

Nebius is leading a new era in cloud infrastructure for the global AI economy. We are building a full-stack AI cloud platform that supports developers and enterprises from data and model training through to production deployment, without the cost and complexity of building large in-house AI/ML infrastructure.

Built by engineers, for engineers. From large-scale GPU orchestration to inference optimization, we own the hard problems across compute, storage, networking and applied AI.

Listed on Nasdaq (NBIS) and headquartered in Amsterdam, we have a global footprint with R&D hubs across Europe, the UK, North America and Israel. Our team of 1,500+ includes hundreds of engineers with deep expertise across hardware, software and AI R&D.

We are looking for a Lead Software Systems Engineer - GPU Performance to play a key role in building our hyperscaler platform, working across its core components while analyzing and optimizing the performance of large-scale GPU clusters at the intersection of hardware and software.

You will operate across the full stack-from hardware and system software to networking (InfiniBand/RoCE), virtualization (KVM/QEMU), and distributed communication layers (e.g., MPI, NCCL).

In this role you will
  • Focus on understanding system behavior across multiple layers, identifying performance bottlenecks, and driving improvements that shape how our clusters are built, operated, tuned, and validated.
  • Investigate and troubleshoot performance issues of GPU cluster under real workloads (training and inference)
  • Evaluate and integrate new hardware, system configurations and tuning approaches through software stack
  • Support complex performance-related escalations from internal teams and customers
  • Work closely with infrastructure, software engineering and hardware vendor teams (e.g. NVIDIA, Mellanox, Intel)
  • Contribute to hardware and cluster qualification (acceptance), ensuring systems meet performance expectations
We expect you to have:
  • 5+ years of professional experience in system-level software development (focused on performance optimization, low-level programming).
  • 3+ years of hands-on experience with Linux systems (administration, troubleshooting, and performance tuning).
  • In-depth understanding of server architecture, including PCIe devices, NICs, Linux OS/Kernel, and high-performance computing (HPC) systems.
  • Strong proficiency in one or more performance-oriented programming languages (C/C++, Go, Python).
We conduct coding interviews as part of the process.

Key employee benefits:
  • Health insurance: 100% company-paid medical, dental and vision coverage for employees and families.
  • 401(k) plan: Up to 4% company match with immediate vesting.
  • Parental leave : 20 weeks paid for primary caregivers, 12 weeks for secondary caregivers.
  • Remote work reimbursement: Up to $85/month for mobile and internet.
  • Disability & life insurance: Company-paid short-term, long-term and life insurance coverage.
Compensation

We offer competitive salaries ranging from $170k-$300k OTE + equity based on your experience.

Pay Transparency

We offer competitive compensation and benefits packages. Actual compensation will be determined based on job-related factors, including experience, skills, qualifications, the level at which the candidate is hired, and geographic location, consistent with applicable law.

Compensation Range

$170,000-$300,000 USD

Benefits & Perks:
  • Competitive compensation
  • Career growth and learning opportunities
  • Flexibility and work-life balance
  • Collaborative and innovative culture
  • Opportunity to work on impactful AI projects
  • International environment and talented teams

What's it like to work at Nebius:

Fast moving - Bold thinking - Constant growth - Meaningful impact - Trust and real ownership - Opportunity to shape the future of AI


Equal Opportunity Statement:

Nebius is an equal opportunity employer. We are committed to fostering an inclusive and diverse workplace and to providing equal employment opportunities in all aspects of employment. We do not discriminate on the basis of race, color, religion, sex (including pregnancy), national origin, ancestry, age, disability, genetic information, marital status, veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by applicable law.

Applicants must be authorized to work in the country in which they apply and will be required to provide proof of employment eligibility as a condition of hire.


If you need accommodations during the application process, please let us know.
Vacancy posted 18 hours ago
Similar jobs that could be interesting for youBased on the Lead Software Systems Engineer - GPU Performance in United States vacancy
  •  ...Lead Software Engineer Be an integral part of an agile team that's constantly...  ...integrated with firm systems. Produce architecture and...  ...batching. Deploy and manage GPU workloads in Kubernetes...  ...GPU programming (CUDA) and performance optimization. Experience... 
    Performance

    Chase

    Palo Alto, CA
    2 days ago
  •  ...Bright Vision Technologies is seeking a GPU Systems Engineer (CUDA) to design and optimize compute...  ...in CUDA programming and high-performance computing, focusing on improving performance...  ...quality solutions, contributing to our innovative software projects. #J-18808-Ljbffr... 
    Performance
    Remote work

    Bright-Vision-Technologies

    New York, NY
    3 days ago
  •  ...Summary We are seeking an On-Premise LLM Inference & GPU Systems Engineer to build, optimize, and support a large-scale enterprise Generative...  ...This position will be responsible for maximizing inference performance, operational efficiency, and platform reliability across... 
    Performance

    Compunnel

    Charlotte, NC
    18 hours ago
  • $150k - $300k

     ...Hudson River Trading (HRT) is looking for GPU Systems Engineers to help scale and evolve our...  ...scope, from HPC/AI cluster design and performance tuning, to troubleshooting and automation...  ...Test and deploy new hardware and software, and partner with vendors to resolve... 
    Performance
    Work at office
    Local area
    Immediate start

    Hudson River Trading

    New York, NY
    3 days ago
  •  ...Job Title: System Engineer Datacenter GPU Location(s):...  ...Engineering. IPP is a core software infrastructure organization...  ...distributed infrastructure scaling, leading GPU product bringups (PCIe...  ...infrastructure. Automation and performance tuning of regression test... 
    Performance
    Worldwide

    Ampcus

    Austin, TX
    2 days ago
  • $100k - $150k

     ...GPU Systems Engineer (CUDA) Bright Vision Technologies is a forward-thinking software development company dedicated to building innovative solutions that help businesses...  ...CUDA programming, GPU architecture, and high-performance computing to design and optimize compute-... 
    Performance
    Full time
    H1b
    Remote work
    Visa sponsorship

    Bright Vision Technologies

    United States
    3 days ago
  • $151.9k - $172.16k

     ...Requisition ID: 2503 Standard Title: GPU Systems Engineer Required Security Clearance: Top...  ...define and optimize architectures for performance, power efficiency, and required...  ...improve efficiency across hardware and software layers. Build and maintain debugging... 
    Performance
    Hourly pay
    Contract work
    Temporary work
    Immediate start
    Flexible hours
    Shift work

    Base2 Solutions

    Bethesda, MD
    1 day ago
  •  ...Apple Inc. in Cambridge, Massachusetts, is seeking a Software Engineer focused on GPU performance. In this role, you will develop the infrastructure for Apple GPUs, conduct performance analyses, and define driver software for enhanced GPU introspection capabilities. A... 
    Performance

    Apple

    Cambridge, MA
    46 minutes ago
  •  ...edge AI infrastructure startup is seeking a Kubernetes DevOps Engineer to join their innovative team in San Francisco. The role...  ...Kubernetes clusters across various environments, focusing on high-performance GPU workloads. Ideal candidates will have deep Kubernetes... 
    Performance

    Jack & Jill/External ATS

    San Francisco, CA
    29 minutes ago
  •  ...professional in New York to design and operate large-scale GPU infrastructure for model inference and reinforcement...  ...role demands several years of experience in deploying GPU systems, optimizing model performance, and working with frameworks like SGLang and Megatron. The... 
    Performance

    Reflection

    New York, NY
    23 hours ago
  • MakerMaker, based in San Francisco, is seeking a highly skilled kernel engineer to write and optimize GPU kernels that enhance performance for training and inference. This role involves deep, low-level work to close the significant performance gap that exists in modern... 
    Performance

    MakerMaker

    San Francisco, CA
    3 days ago
  • $160k - $320k

    A leading AI computing firm is seeking a Systems Engineer in San Francisco or Los Angeles to scale AI inference. Candidates should have strong C++...  .... Responsibilities include designing GPU kernels, optimizing performance, and collaborating with technical leads to enhance... 
    Performance

    Vast.ai

    San Francisco, CA
    23 hours ago
  • Bright Vision Technologies is seeking a GPU Systems Engineer (CUDA) to enhance business processes through technology. This remote full-time...  ...position requires 6+ years of experience in GPU programming and performance engineering. The ideal candidate will have in-depth... 
    Performance
    Remote job
    Full time

    Bright Vision Technologies

    Edison, NJ
    3 days ago
  • $195.2k - $292.8k

     ...Technologies, Inc. Job Area: Engineering Group, Engineering Group GPU ASICS Engineering General Summary: GPU System Driver Team are looking for talented software engineers to develop in-...  ...to verify GPU function and performance on simulator/emulator, and... 
    Performance
    Work experience placement

    Qualcomm

    San Diego, CA
    3 days ago
  • NVIDIA Corporation, located in Santa Clara, CA, is seeking a Senior Systems Software Engineer focused on GPU Performance at Scale. This role entails leading performance practices in large-scale GPU infrastructure and aligning AI workloads with next-generation datacenter... 
    Performance

    NVIDIA Corporation

    Santa Clara, CA
    3 days ago
  •  ...defense technology firm in McLean, VA is looking for a High-Performance Computing Engineer to design, implement, and maintain advanced computing...  ...strong technical expertise in large-scale HPC environments, GPU computing, and high-speed networking, alongside an active... 
    Performance

    VTG Defense

    Mc Lean, VA
    1 day ago
  • $160k - $320k

     ...initiative and deliver excellence. We seek engineers/researchers with strong intrinsic...  ...About the Role We're looking for a systems engineer with HPC or parallel...  ...'ll leverage your knowledge of high-performance systems to optimize GPU performance at the bleeding edge of... 
    Performance
    Full time
    Work at office

    Vast.ai

    San Francisco, CA
    23 hours ago
  • $160k - $230k

     ...About the Role As a Systems Research Engineer specialized in GPU Programming, you will...  ...architecture to enhance the performance and efficiency of our AI...  ...with the hardware and software teams, you will contribute...  ...We have contributed to leading open-source research,... 
    Performance
    Full time
    Remote work

    Together AI

    San Francisco, CA
    4 days ago
  • $200k - $300k

     ...our own, taking pride in the systems we build and the trust we...  ...About the Role As a System Engineer, GPU Fleet, you will manage, operate...  .... Ensure high availability, performance, and reliability of GPU server...  ..., and application teams Lead post‑incident reviews, document... 
    Performance
    Local area

    Fluidstack

    Seattle, WA
    2 days ago
  • $146k - $194k

     ...changing how military systems are designed, built and...  ...is seeking a High Performance Computing (HPC) System Engineer to directly support our...  ...Architect and deploy advanced GPU infrastructure, leading the design, deployment...  ...cluster management software (e.g., Warewulf,... 
    Performance
    Full time
    Work experience placement
    Immediate start

    Anduril Industries

    Costa Mesa, CA
    3 days ago
  • $160k - $253k

     ...accelerated computing is the engine of artificial...  ...platforms integrate high performance compute, networking, and a full-stack software ecosystem to power AI at...  ...in showcasing NVIDIA's GPU architecture, server-level...  ...NVIDIA's GPU and rack-scale systems. This role bridges architecture... 
    Performance
    Remote work

    NVIDIA

    United States
    4 days ago
  •  ...Oefentherapie is seeking a Senior Engineer to join the Oracle Cloud...  ...scale bare-metal provisioning systems for cloud services. As a team...  ...high reliability and performance for Oracle's compute services...  ...enjoys working with advanced GPU hardware. The position offers... 
    Performance

    Ll Oefentherapie

    Seattle, WA
    4 days ago
  •  ...Group, LLP is seeking a Machine Learning Engineer in Bala Cynwyd, PA. This role focuses...  ...inference optimization for high-performance model serving systems. You will collaborate with researchers...  ...performance, evaluate frameworks, and debug GPU memory issues while managing... 
    Performance

    Susquehanna International Group, LLP

    Bala Cynwyd, PA
    23 hours ago
  • $168k - $322k

    NVIDIA Gruppe is looking for a System Design Engineer to join the Graphics Product Team in Santa Clara...  ...In this role, you will develop NVIDIA GPU/Tegra based products while...  ...SW engineers to balance product cost, performance, and schedule. Candidates should have... 
    Performance

    NVIDIA Gruppe

    Santa Clara, CA
    3 days ago
  • $140k - $225k

     ...Systems Engineer - Graphics Processing Unit (GPU) Absolute Business Solutions Corp (ABSC) is not just another tech...  ...-site position. All work must be performed at the customer site in Bethesda...  ...efficiency across hardware and software layers. Tooling and Automation... 
    Performance
    Contract work

    Absolute Business Solutions Corp

    Bethesda, MD
    23 hours ago
  •  ...seeking to enhance its enterprise AI mission systems by hiring a specialized engineer focused on designing and optimizing GPU clusters. In this role, you will be...  ...security clearance. Knowledge of Kubernetes and performance monitoring tools is highly desirable. #J-188... 
    Performance

    RPMGlobal

    Bethesda, MD
    1 day ago
  • $200k - $322k

     ...self‑motivated senior engineer for the Aerial Omniverse...  ...devices, across systems of potentially thousands...  ...design and implement GPU kernels that apply time...  ...need to see: PhD in high‑performance computing, computer...  ...RAN platforms, L1/L2 software stacks, or channel emulators... 
    Performance

    NVIDIA Gruppe

    Santa Clara, CA
    1 day ago
  • NVIDIA is seeking a System Design Engineer in Santa Clara, California. This role involves collaborating with HW/SW engineers to develop GPU/Tegra based products, focusing on cost-performance balance and optimization. Candidates should hold a B.S or M.S. in Electrical Engineering... 
    Performance

    NVIDIA

    Santa Clara, CA
    3 days ago
  • $116.2k - $343.6k

     ...Lead Engine System Engineer LightSpeed LA is seeking a talented and enthusiastic Lead Engine...  ...develop key engine systems and focus on performance and optimization Work with design and...  ...Utilize profiling tools to identify CPU and GPU performance issues Evolve... 
    Performance
    Relocation package

    Lightspeed Studios

    Irvine, CA
    2 days ago
  • $121k - $194k

     ...immediate career opening for a Lead Systems Engineer. This opening is located at...  ...to manage its High Performance Computing (HPC) resources,...  ...significant experience with CPU/GPU based systems, high-performance...  ...systems; install software to support research; ensure... 
    Performance
    Immediate start

    Institute for Defense Analyses

    Princeton, NJ
    23 hours ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Lead Software Systems Engineer - GPU Performance. Be the first to apply!