Lead Software Systems Engineer - GPU Performance
$170k - $300kNebius
About Nebius: Nebius is leading a new era in cloud infrastructure for the global AI economy. We are building a full-stack AI cloud platform that supports developers and enterprises from data and model training through to production deployment, without the cost and complexity of building large in-house AI/ML infrastructure. Built by engineers, for engineers. From large-scale GPU orchestration to inference optimization, we own the hard problems across compute, storage, networking and applied AI. Listed on Nasdaq (NBIS) and headquartered in Amsterdam, we have a global footprint with R&D hubs across Europe, the UK, North America and Israel. Our team of 1,500+ includes hundreds of engineers with deep expertise across hardware, software and AI R&D. We are looking for a Lead Software Systems Engineer - GPU Performance to play a key role in building our hyperscaler platform, working across its core components while analyzing and optimizing the performance of large-scale GPU clusters at the intersection of hardware and software. You will operate across the full stack-from hardware and system software to networking (InfiniBand/RoCE), virtualization (KVM/QEMU), and distributed communication layers (e.g., MPI, NCCL). In this role you will
What's it like to work at Nebius: Fast moving - Bold thinking - Constant growth - Meaningful impact - Trust and real ownership - Opportunity to shape the future of AI
Equal Opportunity Statement: Nebius is an equal opportunity employer. We are committed to fostering an inclusive and diverse workplace and to providing equal employment opportunities in all aspects of employment. We do not discriminate on the basis of race, color, religion, sex (including pregnancy), national origin, ancestry, age, disability, genetic information, marital status, veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by applicable law. Applicants must be authorized to work in the country in which they apply and will be required to provide proof of employment eligibility as a condition of hire.
If you need accommodations during the application process, please let us know.
- Focus on understanding system behavior across multiple layers, identifying performance bottlenecks, and driving improvements that shape how our clusters are built, operated, tuned, and validated.
- Investigate and troubleshoot performance issues of GPU cluster under real workloads (training and inference)
- Evaluate and integrate new hardware, system configurations and tuning approaches through software stack
- Support complex performance-related escalations from internal teams and customers
- Work closely with infrastructure, software engineering and hardware vendor teams (e.g. NVIDIA, Mellanox, Intel)
- Contribute to hardware and cluster qualification (acceptance), ensuring systems meet performance expectations
- 5+ years of professional experience in system-level software development (focused on performance optimization, low-level programming).
- 3+ years of hands-on experience with Linux systems (administration, troubleshooting, and performance tuning).
- In-depth understanding of server architecture, including PCIe devices, NICs, Linux OS/Kernel, and high-performance computing (HPC) systems.
- Strong proficiency in one or more performance-oriented programming languages (C/C++, Go, Python).
- Health insurance: 100% company-paid medical, dental and vision coverage for employees and families.
- 401(k) plan: Up to 4% company match with immediate vesting.
- Parental leave : 20 weeks paid for primary caregivers, 12 weeks for secondary caregivers.
- Remote work reimbursement: Up to $85/month for mobile and internet.
- Disability & life insurance: Company-paid short-term, long-term and life insurance coverage.
- Competitive compensation
- Career growth and learning opportunities
- Flexibility and work-life balance
- Collaborative and innovative culture
- Opportunity to work on impactful AI projects
- International environment and talented teams
What's it like to work at Nebius: Fast moving - Bold thinking - Constant growth - Meaningful impact - Trust and real ownership - Opportunity to shape the future of AI
Equal Opportunity Statement: Nebius is an equal opportunity employer. We are committed to fostering an inclusive and diverse workplace and to providing equal employment opportunities in all aspects of employment. We do not discriminate on the basis of race, color, religion, sex (including pregnancy), national origin, ancestry, age, disability, genetic information, marital status, veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by applicable law. Applicants must be authorized to work in the country in which they apply and will be required to provide proof of employment eligibility as a condition of hire.
If you need accommodations during the application process, please let us know.
Vacancy posted 18 hours ago
Similar jobs that could be interesting for youBased on the Lead Software Systems Engineer - GPU Performance in United States vacancy
- ...Lead Software Engineer Be an integral part of an agile team that's constantly... ...integrated with firm systems. Produce architecture and... ...batching. Deploy and manage GPU workloads in Kubernetes... ...GPU programming (CUDA) and performance optimization. Experience...Performance
- ...Bright Vision Technologies is seeking a GPU Systems Engineer (CUDA) to design and optimize compute... ...in CUDA programming and high-performance computing, focusing on improving performance... ...quality solutions, contributing to our innovative software projects. #J-18808-Ljbffr...PerformanceRemote work
- ...Summary We are seeking an On-Premise LLM Inference & GPU Systems Engineer to build, optimize, and support a large-scale enterprise Generative... ...This position will be responsible for maximizing inference performance, operational efficiency, and platform reliability across...Performance
$150k - $300k
...Hudson River Trading (HRT) is looking for GPU Systems Engineers to help scale and evolve our... ...scope, from HPC/AI cluster design and performance tuning, to troubleshooting and automation... ...Test and deploy new hardware and software, and partner with vendors to resolve...PerformanceWork at officeLocal areaImmediate start- ...Job Title: System Engineer Datacenter GPU Location(s):... ...Engineering. IPP is a core software infrastructure organization... ...distributed infrastructure scaling, leading GPU product bringups (PCIe... ...infrastructure. Automation and performance tuning of regression test...PerformanceWorldwide
$100k - $150k
...GPU Systems Engineer (CUDA) Bright Vision Technologies is a forward-thinking software development company dedicated to building innovative solutions that help businesses... ...CUDA programming, GPU architecture, and high-performance computing to design and optimize compute-...PerformanceFull timeH1bRemote workVisa sponsorship$151.9k - $172.16k
...Requisition ID: 2503 Standard Title: GPU Systems Engineer Required Security Clearance: Top... ...define and optimize architectures for performance, power efficiency, and required... ...improve efficiency across hardware and software layers. Build and maintain debugging...PerformanceHourly payContract workTemporary workImmediate startFlexible hoursShift work- ...Apple Inc. in Cambridge, Massachusetts, is seeking a Software Engineer focused on GPU performance. In this role, you will develop the infrastructure for Apple GPUs, conduct performance analyses, and define driver software for enhanced GPU introspection capabilities. A...Performance
- ...edge AI infrastructure startup is seeking a Kubernetes DevOps Engineer to join their innovative team in San Francisco. The role... ...Kubernetes clusters across various environments, focusing on high-performance GPU workloads. Ideal candidates will have deep Kubernetes...Performance
- ...professional in New York to design and operate large-scale GPU infrastructure for model inference and reinforcement... ...role demands several years of experience in deploying GPU systems, optimizing model performance, and working with frameworks like SGLang and Megatron. The...Performance
- MakerMaker, based in San Francisco, is seeking a highly skilled kernel engineer to write and optimize GPU kernels that enhance performance for training and inference. This role involves deep, low-level work to close the significant performance gap that exists in modern...Performance
$160k - $320k
A leading AI computing firm is seeking a Systems Engineer in San Francisco or Los Angeles to scale AI inference. Candidates should have strong C++... .... Responsibilities include designing GPU kernels, optimizing performance, and collaborating with technical leads to enhance...Performance- Bright Vision Technologies is seeking a GPU Systems Engineer (CUDA) to enhance business processes through technology. This remote full-time... ...position requires 6+ years of experience in GPU programming and performance engineering. The ideal candidate will have in-depth...PerformanceRemote jobFull time
$195.2k - $292.8k
...Technologies, Inc. Job Area: Engineering Group, Engineering Group GPU ASICS Engineering General Summary: GPU System Driver Team are looking for talented software engineers to develop in-... ...to verify GPU function and performance on simulator/emulator, and...PerformanceWork experience placement- NVIDIA Corporation, located in Santa Clara, CA, is seeking a Senior Systems Software Engineer focused on GPU Performance at Scale. This role entails leading performance practices in large-scale GPU infrastructure and aligning AI workloads with next-generation datacenter...Performance
- ...defense technology firm in McLean, VA is looking for a High-Performance Computing Engineer to design, implement, and maintain advanced computing... ...strong technical expertise in large-scale HPC environments, GPU computing, and high-speed networking, alongside an active...Performance
$160k - $320k
...initiative and deliver excellence. We seek engineers/researchers with strong intrinsic... ...About the Role We're looking for a systems engineer with HPC or parallel... ...'ll leverage your knowledge of high-performance systems to optimize GPU performance at the bleeding edge of...PerformanceFull timeWork at office$160k - $230k
...About the Role As a Systems Research Engineer specialized in GPU Programming, you will... ...architecture to enhance the performance and efficiency of our AI... ...with the hardware and software teams, you will contribute... ...We have contributed to leading open-source research,...PerformanceFull timeRemote work$200k - $300k
...our own, taking pride in the systems we build and the trust we... ...About the Role As a System Engineer, GPU Fleet, you will manage, operate... .... Ensure high availability, performance, and reliability of GPU server... ..., and application teams Lead post‑incident reviews, document...PerformanceLocal area$146k - $194k
...changing how military systems are designed, built and... ...is seeking a High Performance Computing (HPC) System Engineer to directly support our... ...Architect and deploy advanced GPU infrastructure, leading the design, deployment... ...cluster management software (e.g., Warewulf,...PerformanceFull timeWork experience placementImmediate start$160k - $253k
...accelerated computing is the engine of artificial... ...platforms integrate high performance compute, networking, and a full-stack software ecosystem to power AI at... ...in showcasing NVIDIA's GPU architecture, server-level... ...NVIDIA's GPU and rack-scale systems. This role bridges architecture...PerformanceRemote work- ...Oefentherapie is seeking a Senior Engineer to join the Oracle Cloud... ...scale bare-metal provisioning systems for cloud services. As a team... ...high reliability and performance for Oracle's compute services... ...enjoys working with advanced GPU hardware. The position offers...Performance
- ...Group, LLP is seeking a Machine Learning Engineer in Bala Cynwyd, PA. This role focuses... ...inference optimization for high-performance model serving systems. You will collaborate with researchers... ...performance, evaluate frameworks, and debug GPU memory issues while managing...Performance
$168k - $322k
NVIDIA Gruppe is looking for a System Design Engineer to join the Graphics Product Team in Santa Clara... ...In this role, you will develop NVIDIA GPU/Tegra based products while... ...SW engineers to balance product cost, performance, and schedule. Candidates should have...Performance$140k - $225k
...Systems Engineer - Graphics Processing Unit (GPU) Absolute Business Solutions Corp (ABSC) is not just another tech... ...-site position. All work must be performed at the customer site in Bethesda... ...efficiency across hardware and software layers. Tooling and Automation...PerformanceContract work- ...seeking to enhance its enterprise AI mission systems by hiring a specialized engineer focused on designing and optimizing GPU clusters. In this role, you will be... ...security clearance. Knowledge of Kubernetes and performance monitoring tools is highly desirable. #J-188...Performance
$200k - $322k
...self‑motivated senior engineer for the Aerial Omniverse... ...devices, across systems of potentially thousands... ...design and implement GPU kernels that apply time... ...need to see: PhD in high‑performance computing, computer... ...RAN platforms, L1/L2 software stacks, or channel emulators...Performance- NVIDIA is seeking a System Design Engineer in Santa Clara, California. This role involves collaborating with HW/SW engineers to develop GPU/Tegra based products, focusing on cost-performance balance and optimization. Candidates should hold a B.S or M.S. in Electrical Engineering...Performance
$116.2k - $343.6k
...Lead Engine System Engineer LightSpeed LA is seeking a talented and enthusiastic Lead Engine... ...develop key engine systems and focus on performance and optimization Work with design and... ...Utilize profiling tools to identify CPU and GPU performance issues Evolve...PerformanceRelocation package$121k - $194k
...immediate career opening for a Lead Systems Engineer. This opening is located at... ...to manage its High Performance Computing (HPC) resources,... ...significant experience with CPU/GPU based systems, high-performance... ...systems; install software to support research; ensure...PerformanceImmediate start
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Lead Software Systems Engineer - GPU Performance. Be the first to apply!
Related searches
- lead maintenance engineer United States
- lead support engineer United States
- lead c# developer United States
- lead sharepoint developer United States
- lead process engineer United States
- lead operating engineer United States
- lead software test engineer United States
- lead engineer United States
- lead infrastructure engineer United States
- lead sales engineer United States

