Infrastructure Software Engineer: Scale GPU Clusters (Remote)
$190k - $350k6AM City
- Remote job
6AM City, LLC, located in California, is looking for a Software Engineer specializing in Infrastructure to enhance software tools for large-scale computing efforts. This role focuses on developing, debugging, and implementing robust systems that will empower engineers to work efficiently. Key responsibilities include building scalable tools, debugging distributed systems, and collaborating with team members to ensure reliability. With competitive benefits and a salary range of $190,000 to $350,000, this position offers a chance to significantly impact software infrastructure. #J-18808-Ljbffr 6AM City
- ...A pioneering AI infrastructure company is seeking a GPU Cloud Platform Engineer to design and operate large-scale GPU clusters. This remote position aims to ensure high availability and performance of containerized AI workloads across cloud environments. The ideal candidate...Remote work
$170k - $220k
...Boston, MA is seeking an experienced infrastructure manager to support large-scale AI workloads. This role involves designing and optimizing GPU and cloud infrastructure for efficient... ...ownership. Exceptional candidates may be considered for remote work. #J-18808-Ljbffr...Remote work- ...Baseten Engineer Opportunity Baseten powers... ..., flexible infrastructure, and seamless developer... ...multi-modal workloads scale, the network is... ...engineers to lead our GPU Networking efforts... ...to architect the software fabric that... ...on bleeding-edge clusters (H100/H200, B200/B...Remote workFlexible hours
- ...Staff Software Engineer Our mission is to scale intelligence to serve humanity.... ...AI. The internal infrastructure team is responsible... ...manage Kubernetes-based GPU/TPU superclusters... ...with GPU/TPU clusters, distributed training... ...workspace improvement ~ Remote-flexible, offices...Remote workFull timeWork at officeFlexible hours
$165k - $242k
...innovators to build and scale AI with confidence... ...combines superior infrastructure performance with... ...the role Senior engineers are area owners... ...teams to evolve our GPU performance... ...Go and/or Python software development. Hands... ...work environment, remote work may be considered...Remote workPermanent employmentTemporary workCasual workWork at officeFlexible hours$152k - $241.5k
...Senior Software Engineer, Fabric Networking - GPU page is loaded## Senior Software Engineer, Fabric Networking -... ...GPUlocations: US, CA, Santa Clara: US, IL, Remote: US, CO, Remote: US, AZ, Remote: US... ...and software to support large scale computing platforms.* Work with...Remote work- ...Location: Remote (Global) Type: Full-time... ...orchestration at a planetary scale. Our mission is to... ...We are seeking a GPU Cloud Platform Engineer to join our core infrastructure team and help... ...-scale, multi-cluster GPU infrastructure... ...Computer Science, Software Engineering, Electronic...Remote workFull timeFlexible hours
- Bright Vision Technologies is looking for an AI Infrastructure Engineer to design and operate infrastructure that supports large-scale AI workloads. The role is entirely remote and requires expertise in GPU clusters, distributed training, and performance optimization. Ideal...Remote jobFull time
$180k - $200k
...Infrastructure Engineer (GPU & Compute) Lightning AI is the company behind PyTorch Lightning... ...AI combines developer-first software with cost-efficient, large-scale compute. Teams get the tools... ...infrastructure Run and maintain test clusters used for system validation,...Remote workWork from homeFlexible hours$184k - $287.5k
## Senior Software Engineer, DGX Cloud AI InfrastructureApplylocations... ..., Austin: US, OR, Remote: US, WA, Remote:... ...across NVIDIA GPU platforms at the largest scales we run.In this... ...that keep large clusters productive. This... ...AI clusters, infrastructure, and end-to-end workloads...Remote work$116k - $189.75k
## Software Engineer, DGX Cloud AI InfrastructureApplylocations:... ...US, TX, Austin: US, OR, Remote: US, WA, Remote: US, WA... ...across NVIDIA GPU platforms at the largest scales we run.In this role you... ...and debug large-scale AI clusters, infrastructure, and end-to-end workloads...Remote work- A leading AI infrastructure company seeks a Head of AI Infrastructure to... ...technical roadmap for a global GPU cloud platform. This role... ...expertise in Kubernetes and GPU clusters. The ideal candidate will lead... ...team, working in a remote-first environment, and will enjoy...Remote workImmediate start
- ...approach to silicon and software development. We’re seeking engineers who are energized by... ...performance at scale. Cornelis Networks delivers... ...the efficiency of GPU, CPU and accelerator-based compute clusters at any scale. Our... ..., hybrid, and fully remote roles. #J-18808-...Remote work
$160k - $230k
...Senior Software Engineer - Together Cloud Infrastructure San Francisco About the... ...Kubernetes and Slurm clusters. This platform serves... ...I/O, and scale ~ Experience with... ...SmartNICs a plus ~ GPU programming, NCCL,... ...flexibility in terms of remote work. The US base salary...Remote workFull time- ...Engineering Manager For The Clustering Platform Team The engineering team... ...-world entities at scale. You'll lead a high... ...focused on clustering infrastructure and data systems.... ...Strong hands-on software and data engineering... ...is hybrid in NYC or remote U.S. EST and Ontario...Remote workFlexible hours
- ...for an experienced Kafka Platform Engineer to architect, deploy, and operate large-scale Apache Kafka environments. The... ...Responsibilities include deploying Kafka clusters, implementing security measures,... .... This full-time role is 100% remote and aligned with long-term...Remote jobFull time
$165k - $225k
...Senior Software Engineer, Network Platform Moonlite delivers... ...high-performance AI infrastructure for organizations... ...computational research, large-scale model training, and... .... Research Cluster Networking: Design and... ...performance connectivity for GPU clusters and...Remote workImmediate startFlexible hours$184k - $356.5k
...NVIDIA Corporation is seeking a Senior Software Engineer in Santa Clara to enhance the performance and reliability of large-scale AI infrastructures. The role involves leadership in debugging... ...training workloads across NVIDIA’s GPU platforms. Ideal candidates should have...$272k - $431.25k
...computing. An era in which our GPU acts as the brains of... ...NVIDIA, as a Principal Rack Scale Systems Infrastructure Engineer, you will build and guide the development of software systems. These systems support... ...experience with rack- or cluster-scale systems spanning compute...Remote workShift work$165k - $225k
...Senior Software Engineer, Platform Infrastructure Moonlite delivers high-performance AI infrastructure for... ...intensive computational research, large-scale model training, and demanding data... ...– bare-metal servers, GPU clusters, high-performance storage, and networking...Remote workImmediate startFlexible hours- ...early-stage Kubernetes infrastructure company is hiring a Frontend Engineer to design and build the... ...that translate complex cluster state into clear, actionable... ...UIs using Convex that scale from single users to large... ...in San Francisco, CA . Remote work is not available...Remote work
$150k - $170k
...Senior Software Platform Engineer Palo Alto, California... ...United States; Remote PsiQuantum's... ...advantages for scale: photons don't feel... ...standard fiber-optic infrastructure. In 2024,... ...infrastructure with GPU-accelerated... ...efficiently on GPU clusters. Role...Remote workFull timeShift work- ...NVIDIA Gruppe is seeking highly motivated EngOps and Platform Engineers to develop automated tools for managing large GPU clusters. This position requires strong expertise in high-performance computing and deep learning. The ideal applicants have a BS or MS in a relevant...
$139k - $204k
...innovators to build and scale AI with... ...combines superior infrastructure performance with... ...As part of the Cluster Orchestration team... ...efficiently across massive GPU clusters. By... ...As a Senior Software Engineer I (IC3), you will... ...work environment, remote work may be considered...Remote workPermanent employmentTemporary workCasual workWork at officeFlexible hours$152k - $241.5k
...era in which our GPU acts as the brains... ...that analyzes large-scale datacenter... ...on GPU-accelerated clusters. You will turn telemetry... ...GPU, and systems engineers. When useful, you... ...scale workloads and infrastructure signals to find application... ...) inside existing software workflows. What...Remote work$152k - $241.5k
...and Visualization. The GPU, our invention,... ...highly motivated Senior Software Engineers to join our Fabric Networking... ...focus on NVLink Rack-Scale Systems Stability &... ..., and large-scale AI infrastructure, contributing... ...and large-scale AI/HPC clusters such as NVIDIA GB200...Remote work$184k - $287.5k
...Cloud is building and operating large-scale GPU infrastructure for AI research and production workloads. We are looking for Senior Software Engineers to help build the automation, tooling... ...operational systems that make GPU clusters reliable, scalable, and safe to run....Remote work- ...Technical Marketing Engineer (TME) within the Software Product Management... ...AMD’s Data Center GPU Business Unit, you... ...through deployment, scaling, and optimization of... ...operate AMD‑powered GPU clusters & networks to... ...focused on datacenter infrastructure or AI platforms....
- ...reputation for Swiss engineering standards has made us... ...performs consistently at scale, deliver high-quality... ...critical applications and infrastructure. Experience with... ...and Kubernetes (EKS) clusters, which sit at the foundation... .... Our benefits (Remote) You get to impact the...Remote workWorldwide
$160k - $322k
...NVIDIA Gruppe in Santa Clara is seeking a Senior Technical Marketing Engineer focused on GPUs and scale-up architecture. The role involves showcasing NVIDIA's GPU architecture and server-level platforms, aiming to maximize performance for AI applications. The ideal candidate...
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Infrastructure Software Engineer: Scale GPU Clusters (Remote). Be the first to apply!
- security infrastructure engineer California, MO
- principal infrastructure engineer California, MO
- lead infrastructure engineer California, MO
- remote infrastructure engineer California, MO
- infrastructure developer California, MO
- senior infrastructure engineer California, MO
- infrastructure automation engineer California, MO
- infrastructure engineer California, MO
- data infrastructure engineer California, MO
- infrastructure engineering manager California, MO

