Member of Technical Staff - GPU Infrastructure

$150k - $300k

Prime Intellect

Building Open Superintelligence Infrastructure Prime Intellect is building the open superintelligence stack - from frontier agentic models to the infra that enables anyone to create, train, and deploy them. We aggregate and orchestrate global compute into a single control plane and pair it with the full RL post-training stack: environments, secure sandboxes, verifiable evals, and our async RL trainer. We enable researchers, startups and enterprises to run end-to-end reinforcement learning at frontier scale, adapting models to real tools, workflows, and deployment contexts. As our Solutions Architect for GPU Infrastructure, you'll be the technical expert who transforms customer requirements into production‑ready systems capable of training the world’s most advanced AI models. We recently raised $15mm in funding (total of $20mm raised) led by Founders Fund, with participation from Menlo Ventures and prominent angels including Andrej Karpathy (Eureka AI, Tesla, OpenAI), Tri Dao (Chief Scientific Officer of Together AI), Dylan Patel (SemiAnalysis), Clem Delangue (Huggingface), Emad Mostaque (Stability AI) and many others. Core Technical Responsibilities Customer Architecture & Design Partner with clients to understand workload requirements and design optimal GPU cluster architectures Create technical proposals and capacity planning for clusters ranging from 100 to 10,000+ GPUs Develop deployment strategies for LLM training, inference, and HPC workloads Present architectural recommendations to technical and executive stakeholders Infrastructure Deployment & Optimization Deploy and configure orchestration systems including SLURM and Kubernetes for distributed workloads Implement high‑performance networking with InfiniBand, RoCE, and NVLink interconnects Optimize GPU utilization, memory management, and inter‑node communication Configure parallel filesystems (Lustre, BeeGFS, GPFS) for optimal I/O performance Tune system performance from kernel parameters to CUDA configurations Production Operations & Support Serve as primary technical escalation point for customer infrastructure issues Diagnose and resolve complex problems across the full stack - hardware, drivers, networking, and software Implement monitoring, alerting, and automated remediation systems Provide 24/7 on‑call support for critical customer deployments Create runbooks and documentation for customer operations teams Technical Requirements Required Experience 3+ years hands‑on experience with GPU clusters and HPC environments Deep expertise with SLURM and Kubernetes in production GPU settings Proven experience with InfiniBand configuration and troubleshooting Strong understanding of NVIDIA GPU architecture, CUDA ecosystem, and driver stack Experience with infrastructure automation tools (Ansible, Terraform) Proficiency in Python, Bash, and systems programming Track record of customer‑facing technical leadership Infrastructure Skills NVIDIA driver installation and troubleshooting (CUDA, Fabric Manager, DCGM) Container runtime configuration for GPUs (Docker, Containerd, Enroot) Linux kernel tuning and performance optimization Network topology design for AI workloads Power and cooling requirements for high‑density GPU deployments Nice to Have Experience with 1000+ GPU deployments NVIDIA DGX, HGX, or SuperPOD certification Distributed training frameworks (PyTorch FSDP, DeepSpeed, Megatron‑LM) ML framework optimization and profiling Experience with AMD MI300 or Intel Gaudi accelerators Contributions to open‑source HPC/AI infrastructure projects Growth Opportunity You’ll work directly with customers pushing the boundaries of AI, from startups training foundation models to enterprises deploying massive inference infrastructure. You’ll collaborate with our world‑class engineering team while having direct impact on systems powering the next generation of AI breakthroughs. We value expertise and customer obsession - if you’re passionate about building reliable, high‑performance GPU infrastructure and have a track record of successful large‑scale deployments, we want to talk to you. Apply now and join us in our mission to democratize access to planetary scale computing. Compensation Cash Compensation Range of $150-300k plus Equity Incentives #J-18808-Ljbffr

Apply

Vacancy posted 3 days ago

Similar jobs that could be interesting for youBased on the Member of Technical Staff - GPU Infrastructure in San Francisco, CA vacancy

Member of Technical Staff, Infrastructure
...team is 5 people with a research and product focus. As a Member of Technical Staff on our infrastructure team, you'll own the cloud systems that serve our... ...You'd get to build global low-latency, high-throughput GPU ML inference infra that sits in the critical path of customer...
Suggested
Visa sponsorship
The Token Company
San Francisco, CA
1 day ago
Member of Technical Staff - RL Infrastructure
$300k
...Member of Technical Staff - RL Infrastructure About V max V max is an applied research lab developing AI capable of open-ended learning. We are building systems... ...for LLM inference and/or RL training. Experience with GPU clusters, distributed training, model serving, or high-...
Suggested
Work at office
Local area
VMAX LLC
San Francisco, CA
4 days ago
Senior Member of Technical Staff - Infrastructure Security
...Member of Technical Staff – Infrastructure Security We're partnering with a frontier AI research company that is building next-generation open-weight foundation... ...strategy from day one Work across cloud, Kubernetes, GPU infrastructure, CI/CD, and platform engineering...
Suggested
Xcede
San Francisco, CA
4 days ago
Member of Technical Staff - Kernels & GPU Performance
$150k - $350k
...cost with today’s homogeneous, vertically integrated infrastructure. Gimlet addresses this by decoupling AI workloads... ...AI datacenters. Mission Gimlet Labs is seeking a Member of Technical Staff focused on kernels and GPU performance. In this role, you will work close to...
Suggested
Gimlet Labs, Inc.
San Francisco, CA
4 days ago
Member of Technical Staff, ML Infrastructure & Inference
...Member of Technical Staff, ML Infrastructure & Inference Overview We are a cutting-edge AI infrastructure company is building a scalable cloud platform designed... ..., LLM Inference, Model Serving, Distributed Systems, GPU Infrastructure, AI Infrastructure, Inference Runtime,...
Suggested
Acceler8 Talent
San Francisco, CA
4 days ago
Member of Technical Staff - Infrastructure Security
...and beyond. Role Overview Reflection.AI is looking for a Member of Technical Staff - Infrastructure Security to secure our geographically diverse multi-... ...GCP, AWS, and/or Azure Experience working with neocloud GPU providers such as VoltagePark, GMI Cloud, Crusoe, Anyscale...
Relocation package
Reflection
San Francisco, CA
4 days ago
Member of Technical Staff - Kernels & GPU Performance
...building the next generation of AI infrastructure: large-scale AI datacenters... ...This is not a traditional GPU optimization role. We are... ...years to come. As an early member of our team, you will have... ...ownership, work alongside highly technical engineers, and help shape...
Gimlet Labs
San Francisco, CA
2 days ago
Member of Technical Staff, AI Platform & Architecture (Infrastructure)
$256k - $276k
...and our vision at Postman. The Opportunity As a Member of Technical Staff on AI Infrastructure, you will build and maintain the foundational systems... ...and services Optimize performance for GPU/xPU accelerators and cloud environments Build tools...
Work at office
Flexible hours
3 days per week
Postman
San Francisco, CA
2 days ago
Member of Technical Staff: Pre-Training Infrastructure
$200k - $350k
...term success for both clients and candidates. Member of Technical Staff - Pre-Training Infrastructure Location: San Francisco, CA Company Stage of... ...development. Build efficient and reproducible multi-GPU and multi-node training workflows. Develop high-...
Work at office
Visa sponsorship
Recruiting from Scratch
San Francisco, CA
22 hours ago
02 Member of Technical Staff, Infrastructure San Francisco
...Member of Technical Staff, Infrastructure Join us and help shape the future of AI by architecting next-generation knowledge systems. Join us and help shape the future of AI by defining the narrative around document understanding. About the Role The Infra team at LlamaIndex...
Work at office
LlamaIndex, Inc.
San Francisco, CA
3 days ago
Member of Technical Staff - Infrastructure
...writes, tests, and maintains automation code on fully‑managed infrastructure – cutting dev time by 90%. We’re starting with healthcare, where... ...our agents – low latency, high throughput, cost‑efficient GPU utilization Expand our OpenTelemetry and Langfuse tracing into...
Immediate start
Remote work
CloudCruise
San Francisco, CA
3 days ago
Compute Infrastructure Member of Technical Staff
...compute platforms, orchestration systems, or high‑performance infrastructure at scale Ability to thrive in a fast‑paced, meritocratic environment... ...or designing large‑scale AI training/inference clusters (GPU/TPU scale) (Desirable) Experience with custom runtimes,...
Xai
San Francisco, CA
3 days ago
Member of Technical Staff - Infrastructure
...Infrastructure / Cluster Engineer Gimlet is building the next generation of AI infrastructure: large-scale AI datacenters and the orchestration... ...Work On Design, deploy, and operate large-scale CPU, GPU, and accelerator clusters powering production AI inference....
Gimlet Labs
San Francisco, CA
22 hours ago
Member of Technical Staff - RL Infrastructure
$180k
...Member Of Technical Staff - RL Infrastructure Palo Alto, CA About XAI XAI's mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, and focused on engineering...
Temporary work
Xai
San Francisco, CA
1 day ago
Pantheon - Member of Technical Staff: Infrastructure
$200k - $350k
...About the job Pantheon - Member of Technical Staff: Infrastructure Member of Technical Staff: Infrastructure Posted by Transparent Search Group on behalf of Pantheon . About Pantheon Autonomous physical labor Website: The role We are...
H1b
Remote work
Visa sponsorship
Transparent Search Group
San Francisco, CA
4 days ago
Member of Technical Staff, Infrastructure
$10k
...What You'll Do: 30 Day: You'll ramp on our multi-cluster, multi-cloud infrastructure. 60 Day: You'll deliver a new service like Anycast Global Router. 90 Day: You'll own a domain like GPU inference clusters. Who You Are: You've seen Series B to F. You've...
Flexible hours
Shift work
Superpowered Inc
San Francisco, CA
1 day ago
Member of Technical Staff, Infrastructure / DevOps
...Plato is an applied research lab building the foundational infrastructure to train specialized AI agents. We turn real-world data... ...evaluation, and iteration feel like one seamless system. As a Member of Technical Staff, Infrastructure / DevOps, you will own the systems that...
Plato.ai
San Francisco, CA
4 days ago
Member of Technical Staff - Infrastructure Engineer
$150k - $300k
...Member Of Technical Staff - Infrastructure Engineer Freiburg (Germany) About Black Forest Labs We're the team behind Latent Diffusion, Stable Diffusion... ...Focus Python, Bash, Go Kubernetes Nvidia GPU drivers, and operators OTel, Prometheus What We...
Work at office
Remote work
Worldwide
Relocation
2 days per week
Black Forest Labs
San Francisco, CA
1 day ago
Member of Technical Staff, Infrastructure and Training Systems
...Radical Numerics was founded to develop both the power to design and the responsibility to defend. About the Role As a Member of Technical Staff, Infrastructure & Training Systems at Radical Numerics, you will design and build the systems that make large-scale model training...
Local area
Radical Numerics Inc.
San Francisco, CA
12 hours ago
Member of Technical Staff - Platform Infrastructure
...full ownership of NeoSigma's platform infrastructure — lead architectural decisions and design... ...regulated enterprise customers Own the technical relationship with enterprise customers... ...a career-defining impact As a founding member, you’ll help define the technical foundation...
NeoSigma
San Francisco, CA
4 days ago
Member of Technical Staff, Supercomputing Platform & Infrastructure
$200k
...compute to achieve this goal. About the Role As an engineer on the Supercomputing Platform & Infrastructure team, you will design, build, and operate the large-scale GPU infrastructure that powers Magic's model training and inference workloads. A core part of...
Relocation
Visa sponsorship
Magic Inc
San Francisco, CA
1 day ago
Infrastructure Engineer - Member of Technical Staff
$200k - $400k
...Infrastructure Engineer We are looking for an Infrastructure Engineer who thrives on the complexity... ...Communication: Ability to write clear technical specs for both internal teams and... ...specifically high-throughput inference systems or GPU-accelerated computing. Kubernetes...
Flexible hours
Simile
San Francisco, CA
3 days ago
Member of Technical Staff - Inference
$150k - $300k
...key areas are: Building the infrastructure to serve LLMs efficiently at... ...our RL training stack. Core Technical Responsibilities LLM Serving... ...that operates across our cloud GPU fleets. GPU‑Aware Scheduling:... ...development and encourage team members to contribute to the broader...
Work at office
Remote work
Visa sponsorship
Relocation package
Flexible hours
Shift work
Prime Intellect
San Francisco, CA
3 days ago
Member of Technical Staff
...Pixeltable Inc. Member of Technical Staff San Francisco, CA·Full time Apply for Member of Technical... ...teams to focus on innovation, not on infrastructure. We aim to simplify the AI development... ...heterogeneous compute resources (CPU and GPU) efficiently? What data model will...
Full time
Part time
Work at office
Work from home
Flexible hours
2 days per week
Pixeltable, Inc.
San Francisco, CA
4 days ago
Member of Technical Staff, Infrastructure
.... Successful candidates typically come from staff or principal-level roles and are recognized for establishing technical direction, leading large-scale initiatives,... ...teams use to right‑size space and budgets. This infrastructure already powers 16,000 workplaces and 9,000+...
Work at office
Local area
Monday to Thursday
Envoy
San Francisco, CA
3 days ago
Member of Technical Staff - Infrastructure
$150k - $300k
...About Us Sieve is the only AI research lab exclusively focused on video data. We combine exabyte-scale video infrastructure, novel video understanding techniques, and dozens of data sources to develop datasets that push the frontier of video modeling. Video makes up 8...
Sieve, Inc.
San Francisco, CA
3 days ago
Member of the Technical Staff- LLMs
$170k - $220k
...Member of Technical Staff – Infrastructure & LLMs Location: San Francisco, CA (Hybrid) Compensation: $170,000 – $220,000 base + 1–3% equity Work Authorization... ...one, working directly on problems like: Scaling multi-GPU inference workloads Designing distributed job...
Full time
Temporary work
Immediate start
Visa sponsorship
Work visa
Amadeus Search
San Francisco, CA
4 days ago
Principal Member of Technical Staff, Platform Infrastructure
$200k - $350k
...scaling, and operating the core platform infrastructure that powers autonomous scientific... ...engineering at the senior level is about technical ownership and leverage- understanding how... ...tolerance for heterogeneous workloads (CPU, GPU, memory-intensive). Establish and uphold...
Full time
Work at office
Edison Scientific
San Francisco, CA
2 days ago
Lead Member of Technical Staff, Inference Infrastructure
...systems? Do you want to set technical direction and help shape... ...We are looking for a Lead Member of Technical Staff to join the Model Serving... ...experience running production infrastructure at a large scale, with a... ...with Kubernetes and GPU workloads on those clusters...
Full time
Work at office
Local area
Remote work
Home office
Cohere Health
San Francisco, CA
2 days ago
Member of Technical Staff, Financial Infrastructure
...Overview Anchorage Digital is building infrastructure that enables the world’s largest financial... .... Responsibilities Contribute to the technical direction of our infrastructure... ...problems, and assist or teach other team members when possible. Qualifications 2–5 years...
Anchorage Lending CA, LLC
San Francisco, CA
4 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Member of Technical Staff - GPU Infrastructure. Be the first to apply!