Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Member of Technical Staff - Training Platform

$150k - $300k

Prime Intellect

Building Open Superintelligence Infrastructure Prime Intellect is building the open superintelligence stack - from frontier agentic models to the infrastructure that lets anyone create, train, and deploy them. We aggregate and orchestrate global compute into a single control plane and pair it with the full RL post-training stack: environments, secure sandboxes, verifiable evals, and our async RL trainer. We enable researchers, startups, and enterprises to run end-to-end reinforcement learning at frontier scale, adapting models to real tools, workflows, and deployment contexts. We recently raised $15M in funding (taking total funding to $20M), led by Founders Fund with participation from Menlo Ventures and prominent angels including Andrej Karpathy (Eureka Labs, Tesla, OpenAI), Tri Dao (Chief Scientist, Together AI), Dylan Patel (SemiAnalysis), Clem Delangue (Hugging Face), Emad Mostaque (Stability AI), and many others. Role Impact You'll help build our hosted training platform - the product that lets users launch LoRA and full fine-tuning runs on managed GPU clusters with a single API call or a few clicks. The role spans the developer-facing platform and the underlying Kubernetes-based training infrastructure that runs the jobs. Core Technical Responsibilities Hosted Training Infrastructure Design and operate Kubernetes-based training and inference orchestration across multi-cluster, multi-cloud GPU fleets Build and maintain Helm charts that compose trainers, inference servers, environment servers, and supporting services into reproducible "Training stacks" Develop the Python control-plane agents that watch pods, report run state to the platform, and keep clusters in sync Implement scheduling and autoscaling for heterogeneous hardware (H100/H200/B200) using KEDA, LeaderWorkerSet, taints/tolerations, and gang scheduling Run a tight GitOps workflow - every change ships through PRs, Helm values, and CI Build node-local model caches, checkpoint pipelines, and shared storage for fast cold starts Operate the observability stack (Prometheus, Grafana, Loki, DCGM) and make GPU cluster debugging fast Platform Development Build the developer-facing surfaces for hosted training: job submission, live run monitoring, logs, metrics, model/adapter management, comparisons Develop FastAPI backend services and REST APIs that bridge the platform to running clusters Build real-time monitoring and debugging tools (streaming logs, step-level metrics, failure analysis) Ship product UI in Next.js / React / TypeScript with shadcn, Tailwind, tRPC, and TanStack Query Research Bridge Interface with the RL trainer, inference servers, and environment servers running inside our clusters Productize new training capabilities (new model architectures, RL algorithms, modes) Technical Requirements We're looking for engineers who are fluent across three areas - you don't need to be the world's best at any one, but you should have real depth in all three and a clear point of view on how they connect. AI & GPU Landscape Strong working knowledge of the modern AI stack - open model families, finetuning techniques (LoRA, QLoRA, full FT, RLHF/RLAIF), inference engines (vLLM, SGLang, TensorRT-LLM) Familiarity with GPU hardware tradeoffs (H100 / H200 / B200, NVLink, interconnects, memory hierarchy) and what they mean for training and inference workloads Understanding of distributed training fundamentals (data/tensor/pipeline/expert parallelism, NCCL, multi-node scheduling) Aware of what's happening at the frontier - new models, training methods, infra patterns - and the ability to translate that into product decisions Kubernetes & Infrastructure Strong Kubernetes operations experience - Helm, CRDs, operators, KEDA, gang scheduling, GPU operator Comfortable debugging real production clusters (kubectl, pod lifecycle, node issues, networking) Cloud platform experience (GCP preferred - GCS, GKE, Cloud Run, Cloud Tasks) Infrastructure automation (Helm, Terraform, Ansible) and a GitOps mindset Observability: Prometheus, Grafana, Loki, OpenTelemetry, DCGM Linux fundamentals: networking, namespaces, performance tuning Programming & Platform Strong Python backend development (FastAPI, async, SQLAlchemy) Comfortable building Python control-plane agents that talk to Kubernetes APIs Modern frontend development (TypeScript, React/Next.js, Tailwind, shadcn) - enough to ship product surfaces end-to-end REST and tRPC API design Experience building developer tools, dashboards, and live-monitoring UIs What We Offer Cash compensation $150K–$300K with significant equity Flexible work arrangement (remote or San Francisco office) Full visa sponsorship and relocation support Professional development budget for courses and conferences Regular team off-sites and conference attendance Opportunity to shape the future of decentralized AI development Growth Opportunity You'll join a team of experienced engineers and researchers working on cutting‑edge problems in AI infrastructure. We believe in open development and encourage team members to contribute to the broader AI community through research and open‑source work. We value potential over perfection - if you're passionate about democratizing AI development and have experience in either platform or infrastructure development (ideally both), we want to talk to you. Ready to help shape the future of AI? Apply now and join us in our mission to make powerful AI models accessible to everyone. #J-18808-Ljbffr Prime Intellect

Vacancy posted 10 days ago
Similar jobs that could be interesting for youBased on the Member of Technical Staff - Training Platform in San Francisco, CA vacancy
  • $200k - $275k

    Founding Member of Technical Staff (Research / Post-Training) Applied AI / RL | San Francisco (onsite) | $200k-$275k + 0.25-0.50% equity DeepRec is partnered...  ...workflows across financial services. Their core platform focuses on building high‑quality RL environments that... 
    Platform
    Technical training
    Full time
    Visa sponsorship
    Relocation package

    DeepRec.ai

    San Francisco, CA
    4 days ago
  •  ...GPU infrastructure for high-throughput model inference and mid-training workloads. Develop systems that power synthetic data...  ...learning pipelines at scale. Build high-performance inference platforms capable of serving and evaluating models across thousands of GPUs... 
    Platform
    Technical training
    Relocation package

    Reflection

    San Francisco, CA
    2 days ago
  •  ...by decoupling AI workloads from the underlying hardware. Our platform intelligently partitions workloads into components and orchestrates...  ...AI datacenters. Mission Gimlet Labs is seeking an Member of Staff focused on AI Research (Intern). As an AI Researcher (Intern)... 
    Platform
    Internship

    Gimlet Labs

    San Francisco, CA
    4 days ago
  • $130k - $240k

     ...expertise with frontier technology. The Role Being a Member of Technical Staff at SketchPro means the problem in front of you will keep changing...  ...2D drawings and 3D geometry the way an architect does Platform integrations: deep, real-world hooks into Autodesk Revit... 
    Platform
    Work at office
    Shift work

    SketchPro AI

    San Francisco, CA
    13 days ago
  •  ...What we are looking for? Seeking a Member of Technical Staff - Backend with 5+ years of experience...  ...deploying machine learning capabilities in training and production Drive technology...  ...or similar web frameworks, GCP and Kubernetes and workflow management platforms... 
    Platform
    Work experience placement

    RST Recruitment

    San Francisco, CA
    20 days ago
  • $150k - $250k

     ...Ryan Hoover (Founder, Product Hunt), Charlie Songhurst (Board Member, Meta), and Michael Jones (Former Chair, Huntington Bank Ventures...  ...CxOs at leading NBFIs and FIs to deploy and integrate Krew's platform Support with change management and PMO objectives Develop... 
    Platform
    Full time
    Work experience placement
    Internship
    Worldwide

    Krew

    San Francisco, CA
    9 days ago
  • Member of Technical Staff, Post-Training Location: SF Bay Area or Tokyo, Japan Type: Full-time About Radical Numerics Radical Numerics is an AI lab bringing the rigor of distributed systems, model architecture, and numerics research to the challenges of biology. We are... 
    Technical training
    Full time

    Radical Numerics

    San Francisco, CA
    2 days ago
  • Member of Technical Staff, Pretraining Science Member of Technical Staff, Pre-Training Science Location: SF Bay Area or Tokyo, Japan Type: Full-time About Radical Numerics Radical Numerics is an AI lab bringing the rigor of distributed systems, model architecture, and... 
    Technical training
    Full time

    Radical Numerics

    San Francisco, CA
    4 days ago
  • $200k

     ...Join to apply for the Member of Technical Staff role at Listen Labs . TL;DR: We are seeing strong market demand and an aggressive 6‑month product...  ...talk. Background: Listen Labs is an AI‑powered research platform that helps teams uncover insights from customer interviews... 
    Platform
    Flexible hours

    Listen Labs

    San Francisco, CA
    4 days ago
  •  ...mission is to scale intelligence to serve humanity. We’re training and deploying frontier models for developers and enterprises...  ...New York but also embrace being remote-friendly! As a Member of Technical Staff, you will: Design and write high-performant and scalable... 
    Technical training
    Full time
    Work at office
    Remote work
    Flexible hours

    Jaide Health

    San Francisco, CA
    1 day ago
  • Member of Technical Staff - Post‑Training Join to apply for the Member of Technical Staff - Post‑Training role at Reflection AI . Our Mission Reflection’s mission is to build open superintelligence and make it accessible to all. We’re developing open weight models for... 
    Technical training
    Full time
    Relocation package

    Reflection AI

    San Francisco, CA
    4 days ago
  • Pixeltable, Inc. is seeking a Member of Technical Staff based in San Francisco, CA. As a founding member of our engineering team, you will directly...  ...the design and development of a revolutionary AI data platform. With over 5 years of experience in systems engineering... 
    Platform
    Flexible hours

    Pixeltable, Inc.

    San Francisco, CA
    2 days ago
  • $350k

    Software Engineer ML Infra - Distributed Systems Series A AI Infrastructure Startup | Neocloud Platform | On-site (San Francisco) We’re hiring a Member of Technical Staff - Distributed Systems to join a next-generation AI infrastructure company building the first heterogeneous... 
    Platform

    Acceler8 Talent

    San Francisco, CA
    3 days ago
  •  ...decoupling AI workloads from the underlying hardware. Our platform intelligently partitions workloads into components and orchestrates...  ...gigawatt-class AI datacenters. Gimlet Labs is seeking a Member of Technical Staff focused on kernels and GPU performance. In this role, you... 
    Platform

    Gimlet Labs

    San Francisco, CA
    2 days ago
  • $150k - $350k

    Mission Gimlet Labs is seeking a Member of Technical Staff focused on distributed systems. In this role, you will build the core platform that schedules, routes, and operates AI workloads reliably at production scale. You will work on systems that coordinate execution... 
    Platform

    Gimlet Labs, Inc.

    San Francisco, CA
    2 days ago
  •  ...decoupling AI workloads from the underlying hardware. Our platform intelligently partitions workloads into components and orchestrates...  ...gigawatt-class AI datacenters. Gimlet Labs is seeking a Member of Technical Staff focused on distributed systems. In this role, you will... 
    Platform

    Gimlet Labs

    San Francisco, CA
    2 days ago
  •  ...decoupling AI workloads from the underlying hardware. Our platform intelligently partitions workloads into components and orchestrates...  ...gigawatt-class AI datacenters. Gimlet Labs is seeking a Member of Technical Staff focused on compilers. In this role, you will work on the... 
    Platform

    Gimlet Labs

    San Francisco, CA
    2 days ago
  •  ...decoupling AI workloads from the underlying hardware. Our platform intelligently partitions workloads into components and orchestrates...  ...‑class AI datacenters. Mission Gimlet Labs is seeking a Member of Technical Staff focused on ML systems and inference. In this role, you... 
    Platform

    Gimlet Labs, Inc.

    San Francisco, CA
    4 days ago
  •  ...experiences that will reshape how people discover and buy online. Role As a Member of Technical Staff, you will ship core systems, set engineering culture, and move the mission from prototype to platform. You will work across the stack and own problems end to end. You might... 
    Platform
    Work at office

    Getcatalog

    San Francisco, CA
    5 days ago
  • $300k

    Member of Technical Staff - RL Infrastructure About V max V max is an applied research lab developing...  ...RL at scale: distributed rollouts, training orchestration, inference, evals, data...  ...reliability. You will create the durable platform that enables researchers and applied... 
    Platform
    Work at office
    Local area

    Vmax

    San Francisco, CA
    5 days ago
  • $150k

    Member of Technical Staff, Epigenetics / Therapeutics / Single-Cell LatchBio is building AI agents that do real biology. Agents that can take...  ...submission) Built or contributed to open‑source tools, internal platforms, or production pipelines used by people other than you... 
    Platform
    Work at office
    Visa sponsorship

    LatchBio

    San Francisco, CA
    19 hours ago
  •  ...is focused on building and deploying the technical systems that make biosecurity real. About the Role As a Member of Technical Staff, Biosecurity at Radical Numerics, you will...  ...systems such as biosurveillance platforms, sequence screening tools, and rapid detection... 
    Platform
    Full time

    Radical Numerics

    San Francisco, CA
    2 days ago
  • Perplexity is seeking an intrepid, polymathic Member of Technical Staff to take on one of the AI industry’s most unique engineering roles. You...  ...regulatory issues pertaining to Perplexity. Build automated platforms for harvesting, prioritizing, and patenting Perplexity’s... 
    Platform

    Perplexity

    San Francisco, CA
    3 days ago
  •  ...systematically interrogate disease biology at scale. Our proprietary platforms enable high‑throughput, high‑resolution genetic screening...  ...Role We are seeking a highly motivated and experienced Member of Technical Staff, Computational Biology to join our dynamic R&D team. In... 
    Platform

    Algen Biotechnologies

    San Francisco, CA
    2 days ago
  •  ...toward what people actually want. We’re a small, deeply technical team with people from Harvard, Berkeley, Apple, Microsoft...  ...YC, Conviction, SV Angel, BoxGroup and others. The Role Member of Technical Staff, Platform Engineer You’ll design, build, and own distributed... 
    Platform

    Arcada Labs Incorporated

    San Francisco, CA
    4 days ago
  • Pixeltable Inc. Member of Technical Staff San Francisco, CA·Full time Apply for Member of Technical...  ...development landscape with our data-centric platform designed to simplify and accelerate...  ...spanning ingestion, transformation, training/fine-tuning, and inference? You will... 
    Platform
    Full time
    Part time
    Work at office
    Work from home
    Flexible hours
    2 days per week

    Pixeltable, Inc.

    San Francisco, CA
    2 days ago
  • Member of Technical Staff — Data Quality Operations Patronus AI is a frontier lab developing simulation research and infrastructure to accelerate...  ...delivery. Furthermore, you will collaborate with the Platform team to design the instrumentation and automation necessary... 
    Platform

    Patronus AI, Inc.

    San Francisco, CA
    2 days ago
  •  ...manual processes that are labor intensive and costly. Our platform is gaining traction with finance teams across industrials,...  ...Activant, 1984 Ventures and Page One. The Role We’re hiring a Member of Technical Staff - AI/ML to design, build, and deploy AI-powered systems... 
    Platform
    Full time
    Flexible hours

    Stuut

    San Francisco, CA
    4 days ago
  • Member of Technical Staff, Hardware Security Modules At Anchorage Digital, we are building the world’s most advanced digital asset platform for institutions to participate in crypto. Anchorage Digital is a crypto platform that enables institutions to participate in digital... 
    Platform
    Flexible hours

    Crypto Pro Network

    San Francisco, CA
    2 days ago
  • Member of Technical Staff, Applied AI The opportunity We are looking for a Member of Technical Staff...  ...as high throughput protein screening platforms. At Latent Labs you will be working...  ...understanding of generative model architectures, training dynamics and inference behaviour.... 
    Platform
    Flexible hours

    Latent Labs

    San Francisco, CA
    5 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Member of Technical Staff - Training Platform. Be the first to apply!