Member of Technical Staff - Training Platform
$150k - $300kKubelt
Building Open Superintelligence Infrastructure Prime Intellect is building the open superintelligence stack - from frontier agentic models to the infrastructure that lets anyone create, train, and deploy them. We aggregate and orchestrate global compute into a single control plane and pair it with the full RL post-training stack: environments, secure sandboxes, verifiable evals, and our async RL trainer. We enable researchers, startups, and enterprises to run end-to-end reinforcement learning at frontier scale, adapting models to real tools, workflows, and deployment contexts. We recently raised $15M in funding (taking total funding to $20M), led by Founders Fund with participation from Menlo Ventures and prominent angels including Andrej Karpathy (Eureka Labs, Tesla, OpenAI), Tri Dao (Chief Scientist, Together AI), Dylan Patel (SemiAnalysis), Clem Delangue (Hugging Face), Emad Mostaque (Stability AI), and many others. Role Impact You'll help build our hosted training platform - the product that lets users launch LoRA and full fine-tuning runs on managed GPU clusters with a single API call or a few clicks. The role spans the developer-facing platform and the underlying Kubernetes-based training infrastructure that runs the jobs. Core Technical Responsibilities Hosted Training Infrastructure Design and operate Kubernetes-based training and inference orchestration across multi-cluster, multi-cloud GPU fleets Build and maintain Helm charts that compose trainers, inference servers, environment servers, and supporting services into reproducible "Training stacks" Develop the Python control-plane agents that watch pods, report run state to the platform, and keep clusters in sync Implement scheduling and autoscaling for heterogeneous hardware (H100/H200/B200) using KEDA, LeaderWorkerSet, taints/tolerations, and gang scheduling Run a tight GitOps workflow - every change ships through PRs, Helm values, and CI Build node-local model caches, checkpoint pipelines, and shared storage for fast cold starts Operate the observability stack (Prometheus, Grafana, Loki, DCGM) and make GPU cluster debugging fast Platform Development Build the developer-facing surfaces for hosted training: job submission, live run monitoring, logs, metrics, model/adapter management, comparisons Develop FastAPI backend services and REST APIs that bridge the platform to running clusters Build real-time monitoring and debugging tools (streaming logs, step-level metrics, failure analysis) Ship product UI in Next.js / React / TypeScript with shadcn, Tailwind, tRPC, and TanStack Query Research Bridge Interface with the RL trainer, inference servers, and environment servers running inside our clusters Productize new training capabilities (new model architectures, RL algorithms, modes) Technical Requirements We're looking for engineers who are fluent across three areas - you don't need to be the world's best at any one, but you should have real depth in all three and a clear point of view on how they connect. AI & GPU Landscape Strong working knowledge of the modern AI stack - open model families, finetuning techniques (LoRA, QLoRA, full FT, RLHF/RLAIF), inference engines (vLLM, SGLang, TensorRT-LLM) Familiarity with GPU hardware tradeoffs (H100 / H200 / B200, NVLink, interconnects, memory hierarchy) and what they mean for training and inference workloads Understanding of distributed training fundamentals (data/tensor/pipeline/expert parallelism, NCCL, multi-node scheduling) Aware of what's happening at the frontier - new models, training methods, infra patterns - and the ability to translate that into product decisions Kubernetes & Infrastructure Strong Kubernetes operations experience - Helm, CRDs, operators, KEDA, gang scheduling, GPU operator Comfortable debugging real production clusters (kubectl, pod lifecycle, node issues, networking) Cloud platform experience (GCP preferred - GCS, GKE, Cloud Run, Cloud Tasks) Infrastructure automation (Helm, Terraform, Ansible) and a GitOps mindset Observability: Prometheus, Grafana, Loki, OpenTelemetry, DCGM Linux fundamentals: networking, namespaces, performance tuning Programming & Platform Strong Python backend development (FastAPI, async, SQLAlchemy) Comfortable building Python control-plane agents that talk to Kubernetes APIs Modern frontend development (TypeScript, React/Next.js, Tailwind, shadcn) - enough to ship product surfaces end-to-end REST and tRPC API design Experience building developer tools, dashboards, and live-monitoring UIs What We Offer Cash compensation $150K–$300K with significant equity Flexible work arrangement (remote or San Francisco office) Full visa sponsorship and relocation support Professional development budget for courses and conferences Regular team off-sites and conference attendance Opportunity to shape the future of decentralized AI development Growth Opportunity You'll join a team of experienced engineers and researchers working on cutting‑edge problems in AI infrastructure. We believe in open development and encourage team members to contribute to the broader AI community through research and open‑source work. We value potential over perfection - if you're passionate about democratizing AI development and have experience in either platform or infrastructure development (ideally both), we want to talk to you. Ready to help shape the future of AI? Apply now and join us in our mission to make powerful AI models accessible to everyone. #J-18808-Ljbffr Kubelt
- ...GPU infrastructure for high-throughput model inference and mid-training workloads. Develop systems that power synthetic data... ...learning pipelines at scale. Build high-performance inference platforms capable of serving and evaluating models across thousands of GPUs...PlatformTechnical trainingFull timeRelocation package
- ...exceptional people to help us get there. The Opportunity Our Training Infrastructure team is building the distributed systems that power... ...role focused on runtime/performance/reliability (not a general platform/SRE role). You’ll work on a small team with fast feedback loops...PlatformTechnical training
- ...your work will define what cutting edge means. We're hiring Members of Technical Staff to design the evaluations that set the standard for how AI... ...in the world, and help drive the product direction of our platform. The bar for success is becoming a world expert in modern...Platform
- ...experiences that will reshape how people discover and buy online. Role As a Member of Technical Staff, you will ship core systems, set engineering culture, and move the mission from prototype to platform. You will work across the stack and own problems end to end. You might...PlatformWork at office
- ...What we are looking for? Seeking a Member of Technical Staff - Backend with 5+ years of experience.... ...deploying machine learning capabilities in training and production Drive technology and... ...frameworks, GCP and Kubernetes and workflow management platforms #J-18808-Ljbffr...PlatformWork experience placement
- ...Pixeltable Inc. Member of Technical Staff San Francisco, CA·Full time Apply for Member of Technical... ...development landscape with our data-centric platform designed to simplify and accelerate... ...spanning ingestion, transformation, training/fine-tuning, and inference? You will...PlatformFull timePart timeWork at officeWork from homeFlexible hours2 days per week
$130k - $200k
...domain expertise with frontier technology. The Role Being a Member of Technical Staff at SketchPro means the problem in front of you will keep... ...read 2D drawings and 3D geometry the way an architect does Platform integrations: deep, real-world hooks into Autodesk Revit...PlatformWork at officeShift work- ...We’re an AI platform out to redefine knowledge work. The team builds agents that continuously learn to capture companies' proprietary... ...for Fortune 500 companies. About the Role As a Member of Technical Staff, you will be part of the team responsible for the work platform...PlatformWork experience placementH1bWork at officeVisa sponsorship
- ...turning raw cryogenic hardware into a stable, ready‑to‑run platform accessible through a simple chat interface. By solving... ...pointing ours at the frontier of science. Role Overview As a Member of Technical Staff you will shape Conductor's core offerings: AI software that...Platform
$227.5k - $401k
..., making us the financial technology platform of choice. At Adyen, everything we do... ...motivated individuals who tackle unique technical challenges at scale and solve them as... ...financial technology sector. As a Member of Technical Staff, you will operate with a high degree...PlatformWork at officeImmediate startRelocationFlexible hours$150k - $300k
...that enables anyone to create, train, and deploy them. We... ...our RL training stack. Core Technical Responsibilities LLM Serving... ...Build a multi‑tenant LLM serving platform that operates across our cloud... ...development and encourage team members to contribute to the broader...PlatformWork at officeRemote workVisa sponsorshipRelocation packageFlexible hoursShift work$200k
...Join to apply for the Member of Technical Staff role at Listen Labs . TL;DR: We are seeing strong market demand and an aggressive 6‑month... .... Background: Listen Labs is an AI‑powered research platform that helps teams uncover insights from customer interviews...PlatformFlexible hours$225k - $300k
...Member of Technical Staff Location: San Francisco, CA Onsite Policy: Full-time onsite Comp & Benefits: $225K - $300K base + 0.5% - 2% equity... ...intelligence, fraud detection, and financial decisioning. Their platform has already helped over a million people gain access to...PlatformFull time- ...Solution Lovefreedom Solution is partnering with Context , an AI platform redefining knowledge work by building secure agents that... ...companies such as Apple, Ramp, Stripe, and Meta. As a Member of Technical Staff , you will own products end‑to‑end across a full‑stack...PlatformWork at office
- ...portfolio companies running on this platform. That is a problem set with serious data... ...to copy from. About the Role Members of Technical Staff (MTS) are the senior engineers who build... ...infrastructure around a model someone else trained, and have an informed opinion on...Platform
- ...role) Hands‑on experience with LLM evaluations and/or post‑training methods: How to design useful evals and use their results... ...end‑to‑end What the job involves We are seeking a Member of Technical Staff, Evals & Post‑Training Product to help define how developers...Technical training
- Member of Technical Staff - Post‑Training Join to apply for the Member of Technical Staff - Post‑Training role at Reflection AI . Our Mission Reflection’s mission is to build open superintelligence and make it accessible to all. We’re developing open weight models for...Technical trainingFull timeRelocation package
- ...mission is to scale intelligence to serve humanity. We’re training and deploying frontier models for developers and enterprises... ...New York but also embrace being remote-friendly! As a Member of Technical Staff, you will: Design and write high-performant and scalable...Technical trainingFull timeWork at officeRemote workFlexible hours
- ...challenges of biology. We’ve redesigned the foundation model training stack to turn the world’s raw scientific data (e.g.... ...design and the responsibility to defend. About the Role As a Member of Technical Staff, Post‑Training at Radical Numerics, you will develop the training...Technical trainingLocal area
- ...challenges of biology. We’ve redesigned the foundation model training stack to turn the world’s raw scientific data (e.g.... ...design and the responsibility to defend. About the Role As a Member of Technical Staff, Infrastructure & Training Systems at Radical Numerics, you...Technical trainingLocal area
- ...challenges of biology. We’ve redesigned the foundation model training stack to turn the world’s raw scientific data (e.g.... ...design and the responsibility to defend. About the Role As a Member of Technical Staff, Pre-Training Science at Radical Numerics, you will work on...Technical trainingLocal area
- Pixeltable, Inc. is seeking a Member of Technical Staff based in San Francisco, CA. As a founding member of our engineering team, you will directly... ...the design and development of a revolutionary AI data platform. With over 5 years of experience in systems engineering...PlatformFlexible hours
- ...decoupling AI workloads from the underlying hardware. Our platform intelligently partitions workloads into components and orchestrates... ...gigawatt-class AI datacenters. Gimlet Labs is seeking a Member of Technical Staff focused on kernels and GPU performance. In this role, you...Platform
$150k - $350k
Mission Gimlet Labs is seeking a Member of Technical Staff focused on distributed systems. In this role, you will build the core platform that schedules, routes, and operates AI workloads reliably at production scale. You will work on systems that coordinate execution...Platform- Member of Technical Staff — Kernels & GPU Performance Employment Type: Full-time Workplace: On-site About the Company We are building the execution... ...efficiently, reliably, and at production scale. Our platform intelligently partitions, schedules, and routes AI workloads...PlatformFull time
- ...decoupling AI workloads from the underlying hardware. Our platform intelligently partitions workloads into components and orchestrates... ...gigawatt-class AI datacenters. Gimlet Labs is seeking a Member of Technical Staff focused on distributed systems. In this role, you will...Platform
$130k - $200k
Job Title Member of Technical Staff Salary $130k-200k+ + Equity Company Description SketchPro.ai is a venture-backed startup building the first... ...of multimodal reasoning, 3D geometry, and complex platform integrations within the AEC stack. What You Will Do Design...Platform- ...decoupling AI workloads from the underlying hardware. Our platform intelligently partitions workloads into components and orchestrates... ...gigawatt-class AI datacenters. Gimlet Labs is seeking a Member of Technical Staff focused on compilers. In this role, you will work on the...Platform
- Member of Technical Staff - Infrastructure Security We're partnering with a frontier AI research company that is building next-generation open... ...Kubernetes, cloud infrastructure, incident response, and platform security while defining the long-term security strategy for...Platform
- ...decoupling AI workloads from the underlying hardware. Our platform intelligently partitions workloads into components and orchestrates... ...‑class AI datacenters. Mission Gimlet Labs is seeking a Member of Technical Staff focused on ML systems and inference. In this role, you...Platform
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Member of Technical Staff - Training Platform. Be the first to apply!
- salesforce technical analyst San Francisco, CA
- desktop support analyst San Francisco, CA
- personal computer support technician San Francisco, CA
- technical support specialist San Francisco, CA
- support analyst San Francisco, CA
- customer support technician San Francisco, CA
- support technician San Francisco, CA
- application support technician San Francisco, CA
- technical solutions specialist San Francisco, CA
- help desk administrator San Francisco, CA

