Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Member of Technical Staff, Infrastructure / DevOps

Plato.ai

Introduction Plato is an applied research lab building the foundational infrastructure to train specialized AI agents. We turn real-world data streams into high-fidelity simulated environments that generate the training signal needed to make capable models. Today, only a handful of players can train models for capable work. Compute and algorithms are rapidly commoditizing, but reinforcement learning data remains the bottleneck. Plato is changing that by automatically scaling training environments from proprietary real-world data. Our work supports frontier labs, hyperscalers, and enterprises building AI systems for complex, high-stakes work. Why This Role Matters Infrastructure is central to Plato's product and research loop. Generic cloud systems are not designed for long-running RL environments, persistent agent workspaces, replayable rollouts, storage-efficient forks, or recursive debugging loops. To train useful agents, we need infrastructure that makes environment construction, experimentation, evaluation, and iteration feel like one seamless system. As a Member of Technical Staff, Infrastructure / DevOps, you will own the systems that make Plato's research and training loops reliable at scale. Role Description You will build and operate the infrastructure behind long-horizon agent experiments, including environment VMs, storage-efficient snapshots and forks, orchestration for parallel agent fleets, shared workspaces, verifier workers, telemetry pipelines, deployment systems, and the operational tooling that lets researchers run thousands of experiments without thinking about the machinery underneath. This is not conventional cloud plumbing. You will be building infrastructure that directly shapes the quality, speed, and reliability of Plato's research. You Will Work On Build and operate purpose-built infrastructure for RL rollouts, long-running agent tasks, and environment synthesis jobs. Scale environment VMs, snapshots, checkpointing, persistent sandboxes, and storage-efficient forks. Design orchestration systems for fleets of agents that crawl, synthesize, evaluate, debug, and rerun experiments. Build telemetry, logging, tracing, replay, and observability systems for thousands of concurrent agent sessions. Improve reliability, cold starts, uptime, cost efficiency, isolation, and developer experience across the infrastructure stack. Partner with research engineers to turn experimental workflows into repeatable, production-grade systems. What We're Looking For We're looking for someone who is excited to work close to the metal of AI infrastructure and enjoys turning ambiguous research workflows into reliable systems. You may be a strong fit if you: Have experience building or operating distributed systems, cloud infrastructure, orchestration platforms, or developer tooling. Are comfortable debugging across infrastructure, application, and research workflows. Care deeply about reliability, observability, isolation, and cost efficiency. Enjoy working with researchers and engineers to turn messy, fast-moving workflows into durable systems. Want to build infrastructure that is part of the core product, not just internal support tooling. How We Work Being an engineer at an early-stage AI startup is not easy. These are the values we care about. Ownership We value teammates who bring novel ideas to the table, experiment, and see results through end to end. You'll have access to massive compute budgets to test large scale experiments. Move Fast, Build Durable Demand is growing faster than our team. We move quickly, prioritize ruthlessly, and ship systems that keep working under load. Reality Over Narratives Training data is incredibly fragile and prone to reward-hacking. We prioritize digging deep through data, manually if we have to, to garner deep intuition on retaining high quality throughput. Stay Close to the Frontier New AI capabilities rapidly change how we think about problems and what doors open. We stay close to the frontier of model capability, and encourage teammates to constantly share new findings and update their world model of what's possible. #J-18808-Ljbffr

Vacancy posted 3 days ago
Similar jobs that could be interesting for youBased on the Member of Technical Staff, Infrastructure / DevOps in San Francisco, CA vacancy
  • $150k - $265k

     ...of voice technology. Our market edge is extensible, reliable infrastructure designed for the full complexity of voice interactions. 18 months...  ...’s done ;) Qualifications Previous Founding/Infrastructure/DevOps Engineer 1+ year of experience with a seed or Series A... 
    Devops
    Full time
    Shift work

    Vapi

    San Francisco, CA
    2 days ago
  •  ..., Slurm, Python, C++, PyTorch, and primarily on AWS. As an AI Infrastructure Engineer, you will be partnering closely with our Inference and...  ...and HPC workload management Previous roles in SRE, DevOps, or Platform Engineering with focus on ML infrastructure Experience... 
    Devops

    Perplexity AI

    San Francisco, CA
    1 day ago
  •  ...Member of Technical Staff, ML Infrastructure & Inference Overview We are a cutting-edge AI infrastructure company is building a scalable cloud platform designed for next-generation machine learning workloads ($80M series A). As AI systems continue to grow in complexity... 
    Suggested

    Acceler8 Talent

    San Francisco, CA
    4 days ago
  •  ...observe their code. We are responsible for designing, building, and scaling core infrastructure that powers a high-volume data platform for AI applications. We are looking for team members who love building enabling systems that empower our engineers and power our rapidly... 
    Suggested
    Work at office

    LlamaIndex

    San Francisco, CA
    3 days ago
  • $180k

     ...Member Of Technical Staff - RL Infrastructure Palo Alto, CA About XAI XAI's mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, and focused on engineering... 
    Suggested
    Temporary work

    Xai

    San Francisco, CA
    1 day ago
  • $256k - $276k

     ...World" graphic novel to understand the bigger picture and our vision at Postman. The Opportunity As a Member of Technical Staff on AI Infrastructure, you will build and maintain the foundational systems and distributed infrastructure that power AI model post training... 
    Work at office
    Flexible hours
    3 days per week

    Postman

    San Francisco, CA
    2 days ago
  • $200k - $350k

     ...place the best people in the right roles to drive long-term success for both clients and candidates. Member of Technical Staff - Pre-Training Infrastructure Location: San Francisco, CA Company Stage of Funding: Seed Stage ($23M Raised) Office Type: Onsite... 
    Work at office
    Visa sponsorship

    Recruiting from Scratch

    San Francisco, CA
    5 days ago
  •  ...innovates at the frontier of AI infrastructure, search, and orchestration...  ...these interfaces. As a member of our team, you'll work on...  ...experience alike. You'll also define technical strategy for how we scale to...  ..., CI/CD, and modern DevOps practices. Experience with modern... 
    Devops

    United States Digital Space LLC

    San Francisco, CA
    3 days ago
  • $200k - $350k

     ...About the job Pantheon - Member of Technical Staff: Infrastructure Member of Technical Staff: Infrastructure Posted by Transparent Search Group on behalf of Pantheon . About Pantheon Autonomous physical labor Website: The role We are... 
    H1b
    Remote work
    Visa sponsorship

    Transparent Search Group

    San Francisco, CA
    4 days ago
  • Member of Technical Staff - Infrastructure Security We're partnering with a frontier AI research company that is building next-generation open-weight foundation models with the mission of making advanced AI broadly accessible. Their team includes researchers, engineers... 

    Xcede

    San Francisco, CA
    3 days ago
  • $300k

    Member of Technical Staff - RL Infrastructure About V max V max is an applied research lab developing AI capable of open-ended learning. We are building systems to exceed humans in all capacities by optimising beyond the local maxima of learning from human expertise. About... 
    Work at office
    Local area

    Vmax

    San Francisco, CA
    5 days ago
  • $160k - $270k

     ...About Mandolin Nearly every disease will become treatable in our lifetimes. Mandolin is laying the clinical and financial infrastructure to get groundbreaking treatments to patients faster, powered by AI agents. Mandolin partners closely with the largest healthcare institutions... 
    Devops
    Full time
    Work at office
    Local area

    Mandolin

    San Francisco, CA
    3 days ago
  • $200k - $250k

     ...provider, is looking for a Software Engineer, Infrastructure Platform to build the foundational...  ...scale with organizational growth Technical Leadership Evaluate build vs. buy decisions...  ...Background in infrastructure automation, DevOps, or platform engineering Familiarity... 
    Devops
    Local area

    Fluidstack

    San Francisco, CA
    3 days ago
  • $150k - $400k

     ...Member Of Technical Staff - Infrastructure Engineer Freiburg (Germany) About Black Forest Labs We're the team behind Latent Diffusion, Stable Diffusion, and FLUX—foundational technologies that changed how the world creates images and video. We're creating the... 
    Work at office
    Remote work
    Worldwide
    Relocation
    2 days per week

    Black Forest Labs

    San Francisco, CA
    1 day ago
  •  .... Successful candidates typically come from staff or principal-level roles and are recognized for establishing technical direction, leading large-scale initiatives,...  ...teams use to right‑size space and budgets. This infrastructure already powers 16,000 workplaces and 9,000+... 
    Work at office
    Local area
    Monday to Thursday

    Envoy

    San Francisco, CA
    3 days ago
  •  ...agent for enterprise computer automation. Our developer platform writes, tests, and maintains automation code on fully‑managed infrastructure – cutting dev time by 90%. We’re starting with healthcare, where legacy systems make reliable automation a genuinely hard problem... 
    Immediate start
    Remote work

    CloudCruise

    San Francisco, CA
    3 days ago
  •  ...users create characters, worlds, stories, and relationships with AI, and making that feel fast, reliable, and alive takes serious infrastructure. We are looking for an engineer who wants to help own that whole stack. We run more of our own than most companies our size.... 

    janitorAI

    San Francisco, CA
    1 day ago
  •  ...About us Parallel is a web infrastructure company. Our products are used by leading businesses in sales, marketing, insurance, and...  ...We're a flat, talent-dense organization dedicated to solving technical and creative problems. We seek like-minded individuals who... 
    Work at office
    Visa sponsorship
    Flexible hours

    Parallel Web Systems Inc

    San Francisco, CA
    4 days ago
  •  ...Infrastructure / Cluster Engineer Gimlet is building the next generation of AI infrastructure: large-scale AI datacenters and the orchestration platform that coordinates them. The future of AI will require vastly more compute than exists today. But as AI workloads... 

    Gimlet Labs

    San Francisco, CA
    4 days ago
  • $10k

     ...Hiring This Role Vapi runs live phone calls — when something breaks, callers hear it. We're building cell-based, multi-region infrastructure to drive 99.99% call completion, and this hire owns the foundation: multi-cluster Kubernetes on EKS, a stateful data plane (... 
    Flexible hours

    VAPI

    San Francisco, CA
    1 day ago
  •  ...About Us Sieve is the only AI research lab exclusively focused on video data. We combine exabyte-scale video infrastructure, novel video understanding techniques, and dozens of data sources to develop datasets that push the frontier of video modeling. Video makes... 

    Sieve, Inc.

    San Francisco, CA
    1 day ago
  • $200k

     ...context, and inference-time compute to achieve this goal. About the Role As an engineer on the Supercomputing Platform & Infrastructure team, you will design, build, and operate the large-scale GPU infrastructure that powers Magic's model training and inference workloads... 
    Relocation
    Visa sponsorship

    Magic Inc

    San Francisco, CA
    1 day ago
  • $160k - $270k

     ...About Mandolin Nearly every disease will become treatable in our lifetimes. Mandolin is laying the clinical and financial infrastructure to get groundbreaking treatments to patients faster, powered by AI agents. Mandolin partners closely with the largest healthcare... 
    Full time
    Work at office
    Local area

    Mandolin

    San Francisco, CA
    3 days ago
  • $10k

     ...millions of calls. It's freakin hard and a lot of fun. What You'll Do: 30 Day: You'll ramp on our multi-cluster, multi-cloud infrastructure. 60 Day: You'll deliver a new service like Anycast Global Router. 90 Day: You'll own a domain like GPU inference clusters.... 
    Flexible hours
    Shift work

    Superpowered Inc

    San Francisco, CA
    1 day ago
  •  ...Hands‑on experience building or significantly enhancing distributed compute platforms, orchestration systems, or high‑performance infrastructure at scale Ability to thrive in a fast‑paced, meritocratic environment with full ownership, high standards, and a focus on... 

    xAI

    San Francisco, CA
    4 days ago
  • $150k - $350k

    About Us Sieve is the only AI research lab exclusively focused on video data. We combine exabyte-scale video infrastructure, novel video understanding techniques, and dozens of data sources to develop datasets that push the frontier of video modeling. Video makes up 80... 

    Sieve

    San Francisco, CA
    1 day ago
  • $150k - $300k

    Building Open Superintelligence Infrastructure Prime Intellect is building the open superintelligence stack - from frontier agentic models...  ...Solutions Architect for GPU Infrastructure, you'll be the technical expert who transforms customer requirements into production‑ready... 

    Prime Intellect

    San Francisco, CA
    2 days ago
  •  ...You'll Pioneer You’ll create the data systems that make frontier research and the largest training runs possible. It's building infrastructure at a scale where billion-image datasets are normal and where video processing pipelines need to run across thousands of GPUs.... 
    Worldwide

    Black Forest Labs

    San Francisco, CA
    5 days ago
  • $150k - $250k

     ...people take ownership, grow together, and share both the challenges and the wins. What you'll do Build the supercomputing infrastructure that runs our agents. Our agents tackle long-horizon, high-performance workloads, and you'll design the cloud compute,... 
    Work at office
    Remote work
    Flexible hours

    Asari AI

    San Francisco, CA
    9 days ago
  • Why We’re Hiring This Role: Three of our worst recent incidents - Nov 29 config rollout, Dec 23 duplicate messages, Oct 13 egress proxy - were resolved by rollback. You’ll own progressive delivery (canary, blue/green, automated rollback, soak periods), the GitOps story...
    Devops

    Slope

    San Francisco, CA
    3 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Member of Technical Staff, Infrastructure / DevOps. Be the first to apply!