Member of Technical Staff, Infrastructure / DevOps
Plato.ai
Introduction Plato is an applied research lab building the foundational infrastructure to train specialized AI agents. We turn real-world data streams into high-fidelity simulated environments that generate the training signal needed to make capable models. Today, only a handful of players can train models for capable work. Compute and algorithms are rapidly commoditizing, but reinforcement learning data remains the bottleneck. Plato is changing that by automatically scaling training environments from proprietary real-world data. Our work supports frontier labs, hyperscalers, and enterprises building AI systems for complex, high-stakes work. Why This Role Matters Infrastructure is central to Plato's product and research loop. Generic cloud systems are not designed for long-running RL environments, persistent agent workspaces, replayable rollouts, storage-efficient forks, or recursive debugging loops. To train useful agents, we need infrastructure that makes environment construction, experimentation, evaluation, and iteration feel like one seamless system. As a Member of Technical Staff, Infrastructure / DevOps, you will own the systems that make Plato's research and training loops reliable at scale. Role Description You will build and operate the infrastructure behind long-horizon agent experiments, including environment VMs, storage-efficient snapshots and forks, orchestration for parallel agent fleets, shared workspaces, verifier workers, telemetry pipelines, deployment systems, and the operational tooling that lets researchers run thousands of experiments without thinking about the machinery underneath. This is not conventional cloud plumbing. You will be building infrastructure that directly shapes the quality, speed, and reliability of Plato's research. You Will Work On Build and operate purpose-built infrastructure for RL rollouts, long-running agent tasks, and environment synthesis jobs. Scale environment VMs, snapshots, checkpointing, persistent sandboxes, and storage-efficient forks. Design orchestration systems for fleets of agents that crawl, synthesize, evaluate, debug, and rerun experiments. Build telemetry, logging, tracing, replay, and observability systems for thousands of concurrent agent sessions. Improve reliability, cold starts, uptime, cost efficiency, isolation, and developer experience across the infrastructure stack. Partner with research engineers to turn experimental workflows into repeatable, production-grade systems. What We're Looking For We're looking for someone who is excited to work close to the metal of AI infrastructure and enjoys turning ambiguous research workflows into reliable systems. You may be a strong fit if you: Have experience building or operating distributed systems, cloud infrastructure, orchestration platforms, or developer tooling. Are comfortable debugging across infrastructure, application, and research workflows. Care deeply about reliability, observability, isolation, and cost efficiency. Enjoy working with researchers and engineers to turn messy, fast-moving workflows into durable systems. Want to build infrastructure that is part of the core product, not just internal support tooling. How We Work Being an engineer at an early-stage AI startup is not easy. These are the values we care about. Ownership We value teammates who bring novel ideas to the table, experiment, and see results through end to end. You'll have access to massive compute budgets to test large scale experiments. Move Fast, Build Durable Demand is growing faster than our team. We move quickly, prioritize ruthlessly, and ship systems that keep working under load. Reality Over Narratives Training data is incredibly fragile and prone to reward-hacking. We prioritize digging deep through data, manually if we have to, to garner deep intuition on retaining high quality throughput. Stay Close to the Frontier New AI capabilities rapidly change how we think about problems and what doors open. We stay close to the frontier of model capability, and encourage teammates to constantly share new findings and update their world model of what's possible. #J-18808-Ljbffr
$150k - $265k
...of voice technology. Our market edge is extensible, reliable infrastructure designed for the full complexity of voice interactions. 18 months... ...’s done ;) Qualifications Previous Founding/Infrastructure/DevOps Engineer 1+ year of experience with a seed or Series A...DevopsFull timeShift work- ..., Slurm, Python, C++, PyTorch, and primarily on AWS. As an AI Infrastructure Engineer, you will be partnering closely with our Inference and... ...and HPC workload management Previous roles in SRE, DevOps, or Platform Engineering with focus on ML infrastructure Experience...Devops
- ...Member of Technical Staff, ML Infrastructure & Inference Overview We are a cutting-edge AI infrastructure company is building a scalable cloud platform designed for next-generation machine learning workloads ($80M series A). As AI systems continue to grow in complexity...Suggested
- ...observe their code. We are responsible for designing, building, and scaling core infrastructure that powers a high-volume data platform for AI applications. We are looking for team members who love building enabling systems that empower our engineers and power our rapidly...SuggestedWork at office
$180k
...Member Of Technical Staff - RL Infrastructure Palo Alto, CA About XAI XAI's mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, and focused on engineering...SuggestedTemporary work$256k - $276k
...World" graphic novel to understand the bigger picture and our vision at Postman. The Opportunity As a Member of Technical Staff on AI Infrastructure, you will build and maintain the foundational systems and distributed infrastructure that power AI model post training...Work at officeFlexible hours3 days per week$200k - $350k
...place the best people in the right roles to drive long-term success for both clients and candidates. Member of Technical Staff - Pre-Training Infrastructure Location: San Francisco, CA Company Stage of Funding: Seed Stage ($23M Raised) Office Type: Onsite...Work at officeVisa sponsorship- ...innovates at the frontier of AI infrastructure, search, and orchestration... ...these interfaces. As a member of our team, you'll work on... ...experience alike. You'll also define technical strategy for how we scale to... ..., CI/CD, and modern DevOps practices. Experience with modern...Devops
$200k - $350k
...About the job Pantheon - Member of Technical Staff: Infrastructure Member of Technical Staff: Infrastructure Posted by Transparent Search Group on behalf of Pantheon . About Pantheon Autonomous physical labor Website: The role We are...H1bRemote workVisa sponsorship- Member of Technical Staff - Infrastructure Security We're partnering with a frontier AI research company that is building next-generation open-weight foundation models with the mission of making advanced AI broadly accessible. Their team includes researchers, engineers...
$300k
Member of Technical Staff - RL Infrastructure About V max V max is an applied research lab developing AI capable of open-ended learning. We are building systems to exceed humans in all capacities by optimising beyond the local maxima of learning from human expertise. About...Work at officeLocal area$160k - $270k
...About Mandolin Nearly every disease will become treatable in our lifetimes. Mandolin is laying the clinical and financial infrastructure to get groundbreaking treatments to patients faster, powered by AI agents. Mandolin partners closely with the largest healthcare institutions...DevopsFull timeWork at officeLocal area$200k - $250k
...provider, is looking for a Software Engineer, Infrastructure Platform to build the foundational... ...scale with organizational growth Technical Leadership Evaluate build vs. buy decisions... ...Background in infrastructure automation, DevOps, or platform engineering Familiarity...DevopsLocal area$150k - $400k
...Member Of Technical Staff - Infrastructure Engineer Freiburg (Germany) About Black Forest Labs We're the team behind Latent Diffusion, Stable Diffusion, and FLUX—foundational technologies that changed how the world creates images and video. We're creating the...Work at officeRemote workWorldwideRelocation2 days per week- .... Successful candidates typically come from staff or principal-level roles and are recognized for establishing technical direction, leading large-scale initiatives,... ...teams use to right‑size space and budgets. This infrastructure already powers 16,000 workplaces and 9,000+...Work at officeLocal areaMonday to Thursday
- ...agent for enterprise computer automation. Our developer platform writes, tests, and maintains automation code on fully‑managed infrastructure – cutting dev time by 90%. We’re starting with healthcare, where legacy systems make reliable automation a genuinely hard problem...Immediate startRemote work
- ...users create characters, worlds, stories, and relationships with AI, and making that feel fast, reliable, and alive takes serious infrastructure. We are looking for an engineer who wants to help own that whole stack. We run more of our own than most companies our size....
- ...About us Parallel is a web infrastructure company. Our products are used by leading businesses in sales, marketing, insurance, and... ...We're a flat, talent-dense organization dedicated to solving technical and creative problems. We seek like-minded individuals who...Work at officeVisa sponsorshipFlexible hours
- ...Infrastructure / Cluster Engineer Gimlet is building the next generation of AI infrastructure: large-scale AI datacenters and the orchestration platform that coordinates them. The future of AI will require vastly more compute than exists today. But as AI workloads...
$10k
...Hiring This Role Vapi runs live phone calls — when something breaks, callers hear it. We're building cell-based, multi-region infrastructure to drive 99.99% call completion, and this hire owns the foundation: multi-cluster Kubernetes on EKS, a stateful data plane (...Flexible hours- ...About Us Sieve is the only AI research lab exclusively focused on video data. We combine exabyte-scale video infrastructure, novel video understanding techniques, and dozens of data sources to develop datasets that push the frontier of video modeling. Video makes...
$200k
...context, and inference-time compute to achieve this goal. About the Role As an engineer on the Supercomputing Platform & Infrastructure team, you will design, build, and operate the large-scale GPU infrastructure that powers Magic's model training and inference workloads...RelocationVisa sponsorship$160k - $270k
...About Mandolin Nearly every disease will become treatable in our lifetimes. Mandolin is laying the clinical and financial infrastructure to get groundbreaking treatments to patients faster, powered by AI agents. Mandolin partners closely with the largest healthcare...Full timeWork at officeLocal area$10k
...millions of calls. It's freakin hard and a lot of fun. What You'll Do: 30 Day: You'll ramp on our multi-cluster, multi-cloud infrastructure. 60 Day: You'll deliver a new service like Anycast Global Router. 90 Day: You'll own a domain like GPU inference clusters....Flexible hoursShift work- ...Hands‑on experience building or significantly enhancing distributed compute platforms, orchestration systems, or high‑performance infrastructure at scale Ability to thrive in a fast‑paced, meritocratic environment with full ownership, high standards, and a focus on...
$150k - $350k
About Us Sieve is the only AI research lab exclusively focused on video data. We combine exabyte-scale video infrastructure, novel video understanding techniques, and dozens of data sources to develop datasets that push the frontier of video modeling. Video makes up 80...$150k - $300k
Building Open Superintelligence Infrastructure Prime Intellect is building the open superintelligence stack - from frontier agentic models... ...Solutions Architect for GPU Infrastructure, you'll be the technical expert who transforms customer requirements into production‑ready...- ...You'll Pioneer You’ll create the data systems that make frontier research and the largest training runs possible. It's building infrastructure at a scale where billion-image datasets are normal and where video processing pipelines need to run across thousands of GPUs....Worldwide
$150k - $250k
...people take ownership, grow together, and share both the challenges and the wins. What you'll do Build the supercomputing infrastructure that runs our agents. Our agents tackle long-horizon, high-performance workloads, and you'll design the cloud compute,...Work at officeRemote workFlexible hours- Why We’re Hiring This Role: Three of our worst recent incidents - Nov 29 config rollout, Dec 23 duplicate messages, Oct 13 egress proxy - were resolved by rollback. You’ll own progressive delivery (canary, blue/green, automated rollback, soak periods), the GitOps story...Devops
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Member of Technical Staff, Infrastructure / DevOps. Be the first to apply!
- remote support technician San Francisco, CA
- personal computer support technician San Francisco, CA
- customer support analyst San Francisco, CA
- systems support technician San Francisco, CA
- help desk administrator San Francisco, CA
- decision support analyst San Francisco, CA
- technical support assistant San Francisco, CA
- technical analyst San Francisco, CA
- technical assistant San Francisco, CA
- IT support technician San Francisco, CA


