Distributed Systems Engineer, Data & Inference Platform
Adaption
About Us Most AI is frozen in place - it doesn't adapt to the world. We think that's backwards. Our mandate is to build efficient intelligence that evolves in real-time. Our vision is AI systems that are flexible, personalized, and accessible to everyone. We believe efficiency is what makes this possible - it's how we expand access and ensure innovation benefits the many, not the few. We believe in talent density: bringing together the best and most driven individuals to push the boundaries of continual adaptation. We're looking for builders and creative thinkers ready to shape the next era of intelligence. The Role You'll build and operate the systems that turn raw compute into useful intelligence - the inference services that serve LLMs at scale and the data pipelines that feed them. One week you're hunting a tail-latency regression in a production inference service handling millions of requests; the next you're redesigning a Ray Data pipeline so it stops melting down at petabyte scale. The work spans architecture, implementation, and the on-call pager that keeps you honest about both. Researchers and ML engineers will hand you workloads that barely run; you'll hand them back systems that run reliably, efficiently, and cheaply enough to matter. Responsibilities
- Serve Models at Scale: Design and operate distributed inference systems for LLMs, optimizing throughput, latency, and cost across heterogeneous GPU fleets. Batching, scheduling, KV cache management, autoscaling - you own the levers that make inference economical.
- Move the Data: Build large-scale data pipelines (Ray Data, Spark, or equivalents) that ingest, transform, and curate the datasets behind training and evaluation. The bottleneck is rarely where people think it is, and you find it.
- Debug the Undebuggable: Chase down the failure modes that only emerge under real production traffic - stragglers, head-of-line blocking, silent data corruption, GPU memory fragmentation - and write the postmortems that prevent the next ten. Define SLOs, build the observability to measure them, and own the on-call rotation that defends them.
- Partner Across the Stack: Work directly with researchers and ML engineers to take experimental workloads from "runs on one node" to "runs in production." You're a systems partner, not a ticket queue.
- 5+ years building and operating distributed systems in production.
- Deep experience with at least one large-scale data or compute framework (Ray, Spark, Flink, Beam, Dask).
- Strong fluency in Python and at least one systems language (Go, Rust, C++).
- Working knowledge of the GPU/accelerator stack: CUDA fundamentals, NCCL, mixed precision, memory layout. You don't need to write kernels, but you should know why a workload is bound by what it's bound by.
- Experience operating Kubernetes-based infrastructure, including custom operators or schedulers.
- A track record of owning hard production incidents end-to-end - diagnosis, mitigation, and the durable fix.
- Bonus: hands-on experience with LLM inference engines (vLLM, SGLang, TensorRT-LLM, TGI), modern lakehouse formats (Iceberg, Delta, Hudi), or open-source contributions to relevant projects.
- Flexible work : In-person collaboration in the Bay Area, a distributed global-first team, and team offsites.
- Adaption Passport : Annual travel stipend to explore a country you've never visited. We're building intelligence that evolves alongside you, so we encourage you to keep expanding your horizons.
- Lunch Stipend: Weekly meal allowance for take-out or grocery delivery.
- Well-Being : Comprehensive medical benefits and generous paid time off.
Vacancy posted 21 days ago
Similar jobs that could be interesting for youBased on the Distributed Systems Engineer, Data & Inference Platform in San Francisco, CA vacancy
$200k - $300k
...tech startup in San Francisco seeks a Lead Software Engineer to build and optimize foundational backend systems for a massive AI video dataset. You will lead... ...years in backend engineering, strong experience with distributed systems, and is proficient in Go, Python, or Node...Platform- A leading tech company based in San Francisco is seeking a Software Engineer to enhance its data and AI platform. The role involves developing high-performance distributed data systems and delivering on ambitious projects such as Delta Lake and performance engineering....Platform
$295k
...capabilities with the constraints of physical systems to improve peoples' lives. About the Role As a Research Engineer, Distributed Data Systems, you will design and scale the... ..., and security. Ensure our data platform can scale by orders of magnitude while...PlatformWork at officeRelocation package$139.2k - $174k
...are seeking a Senior Engineer 2 to play a key role... ...running AI workloads— inference, training, fine‑tuning... ...between high‑scale distributed systems and specialized AI inference... ...to ensure our global platform remains simple,... ...position is based on market data, relevant years of...PlatformLocal areaRemote workWorldwideFlexible hours$192k - $260k
...Databricks Databricks is the data and AI company. More than 10... ...Data Intelligence Platform to unify and democratize data... ...Optional: MS or PhD in databases, distributed systems. Comfortable working towards... ...complexity of real-world data engineering architecture. Delta Pipelines...PlatformWorldwide$255k - $405k
...of broad societal benefit. About the Role As a Software Engineer, Distributed Data Systems, you will design and scale the infrastructure that... ...scalability, reliability, and security. Ensure our data platform can scale by orders of magnitude while remaining reliable...PlatformFull timeWork at officeLocal areaRelocation packageFlexible hours- Voiceflow is seeking a Software Engineer (Distributed Systems) in San Francisco. As a founding engineer, you will focus on building a real-time database... ...processing, and prefers working in-person. Join us in shaping the future of data replication! #J-18808-Ljbffr Voiceflow
- ...Distributed Systems Engineer As a distributed systems engineer, you'll work across the stack to solve... ...building the next, default storage platform in the cloud. Over the past 15 years... ...the default way to store inactive data sets in the cloud, but the next-generation...PlatformFlexible hours
- MLabs Ltd is seeking a talented engineer to design and implement core systems for a real-time distributed platform. Based in New York, the role demands expertise in Rust and extensive experience in building distributed systems. Candidates will have the opportunity for...PlatformRemote job
- ...Tensorlake is to unlock your data wherever it is. We... ...action. We're looking for engineers who want to build the operating system for AI Data Applications... ...looking for experienced distributed systems engineers to... ...primarily in DevOps, SRE, or platform operations (Terraform,...Platform
$350k
...and steerable AI systems. We want AI to be... ...committed researchers, engineers, policy experts,... ...Anthropic's inference fleet serves Claude... ...'s largest cloud platforms. The stack that... ..., model servers, distributed routing, autoscaling... ...t write Solid data analysis skills (...PlatformWork at officeVisa sponsorshipFlexible hours- ...definitive tools catalog and tool-calling platform that will unlock AI's true potential.... ...authentication, integrations, distributed systems, and AI experts from Okta, Redis, Microsoft... ...desire to ship. ~7+ years of software engineering experience comprising of: ~5+ years...PlatformWork at officeShift work
$180k - $300k
...discovered, priced, and distributed in real time. The... ...transparency, and efficiency to systems where value is... ...distributed systems, including data ingestion, low-latency... ...-stakes distributed platform. End-to-End Ownership... ...while establishing engineering best practices and...PlatformFull timeRemote workFlexible hours$142.6k - $261.5k
...The opportunity The Platforms Practice specializes in... ...team of product leaders, data scientists, designers, and software engineers enable our clients to... ...practices. Knowledgeable in system development lifecycle... ...interest in cloud and distributed systems architectures...PlatformSummer holidayFlexible hours- Gravity Engineering Services Pvt Ltd. is looking for a Distributed LLM Inference Engineer to join their team. This critical role focuses... ...teams, integrating Ray Data and LLM engines, while keeping... ...and knowledge of distributed systems is crucial. #J-18808-Ljbffr Gravity...
- ...As a Research Engineer, Distributed Data Systems, you will design and scale the infrastructure that powers large-scale multimodal training and evaluation at OpenAI. You’ll manage distributed data pipelines, collaborate closely with researchers to translate requirements...
- ...Tech Lead, Data & Inference Engineer Massachusetts, Massachusetts, United... ...targetable audiences across platforms such as Meta, Google and YouTube... ...world of intelligent systems. Location: San... ...Comfortable in hybrid or distributed environments with strong ownership...PlatformFull time
- ...B Capital is seeking a skilled software engineer in San Francisco to develop foundational AI systems. You will work on shared services and improve operational... ...development, experience with APIs, and familiarity with distributed systems. This role offers top-tier compensation,...Platform
$255k - $405k
Slope is seeking a Software Engineer for its team in San Francisco, CA. The role focuses... .... Responsibilities include managing distributed data pipelines and collaborating closely with... ...exhibit strong experience in distributed systems and possess excellent organizational...- ...company-wide foundations platform that accelerates every... ..., and high-throughput data ingestion tooling... ...production environments. These systems form the foundational... ...-tenant isolation. Distributed Systems Architecture:... ..., service reliability engineering. About You...PlatformRelocation package
$230k - $310k
...the role You'll own Gamma's data infrastructure and... ...pipeline architecture, designing distributed systems that handle massive scale with... ...'ll solve the hardest data engineering challenges we face while... ...processing) and event streaming platforms ~ Extensive hands-on...PlatformFull timeWork at officeWork from home$130k - $170k
...Analytics Engineer — Data Warehouse San Francisco About... ...building high-performance inference compute and the software platform around it. We're... ...referential integrity, distribution drift, anomaly detection... ...open and transparent AI systems will drive innovation and...PlatformFull timeInternship- ...by the inefficiency of the data that feeds it. At scale, each... ...modeling , and distributed systems to design self-optimizing data... ...represented and used by AI. This engineering team partners closely with... ...development of distributed compute platforms that scale predictively and...PlatformFlexible hours
- ...ParadeDB Cloud Engineer ParadeDB is a Postgres-native... ...eliminate ETL/change data capture tools, add... ...environment. We're primarily distributed across the United... ...looking for a distributed systems engineer to join our... ...our managed database platform built on Kubernetes...PlatformFull timeWork at office
- B Capital in San Francisco is looking for a Senior/Lead/Principal Distributed Systems Software Engineer. The role involves designing and maintaining a distributed systems engineering platform for public cloud environments. Candidates should have over 3 years of backend...Platform
- ...Product Infrastructure Engineer Truewind is... ...across ERP and financial systems. To make this reliable... ...engineer who can build the data foundation and... ...the middle of a major platform transition. We are... ...data infrastructure, or distributed systems ~ Strong experience...Platform
- ...Baseten powers mission‑critical inference for the world's most... ...Join us and help build the platform engineers turn to to ship AI... ...building the global operating system for distributed, heterogeneous AI hardware... ...interconnects to ensure that data movement operates at wire‑...PlatformFlexible hours
- deCircle is seeking an engineer to design and implement core systems for its agentic AI platform. This role involves building production systems, ensuring reliable cloud... ...has over 3 years of experience in backend or distributed systems engineering, strong skills in...Platform
- ...Job Description Job Description About the Role Join a startup building an agentic data lakehouse platform. As a Senior Software Engineer, Distributed Data Systems, you'll work on a greenfield project to build scalable data infrastructure that transforms enterprise...Platform
- Engineering Manager — Foundational Data Systems for AI Location: Downtown Mountain View, CA (office-based, 5 days/... ...depends on. You’ll lead a globally distributed team of ~15-20 senior engineers... ...compute, storage, or data platforms Experience building or operating...PlatformWork at officeFlexible hours
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Distributed Systems Engineer, Data & Inference Platform. Be the first to apply!
Related searches
- wireless systems engineer San Francisco, CA
- electronic systems engineer San Francisco, CA
- space systems engineer San Francisco, CA
- systems engineer San Francisco, CA
- system design engineer San Francisco, CA
- ground systems engineer San Francisco, CA
- computer systems engineer San Francisco, CA
- senior linux systems engineer San Francisco, CA
- healthcare systems engineer San Francisco, CA
- senior staff systems engineer San Francisco, CA

