Software Engineer, ML Infra & Distributed Systems (Staff & Principal)
Tubi TV
Staff Software Engineer, ML Infra & Distributed Systems About the Role: As a Staff Software Engineer on the ML Infrastructure team, you will collaborate closely with the Machine Learning and Product teams to build world-class machine learning inference platforms. These platforms power essential services like personalized recommendations, search, and content understanding across Tubi. A core responsibility is developing and maintaining low-latency ML model serving systems that support Deep Learning, LLM, and Search models. This involves building self-service infrastructure and critical components such as the inference engine, feature store, vector store, and experimentation engine. You will improve the way we deploy and operate our services and may contribute to open-source projects. This role grants architectural freedom to explore new frameworks, lead critical cross-functional projects, and transform the capabilities of our ML and Product teams. Responsibilities: Design and build scalable, high throughput, and low latency distributed systems using Scala Build reusable components and services that serve various ML applications like Personalization, Search, Ads and Exploration Partner closely with ML engineers to understand their challenges and limitations and develop scalable solutions to address them. Proactively recommend solutions to keep our ML Inference stack state of the art. Take a data driven approach to identifying & optimizing latency, cost, and efficiency of our infra. Lead large scale cross functional refactorings if necessary Mentor other engineers on system design, incident management, interviewing, leveraging LLMs for work, etc. Collaborate with ML, Product, and cross functional engineering teams to define the long term vision and architecture for ML Infrastructure at Tubi. Your Background: 8+ years of experience designing and building scalable, distributed systems in any modern backend language (e.g., Scala, Java, Python, Go, C++); experience with Scala or JVM based language is a plus. Strong experience with AWS or an equivalent cloud platform Experience building online microservices at scale with low latency serving Experience with both SQL (e.g. Postgres) and NoSQL databases (e.g. Cassandra), message brokers (e.g. Kafka), and caches (e.g. Redis) Experience with containerization technologies, such as Docker or Kubernetes Led the response and resolution efforts for multiple major, large-scale incidents Bonus: Familiarity with machine learning infrastructure like inference engines (e.g. torchserve, Triton, vLLM), vector stores (e.g. LanceDB, FAISS), feature stores (e.g. Feast) Understanding of ML model training pipelines and model internals. Experience with Recommender Systems, Search, Autocomplete and Ads ML is a plus Previous experience with Akka, Erlang, Elixir or Go Proficient in data-driven analysis of complex A/B testing results About Tubi: Tubi is a free streaming service that entertains over 100 million monthly active users. Tubi offers a large collection of Hollywood movies and TV shows, thousands of creator-led stories and hundreds of Tubi Originals. Headquartered in San Francisco and founded in 2014, Tubi is part of Tubi Media Group, a division of Fox Corporation. #LI-Hybrid We are an equal opportunity employer and all qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, gender identity, disability, protected veteran status, or any other characteristic protected by law. We will consider for employment qualified applicants with criminal histories consistent with applicable law. Disclosures and additional program details are available upon request. This job description is intended to describe the general nature and level of work performed by employees assigned to this position and is not intended to be an exhaustive list of all duties and responsibilities. #J-18808-Ljbffr
$117.2k - $313.7k
...efforts. Job Category Software Engineering Job Details About Salesforce... ...components/frameworks in distributed filesystems in an ever-... ...drive innovations that improve system scalability, robustness, and... ...& Experience with Big-Data/ML and S3 Hands-on experience with...SuggestedImmediate startRemote work$180k - $275k
...evolve the core data model and storage systems powering Gamma’s business. You’ll ship... ...with rapid shipping velocity. As Software Engineer on the Platform team, you’ll collaborate... ...do Design and implement scalable APIs, distributed systems, and data infrastructure that...SuggestedFull timeWork at officeWork from home- ...firm in San Francisco is seeking an AI Infra Engineer to enhance their infrastructure. The successful... ...clusters and manage Slurm for distributed training. Important skills include extensive... ...dynamic team aiming at advancements in AI and ML infrastructure. #J-18808-Ljbffr...Suggested
$146.5k - $228k
...attitude. About the team: The ML Data Engineering team powers metadata... ...millions of users worldwide. Our systems operate at massive scale,... ...learning, data engineering, and distributed systems, collaborating... ...Overview: We’re seeking a Senior Software Engineer with deep...SuggestedTemporary workLocal areaWorldwideHome officeFlexible hours$146.5k
...About the team: The ML Data Engineering team powers metadata extraction... ...of users worldwide. Our systems operate at massive scale,... ...learning, data engineering, and distributed systems, collaborating... ...We're seeking a Senior Software Engineer with deep experience...SuggestedFor contractorsLocal areaWorldwideHome officeFlexible hours- ...and help build the platform engineers turn to to ship AI products.... ...building the global operating system for distributed, heterogeneous AI hardware.... ...to architect the software fabric that unifies thousands... ...k) Exposure to a variety of ML startups, offering unparalleled...Flexible hours
- ...Role CloudZero is hiring Staff and Principal Software Engineers across our engineering organization... ...from the front: designing systems, shaping roadmaps,... ...(AWS preferred), distributed systems, and the tradeoffs... ...infrastructure Exposure to ML pipelines or AI workload cost...Immediate start
$90k
...Distributed Systems Software Engineer, Python / Go Join to apply for the Distributed Systems Software Engineer, Python / Go role at Canonical Continue... ...deployment capabilities to new clouds and developing AI/ML pipelines for automatic analysis of test results. A successful...Full timeFreelanceInternshipLocal areaRemote workWorldwide- ...machine collaboration systems. Our primary goal... ...individual engineers. We combine language... ...the boundaries of software development efficiency... ...alongside AI researchers, infra specialists,... ...services, APIs, and distributed systems Collaborate with ML engineers to integrate...Remote work
$150k - $300k
...- from frontier agentic models to the infra that enables anyone to create, train,... ...and deployment contexts. As a Research Engineer working on Distributed Training, you'll play a crucial role... ...date with the latest advancements in AI/ML infrastructure and tools,...Remote workWorldwideVisa sponsorshipRelocation packageFlexible hours$150k - $250k
...Foundry Robotics Inc. is looking for a Senior Software Engineer to join their team in San Francisco. This vital role focuses on building cloud-based backend systems, infrastructure, and ensuring data integrity in advanced robotics manufacturing. The successful candidate...$230k - $385k
...group working across engineering, product, research... ...the foundational systems that will help... ...for an experienced Software Engineer to help build... ...and implement distributed systems that power... ...solutions Drive 01 infra development... ...marketplaces, AI/ML infra, or other monetization...Relocation packageFlexible hours- ...Web Crawler Engineer Exa is an applied AI lab building a search... ...seen. We build massive-scale infra to crawl the entire web, train... ...want to build massive-scale ML systems that will define the way the... ...What You Could Do Build a distributed crawler that can handle 100M+...H1b
$180k - $300k
...Join to apply for the Software Engineer (Infra) role at Numeral . This range is provided by Numeral. Your actual pay will be based on your... ...(Infrastructure) who thrives on solving complex distributed systems problems at scale. You'll design and build core infrastructure...Full timeImmediate startRemote workFlexible hours- ...Overview Cambio is a software platform for world... ...The role As an AI Engineer, you will play a... ...machine learning (ML) solutions into... ...platform and internal systems. Your work will... ...including system design, distributed systems, API... ...databases. Our Tech Stack Infra: AWS, Fargate,...
$170k - $250k
...Senior Infra Software Engineer Title of Role: Senior Infra Software Engineer Location:... ...enhance DevOps practices and improve system reliability. Contribute to the development... ...building and maintaining large-scale distributed systems. Knowledge of best...Work at office- ...and help build the platform engineers turn to to ship AI products.... ...source models. This work spans distributed systems, model serving, and... ...product, model performance, and infra, helping to define how developers... ...fundamentals and curiosity. ML experience is a plus, but...Flexible hours
- ...mission to democratize distributed computing and make it accessible to software developers of all... ...can scale an ML application from their... ...be a distributed systems expert. Proud to... ...We're looking for engineers with systems software... ...libraries, test infra improvements,...Work experience placement
$250k - $280k
...Staff / Principal Founding Engineer (Backend-Leaning) - AI Systems Platform San Francisco (in-office) $250-280K base + 0.75-1.... ...Python, TypeScript, APIs, AWS, cloud infra) Background in 0→1 or early-... ..., agent frameworks, or data/ML pipelines, come from top 5 CS undergrad...Work at officeImmediate startFlexible hours$191k - $223k
...bugs — we build the systems that prevent them.... ...of Quality Engineering, Infrastructure, and... ...are evolving how software quality is built by... ...building platforms, infra, or developer/quality... ...interest in AI/ML or LLM-based systems... ...Understanding of distributed systems, CI/CD workflows...Work experience placementCasual workLive inWork at officeRemote work- ...What you’ll do As a Software Engineer, Infrastructure at... ...maintaining the core systems that make our AI platform... ...Architect and operate distributed systems that leverage... ...retrieval systems, and ML models. Develop and... ...environment or platform/infra-focused team. Our...Full timeFlexible hours
- ...help build the platform engineers turn to to ship AI... ...THE ROLE As a Senior Software Engineer – Model Training... ...design and implement distributed training systems, optimize GPU... ...collaborate with product and infra teams to surface... ...years of experience in ML infrastructure,...Flexible hours
$150k - $215k
...Artie Software Engineer (Distributed Systems) $150K - $215K | San Francisco, CA, US Job type: Full-time Role: Engineering, Backend Experience: 3+ years... ...) Backend: Go, Postgres, Redis, Kafka and Elasticsearch Infra: Terraform, Kubernetes, and Helm on GCP and AWS About Artie...Full timeVisa sponsorship- ...dynamic digital product studio is seeking a Backend Software Engineer (ML Infrastructure) to design and build core systems for training and deploying ML models. This... ...with ML engineers and focuses on distributed training pipelines and cloud-native infrastructure...
- ...products. Their promise is simple: they make your AI system better. They are hiring a Backend Software Engineer (ML Infrastructure) to help design, build, and scale... ...and deployment. The candidate will work on distributed training pipelines, cloud-native infrastructure,...Remote work
$200k - $275k
...hospital electronic health record systems, screen 100% of patients daily... ...data pipelines, and powers ML models that clinicians rely on... ...our data scientists and ML engineers to build and operate the infrastructure... ...stage startup where you owned infra end-to-end This role is NOT...Work at officeHome officeDay shift$240k
...fundamentally change how software is built on the... ...a team of engineers who have built and... ...experience running large systems at scale, and as... ...for exceptional staff or principal-level engineers... ...passionate about distributed systems and have... ...large-scale infra, we’d love to talk...Full timeWork at officeRemote workShift workNight shift- ...member of the AI technical staff to join the founding... ...: Scale infra for post-training of multimodal... ...Work closely with product engineers to translate cutting-... ...Experience with ML infrastructure (GPU clusters... ...latency) Low level systems experience (Triton, CUDA...Work at officeRelocationVisa sponsorship
- ...Management (ICM) software that drives commissions... .... As a Software Engineering Architect focusing... ...‑scale, agentic systems that move beyond static... ..., quality infra and at the right cost... ...migration strategies, and distributed system performance... .... Experience with ML/AI model...Contract workFlexible hours
$218.4k - $365.2k
...Management (ICM) software that drives commissions... .... As a Software Engineering Architect focusing... ...-scale, agentic systems that move beyond static... ..., quality infra and at the right cost... ...migration strategies, and distributed system performance... ....Experience with ML/AI model...Contract workFlexible hours
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Software Engineer, ML Infra & Distributed Systems (Staff & Principal). Be the first to apply!
- software sales engineer San Francisco, CA
- software engineer internship remote San Francisco, CA
- IT software developer San Francisco, CA
- new grad software engineer San Francisco, CA
- software engineer staff San Francisco, CA
- integration software engineer San Francisco, CA
- machine learning software engineer San Francisco, CA
- software engineer part time San Francisco, CA
- facebook software engineer San Francisco, CA
- senior robotics software engineer San Francisco, CA

