Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Software Engineer, ML Infra & Distributed Systems (Staff & Principal)

Tubi TV

Staff Software Engineer, ML Infra & Distributed Systems About the Role: As a Staff Software Engineer on the ML Infrastructure team, you will collaborate closely with the Machine Learning and Product teams to build world-class machine learning inference platforms. These platforms power essential services like personalized recommendations, search, and content understanding across Tubi. A core responsibility is developing and maintaining low-latency ML model serving systems that support Deep Learning, LLM, and Search models. This involves building self-service infrastructure and critical components such as the inference engine, feature store, vector store, and experimentation engine. You will improve the way we deploy and operate our services and may contribute to open-source projects. This role grants architectural freedom to explore new frameworks, lead critical cross-functional projects, and transform the capabilities of our ML and Product teams. Responsibilities: Design and build scalable, high throughput, and low latency distributed systems using Scala Build reusable components and services that serve various ML applications like Personalization, Search, Ads and Exploration Partner closely with ML engineers to understand their challenges and limitations and develop scalable solutions to address them. Proactively recommend solutions to keep our ML Inference stack state of the art. Take a data driven approach to identifying & optimizing latency, cost, and efficiency of our infra. Lead large scale cross functional refactorings if necessary Mentor other engineers on system design, incident management, interviewing, leveraging LLMs for work, etc. Collaborate with ML, Product, and cross functional engineering teams to define the long term vision and architecture for ML Infrastructure at Tubi. Your Background: 8+ years of experience designing and building scalable, distributed systems in any modern backend language (e.g., Scala, Java, Python, Go, C++); experience with Scala or JVM based language is a plus. Strong experience with AWS or an equivalent cloud platform Experience building online microservices at scale with low latency serving Experience with both SQL (e.g. Postgres) and NoSQL databases (e.g. Cassandra), message brokers (e.g. Kafka), and caches (e.g. Redis) Experience with containerization technologies, such as Docker or Kubernetes Led the response and resolution efforts for multiple major, large-scale incidents Bonus: Familiarity with machine learning infrastructure like inference engines (e.g. torchserve, Triton, vLLM), vector stores (e.g. LanceDB, FAISS), feature stores (e.g. Feast) Understanding of ML model training pipelines and model internals. Experience with Recommender Systems, Search, Autocomplete and Ads ML is a plus Previous experience with Akka, Erlang, Elixir or Go Proficient in data-driven analysis of complex A/B testing results About Tubi: Tubi is a free streaming service that entertains over 100 million monthly active users. Tubi offers a large collection of Hollywood movies and TV shows, thousands of creator-led stories and hundreds of Tubi Originals. Headquartered in San Francisco and founded in 2014, Tubi is part of Tubi Media Group, a division of Fox Corporation. #LI-Hybrid We are an equal opportunity employer and all qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, gender identity, disability, protected veteran status, or any other characteristic protected by law. We will consider for employment qualified applicants with criminal histories consistent with applicable law. Disclosures and additional program details are available upon request. This job description is intended to describe the general nature and level of work performed by employees assigned to this position and is not intended to be an exhaustive list of all duties and responsibilities. #J-18808-Ljbffr

Vacancy posted 4 hours ago
Similar jobs that could be interesting for youBased on the Software Engineer, ML Infra & Distributed Systems (Staff & Principal) in San Francisco, CA vacancy
  • $117.2k - $313.7k

     ...efforts. Job Category Software Engineering Job Details About Salesforce...  ...components/frameworks in distributed filesystems in an ever-...  ...drive innovations that improve system scalability, robustness, and...  ...& Experience with Big-Data/ML and S3 Hands-on experience with... 
    Suggested
    Immediate start
    Remote work

    Salesforce.Com Inc

    San Francisco, CA
    4 days ago
  • $180k - $275k

     ...evolve the core data model and storage systems powering Gamma’s business. You’ll ship...  ...with rapid shipping velocity. As Software Engineer on the Platform team, you’ll collaborate...  ...do Design and implement scalable APIs, distributed systems, and data infrastructure that... 
    Suggested
    Full time
    Work at office
    Work from home

    Gamma

    San Francisco, CA
    1 day ago
  •  ...firm in San Francisco is seeking an AI Infra Engineer to enhance their infrastructure. The successful...  ...clusters and manage Slurm for distributed training. Important skills include extensive...  ...dynamic team aiming at advancements in AI and ML infrastructure. #J-18808-Ljbffr... 
    Suggested

    Perplexity

    San Francisco, CA
    5 hours ago
  • $146.5k - $228k

     ...attitude. About the team: The ML Data Engineering team powers metadata...  ...millions of users worldwide. Our systems operate at massive scale,...  ...learning, data engineering, and distributed systems, collaborating...  ...Overview: We’re seeking a Senior Software Engineer with deep... 
    Suggested
    Temporary work
    Local area
    Worldwide
    Home office
    Flexible hours

    Scribd

    San Francisco, CA
    1 day ago
  • $146.5k

     ...About the team: The ML Data Engineering team powers metadata extraction...  ...of users worldwide. Our systems operate at massive scale,...  ...learning, data engineering, and distributed systems, collaborating...  ...We're seeking a Senior Software Engineer with deep experience... 
    Suggested
    For contractors
    Local area
    Worldwide
    Home office
    Flexible hours

    Scribd

    San Francisco, CA
    1 day ago
  •  ...and help build the platform engineers turn to to ship AI products....  ...building the global operating system for distributed, heterogeneous AI hardware....  ...to architect the software fabric that unifies thousands...  ...k) Exposure to a variety of ML startups, offering unparalleled... 
    Flexible hours

    BaseTen

    San Francisco, CA
    1 day ago
  •  ...Role CloudZero is hiring Staff and Principal Software Engineers across our engineering organization...  ...from the front: designing systems, shaping roadmaps,...  ...(AWS preferred), distributed systems, and the tradeoffs...  ...infrastructure Exposure to ML pipelines or AI workload cost... 
    Immediate start

    CloudZero

    San Francisco, CA
    1 day ago
  • $90k

     ...Distributed Systems Software Engineer, Python / Go Join to apply for the Distributed Systems Software Engineer, Python / Go role at Canonical Continue...  ...deployment capabilities to new clouds and developing AI/ML pipelines for automatic analysis of test results. A successful... 
    Full time
    Freelance
    Internship
    Local area
    Remote work
    Worldwide

    Canonical

    San Francisco, CA
    11 days ago
  •  ...machine collaboration systems. Our primary goal...  ...individual engineers. We combine language...  ...the boundaries of software development efficiency...  ...alongside AI researchers, infra specialists,...  ...services, APIs, and distributed systems Collaborate with ML engineers to integrate... 
    Remote work

    CodeRabbit

    San Francisco, CA
    3 days ago
  • $150k - $300k

     ...- from frontier agentic models to the infra that enables anyone to create, train,...  ...and deployment contexts. As a Research Engineer working on Distributed Training, you'll play a crucial role...  ...date with the latest advancements in AI/ML infrastructure and tools,... 
    Remote work
    Worldwide
    Visa sponsorship
    Relocation package
    Flexible hours

    Prime-Intellect

    San Francisco, CA
    2 days ago
  • $150k - $250k

     ...Foundry Robotics Inc. is looking for a Senior Software Engineer to join their team in San Francisco. This vital role focuses on building cloud-based backend systems, infrastructure, and ensuring data integrity in advanced robotics manufacturing. The successful candidate... 

    Foundry Robotics Inc.

    San Francisco, CA
    4 hours ago
  • $230k - $385k

     ...group working across engineering, product, research...  ...the foundational systems that will help...  ...for an experienced Software Engineer to help build...  ...and implement distributed systems that power...  ...solutions Drive 01 infra development...  ...marketplaces, AI/ML infra, or other monetization... 
    Relocation package
    Flexible hours

    OpenAI

    San Francisco, CA
    3 days ago
  •  ...Web Crawler Engineer Exa is an applied AI lab building a search...  ...seen. We build massive-scale infra to crawl the entire web, train...  ...want to build massive-scale ML systems that will define the way the...  ...What You Could Do Build a distributed crawler that can handle 100M+... 
    H1b

    Exa Labs

    San Francisco, CA
    3 days ago
  • $180k - $300k

     ...Join to apply for the Software Engineer (Infra) role at Numeral . This range is provided by Numeral. Your actual pay will be based on your...  ...(Infrastructure) who thrives on solving complex distributed systems problems at scale. You'll design and build core infrastructure... 
    Full time
    Immediate start
    Remote work
    Flexible hours

    Numeral

    San Francisco, CA
    1 day ago
  •  ...Overview Cambio is a software platform for world...  ...The role As an AI Engineer, you will play a...  ...machine learning (ML) solutions into...  ...platform and internal systems. Your work will...  ...including system design, distributed systems, API...  ...databases. Our Tech Stack Infra: AWS, Fargate,... 

    WithClutch

    San Francisco, CA
    4 hours ago
  • $170k - $250k

     ...Senior Infra Software Engineer Title of Role: Senior Infra Software Engineer Location:...  ...enhance DevOps practices and improve system reliability. Contribute to the development...  ...building and maintaining large-scale distributed systems. Knowledge of best... 
    Work at office

    Recruiting from Scratch

    San Francisco, CA
    3 days ago
  •  ...and help build the platform engineers turn to to ship AI products....  ...source models. This work spans distributed systems, model serving, and...  ...product, model performance, and infra, helping to define how developers...  ...fundamentals and curiosity. ML experience is a plus, but... 
    Flexible hours

    BaseTen

    San Francisco, CA
    5 hours ago
  •  ...mission to democratize distributed computing and make it accessible to software developers of all...  ...can scale an ML application from their...  ...be a distributed systems expert. Proud to...  ...We're looking for engineers with systems software...  ...libraries, test infra improvements,... 
    Work experience placement

    Anyscale

    San Francisco, CA
    5 hours ago
  • $250k - $280k

     ...Staff / Principal Founding Engineer (Backend-Leaning) - AI Systems Platform San Francisco (in-office) $250-280K base + 0.75-1....  ...Python, TypeScript, APIs, AWS, cloud infra) Background in 0→1 or early-...  ..., agent frameworks, or data/ML pipelines, come from top 5 CS undergrad... 
    Work at office
    Immediate start
    Flexible hours

    Xpertalent

    San Francisco, CA
    2 days ago
  • $191k - $223k

     ...bugs — we build the systems that prevent them....  ...of Quality Engineering, Infrastructure, and...  ...are evolving how software quality is built by...  ...building platforms, infra, or developer/quality...  ...interest in AI/ML or LLM-based systems...  ...Understanding of distributed systems, CI/CD workflows... 
    Work experience placement
    Casual work
    Live in
    Work at office
    Remote work

    Nerdleveltech

    San Francisco, CA
    4 hours ago
  •  ...What you’ll do As a Software Engineer, Infrastructure at...  ...maintaining the core systems that make our AI platform...  ...Architect and operate distributed systems that leverage...  ...retrieval systems, and ML models. Develop and...  ...environment or platform/infra-focused team. Our... 
    Full time
    Flexible hours

    Sierra

    San Francisco, CA
    5 days ago
  •  ...help build the platform engineers turn to to ship AI...  ...THE ROLE As a Senior Software Engineer – Model Training...  ...design and implement distributed training systems, optimize GPU...  ...collaborate with product and infra teams to surface...  ...years of experience in ML infrastructure,... 
    Flexible hours

    BaseTen

    San Francisco, CA
    4 hours ago
  • $150k - $215k

     ...Artie Software Engineer (Distributed Systems) $150K - $215K | San Francisco, CA, US Job type: Full-time Role: Engineering, Backend Experience: 3+ years...  ...) Backend: Go, Postgres, Redis, Kafka and Elasticsearch Infra: Terraform, Kubernetes, and Helm on GCP and AWS About Artie... 
    Full time
    Visa sponsorship

    Voiceflow

    San Francisco, CA
    4 hours ago
  •  ...dynamic digital product studio is seeking a Backend Software Engineer (ML Infrastructure) to design and build core systems for training and deploying ML models. This...  ...with ML engineers and focuses on distributed training pipelines and cloud-native infrastructure... 

    Rockstar

    San Francisco, CA
    4 hours ago
  •  ...products. Their promise is simple: they make your AI system better. They are hiring a Backend Software Engineer (ML Infrastructure) to help design, build, and scale...  ...and deployment. The candidate will work on distributed training pipelines, cloud-native infrastructure,... 
    Remote work

    Rockstar

    San Francisco, CA
    4 hours ago
  • $200k - $275k

     ...hospital electronic health record systems, screen 100% of patients daily...  ...data pipelines, and powers ML models that clinicians rely on...  ...our data scientists and ML engineers to build and operate the infrastructure...  ...stage startup where you owned infra end-to-end This role is NOT... 
    Work at office
    Home office
    Day shift

    HealthLeap

    San Francisco, CA
    5 hours ago
  • $240k

     ...fundamentally change how software is built on the...  ...a team of engineers who have built and...  ...experience running large systems at scale, and as...  ...for exceptional staff or principal-level engineers...  ...passionate about distributed systems and have...  ...large-scale infra, we’d love to talk... 
    Full time
    Work at office
    Remote work
    Shift work
    Night shift

    Convex

    San Francisco, CA
    4 hours ago
  •  ...member of the AI technical staff to join the founding...  ...: Scale infra for post-training of multimodal...  ...Work closely with product engineers to translate cutting-...  ...Experience with ML infrastructure (GPU clusters...  ...latency) Low level systems experience (Triton, CUDA... 
    Work at office
    Relocation
    Visa sponsorship

    Yutori

    San Francisco, CA
    23 days ago
  •  ...Management (ICM) software that drives commissions...  .... As a Software Engineering Architect focusing...  ...‑scale, agentic systems that move beyond static...  ..., quality infra and at the right cost...  ...migration strategies, and distributed system performance...  .... Experience with ML/AI model... 
    Contract work
    Flexible hours

    B Capital

    San Francisco, CA
    4 hours ago
  • $218.4k - $365.2k

     ...Management (ICM) software that drives commissions...  .... As a Software Engineering Architect focusing...  ...-scale, agentic systems that move beyond static...  ..., quality infra and at the right cost...  ...migration strategies, and distributed system performance...  ....Experience with ML/AI model... 
    Contract work
    Flexible hours

    Salesforce

    San Francisco, CA
    3 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Software Engineer, ML Infra & Distributed Systems (Staff & Principal). Be the first to apply!