Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Product Engineer - Training Platform

Baseten

ABOUT BASETEN

Baseten powers mission-critical inference for the world's most dynamic AI companies, like Cursor, Notion, OpenEvidence, Abridge, Clay, Gamma and Writer. By uniting applied AI research, flexible infrastructure, and seamless developer tooling, we enable companies operating at the frontier of AI to bring cutting-edge models into production. We're growing quickly and recently raised our $300M Series E, backed by investors including BOND, IVP, Spark Capital, Greylock, and Conviction. Join us and help build the platform engineers turn to to ship AI products.

THE ROLE

We’re looking for a customer-obsessed software engineer to come ship with us. You’ll own features like multi-node training and products like serverless reinforcement learning (RL) from conception to MVP (and from MVP to GA!). You’ll work through the stack, architecting solutions from API and UI down to our infrastructure layer. You’ll fine tune models yourself to develop an understanding of user workflows. You’ll work closely with research engineers leveraging state-of-the-art training techniques to build experiences that accelerate model development and solve for real pain points. If you’re excited to dive deep into the training, let’s talk!

THE PRODUCT

Take a look at what we’ve built so far:

  • Overview of the product so far
  • Training docs overview
  • Story of the Training product
  • Research we've done

EXAMPLE INITIATIVES

  • Checkpointing Pipeline: Our checkpointing pipeline starts with automated checkpointing, a feature that ensures that versions of models created during training are automatically backed up to the cloud. Users are able to then deploy checkpoints seamlessly into inference servers, providing point-and-click integrations into inference frameworks like vLLM and Baseten’s Inference Stack. This enables customers to quickly evaluate the performance of their checkpoints with real traffic.
  • Multinode training: Multinode training enables customers to easily run training jobs across multiple compute nodes, enabling users to train large models like GLM 4.7 and DeepSeek. We’ve built deeply at the Kubernetes layer to ensure that scheduling, startup, inter-node communication, and shutdown happen seamlessly under the hood and as the user expects.
  • Training DX: Customers come to train on Baseten because it helps them get to value fast. To do this, we ensure that the features we ship aren’t just fast, but are easy to iterate with. We enhanced Baseten’s metrics from pod-level GPU summaries to per-GPU and per-Node. We’ve built a CLI experience that caters to terminal users, and UI experiences that enable user to seamlessly manage their training jobs.

RESPONSIBILITIES

  • Iterate like crazy
  • Design ergonomic APIs and abstractions to model complex resources and lifecycles
  • Work throughout the stack (API layer, backend and database implementation, infra layer; frontend is a plus) to implement features.
  • Fine-tune and deploy models to develop intuition around training workflows.
  • Partner closely with model developers and world-class research engineers to understand the requirements and pain points of post-training workflows.
  • Drive long-term improvements to improve reliability of systems and velocity of development
  • Fix bugs & resolve customer issues with urgency

REQUIREMENTS

  • 5+ years experience building software applications
  • Deep knowledge of the web stack, databases, and distributed systems
  • Experience developing developer tooling or infrastructure products for external or internal users.
  • Good taste in product, particularly developer-oriented tools
  • Interest in ML/AI infrastructure and willingness to learn
  • Driven by high agency and ownership
  • Strong communication skills with the ability to bridge technical depth and business needs

NICE TO HAVE

  • Experience launching features and products through different release cycles (MVP, Beta, GA, etc.)
  • Experience with model development methods and paradigms, like Supervised Fine-Tuning, Reinforcement Learning, Synthetic Data Generation, LoRA, Full Finetunes, etc.
  • Familiarity or experience with the open source training stack and frameworks (NCCL, PyTorch, Megatron, NemoRL, VeRL, Axolotl, HF Trainer) and distributed training techniques (FSDP, DeepSpeed).
  • Experience developing AI products, tooling, or agents
  • Frontend fluency

BENEFITS

  • Competitive compensation, including meaningful equity.
  • 100% coverage of medical, dental, and vision insurance for employee and dependents
  • Generous PTO policy including company wide Winter Break (our offices are closed from Christmas Eve to New Year\'s Day!)
  • Paid parental leave
  • Company-facilitated 401(k)
  • Exposure to a variety of ML startups, offering unparalleled learning and networking opportunities.

Apply now to embark on a rewarding journey in shaping the future of AI! If you are a motivated individual with a passion for machine learning and a desire to be part of a collaborative and forward-thinking team, we would love to hear from you.

At Baseten, we are committed to fostering a diverse and inclusive workplace. We provide equal employment opportunities to all employees and applicants without regard to race, color, religion, gender, sexual orientation, gender identity or expression, national origin, age, genetic information, disability, or veteran status.

#J-18808-Ljbffr

Vacancy posted 2 days ago
Similar jobs that could be interesting for youBased on the Product Engineer - Training Platform in San Francisco, CA vacancy
  •  ...at HappyRobot. Role Overview: We are looking for a Product Operations Engineer to act as the operational backbone of our product organization...  ...labelers and external agencies to ensure high‑quality training data throughput. Who You Are: A Technical Operator:... 
    Training
    Shift work

    Happyrobot Inc.

    San Francisco, CA
    1 day ago
  •  ...Security Product Engineer We believe that software is the foundation of modern civilization...  ...critical software vulnerabilities. We are training and scaling security AI agents to...  ...custom integrations on top of depthfirst's platform, and partner closely with our product team... 
    Training
    Work at office
    Relocation

    depthfirst

    San Francisco, CA
    1 day ago
  • $117.2k - $176.7k

     ...efforts. Job Category Product Job Details About Salesforce...  ...for a Product Security Engineer to join our Salesforce Product...  ...security posture of our core platforms, ensuring the resilience and...  ...compensation, promotion, benefits, training, assessment of job... 
    Training

    Salesforce.Com Inc

    San Francisco, CA
    1 day ago
  • $180k - $258k

     ...founders. Role Overview We are looking for a Product Security Engineer to join our team and act as a champion for security within...  .... Secure Coding Standards: Develop and deliver training, coding patterns, and security guardrails to help engineering... 
    Training
    Shift work

    Candid Health

    San Francisco, CA
    3 days ago
  •  ...unified payments and financial platform for global businesses....  ...turn zerotoone ideas into real products, and you "get stuff done" end...  ...role As a Senior Security Engineer at Airwallex, you will be a trusted...  ...or similar Recognised training or cybersecurity... 
    Training
    Worldwide

    Airwallex

    San Francisco, CA
    5 days ago
  • $162k - $260k

     ...us on LinkedIn. Aurora's Product Security team's mission is to...  ...contributing and documenting security engineering processes and the resulting...  ...across the Aurora Driver Platform and prioritize high value...  ...qualifications, relevant education or training, and market conditions. These... 
    Training
    Work experience placement
    Work at office
    Local area
    3 days per week

    Aurora Innovation

    San Francisco, CA
    5 days ago
  • $120k - $175k

     ...Product Support Engineer Cooley is seeking a Product Support Engineer to join the Product team,...  ...direction of the Manager of Support and Training, and with guidance from the Associate...  ...Familiarity with cloud platforms (Azure preferred) Experience writing... 
    Training
    Full time
    Temporary work
    Work at office
    Remote work
    Work from home
    Worldwide
    Flexible hours
    Weekend work

    Cooley

    San Francisco, CA
    24 days ago
  •  ...technology inspired a brand-new product category, later named "SASE"...  ...network and secure cloud platform, and is on a fast track to becoming...  ...creative Product Enablement Engineer to design and deliver...  ...such as: Product demos Training videos Walkthroughs or tutorials... 
    Training
    Worldwide
    Flexible hours

    Cato Networks

    San Francisco, CA
    10 days ago
  •  ...invented State Space Models or SSMs, a new primitive for training efficient, large-scale foundation models. Our team...  ...deep expertise in model innovation and systems engineering paired with a design-minded product engineering team to build and ship cutting edge models... 
    Training
    Internship
    Work at office

    Cartesia

    San Francisco, CA
    2 days ago
  •  ...invented State Space Models or SSMs, a new primitive for training efficient, large-scale foundation models. Our team...  ...deep expertise in model innovation and systems engineering paired with a design-minded product engineering team to build and ship cutting edge models... 
    Training
    Work at office

    Cartesia

    San Francisco, CA
    5 days ago
  • A healthcare technology company in San Francisco is seeking a full-stack product engineer to design and maintain applications for dentists. You will be involved in shaping product strategy and engineering culture in a startup environment. Ideal candidates have experience... 

    Daydream Services LLC

    San Francisco, CA
    3 days ago
  •  ...healthcare technology company in San Francisco is seeking a Senior Product Engineer to tackle complex technical challenges in modern healthcare. This role involves designing a cutting-edge cloud platform with technologies like Elixir and Phoenix. Candidates should have... 
    Flexible hours

    Lunar

    San Francisco, CA
    5 days ago
  •  ...use cases emerge, high-quality training data is the bottleneck. This...  ...by a team of former Scale AI engineers and operators. In less than a...  ...best research, engineering, product, and operations minds to join...  ...engineers build the pipelines, platforms, and models that transform... 
    Training
    Work at office

    David AI

    San Francisco, CA
    2 days ago
  •  ...At Strava, the Foundation engineering team safeguards the infrastructure...  ...tools, frameworks, and platforms that underpin every feature...  ...passionate about developer productivity, you will: Develop and sustain...  ..., promotional and training opportunities, without regard... 
    Training
    Work at office
    Flexible hours
    3 days per week

    Strava

    San Francisco, CA
    1 day ago
  • $146k - $227.5k

     ...collective expertise through our four products: Everand, Scribd, Slideshare,...  ...About the Team The Web Platform team is part of our Developer...  ...hiring a Senior Software Engineer, Web API Platform to help...  ...sets; relevant education or training; and other business and organizational... 
    Training
    Temporary work
    Local area
    Home office
    Flexible hours

    Scribd

    San Francisco, CA
    3 days ago
  • A tech startup specializing in voice technologies is looking for a Product Engineer to manage voice agent projects. You will ramp up on the technology, handle large projects end-to-end, and engage with customers to create valuable APIs. This role offers a competitive salary... 
    Flexible hours

    Vapi Inc.

    San Francisco, CA
    5 days ago
  • $141.9k - $190.3k

     ...Sr Product Software Engineer Technology is at the heart of Disney's past, present, and future. Disney...  ...and building the products and platforms that will power our media, advertising...  ...Required Education, Experience/Skills/Training: ~ Basic Qualifications ~5+ years... 
    Training

    Disney France

    San Francisco, CA
    1 day ago
  •  ...looking for a Senior Server Engineer to join our Identity Engineering...  ...than login screens, it’s the platform that ensures trust, security,...  ...with cross-functional teams—Product, Security, Trust & Safety,...  ...termination, promotional and training opportunities, without regard... 
    Training
    Work at office
    Flexible hours
    3 days per week

    Strava

    San Francisco, CA
    1 day ago
  •  ...healthcare technology company based in San Francisco is on a mission to revolutionize healthcare with a new software platform. They are seeking a Staff Product Engineer to tackle complex challenges using Elixir and modern cloud architecture. The ideal candidate will have at... 
    3 days per week

    Lunar

    San Francisco, CA
    5 days ago
  •  ...enterprise ecosystem. Our ML Platforms, Solutions, and Services deliver...  ...for talented Software Engineers who are passionate about distributed...  ...world-class ML platforms and products across cloud environments....  ...for data processing and model training/fine-tuning workflows.... 
    Training

    Apple

    San Francisco, CA
    2 days ago
  • $175k - $215k

     ...Software Engineer, Trip Platform Waymo is an autonomous driving technology company with the mission...  ...to a range of vehicle platforms and product use cases. The Waymo Driver has provided...  ...work location, experience, relevant training and education, and skill level. Your recruiter... 
    Training
    Full time
    Remote work

    Waymo

    San Francisco, CA
    2 days ago
  • $160k - $240k

     ...Senior Software Engineer - Platform as a Service Location San Francisco Business Area...  ...world to develop, deploy, and manage production services. We are building what will become...  ..., market conditions, education/training and skill level. We offer one of the... 
    Training
    Temporary work
    For contractors
    Work experience placement

    Bloomberg

    San Francisco, CA
    4 days ago
  • $196k - $220.5k

     ...nearly everyone does on our platform: play video games. Over 90%...  ...safely deliver new features to production while ensuring Discord...  ...scalable. As a Senior Software Engineer on these teams, you will continuously...  ..., and relevant education or training. Please note that the... 
    Training
    Full time
    Relocation
    Relocation package

    Discord

    San Francisco, CA
    1 day ago
  • $149k - $350k

     ...design accessible to all. Figma’s platform helps teams bring ideas to...  ...with AI. From idea to product, Figma empowers teams to streamline...  ...from infrastructure for training cutting-edge models to platforms...  ...functionality. We’re looking for engineers with a background in platform... 
    Training
    Full time
    Contract work
    For contractors
    For subcontractor
    Work at office
    Remote work
    Work from home

    Figma

    San Francisco, CA
    3 days ago
  • $175k - $215k

     ...Sr. Software Engineer, Marketplace ML Platform Waymo is an autonomous driving technology company with...  ...to a range of vehicle platforms and product use cases. The Waymo Driver has provided...  ...cycle, such as feature engineering, training workflows and inference services; we... 
    Training
    Full time
    Remote work

    Waymo

    San Francisco, CA
    2 days ago
  • $230k - $265k

     ...tools they need through the platforms they already sell on....  ...We're looking for a software engineer to join Parafin's Infrastructure...  ...systems for model experimentation, training, evaluation, inference, and retraining...  ...and other ML-driven products for small businesses. As a... 
    Training
    Work from home
    Flexible hours

    Parafin Inc

    San Francisco, CA
    1 day ago
  •  ...use cases emerge, high-quality training data is the bottleneck. This...  ...by a team of former Scale AI engineers and operators. In less than a...  ...best research, engineering, product, and operations minds to join...  ...engineers build the pipelines, platforms, and models that transform... 
    Training
    Work at office

    David AI

    San Francisco, CA
    4 days ago
  • $153k - $376k

     ...design accessible to all. Figma's platform helps teams bring ideas to...  ...with AI. From idea to product, Figma empowers teams to streamline...  ..., from infrastructure for training cutting-edge models to platforms...  ...Figma. Modeling Platform engineers partner with our AI Research... 
    Training
    Full time
    Remote work
    Work from home
    Flexible hours

    Figma

    San Francisco, CA
    5 days ago
  • $187k

     ...Software Engineer, Machine Learning Platform San Francisco, CA, USA About the role Chime's Machine...  ...scientists and ML engineers to develop, train, deploy, and monitor models reliably...  ...in on-call rotations to support production systems To thrive in this role,... 
    Training
    Full time
    Work at office
    Local area
    Remote work
    Night shift

    Chime

    San Francisco, CA
    5 days ago
  • $228.4k - $303.55k

     ...Sr. Staff Software Engineer – Data Platform RDQ126R106 At Databricks, we are passionate about enabling...  ...with prominent Tech Leads, Databricks Product Teams, Data Science, and many more....  ..., relevant certifications and training, and specific work location. Based on... 
    Training
    Worldwide

    Databricks Inc.

    San Francisco, CA
    1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Product Engineer - Training Platform. Be the first to apply!