Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Software Engineer - Training Infrastructure

The Consensus

ABOUT BASETEN

Baseten powers mission‑critical inference for the world's most dynamic AI companies, like Cursor, Notion, OpenEvidence, Abridge, Clay, Gamma and Writer. By uniting applied AI research, flexible infrastructure, and seamless developer tooling, we enable companies operating at the frontier of AI to bring cutting‑edge models into production. We’re growing quickly and recently raised our $1.5B Series F, led by Altimeter Capital, Conviction Partners, and Spark Capital. Join us and help build the platform engineers turn to to ship AI products.

THE ROLE

As a Software Engineer on the Training Infrastructure team, you’ll architect and lead development of our training platform, supporting top‑tier research engineers and model developers. You’ll make key technical decisions for the infrastructure enabling developers to deploy, scale, and monitor their workloads with high performance and reliability. You’ll own scheduling, storage, networking, reliability, and observability of technical systems in the training stack.

EXAMPLE INITIATIVES

Take a look at what we’ve built so far: Overview of the product so far Training docs overview Story of the Training product Research we've done

RESPONSIBILITIES

Design and architect scalable infrastructure systems for our ML training platform (e.g. scheduling, storage, and networking) Partner closely with developers and research engineers to translate complex training requirements into technical solutions Design and architect a global training scheduler Design and architect reinforcement learning systems and continuous learning pipelines Drive long‑term improvements to improve reliability of systems and velocity of development Partner closely with SRE and Capacity teams to unlock state of the art training infrastructure Make critical architectural decisions balancing performance with system reliability Lead technical discussions and mentor junior engineers on infrastructure best practices Contribute to long‑term technical strategy and infrastructure roadmap

REQUIREMENTS

Bachelor’s degree or higher in Computer Science or related field Proficiency in Go, with Python experience a plus Deep expertise with Kubernetes in production environments Extensive experience with major cloud providers (AWS, GCP) and neo‑cloud providers (Crusoe, DigitalOcean, Nebius) a plus Advanced understanding of distributed systems concepts and performance tuning Proven experience designing observability systems Experience with ML/AI workloads and MLOps platforms highly valued

NICHE TO HAVE

Experience with distributed storage systems Experience with workload orchestration platforms like Temporal or Airflow Familiarity or experience with the open source training stack and frameworks (NCCL, PyTorch, Megatron, NemoRL, VeRL, Axolotl, HF Trainier) and distributed training techniques (FSDP, DeepSpeed) Experience developing AI products, tooling, or agents

BENEFITS

Competitive compensation, including meaningful equity 100% coverage of medical, dental, and vision insurance for employee and dependents Flexible PTO policy including company wide Winter Break (our offices are closed from Christmas Eve to New Year's Day) Paid parental leave Fertility and family‑building stipend through Carrot Company‑facilitated 401(k) Exposure to a variety of ML startups, offering unparalleled learning and networking opportunities At Baseten, we are committed to fostering a diverse and inclusive workplace. We provide equal employment opportunities to all employees and applicants without regard to race, color, religion, gender, sexual orientation, gender identity or expression, national origin, age, genetic information, disability, or veteran status. We are an Equal Opportunity Employer and will consider qualified applicants with criminal histories in a manner consistent with applicable law (by example, the requirements of the San Francisco Fair Chance Ordinance, where applicable). #J-18808-Ljbffr The Consensus

Vacancy posted 1 day ago
Similar jobs that could be interesting for youBased on the Software Engineer - Training Infrastructure in New York, NY vacancy
  • $174k - $253k

    Senior Software Engineer, Infrastructure Preferred working location: Sunnyvale, CA, USA; New York, NY, USA. Requirements Bachelor’s degree or equivalent...  ...-related skills, experience, and relevant education or training. About the Company Google's software engineers develop... 
    Training

    Google Inc.

    New York, NY
    2 days ago
  •  ...Our exceptionally strong team includes software engineers, AI researchers, security engineers,...  ...and Senior Software Engineer, Infrastructure at Artemis, you’ll own the backbone of...  ...teams - Support GPU workloads, model training pipelines, and large‑scale data warehouses... 
    Training

    Artemis

    New York, NY
    2 days ago
  • $150k - $250k

    Senior Software Engineer (Infrastructure) We’re building applied AI systems for high‑stakes, real‑world decisions. Our platform ingests and reasons...  ...skills, experience, location, and relevant education or training. Join us at Owl.co and be part of a dynamic team... 
    Training
    Local area

    Owl Labs Inc.

    New York, NY
    1 day ago
  •  ...By uniting applied AI research, flexible infrastructure, and seamless developer tooling, we...  ...Conviction. Join us and help build the platform engineers turn to to ship AI products. THE ROLE As a Senior Software Engineer - Model Training at Baseten, you’ll be at the forefront... 
    Training
    Flexible hours

    BaseTen

    New York, NY
    4 days ago
  •  ...and personal freedom. The Department: Onchain The Role: Software Engineer (Infrastructure) The infrastructure team at Gemini creates and manages tools...  ...Experience working with engineering teams, teaching, training, and mentoring on how to implement best-practice... 
    Training
    Remote job
    Flexible hours

    WorksHub

    New York, NY
    3 days ago
  • $160k - $240k

    Senior Software Engineer - Hadoop Infrastructure Location New York Business Area Engineering and CTO Ref # 10049929 Description & Requirements...  ...location, work experience, market conditions, education/training and skill level. We offer one of the most... 
    Training
    Temporary work
    For contractors
    Work experience placement

    Bloomberg L.P.

    New York, NY
    3 days ago
  • $160k - $240k

    Senior Software Engineer - ClickHouse Infrastructure Location New York Business Area Engineering and CTO Ref # 10051736 Description & Requirements Our...  ..., work experience, market conditions, education/training and skill level. We offer one of the most comprehensive... 
    Training
    Temporary work
    For contractors
    Work experience placement

    Bloomberg L.P.

    New York, NY
    4 days ago
  • $160k - $240k

    Senior Software Engineer - Apache Kafka Infrastructure Location: New York Business Area: Engineering and CTO Ref #: 10051649 Description & Requirements...  ..., work experience, market conditions, education/training and skill level. We offer one of the most comprehensive... 
    Training
    Temporary work
    For contractors
    Work experience placement

    Bloomberg L.P.

    New York, NY
    3 days ago
  • $170k - $200k

     ...valuable. About The Role Zora is looking for an experienced infrastructure software engineer to work closely with the development team to ensure that...  ...related skills, experience and relevant education and training, to determine compensation that is fair and competitive for... 
    Training
    Full time
    Local area
    Remote work
    Home office
    Flexible hours

    Framework Ventures

    New York, NY
    4 days ago
  • The Consensus is seeking a software engineer in New York to develop innovative AI solutions. In this role, you'll manage features like multi-node training and serverless reinforcement learning. Collaborating with top research engineers, you'll enhance user workflows and... 
    Training

    The Consensus

    New York, NY
    1 day ago
  • $160k - $240k

    Senior Software Engineer - Data Center Infrastructure Management API Location New York Business Area Engineering and CTO Ref # 10043898 Description & Requirements...  ..., work experience, market conditions, education/training and skill level. We offer one of the most... 
    Training
    Temporary work
    For contractors
    Work experience placement

    Bloomberg

    New York, NY
    5 days ago
  • $190k - $270k

    Staff Software Engineer - AI Research Infrastructure P-1215 At the company, we are obsessed with enabling data teams to solve the world’s toughest problems...  ...their own data, with technologies ranging from post‑training open‑source LLMs to advanced multi‑agent... 
    Training
    Local area

    United States Digital Space LLC

    New York, NY
    1 day ago
  • Job Title: Senior DevOps Engineer (Infrastructure & MLOps) Company Overview Prompt is revolutionizing...  ...automated and modern B2B enterprise software to rehab therapy businesses, their...  ...legacy SageMaker environments for model training, hosting, and inference. Create and... 
    Training
    Remote work
    Flexible hours

    Prompt

    New York, NY
    4 days ago
  • $100k - $122k

    Overview As a DevOps Engineer, Infrastructure you will be responsible for deploying product updates...  ...skills, work experience, business needs, training, location, and market demands. The...  ...DevOps, platform, infrastructure, or software engineering role supporting... 
    Training
    Full time
    Temporary work
    Work experience placement
    Remote work
    Flexible hours

    Origami Risk LLC.

    New York, NY
    4 days ago
  • $262k - $365k

    Senior Staff Software Engineer, Infrastructure, Core corporate_fare Google place New York, NY, USA Bachelor's degree or equivalent practical experience...  ...-related skills, experience, and relevant education or training. Your recruiter can share more about the specific salary... 
    Training
    Full time

    Google Inc.

    New York, NY
    1 day ago
  • $244k - $305k

     ...team within Airbnb’s Cloud Infrastructure org owns the entire Airbnb production...  ...building and operating the software and solutions that connect...  ...be working with talented engineers on cutting edge technologies...  ...upon many factors, such as: training, transferable skills, work... 
    Training
    Work experience placement
    Casual work
    Live in
    Work at office
    Remote work

    Airbnb

    New York, NY
    4 days ago
  • $160k - $240k

    Senior Software Engineer - Network Security Location New York Business Area Engineering and...  ...full lifecycle of our global network infrastructure that supports Bloomberg’s core products...  ..., market conditions, education/training and skill level. We offer one of the... 
    Training
    Temporary work
    For contractors
    Work experience placement
    Work at office

    Bloomberg L.P.

    New York, NY
    1 day ago
  • A leading engineering firm is seeking a Senior Mechanical Engineer to join their Facilities & Infrastructure team in New York City. The successful candidate will oversee project designs...  ...with technical standards, and assist in training junior staff. Candidates must have a... 
    Training

    Burns Engineering, Inc.

    New York, NY
    4 days ago
  • $160k - $240k

    Senior Software Engineer - Public Cloud Engineering Location: New York Business Area: Engineering...  ...with deep expertise across cloud infrastructure, software engineering, security, networking...  ..., market conditions, education/training and skill level. We offer one of the... 
    Training
    Temporary work
    For contractors
    Work experience placement

    Bloomberg L.P.

    New York, NY
    4 days ago
  • $140k - $215k

     ...customers worldwide.We're seeking passionate engineers to build cutting-edge runtime...  ...or intelligence fields.Familiarity with infrastructure as codeContributions to the open-source...  ...decisions-including recruitment, selection, training, compensation, benefits, discipline,... 
    Training
    Work experience placement
    Work at office
    Local area
    Worldwide

    CrowdStrike Holdings, Inc.

    New York, NY
    3 days ago
  •  ...a highly skilled and motivated Senior Software Engineer, Workflow Platforms to architect, build...  ...From simulation and machine learning training to data pipelines and CI/CD, our teams...  ...Partner with ML, simulation, data, and infrastructure teams to understand workload... 
    Training
    Temporary work

    Dormont Manufacturing Co

    New York, NY
    1 day ago
  • $141k - $216.6k

     ...for an experienced platform engineer to join Prepared's platform...  ...a core role in defining our infrastructure and overall developer platform...  ...with the broader software team. This opportunity allows...  ...factors such as level, function, training, transferable skills, work experience... 
    Training
    Work experience placement
    Work at office

    National Society for Black Engineers

    New York, NY
    5 days ago
  •  ...Ash is seeking a highly skilled Senior Software Engineer to join our engineering team. In this...  ...and will be a key player in building infrastructure that makes high-quality care easier to...  ...work, fast learning cycles, practical training, and meaningful feedback. Responsibilities... 
    Training
    Work at office
    Flexible hours

    Ash Wellness, Inc.

    New York, NY
    2 days ago
  • $160k - $240k

    Senior Software Engineer - Edge Connectivity Platform Location New York Business Area Engineering...  ...(TCP/IP, TLS, Write code to solve infrastructure problems ( Python preferred ) Enjoy...  ...experience, market conditions, education/training and skill level. We offer one of the... 
    Training
    Temporary work
    For contractors
    Work experience placement
    Worldwide

    Bloomberg

    New York, NY
    4 days ago
  • $166k - $225k

     ...the world's best data and AI infrastructure platform so our customers...  ...business. As one of the first engineers in the NYC Engineering office...  ...with hands‑on full‑stack software development to create dynamic...  ...relevant certifications and training, and specific work location.... 
    Training
    Work at office

    Menlo Ventures

    New York, NY
    3 days ago
  • $160k - $240k

    Senior Software Engineer - Real Time Media Platform Team Location New York Business Area Engineering...  ...and all related applications. Our infrastructure powers applications which we own and...  ...experience, market conditions, education/training and skill level. We offer one of the... 
    Training
    Temporary work
    For contractors
    Work experience placement

    Bloomberg L.P.

    New York, NY
    3 days ago
  • $120.3k - $161.3k

     ...is a global organization of engineers, product developers, designers...  .... Job Summary As a Software Engineer, you will use your...  ...services AWS for most of our infrastructure Observability Platforms for...  ...Education, Experience/Skills/Training YOU MUST… Hold a Bachelor’s... 
    Training
    Work experience placement

    Disney Cruise Line

    New York, NY
    2 days ago
  • $160k - $170k

     ...lives of working Americans. About the role: In this Senior Software Engineer role at Branch, you will be responsible for developing core...  ...specific skill set, depth of experience, relevant education or training, etc. Location This position is classified as REMOTE within... 
    Training
    Remote job
    Home office
    Flexible hours

    Branch Messenger Inc.

    New York, NY
    2 days ago
  • Senior Software Engineer, Email & Communications Platform (Software Engineer III) We're looking...  ...full stack, building and scaling the infrastructure that powers campaigns, transactional messages...  ...leaves of absence, compensation, and training. LeadVenture expressly prohibits any... 
    Training
    Local area
    Remote work

    LeadVenture®

    New York, NY
    5 days ago
  • $160k - $240k

    Senior Software Engineer - Windows Deployment Platform Location: New York Business Area: Engineering and CTO Ref #: 10050184 Description...  ...location, work experience, market conditions, education/training and skill level. We offer one of the most comprehensive and... 
    Training
    Temporary work
    For contractors
    Work experience placement

    Bloomberg L.P.

    New York, NY
    5 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Software Engineer - Training Infrastructure. Be the first to apply!