Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Research Scientist: Post-Training

Seer

Staff / Principal ML Training Systems Engineer

We are building next-generation intelligent systems capable of operating in complex, real-world environments. Our team develops the full stack — from high-performance hardware and distributed systems infrastructure to large-scale multimodal foundation models powering autonomous decision-making.

Backed by significant funding and operating at the intersection of AI, systems engineering, and large-scale compute infrastructure, we are investing heavily in research, infrastructure, and scalable training systems to push the frontier of embodied intelligence.

We are seeking a Staff / Principal ML Training Systems Engineer to lead training systems performance across large-scale multimodal AI workloads. This is a core systems engineering role focused on scalability, efficiency, and correctness at massive GPU scale. Your work will directly impact infrastructure utilization, training throughput, and research iteration speed.

What You’ll Do

Own Training Performance End-to-End

  • Diagnose and optimize performance for large-scale multimodal training workloads involving vision, video, language, sensor data, and sequential decision-making
  • Build systematic performance attribution tooling, including:
  • Step-time decomposition
  • Compute vs communication analysis
  • Input pipeline profiling
  • Scaling curve analysis across cluster sizes
  • Bottleneck identification and prioritization

Drive Efficiency Improvements Across the Stack

Improve distributed training efficiency through:

  • Communication/computation overlap
  • Gradient bucketization
  • Topology-aware workload placement
  • Parallelism optimization strategies

Improve compute efficiency through:

  • Kernel optimization
  • Operator fusion
  • Attention optimization
  • Runtime and framework overhead reduction

Improve memory efficiency through:

  • Activation checkpointing
  • Sequence packing and bucketing
  • Memory fragmentation reduction

Design and Evolve Training Systems

  • Define and optimize data, tensor, pipeline, sharded, and hybrid parallelism strategies
  • Improve execution efficiency through:
  • Communication scheduling and overlap
  • Graph capture and execution optimization
  • Runtime-level improvements
  • Extend and improve internal training frameworks where necessary

Make Performance Observable and Measurable

  • Establish source-of-truth performance metrics including:
  • Step-time breakdowns
  • Model FLOPs utilization (MFU)
  • Throughput and scaling efficiency
  • Build tooling to:
  • Detect bottlenecks quickly
  • Compare scaling behavior across model families and cluster configurations
  • Track performance regressions over time
  • Develop automated benchmarking and regression detection systems

Partner Closely With Research Teams

  • Collaborate directly with research scientists and ML engineers in a highly integrated environment
  • Translate novel model architectures and research ideas into scalable, production-ready implementations
  • Advise on training tradeoffs involving:
  • Long-horizon sequence modeling
  • Multimodal and variable-length data
  • Evaluation cadence and rollout efficiency

Improve Cluster-Level Efficiency

  • Work with infrastructure and reliability teams to optimize utilization across large distributed workloads
  • Analyze the impact of networking, collectives, and cluster topology on training efficiency
  • Improve topology-aware scheduling and large-scale scaling behavior

What We’re Looking For

  • Proven track record optimizing large-scale distributed ML training systems
  • Deep hands-on experience with modern ML frameworks (PyTorch required; JAX is a plus)
  • Strong understanding of:
  • Data, tensor, and pipeline parallelism
  • FSDP / ZeRO-style sharded training
  • Communication overlap strategies
  • Large-scale GPU cluster scaling behavior
  • Strong systems intuition across compute, communication, and memory bottlenecks
  • Exceptional debugging and performance analysis skills
  • High ownership mindset and comfort operating in fast-moving, highly technical environments

Preferred Experience

  • GPU kernel or compiler-level optimization experience (CUDA, Triton, graph capture, operator fusion)
  • Experience with multimodal or video training involving variable-length sequences and packing strategies
  • Experience building or extending distributed training frameworks and runtimes
  • Familiarity with cluster networking, topology-aware scheduling, and large-scale infrastructure effects

Why This Role Matters

  • Direct impact on research velocity — every efficiency improvement accelerates model development across the organization
  • Opportunity to shape the scalability and performance of next-generation multimodal training systems
  • High-leverage engineering work with compounding impact across all training workloads
  • Small, highly technical team with significant ownership and autonomy

About the Company

We are a research-driven AI company focused on building scalable intelligent systems capable of robust operation in dynamic environments. By combining advances in machine learning, distributed systems, and infrastructure engineering, we aim to push the frontier of large-scale AI systems.

We are committed to building an inclusive and diverse workplace and encourage applicants from all backgrounds to apply.

Vacancy posted 10 hours ago
Similar jobs that could be interesting for youBased on the Research Scientist: Post-Training in Santa Clara, CA vacancy
  •  ...allows Cerebras to deliver industry-leading training and inference speeds and empowers...  ...Role As an Applied Machine Learning Research Scientist at Cerebras, you will play a key role...  ...tuning, and reinforcement learning-based post-training. This includes building... 
    Training
    Internship

    CEREBRAS SYSTEMS INC.

    Sunnyvale, CA
    1 day ago
  •  ...real robotic hardware. This is foundational research with direct physical impact. No hand-offs...  ...them. What You'll Do Design and train large-scale multimodal architectures...  ...production Deep fluency in LLM pretraining, post-training, and RL at scale Comfort... 
    Training

    Prime Recruitment Partners

    Sunnyvale, CA
    3 days ago
  •  ...operate in complex real-world environments. Our work focuses on training robot foundation models using massive multimodal datasets...  ...embodiments and environments About the Company We are a research-driven AI and robotics company focused on building scalable embodied... 
    Training

    Seer

    Santa Clara, CA
    10 hours ago
  •  ...challenges and deliver innovative solutions for enterprise clients. Applicants should have a solid background in Python, experience in training and fine-tuning LLMs, and a strong publication record. Competitive compensation and benefits are offered, ensuring a dynamic and... 
    Training
    Flexible hours

    Victrays

    Santa Clara, CA
    4 days ago
  • A leading technology company is seeking a Senior Research Scientist to focus on Multimodal Foundation Models and Robotics. This position involves...  ...designing AI algorithms for humanoid robots, developing training methods for foundation models, and working with a... 
    Training

    NVIDIA Corporation

    Santa Clara, CA
    10 hours ago
  • A dedicated research lab in Sunnyvale, California, is seeking individuals to develop cutting-edge foundation models. The role involves designing scalable systems for training and optimizing AI models. Candidates should have a Master's or PhD in relevant fields and experience... 
    Training
    Visa sponsorship

    Institute of Foundation Models

    Sunnyvale, CA
    3 days ago
  • $184k - $299k

    Senior Research Scientist, Efficient Deep Learning NVIDIA is searching for an outstanding Senior Researcher working on efficient deep learning...  ...real world. We are particularly excited about methods for post-training model optimization (pruning, quantization, NAS), efficient... 
    Training

    NVIDIA Corporation

    Santa Clara, CA
    3 days ago
  • $168k - $264.5k

    **What you will be doing:*** Designing and implementing post-training algorithms LLMs and DLMs.* Driving efficiency and scalability improvements...  ...training pipelines and serving systems* Collaborating with researchers to translate cutting-edge ideas into production-ready... 
    Training

    NVIDIA Corporation

    Santa Clara, CA
    3 days ago
  • $192k - $304.75k

    Senior Research Scientist, AI-Mediated Reality and Interaction page is loaded## Senior Research Scientist...  ...Claratime type: Full timeposted on: Posted Todayjob requisition id: JR2017531We...  ...with distributed deep learning training frameworks, e.g., PyTorch.* Excellent communication... 
    Training

    NVIDIA Corporation

    Santa Clara, CA
    2 days ago
  • $184k - $299k

    We are now looking for a Senior Research Scientist focused on Multimodal Foundation Models and Robotics! NVIDIA is searching for an outstanding...  ...humanoid robots and embodied agents; Develop large-scale AI training and inference methods for foundation models; Optimize and... 
    Training

    NVIDIA Corporation

    Santa Clara, CA
    10 hours ago
  • $300k

     ...Foundation Models We are a dedicated research lab for building, understanding,...  ...cutting-edge foundation model training, alongside world-class researchers, data scientists, and engineers, tackling the...  ...- Document designs clearly, run post-mortems, and partner with global... 
    Training
    Visa sponsorship

    Institute of Foundation Models

    Sunnyvale, CA
    3 days ago
  • $150k

     ...Institute of Foundation Models We are a dedicated research lab for building, understanding, using, and risk-managing...  ...work on the core of cutting-edge foundation model training, alongside world-class researchers, data scientists, and engineers, tackling the most fundamental and... 
    Training
    Visa sponsorship

    Institute of Foundation Models

    Sunnyvale, CA
    3 days ago
  • $174k - $252k

    Senior Research Scientist, Google Cloud AI Research Google Sunnyvale, CA, USA Required qualifications...  ...experience, and relevant education or training. Your recruiter can share more about...  ...details listed in US role postings reflect the base salary only, and do not... 
    Training
    Full time

    Google Inc.

    Sunnyvale, CA
    1 day ago
  • $150k

     ...Institute of Foundation Models We are a dedicated research lab for building, understanding, using, and risk‑managing...  ...work on the core of cutting‑edge foundation model training, alongside world‑class researchers, data scientists, and engineers, tackling the most fundamental and... 
    Training
    Visa sponsorship

    Institute of Foundation Models

    Sunnyvale, CA
    3 days ago
  • $150k

     ...Foundation Models We are a dedicated research lab for building, understanding,...  ...cutting‑edge foundation model training, alongside world‑class researchers, data scientists, and engineers, tackling the...  ...demonstrated by publication, blog post, public code. Experience... 
    Training
    Visa sponsorship
    Shift work

    Institute of Foundation Models

    Sunnyvale, CA
    3 days ago
  • $150k

     ...Foundation Models We are a dedicated research lab for building, understanding,...  ...cutting‑edge foundation model training, alongside world‑class researchers, data scientists, and engineers, tackling the...  ...datasets for training, LLM evaluation, post‑training data, efficient... 
    Training
    Worldwide
    Visa sponsorship

    Institute of Foundation Models

    Sunnyvale, CA
    3 days ago
  • $126k - $423k

     ...We are looking for multiple passionate Research Scientists to join the Research Group at Applied...  ...and distributed machine learning model training Nice to have: Hands‑on experience in...  ...the position. Please reference the job posting’s subtitle for where this position... 
    Training
    Full time
    For contractors
    For subcontractor
    Casual work
    Work at office
    Immediate start
    Remote work
    Day shift

    Decisive Point

    Sunnyvale, CA
    4 days ago
  •  ...Job Description Job Description Job Title:  Research scientist (Robotics, AI) Position Type: Full time Location: Santa Clara, CA...  ...Responsibilities Develop new algorithms and methods for training AI models for enhancing the robot dexterity. Conduct cutting... 
    Training
    Full time
    Work experience placement

    IntelliPro Group Inc.

    Santa Clara, CA
    8 days ago
  • $192k - $304.75k

    Senior Quantum AI Research Scientist, Applied Research page is loaded## Senior Quantum AI Research...  ...Claratime type: Full timeposted on: Posted Todayjob requisition id: JR2018110At NVIDIA...  ...data, enabling the community to train and evaluate AI models at scale.* Collaborate... 
    Training

    NVIDIA Corporation

    Santa Clara, CA
    4 days ago
  • $150k

     ...Institute of Foundation Models We are a dedicated research lab for building, understanding, using, and risk‑managing...  ...work on the core of cutting‑edge foundation model training, alongside world‑class researchers, data scientists, and engineers, tackling the most fundamental and... 
    Training
    Visa sponsorship

    Institute of Foundation Models

    Sunnyvale, CA
    10 hours ago
  • $165k - $180k

    AI Research Scientist - GenAI at Bosch Group - Sunnyvale, CA, United States Company Description The Bosch Research and Technology Center North...  ...research experience on foundation models, including training, fine-tuning, and prompting. In-depth experiences in deep learning... 
    Training
    Work experience placement
    Worldwide

    Victrays

    Sunnyvale, CA
    4 days ago
  • $160k - $350k

    Collinear is a research-focused AI company building systems that make intelligent models...  ...in real-world settings. We work across post-training, RL, distillation, and evaluation to...  ...collaboration with founders, research scientists, and engineering leads High-impact ownership... 
    Training
    Full time
    Work at office
    Local area
    Immediate start
    Relocation package
    Flexible hours

    Collinear AI, Inc.

    Sunnyvale, CA
    3 days ago
  • $248k - $349k

    Senior Staff Research Scientist, Google Cloud AI Research corporate_fare Google place Sunnyvale...  ...experience, and relevant education or training. Your recruiter can share more about the...  ...details listed in US role postings reflect the base salary only, and do not... 
    Training
    Full time
    Worldwide

    Google Inc.

    Sunnyvale, CA
    4 days ago
  •  ...Machine Learning Research Scientist At Autoscience Institute, we create AI systems that autonomously conduct AI research. Recently, we...  ...to build and deploy production-ready research systems. RL post-train and fine-tune reasoning models to automate components of the... 
    Training
    Full time
    Flexible hours

    Autoscience Institute

    Menlo Park, CA
    1 day ago
  • $207k - $300k

    Staff AI Research Scientist, Applied AI, Google Cloud corporate_fare Google place Sunnyvale, CA...  ...experience, and relevant education or training. Your recruiter can share more about the...  ...details listed in US role postings reflect the base salary only, and do not... 
    Training
    Full time

    Google Inc.

    Sunnyvale, CA
    4 days ago
  • $192.2k - $260k

     ...Description Applied Scientists in AWS Automated Reasoning are dedicated to making AWS the...  ...of ideas, perspectives, and voices. Training & Career Growth We're continuously raising...  ...Master's degree and 6+ years of applied research experience Experience in any of the... 
    Training
    Local area
    Flexible hours

    Amazon

    Santa Clara, CA
    10 hours ago
  • $176k - $253.5k

     ...At Toyota Research Institute (TRI), we're on a mission to improve the quality of human life...  ...We are looking for an AI Research Scientist, or Senior Machine Learning Research Scientist...  ...in large-scale foundational model training, fine-tuning, evaluation and benchmarking... 
    Training
    Temporary work
    Local area
    Shift work

    Toyota Research Institute

    Los Altos, CA
    1 day ago
  • $176k - $253k

     ...At Toyota Research Institute (TRI), we're on a mission to improve the...  ...learning infrastructure needed to train and evaluate these systems at...  ...We are looking for a Research Scientist to join us in building...  ...nearing completion), with some post-PhD or internship work experience... 
    Training
    Work experience placement
    Internship
    Local area
    Remote work
    Shift work

    Toyota Research Institute

    Los Altos, CA
    1 day ago
  • $171.6k - $222.2k

     ...Neuron Science Team is looking for talented scientists to enhance our software stack,...  ...solutions and engage with academic and research communities to advance state-of-the-art...  ...designed to deliver the best-in-class ML training performance at the lowest training cost... 
    Training
    Local area
    Flexible hours

    Amazon

    Santa Clara, CA
    2 days ago
  • $165k - $185k

    Company Description The Bosch Research and Technology Center North America with offices in Sunnyvale, California, Pittsburgh, Pennsylvania...  ...graduate research experience on foundation models, including training, fine-tuning, and prompting In-depth experiences in deep... 
    Training
    Full time
    Work experience placement
    Worldwide

    Bosch Group

    Sunnyvale, CA
    1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Research Scientist: Post-Training. Be the first to apply!