Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Research Engineer, Pretraining Scaling

$315k

Anthropic

About Anthropic

Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems.

About the Role:

Anthropic's ML Performance and Scaling team trains our production pretrained models, work that directly shapes the company's future and our mission to build safe, beneficial AI systems. As a Research Engineer on this team, you'll ensure our frontier models train reliably, efficiently, and at scale. This is demanding, high-impact work that requires both deep technical expertise and a genuine passion for the craft of large-scale ML systems.

This role lives at the boundary between research and engineering. You'll work across our entire production training stack: performance optimization, hardware debugging, experimental design, and launch coordination. During launches, the team works in tight lockstep, responding to production issues that can't wait for tomorrow.

Responsibilities: 

  • Own critical aspects of our production pretraining pipeline, including model operations, performance optimization, observability, and reliability
  • Debug and resolve complex issues across the full stack—from hardware errors and networking to training dynamics and evaluation infrastructure
  • Design and run experiments to improve training efficiency, reduce step time, increase uptime, and enhance model performance
  • Respond to on-call incidents during model launches, diagnosing problems quickly and coordinating solutions across teams
  • Build and maintain production logging, monitoring dashboards, and evaluation infrastructure
  • Add new capabilities to the training codebase, such as long context support or novel architectures
  • Collaborate closely with teammates across SF and London, as well as with Tokens, Architectures, and Systems teams
  • Contribute to the team's institutional knowledge by documenting systems, debugging approaches, and lessons learned

You May Be a Good Fit If You:

  • Have hands-on experience training large language models, or deep expertise with JAX, TPU, PyTorch, or large-scale distributed systems
  • Genuinely enjoy both research and engineering work—you'd describe your ideal split as roughly 50/50 rather than heavily weighted toward one or the other
  • Are excited about being on-call for production systems, working long days during launches, and solving hard problems under pressure
  • Thrive when working on whatever is most impactful, even if that changes day-to-day based on what the production model needs
  • Excel at debugging complex, ambiguous problems across multiple layers of the stack
  • Communicate clearly and collaborate effectively, especially when coordinating across time zones or during high-stress incidents
  • Are passionate about the work itself and want to refine your craft as a research engineer
  • Care about the societal impacts of AI and responsible scaling

Strong Candidates May Also Have: 

  • Previous experience training LLM’s or working extensively with JAX/TPU, PyTorch, or other ML frameworks at scale
  • Contributed to open-source LLM frameworks (e.g., open_lm, llm-foundry, mesh-transformer-jax)
  • Published research on model training, scaling laws, or ML systems
  • Experience with production ML systems, observability tools, or evaluation infrastructure
  • Background as a systems engineer, quant, or in other roles requiring both technical depth and operational excellence

What Makes This Role Unique: 

This is not a typical research engineering role. The work is highly operational—you'll be deeply involved in keeping our production models training smoothly, which means being responsive to incidents, flexible about priorities, and comfortable with uncertainty. During launches, the team often works extended hours and may need to respond to issues on evenings and weekends.

However, this operational intensity comes with extraordinary learning opportunities. You'll gain hands-on experience with some of the largest, most sophisticated training runs in the industry. You'll work alongside world-class researchers and engineers, and the institutional knowledge you build will compound in ways that can't be easily transferred. For people who thrive on this type of work, it's uniquely rewarding.

We're building a close-knit team of people who genuinely care about doing excellent work together. If you're someone who wants to be part of training the models that will define the future of AI—and you're excited about the full reality of what that entails—we'd love to hear from you.

Location: This role requires working in-office 5 days per week in San Francisco. 

Deadline to apply: None. Applications will be reviewed on a rolling basis.

The expected base compensation for this position is below. Our total compensation package for full-time employees includes equity, benefits, and may include incentive compensation.

Annual Salary:

$315,000—$560,000 USD

Logistics

Education requirements: We require at least a Bachelor's degree in a related field or equivalent experience.

Location-based hybrid policy:
Currently, we expect all staff to be in one of our offices at least 25% of the time. However, some roles may require more time in our offices.

Visa sponsorship: We do sponsor visas! However, we aren't able to successfully sponsor visas for every role and every candidate. But if we make you an offer, we will make every reasonable effort to get you a visa, and we retain an immigration lawyer to help with this.

We encourage you to apply even if you do not believe you meet every single qualification. Not all strong candidates will meet every single qualification as listed.  Research shows that people who identify as being from underrepresented groups are more prone to experiencing imposter syndrome and doubting the strength of their candidacy, so we urge you not to exclude yourself prematurely and to submit an application if you're interested in this work. We think AI systems like the ones we're building have enormous social and ethical implications. We think this makes representation even more important, and we strive to include a range of diverse perspectives on our team.

How we're different

We believe that the highest-impact AI research will be big science. At Anthropic we work as a single cohesive team on just a few large-scale research efforts. And we value impact — advancing our long-term goals of steerable, trustworthy AI — rather than work on smaller and more specific puzzles. We view AI research as an empirical science, which has as much in common with physics and biology as with traditional efforts in computer science. We're an extremely collaborative group, and we host frequent research discussions to ensure that we are pursuing the highest-impact work at any given time. As such, we greatly value communication skills.

The easiest way to understand our research directions is to read our recent research. This research continues many of the directions our team worked on prior to Anthropic, including: GPT-3, Circuit-Based Interpretability, Multimodal Neurons, Scaling Laws, AI & Compute, Concrete Problems in AI Safety, and Learning from Human Preferences.

Come work with us!

Anthropic is a public benefit corporation headquartered in San Francisco. We offer competitive compensation and benefits, optional equity donation matching, generous vacation and parental leave, flexible working hours, and a lovely office space in which to collaborate with colleagues. Guidance on Candidates' AI Usage: Learn about  our policy for using AI in our application process

Vacancy posted more than 2 months ago
Similar jobs that could be interesting for youBased on the Research Engineer, Pretraining Scaling in San Francisco, CA vacancy
  • $310k

     ...culture of trust and transparency. The Pretraining Safety team’s goal is to build safer, more...  ..., we will conduct the foundational research necessary for understanding how behaviors...  ...and reduce risk without waiting for full‑scale training runs Design architectures and... 
    Suggested
    Work at office
    Local area
    Relocation package
    Flexible hours

    Dormont Manufacturing Co

    San Francisco, CA
    1 day ago
  •  ...society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems. About the role The Domain Scaling team has the goal to make Claude world-class at real-world... 
    Suggested

    United States Digital Space LLC

    San Francisco, CA
    11 hours ago
  •  ...and distributed systems. The role involves direct collaboration with researchers to implement efficient training solutions. Candidates should possess strong skills in PyTorch, experience with large-scale training, and practical judgment for profiling GPU workloads.... 
    Suggested
    Remote job

    Black Forest Labs

    San Francisco, CA
    2 days ago
  • $310k

    Dormont Manufacturing Co in San Francisco is seeking a Pretraining Safety team member to pioneer how safety is built into models before...  ...architectures, is comfortable with data pipelines, and values hands-on research. Compensation ranges from $310K to $460K, with equity and... 
    Suggested

    Dormont Manufacturing Co

    San Francisco, CA
    1 day ago
  • Causal Labs in San Francisco is seeking a machine learning engineer to work across the full ML stack, including data, models, and infrastructure...  ...novel algorithms and build data pipelines for petabyte-scale datasets. The ideal candidate has a strong grasp of machine... 
    Suggested

    Causal Labs

    San Francisco, CA
    11 hours ago
  •  ...Francisco, California. The Role: As a Research Engineer - AI Performance & Kernel Optimization...  ...optimize the performance of our large-scale language model training and inference stacks. You will work closely with our pretraining and inference teams to identify bottlenecks... 
    Work at office
    Relocation package

    Zyphra

    San Francisco, CA
    3 days ago
  •  ...candidate to own the end-to-end lifecycle of memory features. You will fine-tune models for extraction and updates, and implement research findings while ensuring high reliability and low latency. The ideal candidate will work closely with customers to identify pain points... 

    Mem0 Official Documentation

    San Francisco, CA
    11 hours ago
  • Zyphra, an AI company in San Francisco, seeks a Research Engineer - Language Model Pre-Training to develop their...  ...learning, collaborating closely with their pretraining team. The role requires ability to conduct large-scale training runs and optimize model performance.... 

    Zyphra

    San Francisco, CA
    11 hours ago
  •  ...Francisco, California. The Role: As a Research Engineer - Language Model Pre-Training , you\'ll...  ...language model roadmap through end-to-end pretraining development. You will work extremely...  ...generation models. You'll Work Across: Large-scale training runs and model parallelization... 
    Work at office
    Relocation package

    Zyphra

    San Francisco, CA
    11 hours ago
  • $350k

     ...a quickly growing group of committed researchers, engineers, policy experts, and business leaders...  ...and managing data pipelines for large-scale datasets Comfortable navigating ambiguity...  ...in reinforcement learning and/or pretraining Knowledge of containerization technologies... 
    Full time
    Work experience placement
    Work at office
    Visa sponsorship
    Flexible hours

    Anthropic

    San Francisco, CA
    1 day ago
  •  ...a world-class team of scientists, ML researchers, and engineers working together to make the physical...  ...AI x chemistry. Operate at frontier scale: massive compute, massive data, and massive...  ...will help invent the playbook for pretraining of FSMs akin to current generative AI... 
    Full time
    Temporary work

    Achira

    San Francisco, CA
    1 day ago
  • Kindredventures in San Francisco seeks a Machine Learning Engineer to work with the full ML stack, implementing advanced model architectures and building extensive data pipelines for large datasets. The ideal candidate will have expertise in machine learning principles... 

    Kindredventures

    San Francisco, CA
    11 hours ago
  •  ...anomalies such as instruction drifts and context retrieval loss in scaled production environments. Hundreds of teams building...  ...Kevin Hartz, and others. The Role: We are looking for Research Engineers to build AI systems that use agent interaction data to help us... 
    Immediate start

    Judgment Labs

    San Francisco, CA
    1 day ago
  • $180k - $340k

     ...Research Engineer You'll own the quality of AI across everything Gamma creates. As our Research Engineer, you'll design evaluation frameworks...  ...AI quality improvements ship quickly and work reliably at scale What You'll Bring ~2+ years working with AI systems... 
    Full time
    Work at office
    Work from home

    Gamma

    San Francisco, CA
    3 days ago
  •  ...top VCs and were YC W25. About the Role We're looking for research engineers to help build out QA for training data created by companies using HUD's infrastructure. You'll build the systems that scale quality to help us meet our continued strong demand.... 
    Full time
    Work at office
    Remote work
    Relocation
    Visa sponsorship

    Hud (yc W25)

    San Francisco, CA
    4 days ago
  •  ...Research Engineer Lotus Health is a groundbreaking primary care app that integrates your medical records, AI, and real doctors to provide...  ...Our team includes ex-founders and engineers who have built and scaled consumer apps to millions of users, generating over $100M in... 

    Lotus Health

    San Francisco, CA
    4 days ago
  •  ...Ship models, not slide decks — partner with research and infra to prototype, train, and deploy...  ...experience. Squeeze silicon — scale training and inference for LLM-class workloads...  ...Expert-level PyTorch. Proven software engineer who loves ML; comfortable writing production... 
    Full time
    Contract work
    Flexible hours
    Shift work

    SESAME

    San Francisco, CA
    3 days ago
  • $225k

     ...Research Engineer Magic's mission is to build safe AGI that accelerates humanity's progress on the world's most important problems. We believe...  ...than humans can alone. Our approach combines frontier-scale pre-training, domain-specific RL, ultra-long context, and inference... 
    Relocation
    Visa sponsorship

    Magic Inc

    San Francisco, CA
    4 days ago
  •  ...production. We're building a world-class team to solve this crisis. The Role — Research Engineer Own difficult, first-principles R&D that unlocks rare-earth metal production at scale. You will design and execute high-temperature experiments, build and operate... 
    Visa sponsorship
    Flexible hours

    Solcoa Industries

    San Francisco, CA
    4 days ago
  •  ...About Human Archive Human Archive is a research lab backed by Y Combinator focused on...  ...hardware products, deploy them globally at scale, and publish research. Today, our data is...  .... The Opportunity As a Research Engineer, you'll work on multimodal sensing systems... 
    Shift work

    Human Archive

    San Francisco, CA
    11 hours ago
  • $140k - $200k

     ...Research Engineer The Center for AI Safety (CAIS) is a leading research and advocacy organization focused on mitigating societal-scale risks from AI. We address AI's toughest challenges through technical research, field-building initiatives, and policy engagement, along... 
    Work at office
    Local area

    Center for AI Safety

    San Francisco, CA
    1 day ago
  • $250k

     ...By applying to this role, you will be considered for Research Engineer roles across all teams at OpenAI. About the Role As a Research...  ...for example designing, implementing, and improving a massive-scale distributed machine learning system), writing bug-free... 

    OpenAI

    San Francisco, CA
    3 days ago
  •  ...remediate critical software vulnerabilities. We are training and scaling security AI agents to discover zero-days vulnerabilities...  ...Infrastructure. About this role We're seeking an experienced Research Engineer to join our effort in building and training AI agents for... 
    Full time
    Work at office

    DepthFirst

    San Francisco, CA
    4 days ago
  •  ...troubleshooting have become a massive tax of engineering velocity. Resolve AI is solving this by...  ...powered workflows end-to-end, balancing research and engineering to create production-...  ...: Experience working with large-scale data pipelines, embeddings, vector databases... 
    Work at office
    Visa sponsorship
    Flexible hours

    Resolve AI

    San Francisco, CA
    1 day ago
  • $180k - $250k

     ...for the next generation of AI. We are developing the context engine layer that solves a fundamental challenge: "What if I could just...  ...limitations of LLM context windows and the fundamental squared scaling inherent to attention mechanisms. While sparse attention and RAG... 

    GraphOn

    San Francisco, CA
    3 days ago
  • $180k

     ...small, highly motivated, and focused on engineering excellence. This organization is for individuals...  ...are important. All engineers and researchers are expected to have strong...  ...Training trillion parameter neural networks at scale, as well as a variety of smaller specialized... 
    Local area
    Relocation

    Dormont Manufacturing Co

    San Francisco, CA
    11 hours ago
  • Join a forward-thinking company at the forefront of AI innovation! As a Research Engineer, you will transform groundbreaking research into impactful applications. Collaborate with top minds in the field, design advanced machine learning models, and optimize them for real... 

    OpenAI

    San Francisco, CA
    11 hours ago
  •  ...primitive for training efficient, large‑scale foundation models. Our team combines deep...  ...in model innovation and systems engineering paired with a design‑minded product engineering...  ...for building scalable systems that bridge research and production. What We Offer... 
    Work at office
    Relocation package

    Cartesia

    San Francisco, CA
    11 hours ago
  •  ...and shaping the future with cutting‑edge research. Our mission is to ensure that AI's...  .... We are looking for visionary Research Engineers to join our Speech & Realtime Models Team...  ...freely and creativity thrives. Optimize and Scale: Implement scalable data pipelines,... 
    Internship

    Slope

    San Francisco, CA
    3 days ago
  • $148.5k - $260.1k

     ...duplicating efforts. Job Category: Software Engineering About Salesforce Salesforce is the #1...  ...skills directly enable world‑class research and products used by millions? At Salesforce...  ...every customer interaction at hyper‑scale. This role is crucial to transforming our... 

    salesforce.com, inc.

    San Francisco, CA
    2 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Research Engineer, Pretraining Scaling. Be the first to apply!