Research Engineer, Pretraining Scaling

$315k

Anthropic

About Anthropic

Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems.

About the Role:

Anthropic's ML Performance and Scaling team trains our production pretrained models, work that directly shapes the company's future and our mission to build safe, beneficial AI systems. As a Research Engineer on this team, you'll ensure our frontier models train reliably, efficiently, and at scale. This is demanding, high-impact work that requires both deep technical expertise and a genuine passion for the craft of large-scale ML systems.

This role lives at the boundary between research and engineering. You'll work across our entire production training stack: performance optimization, hardware debugging, experimental design, and launch coordination. During launches, the team works in tight lockstep, responding to production issues that can't wait for tomorrow.

Responsibilities:

Own critical aspects of our production pretraining pipeline, including model operations, performance optimization, observability, and reliability
Debug and resolve complex issues across the full stack—from hardware errors and networking to training dynamics and evaluation infrastructure
Design and run experiments to improve training efficiency, reduce step time, increase uptime, and enhance model performance
Respond to on-call incidents during model launches, diagnosing problems quickly and coordinating solutions across teams
Build and maintain production logging, monitoring dashboards, and evaluation infrastructure
Add new capabilities to the training codebase, such as long context support or novel architectures
Collaborate closely with teammates across SF and London, as well as with Tokens, Architectures, and Systems teams
Contribute to the team's institutional knowledge by documenting systems, debugging approaches, and lessons learned

You May Be a Good Fit If You:

Have hands-on experience training large language models, or deep expertise with JAX, TPU, PyTorch, or large-scale distributed systems
Genuinely enjoy both research and engineering work—you'd describe your ideal split as roughly 50/50 rather than heavily weighted toward one or the other
Are excited about being on-call for production systems, working long days during launches, and solving hard problems under pressure
Thrive when working on whatever is most impactful, even if that changes day-to-day based on what the production model needs
Excel at debugging complex, ambiguous problems across multiple layers of the stack
Communicate clearly and collaborate effectively, especially when coordinating across time zones or during high-stress incidents
Are passionate about the work itself and want to refine your craft as a research engineer
Care about the societal impacts of AI and responsible scaling

Strong Candidates May Also Have:

Previous experience training LLM’s or working extensively with JAX/TPU, PyTorch, or other ML frameworks at scale
Contributed to open-source LLM frameworks (e.g., open_lm, llm-foundry, mesh-transformer-jax)
Published research on model training, scaling laws, or ML systems
Experience with production ML systems, observability tools, or evaluation infrastructure
Background as a systems engineer, quant, or in other roles requiring both technical depth and operational excellence

What Makes This Role Unique:

This is not a typical research engineering role. The work is highly operational—you'll be deeply involved in keeping our production models training smoothly, which means being responsive to incidents, flexible about priorities, and comfortable with uncertainty. During launches, the team often works extended hours and may need to respond to issues on evenings and weekends.

However, this operational intensity comes with extraordinary learning opportunities. You'll gain hands-on experience with some of the largest, most sophisticated training runs in the industry. You'll work alongside world-class researchers and engineers, and the institutional knowledge you build will compound in ways that can't be easily transferred. For people who thrive on this type of work, it's uniquely rewarding.

We're building a close-knit team of people who genuinely care about doing excellent work together. If you're someone who wants to be part of training the models that will define the future of AI—and you're excited about the full reality of what that entails—we'd love to hear from you.

Location: This role requires working in-office 5 days per week in San Francisco.

Deadline to apply: None. Applications will be reviewed on a rolling basis.

The expected base compensation for this position is below. Our total compensation package for full-time employees includes equity, benefits, and may include incentive compensation.

Annual Salary:

$315,000—$560,000 USD

Logistics

Education requirements: We require at least a Bachelor's degree in a related field or equivalent experience.

Location-based hybrid policy:
Currently, we expect all staff to be in one of our offices at least 25% of the time. However, some roles may require more time in our offices.

Visa sponsorship: We do sponsor visas! However, we aren't able to successfully sponsor visas for every role and every candidate. But if we make you an offer, we will make every reasonable effort to get you a visa, and we retain an immigration lawyer to help with this.

We encourage you to apply even if you do not believe you meet every single qualification. Not all strong candidates will meet every single qualification as listed. Research shows that people who identify as being from underrepresented groups are more prone to experiencing imposter syndrome and doubting the strength of their candidacy, so we urge you not to exclude yourself prematurely and to submit an application if you're interested in this work. We think AI systems like the ones we're building have enormous social and ethical implications. We think this makes representation even more important, and we strive to include a range of diverse perspectives on our team.

How we're different

We believe that the highest-impact AI research will be big science. At Anthropic we work as a single cohesive team on just a few large-scale research efforts. And we value impact — advancing our long-term goals of steerable, trustworthy AI — rather than work on smaller and more specific puzzles. We view AI research as an empirical science, which has as much in common with physics and biology as with traditional efforts in computer science. We're an extremely collaborative group, and we host frequent research discussions to ensure that we are pursuing the highest-impact work at any given time. As such, we greatly value communication skills.

The easiest way to understand our research directions is to read our recent research. This research continues many of the directions our team worked on prior to Anthropic, including: GPT-3, Circuit-Based Interpretability, Multimodal Neurons, Scaling Laws, AI & Compute, Concrete Problems in AI Safety, and Learning from Human Preferences.

Come work with us!

Anthropic is a public benefit corporation headquartered in San Francisco. We offer competitive compensation and benefits, optional equity donation matching, generous vacation and parental leave, flexible working hours, and a lovely office space in which to collaborate with colleagues. Guidance on Candidates' AI Usage: Learn about our policy for using AI in our application process

Apply

Vacancy posted more than 2 months ago

Similar jobs that could be interesting for youBased on the Research Engineer, Pretraining Scaling in San Francisco, CA vacancy

Research Engineer / Scientist, Pretraining Safety
$310k
...culture of trust and transparency. The Pretraining Safety team’s goal is to build safer, more... ..., we will conduct the foundational research necessary for understanding how behaviors... ...and reduce risk without waiting for full‑scale training runs Design architectures and...
Suggested
Work at office
Local area
Relocation package
Flexible hours
Dormont Manufacturing Co
San Francisco, CA
1 day ago
Research Engineer, Domain Scaling
...society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems. About the role The Domain Scaling team has the goal to make Claude world-class at real-world...
Suggested
United States Digital Space LLC
San Francisco, CA
11 hours ago
Research Engineer - Large-Scale Training (Remote + Equity)
...and distributed systems. The role involves direct collaboration with researchers to implement efficient training solutions. Candidates should possess strong skills in PyTorch, experience with large-scale training, and practical judgment for profiling GPU workloads....
Suggested
Remote job
Black Forest Labs
San Francisco, CA
2 days ago
Pretraining Safety Research Engineer
$310k
Dormont Manufacturing Co in San Francisco is seeking a Pretraining Safety team member to pioneer how safety is built into models before... ...architectures, is comfortable with data pipelines, and values hands-on research. Compensation ranges from $310K to $460K, with equity and...
Suggested
Dormont Manufacturing Co
San Francisco, CA
1 day ago
ML Research Engineer: Multimodal & Large-Scale Pipelines
Causal Labs in San Francisco is seeking a machine learning engineer to work across the full ML stack, including data, models, and infrastructure... ...novel algorithms and build data pipelines for petabyte-scale datasets. The ideal candidate has a strong grasp of machine...
Suggested
Causal Labs
San Francisco, CA
11 hours ago
Research Engineer - AI Performance & Kernel Optimization
...Francisco, California. The Role: As a Research Engineer - AI Performance & Kernel Optimization... ...optimize the performance of our large-scale language model training and inference stacks. You will work closely with our pretraining and inference teams to identify bottlenecks...
Work at office
Relocation package
Zyphra
San Francisco, CA
3 days ago
Senior Research Engineer - Memory & Retrieval at Scale
...candidate to own the end-to-end lifecycle of memory features. You will fine-tune models for extraction and updates, and implement research findings while ensuring high reliability and low latency. The ideal candidate will work closely with customers to identify pain points...
Mem0 Official Documentation
San Francisco, CA
11 hours ago
Research Engineer - Language Model Pre-Training
Zyphra, an AI company in San Francisco, seeks a Research Engineer - Language Model Pre-Training to develop their... ...learning, collaborating closely with their pretraining team. The role requires ability to conduct large-scale training runs and optimize model performance....
Zyphra
San Francisco, CA
11 hours ago
Research Engineer - Language Model Pre-Training
...Francisco, California. The Role: As a Research Engineer - Language Model Pre-Training , you\'ll... ...language model roadmap through end-to-end pretraining development. You will work extremely... ...generation models. You'll Work Across: Large-scale training runs and model parallelization...
Work at office
Relocation package
Zyphra
San Francisco, CA
11 hours ago
Research Engineer, Life Sciences
$350k
...a quickly growing group of committed researchers, engineers, policy experts, and business leaders... ...and managing data pipelines for large-scale datasets Comfortable navigating ambiguity... ...in reinforcement learning and/or pretraining Knowledge of containerization technologies...
Full time
Work experience placement
Work at office
Visa sponsorship
Flexible hours
Anthropic
San Francisco, CA
1 day ago
Machine Learning Research Engineer (MLRE) - GPUs
...a world-class team of scientists, ML researchers, and engineers working together to make the physical... ...AI x chemistry. Operate at frontier scale: massive compute, massive data, and massive... ...will help invent the playbook for pretraining of FSMs akin to current generative AI...
Full time
Temporary work
Achira
San Francisco, CA
1 day ago
Senior ML Research Engineer — Full-Stack, Petabyte-Scale
Kindredventures in San Francisco seeks a Machine Learning Engineer to work with the full ML stack, implementing advanced model architectures and building extensive data pipelines for large datasets. The ideal candidate will have expertise in machine learning principles...
Kindredventures
San Francisco, CA
11 hours ago
Research Engineer
...anomalies such as instruction drifts and context retrieval loss in scaled production environments. Hundreds of teams building... ...Kevin Hartz, and others. The Role: We are looking for Research Engineers to build AI systems that use agent interaction data to help us...
Immediate start
Judgment Labs
San Francisco, CA
1 day ago
Research Engineer
$180k - $340k
...Research Engineer You'll own the quality of AI across everything Gamma creates. As our Research Engineer, you'll design evaluation frameworks... ...AI quality improvements ship quickly and work reliably at scale What You'll Bring ~2+ years working with AI systems...
Full time
Work at office
Work from home
Gamma
San Francisco, CA
3 days ago
Research Engineer
...top VCs and were YC W25. About the Role We're looking for research engineers to help build out QA for training data created by companies using HUD's infrastructure. You'll build the systems that scale quality to help us meet our continued strong demand....
Full time
Work at office
Remote work
Relocation
Visa sponsorship
Hud (yc W25)
San Francisco, CA
4 days ago
Research Engineer
...Research Engineer Lotus Health is a groundbreaking primary care app that integrates your medical records, AI, and real doctors to provide... ...Our team includes ex-founders and engineers who have built and scaled consumer apps to millions of users, generating over $100M in...
Lotus Health
San Francisco, CA
4 days ago
Research Engineer
...Ship models, not slide decks — partner with research and infra to prototype, train, and deploy... ...experience. Squeeze silicon — scale training and inference for LLM-class workloads... ...Expert-level PyTorch. Proven software engineer who loves ML; comfortable writing production...
Full time
Contract work
Flexible hours
Shift work
SESAME
San Francisco, CA
3 days ago
Research Engineer
$225k
...Research Engineer Magic's mission is to build safe AGI that accelerates humanity's progress on the world's most important problems. We believe... ...than humans can alone. Our approach combines frontier-scale pre-training, domain-specific RL, ultra-long context, and inference...
Relocation
Visa sponsorship
Magic Inc
San Francisco, CA
4 days ago
Research Engineer
...production. We're building a world-class team to solve this crisis. The Role — Research Engineer Own difficult, first-principles R&D that unlocks rare-earth metal production at scale. You will design and execute high-temperature experiments, build and operate...
Visa sponsorship
Flexible hours
Solcoa Industries
San Francisco, CA
4 days ago
Research Engineer
...About Human Archive Human Archive is a research lab backed by Y Combinator focused on... ...hardware products, deploy them globally at scale, and publish research. Today, our data is... .... The Opportunity As a Research Engineer, you'll work on multimodal sensing systems...
Shift work
Human Archive
San Francisco, CA
11 hours ago
Research Engineer
$140k - $200k
...Research Engineer The Center for AI Safety (CAIS) is a leading research and advocacy organization focused on mitigating societal-scale risks from AI. We address AI's toughest challenges through technical research, field-building initiatives, and policy engagement, along...
Work at office
Local area
Center for AI Safety
San Francisco, CA
1 day ago
Research Engineer
$250k
...By applying to this role, you will be considered for Research Engineer roles across all teams at OpenAI. About the Role As a Research... ...for example designing, implementing, and improving a massive-scale distributed machine learning system), writing bug-free...
OpenAI
San Francisco, CA
3 days ago
Research Engineer
...remediate critical software vulnerabilities. We are training and scaling security AI agents to discover zero-days vulnerabilities... ...Infrastructure. About this role We're seeking an experienced Research Engineer to join our effort in building and training AI agents for...
Full time
Work at office
DepthFirst
San Francisco, CA
4 days ago
Research - engineering
...troubleshooting have become a massive tax of engineering velocity. Resolve AI is solving this by... ...powered workflows end-to-end, balancing research and engineering to create production-... ...: Experience working with large-scale data pipelines, embeddings, vector databases...
Work at office
Visa sponsorship
Flexible hours
Resolve AI
San Francisco, CA
1 day ago
Research Engineer
$180k - $250k
...for the next generation of AI. We are developing the context engine layer that solves a fundamental challenge: "What if I could just... ...limitations of LLM context windows and the fundamental squared scaling inherent to attention mechanisms. While sparse attention and RAG...
GraphOn
San Francisco, CA
3 days ago
AI Engineer & Researcher - Pre-training and Scaling
$180k
...small, highly motivated, and focused on engineering excellence. This organization is for individuals... ...are important. All engineers and researchers are expected to have strong... ...Training trillion parameter neural networks at scale, as well as a variety of smaller specialized...
Local area
Relocation
Dormont Manufacturing Co
San Francisco, CA
11 hours ago
Applied AI Research Engineer: Deploy & Scale ML
Join a forward-thinking company at the forefront of AI innovation! As a Research Engineer, you will transform groundbreaking research into impactful applications. Collaborate with top minds in the field, design advanced machine learning models, and optimize them for real...
OpenAI
San Francisco, CA
11 hours ago
Research Engineer, Data
...primitive for training efficient, large‑scale foundation models. Our team combines deep... ...in model innovation and systems engineering paired with a design‑minded product engineering... ...for building scalable systems that bridge research and production. What We Offer...
Work at office
Relocation package
Cartesia
San Francisco, CA
11 hours ago
Research Engineer - Speech & Realtime Models
...and shaping the future with cutting‑edge research. Our mission is to ensure that AI's... .... We are looking for visionary Research Engineers to join our Speech & Realtime Models Team... ...freely and creativity thrives. Optimize and Scale: Implement scalable data pipelines,...
Internship
Slope
San Francisco, CA
3 days ago
Applied Research Engineer
$148.5k - $260.1k
...duplicating efforts. Job Category: Software Engineering About Salesforce Salesforce is the #1... ...skills directly enable world‑class research and products used by millions? At Salesforce... ...every customer interaction at hyper‑scale. This role is crucial to transforming our...
salesforce.com, inc.
San Francisco, CA
2 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Research Engineer, Pretraining Scaling. Be the first to apply!