Full-Stack Software Engineer, Reinforcement Learning
$300k - $405kAnthropic
Full-Stack Software Engineer, Reinforcement Learning
San Francisco, CA | New York City, NY
About Anthropic
Anthropic's mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems.
About the Role
As a Full-Stack Software Engineer in RL, you'll build the platforms, tools, and interfaces that power environment creation, data collection, and training observability. The quality of Claude's next generation depends on the quality of the data we train it on — and the systems you build are what make that data possible.
You'll own product surfaces end-to-end — from backend services and APIs to the web UIs that researchers, external vendors, and thousands of data labelers use every day. You don't need a background in ML research. What matters is that you can take an ambiguous, high-stakes problem and ship a polished, reliable product against it, fast.
This team moves very quickly. Claude writes a lot of the code we commit, which means the bottleneck isn't typing — it's judgment, taste, and the ability to react to what researchers need next. You'll iterate on data collection strategies to distill the knowledge of thousands of human experts around the world into our models, and you'll do it in a loop that closes in hours and days, not quarters or months.
Anthropic's Reinforcement Learning organization leads the research and development that trains Claude to be capable, reliable, and safe. We've contributed to every Claude model, with significant impact on the autonomy and coding capabilities of our most advanced models. Our work spans teaching models to use computers effectively, advancing code generation through RL, pioneering fundamental RL research for large language models, and building the scalable training methodologies behind our frontier production models.
The RL org is organized around four goals: solving the science of long-horizon tasks and continual learning, scaling RL data and environments to be comprehensive and diverse, automating software engineering end-to-end, and training the frontier production model. Our engineering teams build the environments, evaluation systems, data pipelines, and tooling that make all of this possible — from realistic agentic training environments and scalable code data generation to human data collection platforms and production training operations.
What You'll Do
- Build and extend web platforms for RL environment creation, management, and quality review — including environment configuration, versioning, and validation workflows
- Develop vendor-facing interfaces and tooling that let external partners create, submit, and iterate on training environments with minimal friction
- Design and implement platforms for human data collection at scale, including labeling workflows, quality assurance systems, and feedback mechanisms that surface reward signal integrity issues early
- Build evaluation dashboards and observability UIs that give researchers real-time insight into environment quality, training run health, and reward hacking
- Create backend services and APIs that connect environment authoring tools, data collection systems, and RL training infrastructure
- Build and expand scalable code data generation pipelines, producing diverse programming tasks with robust reward signals across languages and difficulty levels
- Develop onboarding automation and documentation tooling so new vendors and internal users ramp up in hours, not weeks
- Partner closely with RL researchers, data operations, and vendor management to translate ambiguous requirements into well-scoped, well-designed products
You May Be a Good Fit If You
- Have strong software engineering fundamentals and real full-stack range — you're comfortable owning a surface from database schema to frontend
- Are proficient in Python and a modern web stack (React, TypeScript, or similar)
- Have a track record of shipping systems that solved a hard problem, not just shipped on time — e.g. you built the thing that made your team 10x faster, or the internal tool nobody thought was possible
- Operate with high agency: you identify what needs to be done and drive it forward without waiting for a ticket
- Have found yourself wondering "why isn't this moving faster?" in previous roles — and then have done something about it
- Care about UX and can build interfaces that are intuitive for both technical researchers and non-technical labelers
- Communicate clearly with researchers, operations teams, and engineers, and can turn vague asks into well-scoped work
- Thrive in a fast-moving environment where priorities shift, Claude is your pair programmer, and the next problem is often one nobody has solved before
- Care about Anthropic's mission to build safe, beneficial AI and want your work to contribute directly to it
Strong Candidates May Also Have
- Built data collection, labeling, or annotation platforms — ideally ones that had to scale across many vendors or many task types
- Background building multi-tenant platforms with role-based access, audit trails, and vendor management workflows
- Experience with cloud infrastructure (GCP or AWS), Docker, and CI/CD pipelines
- Familiarity with LLM training, fine-tuning, or evaluation workflows
- Experience with async Python (Trio, asyncio) or high-throughput API design
- Background in dashboards, monitoring, or observability tooling
- Experience working directly with external vendors or partners on technical integrations
- A background that isn't a straight line — e.g. math or physics into SWE, competitive programming, research into engineering, or a side project that outgrew its scope
Representative Projects
- Building a unified platform for human data collection that integrates labeling workflows, vendor management, and QA for complex agentic tasks
- Developing vendor onboarding automation that handles Docker registry access, API token management, and environment validation
- Creating evaluation and observability dashboards that catch reward hacks, measure environment difficulty, and give real-time feedback during production training
- Building environment quality review workflows that let researchers browse, grade, and provide feedback on training environments
- Developing automated environment quality pipelines that validate correctness and difficulty calibration before environments hit production training
- Building internal tools for browsing and analyzing training run results, environment statistics, and data collection progress
The annual compensation range for this role is listed below. For sales roles, the range provided is the role's On Target Earnings ("OTE") range, meaning that the range includes both the sales commissions/sales bonuses target and annual base salary for the role.
Annual Salary:
$300,000 - $405,000 USD
Logistics
Minimum education: Bachelor's degree or an equivalent combination of education, training, and/or experience
Required field of study: A field relevant to the role as demonstrated through coursework, training, or professional experience
Minimum years of experience: Years of experience required will correlate with the internal job level requirements for the position
Location-based hybrid policy: Currently, we expect all staff to be in one of our offices at least 25% of the time. However, some roles may require more time in our offices.
Visa sponsorship: We do sponsor visas! However, we aren't able to successfully sponsor visas for every role and every candidate. But if we make you an offer, we will make every reasonable effort to get you a visa, and we retain an immigration lawyer to help with this.
We encourage you to apply even if you do not believe you meet every single qualification. Not all strong candidates will meet every single qualification as listed. Research shows that people who identify as being from underrepresented groups are more prone to experiencing imposter syndrome and doubting the strength of their candidacy, so we urge you not to exclude yourself prematurely and to submit an application if you're interested in this work. We think AI systems like the ones we're building have enormous social and ethical implications. We think this makes representation even more important, and we strive to include a range of diverse perspectives on our team. Your safety matters to us. To protect yourself from potential scams, remember that Anthropic recruiters only contact you from @anthropic.
- ...Navgurukul Foundation is seeking experienced Full Stack Developers to design and scale technological platforms that enhance learning for underserved students. You will work with modern tools such as React.js and Node.js, develop responsive applications, and manage databases...Suggested
$300k - $405k
...growing group of committed researchers, engineers, policy experts, and business leaders... ...The Horizons team leads Anthropic's reinforcement learning (RL) research and development, playing... ...experience with machine learning. Have strong software engineering skills. Can balance...SuggestedWork at officeVisa sponsorshipFlexible hours- ...Full Stack Engineer The ChatGPT Learning team focuses on building the next generation of learning experiences inside ChatGPT. Learning is already one... ...that sit at the intersection of AI research and consumer software. This role is based in San Francisco or New York...SuggestedWork at officeWorldwideRelocation package
- ...Full Stack + AI Engineer Location: New York, NY (Hybrid / Onsite Preferred) Gesture is where... ...growing tech company using AI, machine learning, and intelligent logistics to power a... ...with recommendation algorithms, reinforcement learning, or consumer marketing tech...SuggestedCasual workWork at office
- ...A leading fintech company in Canada is seeking a Software Developer to build and maintain RESTful APIs for their credit card and payments... ...understanding of SQL and API design, and a willingness to learn about the payments industry. The position is remote-first and offers...SuggestedRemote work
$115k - $140k
...Full Stack Software Engineer - MERN Stack United States Boom Entertainment is redefining the way fans engage with sports through innovative, high... ...with strong critical thinking and troubleshooting skills who learns quickly and takes initiative. Strong communicator and...Home office- ...Overview Empassion is seeking a Senior Full Stack Software Engineer to join their innovative team. This remote position offers a challenging yet... ...The company fosters professional growth through continuous learning opportunities. The right individual will thrive in a dynamic...Remote workFlexible hours
- ...ego Comfortable operating in motion The Role As a Software Engineer on the Banking team, your work will directly impact how... ...top of Kubernetes and AWS. We believe great engineers can learn any stack, so you do not need experience with these specific tools,...Remote workFlexible hours
- ...with AI, not around it. Our engineering team uses Claude Code and agentic... ...practices across the stack Mentor and guide other engineers... ...development velocity Ship fast and learn fast - high urgency... ...~2–6 years of professional full-stack experience at a product...
- ...self-sovereignty and security. What the role involves: As a Full Stack Software Engineer at IOG, you’ll do more than just write code—you’ll help... ...package to buy hardware essentials (headphones, monitor, etc) Learning & Development opportunities Competitive PTO At IOG, we...Remote work
$115k - $150k
...planet. We are seeking a highly skilled and motivated Senior Full-Stack Software Engineer to join our dynamic team. This role is pivotal in... ...addition to perks and benefits. Visit our About Us page to learn more about our mission, impact, and dive into blogs and research...Local areaRemote work- ...Full Stack Engineer Duration: Long Term Contract Location: Durham, NC | Westlake, TX | Boston... ...support digital capabilities. As a Software Engineer, you will be an integral member... ...for self-starters who are willing to learn the current application framework and start...Long term contract
$250k
...aren't. They're building AI agents that learn how work actually happens, then run those... ...ve built a deeply technical team across engineering, AI research, and strategy. The focus is... ...the craft. You'll join as an early full stack engineer , shaping both the product and...- ...Senior Full Stack Software Engineer Location: Remote (US Only) - Eastern Time Zone About Us Gambyt's mission is to make real-money gaming products... ...but not required—strong fundamentals and the ability to learn quickly are more important. Our technologies include: Mobile...Full timeSummer workRemote work
- ...s network. The next step is to speak to Jack. Full-Stack Software Engineer Company Description: Fast-growing voice AI startup... .... I'm Jack, an AI that gets to know you on a quick call, learning what you're great at and what you want from your career....Immediate startRemote work
$180k - $220k
...Senior Full-Stack Software Engineer Title of Role: Senior Full-Stack Software Engineer Location: New York, onsite Company Stage of... ...reviews and mentor junior engineers to foster a collaborative learning environment. Ideal Candidate Background ~8-12...Work at office$240k - $260k
...intent beyond legacy search engines. Today, our Native Search... ...an experienced Lead Machine Learning Engineer passionate about building... ..., deep learning, and reinforcement learning techniques. Develop... ...of each hire. Department Software Engineering Role Lead...Summer workWork at office- ...Tesorio builds software that helps companies manage and optimize cash... ...scaled startups or top-tier engineering orgs); strong backend... ...coding Ability to work across the stack : strong backend engineer who... ...flexible work Home office stipend Learning stipend #J-18808-Ljbffr...Local areaRemote workHome officeFlexible hours
$156.8k - $235.2k
...mark on culture. Senior Lead Machine Learning Engineer, Entry & Re-engagement 45724... ...specifically in ranking, retrieval, or reinforcement learning. ~ Cold Start Experience :... ...comprehensive benefits packages. Check out our full list of benefits here: Generous...Immediate start- ...Role: Mid - Senior Machine Learning Engineer (This role is open to US Citizens... ...and more. We're not just a software consulting company – we're a... .... · Engage in the full lifecycle of data modeling projects... ..., NLP, computer vision, reinforcement learning, and/or other AI...Remote workVisa sponsorshipRelocation package
- ...authenticity. Our founders are engineers, and we come from a variety... ...years of experience in active software development. A favorite AI... ...others. Benefits & Logistics Full-time, fully remote role (U.S.... ...decisions, trade-offs, and what you learned. If you don’t have public...Full timeRemote work
$180k - $240k
...that are more immersive, interactive and personalized than ever before. Learn more at geniussports.com. About the Role We are looking for a Senior Full-Stack Software Engineer with strong frontend expertise who brings deep technical knowledge, strong...Temporary workWork at officeWorldwide- ...CVector Senior Full Stack Software Engineer CVector's mission is to bring real time economic optimization and AI prediction to every energy and... .... You should be comfortable working across the stack, learning new technologies quickly, and taking ownership of complex...Live inWork at office
- ...Future. We’re looking for a Senior Fullstack Software Engineer (Python and React) to join our product... ...electric car leasing industry. Our tech stack is Python with GraphQL, React with Next.... ...culture . An organisation where people learn, decide, and build quicker. We value...Work at office
- ...Home / Careers / Full-Stack Software Engineer Rentana Careers Full-Stack Software Engineer Remote (USA) Company Overview We are a fast-growing startup... ...AI‑powered platform leverages data analytics and machine learning to optimize rental pricing, improve occupancy rates, and...Remote workFlexible hours
$180k - $205k
...Full Stack Software Engineer Constrafor is a SaaS and fintech platform purpose-built for construction. We are setting new standards of productivity... ...~ A passion for mentorship, clean code, and continuous learning. Constrafor Benefits: Competitive Salary: $...For contractorsFor subcontractorWork at officeLocal area- ...Full Stack Software Engineer III at Availity Availity delivers revenue cycle and related business solutions for health care professionals who want... ...Availity Pride, VetAvaility, She Can Code IT). Continuous learning with resources and experts in our tech stack and industry....Hourly payWork at officeLocal areaRemote work
$125k - $145k
...Read on. OVERVIEW Join the best engineering team in media. Morning Brew is looking for a passionate Full Stack engineer to join its small,... ...ship quickly, we test and learn and iterate. We have a high level... ...YOU'LL BRING - 3+ years of software engineering experience with a...Work at officeDay shift$160k - $200k
...About You: Versana is looking for a motivated Full Stack Software Engineer to join our Application Development squad. The squad’s goal... ...teams and pivot quickly. ~ Curiosity and willingness to learn new things. ~ Strong communication, analytical and problem...Local area- ...Overview Insight Global is looking for Full Stack Software Engineers to join a dedicated team building and evolving a clinical data platform serving... ..., please send a request to ****@*****.***. To learn more about how we collect, keep, and process your private...Remote work
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Full-Stack Software Engineer, Reinforcement Learning. Be the first to apply!
- full stack react developer New York, NY
- lead full stack java developer New York, NY
- full stack / python developer (remote) New York, NY
- java full stack angular developer New York, NY
- full stack engineer New York, NY
- mid level full stack developer New York, NY
- full stack developer remote New York, NY
- remote .net full stack web developer New York, NY
- full stack java developer New York, NY
- mean stack developer New York, NY

