Applied Research - Evals & Data

$150k - $300k

Prime Intellect

Be Your Own Lab

Prime Intellect builds the infrastructure that frontier AI labs build internally, and makes it available to everyone. Our platform, Lab, unifies environments, evaluations, sandboxes, and high-performance training into a single full-stack system for post-training at frontier scale, from RL and SFT to tool use, agent workflows, and deployment. We validate everything by using it ourselves, training open state-of-the-art models on the same stack we put in your hands. We're looking for people who want to build at the intersection of frontier research and real infrastructure.

We recently raised $15mm in funding (total of $20mm raised) led by Founders Fund, with participation from Menlo Ventures and prominent angels including Andrej Karpathy (Eureka AI, Tesla, OpenAI), Tri Dao (Chief Scientific Officer of Together AI), Dylan Patel (SemiAnalysis), Clem Delangue (Huggingface), Emad Mostaque (Stability AI) and many others.

Role Impact

This is a customer facing role at the intersection of cutting-edge RL/post-training methods, applied data, and agent systems. You'll have a direct impact on shaping how advanced models are aligned, evaluated, deployed, and used in the real world by:

Advancing Agent Capabilities: Designing and iterating on next-generation AI agents that tackle real workloads—workflow automation, reasoning-intensive tasks, and decision-making at scale. Working with applied data from real deployments to continuously refine policies, improve reasoning, and enhance reliability and safety.
Building Robust Infrastructure: Developing the distributed systems, evaluation pipelines, and coordination frameworks that enable these agents to operate reliably, efficiently, and at massive scale. Building data capture, processing, and versioning workflows for feedback, model traces, and reward signals.
Bridge Between Customers & Research: Translating customer needs and insights from applied data into clear technical requirements that guide product and research priorities. Collaborating closely with RL and eval teams to ensure real-world signals inform model alignment and reward shaping.
Prototype in the Field: Rapidly designing and deploying agents, evals, and harnesses alongside customers to validate solutions. Using applied evaluation data to iterate on model performance and discover new capabilities.

Customer-Facing Engineering

Work side-by-side with customers to deeply understand workflows, data sources, and bottlenecks.
Prototype agents, data pipelines, and eval harnesses tailored to real use cases, then hand off hardened systems to core teams.
Translate customer insights and evaluation results into roadmap and research direction.

Post-training & Reinforcement Learning

Design and implement novel RL and post-training methods (RLHF, RLVR, GRPO, etc.) to align large models with domain-specific tasks.
Build evaluation harnesses and verifiers to measure reasoning, robustness, and agentic behavior in real-world workflows.
Integrate applied data collection and analytics into the post-training process to surface regressions, emergent skills, and alignment opportunities.
Prototype multi-agent and memory-augmented systems to expand capabilities for customer-facing solutions.

Agent Development & Infrastructure

Rapidly prototype and iterate on AI agents for automation, workflow orchestration, and decision-making.
Extend and integrate with agent frameworks to support evolving feature requests and performance requirements.
Architect and maintain distributed training and inference pipelines, ensuring scalability and cost efficiency.
Develop observability and monitoring (Prometheus, Grafana, tracing) to ensure reliability and performance in production deployments.

Requirements

Strong background in machine learning engineering, with experience in post-training, RL, or large-scale model alignment.
Experience with applied data workflows and evaluation frameworks for large models or agents (e.g., SWE-Bench, HELM, EvalFlow, internal eval pipelines).
Deep expertise in distributed training/inference frameworks (e.g., vLLM, sglang, Ray, Accelerate).
Experience deploying containerized systems at scale (Docker, Kubernetes, Terraform).
Track record of research contributions (publications, open-source contributions, benchmarks) in ML/RL.
Passion for advancing the state-of-the-art in reasoning, measurement, and building practical, agentic AI systems.

What We Offer

Cash Compensation Range of $150-300k + equity incentives
Flexible Work (remote or San Francisco)
Visa Sponsorship & relocation support
Professional Development budget
Team Off-sites & conference attendance

Growth Opportunity

You'll join a mission-driven team working at the frontier of open, superintelligence infra. In this role, you'll have the opportunity to:

Shape the evolution of agent-driven, data-informed solutions—from research breakthroughs to production systems used by real customers.
Collaborate with leading researchers, engineers, and partners pushing the boundaries of RL, evaluation, and post-training.
Grow with a fast-moving organization where your contributions directly influence both the technical direction and the broader AI ecosystem.

If you're excited to move fast, build boldly, and help define how agentic AI is developed and deployed, we'd love to hear from you.

Ready to build the open superintelligence infrastructure of tomorrow? Apply now to help us make powerful, open AGI accessible to everyone.

Apply

Vacancy posted 14 days ago

Similar jobs that could be interesting for youBased on the Applied Research - Evals & Data in San Francisco, CA vacancy

Applied Research - Forward-Deployed
$150k - $300k
...the intersection of frontier research and real infrastructure. We... ...real-world platform usage Applied Research & Experimentation... ...on the frontier of agentic AI, evals, and post-training methods, and... ...scale (benchmarks, synthetic data generation, model grading)...
Data
Remote work
Visa sponsorship
Relocation package
Flexible hours
Prime Intellect
San Francisco, CA
4 days ago
Applied Research - RL & Agents
$150k - $300k
...the intersection of frontier research and real infrastructure. We... ...RL/post-training methods and applied agent systems. You'll have a direct... ...design and deploy agents, evals, and harnesses for real-world... ...evaluations and/or synthetic data generation. Experience deploying...
Data
Remote work
Visa sponsorship
Relocation package
Flexible hours
Prime Intellect
San Francisco, CA
3 days ago
Applied Research Intern
$35 - $45 per hour
...Applied Research Intern San Francisco Bay Area Shape the Future of AI At Labelbox, we're... ...enterprises. Since 2018, we've been pioneering data-centric approaches that are fundamental... .... You'll own continuous, high-quality evals and benchmarks (reasoning, code, agent/...
Data
Internship
Work at office
Flexible hours
2 days per week
Labelbox
San Francisco, CA
1 day ago
Research, Evals
...Evaluations Engineer Exa is an applied AI lab building a search... ...this is the place for you. Research at Exa The ML organization... ...in. We're looking for an ML evals engineer to design and build our... ...etc.) Enjoy diving into data via building eval sets, inspecting...
Data
H1b
Exa Labs
San Francisco, CA
4 days ago
Research Engineer - Benchmarking, Evals & Failure Analysis
...team. You’ll work alongside researchers, operators, and AI companies... ...intersection of engineering and applied AI research. You’ll own... ...reasoning. You’ll design and run evals, build rubrics and scorers, and... ...for post-training, RLVR, and data pipelines. What You’ll Do...
Data
Work at office
Mercor
San Francisco, CA
2 days ago
Research Engineer - Evals
$160k - $240k
Research Engineer — Evals Location: San Francisco, CA (Hybrid) OR Remote (Americas, UTC-3 to UTC-10) Employment... ...URL into clean, structured, LLM-ready data reliably — is hard to measure... ...Experience: 3+ years in ML engineering, applied AI, or data quality — with production...
Data
Full time
Temporary work
Work at office
Remote work
AI Chopping Block, Inc.
San Francisco, CA
2 days ago
Senior / Staff Applied Scientist
$146k - $280k
Applied Data Scientist - Senior Technical Role We are looking for a highly experienced Applied Data Scientist to shape evaluation methodologies for autonomous driving technology. This senior technical role sits at the intersection of Evaluation, Systems & Safety, and the...
Data
Full time
Work at office
Work from home
Flexible hours
Waabi
San Francisco, CA
1 day ago
Senior Applied Research Scientist
...sandboxes, durable execution and unstructured data ingestion APIs to developers in... ...reasoning, and deep content extraction. Research, evaluate, and integrate the latest vision... ...tailored to real-world document AI tasks. Applied AI & Product Impact Build and ship production...
Data
Tensorlake Inc.
San Francisco, CA
12 hours ago
Applied Scientist
...of intelligence. The Role We're looking for an Applied Scientist who thrives at the intersection of applied research and real-world products. You'll push the frontier... ...in. Hands-on Execution: Own implementation of data products at Adaption, addressing novel challenges...
Data
Flexible hours
Adaption Labs
San Francisco, CA
12 hours ago
Senior/Staff Applied Scientist - Autonomy Evaluation
$146k - $280k
Waabi is seeking a Senior Applied Data Scientist in San Francisco to shape evaluation methodologies for autonomous driving technology. Responsibilities include designing production frameworks, prototyping analyses, and developing analytical models to correlate simulation...
Data
Flexible hours
Waabi
San Francisco, CA
1 day ago
Applied Scientist
$160k - $225k
...communities. About the Role We're building a new team, Applied Science, and we're looking for our first outside hire.... ...building and deploying quantitative models (in applied or research settings) Comfort querying data directly (SQL or equivalent) Experience designing and...
Data
Work at office
Remote work
3 days per week
Clipboard
San Francisco, CA
12 hours ago
Summer Intern - Applied Science MSc/PhD
$8k - $16k
Summer Intern - Applied Science MSc/PhD at Terranox AI (W26) $8K - $16K The first AI-powered... ...others. The Role You'll work on a real research problem at the intersection of geophysics... ...experience with ML methods on real geoscientific data Background in exploration geophysics...
Data
Full time
Summer work
Internship
Summer internship
Terranox AI
San Francisco, CA
1 day ago
Sr. Distinguished Applied Researcher
$311.9k - $356k
Sr. Distinguished Applied Researcher Overview At Capital One, we are creating trustworthy and reliable AI systems, changing banking for good... ...Key Responsibilities Partner with a cross‑functional team of data scientists, software engineers, machine learning engineers...
Data
Full time
Local area
Flexible hours
Capital One National Association
San Francisco, CA
12 hours ago
Applied Researcher
$150k
...continuously fuzz-testing them. We are looking for Research Engineers to help develop our reliability platform, with a focus on: Data-efficient alignment of evaluation models... ...engineering experience (e.g. ML in an applied setting). No spaghetti research code! Some...
Data
Visa sponsorship
Enboarder
San Francisco, CA
4 days ago
Applied AI Researcher, Benchmarking
$150k - $250k
...Distyl AI Job Opportunity Distyl is an applied AI technology company partnering with the... ..., and global social organizations. We research and deploy technologies that power AI-native... ...workflow Strong Programming and Data Analysis Skills: While you might not consider...
Data
Work at office
3 days per week
Distyl AI
San Francisco, CA
4 days ago
Staff Applied AI Scientist, Decision Systems
...all of our partners. As our Staff Applied AI Scientist focused on Decision Systems... ...real production workflows. This is not a research role and it is not a pure engineering role... ...or more years of experience in applied data science, including at least three years working...
Data
Flexible hours
Mulligan Funding
San Francisco, CA
26 days ago
Applied AI Researcher, Multi-Agent Systems
$150k - $250k
...Distyl AI Job Posting Distyl is an applied AI technology company partnering with the world... ..., and global social organizations. We research and deploy technologies that power AI-native... ...workflow. Strong Programming and Data Analysis Skills: While you might not consider...
Data
Work at office
3 days per week
Distyl AI
San Francisco, CA
2 days ago
Applied Scientist
...end conversions based on demographic, audience, and semantic data Apply privacy-preserving clustering methods to categorize conversational... ..., forming testable hypotheses, and conducting impactful research to drive significant business impact You have a relentless focus...
Data
Full time
Koah Labs
San Francisco, CA
1 day ago
Applied Science Manager, Amazon Publisher Monetization Stores
$202.2k - $273.6k
...Applied Science Manager, Amazon Publisher Monetization Stores Job ID: 3157704 | Amazon... ...Identify new opportunities to leverage data and advanced analytics to unlock value for... ...Master's degree and 5+ years of applied research experience ~ Knowledge of ML, NLP, Information...
Data
Flexible hours
Amazon
San Francisco, CA
12 hours ago
Applied Research Engineer — Video ML Pipelines
$150k - $350k
Sieve in San Francisco is looking for an Applied Research Engineer to develop high-performance pipelines for understanding video data. Ideal candidates have over 2 years of experience in computer vision or audio processing, are strong Python developers familiar with ML...
Data
Sieve
San Francisco, CA
12 hours ago
Applied Research Engineer, Agents NYC
$160k - $300k
...experiences for matrix and deep, multi-source research. We’ve built our own agentic frameworks... ...we build these systems for the scale of data our customers bring to the table. Our... ...multi billion dollar M&A. The Role As an Applied Research Engineer, you will be the bridge...
Data
Hebbia, Inc.
San Francisco, CA
12 hours ago
Founding Applied Research Engineer
...training signals, and evaluation ownership. Every applied AI company we benchmark against like Decagon, Harvey... ...Rox is in market. We run agents against enterprise data at scale, every day. We see exactly where research meets production and where the data is dirty, state...
Data
Relocation
Rox Data Corp
San Francisco, CA
2 days ago
Senior Applied Research Engineer 2
$192k - $259.8k
...off validated approaches for productionization Stay current on applied research in RAG, agents, LLM evaluation, and relevance modeling; bring... ...product Qualifications 6+ years of experience in applied research, data science, or ML with a focus on NLP, information retrieval, or...
Data
Work at office
Worldwide
Monday to Friday
Flexible hours
Drata
San Francisco, CA
1 day ago
Senior Recruiter - Applied Science, Machine Learning & Data Science Recruiting
$94k - $142.3k
...best candidate experience, please consider applying for a maximum of 3 roles within 12 months... ...Science, Machine Learning, and/or Data Science and hiring at a company where the... ...Scientist roles across Salesforce's AI and research organizations Partner deeply with senior...
Data
Salesforce.Com Inc
San Francisco, CA
3 days ago
Founding Applied AI Research Engineer (Production-Driven)
Rox Data Corp in San Francisco is seeking a Research Scientist in Applied AI to lead innovative research projects and build evaluation frameworks that measure the quality of AI agent interactions. The role offers the opportunity to influence the research agenda and work...
Data
Relocation package
Rox Data Corp
San Francisco, CA
2 days ago
Applied Research Engineer (Agents)
$250k - $300k
...infrastructure that powers breakthrough AI models at leading research labs and enterprises. Since 2018, we've been pioneering data-centric approaches that are fundamental to AI... ..., NAACL, NeurIPS, ICML, ICLR, etc.). Labelbox Applied Research At Labelbox Applied Research, we're...
Data
Work at office
Flexible hours
2 days per week
Labelbox
San Francisco, CA
4 days ago
Senior Applied Researcher
$181.1k - $318.4k
...ML organization sits at the nexus of these systems—where deep applied research, advanced machine learning, and large language models converge... ...complex user or content behaviors in high‑dimensional, unstructured data. Drive LLM fine‑tuning, evaluation, safety alignment, and...
Data
Relocation
Apple Inc.
San Francisco, CA
2 days ago
Applied Research Engineer
$145.2k - $196.4k
...employee stories, and career news. Job Summary Drata is seeking an Applied Research Engineer to drive the quality and effectiveness of our AI... ...You'll Bring 3+ years of experience in applied research, data science, or ML with a focus on NLP, information retrieval, or...
Data
Work at office
Immediate start
Monday to Friday
Flexible hours
Drata
San Francisco, CA
12 hours ago
Applied Research Engineer
...infrastructure that powers breakthrough AI models at leading research labs and enterprises. Since 2018, we've been pioneering data-centric approaches that are fundamental to AI... ...ownership and metrics. Role Overview As an Applied Research Engineer, you will be at the forefront...
Data
Flexible hours
HRB
San Francisco, CA
4 days ago
Applied Research Engineer
$197.3k - $313.7k
## Applied Research EngineerApplyremote type: Office Tech-Flexiblelocations: California - San Francisco: California - Palo Alto: Illinois - Chicago... ...Statement for more information about how we use your personal data and your rights, including with regard to use of AI tools and...
Data
Work at office
Salesforce, Inc.
San Francisco, CA
3 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Applied Research - Evals & Data. Be the first to apply!