Applied Research - Evals & Data
$150k - $300kPrime Intellect
Be Your Own Lab
Prime Intellect builds the infrastructure that frontier AI labs build internally, and makes it available to everyone. Our platform, Lab, unifies environments, evaluations, sandboxes, and high-performance training into a single full-stack system for post-training at frontier scale, from RL and SFT to tool use, agent workflows, and deployment. We validate everything by using it ourselves, training open state-of-the-art models on the same stack we put in your hands. We're looking for people who want to build at the intersection of frontier research and real infrastructure.
We recently raised $15mm in funding (total of $20mm raised) led by Founders Fund, with participation from Menlo Ventures and prominent angels including Andrej Karpathy (Eureka AI, Tesla, OpenAI), Tri Dao (Chief Scientific Officer of Together AI), Dylan Patel (SemiAnalysis), Clem Delangue (Huggingface), Emad Mostaque (Stability AI) and many others.
Role Impact
This is a customer facing role at the intersection of cutting-edge RL/post-training methods, applied data, and agent systems. You'll have a direct impact on shaping how advanced models are aligned, evaluated, deployed, and used in the real world by:
- Advancing Agent Capabilities: Designing and iterating on next-generation AI agents that tackle real workloads—workflow automation, reasoning-intensive tasks, and decision-making at scale. Working with applied data from real deployments to continuously refine policies, improve reasoning, and enhance reliability and safety.
- Building Robust Infrastructure: Developing the distributed systems, evaluation pipelines, and coordination frameworks that enable these agents to operate reliably, efficiently, and at massive scale. Building data capture, processing, and versioning workflows for feedback, model traces, and reward signals.
- Bridge Between Customers & Research: Translating customer needs and insights from applied data into clear technical requirements that guide product and research priorities. Collaborating closely with RL and eval teams to ensure real-world signals inform model alignment and reward shaping.
- Prototype in the Field: Rapidly designing and deploying agents, evals, and harnesses alongside customers to validate solutions. Using applied evaluation data to iterate on model performance and discover new capabilities.
Customer-Facing Engineering
- Work side-by-side with customers to deeply understand workflows, data sources, and bottlenecks.
- Prototype agents, data pipelines, and eval harnesses tailored to real use cases, then hand off hardened systems to core teams.
- Translate customer insights and evaluation results into roadmap and research direction.
Post-training & Reinforcement Learning
- Design and implement novel RL and post-training methods (RLHF, RLVR, GRPO, etc.) to align large models with domain-specific tasks.
- Build evaluation harnesses and verifiers to measure reasoning, robustness, and agentic behavior in real-world workflows.
- Integrate applied data collection and analytics into the post-training process to surface regressions, emergent skills, and alignment opportunities.
- Prototype multi-agent and memory-augmented systems to expand capabilities for customer-facing solutions.
Agent Development & Infrastructure
- Rapidly prototype and iterate on AI agents for automation, workflow orchestration, and decision-making.
- Extend and integrate with agent frameworks to support evolving feature requests and performance requirements.
- Architect and maintain distributed training and inference pipelines, ensuring scalability and cost efficiency.
- Develop observability and monitoring (Prometheus, Grafana, tracing) to ensure reliability and performance in production deployments.
Requirements
- Strong background in machine learning engineering, with experience in post-training, RL, or large-scale model alignment.
- Experience with applied data workflows and evaluation frameworks for large models or agents (e.g., SWE-Bench, HELM, EvalFlow, internal eval pipelines).
- Deep expertise in distributed training/inference frameworks (e.g., vLLM, sglang, Ray, Accelerate).
- Experience deploying containerized systems at scale (Docker, Kubernetes, Terraform).
- Track record of research contributions (publications, open-source contributions, benchmarks) in ML/RL.
- Passion for advancing the state-of-the-art in reasoning, measurement, and building practical, agentic AI systems.
What We Offer
- Cash Compensation Range of $150-300k + equity incentives
- Flexible Work (remote or San Francisco)
- Visa Sponsorship & relocation support
- Professional Development budget
- Team Off-sites & conference attendance
Growth Opportunity
You'll join a mission-driven team working at the frontier of open, superintelligence infra. In this role, you'll have the opportunity to:
- Shape the evolution of agent-driven, data-informed solutions—from research breakthroughs to production systems used by real customers.
- Collaborate with leading researchers, engineers, and partners pushing the boundaries of RL, evaluation, and post-training.
- Grow with a fast-moving organization where your contributions directly influence both the technical direction and the broader AI ecosystem.
If you're excited to move fast, build boldly, and help define how agentic AI is developed and deployed, we'd love to hear from you.
Ready to build the open superintelligence infrastructure of tomorrow? Apply now to help us make powerful, open AGI accessible to everyone.
$150k - $300k
...the intersection of frontier research and real infrastructure. We... ...RL/post-training methods and applied agent systems. You'll have a direct... ...design and deploy agents, evals, and harnesses for real-world... ...evaluations and/or synthetic data generation. Experience deploying...DataRemote workVisa sponsorshipRelocation packageFlexible hours$35 - $45 per hour
...Applied Research Intern San Francisco Bay Area Shape the Future of AI At Labelbox, we're... ...enterprises. Since 2018, we've been pioneering data-centric approaches that are fundamental... .... You'll own continuous, high-quality evals and benchmarks (reasoning, code, agent/...DataInternshipWork at officeFlexible hours2 days per week$240k - $380k
Staff Machine Learning Research Scientist, LLM Evals Ready to Apply? Join the team shaping the future of AI at Scale. As the leading data and evaluation partner for frontier AI companies, Scale is dedicated to advancing the evaluation and benchmarking of large language...DataFull time- ...team. You’ll work alongside researchers, operators, and AI companies... ...intersection of engineering and applied AI research. You’ll own... ...reasoning. You’ll design and run evals, build rubrics and scorers, and... ...for post-training, RLVR, and data pipelines. What You’ll Do...DataWork at office
- ...Abridge, Clay, Gamma and Writer. By uniting applied AI research, flexible infrastructure, and seamless developer... ..., reward function engineering, and synthetic data generation. Build task‑specific training environments and evals tailored to customer domains like healthcare,...DataFlexible hoursShift work
$160k - $240k
Research Engineer — Evals You’ll build the evaluation systems that tell us whether Firecrawl actually... ...URL into clean, structured, LLM‑ready data reliably — is hard to measure rigorously... ...: 3+ years in ML engineering, applied AI, or data quality — with production...DataFull timeTemporary workRemote work$210k - $385k
...Location Type Hybrid Department Data Science Compensation $210K -... ...residence. USD salary ranges apply only to U.S.-based positions.... ..., you will build specialized evals to improve answer quality... ...user traffic at scale A strong research background, with experience applying...DataFull timeLocal area$146k - $280k
Applied Data Scientist - Senior Technical Role We are looking for a highly experienced Applied Data Scientist to shape evaluation methodologies for autonomous driving technology. This senior technical role sits at the intersection of Evaluation, Systems & Safety, and the...DataFull timeWork at officeWork from homeFlexible hours$150k - $250k
David Joseph & Company is seeking a Research Scientist in San Francisco, focusing on applied research to enhance frontier AI systems. The role involves designing datasets, developing measurement frameworks, and executing rapid experiments. Candidates should possess quantitative...Data- ...sandboxes, durable execution and unstructured data ingestion APIs to developers in... ...reasoning, and deep content extraction. Research, evaluate, and integrate the latest vision... ...tailored to real-world document AI tasks. Applied AI & Product Impact Build and ship production...Data
$160k - $225k
...communities. About the Role We're building a new team, Applied Science, and we're looking for our first outside hire.... ...building and deploying quantitative models (in applied or research settings) Comfort querying data directly (SQL or equivalent) Experience designing and...DataWork at officeRemote work3 days per week$146k - $280k
Waabi is seeking a Senior Applied Data Scientist in San Francisco to shape evaluation methodologies for autonomous driving technology. Responsibilities include designing production frameworks, prototyping analyses, and developing analytical models to correlate simulation...DataFlexible hours$146k - $280k
...on the performance of our software, platform, and fleet with rigor and reliability. We are looking for a highly experienced Applied Data Scientist to play a leading role in shaping the methodologies underlying our evaluation ecosystem. This is a senior technical role...DataOdd jobFull timeWork at officeWork from homeFlexible hours- ...of intelligence. The Role We're looking for an Applied Scientist who thrives at the intersection of applied research and real-world products. You'll push the frontier... ...in. Hands-on Execution: Own implementation of data products at Adaption, addressing novel challenges...DataFlexible hours
$142.8k - $274.8k
...quality, time saved). Own end-to-end delivery of applied science solutions: problem framing data/labels modeling/prompting evaluation experimentation... ...experience (e.g., statistics, predictive analytics, research) OR Master's Degree in Statistics, Econometrics, Computer...DataOngoing contractWork at officeLocal area$150k - $250k
...Distyl AI Job Opportunity Distyl is an applied AI technology company partnering with the... ..., and global social organizations. We research and deploy technologies that power AI-native... ...workflow Strong Programming and Data Analysis Skills: While you might not consider...DataWork at office3 days per week$150k
...continuously fuzz-testing them. We are looking for Research Engineers to help develop our reliability platform, with a focus on: Data-efficient alignment of evaluation models... ...engineering experience (e.g. ML in an applied setting). No spaghetti research code! Some...DataVisa sponsorship$311.9k - $356k
Sr. Distinguished Applied Researcher Overview At Capital One, we are creating trustworthy and reliable AI systems, changing banking for good... ...Key Responsibilities Partner with a cross‑functional team of data scientists, software engineers, machine learning engineers...DataFull timeLocal areaFlexible hours- ...end conversions based on demographic, audience, and semantic data Apply privacy-preserving clustering methods to categorize conversational... ..., forming testable hypotheses, and conducting impactful research to drive significant business impact You have a relentless focus...DataFull time
$180k - $250k
...Staff Applied AI Scientist Focused On Decision Systems Headquartered in San Diego, Mulligan... ...real production workflows. This is not a research role and it is not a pure engineering... ...Eight or more years of experience in applied data science, including at least three years...DataFlexible hours$262.5k - $299.6k
...divh2Applied Researcher II (AI Foundations)/h2pAt Capital One, we are creating trustworthy and... ...We are committed to building world-class applied science and engineering teams and... ...pulliPartner with a cross-functional team of data scientists, software engineers, machine learning...DataFull timePart time$150k - $250k
...interaction across multiple reasoning processes. Researchers build systems that structure... ...accelerate workflow. Strong programming and data analysis skills: able to build prototypes... ...characteristic. We encourage candidates from all backgrounds to apply. #J-18808-Ljbffr DistylDataWork at office3 days per week$195k - $222.5k
Overview Applied Researcher I Overview: At Capital One, we are creating trustworthy and reliable AI systems, changing banking for good. For years... .... Responsibilities Partner with a cross-functional team of data scientists, software engineers, machine learning engineers...DataFull timePart timeLocal area$150k - $350k
Sieve in San Francisco is looking for an Applied Research Engineer to develop high-performance pipelines for understanding video data. Ideal candidates have over 2 years of experience in computer vision or audio processing, are strong Python developers familiar with ML...Data$139.9k - $274.8k
...sits at the intersection of cutting-edge AI research and planet-scale production systems,... ...Azure OpenAI. We are looking for a Principal Applied Scientist to join our team! Microsoft'... ...in Computer Science, Machine Learning, Data Science, or a related field. ~5+ years...DataOngoing contractWork at officeLocal areaShift work$160k - $300k
...experiences for matrix and deep, multi-source research. We’ve built our own agentic frameworks... ...we build these systems for the scale of data our customers bring to the table. Our... ...multi billion dollar M&A. The Role As an Applied Research Engineer, you will be the bridge...Data$150k - $250k
About Distyl AI Distyl is an applied AI technology company partnering with the world’s most... ..., and global social organizations. We research and deploy technologies that power AI-native... ...your workflow. Strong Programming and Data Analysis Skills: While you might not consider...DataWork at office3 days per week$189.72k - $332.01k
...content understanding, and more. It is an applied team that works horizontally across the... ...team also publishes its work in applied research conferences, but the main goal of the team... ...documentation search, experiment analysis, SQL/data exploration, and engineering workflow...DataWork experience placementWork at officeLocal areaRemote workRelocationRelocation package$181.1k - $318.4k
...ML organization sits at the nexus of these systems—where deep applied research, advanced machine learning, and large language models converge... ...complex user or content behaviors in high‑dimensional, unstructured data. Drive LLM fine‑tuning, evaluation, safety alignment, and...DataRelocation$100 - $130 per hour
...model) Job Description: We are seeking an experienced Applied UX Researcher to join our team in San Francisco, CA, working in a hybrid onsite... ...product direction through thoughtful, ethical, and data-driven research. Responsibilities: Scope research projects...DataHourly payContract work
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Applied Research - Evals & Data. Be the first to apply!
- clinical data San Francisco, CA
- master data coordinator San Francisco, CA
- clinical data coordinator remote San Francisco, CA
- data intern San Francisco, CA
- data cabling installation San Francisco, CA
- data collection researcher San Francisco, CA
- data technician San Francisco, CA
- data mining San Francisco, CA
- minimum data set coordinator San Francisco, CA
- data reviewer San Francisco, CA


