Applied Research - Evals & Data
$150k - $300kPrime Intellect
Be Your Own Lab
Prime Intellect builds the infrastructure that frontier AI labs build internally, and makes it available to everyone. Our platform, Lab, unifies environments, evaluations, sandboxes, and high-performance training into a single full-stack system for post-training at frontier scale, from RL and SFT to tool use, agent workflows, and deployment. We validate everything by using it ourselves, training open state-of-the-art models on the same stack we put in your hands. We're looking for people who want to build at the intersection of frontier research and real infrastructure.
We recently raised $15mm in funding (total of $20mm raised) led by Founders Fund, with participation from Menlo Ventures and prominent angels including Andrej Karpathy (Eureka AI, Tesla, OpenAI), Tri Dao (Chief Scientific Officer of Together AI), Dylan Patel (SemiAnalysis), Clem Delangue (Huggingface), Emad Mostaque (Stability AI) and many others.
Role Impact
This is a customer facing role at the intersection of cutting-edge RL/post-training methods, applied data, and agent systems. You'll have a direct impact on shaping how advanced models are aligned, evaluated, deployed, and used in the real world by:
- Advancing Agent Capabilities: Designing and iterating on next-generation AI agents that tackle real workloads—workflow automation, reasoning-intensive tasks, and decision-making at scale. Working with applied data from real deployments to continuously refine policies, improve reasoning, and enhance reliability and safety.
- Building Robust Infrastructure: Developing the distributed systems, evaluation pipelines, and coordination frameworks that enable these agents to operate reliably, efficiently, and at massive scale. Building data capture, processing, and versioning workflows for feedback, model traces, and reward signals.
- Bridge Between Customers & Research: Translating customer needs and insights from applied data into clear technical requirements that guide product and research priorities. Collaborating closely with RL and eval teams to ensure real-world signals inform model alignment and reward shaping.
- Prototype in the Field: Rapidly designing and deploying agents, evals, and harnesses alongside customers to validate solutions. Using applied evaluation data to iterate on model performance and discover new capabilities.
Customer-Facing Engineering
- Work side-by-side with customers to deeply understand workflows, data sources, and bottlenecks.
- Prototype agents, data pipelines, and eval harnesses tailored to real use cases, then hand off hardened systems to core teams.
- Translate customer insights and evaluation results into roadmap and research direction.
Post-training & Reinforcement Learning
- Design and implement novel RL and post-training methods (RLHF, RLVR, GRPO, etc.) to align large models with domain-specific tasks.
- Build evaluation harnesses and verifiers to measure reasoning, robustness, and agentic behavior in real-world workflows.
- Integrate applied data collection and analytics into the post-training process to surface regressions, emergent skills, and alignment opportunities.
- Prototype multi-agent and memory-augmented systems to expand capabilities for customer-facing solutions.
Agent Development & Infrastructure
- Rapidly prototype and iterate on AI agents for automation, workflow orchestration, and decision-making.
- Extend and integrate with agent frameworks to support evolving feature requests and performance requirements.
- Architect and maintain distributed training and inference pipelines, ensuring scalability and cost efficiency.
- Develop observability and monitoring (Prometheus, Grafana, tracing) to ensure reliability and performance in production deployments.
Requirements
- Strong background in machine learning engineering, with experience in post-training, RL, or large-scale model alignment.
- Experience with applied data workflows and evaluation frameworks for large models or agents (e.g., SWE-Bench, HELM, EvalFlow, internal eval pipelines).
- Deep expertise in distributed training/inference frameworks (e.g., vLLM, sglang, Ray, Accelerate).
- Experience deploying containerized systems at scale (Docker, Kubernetes, Terraform).
- Track record of research contributions (publications, open-source contributions, benchmarks) in ML/RL.
- Passion for advancing the state-of-the-art in reasoning, measurement, and building practical, agentic AI systems.
What We Offer
- Cash Compensation Range of $150-300k + equity incentives
- Flexible Work (remote or San Francisco)
- Visa Sponsorship & relocation support
- Professional Development budget
- Team Off-sites & conference attendance
Growth Opportunity
You'll join a mission-driven team working at the frontier of open, superintelligence infra. In this role, you'll have the opportunity to:
- Shape the evolution of agent-driven, data-informed solutions—from research breakthroughs to production systems used by real customers.
- Collaborate with leading researchers, engineers, and partners pushing the boundaries of RL, evaluation, and post-training.
- Grow with a fast-moving organization where your contributions directly influence both the technical direction and the broader AI ecosystem.
If you're excited to move fast, build boldly, and help define how agentic AI is developed and deployed, we'd love to hear from you.
Ready to build the open superintelligence infrastructure of tomorrow? Apply now to help us make powerful, open AGI accessible to everyone.
$150k - $300k
...the intersection of frontier research and real infrastructure. We... ...real-world platform usage Applied Research & Experimentation... ...on the frontier of agentic AI, evals, and post-training methods, and... ...scale (benchmarks, synthetic data generation, model grading)...DataRemote workVisa sponsorshipRelocation packageFlexible hours$150k - $300k
...the intersection of frontier research and real infrastructure. We... ...RL/post-training methods and applied agent systems. You'll have a direct... ...design and deploy agents, evals, and harnesses for real-world... ...evaluations and/or synthetic data generation. Experience deploying...DataRemote workVisa sponsorshipRelocation packageFlexible hours$35 - $45 per hour
...Applied Research Intern San Francisco Bay Area Shape the Future of AI At Labelbox, we're... ...enterprises. Since 2018, we've been pioneering data-centric approaches that are fundamental... .... You'll own continuous, high-quality evals and benchmarks (reasoning, code, agent/...DataInternshipWork at officeFlexible hours2 days per week- ...Evaluations Engineer Exa is an applied AI lab building a search... ...this is the place for you. Research at Exa The ML organization... ...in. We're looking for an ML evals engineer to design and build our... ...etc.) Enjoy diving into data via building eval sets, inspecting...DataH1b
- ...team. You’ll work alongside researchers, operators, and AI companies... ...intersection of engineering and applied AI research. You’ll own... ...reasoning. You’ll design and run evals, build rubrics and scorers, and... ...for post-training, RLVR, and data pipelines. What You’ll Do...DataWork at office
$160k - $240k
Research Engineer — Evals Location: San Francisco, CA (Hybrid) OR Remote (Americas, UTC-3 to UTC-10) Employment... ...URL into clean, structured, LLM-ready data reliably — is hard to measure... ...Experience: 3+ years in ML engineering, applied AI, or data quality — with production...DataFull timeTemporary workWork at officeRemote work$146k - $280k
Applied Data Scientist - Senior Technical Role We are looking for a highly experienced Applied Data Scientist to shape evaluation methodologies for autonomous driving technology. This senior technical role sits at the intersection of Evaluation, Systems & Safety, and the...DataFull timeWork at officeWork from homeFlexible hours- ...sandboxes, durable execution and unstructured data ingestion APIs to developers in... ...reasoning, and deep content extraction. Research, evaluate, and integrate the latest vision... ...tailored to real-world document AI tasks. Applied AI & Product Impact Build and ship production...Data
- ...of intelligence. The Role We're looking for an Applied Scientist who thrives at the intersection of applied research and real-world products. You'll push the frontier... ...in. Hands-on Execution: Own implementation of data products at Adaption, addressing novel challenges...DataFlexible hours
$146k - $280k
Waabi is seeking a Senior Applied Data Scientist in San Francisco to shape evaluation methodologies for autonomous driving technology. Responsibilities include designing production frameworks, prototyping analyses, and developing analytical models to correlate simulation...DataFlexible hours$160k - $225k
...communities. About the Role We're building a new team, Applied Science, and we're looking for our first outside hire.... ...building and deploying quantitative models (in applied or research settings) Comfort querying data directly (SQL or equivalent) Experience designing and...DataWork at officeRemote work3 days per week$8k - $16k
Summer Intern - Applied Science MSc/PhD at Terranox AI (W26) $8K - $16K The first AI-powered... ...others. The Role You'll work on a real research problem at the intersection of geophysics... ...experience with ML methods on real geoscientific data Background in exploration geophysics...DataFull timeSummer workInternshipSummer internship$311.9k - $356k
Sr. Distinguished Applied Researcher Overview At Capital One, we are creating trustworthy and reliable AI systems, changing banking for good... ...Key Responsibilities Partner with a cross‑functional team of data scientists, software engineers, machine learning engineers...DataFull timeLocal areaFlexible hours$150k
...continuously fuzz-testing them. We are looking for Research Engineers to help develop our reliability platform, with a focus on: Data-efficient alignment of evaluation models... ...engineering experience (e.g. ML in an applied setting). No spaghetti research code! Some...DataVisa sponsorship$150k - $250k
...Distyl AI Job Opportunity Distyl is an applied AI technology company partnering with the... ..., and global social organizations. We research and deploy technologies that power AI-native... ...workflow Strong Programming and Data Analysis Skills: While you might not consider...DataWork at office3 days per week- ...all of our partners. As our Staff Applied AI Scientist focused on Decision Systems... ...real production workflows. This is not a research role and it is not a pure engineering role... ...or more years of experience in applied data science, including at least three years working...DataFlexible hours
$150k - $250k
...Distyl AI Job Posting Distyl is an applied AI technology company partnering with the world... ..., and global social organizations. We research and deploy technologies that power AI-native... ...workflow. Strong Programming and Data Analysis Skills: While you might not consider...DataWork at office3 days per week- ...end conversions based on demographic, audience, and semantic data Apply privacy-preserving clustering methods to categorize conversational... ..., forming testable hypotheses, and conducting impactful research to drive significant business impact You have a relentless focus...DataFull time
$202.2k - $273.6k
...Applied Science Manager, Amazon Publisher Monetization Stores Job ID: 3157704 | Amazon... ...Identify new opportunities to leverage data and advanced analytics to unlock value for... ...Master's degree and 5+ years of applied research experience ~ Knowledge of ML, NLP, Information...DataFlexible hours$150k - $350k
Sieve in San Francisco is looking for an Applied Research Engineer to develop high-performance pipelines for understanding video data. Ideal candidates have over 2 years of experience in computer vision or audio processing, are strong Python developers familiar with ML...Data$160k - $300k
...experiences for matrix and deep, multi-source research. We’ve built our own agentic frameworks... ...we build these systems for the scale of data our customers bring to the table. Our... ...multi billion dollar M&A. The Role As an Applied Research Engineer, you will be the bridge...Data- ...training signals, and evaluation ownership. Every applied AI company we benchmark against like Decagon, Harvey... ...Rox is in market. We run agents against enterprise data at scale, every day. We see exactly where research meets production and where the data is dirty, state...DataRelocation
$192k - $259.8k
...off validated approaches for productionization Stay current on applied research in RAG, agents, LLM evaluation, and relevance modeling; bring... ...product Qualifications 6+ years of experience in applied research, data science, or ML with a focus on NLP, information retrieval, or...DataWork at officeWorldwideMonday to FridayFlexible hours$94k - $142.3k
...best candidate experience, please consider applying for a maximum of 3 roles within 12 months... ...Science, Machine Learning, and/or Data Science and hiring at a company where the... ...Scientist roles across Salesforce's AI and research organizations Partner deeply with senior...Data- Rox Data Corp in San Francisco is seeking a Research Scientist in Applied AI to lead innovative research projects and build evaluation frameworks that measure the quality of AI agent interactions. The role offers the opportunity to influence the research agenda and work...DataRelocation package
$250k - $300k
...infrastructure that powers breakthrough AI models at leading research labs and enterprises. Since 2018, we've been pioneering data-centric approaches that are fundamental to AI... ..., NAACL, NeurIPS, ICML, ICLR, etc.). Labelbox Applied Research At Labelbox Applied Research, we're...DataWork at officeFlexible hours2 days per week$181.1k - $318.4k
...ML organization sits at the nexus of these systems—where deep applied research, advanced machine learning, and large language models converge... ...complex user or content behaviors in high‑dimensional, unstructured data. Drive LLM fine‑tuning, evaluation, safety alignment, and...DataRelocation$145.2k - $196.4k
...employee stories, and career news. Job Summary Drata is seeking an Applied Research Engineer to drive the quality and effectiveness of our AI... ...You'll Bring 3+ years of experience in applied research, data science, or ML with a focus on NLP, information retrieval, or...DataWork at officeImmediate startMonday to FridayFlexible hours- ...infrastructure that powers breakthrough AI models at leading research labs and enterprises. Since 2018, we've been pioneering data-centric approaches that are fundamental to AI... ...ownership and metrics. Role Overview As an Applied Research Engineer, you will be at the forefront...DataFlexible hours
$197.3k - $313.7k
## Applied Research EngineerApplyremote type: Office Tech-Flexiblelocations: California - San Francisco: California - Palo Alto: Illinois - Chicago... ...Statement for more information about how we use your personal data and your rights, including with regard to use of AI tools and...DataWork at office
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Applied Research - Evals & Data. Be the first to apply!
- data mining San Francisco, CA
- data recovery San Francisco, CA
- data modeling San Francisco, CA
- master data coordinator San Francisco, CA
- data officer San Francisco, CA
- clinical data San Francisco, CA
- sap data migration San Francisco, CA
- data tech San Francisco, CA
- data collection San Francisco, CA
- provider data management San Francisco, CA



