Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Applied Research - Evals & Data

$150k - $300k

Prime Intellect

Be Your Own Lab

Prime Intellect builds the infrastructure that frontier AI labs build internally, and makes it available to everyone. Our platform, Lab, unifies environments, evaluations, sandboxes, and high-performance training into a single full-stack system for post-training at frontier scale, from RL and SFT to tool use, agent workflows, and deployment. We validate everything by using it ourselves, training open state-of-the-art models on the same stack we put in your hands. We're looking for people who want to build at the intersection of frontier research and real infrastructure.

We recently raised $15mm in funding (total of $20mm raised) led by Founders Fund, with participation from Menlo Ventures and prominent angels including Andrej Karpathy (Eureka AI, Tesla, OpenAI), Tri Dao (Chief Scientific Officer of Together AI), Dylan Patel (SemiAnalysis), Clem Delangue (Huggingface), Emad Mostaque (Stability AI) and many others.

Role Impact

This is a customer facing role at the intersection of cutting-edge RL/post-training methods, applied data, and agent systems. You'll have a direct impact on shaping how advanced models are aligned, evaluated, deployed, and used in the real world by:

  • Advancing Agent Capabilities: Designing and iterating on next-generation AI agents that tackle real workloads—workflow automation, reasoning-intensive tasks, and decision-making at scale. Working with applied data from real deployments to continuously refine policies, improve reasoning, and enhance reliability and safety.
  • Building Robust Infrastructure: Developing the distributed systems, evaluation pipelines, and coordination frameworks that enable these agents to operate reliably, efficiently, and at massive scale. Building data capture, processing, and versioning workflows for feedback, model traces, and reward signals.
  • Bridge Between Customers & Research: Translating customer needs and insights from applied data into clear technical requirements that guide product and research priorities. Collaborating closely with RL and eval teams to ensure real-world signals inform model alignment and reward shaping.
  • Prototype in the Field: Rapidly designing and deploying agents, evals, and harnesses alongside customers to validate solutions. Using applied evaluation data to iterate on model performance and discover new capabilities.
Customer-Facing Engineering
  • Work side-by-side with customers to deeply understand workflows, data sources, and bottlenecks.
  • Prototype agents, data pipelines, and eval harnesses tailored to real use cases, then hand off hardened systems to core teams.
  • Translate customer insights and evaluation results into roadmap and research direction.
Post-training & Reinforcement Learning
  • Design and implement novel RL and post-training methods (RLHF, RLVR, GRPO, etc.) to align large models with domain-specific tasks.
  • Build evaluation harnesses and verifiers to measure reasoning, robustness, and agentic behavior in real-world workflows.
  • Integrate applied data collection and analytics into the post-training process to surface regressions, emergent skills, and alignment opportunities.
  • Prototype multi-agent and memory-augmented systems to expand capabilities for customer-facing solutions.
Agent Development & Infrastructure
  • Rapidly prototype and iterate on AI agents for automation, workflow orchestration, and decision-making.
  • Extend and integrate with agent frameworks to support evolving feature requests and performance requirements.
  • Architect and maintain distributed training and inference pipelines, ensuring scalability and cost efficiency.
  • Develop observability and monitoring (Prometheus, Grafana, tracing) to ensure reliability and performance in production deployments.
Requirements
  • Strong background in machine learning engineering, with experience in post-training, RL, or large-scale model alignment.
  • Experience with applied data workflows and evaluation frameworks for large models or agents (e.g., SWE-Bench, HELM, EvalFlow, internal eval pipelines).
  • Deep expertise in distributed training/inference frameworks (e.g., vLLM, sglang, Ray, Accelerate).
  • Experience deploying containerized systems at scale (Docker, Kubernetes, Terraform).
  • Track record of research contributions (publications, open-source contributions, benchmarks) in ML/RL.
  • Passion for advancing the state-of-the-art in reasoning, measurement, and building practical, agentic AI systems.
What We Offer
  • Cash Compensation Range of $150-300k + equity incentives
  • Flexible Work (remote or San Francisco)
  • Visa Sponsorship & relocation support
  • Professional Development budget
  • Team Off-sites & conference attendance
Growth Opportunity

You'll join a mission-driven team working at the frontier of open, superintelligence infra. In this role, you'll have the opportunity to:

  • Shape the evolution of agent-driven, data-informed solutions—from research breakthroughs to production systems used by real customers.
  • Collaborate with leading researchers, engineers, and partners pushing the boundaries of RL, evaluation, and post-training.
  • Grow with a fast-moving organization where your contributions directly influence both the technical direction and the broader AI ecosystem.

If you're excited to move fast, build boldly, and help define how agentic AI is developed and deployed, we'd love to hear from you.

Ready to build the open superintelligence infrastructure of tomorrow? Apply now to help us make powerful, open AGI accessible to everyone.

Vacancy posted 14 days ago
Similar jobs that could be interesting for youBased on the Applied Research - Evals & Data in San Francisco, CA vacancy
  • $150k - $300k

     ...the intersection of frontier research and real infrastructure. We...  ...real-world platform usage Applied Research & Experimentation...  ...on the frontier of agentic AI, evals, and post-training methods, and...  ...scale (benchmarks, synthetic data generation, model grading)... 
    Data
    Remote work
    Visa sponsorship
    Relocation package
    Flexible hours

    Prime Intellect

    San Francisco, CA
    4 days ago
  • $150k - $300k

     ...the intersection of frontier research and real infrastructure. We...  ...RL/post-training methods and applied agent systems. You'll have a direct...  ...design and deploy agents, evals, and harnesses for real-world...  ...evaluations and/or synthetic data generation. Experience deploying... 
    Data
    Remote work
    Visa sponsorship
    Relocation package
    Flexible hours

    Prime Intellect

    San Francisco, CA
    3 days ago
  • $35 - $45 per hour

     ...Applied Research Intern San Francisco Bay Area Shape the Future of AI At Labelbox, we're...  ...enterprises. Since 2018, we've been pioneering data-centric approaches that are fundamental...  .... You'll own continuous, high-quality evals and benchmarks (reasoning, code, agent/... 
    Data
    Internship
    Work at office
    Flexible hours
    2 days per week

    Labelbox

    San Francisco, CA
    1 day ago
  •  ...Evaluations Engineer Exa is an applied AI lab building a search...  ...this is the place for you. Research at Exa The ML organization...  ...in. We're looking for an ML evals engineer to design and build our...  ...etc.) Enjoy diving into data via building eval sets, inspecting... 
    Data
    H1b

    Exa Labs

    San Francisco, CA
    4 days ago
  •  ...team. You’ll work alongside researchers, operators, and AI companies...  ...intersection of engineering and applied AI research. You’ll own...  ...reasoning. You’ll design and run evals, build rubrics and scorers, and...  ...for post-training, RLVR, and data pipelines. What You’ll Do... 
    Data
    Work at office

    Mercor

    San Francisco, CA
    2 days ago
  • $160k - $240k

    Research Engineer — Evals Location: San Francisco, CA (Hybrid) OR Remote (Americas, UTC-3 to UTC-10) Employment...  ...URL into clean, structured, LLM-ready data reliably — is hard to measure...  ...Experience: 3+ years in ML engineering, applied AI, or data quality — with production... 
    Data
    Full time
    Temporary work
    Work at office
    Remote work

    AI Chopping Block, Inc.

    San Francisco, CA
    2 days ago
  • $146k - $280k

    Applied Data Scientist - Senior Technical Role We are looking for a highly experienced Applied Data Scientist to shape evaluation methodologies for autonomous driving technology. This senior technical role sits at the intersection of Evaluation, Systems & Safety, and the... 
    Data
    Full time
    Work at office
    Work from home
    Flexible hours

    Waabi

    San Francisco, CA
    1 day ago
  •  ...sandboxes, durable execution and unstructured data ingestion APIs to developers in...  ...reasoning, and deep content extraction. Research, evaluate, and integrate the latest vision...  ...tailored to real-world document AI tasks. Applied AI & Product Impact Build and ship production... 
    Data

    Tensorlake Inc.

    San Francisco, CA
    12 hours ago
  •  ...of intelligence. The Role We're looking for an Applied Scientist who thrives at the intersection of applied research and real-world products. You'll push the frontier...  ...in. Hands-on Execution: Own implementation of data products at Adaption, addressing novel challenges... 
    Data
    Flexible hours

    Adaption Labs

    San Francisco, CA
    12 hours ago
  • $146k - $280k

    Waabi is seeking a Senior Applied Data Scientist in San Francisco to shape evaluation methodologies for autonomous driving technology. Responsibilities include designing production frameworks, prototyping analyses, and developing analytical models to correlate simulation... 
    Data
    Flexible hours

    Waabi

    San Francisco, CA
    1 day ago
  • $160k - $225k

     ...communities. About the Role We're building a new team, Applied Science, and we're looking for our first outside hire....  ...building and deploying quantitative models (in applied or research settings) Comfort querying data directly (SQL or equivalent) Experience designing and... 
    Data
    Work at office
    Remote work
    3 days per week

    Clipboard

    San Francisco, CA
    12 hours ago
  • $8k - $16k

    Summer Intern - Applied Science MSc/PhD at Terranox AI (W26) $8K - $16K The first AI-powered...  ...others. The Role You'll work on a real research problem at the intersection of geophysics...  ...experience with ML methods on real geoscientific data Background in exploration geophysics... 
    Data
    Full time
    Summer work
    Internship
    Summer internship

    Terranox AI

    San Francisco, CA
    1 day ago
  • $311.9k - $356k

    Sr. Distinguished Applied Researcher Overview At Capital One, we are creating trustworthy and reliable AI systems, changing banking for good...  ...Key Responsibilities Partner with a cross‑functional team of data scientists, software engineers, machine learning engineers... 
    Data
    Full time
    Local area
    Flexible hours

    Capital One National Association

    San Francisco, CA
    12 hours ago
  • $150k

     ...continuously fuzz-testing them. We are looking for Research Engineers to help develop our reliability platform, with a focus on: Data-efficient alignment of evaluation models...  ...engineering experience (e.g. ML in an applied setting). No spaghetti research code! Some... 
    Data
    Visa sponsorship

    Enboarder

    San Francisco, CA
    4 days ago
  • $150k - $250k

     ...Distyl AI Job Opportunity Distyl is an applied AI technology company partnering with the...  ..., and global social organizations. We research and deploy technologies that power AI-native...  ...workflow Strong Programming and Data Analysis Skills: While you might not consider... 
    Data
    Work at office
    3 days per week

    Distyl AI

    San Francisco, CA
    4 days ago
  •  ...all of our partners.   As our Staff Applied AI Scientist focused on Decision Systems...  ...real production workflows. This is not a research role and it is not a pure engineering role...  ...or more years of experience in applied data science, including at least three years working... 
    Data
    Flexible hours

    Mulligan Funding

    San Francisco, CA
    26 days ago
  • $150k - $250k

     ...Distyl AI Job Posting Distyl is an applied AI technology company partnering with the world...  ..., and global social organizations. We research and deploy technologies that power AI-native...  ...workflow. Strong Programming and Data Analysis Skills: While you might not consider... 
    Data
    Work at office
    3 days per week

    Distyl AI

    San Francisco, CA
    2 days ago
  •  ...end conversions based on demographic, audience, and semantic data Apply privacy-preserving clustering methods to categorize conversational...  ..., forming testable hypotheses, and conducting impactful research to drive significant business impact You have a relentless focus... 
    Data
    Full time

    Koah Labs

    San Francisco, CA
    1 day ago
  • $202.2k - $273.6k

     ...Applied Science Manager, Amazon Publisher Monetization Stores Job ID: 3157704 | Amazon...  ...Identify new opportunities to leverage data and advanced analytics to unlock value for...  ...Master's degree and 5+ years of applied research experience ~ Knowledge of ML, NLP, Information... 
    Data
    Flexible hours

    Amazon

    San Francisco, CA
    12 hours ago
  • $150k - $350k

    Sieve in San Francisco is looking for an Applied Research Engineer to develop high-performance pipelines for understanding video data. Ideal candidates have over 2 years of experience in computer vision or audio processing, are strong Python developers familiar with ML... 
    Data

    Sieve

    San Francisco, CA
    12 hours ago
  • $160k - $300k

     ...experiences for matrix and deep, multi-source research. We’ve built our own agentic frameworks...  ...we build these systems for the scale of data our customers bring to the table. Our...  ...multi billion dollar M&A. The Role As an Applied Research Engineer, you will be the bridge... 
    Data

    Hebbia, Inc.

    San Francisco, CA
    12 hours ago
  •  ...training signals, and evaluation ownership. Every applied AI company we benchmark against like Decagon, Harvey...  ...Rox is in market. We run agents against enterprise data at scale, every day. We see exactly where research meets production and where the data is dirty, state... 
    Data
    Relocation

    Rox Data Corp

    San Francisco, CA
    2 days ago
  • $192k - $259.8k

     ...off validated approaches for productionization Stay current on applied research in RAG, agents, LLM evaluation, and relevance modeling; bring...  ...product Qualifications 6+ years of experience in applied research, data science, or ML with a focus on NLP, information retrieval, or... 
    Data
    Work at office
    Worldwide
    Monday to Friday
    Flexible hours

    Drata

    San Francisco, CA
    1 day ago
  • $94k - $142.3k

     ...best candidate experience, please consider applying for a maximum of 3 roles within 12 months...  ...Science, Machine Learning, and/or Data Science and hiring at a company where the...  ...Scientist roles across Salesforce's AI and research organizations Partner deeply with senior... 
    Data

    Salesforce.Com Inc

    San Francisco, CA
    3 days ago
  • Rox Data Corp in San Francisco is seeking a Research Scientist in Applied AI to lead innovative research projects and build evaluation frameworks that measure the quality of AI agent interactions. The role offers the opportunity to influence the research agenda and work... 
    Data
    Relocation package

    Rox Data Corp

    San Francisco, CA
    2 days ago
  • $250k - $300k

     ...infrastructure that powers breakthrough AI models at leading research labs and enterprises. Since 2018, we've been pioneering data-centric approaches that are fundamental to AI...  ..., NAACL, NeurIPS, ICML, ICLR, etc.). Labelbox Applied Research At Labelbox Applied Research, we're... 
    Data
    Work at office
    Flexible hours
    2 days per week

    Labelbox

    San Francisco, CA
    4 days ago
  • $181.1k - $318.4k

     ...ML organization sits at the nexus of these systems—where deep applied research, advanced machine learning, and large language models converge...  ...complex user or content behaviors in high‑dimensional, unstructured data. Drive LLM fine‑tuning, evaluation, safety alignment, and... 
    Data
    Relocation

    Apple Inc.

    San Francisco, CA
    2 days ago
  • $145.2k - $196.4k

     ...employee stories, and career news. Job Summary Drata is seeking an Applied Research Engineer to drive the quality and effectiveness of our AI...  ...You'll Bring 3+ years of experience in applied research, data science, or ML with a focus on NLP, information retrieval, or... 
    Data
    Work at office
    Immediate start
    Monday to Friday
    Flexible hours

    Drata

    San Francisco, CA
    12 hours ago
  •  ...infrastructure that powers breakthrough AI models at leading research labs and enterprises. Since 2018, we've been pioneering data-centric approaches that are fundamental to AI...  ...ownership and metrics. Role Overview As an Applied Research Engineer, you will be at the forefront... 
    Data
    Flexible hours

    HRB

    San Francisco, CA
    4 days ago
  • $197.3k - $313.7k

    ## Applied Research EngineerApplyremote type: Office Tech-Flexiblelocations: California - San Francisco: California - Palo Alto: Illinois - Chicago...  ...Statement for more information about how we use your personal data and your rights, including with regard to use of AI tools and... 
    Data
    Work at office

    Salesforce, Inc.

    San Francisco, CA
    3 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Applied Research - Evals & Data. Be the first to apply!