Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Research Engineer, AI Capabilities & Evaluations

The Consulting Solutions

The Consulting Solutions is seeking a Research Engineer / Scientist to join the North Stars team. In this role, you will work on enhancing AI-enabled experiences, focusing on improving model capability and performance. You will pursue a comprehensive research agenda while collaborating closely with other teams and building evaluations to track improvements. This position offers a hybrid work model of three days in-office per week and includes relocation assistance for new employees. #J-18808-Ljbffr The Consulting Solutions

Vacancy posted 2 days ago
Similar jobs that could be interesting for youBased on the Research Engineer, AI Capabilities & Evaluations in San Francisco, CA vacancy
  •  ...looking for a skilled professional to build evaluation harnesses that ensure models and agents...  ..., and develop tooling to assist research and product teams. The position emphasizes...  ...performance metrics to improve AI capabilities. You'll need to have a firm grasp on non... 
    Suggested
    Relocation package

    AGI, Inc.

    San Francisco, CA
    5 days ago
  • $315k

    We are looking for Research Engineers to build “gold standard” evaluations for catastrophic risks, in order to understand what AI Safety Level (ASL) to assign to models. Research leads...  ...RSP). The policy defines a series of capability thresholds - AI Safety Levels (ASLs)... 
    Suggested
    Currently hiring
    Work at office
    Immediate start
    Home office
    Visa sponsorship
    Relocation package

    Anthropic

    San Francisco, CA
    5 days ago
  • Refresh AI is seeking a Research Engineer in San Francisco to push the boundaries of benchmarking technology. You will build benchmarks that labs use for evaluating coding abilities and computer-use capability. Your role will require expertise in reinforcement learning... 
    Suggested
    Full time

    Refresh AI

    San Francisco, CA
    5 days ago
  • AI Chopping Block, Inc. is searching for a dedicated professional to help build the evaluation harness necessary for our advanced AGI models. You will audit existing processes,...  ...into actionable strategies and elevate our research standards, leading to impactful AI... 
    Suggested

    AI Chopping Block, Inc.

    San Francisco, CA
    4 days ago
  •  ...Francisco, is seeking a dedicated professional for a full-time role to evaluate agent models and develop practical assessment rubrics. This...  ...to aid decision-making. This role is pivotal to ensure product quality and enhance the research strategy. #J-18808-Ljbffr AGI Inc
    Suggested
    Full time
    Relocation package

    AGI Inc

    San Francisco, CA
    1 day ago
  •  ...training and scaling security AI agents to discover zero-days...  ...'re seeking an experienced Research Engineer to join our effort in...  ...We are building a technology capable of finding the next Log4J at...  ...intuition, experience in model evaluation, and benchmarks. Reinforcement... 
    Full time
    Work at office

    DepthFirst

    San Francisco, CA
    4 days ago
  • Drata is seeking a Senior Applied Research Engineer to enhance the quality of AI systems through rigorous evaluation and experimentation. This role emphasizes applied research, focusing on information retrieval and reasoning strategies. The ideal candidate will bring 5... 

    jobr.pro

    San Francisco, CA
    5 days ago
  • $380k

     ...The Future of Computing Research team is an applied research...  ...methods, models, and evaluation frameworks that support...  ...frontier of multimodal AI, helping turn emerging model capabilities into product experiences...  ...closely across research, engineering, design, product, and safety... 
    Work at office
    Immediate start
    Relocation package

    OpenAI

    San Francisco, CA
    4 days ago
  •  ...interpretable, and steerable AI systems. We want AI to be...  ...growing group of committed researchers, engineers, policy experts, and business...  ...of training environments for capable and safe agentic AI. This role...  ...of the art, and building evaluations that measure genuine capability... 
    Work at office
    Remote work
    Visa sponsorship
    Shift work

    Menlo Ventures

    San Francisco, CA
    1 day ago
  •  ...Analysis is a security research lab focused on adversarial simulations, evaluations, and runtime...  ...work across research, engineering, and product. About the...  ...models for adversarial capabilities using reinforcement learning...  ...build deep context in AI security. You are results... 

    General Analysis

    San Francisco, CA
    1 day ago
  • $350k

     ...interpretable, and steerable AI systems. We want AI to be...  ...growing group of committed researchers, engineers, policy experts, and business...  ...on the autonomy and coding capabilities of Claude Sonnet 4.6 and...  ...implement RL environments and evaluations. Conduct experiments and... 
    Work at office
    Visa sponsorship
    Flexible hours

    Menlo Ventures

    San Francisco, CA
    1 day ago
  • $315k

    As a Research Engineer or Research Scientist in Applied Finetuning, you will...  ...to the public via Claude.AI and our API. In this role, you...  ...on data mixes, design evaluations, and improve our production...  ...that tests Claude’s reasoning capabilities Collaborate with a research... 
    Work at office
    Home office
    Visa sponsorship
    Relocation package

    Anthropic

    San Francisco, CA
    3 days ago
  • # Research Engineer, BenchmarkingEngineeringSan FranciscoFull-timeBuild the...  ...coding and computer-use capability. Translate expert workflows into rigorous, verifiable evaluations, run them against frontier models...  ...fine-tuning at a high level. #J-18808-Ljbffr Refresh AI

    Refresh AI

    San Francisco, CA
    5 days ago
  •  ...currently on-site) Industry: AI infrastructure /...  ...Learning (RL) training data & evaluations Compensation: Competitive (range...  ...Opportunity Our partner is hiring a Research Engineer to help scale the quality...  ...with modern AI tooling and LLM capabilities Equal Opportunity &... 
    Remote work

    talentpluto

    San Francisco, CA
    1 day ago
  •  ...Archive Human Archive is a research lab backed by Y...  ...function gains in model capability. The deployment of capable...  ...As a Research Engineer, you’ll work on multimodal...  ...research for embodied AI and robotics. This role...  ...design experiments, evaluate new sensing stacks, and... 
    Shift work

    Human Archive

    San Francisco, CA
    4 days ago
  • $300k - $400k

     ...leading conversational AI platform empowering...  .... About the Team The Research team develops the model...  ...prompting, orchestration, and evaluation in order to make our...  ...As a Senior Research Engineer, you’ll be responsible...  ...agent’s reliability, capability, and efficiency... 
    Work at office

    Decagon

    San Francisco, CA
    2 days ago
  •  ...building state-of-the-art AI systems that can write code...  ...reasoning, and deploy these capabilities in real-world products such...  ...coding. We operate across research, engineering, product, and infrastructure...  ...model training, alignment, and evaluation. Hunt down and address... 
    Work at office
    Relocation package

    OpenAI

    San Francisco, CA
    1 day ago
  • At Capably, we’re building technology that helps businesses operate...  ...seamless automation. As a Research Engineer at Capably, you’ll help...  ...developing the models, systems, and evaluation approaches that make agentic...  ...what today’s enterprise AI tools can reliably deliver.... 

    Capably

    San Francisco, CA
    5 days ago
  •  ...the Team The Privacy Engineering Team at OpenAI is committed...  ...engineering and research partners with the necessary...  ...and efficiency of our AI systems. You will help...  ...internal libraries, evaluation suites, and...  ...pushing the frontiers of capability. About OpenAI OpenAI... 
    Relocation package

    OpenAI

    San Francisco, CA
    4 days ago
  • $190k - $320k

    Research Engineer - Computer Vision & Machine Learning Want to build vision...  .... Vision is a core capability. Your work will directly influence...  ...architectures, training pipelines, evaluation frameworks, and inference...  ...vision systems that connect AI to the physical world in... 

    Trades Workforce Solutions

    San Francisco, CA
    5 days ago
  • $295k

    Research Engineer / Research Scientist -Personal AGI, Proactivity Post-training...  ...technical foundations for AI that can anticipate what...  ...personalization and agentic capabilities. Our team works on reinforcement...  ...learning, dataset creation, evaluations, and other post-training... 
    Work at office
    Relocation package
    Shift work

    OpenAI

    San Francisco, CA
    1 day ago
  • The Role As an Applied Research Engineer , you will serve as the crucial link...  ...in enabling agentic capabilities across the Hebbia product suite...  ...learning systems , and LLM evaluation ; experience building with foundation...  ...products. Frequent user of AI products, especially during... 

    Gravity Engineering Services Pvt Ltd.

    San Francisco, CA
    2 days ago
  • $350k

     ...interpretable, and steerable AI systems. We want AI to be...  ...growing group of committed researchers, engineers, policy experts, and business...  ...training environments and evaluations that make Claude effective...  ...processes for Knowledge Work capabilities, including the process used... 
    Visa sponsorship
    Shift work

    United States Digital Space LLC

    San Francisco, CA
    20 hours ago
  • $350k

     ...reliable, interpretable, and steerable AI systems. We want AI to be safe and...  ...quickly growing group of committed researchers, engineers, policy experts, and business...  ...values do our systems have?), and evaluating novel AI capabilities as they arise. We develop privacy-preserving... 
    Full time
    Contract work
    For contractors
    For subcontractor
    Work at office
    Visa sponsorship
    Flexible hours

    Menlo Ventures

    San Francisco, CA
    2 days ago
  •  ...Turing is the world’s leading research accelerator for frontier AI labs and a trusted...  ...create RL environments to evaluate and improve our customers...  ...vary depending on the model capability being evaluated /...  ...Environments for Software Engineering / coding agents UI-Environments... 
    For contractors
    Flexible hours

    Cerebras

    San Francisco, CA
    1 day ago
  • $160k - $300k

    Hebbia is the AI platform for investors and bankers...  ...and retrieval capabilities - unlocking meaningful...  ...and deep, multi-source research. We’ve built our own agentic...  ...LLM inference engine - a distributed, asynchronous...  ...systems, and LLM evaluation; experience building with... 
    Contract work
    For contractors
    For subcontractor
    Work at office

    Hebbia

    San Francisco, CA
    4 days ago
  • About the Role You’ll work as a Research Engineer / Scientist on the North...  ...bring the next generation of AI‑enabled experiences to all of humanity by closing the capability overhang between power users...  ...these insights into robust evaluations, training data, reward signals... 
    Work at office
    Relocation package

    The Consulting Solutions

    San Francisco, CA
    2 days ago
  •  ...Are We are an applied AI lab building end-to-end...  ...the first AI software engineer, and Windsurf, an AI-...  ...former founders, and researchers from the frontier of AI...  .... Every training run, evaluation loop, and experimental...  ...more about demonstrated capability than credentials. A... 

    Cognition

    San Francisco, CA
    3 days ago
  • $280k

     ...interpretable, and steerable AI systems. We want AI to be...  ...growing group of committed researchers, engineers, policy experts, and business...  ...the context of human-level capabilities. You could describe...  ...Build tooling to efficiently evaluate the effectiveness of novel... 
    Contract work
    For contractors
    For subcontractor
    Work at office
    Relocation
    Visa sponsorship
    Work visa
    Flexible hours

    Menlo Ventures

    San Francisco, CA
    4 days ago
  • $315k

     ...interpretable, and steerable AI systems. We want AI to be...  ...growing group of committed researchers, engineers, policy experts, and business...  ...processes to enhance their capabilities, alignment, and safety. As...  ...for model fine-tuning and evaluation Develop tools to measure and... 
    Work at office
    Visa sponsorship
    Flexible hours

    Menlo Ventures

    San Francisco, CA
    2 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Research Engineer, AI Capabilities & Evaluations. Be the first to apply!