Research Engineer, AI Capabilities & Evaluations
The Consulting Solutions
The Consulting Solutions is seeking a Research Engineer / Scientist to join the North Stars team. In this role, you will work on enhancing AI-enabled experiences, focusing on improving model capability and performance. You will pursue a comprehensive research agenda while collaborating closely with other teams and building evaluations to track improvements. This position offers a hybrid work model of three days in-office per week and includes relocation assistance for new employees. #J-18808-Ljbffr The Consulting Solutions
- ...looking for a skilled professional to build evaluation harnesses that ensure models and agents... ..., and develop tooling to assist research and product teams. The position emphasizes... ...performance metrics to improve AI capabilities. You'll need to have a firm grasp on non...SuggestedRelocation package
$315k
We are looking for Research Engineers to build “gold standard” evaluations for catastrophic risks, in order to understand what AI Safety Level (ASL) to assign to models. Research leads... ...RSP). The policy defines a series of capability thresholds - AI Safety Levels (ASLs)...SuggestedCurrently hiringWork at officeImmediate startHome officeVisa sponsorshipRelocation package- Refresh AI is seeking a Research Engineer in San Francisco to push the boundaries of benchmarking technology. You will build benchmarks that labs use for evaluating coding abilities and computer-use capability. Your role will require expertise in reinforcement learning...SuggestedFull time
- AI Chopping Block, Inc. is searching for a dedicated professional to help build the evaluation harness necessary for our advanced AGI models. You will audit existing processes,... ...into actionable strategies and elevate our research standards, leading to impactful AI...Suggested
- ...Francisco, is seeking a dedicated professional for a full-time role to evaluate agent models and develop practical assessment rubrics. This... ...to aid decision-making. This role is pivotal to ensure product quality and enhance the research strategy. #J-18808-Ljbffr AGI IncSuggestedFull timeRelocation package
- ...training and scaling security AI agents to discover zero-days... ...'re seeking an experienced Research Engineer to join our effort in... ...We are building a technology capable of finding the next Log4J at... ...intuition, experience in model evaluation, and benchmarks. Reinforcement...Full timeWork at office
- Drata is seeking a Senior Applied Research Engineer to enhance the quality of AI systems through rigorous evaluation and experimentation. This role emphasizes applied research, focusing on information retrieval and reasoning strategies. The ideal candidate will bring 5...
$380k
...The Future of Computing Research team is an applied research... ...methods, models, and evaluation frameworks that support... ...frontier of multimodal AI, helping turn emerging model capabilities into product experiences... ...closely across research, engineering, design, product, and safety...Work at officeImmediate startRelocation package- ...interpretable, and steerable AI systems. We want AI to be... ...growing group of committed researchers, engineers, policy experts, and business... ...of training environments for capable and safe agentic AI. This role... ...of the art, and building evaluations that measure genuine capability...Work at officeRemote workVisa sponsorshipShift work
- ...Analysis is a security research lab focused on adversarial simulations, evaluations, and runtime... ...work across research, engineering, and product. About the... ...models for adversarial capabilities using reinforcement learning... ...build deep context in AI security. You are results...
$350k
...interpretable, and steerable AI systems. We want AI to be... ...growing group of committed researchers, engineers, policy experts, and business... ...on the autonomy and coding capabilities of Claude Sonnet 4.6 and... ...implement RL environments and evaluations. Conduct experiments and...Work at officeVisa sponsorshipFlexible hours$315k
As a Research Engineer or Research Scientist in Applied Finetuning, you will... ...to the public via Claude.AI and our API. In this role, you... ...on data mixes, design evaluations, and improve our production... ...that tests Claude’s reasoning capabilities Collaborate with a research...Work at officeHome officeVisa sponsorshipRelocation package- # Research Engineer, BenchmarkingEngineeringSan FranciscoFull-timeBuild the... ...coding and computer-use capability. Translate expert workflows into rigorous, verifiable evaluations, run them against frontier models... ...fine-tuning at a high level. #J-18808-Ljbffr Refresh AI
- ...currently on-site) Industry: AI infrastructure /... ...Learning (RL) training data & evaluations Compensation: Competitive (range... ...Opportunity Our partner is hiring a Research Engineer to help scale the quality... ...with modern AI tooling and LLM capabilities Equal Opportunity &...Remote work
- ...Archive Human Archive is a research lab backed by Y... ...function gains in model capability. The deployment of capable... ...As a Research Engineer, you’ll work on multimodal... ...research for embodied AI and robotics. This role... ...design experiments, evaluate new sensing stacks, and...Shift work
$300k - $400k
...leading conversational AI platform empowering... .... About the Team The Research team develops the model... ...prompting, orchestration, and evaluation in order to make our... ...As a Senior Research Engineer, you’ll be responsible... ...agent’s reliability, capability, and efficiency...Work at office- ...building state-of-the-art AI systems that can write code... ...reasoning, and deploy these capabilities in real-world products such... ...coding. We operate across research, engineering, product, and infrastructure... ...model training, alignment, and evaluation. Hunt down and address...Work at officeRelocation package
- At Capably, we’re building technology that helps businesses operate... ...seamless automation. As a Research Engineer at Capably, you’ll help... ...developing the models, systems, and evaluation approaches that make agentic... ...what today’s enterprise AI tools can reliably deliver....
- ...the Team The Privacy Engineering Team at OpenAI is committed... ...engineering and research partners with the necessary... ...and efficiency of our AI systems. You will help... ...internal libraries, evaluation suites, and... ...pushing the frontiers of capability. About OpenAI OpenAI...Relocation package
$190k - $320k
Research Engineer - Computer Vision & Machine Learning Want to build vision... .... Vision is a core capability. Your work will directly influence... ...architectures, training pipelines, evaluation frameworks, and inference... ...vision systems that connect AI to the physical world in...$295k
Research Engineer / Research Scientist -Personal AGI, Proactivity Post-training... ...technical foundations for AI that can anticipate what... ...personalization and agentic capabilities. Our team works on reinforcement... ...learning, dataset creation, evaluations, and other post-training...Work at officeRelocation packageShift work- The Role As an Applied Research Engineer , you will serve as the crucial link... ...in enabling agentic capabilities across the Hebbia product suite... ...learning systems , and LLM evaluation ; experience building with foundation... ...products. Frequent user of AI products, especially during...
$350k
...interpretable, and steerable AI systems. We want AI to be... ...growing group of committed researchers, engineers, policy experts, and business... ...training environments and evaluations that make Claude effective... ...processes for Knowledge Work capabilities, including the process used...Visa sponsorshipShift work$350k
...reliable, interpretable, and steerable AI systems. We want AI to be safe and... ...quickly growing group of committed researchers, engineers, policy experts, and business... ...values do our systems have?), and evaluating novel AI capabilities as they arise. We develop privacy-preserving...Full timeContract workFor contractorsFor subcontractorWork at officeVisa sponsorshipFlexible hours- ...Turing is the world’s leading research accelerator for frontier AI labs and a trusted... ...create RL environments to evaluate and improve our customers... ...vary depending on the model capability being evaluated /... ...Environments for Software Engineering / coding agents UI-Environments...For contractorsFlexible hours
$160k - $300k
Hebbia is the AI platform for investors and bankers... ...and retrieval capabilities - unlocking meaningful... ...and deep, multi-source research. We’ve built our own agentic... ...LLM inference engine - a distributed, asynchronous... ...systems, and LLM evaluation; experience building with...Contract workFor contractorsFor subcontractorWork at office- About the Role You’ll work as a Research Engineer / Scientist on the North... ...bring the next generation of AI‑enabled experiences to all of humanity by closing the capability overhang between power users... ...these insights into robust evaluations, training data, reward signals...Work at officeRelocation package
- ...Are We are an applied AI lab building end-to-end... ...the first AI software engineer, and Windsurf, an AI-... ...former founders, and researchers from the frontier of AI... .... Every training run, evaluation loop, and experimental... ...more about demonstrated capability than credentials. A...
$280k
...interpretable, and steerable AI systems. We want AI to be... ...growing group of committed researchers, engineers, policy experts, and business... ...the context of human-level capabilities. You could describe... ...Build tooling to efficiently evaluate the effectiveness of novel...Contract workFor contractorsFor subcontractorWork at officeRelocationVisa sponsorshipWork visaFlexible hours$315k
...interpretable, and steerable AI systems. We want AI to be... ...growing group of committed researchers, engineers, policy experts, and business... ...processes to enhance their capabilities, alignment, and safety. As... ...for model fine-tuning and evaluation Develop tools to measure and...Work at officeVisa sponsorshipFlexible hours
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Research Engineer, AI Capabilities & Evaluations. Be the first to apply!
- research assistant engineering San Francisco, CA
- ai research engineer San Francisco, CA
- junior machine learning research engineer San Francisco, CA
- research engineer San Francisco, CA
- research programmer San Francisco, CA
- deep learning research engineer San Francisco, CA
- research software engineer San Francisco, CA
- senior research engineer San Francisco, CA
- assistant research professor San Francisco, CA
- research and development engineer San Francisco, CA

