Staff Research Engineer - LLM Post-Training & Evaluation
Gravity Engineering Services Pvt Ltd.
Gravity Engineering Services Pvt Ltd. is seeking a Member of Technical Staff in San Francisco, California. In this role, you will design and build the infrastructure necessary for models to learn from production workflows continually. You will manage end-to-end experiments related to data, training, and system evaluation, working closely with the company's founders. The ideal candidate will have a strong background in large language models and be equipped with experience in experimental design, preferably with a Master's or PhD in a related field. This role is critical in shaping both the technology and the organizational culture. #J-18808-Ljbffr Gravity Engineering Services Pvt Ltd.
$264.8k - $331k
...out state of the art post-training algorithms to reach the... ...The Enterprise ML Research Lab works on the front... ...enterprise clients. As a Staff Agent Post-Training MLRE... ...: ~5+ years of LLM training in a production... ...a fair and thorough evaluation of all applicants....TrainingFull time- ...generation systems, run evaluations, inspect model... ...world trajectories or researcher hypotheses, materialize... ...AI research, systems engineering, and model evaluation.... ...Have experience with RL, LLM agents, computer‑use agents, evals, post‑training, synthetic data, simulation...Training
$200k
...San Francisco is seeking a Software Engineer for their RL Research & Environments team. The role... ...on designing and improving data and evaluation systems to enhance model capabilities... ...position is an opportunity to influence post-training strategies as part of a fast-paced...Training- ...intelligence to serve humanity. We’re training and deploying frontier... .... Cohere is a team of researchers, engineers, designers, and more, who are... ...responsible for pushing the limits of LLM inference efficiency across... ...preferred locations. As a Staff Research Engineer, you will...TrainingFull timeWork at officeRemote workFlexible hours
- ...large-scale distributed training and data processing ,... ...experience in ML/RL research and application , (Desirable... ...and using metrics for evaluating complex AI systems , (... ...and software engineers who are passionate about... ...models and Generative AI (LLM/VLM) solutions. These...Training
$320k
Anthropic in New York City is seeking a Research Engineer to develop evaluations for Claude’s capabilities. The ideal candidate should have strong Python... ...for running evaluations, and debugging results during training runs. The role offers a hybrid work model and...TrainingRemote job- ...preference and judgment. That lets us evaluate models on what people actually care about... ...About the Role We’re looking for an ML Research Engineer to help us build better ways to... ...analyses What We’re Looking For Experience training, fine-tuning, or evaluating models, including...Training
$150k - $300k
Prime Intellect is looking for a skilled ML Systems Engineer to build and optimize LLM serving infrastructure and inference systems. This hybrid role... ...to the scalability of their reinforcement learning training. Successful candidates will have over 3 years of experience...TrainingRelocation package$180k
...focused on AI is seeking experienced software engineers to develop robust data pipelines and automation... ...frameworks. This role involves creating and maintaining evaluation tasks and improving operational procedures for RL training. The ideal candidate has extensive experience...Training- ...Capital in San Francisco is seeking talented individuals for AI research roles focused on open superintelligence. Candidates will... ...in Computer Science or a related field, possess solid software engineering skills, and have experience with large-scale systems. The position...Training
$220.8k - $298.8k
# Staff Applied Research EngineerHybrid - San FranciscoApply**Our Mission & Values... ...is seeking an Applied AI Engineer to drive the quality and... ...research, experimentation, and evaluation. In this role, you will... ...systems**: cross-encoders, LLM-based rerankers, learning-to...Work at officeImmediate startWorldwideMonday to FridayFlexible hours$220.8k - $298.8k
...automation. Drata is seeking an Applied AI Engineer to drive the quality and... ...of our AI systems through rigorous research, experimentation, and evaluation. In this role, you will optimize retrieval... ...reranking systems: cross‑encoders, LLM‑based rerankers, learning‑to‑rank,...Flexible hours$315k
We are looking for Research Engineers to build “gold standard” evaluations for catastrophic risks, in order... ...for the way we train, deploy, and secure our... ...capabilities. Using our post training infrastructure... ...Currently, we expect all staff to be in one of our offices...TrainingCurrently hiringWork at officeImmediate startHome officeVisa sponsorshipRelocation package- ...infrastructure / Reinforcement Learning (RL) training data & evaluations Compensation: Competitive (range... ...Our partner is hiring a Research Engineer to help scale the quality assurance... ...Familiarity with modern AI tooling and LLM capabilities Equal Opportunity &...TrainingRemote work
$200k - $400k
About the Company Pilots don’t train with real passengers. Surgeons... ...based on real humans. Our research pioneered the field of AI-based... ...Role As a Member of Technical Staff (MTS) in Research, you will work across the stack to train, evaluate, deploy, and monitor our models...TrainingFlexible hours- ...unique role at the intersection of AI research and systems engineering. You will design experiments, build task generation systems, and evaluate model failures. This is a hands-on role... ...background in reinforcement learning, LLM agents, and model behavior analysis, and...
- ...pretraining to production serving, evaluation, and monitoring. As part... ...across Plaid. As a Staff Machine Learning Engineer, you will lead the... ...pipelines that translate research into production impact. You... ...architecture design, distributed training, serving infrastructure, monitoring...TrainingWork experience placementLocal areaImmediate start
- ...everything Gamma creates. As our Research Engineer, you'll design evaluation frameworks that measure AI output... ...experience with prompt engineering, LLM experimentation, and systematic evaluation... ...improvements. Experience with post‑training techniques for LLMs including...TrainingWork at officeWork from home
- ...layer for AI agents. As a Senior Applied Research Engineer, you'll explore novel approaches to... ...engineers who can run rigorous experiments, train and evaluate models, and ship the result as... ...work in retrieval, memory systems, or LLM evaluation. Tech stack Python, Rust/C++...TrainingWork experience placement
$200k - $250k
Research Engineer Location San Francisco (On-site) Compensation $200,000 - $250,000 + variable... ...engineering: dataset curation, model training and evaluation, retrieval and tool use, safety and... ...Track record shipping applied ML or LLM features to real users, not just prototypes...Training$320k
...growing group of committed researchers, engineers, policy experts, and... ..., you’ll build and evaluate model organisms of... ...AI. Create evals and training environments to... ...building and working with LLM‑based agents or autonomous... ...: We expect all staff to be in one of our offices...TrainingRelocationVisa sponsorship$180k - $270k
Research Engineer (Focused on Search/IR) You'll own and advance the search... ...reliably convert URLs into LLM‑ready markdown or structured... .../IR improvements with model training and broader product strategy... ...production implications. We'll evaluate on technical depth,...TrainingFull timeTemporary workRemote work$160k - $240k
Research Engineer — Evals Location: San Francisco, CA (Hybrid) OR Remote (... ...Overview You'll build the evaluation systems that tell us whether... ...URL into clean, structured, LLM-ready data reliably — is hard... ...reporting layer — they're a training signal. You'll work closely...TrainingFull timeTemporary workWork at officeRemote work- ...coding. We operate across research, engineering, product, and... ...research insights into model training, alignment, and evaluation. Hunt down and address inefficiencies... ...—from agent behavior to LLM inference to container... ...you believe this job posting is non-compliant, please...TrainingWork at officeRelocation package
- ...vast talent network trains frontier AI models... ...ll work alongside researchers, operators, and AI... ...As a Research Engineer at Mercor, you’ll... ...benchmarking pipelines, evaluation systems, and... ...improvements for post-training, RLVR, and... ...Build and operate LLM evaluation systems...TrainingWork at office
$320k - $405k
...growing group of committed researchers, engineers, policy experts, and... ...About the Team As AI training and deployments scale... ...Are familiar with LLM application... ...context engineering, evaluation, orchestration) Enjoy... ...Currently, we expect all staff to be in one of our offices...TrainingWork at officeVisa sponsorshipFlexible hours$315k
...growing group of committed researchers, engineers, policy experts, and... ...and Scaling team trains our production... ...training dynamics and evaluation infrastructure Design... ...experience training LLM\'s or working extensively... ...Currently, we expect all staff to be in one of our...TrainingFull timeWork at officeVisa sponsorshipFlexible hoursWeekend workAfternoon shift$180k - $270k
Research Engineer (Focused on RL) You'll bring reinforcement learning to Firecrawl... ...core product — building the training infrastructure, reward... ...RL approaches and modern LLM agent systems. If you care as... ...the systems that train and evaluate Firecrawl's models. You'll own...TrainingFull timeTemporary workRemote work$265k - $295k
...with frontier AI lab researchers to create evaluations, publish benchmarks,... ...Work together with engineers, scientists, operators... ...data-intensive post-training techniques. We believe... ...About the Role As a Staff Forward Deployed Engineer... ...strategy for LLM-powered or AI-native...TrainingFull timeWork at officeRemote workFlexible hours- B Capital is seeking a data engineer to ensure high data quality for training AI models. You will own the upstream data quality for LLM post-training and design automated QA methods in a collaborative environment. Ideal candidates will have strong engineering skills, a...Training
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Staff Research Engineer - LLM Post-Training & Evaluation. Be the first to apply!
- software engineer staff San Francisco, CA
- staff devops engineer San Francisco, CA
- assistant engineer San Francisco, CA
- assistant engineering manager San Francisco, CA
- staff design engineer San Francisco, CA
- project engineer assistant project manager San Francisco, CA
- technology administrator San Francisco, CA
- staff data engineer San Francisco, CA
- assistant chief engineer San Francisco, CA
- senior staff systems engineer San Francisco, CA


