Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Member of Technical Staff - Evaluations

Reflection

About the Role Conduct critical comparative analysis to advance our understanding of model capabilities Build and refine evaluation systems and processes that create tight feedback loops between data, evals, and model behavior Develop generalizable evaluation frameworks that capture what matters for reasoning, alignment, and usefulness. Collaborate closely with pre‑training, post‑training, and applied teams to translate insights into model improvements. Push the boundaries of what’s measurable, from synthetic evals to human feedback and real‑world interaction data. About You Strong statistical analysis and experimental design skills to rigorously measure model improvements Familiarity with LLM evaluation methodologies: static benchmarks, human preference evals, and/or agentic tasks. High agency and thrive in a fast‑paced startup environment; bias for impact over process. Excited to work in a new frontier lab, defining how we measure and accelerate progress toward more capable models. Collaborative, detail‑oriented, and motivated by building the feedback loops that make models truly improve. What We Offer Top-tier compensation: Salary and equity structured to recognize and retain the best talent globally. Health & wellness: Comprehensive medical, dental, vision, life, and disability insurance. Life & family: Fully paid parental leave for all new parents, including adoptive and surrogate journeys. Financial support for family planning. Benefits & balance: paid time off when you need it, relocation support, and more perks that optimize your time. Opportunities to connect with teammates: lunch and dinner are provided daily. We have regular off‑sites and team celebrations. #J-18808-Ljbffr Reflection

Vacancy posted 3 days ago
Similar jobs that could be interesting for youBased on the Member of Technical Staff - Evaluations in San Francisco, CA vacancy
  •  ...beyond. About the Role Conduct critical comparative analysis to advance our understanding of model capabilities Build and refine evaluation systems and processes that create tight feedback loops between data, evals, and model behavior Develop generalizable evaluation... 
    Suggested
    Full time
    Relocation package

    B Capital

    San Francisco, CA
    5 days ago
  • $150k - $300k

     ...be working on advancing our ability to evaluate and serve models trained with our RL Lab...  ...systems into our RL training stack. Core Technical Responsibilities LLM Serving Multi‑tenant...  ...in open development and encourage team members to contribute to the broader AI community... 
    Suggested
    Work at office
    Remote work
    Visa sponsorship
    Relocation package
    Flexible hours
    Shift work

    Prime-Intellect

    San Francisco, CA
    2 days ago
  •  ...improving models. This includes trajectory visualization, evaluation workflows, monitoring dashboards, and the core product interfaces...  ...core agent products. We’re building our team of founding Members of Technical Staff to design the frontier of continually learning systems.... 
    Suggested

    Trajectory

    San Francisco, CA
    2 days ago
  •  ...you won't just observe the cutting edge of AI, your work will define what cutting edge means. We're hiring Members of Technical Staff to design the evaluations that set the standard for how AI is measured, produce analysis that shapes how companies and the broader industry... 
    Suggested

    Artificial Analysis Inc

    San Francisco, CA
    2 days ago
  • $227.5k - $401k

     ...motivated individuals who tackle unique technical challenges at scale and solve them as...  ...financial technology sector. As a Member of Technical Staff , you will operate with a high degree...  ...Multi‑step Reasoning (DABStep), which evaluates AI agents on real‑world data analysis... 
    Suggested
    Work at office
    Immediate start
    Relocation
    Flexible hours

    Adyen

    San Francisco, CA
    1 day ago
  •  .... This role is focused on building and deploying the technical systems that make biosecurity real. About the Role As a Member of Technical Staff, Biosecurity at Radical Numerics, you will lead the design, evaluation, and deployment of biosecurity systems for biological... 
    Full time

    Radical Numerics

    San Francisco, CA
    1 day ago
  • Job Description We’re looking for a Member of Technical Staff to build and deploy production-grade AI systems. In this role, you’ll work across...  ...production environments Model Development: Fine-tune, evaluate, and work with machine learning models in real-world applications... 

    ERAGON

    San Francisco, CA
    2 days ago
  • $150k - $220k

    # Founding Member of Technical Staff (MTS)Bay Area, CAFull-time$150k-$220k + equity## About UsVizopsAI is the secure runtime for custom enterprise...  ...power continuous optimization loops for AI agents—from evaluation pipelines and data/trace infrastructure to APIs that... 

    VizopsAI

    San Francisco, CA
    1 day ago
  • $300k

    Member of Technical Staff - Mechanistic Interpretability About V max V max is an applied research lab developing AI capable of open-ended learning...  ...rigorous ML experiments, including ablations, baselines, evaluation design, and failure analysis. Expertise with Python and at... 
    Work at office
    Local area

    Vmax

    San Francisco, CA
    2 days ago
  • $300k

    Member of Technical Staff - RL Algorithms About V max V max is an applied research lab developing AI capable of open-ended learning. We are building...  ...and agentic settings. Establish empirical baselines and evaluation protocols for measuring sample efficiency, robustness,... 
    Work at office
    Local area
    Shift work

    Vmax

    San Francisco, CA
    4 days ago
  •  ...great products. Join us on our mission and shape the future! Member of Technical Staff, Search Why this role? We are looking for talented...  ...datasets and optimize data pipelines for model training and evaluation. Work closely with the model serving team to ensure that... 
    Full time
    Work at office
    Remote work
    Flexible hours

    Cohere

    San Francisco, CA
    2 days ago
  •  ...to scale to gigawatt-class AI datacenters. Mission Gimlet Labs is seeking an Member of Staff focused on AI Research (Intern). As an AI Researcher (Intern), you will be evaluating and implementing techniques to drive performance and quality optimizations across... 
    Internship

    Gimlet Labs

    San Francisco, CA
    3 days ago
  •  ...of interactive AI. The Role We're looking for a Member of Technical Staff - Embodied Agents to help build general-purpose agents capable...  ...3D worlds Synthetic data pipelines Agent evaluation frameworks Scalable training systems What We're Looking... 

    Moonlake AI

    San Francisco, CA
    2 days ago
  • $220k - $405k

     ...and resources that strengthen the broader AI ecosystem. As a member of SII, you'll conduct original and impactful research on improving...  ...security and privacy in AI-native products. Build security evaluation frameworks, benchmarks, and datasets to measure the effectiveness... 
    Full time
    Local area

    Pantera Capital

    San Francisco, CA
    4 days ago
  •  ...Focus on the things that matter, and join the team. As a Member of Technical Staff with a focus on Multimodal AI, you will: Design and develop...  ...large multimodal models, and have experience building evaluations to measure their performance. Are comfortable diving into... 
    Full time
    Work at office
    Remote work
    Flexible hours

    Cohere

    San Francisco, CA
    2 days ago
  • $200k - $275k

    Founding Member of Technical Staff (Research / Post-Training) Applied AI / RL | San Francisco (onsite) | $200k-$275k + 0.25-0.50% equity DeepRec...  ...and post‑training. You’ll take ownership of training and evaluating frontier models, shaping external benchmarks, and... 
    Full time
    Visa sponsorship
    Relocation package

    DeepRec.ai

    San Francisco, CA
    3 days ago
  • Member of Technical Staff, Statistical Genetics Location: SF Bay Area Type: Full-time About Radical Numerics Radical Numerics is an AI research...  ...data architect, part methods scientist, and part model evaluator. You will collaborate closely with AI engineers and computational... 
    Full time

    Radical Numerics

    San Francisco, CA
    1 day ago
  •  ...preference and judgment. That lets us evaluate models on what people actually care...  ...actually want. We’re a small, deeply technical team with people from Harvard, Berkeley...  ...Angel, BoxGroup and others. The Role Member of Technical Staff, Platform Engineer You’ll design,... 

    Arcada Labs Incorporated

    San Francisco, CA
    3 days ago
  • $200k

    Member of Technical Staff, RL Research & Environments Posted Feb 28, 2026 | Full-time | Advanced (5-10 yrs) Magic’s mission is to build safe...  ...Environments team, you will design and operate the data, evaluation, and environment systems that improve model capabilities after... 
    Full time
    Relocation
    Visa sponsorship

    Magic

    San Francisco, CA
    5 days ago
  •  ...frameworks can be learnt and are therefore framework-agnostic when evaluating past experience. You have deep experience building and...  ...the world . You can use it to work in-person with other team members in the same city, but it is not mandatory - just get work done... 
    For contractors
    Internship
    Work at office

    Project Europe

    San Francisco, CA
    1 day ago
  • Member of Technical Staff, Pretraining Science Member of Technical Staff, Pre-Training Science Location: SF Bay Area or Tokyo, Japan Type: Full...  .... Work on architecture, algorithms, and optimization. Evaluate ideas in model design, optimization, long-context learning... 
    Full time

    Radical Numerics

    San Francisco, CA
    3 days ago
  • $185k - $240k

     ...Activant, 1984 Ventures and Page One. The Role We’re hiring a Member of Technical Staff — Internal AI Harness to build the systems that power how...  ...including HubSpot, Slack, Fathom, and Linear. Implement evaluation frameworks, logging, and feedback loops to continuously... 
    Full time
    Flexible hours
    Shift work

    Stuut

    San Francisco, CA
    1 day ago
  • Member of Technical Staff, Post-Training Location: SF Bay Area or Tokyo, Japan Type: Full-time About Radical Numerics Radical Numerics is an...  ...Training at Radical Numerics, you will develop the training and evaluation loops that shape biological world models after pretraining... 
    Full time

    Radical Numerics

    San Francisco, CA
    1 day ago
  • Member of Technical Staff, Applied AI The opportunity We are looking for a Member of Technical Staff with deep expertise in generative modelling...  ...in biology and understand the unique data challenges, evaluation paradigms and scientific workflows of biological modelling... 
    Flexible hours

    Latent Labs

    San Francisco, CA
    4 days ago
  •  ...Activant, 1984 Ventures and Page One. The Role We’re hiring a Member of Technical Staff - AI/ML to design, build, and deploy AI-powered systems...  ...into effective AI solutions. Measure Impact: Create evaluation frameworks to track AI system performance and quantify business... 
    Full time
    Flexible hours

    Stuut

    San Francisco, CA
    3 days ago
  • $160k - $250k

    Member of Technical Staff - Computational Biology About Edison Scientific focuses on building and commercializing AI agents for science, and shares...  ...Technical Staff - Computational Biology, you'll build and evaluate AI agent systems to automate biological discovery. You'll... 
    Remote work

    Edison Scientific

    San Francisco, CA
    2 days ago
  • $100k - $150k

    Founding Member of Technical Staff (Security) Location: San Francisco • Singapore • Hyderabad • London Engineering • Hybrid • Full-time We're...  ...work we've published in our blog. Create benchmarks to evaluate agent performance on real-world scenarios. Work closely with... 
    Full time
    For contractors
    Work at office

    Crane Venture Partners

    San Francisco, CA
    3 days ago
  • Member of Technical Staff, ML Systems Mirendil Mirendil is a tech-first company focused on solving core bottlenecks that unlock step-change acceleration...  ...latency, throughput, cost) Developing data pipelines and evaluation tooling Deploying and maintaining reliable production... 

    Mirendil

    San Francisco, CA
    5 days ago
  • Member of Technical Staff — Data Quality Operations Patronus AI is a frontier lab developing simulation research and infrastructure to accelerate...  ...some of the earliest and most influential research in AI evaluation like FinanceBench , Lynx , SimpleSafetyTests ,... 

    Patronus AI, Inc.

    San Francisco, CA
    1 day ago
  • $150k

    Amazon’s Frontier AI & Robotics (FAR) team is seeking a Member of Technical Staff to drive foundational research and build intelligent robotic...  ...leveraging our extensive infrastructure to prototype and evaluate at scale Collaborate with our world‑class research team to... 
    Local area

    Amazon Science

    San Francisco, CA
    5 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Member of Technical Staff - Evaluations. Be the first to apply!