Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Member of Technical Staff - Evaluations

B Capital

Location SF, NYC, London Employment Type Full time Location Type On-site Department Engineering Our Mission Reflection’s mission is to build open superintelligence and make it accessible to all . We’re developing open weight models for individuals, agents, enterprises, and even nation states. Our team of AI researchers and company builders come from DeepMind, OpenAI, Google Brain, Meta, Character.AI, Anthropic and beyond. About the Role Conduct critical comparative analysis to advance our understanding of model capabilities Build and refine evaluation systems and processes that create tight feedback loops between data, evals, and model behavior Develop generalizable evaluation frameworks that capture what matters for reasoning, alignment, and usefulness. Collaborate closely with pre-training, post-training, and applied teams to translate insights into model improvements. Push the boundaries of what’s measurable, from synthetic evals to human feedback and real-world interaction data. About You Strong statistical analysis and experimental design skills to rigorously measure model improvements Familiarity with LLM evaluation methodologies: static benchmarks, human preference evals, and/or agentic tasks. High agency and thrive in a fast-paced startup environment; bias for impact over process. Excited to work in a new frontier lab, defining how we measure and accelerate progress toward more capable models. Collaborative, detail-oriented, and motivated by building the feedback loops that make models truly improve. What We Offer: We believe that to build superintelligence that is truly open, you need to start at the foundation. Joining Reflection means building from the ground up as part of a small talent-dense team. You will help define our future as a company, and help define the frontier of open foundational models. We want you to do the most impactful work of your career with the confidence that you and the people you care about most are supported. Top-tier compensation: Salary and equity structured to recognize and retain the best talent globally. Health & wellness: Comprehensive medical, dental, vision, life, and disability insurance. Life & family: Fully paid parental leave for all new parents, including adoptive and surrogate journeys. Financial support for family planning. Benefits & balance: paid time off when you need it, relocation support, and more perks that optimize your time. Opportunities to connect with teammates: lunch and dinner are provided daily. We have regular off-sites and team celebrations. #J-18808-Ljbffr

Vacancy posted 4 days ago
Similar jobs that could be interesting for youBased on the Member of Technical Staff - Evaluations in San Francisco, CA vacancy
  • $150k - $300k

     ...be working on advancing our ability to evaluate and serve models trained with our RL Lab...  ...systems into our RL training stack. Core Technical Responsibilities LLM Serving Multi‑tenant...  ...in open development and encourage team members to contribute to the broader AI community... 
    Suggested
    Work at office
    Remote work
    Visa sponsorship
    Relocation package
    Flexible hours
    Shift work

    Prime Intellect

    San Francisco, CA
    3 days ago
  • $300k

     ...Member of Technical Staff - Mechanistic Interpretability About V max V max is an applied research lab developing AI capable of open-ended learning...  ...rigorous ML experiments, including ablations, baselines, evaluation design, and failure analysis. Expertise with Python and at... 
    Suggested
    Work at office
    Local area

    VMAX LLC

    San Francisco, CA
    3 days ago
  • $185k - $255k

     ...Member of Technical Staff - Reinforcement Learning Optimized deploys AI agents into the most critical supply chains in the world: the operations...  ...and post-training: the reward models, training loops, and evaluations that turn raw model capability into reliable long-horizon... 
    Suggested

    Optimized, Inc.

    San Francisco, CA
    1 day ago
  • $227.5k - $401k

     ...motivated individuals who tackle unique technical challenges at scale and solve them as...  ...financial technology sector. As a Member of Technical Staff, you will operate with a high degree...  ...Multi‑step Reasoning (DABStep), which evaluates AI agents on real‑world data analysis... 
    Suggested
    Work at office
    Immediate start
    Relocation
    Flexible hours

    Adyen

    San Francisco, CA
    3 days ago
  •  ...great products. Join us on our mission and shape the future! Member of Technical Staff, Search Why this role? We are looking for talented...  ...datasets and optimize data pipelines for model training and evaluation. Work closely with the model serving team to ensure that inference... 
    Suggested
    Full time
    Work at office
    Remote work
    Flexible hours

    Cohere

    San Francisco, CA
    3 days ago
  •  ...As a Member of Technical Staff (MTS), you'll build production-grade systems that power continuous optimization loops for AI agents—from evaluation pipelines and data/trace infrastructure to APIs that deploy improved policies. This role is a blend of MLE + backend engineering... 

    VizopsAI

    San Francisco, CA
    3 days ago
  •  ...you won't just observe the cutting edge of AI, your work will define what cutting edge means. We're hiring Members of Technical Staff to design the evaluations that set the standard for how AI is measured, produce analysis that shapes how companies and the broader industry... 

    Artificial Analysis, Inc.

    San Francisco, CA
    1 day ago
  • $300k

     ...Member of Technical Staff - RL Algorithms About V max V max is an applied research lab developing AI capable of open-ended learning. We are building...  ...and agentic settings. Establish empirical baselines and evaluation protocols for measuring sample efficiency, robustness,... 
    Work at office
    Local area
    Shift work

    VMAX LLC

    San Francisco, CA
    3 days ago
  • $200k

     ...builds the internal platform that teams across Magic use to evaluate the performance of internal and external models. The team...  ...of many of the company's most important decisions. As a Member of Technical Staff on Evals, you will build both the platform and the evaluations... 
    Visa sponsorship
    Relocation package

    Magic AI Corp.

    San Francisco, CA
    4 days ago
  •  ...Member Of Technical Staff We're looking for a member of technical staff to build and deploy production-grade AI systems. In this role, you...  ...powered systems into production environments Fine-tune, evaluate, and work with machine learning models in real-world applications... 

    ERAGON

    San Francisco, CA
    3 days ago
  •  ...Member of Technical Staff, Product TL;DR: Listen teaches AI what people actually think and want. We're Sequoia-backed, raised $100M, and...  ...what McKinsey does for $1M per engagement. The bottleneck is evaluating those qualitative outputs. Once you have the eval, you can... 
    Flexible hours
    Shift work

    Listen Labs

    San Francisco, CA
    12 hours ago
  •  ...Member Of Technical Staff @ Lotus AI Lotus AI is a groundbreaking primary care app that integrates your medical records, AI, and real doctors...  ...curation pipelines that produce high-quality training and evaluation datasets from clinical interactions. Voice and... 

    Lotus Health

    San Francisco, CA
    16 days ago
  • $180k

     ...Member Of Technical Staff - RL Infrastructure Palo Alto, CA About XAI XAI's mission is to create AI systems that can accurately understand...  ...engineers to create robust data pipelines, comprehensive evaluations for benchmarking LLMs, and automation frameworks to... 
    Temporary work

    Xai

    San Francisco, CA
    1 day ago
  •  ...improving models. This includes trajectory visualization, evaluation workflows, monitoring dashboards, and the core product interfaces...  ...core agent products. We’re building our team of founding Members of Technical Staff to design the frontier of continually learning systems.... 

    Trajectory

    San Francisco, CA
    3 days ago
  •  ...benchmarks. This spans everything needed to evaluate LLMs at scale: Python libraries, a web...  ...code and architecture reviews for other members of the team Help establish engineering...  ...infrastructure meets their needs Requirements Technical 2+ YOE: 2+ years of full-time experience... 
    Full time
    Work experience placement
    Relocation
    Relocation package
    Shift work

    PetsApp

    San Francisco, CA
    3 days ago
  •  ...multiple levels for this role) Hands‑on experience with LLM evaluations and/or post‑training methods: How to design useful evals...  ...features end‑to‑end What the job involves We are seeking a Member of Technical Staff, Evals & Post‑Training Product to help define how... 

    Fireworks AI

    San Francisco, CA
    4 days ago
  •  ...useful agents, we need infrastructure that makes environment construction, experimentation, evaluation, and iteration feel like one seamless system. As a Member of Technical Staff, Infrastructure / DevOps, you will own the systems that make Plato's research and training... 

    Plato.ai

    San Francisco, CA
    3 days ago
  •  ...Member of Technical Staff, ML Systems Mirendil Mirendil is a tech-first company focused on solving core bottlenecks that unlock step-change acceleration...  ...(latency, throughput, cost) Developing data pipelines and evaluation tooling Deploying and maintaining reliable production... 

    Mirendil

    San Francisco, CA
    4 days ago
  •  ...boundaries of what's possible in robotic intelligence. As a Member of Technical Staff, you'll be at the forefront of developing breakthrough...  ...leveraging our extensive infrastructure to prototype and evaluate at scale Collaborate with our world‑class research team to... 
    Local area

    Amazon Science

    San Francisco, CA
    12 hours ago
  • $100k - $150k

     ...Founding Member of Technical Staff (Security) Location: San Francisco • Singapore • Hyderabad • London Engineering • Hybrid • Full-time We're...  ...the work we've published in our blog. Create benchmarks to evaluate agent performance on real-world scenarios. Work closely with... 
    Full time
    For contractors
    Work at office

    Crane Venture Partners

    San Francisco, CA
    4 days ago
  •  ...Job Description As a Member of Technical Staff (Research) at Trajectory, you will design and build the post‑training stack that lets our customers...  ...own end‑to‑end experiments across data, training, and evaluation: shaping telemetry into learnable signals, training and serving... 

    Trajectory

    San Francisco, CA
    3 days ago
  •  ...Member of Technical Staff, Document Understanding Join us and help shape the future of AI by architecting next-generation knowledge systems....  ...and interests, you might focus more on data curation and evaluation, model fine-tuning and experimentation, or ML infrastructure... 
    Work at office
    Remote work

    LlamaIndex, Inc.

    San Francisco, CA
    3 days ago
  •  ...future of AI. About the role Gimlet Labs is seeking a Member of Technical Staff (Intern) to help develop Gimlet's platform for deploying...  ..., deploying and scaling AI systems for production Evaluating and implementing cutting-edge AI research Researching ways... 
    Internship

    Gimlet Labs

    San Francisco, CA
    12 hours ago
  •  ...knowledge and resources that strengthen the broader AI ecosystem. As a member of SII, you'll conduct original and impactful research on...  ...for security and privacy in AI-native products. Build security evaluation frameworks, benchmarks, and datasets to measure the... 

    Perplexity AI Inc.

    San Francisco, CA
    4 days ago
  •  ...Member of Technical Staff - Applied Research Patronus AI is a frontier lab developing simulation research and infrastructure to accelerate progress...  ...some of the earliest and most influential research in AI evaluation like FinanceBench , Lynx, SimpleSafetyTests ,... 

    Patronus AI, Inc.

    San Francisco, CA
    3 days ago
  •  ...Member Of Technical Staff - Image / Video Generation Freiburg (Germany) About Black Forest Labs We're the team behind Latent Diffusion...  ...architecture Deep understanding of how to effectively evaluate image and video generative models—knowing which metrics correlate... 
    Remote work
    Worldwide
    2 days per week

    Black Forest Labs

    San Francisco, CA
    5 days ago
  •  ...Senior Member of Technical Staff Harper is an AI-native commercial insurance company in San Francisco. We're not bolting AI onto insurance...  ...abstractions that let the team ship new agents in days. Build evaluation that works. Systems that measure whether agents are... 
    Work at office
    Relocation

    Harper Group

    San Francisco, CA
    2 days ago
  • $200k - $350k

     ...Member Of Technical Staff, Inference & Serving Inception creates the world's fastest, most efficient AI models. Our Mercury model is the world...  ...(Kubernetes, Ray, SLURM) for distributed inference, evaluation, and large-batch serving. Implement and manage load balancing... 
    Immediate start
    Flexible hours

    Inception LLC

    San Francisco, CA
    4 days ago
  • $200k - $350k

     ...long-term success for both clients and candidates. Member of Technical Staff - Pre-Training Infrastructure Location: San Francisco,...  ...Implement systems for checkpointing, experiment tracking, evaluation, reproducibility, and model comparison. Build scalable... 
    Work at office
    Visa sponsorship

    Recruiting from Scratch

    San Francisco, CA
    12 hours ago
  •  ...Member Of Technical Staff, Platform Engineer You'll design, build, and own distributed systems and core platform infrastructure end-to-end...  ...user-facing product surfaces and real-time interactions to evaluation pipelines, model orchestration, and the systems underneath... 

    Arcada Labs Incorporated

    San Francisco, CA
    3 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Member of Technical Staff - Evaluations. Be the first to apply!