Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Member of Technical Staff, Evaluation Execution

$285.55k
Full-time

METR

About METR We are a nonprofit research organization that develops scientific methods to assess AI capabilities, risks and mitigations, with a specific focus on threats related to autonomy, AI R&D automation, and alignment. We believe it is robustly good for civilization to have a clearer understanding of what dangers AI systems pose, and we are extremely excited to find ambitious, excellent people to join our team and tackle one of the most important challenges of our time. What We're Looking For The Evaluation Execution team at METR focuses on productionizing, improving, and executing our various evaluations. We streamline our processes and build common infrastructure to scale our ability to continually run our most up-to-date evaluations on the latest models. This team primarily looks for research execution and software engineering skills. \n Research Execution You are an experienced executor/contributor; you are familiar with patterns of successful and unsuccessful execution in frontier ML research. You are undaunted by "I've never done this before" or even "no-one has done this before". You are creative, ambitious and entrepreneurial. You work fast and are highly responsive and available. You can juggle many balls when it is useful. Software Engineering You balance rapid prototyping with the creation of maintainable, scalable systems and make sound technical decisions. You lead large projects from ideation to delivery, balancing innovative ML solutions with reliable, high-quality code. You set high standards for system architecture, code quality, and maintainability, influencing broad software practices across the organization. \n $285,548 - $503,116 a year For very experienced and exceptional researchers, we are open to exploring paying much higher than this stated range. The listed range applies to the base salary for this role. METR also has a host of benefits:

  • The office: Catered lunch and dinner daily; in-office gym and shower
  • Relocation support: Stipend for moving to the Bay Area⁠
  • Time-off and leave: Unlimited PTO and 21-week parental leave for new parents
  • Commuter benefit: Monthly transit/parking stipend and an annual Uber budget
  • Professional development benefit: for training, courses, conferences, and AI safety education⁠
  • Mental health benefit: for therapy, medication, and other mental health expenses⁠
  • Wellness benefit: for gym memberships and other wellness expenses⁠
  • Work equipment benefit: for home office and workstation equipment⁠ expenses
\n Our Culture METR is a mission-driven organization. We believe our work can meaningfully shape humanity's future for the better, and we want to be the best people in the world doing this work. We have a tight-knit, collaborative research culture rooted in truth-seeking and integrity. We're fiercely committed to producing high-quality, trustworthy science. We're honest and transparent about our results, especially when they may go against the grain. We've earned trust as reliable partners who handle confidential information with care. We maintain a low-ego, drama-free environment focused on what matters. Hybrid Requirements: Our technical team members are in our office in Berkeley 3-5 days/week. Please let us know in your application if this is a constraint. If you lack US work authorization and would like to work in-person (strongly preferred), we can likely sponsor a cap-exempt H-1B visa for this role. We encourage you to apply even if your background may not seem like the perfect fit! We would rather review a larger pool of applications than risk missing out on a promising candidate for the position. We are committed to diversity and equal opportunity in all aspects of our hiring process. We do not discriminate on the basis of race, religion, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status. We welcome and encourage all qualified candidates to apply for our open positions.

Vacancy posted 7 hours ago
Similar jobs that could be interesting for youBased on the Member of Technical Staff, Evaluation Execution in Berkeley, CA vacancy
  • $250k

     ...servers. The team is small, technical, and moving fast, with...  ...Industry: AI Tools. The Role Member of Technical Staff who can handle everything...  ...enterprise customers Fine-tune, evaluate, and work with ML models...  ...evidence of startup-speed execution Not AI-forward, views AI... 
    Suggested
    Full time

    David Joseph & Company

    San Francisco, CA
    3 days ago
  • $227.5k - $401k

     ...individuals who tackle unique technical challenges at scale and...  ...technology sector. As a Member of Technical Staff, you will operate with a high...  ...and Deploy : drive the execution of Adyen’s AI strategy, focusing...  ...(DABStep), which evaluates AI agents on real‑world data... 
    Suggested
    Work at office
    Immediate start
    Relocation
    Flexible hours

    Adyen

    San Francisco, CA
    1 day ago
  • $200k

    Founding Member of Technical Staff - Hardware Location: On-site, San Francisco, CA Salary: $200k - $500k + 0.5-2.5% equity Industry: AI, Cybersecurity...  ...hardware with Confidential Computing and Trusted Execution Environments (TEEs) Engineer networking, routing, and integration... 
    Suggested
    Worldwide

    Open Select

    Berkeley, CA
    3 days ago
  •  ...responsibility to defend. About the Role As a Member of Technical Staff, Mechanistic Interpretability at...  ...learning systems, improved model evaluations, and ultimately, mastery over the...  ...frontier biological models work. Design and execute experiments to uncover the features,... 
    Suggested
    Local area

    Radical Numerics Inc.

    San Francisco, CA
    4 days ago
  •  ...work will define what cutting edge means. We're hiring Members of Technical Staff to design the evaluations that set the standard for how AI is measured,...  ...Benchmarking Product Development: Structure, design and execute projects to evaluate AI systems and technologies, including... 
    Suggested

    Artificial Analysis, Inc.

    San Francisco, CA
    1 day ago
  • $300k

    Member of Technical Staff - RL Algorithms About V max V max is an applied research lab developing AI...  .... Establish empirical baselines and evaluation protocols for measuring sample efficiency...  ...identifying promising directions to executing experiments and communicating results... 
    Work at office
    Local area
    Shift work

    Vmax

    San Francisco, CA
    3 days ago
  • $185k - $255k

    Member of Technical Staff - Reinforcement Learning Optimized deploys AI agents into the most critical...  ...reward models, training loops, and evaluations that turn raw model capability into reliable...  ...that improve how our agents plan and execute multi-step work. • Build reward... 

    Optimized, Inc.

    San Francisco, CA
    1 day ago
  • $250k

    Founding Member of Technical Staff - Software Location: On-site, Berkeley, CA Salary: $250k - $500k + 1-3% equity Industry: AI, Cybersecurity...  ...deep integrations with Confidential Computing and Trusted Execution Environments (TEEs) Develop data center systems for... 
    Worldwide

    Open Select

    Berkeley, CA
    3 days ago
  •  ...and judgment. That lets us evaluate models on what people...  ...want. We’re a small, deeply technical team with people from Harvard...  ...and others. The Role Member of Technical Staff, Platform Engineer You’ll...  ...hire for strong engineering execution and mission alignment. What... 

    Arcada Labs Incorporated

    San Francisco, CA
    2 days ago
  • Member of Technical Staff - Applied Research Patronus AI is a frontier lab developing simulation research...  ...and most influential research in AI evaluation like FinanceBench , Lynx,...  ...and modern ML frameworks. Ability to execute quickly with minimal guidance while maintaining... 

    Patronus AI, Inc.

    San Francisco, CA
    5 days ago
  •  ...responsibility to defend. About the Role As a Member of Technical Staff, Pre-Training Science at Radical...  ..., algorithms, and optimization. Evaluate ideas in model design, optimization,...  ...scale experiments rigorously. Design, execute, and analyze experiments with strong... 
    Local area

    Radical Numerics Inc.

    San Francisco, CA
    4 days ago
  • Member of Technical Staff, Document Understanding Join us and help shape the future of AI by architecting...  ...focus more on data curation and evaluation, model fine-tuning and experimentation...  ...specifications Track record of executing with high intensity in fast‑paced environments... 
    Work at office
    Remote work

    LlamaIndex, Inc.

    San Francisco, CA
    1 day ago
  • $160k - $250k

    Member of Technical Staff - Computational Biology About Edison Scientific focuses on building and commercializing...  ...Biology, you'll build and evaluate AI agent systems to automate...  .... You'll focus on improving how LLMs execute complex scientific tasks, creating benchmarks... 
    Remote work

    Edison Scientific

    San Francisco, CA
    1 day ago
  • $350k

     ...Anthropic, Google DeepMind, xAI, OpenAI, Microsoft, Apple, and MIT. The Role We are looking for a research engineer to build the evaluation infrastructure that tells us whether our models are getting better in ways we care about. You'll own the frameworks, pipelines,... 

    Mirendil

    San Francisco, CA
    2 days ago
  • $160k - $250k

     ...build. What you'll own & build As a Member of Technical Staff within the Research Tribe, you’ll be...  ...operate: how they coordinate, how they’re evaluated, and how their behavior improves over...  ...memory, tool-use registries and execution, and build structured, inspectable systems... 
    Work at office
    Weekend work
    3 days per week

    Blok

    San Francisco, CA
    2 days ago
  •  ...Member of Technical Staff, Product TL;DR: Listen teaches AI what people actually think and want. We're Sequoia-backed, raised $100M, and...  ...what McKinsey does for $1M per engagement. The bottleneck is evaluating those qualitative outputs. Once you have the eval, you can... 
    Flexible hours
    Shift work

    Listen Labs

    San Francisco, CA
    2 days ago
  •  ...Member Of Technical Staff We're looking for a member of technical staff to build and deploy production-grade AI systems. In this role, you...  ...powered systems into production environments Fine-tune, evaluate, and work with machine learning models in real-world applications... 

    ERAGON

    San Francisco, CA
    2 days ago
  • $125k - $200k

     ...system from the ground up Making critical technical decisions that will shape our product's...  ...AI agents that can understand and execute complex supply chain workflows Build...  ...workflow systems, including observability, evaluation frameworks, and memory management ~ Skilled... 
    Full time
    Temporary work
    Currently hiring
    Immediate start
    Flexible hours

    burnt

    San Francisco, CA
    9 days ago
  •  ...Member Of Technical Staff @ Lotus AI Lotus AI is a groundbreaking primary care app that integrates your medical records, AI, and real doctors...  ...curation pipelines that produce high-quality training and evaluation datasets from clinical interactions. Voice and... 

    Lotus Health

    San Francisco, CA
    4 days ago
  •  ...founders of Stripe, DoorDash, and Ramp. About the Role Members of Technical Staff (MTS) are the senior engineers who build the platform that...  ...as fast as portco 5. Workflow and action runtime. The execution layer that runs operational workflows across the three... 

    BEACON SOFTWARE COMPANY

    San Francisco, CA
    5 days ago
  • Member of Technical Staff - Computational Biologist Valthos | Posted Mar 3 Full-time Negotiable Advanced...  ...biology by building robust data and evaluation frameworks for assessing and...  ...The Role Contribute to shaping and executing the Valthos-wide research and development... 
    Full time
    Work at office

    Valthos

    San Francisco, CA
    5 days ago
  •  ...About the role Join us as a Member of Technical Staff (Product Engineer) and help build the next generation...  ...owning problems end-to-end, not just executing tasks Ability to move quickly in a...  ...HLS, transcoding, latency tradeoffs) Evaluating multimodal systems (retrieval quality... 
    H1b
    Remote work
    Visa sponsorship

    Reka AI, Inc.

    San Francisco, CA
    1 day ago
  • $160k - $240k

    Full-time San Francisco · In person $160k - $240k + Equity Member of Technical Staff, Modeling About the Role You will build and evaluate the models that turn operational time-series into forecasts, ranked risk drivers, and auditable decisions. The work spans time-series... 
    Full time

    Reific

    San Francisco, CA
    1 day ago
  •  ...improving models. This includes trajectory visualization, evaluation workflows, monitoring dashboards, and the core product interfaces...  ...core agent products. We’re building our team of founding Members of Technical Staff to design the frontier of continually learning systems.... 

    Trajectory

    San Francisco, CA
    1 day ago
  •  ...to gigawatt-class AI datacenters. Gimlet Labs is seeking a Member of Technical Staff focused on distributed systems. In this role, you will...  ...production scale. You will work on systems that coordinate execution across thousands of nodes, expose stable production APIs,... 

    Gimlet Labs

    San Francisco, CA
    5 days ago
  • Pixeltable Inc. Member of Technical Staff San Francisco, CA·Full time Apply for Member of Technical Staff As a founding member of the engineering...  ...: What are the right abstractions for data, models, and execution, so that we abstract away an application’s data plumbing,... 
    Full time
    Part time
    Work at office
    Work from home
    Flexible hours
    2 days per week

    Pixeltable, Inc.

    San Francisco, CA
    5 days ago
  • $200k

     ...builds the internal platform that teams across Magic use to evaluate the performance of internal and external models. The team...  ...of many of the company's most important decisions. As a Member of Technical Staff on Evals, you will build both the platform and the evaluations... 
    Visa sponsorship
    Relocation package

    Magic

    San Francisco, CA
    3 days ago
  • Member of Technical Staff — Kernels & GPU Performance Employment Type: Full-time Workplace: On-site About the Company We are building the execution layer for the next era of AI infrastructure. As AI workloads scale and hardware architectures diversify, the bottleneck is... 
    Full time

    Acceler8 Talent

    San Francisco, CA
    5 days ago
  • $225k - $300k

    Member of Technical Staff Location: San Francisco, CA Onsite Policy: Full-time onsite Comp & Benefits: $225K - $300K base + 0.5% - 2% equity This...  ...backend infrastructure. Own technical architecture and execution across the full stack while solving complex engineering challenges... 
    Full time

    Trades Workforce Solutions

    San Francisco, CA
    2 days ago
  • $150k - $350k

    Mission Gimlet Labs is seeking a Member of Technical Staff focused on distributed systems. In this role, you will build the core platform that...  ...production scale. You will work on systems that coordinate execution across thousands of nodes, expose stable production APIs,... 

    Gimlet Labs, Inc.

    San Francisco, CA
    5 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Member of Technical Staff, Evaluation Execution. Be the first to apply!