Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Member of Technical Staff, Evals

$200k

Magic Inc

Magic's Mission

Magic's mission is to build safe AGI that accelerates humanity's progress on the world's most important problems. We believe the most promising path to safe AGI lies in automating research and code generation to improve models and solve alignment more reliably than humans can alone. Our approach combines frontier-scale pre-training, domain-specific RL, ultra-long context, and inference-time compute to achieve this goal.

About the Role

Evals builds the internal platform that teams across Magic use to evaluate the performance of first-party and third-party models. The team supports pre-training, post-training, data, inference, and product, and sits on the critical path of many of the company's most important decisions.

As a Member of Technical Staff on Evals, you will build both the platform and the evaluations themselves. You'll develop infrastructure for large-scale evaluations, data ablations, and dataset quality analysis, while designing and validating the methodologies used to measure model performance.

Sweating the details matters on this team. Many benchmarks, papers, and open-source evaluation frameworks contain subtle bugs or flawed assumptions that lead to misleading conclusions. We care deeply about correctness, reproducibility, and measurement quality.

Evals are essential to the success of the company. By building trustworthy evaluation systems, you will help Magic make better research decisions, build better datasets, and ship better products.

What You'll Work On
  • Build and maintain the internal evals platform used across Magic

  • Design, implement, and validate eval tasks for pre-training, post-training, reinforcement learning, inference, and product systems

  • Develop infrastructure for running large-scale evaluations

  • Build systems to measure dataset quality and identify opportunities to improve training data

  • Improve evaluation correctness, reproducibility, and reliability

  • Audit and improve upon public benchmarks, evaluation methodologies, and open-source implementations

  • Partner with research, data, inference, and product teams to define metrics that accurately reflect model quality

  • Build tooling and frameworks that enable teams across Magic to make decisions based on trustworthy measurements

What We're Looking For
  • Strong software engineering fundamentals

  • Experience building production systems, internal platforms, or developer infrastructure

  • Exceptional attention to detail and a high bar for correctness

  • Experience working with machine learning systems, evaluation frameworks, data infrastructure, or research tooling

  • Ability to reason critically about benchmarks, metrics, and experimental methodology

  • Strong intuition for measurement quality and experimental design

  • Experience designing, implementing, or operating systems that run at scale

  • Strong debugging and investigative skills

  • Comfortable navigating ambiguity and determining whether a measurement is actually capturing the behavior it claims to measure

  • Skepticism toward results that cannot be reproduced, validated, or explained

  • Track record of owning technical projects end-to-end

  • Excitement about helping researchers and engineers make better decisions through trustworthy measurements

Compensation, Benefits, and Perks (US)
  • Annual salary range between $200K - $550K depending on experience

  • Equity is a significant part of total compensation, in addition to salary

  • 401(k) plan with 6% salary matching

  • Generous health, dental, and vision insurance for you and your dependents

  • Unlimited paid time off

  • Visa sponsorship and relocation support for candidates moving to San Francisco

  • A small, fast-moving, highly collaborative team working on frontier AI systems

Magic strives to be the place where high-potential individuals can do their best work. We value quick learning and grit just as much as skill and experience.

Our Culture
  • Integrity. Words and actions should be aligned

  • Hands-on. At Magic, everyone is building

  • Teamwork. We move as one team, not N individuals

  • Focus. Safely deploy AGI. Everything else is noise

  • Quality. Magic should feel like magic

Vacancy posted 5 hours ago
Similar jobs that could be interesting for youBased on the Member of Technical Staff, Evals in San Francisco, CA vacancy
  • $180k

    Member of Technical Staff - RL Infrastructure About xAI xAI’s mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, and focused on engineering excellence. This organization... 
    Suggested
    Temporary work

    xAI

    San Francisco, CA
    3 days ago
  •  ...Member Of Technical Staff @ Lotus AI Lotus AI is a groundbreaking primary care app that integrates your medical records, AI, and real doctors...  ...and fine-tuning, model tooling, data pipelines, retrieval/evals, and product workflows. You'll be close to the core system... 
    Suggested

    Lotus Health

    San Francisco, CA
    16 hours ago
  •  ...Member of Technical Staff, Product TL;DR: Listen is building the human layer of AI. We're Sequoia-backed, raised $100M, and our customers include...  ...actually want, taking action, and iterating. Agent Evals. Every part of our product is built AI-first. Study Composer... 
    Suggested
    Flexible hours
    Shift work

    Listen Labs

    San Francisco, CA
    5 days ago
  • $150k - $300k

     ...environments, secure sandboxes, verifiable evals, and our async RL trainer. We enable...  ...systems into our RL training stack. Core Technical Responsibilities LLM Serving Multi‑tenant...  ...believe in open development and encourage team members to contribute to the broader AI community... 
    Suggested
    Work at office
    Remote work
    Visa sponsorship
    Relocation package
    Flexible hours
    Shift work

    Prime-Intellect

    San Francisco, CA
    3 days ago
  • $150k - $220k

    # Founding Member of Technical Staff (MTS)Bay Area, CAFull-time$150k-$220k + equity## About UsVizopsAI is the secure runtime for custom enterprise...  ....## What You'll Do* •Build backend services for training, evals, telemetry, and online policy updates* •Instrument... 
    Suggested

    VizopsAI

    San Francisco, CA
    2 days ago
  • $300k

    Member of Technical Staff - RL Infrastructure About V max V max is an applied research lab developing AI capable of open-ended learning. We are...  ...: distributed rollouts, training orchestration, inference, evals, data pipelines, observability, and reliability. You will create... 
    Work at office
    Local area

    Vmax

    San Francisco, CA
    12 hours ago
  • $300k

    Member of Technical Staff - RL Algorithms About V max V max is an applied research lab developing AI capable of open-ended learning. We are building...  .... Collaborate with researchers working on environments, evals, interpretability, reward modeling, and infrastructure to... 
    Work at office
    Local area
    Shift work

    Vmax

    San Francisco, CA
    12 hours ago
  •  ...requirements, and very few precedents to copy from. About the Role Members of Technical Staff (MTS) are the senior engineers who build the platform that...  ...as variations of the same primitive. Observability and evals. The harness that tells us whether the system is working:... 

    Beacon Software

    San Francisco, CA
    12 hours ago
  •  ...Member Of Technical Staff Atomic is the leading venture studio for company creation, partnering with extraordinary founders to launch businesses...  .... AI: Experienced in prompting and familiarity with Evals and RL We are focused on building a diverse and... 
    Local area

    Atomic VC

    San Francisco, CA
    4 days ago
  • $150k - $300k

     ...environments, secure sandboxes, verifiable evals, and our async RL trainer. We enable...  ...infrastructure that runs the jobs. Core Technical Responsibilities Hosted Training...  ...believe in open development and encourage team members to contribute to the broader AI community... 
    Work at office
    Local area
    Remote work
    Visa sponsorship
    Relocation package
    Flexible hours

    Prime Intellect

    San Francisco, CA
    17 hours ago
  • $210k - $385k

     ...cover our use cases. In this role, you will build specialized evals to improve answer quality across Perplexity, covering search-based...  ...directly shape product changes, collaborating closely with technical leadership to measure and improve Answer Quality Qualifications... 
    Full time
    Local area

    Pantera Capital

    San Francisco, CA
    1 day ago
  • $150k - $300k

     ...environments, secure sandboxes, verifiable evals, and our async RL trainer. We enable...  ...fast, robust, and reliable at scale. Core Technical Responsibilities Infrastructure Development...  ...in open development and encourage team members to contribute to the broader AI community... 
    Work at office
    Remote work
    Visa sponsorship
    Relocation package
    Flexible hours

    Prime Intellect, Inc.

    San Francisco, CA
    5 days ago
  • $300k

     ...research ideas into working training systems, evals, environment and rewards. You will work...  ...across RL projects. Own technically ambiguous projects end to end, from problem...  ...maintainable, and usable by other technical team members. Clear written and verbal communication... 
    Full time
    Work at office
    Local area

    Vmax

    San Francisco, CA
    12 hours ago
  •  ...Activant, 1984 Ventures and Page One. The Role We’re hiring a Member of Technical Staff - AI/ML to design, build, and deploy AI-powered systems...  ..., evaluation, calibration Have strong opinions on AI/ML evals — golden datasets, offline + online evaluation, statistical... 
    Full time
    Flexible hours

    Stuut

    San Francisco, CA
    4 days ago
  •  ...Member of Technical Staff humans& is a human-centric frontier AI lab. We believe AI can be reimagined, centering around people and their relationships with each other. We are looking for researchers and engineers who have done exceptional work at the frontier... 

    Humans&

    San Francisco, CA
    18 hours ago
  • $200k - $350k

     ...About the job Pantheon - Member of Technical Staff: Infrastructure Member of Technical Staff: Infrastructure Posted by Transparent Search Group on behalf of Pantheon . About Pantheon Autonomous physical labor Website: The role We are... 
    H1b
    Remote work
    Visa sponsorship

    Transparent Search Group

    San Francisco, CA
    4 days ago
  •  ...power real production workloads built to scale to gigawatt-class AI datacenters. Mission Gimlet Labs is seeking a Member of Technical Staff focused on distributed systems. In this role, you will build the core platform that schedules, routes, and operates AI workloads... 

    Gimlet Labs

    San Francisco, CA
    2 days ago
  •  ...understanding of model capabilities Build and refine evaluation systems and processes that create tight feedback loops between data, evals, and model behavior Develop generalizable evaluation frameworks that capture what matters for reasoning, alignment, and... 
    Relocation package

    Reflection AI

    San Francisco, CA
    19 hours ago
  •  ...Arielle Zuckerberg, Pascal Levy-Garboua), and strategic angels including Ryan Hoover (Founder, Product Hunt), Charlie Songhurst (Board Member, Meta), and Michael Jones (Former Chair, Huntington Bank Ventures). We are a talent dense team comprising of ex-Figure... 
    Full time
    Work experience placement
    Internship
    Worldwide

    Krew Research

    San Francisco, CA
    2 days ago
  •  ...designing, building, and scaling core infrastructure that powers a high-volume data platform for AI applications. We are looking for team members who love building enabling systems that empower our engineers and power our rapidly growing product. We're looking for folks... 
    Work at office

    LlamaIndex

    San Francisco, CA
    4 days ago
  •  ...built brag to your friends about your hyper-optimized AI coding workflows tinker and build software for the love of the game feel equally strong obligations to both 1) choose good and 2) to win think that this role should be renamed "member of tomo staff"... 
    Immediate start

    Tomo

    San Francisco, CA
    3 days ago
  • $200k

     ...Join to apply for the Member of Technical Staff role at Listen Labs . TL;DR: We are seeing strong market demand and an aggressive 6‑month product roadmap, so we are expanding our engineering team. We're looking for someone highly technical (our current team includes 3... 
    Flexible hours

    Listen Labs

    San Francisco, CA
    4 days ago
  •  ...Hugging Face, and Dropbox. Working with some of the leading AI companies in the world to power their core agent products. We’re building our team of founding Members of Technical Staff to design the frontier of continually learning systems. #J-18808-Ljbffr Trajectory

    Trajectory

    San Francisco, CA
    3 days ago
  • Interaction is looking for exceptional AI engineers to join our team in California as Members of the Technical Staff. Interaction is well-funded and only hiring the very best. Existing team members have left the world's leading quant shops and university labs to build... 

    Interaction

    San Francisco, CA
    4 days ago
  •  ...key role in transforming factory floor operations. Wide ownership: Cross-layer problems with no silos. Customer proximity: Embed with factory operators, iterate and validate fast. Meritocracy: Any problem can be solved by any team member. #J-18808-Ljbffr Complement

    Complement

    San Francisco, CA
    1 day ago
  • $350k

     ...Infra - Distributed Systems Series A AI Infrastructure Startup | Neocloud Platform | On-site (San Francisco) We’re hiring a Member of Technical Staff - Distributed Systems to join a next-generation AI infrastructure company building the first heterogeneous neocloud for AI... 

    Acceler8 Talent

    San Francisco, CA
    3 days ago
  • Pixeltable Inc. Member of Technical Staff San Francisco, CA·Full time Apply for Member of Technical Staff As a founding member of the engineering team, you will impact the design and direction of Pixeltable at a formative stage, contributing to some of our most foundational... 
    Full time
    Part time
    Work at office
    Work from home
    Flexible hours
    2 days per week

    Pixeltable, Inc.

    San Francisco, CA
    2 days ago
  • $170k - $220k

    Member of Technical Staff - Infrastructure & LLMs Location: San Francisco, CA (Hybrid) Compensation: $170,000 - $220,000 base + 1-3% equity Work Authorization: U.S. work authorization required (no visa sponsorship) Start Date: ASAP Type: Full-time About the Role We... 
    Full time
    Temporary work
    Immediate start
    Visa sponsorship
    Work visa

    Amadeus Search

    San Francisco, CA
    12 hours ago
  • $110k - $350k

     ...of domains. Clients range from early-stage startups to some of the largest companies in the world. About the role As a Member of the Technical Staff, you will join a team with deep expertise in machine learning, optimization, data science, and software engineering. You... 
    Work at office
    Relocation package

    Kiso Technology

    San Francisco, CA
    12 hours ago
  • $300k

    Member of Technical Staff - Mechanistic Interpretability About V max V max is an applied research lab developing AI capable of open-ended learning. We are building systems to exceed humans in all capacities by optimising beyond the local maxima of learning from human expertise... 
    Work at office
    Local area

    Vmax

    San Francisco, CA
    3 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Member of Technical Staff, Evals. Be the first to apply!