Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Evaluation Lead

$159k - $260k

Waabi

Job Description

Job Description

Waabi, founded by AI visionary Raquel Urtasun, is the leader in Physical AI. With a world-class team, we're unlocking the next era of autonomous transportation with technology that's powering commercial autonomous trucks and robotaxis. Waabi is backed by and partners with world leaders in AI, automotive, logistics, and deep tech.

With offices in Toronto, San Francisco, Dallas, and Pittsburgh, Waabi is growing quickly and looking for diverse, innovative and collaborative candidates who want to impact the world in a positive way. To learn more visit:

We are looking for a hands-on leader to build a new centralized Evaluation team. This team will  be responsible for providing comprehensive and holistic analysis on all aspects of performance of the autonomy system. In this role, you will collaborate closely with the systems & safety team, responsible for defining the requirements & evaluation criteria, as well as the autonomy teams to understand their evaluation needs. You will get to work with Waabi World, our highly realistic closed-loop simulation engine built with the latest in generative AI technologies to deliver the evaluation capabilities needed to support the safe development of  the next generation of autonomous vehicles!

You will...

- Lead and build a cross functional team of software engineers, data analysts, and data scientists supporting automated workflows that provide high signal on autonomy performance. 

- Design scalable production frameworks for sampling evaluation sets, developing and improving metrics, and systematically measuring the performance of both autonomy and the eval ecosystem itself.

- Design pipelines, tools, and dashboards to characterize autonomy performance for technical teams and executive leadership, collaborating closely with platform teams on implementation, and autonomy, systems and safety and product teams on requirements. 

- Work closely with simulation and software teams to build solutions that leverage our data, metrics and simulation platforms effectively. 

- Lead technical projects; contributing as an IC while also managing the team.

- Participate and share ideas in technical and architecture discussions, collaborating with researchers and engineers.

- Conduct regular one-on-one meetings to offer guidance and constructive feedback to direct reports.

 

Qualifications:

- Minimum of 6+ years of autonomous vehicle industry experience including at least 2+ years managing high performing teams

- Experience evaluating AI or machine learning models, ideally in self-driving or related fields

- MS/PhD or Bachelors degree in Computer Science, Data Science, Robotics and/or similar technical field(s) of study

- Strong statistical background 

- Experience working with internal cross-functional partners/stakeholders  

- Experience with system design/architecture and algorithms

- Open-minded and collaborative team player with willingness to help others

- Passionate about self-driving technologies, solving hard problems, and creating innovative solutions.

 

Bonus/nice to have:

- Previous experience leading Autonomy Evaluation teams 

- Experience with large scale databases and analytics 

The US yearly salary range for this role is: $159,000 - $260,000 USD in addition to competitive perks & benefits. Waabi (US) Inc.’s yearly salary ranges are determined based on several factors in accordance with the Company’s compensation practices. The salary base range is reflective of the minimum and maximum target for new hire salaries for the position across all US locations.  Note: The Company provides additional compensation for employees in this role, including equity incentive awards and an annual performance bonus.

 

Perks/Benefits:

- Competitive compensation and equity awards.

- Health and Wellness benefits encompassing Medical, Dental and Vision coverage (for full-time employees only).

- Unlimited Vacation.

- Flexible hours and Work from Home support.

- Daily drinks, snacks and catered meals (when in office).

- Regularly scheduled team building activities and social events both on-site, off-site & virtually.

- As we grow, this list continues to evolve! 

Waabi is a technology start-up building technologies to transform the way the world moves. Join our talented team to be a part of the future and to make an impact!

Waabi is an equal opportunity employer. We celebrate diversity and are committed to creating a supportive, inclusive, and accessible workplace for all our employees. We seek applicants of all backgrounds and identities, across race, color, ethnicity, national origin or ancestry, age, citizenship, religion, sex, sexual orientation, gender identity or expression, military or veteran status, marital status, pregnancy or parental status, caregiver status, disability, or any other characteristic protected by law. We make workplace accommodations for qualified individuals with disabilities as required by applicable law. If reasonable accommodation is needed to participate in the job application or interview process please let our recruiting team know.

We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses and identifying potential inconsistencies or verification signals in application materials based on available information. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.

Vacancy posted 25 days ago
Similar jobs that could be interesting for youBased on the Evaluation Lead in San Francisco, CA vacancy
  • $146.2k - $261.4k

     ...Research Lead - AI Cyber Testing & Evaluation RAND's Center on AI, Security, and Technology (CAST), part of the Global and Emerging Risks (GER) Division conducts cutting-edge research on transformative, high-impact technologies—including artificial intelligence and... 
    Suggested
    Work experience placement
    Remote work
    Work from home

    Employment Opportunities Inc

    San Francisco, CA
    8 days ago
  • Twelve Labs in San Francisco is seeking a vital ML Data Team member to lead video-language data preparation and model evaluation. You will define dataset needs, automate evaluation processes, and collaborate cross-functionally with engineering and AI model teams. Ideal... 
    Suggested
    Flexible hours

    Twelve-Labs

    San Francisco, CA
    1 day ago
  • $176k - $253k

     ...Quality. This role involves converting agent quality assessments from vague estimations to concrete metrics, ensuring agents are evaluated, tested, and monitored effectively. Candidates should have experience in building software evaluation frameworks and strong communication... 
    Suggested

    Harper Group

    San Francisco, CA
    12 hours ago
  • A cutting-edge AI technology firm in San Francisco is seeking an Evaluation Lead to drive the assessment of AI model performance. You will design evaluation methodologies, automate evaluation processes, and oversee various evaluation strategies. The ideal candidate has... 
    Suggested

    SupportFinity™

    San Francisco, CA
    4 days ago
  • TwelveLabs is seeking a key member for its ML Data Team in San Francisco. This role involves designing evaluation frameworks, managing data operations, and collaborating cross-functionally. Ideal candidates should have over 5 years of experience in AI data operations,... 
    Suggested
    Flexible hours

    TwelveLabs

    San Francisco, CA
    12 hours ago
  • $225k - $300k

    ConTra is seeking a Founding AE in San Francisco to drive full-cycle sales for AI and creative evaluation solutions. The ideal candidate will own pipeline creation, consult with various stakeholders, and contribute to the authority of Contra Labs. Compensation includes... 

    ConTra

    San Francisco, CA
    4 days ago
  •  ...you below you can contact us directly with your resume via jobsarchetypeaiio. About The Role Archetype AI is seeking a hands‑on Evaluation Lead to build and assess model performance for physical AI. You will design and implement advanced evaluation techniques for... 

    SupportFinity™

    San Francisco, CA
    4 days ago
  • Anthropic is seeking a Research Lead for the Training Insights team to shape the evaluation of model capabilities. This hands-on leadership role involves developing innovative evaluation methodologies and mentoring a team of researchers. You will play a crucial role in... 
    Remote work

    Anthropic

    San Francisco, CA
    4 days ago
  • Gravity Engineering Services Pvt Ltd. is looking for a Technical Program Manager for Research to define and build programs essential for research teams at the cutting edge of AI development. This role requires engagement across complex and ambiguous research initiatives...

    Gravity Engineering Services Pvt Ltd.

    San Francisco, CA
    1 day ago
  • $17 - $27.75 per hour

     ...Ambassador embodying of Coach values and increasing brand awareness Leads implementation of Company initiatives and support full...  ...Supports the store with recruiting, interviewing, performance evaluation, high-level training as needed Provides necessary feedback and... 
    Minimum wage
    Shift work

    Tapestry

    San Francisco, CA
    4 days ago
  • Twelve-Labs in San Francisco is seeking a dedicated member for our ML Data Team to lead video data preparation and evaluation. This role includes defining dataset needs, automating processes, and enhancing data quality through collaboration. Ideal candidates should have... 
    Flexible hours

    Twelve-Labs

    San Francisco, CA
    4 days ago
  •  ...create the products and experiences that put our members' interests first. The Credit Strategy Lead will work in the Credit team and have responsibilities to analyze and evaluate data to develop and propose value-added credit risk strategies and models for SoFi's lending... 
    Work experience placement
    Work at office
    Remote work

    SoFi

    San Francisco, CA
    3 days ago
  • $109.6k - $137k

     ...Scale is looking for a Quality Lead to ensure top-tier data quality across our Generative AI programs. In this role, you will partner...  ...the same role. This allows us to ensure a fair and thorough evaluation of all applicants. About Us: At Scale, our mission is to... 
    Full time

    Scale AI

    San Francisco, CA
    2 days ago
  • $185k - $220k

     ...work. About the Role: We are seeking a strategic and seasoned Lead, Internal Audit and SOX Compliance to join our Finance team...  ...Achieve: Lead a comprehensive, strategic governance program that evaluates Internal Controls over Financial Reporting (ICFR) against the COSO... 
    Local area

    Notion Labs, Inc

    San Francisco, CA
    3 days ago
  • $192k - $250k

     ...week. About the Role We are looking for a builder to help lead our 'AI for Work' efforts. Together with the Director of AI Strategy...  ...that makes our teams more efficient. That means building, evaluating vendors, and continuously evolving the AI systems our teams run... 
    Hourly pay
    Work at office
    Immediate start
    Flexible hours
    Shift work

    Taskrabbit

    San Francisco, CA
    21 days ago
  • $347k

     ...About the Role We are seeking a Global Detection and Response Lead to own and scale OpenAI's cybersecurity detection and response...  ...environments. Deeply partner across all of OpenAI to evaluate and respond to emergent security concerns in a frontier AI lab environment... 

    OpenAI

    San Francisco, CA
    12 hours ago
  • $90 - $98 per hour

     ...Our Global client in the tech industry is seeking a Lead Product Partnerships Apply today for consideration! Position - Lead...  ..., API agreements, and technology vendor negotiations; comfort evaluating build vs. buy vs. partner tradeoffs ~ Demonstrated ability... 
    Contract work
    Local area
    Remote work
    Shift work

    Pride Global

    San Francisco, CA
    2 days ago
  •  ...release day and assists in reverse logistics (e.g., 1506, returns, empty package). Assists manager or assistant store manager in evaluating and developing displays, including promotional, seasonal, super structures, and sale merchandise. Completes resets and revisions.... 
    Work experience placement
    Seasonal work
    Local area
    Shift work

    Walgreens Boots Alliance

    San Francisco, CA
    4 days ago
  • $233k - $305k

     ...outcomes. About the Role The Global Agency Partnership Lead contributes to the overall strategy, execution, and success of the...  ...with clear timelines, ownership, and measurable outcomes. Evaluate partners for strategic fit, service maturity, regional relevance... 
    Work at office
    Local area
    Flexible hours

    Canva

    San Francisco, CA
    1 day ago
  • $92k - $115k

     ...Lead, CS AI Content Flex is a growth-stage, NYC headquartered FinTech company that is creating the best rent payment experience...  ...operations preferred. Bonus: experience with chatbot authoring, AI evaluation, or support QA. Compensation Flex takes a market-based... 
    Full time
    Local area
    Relocation package
    Flexible hours
    2 days per week
    3 days per week

    FLEX Inc

    San Francisco, CA
    7 days ago
  • $162.4k - $225k

     ...About the Role As the Data Center Physical Security Regional Lead based in the US you will lead to ensure the highest level of...  ...systems. Vendor Management: A successful track record of evaluating and managing third-party security vendors and contracted services... 
    Work at office
    Remote work
    Relocation package

    OpenAI

    San Francisco, CA
    12 hours ago
  • $90k - $140k

     ...and processes to our work. Position Responsibilities Leads definition, planning and documentation of digital work-flow strategies...  ...the performance goals of the Project. Identifies and evaluates possible productivity gains to be made with customization/automation... 
    Work at office
    Flexible hours

    Skidmore, Owings & Merrill LLP

    San Francisco, CA
    3 days ago
  • $130k - $195k

     ...Originations Lead Plural is hiring an Originations Lead to run the sourcing of new renewable energy companies onto the Plural platform...  .... Initial Qualification & Diligence – Conduct first-pass evaluation of inbound and sourced opportunities before the structuring... 

    Plural Everything, Inc.

    San Francisco, CA
    4 days ago
  • $160k - $220k

     ...Senior Lead, Underwriting Denver, CO or San Francisco, CA or Madison, WI About MGT Insurance MGT is the first AI-driven,...  ...~ Strong analytical instincts — comfortable reading loss runs, evaluating portfolio mix, and spotting adverse selection patterns ~ Clear... 
    Shift work

    MGT Insurance

    San Francisco, CA
    4 days ago
  • $190k - $270k

     ...Founding Growth Lead We're looking for a Founding Growth Lead who is equally hands-on in driving the market expansion and accelerating...  ...initiatives across every channel where developers discover, evaluate, and adopt new tools. Build and maintain marketing infrastructure... 
    Full time
    Work at office
    Relocation
    Night shift
    Weekend work

    Inworld AI

    San Francisco, CA
    12 hours ago
  • $127.5k - $248.5k

     ...time, all from batteries we already have. Codes and Standards Lead, Energy Storage Redwood Materials is pioneering a...  ...EV traction batteries in energy storage systems. This includes evaluating SAE guidance and FMVSS requirements, identifying transferable safety... 
    Full time

    Redwood Materials

    San Francisco, CA
    12 hours ago
  • $225k - $275k

     ...Lead, Network Connectivity & Strategy Fluidstack is seeking a Lead, Network Connectivity & Strategy to own our external connectivity...  ...: Participate in Fluidstack's backbone network design by evaluating and recommending POP locations based on carrier density, interconnection... 
    Contract work
    For contractors
    Local area

    Fluidstack

    San Francisco, CA
    2 days ago
  •  ...Supply Chain Lead Location: San Francisco At Kargo, our mission is to build a connective tissue between the physical world of...  ...Sourcing - Manage strategic sourcing initiatives such as supplier evaluation and partner management. Responsible for the full product... 
    Contract work
    Local area

    Kargo

    San Francisco, CA
    12 hours ago
  • $195k - $286k

     ...Distillation Lead Waabi, founded by AI visionary Raquel Urtasun, is the leader in Physical AI. With a world-class team, we're unlocking...  ...across deployment contexts. - Define rigorous benchmarks and evaluation frameworks to characterize efficiency vs. quality trade-offs... 
    Full time
    Work at office
    Work from home
    Flexible hours

    G2 Venture Partners

    San Francisco, CA
    1 day ago
  • $200k - $230k

     ...critical infrastructure that powers breakthrough AI models at leading research labs and enterprises. Since 2018, we've been pioneering...  ...Familiarity with AI, machine learning, data pipelines, or RL/evaluation work. Experience working directly with forward-deployed... 
    Work at office
    Flexible hours
    3 days per week

    Labelbox

    San Francisco, CA
    1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Evaluation Lead. Be the first to apply!