Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Founding Machine learning Engineer - Evaluation

Established Search

Senior ML Engineer Medical Imaging Evaluation & AI Reliability

About the Role:

My client is building evaluation and evidence infrastructure for safety-critical AI systems, starting with diagnostic medical imaging.

AI systems are increasingly used in settings where their outputs affect clinical decisions and patient outcomes. In medical imaging, benchmark accuracy alone is not enough. Hospitals, regulators, and clinical stakeholders need evidence that models will behave reliably across real-world deployment environments, populations, scanners, and workflows.

This role sits at the intersection of:

  • medical imaging AI,
  • model robustness and evaluation,
  • regulatory evidence generation,
  • and real-world deployment behavior.

The work is highly investigative and requires strong technical judgment, scientific reasoning, and the ability to operate effectively in ambiguous environments.

The Role

This is not a traditional “train models on benchmark datasets” ML role.

You will work directly with medical imaging companies and healthcare stakeholders to investigate how AI systems behave in practice and what evidence is required for deployment, regulatory, and clinical decisions.

You will:

  • Design and execute evaluations for medical imaging AI systems
  • Investigate model failure modes, robustness, and generalization gaps
  • Analyze behavior across populations, scanners, imaging protocols, and clinical settings
  • Determine what evidence is sufficient for stakeholders making deployment or regulatory decisions
  • Translate technical findings into actionable recommendations for customers and clinical stakeholders
  • Build reusable evaluation pipelines, evidence schemas, and model assessment frameworks
  • Work with messy, incomplete, and noisy real-world clinical data
  • Help shape how evaluation investigations are conducted across the organization

The important work is not simply running experiments. It is identifying what questions actually matter, what evidence is missing, and how to generate defensible conclusions under real-world constraints.

Required Qualifications:

  • Strong experience in machine learning for medical imaging (radiology, pathology, cardiology imaging, or related domains)
  • Experience evaluating or validating real-world ML systems, not just training models
  • Deep understanding of:
  • model robustness,
  • distribution shift,
  • uncertainty,
  • failure analysis,
  • and real-world deployment behavior
  • Strong Python skills across the full investigation workflow:
  • data analysis,
  • experimentation,
  • evaluation,
  • and reporting
  • Experience working with noisy or imperfect clinical datasets
  • Ability to communicate technical findings clearly to both technical and non-technical stakeholders
  • High tolerance for ambiguity and open-ended investigative work

Strongly Preferred:

  • Experience with FDA-regulated AI/ML systems or medical device submissions (510(k), De Novo, SaMD, etc.)
  • Experience with medical imaging deployment evaluation or clinical validation
  • Experience with interpretability, post-deployment monitoring, uncertainty estimation, or model auditing
  • Experience designing reproducible evaluation frameworks or benchmarking systems
  • Background in healthcare AI or other safety-critical ML domains
  • Customer-facing or cross-functional technical leadership experience
  • PhD or equivalent research depth in ML, medical imaging, computer vision, or related areas

Ideal Candidate Profile

Candidates who tend to succeed in this role often come from backgrounds such as:

  • Medical imaging ML research
  • FDA or healthcare AI evaluation
  • Clinical AI validation
  • AI robustness and reliability research
  • Applied ML investigation in safety-critical environments
  • Healthcare-focused computer vision research

What Success Looks Like:

The strongest people in this role become experts in how medical AI systems behave in the real world.

They develop the judgment to answer questions such as:

  • Where are the model’s true weaknesses?
  • Which deployment conditions introduce risk?
  • What concerns are real versus theoretical?
  • What evidence is sufficient for a hospital or regulator to trust the system?
  • What additional validation is required before deployment proceeds?

Vacancy posted 12 hours ago
Similar jobs that could be interesting for youBased on the Founding Machine learning Engineer - Evaluation in Sunnyvale, CA vacancy
  • $120k - $235k

     ...most innovative companies to build strong engineering teams ready for what's next. Software...  ...About the role How developers were evaluated previously was whether they can write functionally...  ..., target bonus, and equity. Want to learn more about HackerRank? Check out... 
    Suggested
    Shift work

    HackerRank

    Santa Clara, CA
    2 days ago
  • $170k - $216k

     ...across 15+ U.S. states. The DUE Machine Learning team will build and operate scalable machine...  ...tools, improve and speed up the evaluation and onboard developer journeys. It will...  ...looking for researchers and software engineers who are passionate about developing machine... 
    Suggested
    Full time

    Waymo

    Mountain View, CA
    19 hours ago
  • $238k - $302k

     ...Waymo AI Foundations team is to develop machine learning solutions addressing open problems in...  ..., hierarchical learning, and robust evaluation. This role follows a hybrid work schedule...  ...report to a Senior Staff Software Engineer. You will: Work with a... 
    Suggested
    Full time
    Remote work

    Waymo

    Mountain View, CA
    19 hours ago
  • $204k - $259k

     ...states. The Driver Understanding and Evaluation (DUE) team at Waymo is developing rich...  ...of the Waymo Driver. The DUE Machine Learning team will build and operate scalable machine...  ...looking for researchers and software engineers who are passionate about developing... 
    Suggested
    Full time

    Waymo

    Mountain View, CA
    20 hours ago
  • $238k - $302k

     ...states. The Driver Understanding and Evaluation (DUE) team at Waymo is developing rich...  ...of the Waymo Driver. The DUE Machine Learning team will build and operate scalable machine...  ...looking for researchers and software engineers who are passionate about developing... 
    Suggested
    Full time

    Waymo

    Mountain View, CA
    19 hours ago
  • $60 - $70 per hour

     ...Overview: We are seeking a Machine Learning Engineer to join a high-impact team focused on advancing LLM evaluation, NLP, and AI-driven automation. This role centers on designing scalable evaluation frameworks, optimizing prompt strategies, and building systems that... 
    Contract work
    Temporary work
    Remote work
    3 days per week

    TEKsystems

    Cupertino, CA
    6 days ago
  •  ...Description We are seeking an experienced GenAI engineer to join our seasoned founding team to drive the development and innovation...  ...distributed infrastructure to support machine learning training, inference, and evaluation. Hands‑on contributor and overseer of GenAI... 

    Spector.ai

    Mountain View, CA
    5 days ago
  •  ...Weekly Hours: 40 Role Number: 200657970-0836 Summary The Productivity and Machine Learning Evaluation team ensures the quality of AI-powered features across a suite of productivity and creative applications; including Creator Studio, used by hundreds of millions... 
    Shift work

    Apple

    Cupertino, CA
    1 day ago
  •  ...Number: 200657984-0836 Summary The Productivity and Machine Learning Evaluation team ensures the quality of AI-powered features across a...  ...genuinely useful AI outputs Experience partnering with engineering or data teams to define data collection requirements and... 

    Apple

    Cupertino, CA
    1 day ago
  • $126.8k - $220.9k

     ...ML Engineer, Proactive - Agentic Systems Evaluation Are you passionate about working on the next generation of personalized intelligence systems? In...  ...personalized experiences by adapting to user behaviors with machine learning running locally on-device or in PCC. Join our cross... 
    Relocation

    Apple

    Cupertino, CA
    3 days ago
  • $125k - $201.25k

     ...impact health for humanity. Learn more at jnj.com As...  ...the best talent for Senior Machine Learning Engineer - Robotics to be in Santa Clara...  ...hardware. ~ Define and evaluate performance metrics for...  ...Additional information can be found through the link below.
    Work experience placement
    Local area
    Immediate start

    Johnson and Johnson

    Santa Clara, CA
    4 days ago
  • $160.5k - $240.7k

     ...Technologies, Inc. Job Area: Engineering Group, Engineering Group Machine Learning Engineering General Summary:...  ...compilation pipelines and CI/CD evaluation harnesses to scale model...  ...call Qualcomm's toll-free number found here. Upon request, Qualcomm will... 
    Work experience placement
    Work from home

    Qualcomm

    Santa Clara, CA
    5 days ago
  • $157.2k - $254.1k

     ...Machine Learning Engineer We are seeking a Machine Learning Engineer to join our pioneering security...  ...Experience with model evaluation, tuning, and handling imbalanced datasets...  ...description of our employee benefits may be found here. $157,200.00 - $254,100.00/yr... 

    Palo Alto Networks

    Santa Clara, CA
    5 days ago
  • $184k - $287.5k

    Senior ML Evaluation Engineer - Autonomous Vehicles page is loaded## Senior ML Evaluation Engineer - Autonomous Vehicleslocations: US, CA, Santa...  ...behavior evaluation — moving beyond hand-crafted rules to learned evaluation using LLMs, VLMs, and agentic workflows. You'll... 
    Remote work

    NVIDIA Corporation

    Santa Clara, CA
    5 days ago
  • $147.4k - $272.1k

    ML Engineer - Automated Evaluation and Adversarial Design Cupertino, California, United States Software and Services The Productivity and Machine Learning Evaluation team ensures the quality of AI-powered features across a suite of productivity and creative applications... 
    Relocation
    Shift work

    Apple Inc.

    Cupertino, CA
    2 days ago
  • $147.4k - $272.1k

    Apple Inc. in Cupertino, California is looking for an ML Engineer to build and scale automated evaluation systems for AI features across various applications. This role involves defining evaluation approaches and designing adversarial and stress-testing methodologies.... 

    Apple Inc.

    Cupertino, CA
    2 days ago
  •  ...technology company located in Cupertino is seeking an experienced Machine Learning Engineer to develop data generation methodologies and quality assessment systems. This role involves designing automated evaluation systems and collaborating on data requirements. Candidates... 

    Apple Inc.

    Cupertino, CA
    2 days ago
  •  ...Lead to join a centralized evaluation organization and define the...  ...data quality, and ML systems engineering. You will work closely with...  ...modeling, LLM-as-judge, preference learning, and calibration techniques...  ...or PhD in Computer Science, Machine Learning, Artificial... 

    Apple

    Cupertino, CA
    1 day ago
  • $281k - $356k

     ...across 15+ U.S. states. The DUE Machine Learning team will build and operate scalable machine...  ...tools, improve and speed up the evaluation and onboard developer journeys. It will...  ...looking for researchers and software engineers who are passionate about developing machine... 
    Full time

    Waymo

    Mountain View, CA
    19 hours ago
  •  ...intersection of natural language processing, machine learning, and software engineering. We are responsible for the...  ...Writing Tools, Summarization, Found In Apps, and Messages/Mail Smart Replies...  ...pipelines and contribute evaluation metrics to measure progress. Minimum... 

    Apple

    Cupertino, CA
    5 days ago
  • $152k - $277k

     ...features and build large-scale machine learning models and systems to...  ...key performance metrics to evaluate model impact and identify high...  ...-functionally with product, engineering, and data science teams to align...  ...if a candidate is found to have submitted false information... 
    Temporary work
    Flexible hours

    Coupang

    Mountain View, CA
    1 day ago
  •  ...looking for the best At 42dot, our Machine Learning Engineers conduct research and development on...  ...n Responsibilities Dataset and Evaluation : We focus on curating high-quality...  ...imbalances (long-tail learning) commonly found in autonomous driving datasets.... 
    Full time

    42dot

    Mountain View, CA
    12 hours ago
  •  ...law. About us Founded in 2017, Wayve is the...  ..., constantly learning and evolving as we pave...  ...Role   As an  ML Engineer within the Application...  ...architecture, data pipelines, evaluation frameworks, and real-...  ...you up for success as a Machine Learning Engineer at... 
    Full time
    Work at office
    Work from home

    Wayve

    Sunnyvale, CA
    12 hours ago
  • $181.1k - $318.4k

     ...AI/ML - Senior OS Software Engineer, Evaluation For the engineer that obsesses on how software can enable OS developers to evaluate and improve...  ...bonuses or commission payments as well as relocation. Learn more about Apple Benefits Note: Apple benefit, compensation... 
    Work experience placement
    Relocation

    Apple

    Cupertino, CA
    4 days ago
  • $120k - $235k

     ...most innovative companies to build strong engineering teams ready for what’s next. Software...  ...as intelligent as the candidates it is evaluating. Open Problem An interview that thinks...  ..., target bonus, and equity. Want to learn more about HackerRank? Check out... 
    Shift work

    HackerRank

    Santa Clara, CA
    2 days ago
  • $150k

     ...researchers, data scientists, and engineers, tackling the most...  ...performance computing in deep learning, driving impactful discoveries...  ...pioneers. The Role As a Machine Learning Engineer at the Institute...  ...pre-training, post-training, evaluation and so on, especially... 
    Worldwide
    Visa sponsorship

    Institute of Foundation Models

    Sunnyvale, CA
    1 day ago
  •  ...Candidates only Position Summary Seeking an experienced Machine Learning Engineer to lead the development of prompt injection and prompt...  ...drawn from red-teaming and production signals. Build evaluation harnesses that measure attack success rate false-positive... 

    The Fountain Group

    Mountain View, CA
    3 days ago
  • $120k - $235k

     ...most innovative companies to build strong engineering teams ready for what’s next. Software...  ...across all integrity signals. Build the evaluation infrastructure, golden datasets, and benchmarking...  ..., target bonus, and equity. Want to learn more about HackerRank? Check out... 
    Shift work

    HackerRank

    Santa Clara, CA
    2 days ago
  • $230k - $280k

     ...Founding ML Engineer Poesis is building an AI-driven hedge fund focused on reshaping how...  ...Founding ML Engineer, the first full-time machine learning hire who will turn research and data...  .... Implement backtesting and evaluation frameworks with clear performance metrics... 
    Full time
    Relocation package

    Poesis LLC

    Menlo Park, CA
    3 days ago
  •  ...Company Description It all started when engineer Fred Luddy wrote code that automated a...  ...experiments Develop metrics to evaluate ranking performance Qualifications...  ...traditional information retrieval techniques, or machine learning based ranking models Capable of... 
    Work at office
    Immediate start
    Remote work
    Flexible hours

    ServiceNow

    Mountain View, CA
    3 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Founding Machine learning Engineer - Evaluation. Be the first to apply!