Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Eval360 - Error Analysis Engineer

$150k

Institute of Foundation Models

About the Institute of Foundation Models The Institute of Foundation Models is a dedicated research lab focused on building, understanding, using, and risk‑managing foundation models. Our mission is to advance AI research, support the next generation of AI builders, and develop impactful systems that improve how frontier models are trained, evaluated, deployed, and governed. As part of our team, you will work closely with researchers, machine learning engineers, data scientists, software engineers, and product teams on some of the most important challenges in AI development. You will contribute to systems that help measure model quality, identify failure modes, and improve the reliability, safety, and readiness of model releases. The Role We are looking for an Eval360 - Error Analysis Engineer to help build, improve, and operate Eval360, an evaluation service that serves as a quality gate for AI models. This person will focus specifically on error analysis: understanding where models fail, why they fail, how those failures should be categorized, and how evaluation systems can better detect, measure, and prevent these issues before models are released. You will collaborate with researchers, machine learning engineers, product managers, data scientists, and platform teams to develop AI evaluation applications and internal tools based on next‑generation AI research. You will be part of a cross‑functional team responsible for the full software development lifecycle, from requirements gathering and system design to implementation, deployment, monitoring, debugging, documentation, and continuous improvement. The ideal candidate is comfortable working across the stack, including front‑end interfaces for reviewing errors, back‑end evaluation pipelines, data analysis workflows, model evaluation infrastructure, databases, dashboards, and APIs. This person should have strong software engineering skills, excellent analytical judgment, and the ability to turn ambiguous model failures into structured insights that improve evaluation quality. Key Responsibilities Collaborate with researchers, machine learning engineers, data scientists, product managers, and internal stakeholders to implement innovative software solutions for Eval360 and related model evaluation workflows. Build and improve Eval360 as an evaluation service that acts as a quality gate for model development, model comparison, and model release decisions. Perform deep error analysis on model outputs, including identifying failure patterns, categorizing issues, tracing root causes, and proposing improvements to evaluation methodology. Develop tools, workflows, and dashboards that make it easier for researchers and engineers to inspect model failures, compare model behavior, and understand quality regressions. Design and implement client‑side and server‑side architecture for evaluation review systems, error analysis interfaces, reporting tools, and internal evaluation applications. Develop responsive, usable interfaces that support error triage, annotation review, evaluation debugging, and model quality investigation. Build and maintain back‑end services, APIs, data pipelines, and integrations that support evaluation execution, results storage, analysis, and reporting. Test software to ensure responsiveness, correctness, reliability, and efficiency across evaluation workflows. Troubleshoot, debug, and upgrade evaluation systems, including identifying issues in data processing, evaluation metrics, model output handling, job orchestration, and user‑facing analysis tools. Create and maintain security, access control, and data protection settings for evaluation data, model outputs, annotations, and internal tooling. Write clear technical documentation for Eval360 systems, error taxonomies, evaluation workflows, debugging procedures, and user‑facing tools. Work with researchers, data scientists, analysts, and machine learning engineers to improve evaluation quality, model diagnostics, and failure‑mode visibility. Keep track of new development tools, evaluation frameworks, model analysis methods, data quality techniques, and architectures relevant to AI evaluation systems. Contribute to the design of error taxonomies, evaluation rubrics, quality thresholds, regression detection methods, and model readiness criteria. Help ensure Eval360 produces reliable, interpretable, and actionable signals for model quality gates. Contribute to research publications, technical reports, internal knowledge sharing, and external presentations where appropriate. Contribute to intellectual property and thought leadership in AI evaluation, error analysis, model quality measurement, and evaluation infrastructure. Perform all other duties as reasonably directed by the line manager that are aligned with these functional objectives. Academic Qualifications Bachelor's degree in Computer Science, Machine Learning, Data Science, Software Engineering, Statistics, or a related technical field required. Master's or Ph.D. in Computer Science, Machine Learning, Artificial Intelligence, Data Science, or a related field preferred. Professional Experience Proven experience as a Software Engineer, Full Stack Developer, Machine Learning Evaluation Engineer, Data Scientist, AI Engineer, or similar role. Experience building software systems for AI, machine learning, data analysis, evaluation, annotation, experimentation, or model monitoring. Experience working with AI algorithms and the ability to develop systems that accommodate AI‑related requirements. Experience performing error analysis, model evaluation, data quality analysis, or failure‑mode investigation for machine learning or language model systems. Experience developing internal applications, dashboards, review tools, or web‑based workflows for technical users. Familiarity with common software stacks, including front‑end frameworks, back‑end services, databases, APIs, and cloud or internal infrastructure. Familiarity with GitHub, Git, CI/CD workflows, and collaborative software development practices. Knowledge of front‑end languages and libraries such as HTML, CSS, JavaScript, TypeScript, React, Angular, or similar technologies. Knowledge of back‑end languages and frameworks such as Python, Java, C#, Node.js, FastAPI, Flask, Django, or similar technologies. Familiarity with databases such as MySQL, PostgreSQL, MongoDB, or other structured and unstructured data stores. Familiarity with evaluation frameworks, experiment tracking systems, data pipelines, or machine learning infrastructure is strongly preferred. Ability to analyze complex model outputs and translate qualitative failures into structured, measurable categories. Strong problem‑solving and troubleshooting skills, especially for ambiguous technical issues involving models, data, metrics, and software systems. Effective communication and collaboration skills, with the ability to work across research, engineering, data, and product teams. Strong attention to detail and a high bar for evaluation quality, reliability, and interpretability. Preferred Qualifications Experience with large language models, foundation models, multimodal models, or model evaluation systems. Experience designing or using error taxonomies, evaluation rubrics, benchmark datasets, human evaluation workflows, or automated grading systems. Experience with Python‑based data analysis tools such as pandas, NumPy, Jupyter, or similar. Experience with visualization or dashboarding tools for model quality analysis. Experience with distributed systems, job queues, workflow orchestration, or large‑scale data processing. Experience working in a research environment or with fast‑moving AI product and model teams. Benefits Comprehensive medical, dental, and vision benefits Bonus 401K plan Generous paid time off, sick leave, and holidays Paid parental leave Employee assistance program Life insurance and disability insurance Salary Range: $150,000 - $450,000 a year. The posted salary range represents the company’s good faith estimate of the compensation for this position upon hire. The actual compensation offered may vary within this range depending on individual qualifications, including but not limited to relevant skills, experience, education, certifications, geographic location, and specific business needs. Visa Sponsorship: This position is eligible for visa sponsorship. #J-18808-Ljbffr Institute of Foundation Models

Vacancy posted 3 days ago
Similar jobs that could be interesting for youBased on the Eval360 - Error Analysis Engineer in Sunnyvale, CA vacancy
  • $80.75k - $214.5k

     ...Position: Signal Integrity Engineer Location: Santa Clara, CA Amphenol Cable Backplane...  ...with TDR, VNA, pattern generators, error detectors and similar measurement equipment...  ...Experience with signal integrity and simulation/analysis tools such as Ansys HFSS and Designer,... 
    Suggested
    Temporary work

    Amphenol ICC

    Santa Clara, CA
    1 day ago
  •  ...changes as needed • Checks and approves drawings of Engineering Staff Technical Support • Provides technical...  ...provide accurate data and reports with few mistakes, low error rate • Analysis - Ability to perform critical or minute examination of... 
    Suggested
    Work at office
    Local area
    Night shift

    Pasona NA

    Santa Clara, CA
    5 days ago
  • $121.3k - $213.7k

     ...Quality Engineer, Rights & Pricing The Rights & Pricing engineering team provides the Apple...  ...test case generation, coverage gap analysis) where they reduce manual overhead and accelerate...  ...validation of response contracts, error handling, and data integrity ~ Hands-on... 
    Suggested
    Relocation

    Apple

    Cupertino, CA
    1 day ago
  • $75 - $85 per hour

     ...-led, software development and hardware engineering company, offering end-to-end digital services...  ...-level electrical measurements and analysis across power rails, regulators, batteries...  ...layouts, schematics, and designs — flag errors and suggest improvements. ~Review and... 
    Suggested
    Flexible hours

    Fresh Consulting

    Sunnyvale, CA
    1 day ago
  • $161k - $221k

     ...Materials is a global leader in materials engineering solutions used to produce virtually...  ...and drive system requirements definition, error budge, trade studies, and architecture decisions...  ...execute system-level experiments, data analysis, and performance validation. •... 
    Suggested
    Full time
    Relocation

    Applied Materials

    Santa Clara, CA
    1 day ago
  • $82k - $109k

     ...either develop or restart their career in engineering. We fundamentally believe top talent can...  ..., data structures, object orientation, error handling, etc. Knowing these will make...  ...machine learning Suggested Skills Statistical Analysis Machine Learning Coding in Programming/... 
    For contractors
    Apprenticeship
    Work experience placement

    NLP PEOPLE

    Mountain View, CA
    3 days ago
  • $118.5k - $197.5k

     ...Apple is seeking a senior, hands‑on Data Engineer to join the Next‑Gen Workflow team within...  ...software development, applied AI, and business analysis. You will design, build, and deploy...  ...ecosystem Architect for reliability: robust error handling, logging, monitoring, CI/CD, and... 
    Relocation

    Apple Inc.

    Cupertino, CA
    6 days ago
  • $75 - $85 per hour

     ...Consulting is a design-led, software development and hardware engineering company, offering end-to-end digital services to help companies...  ...with production drawings ~ Experience with tolerance analysis ~ Familiarity with PCBA and FPC design and manufacturing methods... 

    Fresh Consulting

    Sunnyvale, CA
    13 days ago
  • $110k - $175k

     ...excellence has earned us several prestigious awards, such as Best Engineering Team, Best Company for Diversity, Compensation, and Work-Life...  ...across mechanical and E/M components. Conduct root cause analysis for quality and process issues at contract manufacturers and... 
    Contract work

    Arista Networks

    Santa Clara, CA
    11 days ago
  • $115k - $130k

     ...The Opportunity Halo Industries is seeking a Mechanical Engineer who will contribute to the design, testing, integration and deployment...  ...solutions to manufacturing needs ● Conduct root cause analysis of test failures and production issues to improve performance... 
    Temporary work

    Halo Industries, Inc.

    Santa Clara, CA
    13 days ago
  • $120.27k - $180.41k

     ...experienced mechanical designer to join our world class team of engineers in our Autonomous Vehicle Hardware group. In this role you...  ...geometry based on cost, weight, strength/durability, and thermal analysis, using both FEA and hand calculations. ~ Experience working... 
    Immediate start
    Flexible hours

    Nuro

    Mountain View, CA
    14 days ago
  • $110k - $175k

     ...commitment to excellence has earned us recognition including Best Engineering Team, Best Company for Diversity, Best Compensation, and Best...  ...) for tooling applications. Perform tolerance stackup analysis for complex mechanical assemblies. Work with overseas CMs and... 
    Contract work
    Overseas

    Arista Networks

    Santa Clara, CA
    21 days ago
  • $110 - $125 per hour

     ...come join the movement! Ceribell is seeking a Senior Mechanical Engineer to support our core research and product development activities...  ...including engineering drawings, bill of materials, tolerance analysis and design reviews. DFX: assist with assembly optimization,... 
    Contract work
    Temporary work
    Work at office
    Local area
    Flexible hours

    Ceribell, Inc

    Sunnyvale, CA
    19 days ago
  •  ...About this role: Cyngn is seeking a Senior Mechanical Engineer who can own the mechanical design, integration, and release of...  ...production. You will be responsible for specifications, CAD design, analysis, prototyping, testing, and release, while coordinating across teams... 
    Temporary work
    Work at office
    Remote work
    Flexible hours

    Cyngn

    Mountain View, CA
    7 days ago
  •  ...Job Description Job Description Display Mechanical Engineer (Long Term Contract) Full-Time Opportunity with Basic Solutions Corp...  ...~ Experience of planning and performing DOE, statistical data analysis, yield improvement, root cause investigation, FACA. ~ Track-record... 
    Long term contract
    Full time

    Basic Solutions

    Sunnyvale, CA
    11 days ago
  • $100.75k - $168.87k

     ...Description Job Description Position: Sr. Mechanical Design Engineer (Data Center Focus) Location: Santa Clara, CA Amphenol...  ...and product qualifications. Perform Design FMEA, Tolerance Analysis, Design for Manufacturability, FEA, and Risk Analysis. Work... 
    Temporary work

    Amphenol TCS

    Santa Clara, CA
    17 days ago
  •  ...issues including rail noise, jitter‑induced errors, resets, and margin loss during hardware...  ...we need to see MS or PhD in Electrical Engineering or a related field, or equivalent...  ...Proficiency with frequency‑domain PDN impedance analysis and time‑domain transient/droop... 
    Work experience placement

    NVIDIA Gruppe

    Santa Clara, CA
    5 days ago
  • $181.1k - $318.4k

    Cellular System Performance Characterization Engineer Do you have a passion for invention and...  ...such as EVM, spectral mask, frequency error, power accuracy, etc. Hands‑on experience...  ...with Python, C/C++, Matlab, and efficient analysis and automation scripting. Preferred... 
    Work experience placement
    Relocation

    Apple Inc.

    Sunnyvale, CA
    4 days ago
  • $172.1k - $305.6k

    Senior Quality Engineer, Rights & Pricing Cupertino, California, United States Software and...  ...generation, anomaly detection, coverage analysis) to improve workflow efficiency where appropriate...  ...including validation of data contracts, error handling, and failure modes Familiarity... 
    Contract work
    Relocation
    Shift work

    Apple Inc.

    Cupertino, CA
    5 days ago
  • $160k - $185k

    Human Factors Engineer Lead - Sunnyvale, CA Lead the Human Factors Engineering & Design team to ensure...  ...devices or diagnostics. Human Factors & Risk Analysis Expertise: Proven expertise in conducting task analyses, use error evaluations, and integrating findings into risk... 

    Payfuture Technologies

    Sunnyvale, CA
    5 days ago
  • $210k - $267k

     ...simulation across 15+ U.S. states. Hardware Engineering is an innovative and collaborative group...  ..., including robust boot sequences, error detection/correction mechanisms, and on-...  ..., PPAP, FMEA, MSA (Measurement System Analysis), SPC (Statistical Process Control), DOE... 
    Full time
    Contract work
    Remote work

    Latent Logic

    Mountain View, CA
    3 days ago
  • $129.2k - $193.8k

    Company:Qualcomm Technologies, Inc.Job Area:Engineering Group, Engineering Group > Systems...  ...spanning algorithm development, performance analysis, optimization, and cross‑functional...  ...based inputsAnalyze and mitigate location error sources , including signal propagation effects... 
    Work experience placement
    Work from home

    Nutanix

    Santa Clara, CA
    2 days ago
  • $168k - $264.5k

     ...Silicon Co-Design Group is seeking a versatile engineer to be part of the HW ArchDev team. The...  ...-level feature development, cost-benefit analysis, system integration solutions, and system...  ..., process variations, statistical error rates, and power analysis. Salary: Base salary... 
    Work experience placement

    NVIDIA Corporation

    Santa Clara, CA
    2 days ago
  •  ...Failure Analysis Engineer Rootshell Enterprise Technologies Inc. is a recognized provider of professional IT Consulting services in the US. We are actively seeking a Failure Analysis Engineer for one of our clients. Location: Santa Clara, California - Onsite Job... 

    Rootshell Inc

    Santa Clara, CA
    4 days ago
  •  ...Summary: The Reality Labs Quality team is seeking an experienced Early Field Failure Analysis Engineer to conduct detailed root cause analysis on device failures. You will be responsible for performing failure analysis on high-volume, technically complex consumer... 

    Insight Global

    Sunnyvale, CA
    4 days ago
  • $116k - $184k

     ...hear from you! We are looking for a Product Quality Engineer to join our team leading all aspects of failure analysis for NVIDIA’s system product segment throughout...  ...defect prevention with a focus on process control and error proofing. Perform data analysis to generate... 
    Work experience placement

    NVIDIA Gruppe

    Santa Clara, CA
    1 day ago
  • $180.5k - $270.7k

     ...Company: Qualcomm Technologies, Inc. Job Area: Engineering Group, Engineering Group Systems Engineering General Summary...  ...simulation ~ Deep experience in holistic structural analysis spanning die, package, board, chassis, and rack-integrated systems... 
    Work experience placement
    Work from home

    Qualcomm

    Santa Clara, CA
    3 days ago
  • $160k - $220k

     ...cost, low-emission delivery. We’re seeking a Senior Mechanical Engineer to lead the design, prototyping, validation, and production of...  ...teams. This is a hands-on position combining deep engineering analysis with rapid prototyping, testing, and field validation. You Will... 
    Flexible hours

    Matternet

    Mountain View, CA
    19 days ago
  • $106k - $260k

     ...will fit here. Role Summary We're looking for a Mechanical Engineer with high drive, low ego, and a genuine desire to own hard...  ...contribute to mechanical design efforts — component design, tolerance analysis, and drawing release — commensurate with experience level... 
    Full time
    Internship

    Range Energy

    Mountain View, CA
    5 days ago
  •  ...We are looking for a highly skilled and experienced Thermal Engineer to join our team in Seattle or Santa Clara. The ideal candidate...  ...completion. Key Responsibilities Thermal System Design and Analysis Design and analyze thermal management systems for various... 
    Local area
    Night shift

    Foxconn-PCE Technology

    Santa Clara, CA
    14 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Eval360 - Error Analysis Engineer. Be the first to apply!