Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Machine Learning Engineer - LLM Evaluation & Automation

Grid Dynamics Holdings

We are seeking a highly skilled Machine Learning Engineer who specializes in leveraging Large Language Models (LLMs) for automated evaluation and quality assessment. In this role, you will design and build systems that automatically measure and improve the accuracy, relevance, and consistency of model outputs. You will lead initiatives to create evaluation pipelines, develop metrics, and deliver actionable insights for continuous improvements. This position requires strong technical expertise, analytical problem-solving abilities, and the capacity to manage projects across multiple cross-functional teams.

Essential functions


Responsibilities:
  • Design and implement automated systems and pipelines for evaluating LLM outputs.
  • Develop metrics and KPIs to measure output quality, accuracy, and consistency using LLM-based evaluations
  • Collaborate with Engineering teams to create automated logic checks and validation tools.
  • Partner with Data Scientists to analyze evaluation results and optimize prompt and task structures.
  • Provide feedback loops to ensure evaluation guidelines align with LLM-based assessments.
  • Investigate how LLM-derived evaluations can enhance product reliability and user experience.
  • Recommend refinements to prompt engineering, evaluation strategies, and automation tools.
  • Stay informed on emerging trends in LLM evaluation, automated quality assessment, and AI toolchains.
  • Continuously improve and expand automated evaluation processes based on industry best practices.
Qualifications
  • 5+ years of experience in ML engineering, NLP, or AI/ML automation.
  • Advanced degree (MS/PhD) in Statistics, Data Science, Computational Social Science, Quantitative Psychology, or a related field.
  • Hands-on experience in prompt engineering and designing LLM-based evaluation systems is preferred
  • Strong understanding of machine learning principles with focus on NLP and advanced LLM capabilities (e.g., Chain-of-Thought, agentic workflows)
  • Expertise in building automated evaluation or QA pipelines.
  • Excellent analytical and problem-solving skills with experience in root cause and error pattern analysis.
  • Proven project management and cross-functional collaboration experience.
  • Excellent communication skills to convey complex insights to technical and non-technical audiences.
  • Detail-oriented mindset with a focus on evaluation metrics, prompt design, and automation.
  • Ability to quickly adapt to new business rules and evaluation guidelines across diverse product domains.
  • Strong programming skills in Python and SQL.
  • Experience with big data technologies like PySpark for data aggregation and sampling is a strong plus
We offer
  • Opportunity to work on cutting-edge projects
  • Work with a highly motivated and dedicated team
  • Competitive salary
  • Flexible schedule
  • Benefits package - medical insurance, vision, dental, etc.
  • Corporate social events
  • Professional development opportunities
  • Well-equipped office

About us


Grid Dynamics (NASDAQ: GDYN) is a leading provider of technology consulting, platform and product engineering, AI, and advanced analytics services. Fusing technical vision with business acumen, we solve the most pressing technical challenges and enable positive business outcomes for enterprise companies undergoing business transformation. A key differentiator for Grid Dynamics is our 8 years of experience and leadership in enterprise AI , supported by profound expertise and ongoing investment in data , analytics , cloud & DevOps , application modernization and customer experience . Founded in 2006, Grid Dynamics is headquartered in Silicon Valley with offices across the Americas, Europe, and India.
Vacancy posted 1 day ago
Similar jobs that could be interesting for youBased on the Machine Learning Engineer - LLM Evaluation & Automation in United States vacancy
  • $180k - $270k

     ...privacy protection. To learn more about Plaud,...  ...clear, defensible, and automated metrics that researchers...  ...Possess strong software engineering skills (especially in...  ..., data pipelines, or evaluation harnesses that can run...  ...looks like for a Speech LLM, translating... 
    Suggested
    Full time
    Work at office
    Worldwide

    Plaud

    San Francisco, CA
    4 days ago
  • $204k - $259k

     ...Waymo AI Foundations team is to develop machine learning solutions addressing open problems in...  ..., hierarchical learning, and robust evaluation. This role follows a hybrid work schedule...  ...report to a Senior Staff Software Engineer. You will: Work with a creative team... 
    Suggested
    Full time
    Temporary work
    Remote work

    Neura Market

    Mountain View, CA
    3 days ago
  •  ...frontier research for their next generation of LLM products. Join us if you: Wish to work...  ...advancement. Responsibilities Own LLM evaluation processes and methods with a focus on...  ...abrupt shift in focus. You must be able to learn, implement, and extend state-of-the-art... 
    Suggested
    Local area
    Shift work

    Capitolis

    San Francisco, CA
    5 days ago
  • $200k - $275k

     ...Do We are looking for a Machine Learning Engineer to help build cutting edge...  ...infrastructure for building and serving LLM’s  at Moveworks. This role...  ...models(LLM), model evaluation and monitoring framework,...  .... Build abstractions to automate various steps in different... 
    Suggested
    Full time

    Moveworks

    Mountain View, CA
    more than 2 months ago
  •  ...Description Dealer Automation Technologies is a...  ..., for a Senior AI/ML Engineer specializing in Large...  ...designing and implementing machine learning models, particularly...  ...neural networks, and evaluating model performance....  ...to integrate LLM-based automation into... 
    Suggested
    Full time
    Part time
    Local area
    Flexible hours

    Bomnin Chevrolet Dadeland

    Miami, FL
    1 day ago
  •  ...About Kinetic Kinetic Automation is building a network of automated repair centers...  .... You’ll collaborate with other engineers and researchers to develop, evaluate, and help deploy vision models for...  ...must understand how Transformer/LLM building blocks map to vision (ViT... 

    Menlo Ventures

    Costa Mesa, CA
    1 day ago
  • $131.4k - $235.95k

     ...Experience Technology Machine Learning Team (GET-ML) @...  ...personalization, and automation, using data, machine...  ...Assistant (CSA), an LLM-driven conversational...  ...workflows, query routing, evaluation and measurement, and...  ...partner closely with ML engineers, MLOps, product managers... 
    For contractors
    Work experience placement
    Work at office
    Remote work

    Autodesk

    Atlanta, GA
    3 days ago
  •  ...Senior ML Engineer Supply chain is the circulatory...  .... We're an industrial automation and research company building...  ...AI, our autonomous LLM-based dispatch agent,...  ...reinforcement learning workflows, prompt engineering...  ...team. Technical evaluation, details disclosed after... 
    Immediate start

    Ritual Capital

    San Francisco, CA
    4 days ago
  • $240k - $290k

     ...AI. Instead of basic automation that needs constant human...  ...role for ML engineers who want to build production...  ...As a Founding Senior Machine Learning Engineer at Retell,...  ...models and audio models, evaluate them with rigorous...  ...Interview (45 min) : LLM theory specific... 
    H1b
    Work at office

    Retell AI

    Redwood City, CA
    4 days ago
  • $264k - $330k

     ...simple assistance to power real automation and decision‑making. Who We...  ...We’re seeking a Principal Machine Learning Engineer to help define and lead the...  ..., model training, evaluation, deployment, and inference...  ...and deploying open source LLM and SLM to production for optimizing... 

    AppFolio

    San Diego, CA
    2 days ago
  • $150k - $230k

     ...the Role We are looking for a hands‑on Machine Learning Engineer to drive the post‑training of our...  ...throughput and stability. Build and maintain evaluation and reward/verifier pipelines to...  ...production‑ready code. Requirements Hands‑on LLM post‑training experience. You have... 
    Full time

    GoTo Meeting

    Mountain View, CA
    4 days ago
  • $171.6k - $302.2k

    Machine Learning Engineer, ML/GenAI Evaluation San Diego, California, United States Software and Services Would you like to contribute to Machine Learning...  ...hallucination rates, faithfulness, and groundedness using LLM-as-a-judge frameworks, human evaluation protocols, and... 
    Relocation
    Shift work

    Apple

    San Diego, CA
    2 days ago
  • $212k - $386.3k

    AIML - Sr Machine Learning Engineer, Evaluation Cupertino, California, United States Machine Learning and AI...  ...observation in production. We develop LLM-as-judge evaluators, train reward models...  ...loop, and on-device settings; build automated prompt and context optimization... 
    Relocation

    Apple

    Cupertino, CA
    1 day ago
  • $147.4k - $272.1k

     ...The Health Sensing Machine Learning Interpretability & Analytics...  ...an exceptional ML Engineer to help us build the...  ...of scalable evaluation infrastructure and lead...  ...data pipelines, and automated frameworks that ensure...  ...edge cases. Expand LLM/diffusion‑based data... 
    Relocation

    Apple

    Cupertino, CA
    3 days ago
  • $204k - $259k

    Neura Market is seeking an experienced Software Engineer in Mountain View, California. You will develop innovative machine learning solutions for autonomous driving and contribute to advanced evaluation systems. The ideal candidate holds a Bachelor's or Master's degree... 

    Neura Market

    Mountain View, CA
    5 days ago
  • $175k - $275k

     ...location) - Senior - Product & Engineering - $175k - $275k Applied Data Scientist, LLM Evaluation Introduction At Driver, we’re...  ...balance human judgment with automated signals. This role builds...  ...Master’s, or PhD in Statistics, Machine Learning, Data Science, Computational... 
    Remote job
    Full time
    Flexible hours

    Driverai

    Austin, TX
    5 days ago
  • $141.8k - $258.6k

     ...leveraging multimodal capabilities. You will design and manage data annotation processes, work with ML Engineers, and develop LLM auto-judges for AI model evaluation. The ideal candidate has a BA/Master’s in a relevant field and at least 2 years of experience in survey... 

    Apple

    Cupertino, CA
    5 days ago
  •  ...Applied Data Scientist with expertise in LLM evaluation to join its innovative team in Austin,...  ...a strong background in statistics and machine learning. The successful candidate will define...  ...evaluation datasets, and establish automated quality signals for content generation... 
    Remote job

    Driverai

    Austin, TX
    5 days ago
  • $133.9k - $223.9k

    Senior Software Machine Learning Engineer (Teradyne, North Reading, MA) Location...  ...Teradyne, a global leader in automated test equipment (ATE) and...  ...optimization, and applied LLM systems. You will be the go...  ...design, training pipelines, evaluation frameworks, and deployment.... 
    Flexible hours

    Teradyne

    North Reading, MA
    2 days ago
  • $170k - $225k

     ...Machine Learning Engineer – Healthcare Salary Range: $170,000 to $225,000 Location: Charlotte, NC Are...  ...leader in AI and large language model (LLM) technology, is transforming one of the...  ...to minutes, driving impactful automation in healthcare. The Role Design and optimise... 
    Remote work
    Flexible hours

    Consortia Group

    Whitehall, NY
    5 days ago
  •  ...ML Engineer | Nox Metals | Detroit, MI American factories...  .... We use software and automation to supply metal to...  ...not built yet. Every machine, every order, every shipment...  ...price Build NLP and LLM features for sales...  ..., labeling, training, evaluation, deployment, monitoring... 
    Full time
    Immediate start
    Shift work

    Nox Metals

    Detroit, MI
    1 day ago
  • $161.9k - $194.2k

     ...creating the future of financial automation so businesses can spend...  ...Join BILL\'s AI Product Engineering team and help shape the future...  ...automation. As a Senior Machine Learning Engineer , you\'ll play a...  ...drive product innovation Evaluate, optimize, and monitor model... 
    Temporary work
    Remote work
    Visa sponsorship
    Flexible hours

    Bill.com

    San Jose, CA
    1 day ago
  • $125k - $135k

     ...Platform (inference, deployment automation, experimentation, sampling)...  ...(not deep expertise) Machine Learning frameworks: TensorFlow, PyTorch...  ...or similar Requirements Evaluate and benchmark new ML inference...  ..., Computer or Electrical Engineering, Mathematics, or a related... 
    Temporary work
    Work experience placement
    Remote work
    Flexible hours

    Hitachi Data Systems

    San Jose, CA
    1 day ago
  • $10k

     ...corporation, created to provide Automation Solutions and Support...  ...and have a desire to learn and grow. Yaskawa's...  ...have a passion for machine learning and advanced...  ...Business Unit. This engineer will work with the guidance...  ...clustering, and model evaluation is a must.... 
    Internship

    Yaskawa

    Franklin, WI
    5 days ago
  • $500 per month

     ...Forward Deployed Senior Machine Learning Engineer Adelphi builds AI/ML-enabled...  ...data silos, build trust in automation without compromising...  ...mix of software development, LLM Ops, and secdevops practices...  ...AutoGen, or similar) and agent evaluation / observability tooling.... 

    Adelphi

    Washington DC
    5 days ago
  •  ...Machine Learning Engineer Location: Cupertino, CA BOUT THIS FEATURED OPPORTUNITY...  ..., anomaly detection, and operational automation. This role will support two major initiatives...  ...platform that uses image-based analysis to evaluate store readiness, supply conditions,... 
    Local area
    Remote work
    Flexible hours

    INSPYR Solutions

    United States
    2 days ago
  •  ...Overview: Machine Learning Engineer Philadelphia, PA OR Washington, DC | Hybrid: 3-4 days/week 9 + Months Role...  ...tooling teams. Enhance existing AIML automation tools (e.g., Speech data), implement LLM prompt interactions, and use LLMs to test LLMs -... 
    3 days per week

    Guru Schools

    Philadelphia, PA
    1 day ago
  •  ...Join Our Data Products and Machine Learning Development Remote Startup...  ...looking for a Machine Learning Engineer to help take our expertise to...  ...visual recognition and automation for various industries, improving...  ...). Background in modern LLM technologies. Understanding... 
    Remote work

    Mutt Data

    United States
    4 days ago
  •  ...Tyto Athene is seeking a driven and adaptable Machine Learning Engineer to help shape the future of cybersecurity through automation and machine learning. This role is an...  ...outside of the current trends, e.g. knows pre-LLM NLP theory and how approaches such as genetic... 
    Remote work
    Worldwide

    Tyto Athene, LLC

    United States
    2 days ago
  • $170k - $230k

     ...Machine Learning Engineer Help us solve fraud asap with Casap — where we're building the world's first AI-native disputes automation and fraud prevention platform. Our mission is to create a future...  ...machine learning models to evaluate disputes and chargebacks and likelihood... 
    Full time
    Work at office
    Immediate start
    Home office
    Monday to Friday
    Flexible hours

    CASAP

    San Francisco, CA
    2 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Machine Learning Engineer - LLM Evaluation & Automation. Be the first to apply!