ML Lead, AI Data Labeling
NewtonX
About NewtonX
NewtonX is a B2B insights company trusted by the worlds most innovative companies to make high-stakes decisions with confidence. We combine a verified network of business professionals with AI-powered research tools to deliver research intelligence faster, more precise, and more defensible than traditional methods. Our clients include Google, Microsoft, TikTok, DoorDash, Stripe, and Coinbase. Our research has been cited by Fortune, Forbes, TechCrunch, Adweek, and the Wall Street Journal. NewtonX has raised $47M from investors including Two Sigma Ventures, Third Prime, XFund, and Citi Ventures.About the Role
AI buyers have changed. From mid-market SaaS companies fine-tuning open-source models to Fortune 500 enterprises building internal AI platforms to frontier AI labs running large-scale evaluations, the question is no longer “is AI useful” but “how do we evaluate whether our AI works?” Every one of these buyers needs structured, expert-grounded evaluation data and domain-specific benchmarks. Almost none of them can build it themselves.
That is the opportunity as ML Lead. Rolling up directly to the VP of Commercial, you are the technical counterpart to ML and product teams across our client base, spanning growth-stage AI companies, enterprise AI platforms, and frontier research labs. You sit in their working sessions, hold your ground on technical specifics (eval design, statistical significance, contamination concerns, inter-annotator reliability), translate what they actually need into concrete operational specs, and partner with our recruiting and ops lead to build the expert pipelines that produce defensible data.
You also build. Beyond bespoke client work, you own the design and development of NewtonX domain benchmarks across high-value verticals (finance, legal, healthcare, and others as we expand). These become both syndicated products and methodological proof points that move us up the client sophistication curve.
And you sell, lightly but meaningfully. You are on client calls. You hear gaps. You spot opportunities other vendors miss. You bring those back, shape them into pitches, and partner with Commercial to expand accounts.
In this role youll focus on: Client Technical Partnership
Serve as the primary technical point of contact for ML, applied science, and product teams at our AI-focused clients across the maturity spectrum, from emerging AI companies to enterprise platforms to frontier labs.
Hold your own in technical conversations: eval design, dataset construction, contamination risk, statistical power, inter-annotator agreement, RLHF data quality, agentic evaluation, red-teaming methodology.
Translate ambiguous technical requirements into concrete operational specs: target expert profiles, screener trees, task design, annotation rubrics, quality control protocols, statistical sampling plans.
Calibrate depth to the audience. A Series B AI startup and a frontier lab need different conversations. You can run both.
Domain Benchmark Development
Design and build domain benchmarks for NewtonX-owned domains in high-value verticals. Initial targets: finance (markets, accounting, regulatory), legal (contracts, case reasoning, jurisdictional), healthcare (clinical reasoning, diagnostic, regulatory). Additional verticals as the business expands.
Architect benchmark structure: task taxonomy, difficulty distribution, expert involvement model, evaluation rubrics, scoring protocols, baseline scoring against frontier models.
Recruit and calibrate the domain experts who write, validate, and grade benchmark tasks. Work with our recruiting and ops lead to operationalize at scale.
Publish methodology papers, technical reports, and leaderboards that make NewtonX benchmarks the reference standard in their verticals.
Operationalization with NewtonX Recruiting and Ops
Work directly with our full-time recruiting and operations lead to convert client and benchmark requirements into operational specs: expert profiles, screeners, task interfaces, annotation workflows, QC sampling rates, and fielding timelines.
Calibrate the recruiting team on what “good” looks like for each engagement. Run alignment sessions when standards shift.
Own the technical feedback loop: when an expert clears screening, but their output is unusable, diagnose whether it is a screener problem, a task-design problem, or a calibration problem, and fix it upstream.
Define quality control metrics: inter-annotator agreement targets, gold-standard task injection rates, and statistical power thresholds. Hold the team accountable to them.
Commercial Partnership and Account Expansion
Sit in client calls alongside Commercial leads. Surface technical gaps and unsolved problems that the client has not yet asked us to address.
Translate gaps into concrete proposal narratives: scope, methodology, deliverables, defensibility. Hand off to Commercial for pricing and close.
Contribute to NewtonX positioning with AI buyers: case studies, technical blog posts, conference presence at applied AI and industry events.
Help shape what additional ML and research roles we hire as the AI account book and benchmark program grow.
Who you are: Required
5 to 8 years of applied ML experience with substantive evaluation, benchmark, or human data work. Examples of strong backgrounds: applied scientist or ML engineer who owned an eval or human data workstream; ML lead at an AI-forward enterprise who built domain-specific evaluation systems; research engineer at an AI consultancy or evaluation firm; quantitative researcher who pivoted into LLM evaluation.
Working fluency with modern LLM evaluation: benchmark design, contamination handling, statistical significance, eval harness construction, agentic and tool-use evaluation, RLHF and preference data quality, red-team probe design. You do not need to have built every one of these, but you should be conversant across them.
Strong programming foundation. You can read and reason about an eval harness, write Python comfortably, work with model APIs, and prototype scoring pipelines. You do not need to be a production engineer, but you should not be hands-off either.
Statistical fluency. You know when an effect is real and when it is noise. You can defend a sample size choice or a significance threshold.
Demonstrated client-facing presence. You have presented technical work to skeptical audiences, defended design choices in real time, and adjusted scope without losing rigor. Range matters here: you can talk to a Series B CTO and a Fortune 100 AI lead in the same week.
Light commercial instinct. You hear a client describe a problem, and your first reaction is, “We could solve that. Here is how.” You are comfortable shaping that into a pitch. You do not need to close, but you need to spot.
Strong written communication. You can write a methodology section, a benchmark technical report, or a client proposal that holds up to expert review.
Strongly Preferred
Direct experience designing or contributing to an LLM benchmark or evaluation system (academic, open-source, or proprietary).
Domain depth in one or more of: finance, legal, healthcare, scientific reasoning, and software engineering. Bonus for two or more.
Exposure to expert-driven data work: RLHF pipelines, preference data collection, expert annotation programs, red-team operations, and evaluation contractor management.
Graduate degree in computer science, machine learning, statistics, or a related quantitative field. A strong applied track record can serve as a substitute.
Publications or open-source contributions in evaluation, benchmarking, or applied ML methodology.
How we will evaluate Technical screen: deep dive on a benchmark, eval system, or data pipeline you have built or contributed to. We will probe design choices, statistical reasoning, and what you would do differently.
Take-home exercise: We give you a real (anonymized) client problem and ask you to design an evaluation or benchmark, including task design, expert profile, sampling plan, scoring methodology, and quality-control protocol.
Live working session: walk through your take-home as if you were defending it to a client ML lead. We will push back. We want to see how you handle it.
Domain benchmark thought exercise: pick a vertical (finance, legal, or healthcare) and sketch what a defensible domain benchmark in that area would look like.
Cross-functional interviews with Commercial, our recruiting and ops lead, and senior leadership to assess collaboration, communication, and commercial instinct.
If the profile above describes you and your passions, wed love to hear from you!
What we offer Massive Impact: Opportunity to have an astounding impact, build a brand new business unit from the ground up, and have direct C-level influence at an extremely fast-growing late-stage startup.
Fast-track career growth: This foundational role will enable you to progress quickly within NewtonX towards commercial and operational leadership.
Comprehensive Benefits: Excellent medical, dental, and vision insurance.
Retirement: 401k match with immediate vesting.
Perks: Health savings/flexible savings account, and pre-tax commuter benefits.
Work-Life Balance: Paid time off: vacation, holidays, sick, and parental leave.
Great Culture: A diverse, collaborative, and positive culture where we invest in and celebrate each others success (happy hours, team projects, and retreats).
Visa sponsorship is not available for this role.
NewtonX is proud to be an equal opportunity workplace. We do not discriminate based upon race, religion, color, national origin, sex, sexual orientation, gender identity/expression, age, status as a protected veteran, status as an individual with a disability, or any other applicable legally protected characteristics. #J-18808-Ljbffr
Vacancy posted 7 hours ago
Similar jobs that could be interesting for youBased on the ML Lead, AI Data Labeling in Richmond, VA vacancy
- ...Job Responsibilities Design and execute the AI automation roadmap across the organization... ...across Product, Marketing, Ops, and Data Lead technical decision-making, balancing innovation... ...environments Strong experience in AI/ML systems (LLMs, APIs, automation workflows,...Data
- ...A leading technology company in the United States seeks an experienced AI Manager to design and execute an extensive AI automation roadmap... ...5 years of experience in AI/ML leadership and proven skills in... ...and effective communication. A data-driven mindset and ability to...Data
- ...Lead Tech Architect – AI Infrastructure Location: United States (Remote) | Duration: Contract-to-hire... ...similar tools. Collaborate with DevOps, data science, and engineering teams to ensure... ...scalable, resilient infrastructure for AI/ML workloads. Monitor infrastructure...DataContract workRemote work
- ...A leading marketplace for fashion is seeking a Senior Machine Learning Engineer to enhance personalization... ...systems. In this high-impact role, you will leverage data analytics to optimize user experience, developing AI/ML solutions tailored to the marketplace dynamics....Data
- The AI Lead is responsible for driving the design, development, and delivery of AI-enabled data engineering solutions that support analytics, machine learning, and intelligent applications... ..., and seamless integration with AI/ML workflows. Key Responsibilities Lead the...Data
$132.7k - $206.8k
...Docusign unleashes business-critical data that is trapped inside of... ...management (CLM). What you’ll do A Lead Information Architect, Enterprise Ontology and AI is a visionary strategist... ...boundaries to ensure consistent labeling and discoverability across products...DataPermanent employmentFull timeContract workWork at officeLocal areaRemote work- ...A leading cloud engineering company seeks a Principal Architect to drive AI solutions and technical leadership. This role requires a Masters... ...in designing high-scale AI/ML systems. The ideal candidate will... ...solutions, collaborate with data architects, and contribute to Zencores...Data
- Ernst & Young Oman is hiring a Senior Manager, Data Scientist to lead AI initiatives and data-science strategies across various business sectors.... ...candidate will have over 12 years of experience in data science and ML, with a PhD preferred in a relevant field. This role offers...Data
- ...We are looking for a Data Engineer to build the "memory" and "knowledge" backbone of our Agentic AI ecosystem. You will be responsible for designing data pipelines that feed... ...Build robust data pipelines using Python (for AI/ML workflows) and C#/.NET (for enterprise...Data
- ...Lead Data Architect This position supports Revolutional's federal customer as part of an application transformation and modernization... ...platforms, pipelines, governance frameworks, analytics ecosystems, AI/ML integration, and large-scale distributed processing...DataFor contractors
- Data Governance / AI Governance Specialist Virginia, Richmond 11/25/2025 Contract Active Job Description... ...explainability, and transparency for AI/ML models. Conduct risk assessments and... ...Work with the AI Center of Excellence to lead internal awareness campaigns on ethical...DataContract workWork at office
- ...A leading technology firm in the United States is seeking a Head of Growth to architect user acquisition strategies for an AI-powered crypto search engine. Candidates should have over 5 years of... ...optimizing marketing efforts based on data analysis, while also managing the...Data
- ...A leading venture capital firm in the United States seeks a Strategy and Execution Manager... ...operationalize services and strategies using AI. This pivotal role involves leading the... ...extensive experience with AI solutions and modern data stacks and possesses both strategic and...Data
$120k - $145k
SitusAMC is seeking a seasoned project manager to oversee AI development initiatives in Richmond, Virginia. The ideal candidate will... ...in a senior role, and strong skills in stakeholder management and data engineering. Responsibilities include project planning, monitoring...Data- ...US-based Software Development firm seeking a Data & AI Engineer to join our growing remote team! In this... ...Ops) and Data Lake architecture. This role will lead the development and deployment of scalable, production‑ready AI/ML solutions built on AWS‑native technologies. The...DataRemote work
$200k - $220k
...Intelligence Solutions Integrator – Senior Team Lead Absolute Business Solutions Corp (ABSC) is not just another tech company... ...supporting our clients in the Intelligence, Technology, Defense, AI/ML, and Data Science fields. As we continue to grow at a rapid pace, we...Data- Enterprise Architect - Full Stack, AI/ML Virginia, Richmond 01/27/2026 Contract Active Job... ..., AI/ML is responsible for defining and leading enterprise-grade architectures that... ...engineering, MLOps, cloud-native platforms, data ecosystems, and enterprise integrations....DataContract work
- ...Socotra, Inc. is seeking a Global Employee Relations Lead to design and govern an AI-enabled ER framework. The role involves leading investigations... ...qualifications include a strong background in Employee Relations, data-driven decision-making, and systems thinking. The position...DataRemote work
- ...A leading data management company is seeking a Senior Architectural Leader to design and implement AI systems aligned with enterprise strategy. The role involves evaluating cloud AI platforms, overseeing multi-agent orchestration, and ensuring robust security measures...DataFlexible hours
- ...Senior Manager AI Architecture Innovation Date: Mar 23, 2026... ...manufacturing technologies, or leading strategic innovation, your ideas... ...the enterprise. Partner with Data & Analytics to ensure AI‑ready... ...architecture and 3‑5 years of AI/ML engineering experience; at least...DataLocal areaRemote workWorldwide
- ...Admin Services, LLC is seeking a Principal Product Designer to lead the evolution of product design focusing on AI-powered experiences. The role involves collaboration with product, engineering, and data teams to shape AI design frameworks and enhance customer outcomes...Data
- ...Penguin Solutions seeks an AI Solution Architect in the United States. You will lead technical discussions and design comprehensive AI solutions on the NVIDIA... ...communication skills, and familiarity with modern data center environments. The position includes competitive...Data
- ...Bitovi has been engaged by Very Big Things to lead initial and technical interviews for this... ...Big Things is seeking an exceptional Lead AI Engineer to architect and build an... ..., Node/Nest/Express) Comfortable across data layers (Postgres/SQL, caches, vector stores...DataRemote work
- ...Lead Solutions Architect This position supports Revolutional's federal customer as part... ...-scale transformation of systems into a data-centric, cloud-native ecosystem capable of... ...modernization programs Experience designing AI/ML-enabled enterprise solutions or advanced...DataFor contractors
$120k - $145k
...timeline for all artificial intelligence (AI) development and deployment initiatives.Oversee... ...requirements, and performance benchmarks.Lead and facilitate kick-off meetings, status... ...for business impact.Experienced data engineering skills with ties to development...DataFull timeLocal areaRemote work$100.71k - $157.63k
...UMB’s Artificial Intelligence (AI) team is responsible for driving automation, data-driven decision making and software modernization using AI centric tools.... ...bridge advanced data engineering, governance, and AI/ML model lifecycle requirements to ensure reliable, ethical...DataWork experience placementLocal areaRemote workFlexible hours- ...About The Role As an AI Engineer in the United States (Remote) via... ...design, build, and operate ML systems that move from experimentation... ...when applicable Partner with data teams on annotation guidelines... ...of training data quality, labeling noise, and QA evaluation methods...DataRemote work
- ...TELUS Digital AI Data Solutions is recruiting for the position of Global Language Labeler. This role involves tasks in Audio Speech Recognition, transcription, and quality assurance. Candidates must be local speakers of specified languages, possess advanced English proficiency...DataHourly payLocal areaImmediate startRemote workFlexible hours
$169.4k - $279.6k
...Projects, and a Growing Team! As a member of AI and Emerging Technology Architecture, you'... ...way for best-in-class solutions. As a Lead Architect, you will collaborate with senior... ...: • Apply your hands-on expertise in AI/ML, Generative AI, and Agentic AI systems — including...Temporary workWork experience placementLocal area- ...oversee document intake processes and automate workflows using AI and ML. This role requires a seasoned professional with extensive expertise... ...on driving process improvements. The successful candidate will lead a team, manage governance, and facilitate stakeholder...
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to ML Lead, AI Data Labeling. Be the first to apply!
Related searches
- clinical data Richmond, VA
- master data coordinator Richmond, VA
- clinical data coordinator remote Richmond, VA
- data intern Richmond, VA
- data cabling installation Richmond, VA
- data collection researcher Richmond, VA
- data technician Richmond, VA
- data mining Richmond, VA
- voice and data technician Richmond, VA
- alliance data Richmond, VA

