Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

ML Lead, AI Data Labeling

NewtonX

About NewtonX NewtonX is a B2B insights company trusted by the world's most innovative companies to make high-stakes decisions with confidence. We combine a verified network of business professionals with AI-powered research tools to deliver research intelligence faster, more precise, and more defensible than traditional methods. Our clients include Google, Microsoft, TikTok, DoorDash, Stripe, and Coinbase. Our research has been cited by Fortune, Forbes, TechCrunch, Adweek, and the Wall Street Journal. NewtonX has raised $47M from investors including Two Sigma Ventures, Third Prime, XFund, and Citi Ventures. About the Role AI buyers have changed. From mid-market SaaS companies fine-tuning open-source models to Fortune 500 enterprises building internal AI platforms to frontier AI labs running large-scale evaluations, the question is no longer “is AI useful” but “how do we evaluate whether our AI works?” Every one of these buyers needs structured, expert-grounded evaluation data and domain-specific benchmarks. Almost none of them can build it themselves. That is the opportunity as ML Lead. Rolling up directly to the VP of Commercial, you are the technical counterpart to ML and product teams across our client base, spanning growth-stage AI companies, enterprise AI platforms, and frontier research labs. You sit in their working sessions, hold your ground on technical specifics (eval design, statistical significance, contamination concerns, inter-annotator reliability), translate what they actually need into concrete operational specs, and partner with our recruiting and ops lead to build the expert pipelines that produce defensible data. You also build. Beyond bespoke client work, you own the design and development of NewtonX domain benchmarks across high-value verticals (finance, legal, healthcare, and others as we expand). These become both syndicated products and methodological proof points that move us up the client sophistication curve. And you sell, lightly but meaningfully. You are on client calls. You hear gaps. You spot opportunities other vendors miss. You bring those back, shape them into pitches, and partner with Commercial to expand accounts. In this role you'll focus on: Client Technical Partnership Serve as the primary technical point of contact for ML, applied science, and product teams at our AI-focused clients across the maturity spectrum, from emerging AI companies to enterprise platforms to frontier labs. Hold your own in technical conversations: eval design, dataset construction, contamination risk, statistical power, inter-annotator agreement, RLHF data quality, agentic evaluation, red-teaming methodology. Translate ambiguous technical requirements into concrete operational specs: target expert profiles, screener trees, task design, annotation rubrics, quality control protocols, statistical sampling plans. Calibrate depth to the audience. A Series B AI startup and a frontier lab need different conversations. You can run both. Domain Benchmark Development Design and build domain benchmarks for NewtonX-owned domains in high-value verticals. Initial targets: finance (markets, accounting, regulatory), legal (contracts, case reasoning, jurisdictional), healthcare (clinical reasoning, diagnostic, regulatory). Additional verticals as the business expands. Architect benchmark structure: task taxonomy, difficulty distribution, expert involvement model, evaluation rubrics, scoring protocols, baseline scoring against frontier models. Recruit and calibrate the domain experts who write, validate, and grade benchmark tasks. Work with our recruiting and ops lead to operationalize at scale. Publish methodology papers, technical reports, and leaderboards that make NewtonX benchmarks the reference standard in their verticals. Operationalization with NewtonX Recruiting and Ops Work directly with our full-time recruiting and operations lead to convert client and benchmark requirements into operational specs: expert profiles, screeners, task interfaces, annotation workflows, QC sampling rates, and fielding timelines. Calibrate the recruiting team on what “good” looks like for each engagement. Run alignment sessions when standards shift. Own the technical feedback loop: when an expert clears screening, but their output is unusable, diagnose whether it is a screener problem, a task-design problem, or a calibration problem, and fix it upstream. Define quality control metrics: inter-annotator agreement targets, gold-standard task injection rates, and statistical power thresholds. Hold the team accountable to them. Commercial Partnership and Account Expansion Sit in client calls alongside Commercial leads. Surface technical gaps and unsolved problems that the client has not yet asked us to address. Translate gaps into concrete proposal narratives: scope, methodology, deliverables, defensibility. Hand off to Commercial for pricing and close. Contribute to NewtonX positioning with AI buyers: case studies, technical blog posts, conference presence at applied AI and industry events. Help shape what additional ML and research roles we hire as the AI account book and benchmark program grow. Who you are: Required 5 to 8 years of applied ML experience with substantive evaluation, benchmark, or human data work. Examples of strong backgrounds: applied scientist or ML engineer who owned an eval or human data workstream; ML lead at an AI-forward enterprise who built domain-specific evaluation systems; research engineer at an AI consultancy or evaluation firm; quantitative researcher who pivoted into LLM evaluation. Working fluency with modern LLM evaluation: benchmark design, contamination handling, statistical significance, eval harness construction, agentic and tool-use evaluation, RLHF and preference data quality, red-team probe design. You do not need to have built every one of these, but you should be conversant across them. Strong programming foundation. You can read and reason about an eval harness, write Python comfortably, work with model APIs, and prototype scoring pipelines. You do not need to be a production engineer, but you should not be hands-off either. Statistical fluency. You know when an effect is real and when it is noise. You can defend a sample size choice or a significance threshold. Demonstrated client-facing presence. You have presented technical work to skeptical audiences, defended design choices in real time, and adjusted scope without losing rigor. Range matters here: you can talk to a Series B CTO and a Fortune 100 AI lead in the same week. Light commercial instinct. You hear a client describe a problem, and your first reaction is, “We could solve that. Here is how.” You are comfortable shaping that into a pitch. You do not need to close, but you need to spot. Strong written communication. You can write a methodology section, a benchmark technical report, or a client proposal that holds up to expert review. Strongly Preferred Direct experience designing or contributing to an LLM benchmark or evaluation system (academic, open-source, or proprietary). Domain depth in one or more of: finance, legal, healthcare, scientific reasoning, and software engineering. Bonus for two or more. Exposure to expert-driven data work: RLHF pipelines, preference data collection, expert annotation programs, red-team operations, and evaluation contractor management. Graduate degree in computer science, machine learning, statistics, or a related quantitative field. A strong applied track record can serve as a substitute. Publications or open-source contributions in evaluation, benchmarking, or applied ML methodology. How we will evaluate Technical screen: deep dive on a benchmark, eval system, or data pipeline you have built or contributed to. We will probe design choices, statistical reasoning, and what you would do differently. Take-home exercise: We give you a real (anonymized) client problem and ask you to design an evaluation or benchmark, including task design, expert profile, sampling plan, scoring methodology, and quality-control protocol. Live working session: walk through your take-home as if you were defending it to a client ML lead. We will push back. We want to see how you handle it. Domain benchmark thought exercise: pick a vertical (finance, legal, or healthcare) and sketch what a defensible domain benchmark in that area would look like. Cross-functional interviews with Commercial, our recruiting and ops lead, and senior leadership to assess collaboration, communication, and commercial instinct. If the profile above describes you and your passions, we'd love to hear from you! What we offer Massive Impact: Opportunity to have an astounding impact, build a brand new business unit from the ground up, and have direct C-level influence at an extremely fast-growing late-stage startup. Fast-track career growth: This foundational role will enable you to progress quickly within NewtonX towards commercial and operational leadership. Comprehensive Benefits: Excellent medical, dental, and vision insurance. Retirement: 401k match with immediate vesting. Perks: Health savings/flexible savings account, and pre-tax commuter benefits. Work-Life Balance: Paid time off: vacation, holidays, sick, and parental leave. Great Culture: A diverse, collaborative, and positive culture where we invest in and celebrate each other's success (happy hours, team projects, and retreats). Visa sponsorship is not available for this role. NewtonX is proud to be an equal opportunity workplace. We do not discriminate based upon race, religion, color, national origin, sex, sexual orientation, gender identity/expression, age, status as a protected veteran, status as an individual with a disability, or any other applicable legally protected characteristics. #J-18808-Ljbffr

Vacancy posted 1 day ago
Similar jobs that could be interesting for youBased on the ML Lead, AI Data Labeling in New York, NY vacancy
  • $190k - $210k

     ...for you We're building the most data‑driven GTM organization in SMB...  ...analytics and applied AI leads to support our Sales and Marketing...  ...analytics team has already built ML‑powered lead scoring that 2x́d...  ...a hybrid path that's hard to label You've personally shipped data... 
    Data
    Remote work

    Owner

    New York, NY
    2 days ago
  •  ...Data Scientist Vice President Bring your expertise to JPMorgan Chase. As part of Risk...  ...Concept (POCs) and deployable models using AI/ML techniques, algorithms and other statistical...  ...Louvian / Hierarchical Clustering, Label Propagation, Connected Component Analysis,... 
    Data

    Chase

    Jersey City, NJ
    1 day ago
  •  ...Job Responsibilities Design and execute the AI automation roadmap across the organization...  ...across Product, Marketing, Ops, and Data Lead technical decision-making, balancing innovation...  ...environments Strong experience in AI/ML systems (LLMs, APIs, automation workflows,... 
    Data

    Trafilea

    New York, NY
    2 days ago
  •  ...EPAM Systems, Inc. is seeking a Data Science Consultant to drive the design and deployment of next-generation ML and AI solutions. You will integrate business objectives with AI innovation and lead cross-functional teams to help clients realize AI’s potential. Your role... 
    Data

    EPAM Systems Inc

    New York, NY
    4 days ago
  •  ...RentFlow (YC S24) in New York is seeking an AI/ML Lead to take ownership of underwriting, cash-flow intelligence, and data insights. The ideal candidate will have experience in building ML systems, comfort with financial data, and a desire to impact real-world outcomes... 
    Data

    RentFlow (YC S24)

    New York, NY
    3 hours ago
  •  ...Data Ideology is looking for a Senior AI/ML Architect to lead the evaluation and selection of Small Language Model candidates. This role includes designing the architecture for operational scope and ensuring suitable inference pipelines. The ideal candidate should have... 
    Data
    Contract work
    Remote work

    Data Ideology, LLC

    New York, NY
    4 days ago
  •  ...AI/ML Lead As the AI/ML Lead at RentFlow, you will own underwriting, cash‑flow intelligence, and data insights end‑to‑end. What You’ll Do Model messy, real‑world SMB cash flows from transaction‑level data Build and iterate on underwriting & decisioning systems Leverage... 
    Data

    RentFlow (YC S24)

    New York, NY
    3 hours ago
  •  ...AI/ML Lead We are seeking an experienced Senior Generative AI Developer to design and implement cutting-edge AI solutions leveraging...  ...APIs using FastAPI for seamless integration with AI models and data pipelines. Collaborate with cross-functional teams to integrate... 
    Data

    Kasmo Global

    Jersey City, NJ
    1 day ago
  •  ...AI Consulting Lead Technical Expertise: • Proficiency in AI/ML frameworks (TensorFlow, PyTorch), Generative AI models, and NLP techniques, Adobe’ suite of products...  ...stacks (Adobe, Sitecore, Salesforce, etc.), and data ecosystems. • Exposure to personalization engines... 
    Data

    JConnect Infotech

    New York, NY
    1 day ago
  •  ...AI/ML Lead Location: Dallas, TX I Tampa, FL I Jersey City, NJ (3 days WFO) Mode: Full Time / C2H after 6 months Job Description...  ...APIs using FastAPI for seamless integration with AI models and data pipelines. Collaborate with cross-functional teams to integrate... 
    Data
    Full time

    Futran Tech Solutions Pvt. Ltd.

    Jersey City, NJ
    1 day ago
  •  ...Role: AI Team Lead Location: NYC, NY Project description We're seeking an AI Engineer to design, build, and deploy production-grade ML/AI systems. You'll work with product, data and infrastructure teams to convert research and models into scalable... 
    Data

    Lorven Technologies

    New York, NY
    1 day ago
  •  ...AI Consulting Lead Location: New York City, NY Duration: Fulltime Job Description: 12+...  ...Functional Skills: Proficiency in AI/ML frameworks (TensorFlow, PyTorch),...  ...Adobe, Sitecore, Salesforce, etc.), and data ecosystems. Exposure to personalization... 
    Data
    Full time

    JConnect Infotech

    New York, NY
    1 day ago
  •  ...AI/ML Development, Build predictive models and machine-learning algorithms. Expertise with data reconciliations, data quality, large data processing and controls Expertise leading large scale Enterprise Transformation initiatives with a clear track record... 
    Data
    Work at office

    Omni Inclusive

    Jersey City, NJ
    1 day ago
  •  ...ML Ops Engineer Step into a fast-growing area of Cybersecurity at JPMorganChase, where...  ...work independently and apply your skills in data analysis, statistics, and data engineering...  ...deployment, optimize infrastructure, and ensure AI systems perform reliably and efficiently.... 
    Data

    Chase

    Jersey City, NJ
    16 days ago
  • $150k

     ...highly skilled Senior Power BI & Agentic AI Engineer to lead the design, development, and...  ...automation solutions. This role blends advanced data visualization expertise with emerging capabilities...  ...Foundry Solid understanding of AI/ML concepts, particularly LLMs and agent‑... 
    Data

    Garan, Incorporated

    New York, NY
    3 hours ago
  •  ...collaboration worldwide. You’ll work with leading companies across industries,...  ...shape their hybrid cloud and AI journeys. With support from our...  ...Leader in Hybrid Cloud & Data, you are a seasoned professional...  ...and Machine Learning (AI/ML) services, with emphasis on building... 
    Data
    Worldwide

    IBM Computing

    New York, NY
    4 days ago
  •  ...AI Modeling Lead Join a team where you will remain deeply technical and hands-on while elevating...  ...impact. ~ Lead modeling or data science engagements end-to-end, including...  ...experience developing machine learning (ML) models, with meaningful exposure to deep... 
    Data

    Chase

    New York, NY
    4 days ago
  • $185k - $245k

     ...across our global operations. As AI becomes increasingly embedded...  ...AI Governance & Risk Strategy Lead to help refine and scale our enterprise...  ...Technology, Legal, Compliance, Data, and Product to ensure the safe...  ...Data and Security Risk, or AI/ML—at least 3 years directly... 
    Data
    Temporary work
    For contractors
    Work experience placement
    Work at office

    Bloomberg

    New York, NY
    4 days ago
  •  ...AI Innovation Lead We are seeking an AI Innovation Lead (Managing Lead Data Scientist) to spearhead our AI strategy and lead our Data Science Lab (DSL). In this role,...  ...initiatives. Identify and implement cutting-edge AI/ML capabilities across the enterprise. Mentor... 
    Data
    Remote work
    Relocation package

    Incorra

    New York, NY
    1 day ago
  •  ...The AI Platform Lead owns the definition, creation, and ongoing management of ULS’s enterprise AI...  ...Provide architectural leadership for AI/ML systems including agentic workflows, RAG...  ..., problem‑solving to safeguard personal data and conviction to make the world a more... 
    Data
    Local area
    Remote work
    Flexible hours
    Shift work

    UL Solutions

    New York, NY
    2 days ago
  • $150k - $250k

     ...Group Details We are seeking an experienced AI/LLM Product Engineer to design and build...  ...the intersection of large language models, data science, and electronic trading, with a...  ...Qualifications ~5+ years of experience in applied ML or AI engineering, with at least 2 years... 
    Data
    Full time
    Work at office
    Immediate start
    Flexible hours

    Tradeweb

    New York, NY
    2 days ago
  • $192k - $250k

     ...for talented, entrepreneurially minded and data-driven people who also have a passion for...  ...We are looking for a builder to help lead our ‘AI for Work’ efforts. Together with the Director...  ...management and prompt versioning. AI & ML Knowledge: You possess a solid understanding... 
    Data
    Hourly pay
    Immediate start
    Flexible hours
    Shift work

    Dormont Manufacturing Company

    New York, NY
    2 days ago
  • A leading financial services firm in New York is seeking an Applied AI ML Lead to join their Sales Science Data and Analytics team. This role involves utilizing AI and ML technologies to enhance banker and client engagement. The ideal candidate will have extensive experience... 
    Data

    JPMorgan Chase & Co.

    New York, NY
    1 day ago
  •  ...products firm in Idaho is seeking a Senior Manager, CX Analytics to lead CX Data capability. This role requires deep expertise in CX analytics...  ...has 8-12 years of experience and strong knowledge of AI/ML techniques. A competitive salary and benefits are offered, cultivating... 
    Data

    Ninjakitchen

    New York, NY
    1 day ago
  • $73k - $122.28k

     ...Gen AI / Agentic AI Lead Infosys is seeking a hands-on Gen AI / Agentic AI Lead to drive the development...  ...experience in software engineering or data science, with 2–3 years in Gen AI or LLM...  ...programming skills and experience with ML/AI libraries (Hugging Face Transformers,... 
    Data
    Full time
    Temporary work
    Relocation

    Infosys

    New York, NY
    1 day ago
  • $164.35k - $260k

     ...of the payment lifecycle and our industry-leading solutions facilitate seamless...  ...trillion. As a Vice President Applied AI/ML Scientist within our payment solutions team...  ...executing highly scalable and dependable data processing pipelines, conducting analysis... 
    Data

    JPMorgan Chase Bank, N.A.

    New York, NY
    2 days ago
  • $102.4k - $204.1k

     ...invest in innovative ideas, such as AI-enabled insights and technology...  ...: AI Solutions Consulting Lead About Crowe Studio...  ...identification, prioritization, validation, data exploration, experimentation...  ...following areas: Basic ML/AI literacy (training vs... 
    Data
    Local area
    Remote work
    Worldwide
    Flexible hours

    Crowe

    New York, NY
    13 hours ago
  • $182k

     ...technical writer," "implementation lead," etc. Paxos Health is a Seed-stage healthcare AI startup that has raised $6M in...  ...do not need to have been an AI/ML engineer, or production software...  ...structured outputs, JSON/API-shaped data, evals, QA, workflow builders,... 
    Data
    Work experience placement
    Summer work
    Work at office
    Relocation
    Flexible hours
    3 days per week

    PEAR

    New York, NY
    1 day ago
  •  ...Machine Learning Scientist to drive innovation in AI technologies. This role involves training and evaluating ML models for customer feedback, designing evaluation...  ...skills, and extensive experience with large-scale text data. A MSc/PhD in a related field is preferred, as... 
    Data
    Remote work

    NLP PEOPLE

    New York, NY
    2 days ago
  • A leading financial services firm is seeking a Vice President Applied AI/ML Scientist to enhance payment solutions using AI/ML technologies. This role involves designing data processing pipelines, collaborating with cross-functional teams, and implementing machine learning... 
    Data

    J.P. Morgan

    New York, NY
    2 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to ML Lead, AI Data Labeling. Be the first to apply!