Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

AI Benchmark Engineer | Native Language Specialist - German - Remote

Lilt

About The Opportunity

We are building a rigorous, verifiable evaluation suite of Terminal-Bench tasks designed to test the limits of large language models on multilingual software challenges. Our goal is to measure multilingual robustness across prompt language effects, non-English data processing, and complex locale/encoding edge cases in terminal workflows.

We are seeking experienced native-speaking software engineers to design, build, and validate these benchmarks. You will create high-signal, high-quality tasks that genuinely test a model's ability to handle multilingual environments without relying on English translation crutches.

Note this is a remote, freelance opportunity

What You'll Deliver
  • Task Engineering: Evaluating Coding Agents.

  • Asset Creation: Build realistic task environments using datasets and files in your native language. Crucially, these assets must remain in the target language to genuinely measure multilingual handling.

  • Prompting & Translation: finding failure points where AI does not work, in your native language

  • Implementation & Verification: Support the development of robust solutions (reference implementations) and write highly reliable, deterministic verifier scripts (using rubric-based judging only when strictly necessary).

  • Calibration & Execution: Analyze execution logs and calibrate task difficulty (Easy to Very Hard) using standard Terminal-Bench run configurations against various model tiers (Haiku, Sonnet, Opus).

  • Quality Assurance: Participate in a rigorous, 4-layer human quality control process (creation, human review, calibration review, and audit) alongside automated LLM-based checks to ensure fairness, grammatical accuracy, and benchmark integrity.

Qualifications
  • Experience: 1+ years of industry experience in software or prompt engineering.

  • Background: Proven track record at leading technology companies and/or graduation from top-tier engineering universities.

  • Language: Native or near-native fluency, with a deep understanding of its grammar, register, and phrasing rules. High English proficiency.

  • Technical Stack: Strong proficiency in Python, standard shell scripting, and data processing.

  • Workflow: Extensive experience with Terminal/CLI-based development workflows and a working familiarity with coding agents.

  • Domain Expertise: Deep technical understanding of multilingual text processing pitfalls, including:

    • Encoding/decoding robustness and Unicode normalization.

    • Locale-dependent conventions (collation, casing, non-Gregorian dates).

    • Text I/O, toolchain interoperability, and safe string operations.

    • (For specific languages) Bidirectional/RTL handling, font fallbacks, and rendering/typography in UI or artifacts.

Why Collaborate with Lilt?
  • Your schedule, your rules. As an independent contractor, work when you want, as much or as little as you want. No fixed hours, no check-ins, no micromanaging.

  • Get paid quickly and fairly. We respect your time and your expertise. Competitive rates, prompt payments, no chasing invoices.

  • Work on projects that actually matter. Contribute to cutting-edge AI and language technology that is shaping how humans and machines communicate.

  • Be part of something bigger. Join a global community of linguists, subject matter experts, and language professionals who are advancing human knowledge together.

  • Grow without limits. As a Lilt contractor you get access to diverse, innovative projects that expand your portfolio and sharpen your skills across industries and domains.

  • Have fun doing what you love. Bring your language skills to life on projects that are as interesting as they are impactful.

How to Join Our Expert Community

1 - Submit your application including an updated copy of your CV in English

2 - Next, complete a GenAI assessment to evaluate your skills

3 - Finalize onboarding and profile set-up in our system, and become eligible for Applied AI projects.

AI is changing how the world communicates — and LILT is leading that transformation. LILT's mission is to make the world's information available to everyone, no matter the language they speak. Join our global community who thrive on innovation and excellence. Our collective knowledge, uniqueness, and skills deliver multilingual AI and human-verified services to Enterprises, Governments, and AI Developers around the world.

Earn money. Have fun. Advance human knowledge. Work on diverse projects from anywhere, any time you want. Get paid quickly and fairly, and build your professional network in a supportive community—all through a streamlined application process tailored to your expertise.

Information collected and processed as part of your application process, including any job applications you choose to submit, is subject to LILT's Privacy Policy at

At LILT, we are committed to a fair, inclusive, and transparent hiring process. As part of our recruitment efforts, we may use artificial intelligence (AI) and automated tools to assist in the evaluation of applications, including résumé screening, assessment scoring, and interview analysis. These tools are designed to support human decision-making and help us identify qualified candidates efficiently and objectively. All final hiring decisions are made by people. If you have any concerns, require accommodations, or would like to opt-out of the use of AI in our hiring process, please let us know at View email address on click.appcast.io.

LILT is an equal opportunity employer. We extend equal opportunity to all individuals without regard to an individual's race, religion, color, national origin, ancestry, sex, sexual orientation, gender identity, age, physical or mental disability, medical condition, genetic characteristics, veteran or marital status, pregnancy, or any other classification protected by applicable local, state or federal laws. We are committed to the principles of fair employment and the elimination of all discriminatory practices.

Vacancy posted 20 hours ago
Similar jobs that could be interesting for youBased on the AI Benchmark Engineer | Native Language Specialist - German - Remote in United States vacancy
  •  ...A leading multilingual AI company is seeking experienced native-speaking software engineers to design and validate benchmarks for large language models. This remote opportunity requires 5+ years in software engineering with strong skills in Python and shell scripting.... 
    Remote work
    Worldwide

    Lilt

    New York, NY
    1 day ago
  •  ...Labs is a leading AI training and...  ...team of domain specialists, subject matter...  ..., physics, and engineering. We pride ourselves...  ...Our flexible, remote-first approach...  ...and fluent in German? Join Rise Data...  ...as a German Language AI Training Specialist...  ...in German (native or near-native... 
    Remote work
    German language
    Hourly pay
    Flexible hours

    MERIT Beauty

    New York, NY
    4 days ago
  •  ...German Language Specialist Thermo Fisher Scientific Language Services is expanding its Language Specialist team to support biotech, medical device...  ...translation). Knowledge, Skills and Abilities: Native-level German proficiency. Proven experience in medical... 
    Remote work
    German language

    Thermo Fisher

    United States
    1 hour ago
  •  ...dexter health, we build AI-powered software...  ...a high-agency AI Engineer to help us build new...  ...comparable backend language Experience building...  ...automation German language skills Knowledge...  ...execution Benefits Remote work Fair compensation...  ...Modern AI-native development workflow... 
    Remote work
    German language

    dexter health

    New York, NY
    1 day ago
  •  ...Overview As an AI Engineer (f/d/m), you will play a pivotal...  ...developing, deploying, and benchmarking NLP and generative AI...  ...reflective search refinement. Language: English proficiency at...  ...the stack. Benefits Remote: 100% remote work possible (German residence required), other... 
    Remote work
    German language
    Temporary work
    Home office
    Flexible hours

    Noxtua

    New Bremen, OH
    4 days ago
  •  ...Senior AI Engineer In Pre-training Evaluation Aleph Alpha Research's...  ...Some weeks you'll be deep in benchmark curation, understanding what a...  ...that measure progress. Own German evaluation: Ensure rigorous assessment of German language capabilities - this is core to... 
    Remote work
    German language
    Relocation
    Flexible hours

    Aleph Alpha

    United States
    3 hours ago
  •  ...Working Student Ai Balancing / Machine Learning Engineer Are you excited about working at the frontier of AI...  ...help build the next generation of AI-native balancing pipelines and live...  ...nations, we are among the largest German employers in the gaming software industry... 
    Remote work
    German language
    Casual work
    Worldwide

    Stillfront

    United States
    2 hours ago
  • $150k - $160k

     ...the productionization of AI capabilities across the...  ...ROLE As a Senior AI Engineer (Full-Stack /...  ...production-grade, cloud-native applications with AI at...  ...position is classified as remote but is tied to our New...  ...Pan Macmillan name. The German publishing company, Holtzbrinck... 
    Remote work
    German language
    Contract work
    Temporary work
    Work at office
    Local area

    Macmillan Learning

    New York, NY
    3 days ago
  •  ...Mercor is seeking an engineer to own core product delivery for an AI-native platform in Ventura, California. This role involves shipping production features, designing integrations, and building analytics experiences. The ideal candidate has strong software engineering... 
    Remote work

    Mercor Inc

    Ventura, CA
    3 hours ago
  •  ...the development of SamBoat with a unique obsession with the customer experience Profile You are an organized person Native German speaker Excellent interpersonal skills Pedagogue Convincing Highly customer satisfaction oriented And above all... 
    Remote work
    German language
    Full time
    Internship

    Samboat

    United States
    4 days ago
  • An innovative AI startup is seeking a Benchmark Specialist to design and execute rigorous benchmarks and evaluate datasets...  ...communicate technical specifications to both engineers and customers. The position is full-time and offers remote work flexibility. If you are passionate... 
    Remote work
    Full time

    Pathway Genomics

    Palo Alto, CA
    3 days ago
  •  ...Mercor in Miami Gardens, FL is seeking a skilled software engineer to build an AI-native platform. The role involves creating core functionalities, integrations, and real-time analytics capabilities, ensuring robust engineering practices through testing and observability... 
    Remote work

    Mercor Inc

    Miami Gardens, FL
    4 hours ago
  • $140k - $160k

     ...Hireology is seeking a Sr. Software Engineer (AI-Native) to join their Developer Experience team. The role focuses on enhancing both internal...  ...and a strong background in software engineering. This remote position prefers candidates near Chicago for occasional in-office... 
    Remote work
    Work at office

    Hireology

    New York, NY
    4 days ago
  •  ...multilingual content agency seeks a Freelance Native-Level German Technical Translator to join its team....  ...on construction, architecture, and engineering topics. Candidates should have native-...  ...position offers a flexible, fully remote work arrangement and the opportunity... 
    Remote work
    German language
    Freelance
    Flexible hours

    VeraContent

    New York, NY
    1 day ago
  •  ...Mercor is looking for a skilled engineer to drive the development of an AI-native platform that enhances operational efficiency with real-time analytics and workflows. This role entails shipping production features and designing scalable solutions, making it integral to... 
    Remote work

    Mercor Inc

    Santa Clara, CA
    3 hours ago
  •  ...Senior AI-Native Software Engineer, a full-time position focused on designing and building features for diverse user segments while leveraging AI tools throughout the development process. Key Responsibilities Design and build high-quality features for aging parents, pre... 
    Remote work
    Full time

    Virtual Vocations Inc

    United States
    3 hours ago
  •  ...Position: AI Native Software Engineers Length: Hybrid 2 days onsite, 3 days remote Location: Remote Pay rate- $70-75/hr on W2 (Only W2) Job Description...  ...in Python, Java, or similar backend languages ~ Experience with: CI/CD pipelines / Infrastructure... 
    Remote work

    Apolis

    United States
    2 days ago
  •  ...Mercor is seeking a Software Engineer to contribute to an AI-native platform focused on streamlining operations with real-time dashboards. This role emphasizes collaborative shipping of features, designing integrations, and managing analytics. The ideal candidate will... 
    Remote work

    Mercor Inc

    Milwaukee, WI
    4 hours ago
  •  ...Mercor is seeking a skilled engineer to develop an AI-native platform that enhances operational efficiency. You will be responsible for core product delivery, including platform foundations, integrations, and analytics dashboards. The ideal candidate should have solid... 
    Remote work
    Full time

    Mercor Inc

    Montebello, CA
    4 hours ago
  •  ...AI-Native Founding Engineer Join Fancysauce as our second engineer and partner with a proven founding team of Harvard grads and Apple alumni. You will own core platform verticals end-to-end, building agentic recipes that help companies optimize AI stacks. This high-... 
    Remote work

    Jack and Jill AI

    United States
    20 hours ago
  •  ...AI Native Software Engineer (All Levels) Bay Area | In-Office About Larridin — We Measure AI Impact Larridin is the measurement layer...  .... People who need fully baked specs to move forward Remote-only candidates (this role is in-office, Bay Area) Those... 
    Remote work
    Work at office

    Larridin

    San Francisco, CA
    20 hours ago
  •  ...About Tread Tread is an AI-native vertical SaaS platform transforming construction materials logistics-a massive, essential industry...  ...product development is run by the founding team and a small engineering group. To scale from $XM to $XXM+ ARR, we need Forward... 
    Remote work
    For contractors
    Immediate start
    Day shift

    Tread Corp

    United States
    2 days ago
  • $73.8k - $261.5k

     ...Advanced Technology Centers (ATCs) is the engine for reinvention in our clients' transformation...  ...industry knowledge, the latest in Gen AI solutions, and tech expertise from around...  ...client challenges You are: An AI Native Engineer with experience building cloud-native... 
    Remote work
    Work experience placement
    Live in
    Work at office
    Local area
    3 days per week

    Accenture

    Dallas, TX
    1 day ago
  • $23 - $34.33 per hour

     ...Meridial Marketplace, by Invisible is looking for a German Language Specialist to help train AI models. In this remote contract role, you will challenge language models on various linguistic topics and document improvement suggestions. A Master's or PhD in German language... 
    Remote work
    German language
    Hourly pay
    Contract work
    For contractors

    Meridial Marketplace, by Invisible

    New York, NY
    1 day ago
  •  ...Noxtua is seeking an AI Engineer to develop AI-powered features for its Review & Drafting Squad. The role involves collaborating...  ...familiarity with Docker and Git. The position offers a remote work option (German residence required) and various benefits including flexible... 
    Remote work
    German language
    Flexible hours

    Noxtua

    New Bremen, OH
    3 days ago
  •  ...An AI technology startup is seeking a Benchmarking Specialist in Palo Alto to design and execute ML evaluation benchmarks. You'll work closely with the R&D team...  ...fluent in English. This is a full-time position with remote work possibilities, targeting an immediate start... 
    Remote work
    Full time
    Immediate start

    Pathway Vet Alliance

    Palo Alto, CA
    3 days ago
  •  ...experience in translating patents from German into Portuguese (native-level fluency of Portuguese) with...  .... Project Details Location: Remote Start date: ASAP Employment Type...  ...English; Native fluency in target language; Degree in linguistics, translation... 
    Remote work
    German language
    Freelance
    Immediate start

    Welocalize

    United States
    4 days ago
  •  ...AI-Native Data Engineer @ TrueMeter SF Bay Area | Hybrid (3 days onsite, 2 remote) About Us We're building the AI Energy Agent that's becoming the default way any business pays for power and saves on energy. The grid is breaking under the weight of AI and... 
    Remote work
    Immediate start

    pear.ai

    San Francisco, CA
    20 hours ago
  •  ...German Into Bosnian Freelance Translator Lilt is looking...  ...Please note that this is a remote, freelance contractor...  ...position. Requirements Native or fluent in the target language (reading and writing)...  ...Contribute to cutting-edge AI and language technology that... 
    Remote work
    German language
    For contractors
    Freelance
    Local area

    Lilt

    United States
    2 days ago
  •  ...A company is looking for a Staff AI Builder (AI Native Mobile Engineer). Key Responsibilities Transform rough ideas into functional prototypes quickly, often within a day Build and iterate user interface components and experiences based on immediate feedback Collaborate... 
    Remote work
    Immediate start

    Virtual Vocations Inc

    United States
    1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to AI Benchmark Engineer | Native Language Specialist - German - Remote. Be the first to apply!