AI Benchmark Engineer | Native Language Specialist - German - Remote
Lilt
About The Opportunity
We are building a rigorous, verifiable evaluation suite of Terminal-Bench tasks designed to test the limits of large language models on multilingual software challenges. Our goal is to measure multilingual robustness across prompt language effects, non-English data processing, and complex locale/encoding edge cases in terminal workflows.
We are seeking experienced native-speaking software engineers to design, build, and validate these benchmarks. You will create high-signal, high-quality tasks that genuinely test a model's ability to handle multilingual environments without relying on English translation crutches.
Note this is a remote, freelance opportunity
What You'll Deliver
Task Engineering: Evaluating Coding Agents.
Asset Creation: Build realistic task environments using datasets and files in your native language. Crucially, these assets must remain in the target language to genuinely measure multilingual handling.
Prompting & Translation: finding failure points where AI does not work, in your native language
Implementation & Verification: Support the development of robust solutions (reference implementations) and write highly reliable, deterministic verifier scripts (using rubric-based judging only when strictly necessary).
Calibration & Execution: Analyze execution logs and calibrate task difficulty (Easy to Very Hard) using standard Terminal-Bench run configurations against various model tiers (Haiku, Sonnet, Opus).
Quality Assurance: Participate in a rigorous, 4-layer human quality control process (creation, human review, calibration review, and audit) alongside automated LLM-based checks to ensure fairness, grammatical accuracy, and benchmark integrity.
Qualifications
Experience: 1+ years of industry experience in software or prompt engineering.
Background: Proven track record at leading technology companies and/or graduation from top-tier engineering universities.
Language: Native or near-native fluency, with a deep understanding of its grammar, register, and phrasing rules. High English proficiency.
Technical Stack: Strong proficiency in Python, standard shell scripting, and data processing.
Workflow: Extensive experience with Terminal/CLI-based development workflows and a working familiarity with coding agents.
Domain Expertise: Deep technical understanding of multilingual text processing pitfalls, including:
Encoding/decoding robustness and Unicode normalization.
Locale-dependent conventions (collation, casing, non-Gregorian dates).
Text I/O, toolchain interoperability, and safe string operations.
(For specific languages) Bidirectional/RTL handling, font fallbacks, and rendering/typography in UI or artifacts.
Why Collaborate with Lilt?
Your schedule, your rules. As an independent contractor, work when you want, as much or as little as you want. No fixed hours, no check-ins, no micromanaging.
Get paid quickly and fairly. We respect your time and your expertise. Competitive rates, prompt payments, no chasing invoices.
Work on projects that actually matter. Contribute to cutting-edge AI and language technology that is shaping how humans and machines communicate.
Be part of something bigger. Join a global community of linguists, subject matter experts, and language professionals who are advancing human knowledge together.
Grow without limits. As a Lilt contractor you get access to diverse, innovative projects that expand your portfolio and sharpen your skills across industries and domains.
Have fun doing what you love. Bring your language skills to life on projects that are as interesting as they are impactful.
How to Join Our Expert Community
1 - Submit your application including an updated copy of your CV in English
2 - Next, complete a GenAI assessment to evaluate your skills
3 - Finalize onboarding and profile set-up in our system, and become eligible for Applied AI projects.
AI is changing how the world communicates — and LILT is leading that transformation. LILT's mission is to make the world's information available to everyone, no matter the language they speak. Join our global community who thrive on innovation and excellence. Our collective knowledge, uniqueness, and skills deliver multilingual AI and human-verified services to Enterprises, Governments, and AI Developers around the world.
Earn money. Have fun. Advance human knowledge. Work on diverse projects from anywhere, any time you want. Get paid quickly and fairly, and build your professional network in a supportive community—all through a streamlined application process tailored to your expertise.
Information collected and processed as part of your application process, including any job applications you choose to submit, is subject to LILT's Privacy Policy at
At LILT, we are committed to a fair, inclusive, and transparent hiring process. As part of our recruitment efforts, we may use artificial intelligence (AI) and automated tools to assist in the evaluation of applications, including résumé screening, assessment scoring, and interview analysis. These tools are designed to support human decision-making and help us identify qualified candidates efficiently and objectively. All final hiring decisions are made by people. If you have any concerns, require accommodations, or would like to opt-out of the use of AI in our hiring process, please let us know at View email address on click.appcast.io.
LILT is an equal opportunity employer. We extend equal opportunity to all individuals without regard to an individual's race, religion, color, national origin, ancestry, sex, sexual orientation, gender identity, age, physical or mental disability, medical condition, genetic characteristics, veteran or marital status, pregnancy, or any other classification protected by applicable local, state or federal laws. We are committed to the principles of fair employment and the elimination of all discriminatory practices.
- ...A leading multilingual AI company is seeking experienced native-speaking software engineers to design and validate benchmarks for large language models. This remote opportunity requires 5+ years in software engineering with strong skills in Python and shell scripting....Remote workWorldwide
- ...Labs is a leading AI training and... ...team of domain specialists, subject matter... ..., physics, and engineering. We pride ourselves... ...Our flexible, remote-first approach... ...and fluent in German? Join Rise Data... ...as a German Language AI Training Specialist... ...in German (native or near-native...Remote workGerman languageHourly payFlexible hours
- ...German Language Specialist Thermo Fisher Scientific Language Services is expanding its Language Specialist team to support biotech, medical device... ...translation). Knowledge, Skills and Abilities: Native-level German proficiency. Proven experience in medical...Remote workGerman language
- ...dexter health, we build AI-powered software... ...a high-agency AI Engineer to help us build new... ...comparable backend language Experience building... ...automation German language skills Knowledge... ...execution Benefits Remote work Fair compensation... ...Modern AI-native development workflow...Remote workGerman language
- ...Overview As an AI Engineer (f/d/m), you will play a pivotal... ...developing, deploying, and benchmarking NLP and generative AI... ...reflective search refinement. Language: English proficiency at... ...the stack. Benefits Remote: 100% remote work possible (German residence required), other...Remote workGerman languageTemporary workHome officeFlexible hours
- ...Senior AI Engineer In Pre-training Evaluation Aleph Alpha Research's... ...Some weeks you'll be deep in benchmark curation, understanding what a... ...that measure progress. Own German evaluation: Ensure rigorous assessment of German language capabilities - this is core to...Remote workGerman languageRelocationFlexible hours
- ...Working Student Ai Balancing / Machine Learning Engineer Are you excited about working at the frontier of AI... ...help build the next generation of AI-native balancing pipelines and live... ...nations, we are among the largest German employers in the gaming software industry...Remote workGerman languageCasual workWorldwide
$150k - $160k
...the productionization of AI capabilities across the... ...ROLE As a Senior AI Engineer (Full-Stack /... ...production-grade, cloud-native applications with AI at... ...position is classified as remote but is tied to our New... ...Pan Macmillan name. The German publishing company, Holtzbrinck...Remote workGerman languageContract workTemporary workWork at officeLocal area- ...Mercor is seeking an engineer to own core product delivery for an AI-native platform in Ventura, California. This role involves shipping production features, designing integrations, and building analytics experiences. The ideal candidate has strong software engineering...Remote work
- ...the development of SamBoat with a unique obsession with the customer experience Profile You are an organized person Native German speaker Excellent interpersonal skills Pedagogue Convincing Highly customer satisfaction oriented And above all...Remote workGerman languageFull timeInternship
- An innovative AI startup is seeking a Benchmark Specialist to design and execute rigorous benchmarks and evaluate datasets... ...communicate technical specifications to both engineers and customers. The position is full-time and offers remote work flexibility. If you are passionate...Remote workFull time
- ...Mercor in Miami Gardens, FL is seeking a skilled software engineer to build an AI-native platform. The role involves creating core functionalities, integrations, and real-time analytics capabilities, ensuring robust engineering practices through testing and observability...Remote work
$140k - $160k
...Hireology is seeking a Sr. Software Engineer (AI-Native) to join their Developer Experience team. The role focuses on enhancing both internal... ...and a strong background in software engineering. This remote position prefers candidates near Chicago for occasional in-office...Remote workWork at office- ...multilingual content agency seeks a Freelance Native-Level German Technical Translator to join its team.... ...on construction, architecture, and engineering topics. Candidates should have native-... ...position offers a flexible, fully remote work arrangement and the opportunity...Remote workGerman languageFreelanceFlexible hours
- ...Mercor is looking for a skilled engineer to drive the development of an AI-native platform that enhances operational efficiency with real-time analytics and workflows. This role entails shipping production features and designing scalable solutions, making it integral to...Remote work
- ...Senior AI-Native Software Engineer, a full-time position focused on designing and building features for diverse user segments while leveraging AI tools throughout the development process. Key Responsibilities Design and build high-quality features for aging parents, pre...Remote workFull time
- ...Position: AI Native Software Engineers Length: Hybrid 2 days onsite, 3 days remote Location: Remote Pay rate- $70-75/hr on W2 (Only W2) Job Description... ...in Python, Java, or similar backend languages ~ Experience with: CI/CD pipelines / Infrastructure...Remote work
- ...Mercor is seeking a Software Engineer to contribute to an AI-native platform focused on streamlining operations with real-time dashboards. This role emphasizes collaborative shipping of features, designing integrations, and managing analytics. The ideal candidate will...Remote work
- ...Mercor is seeking a skilled engineer to develop an AI-native platform that enhances operational efficiency. You will be responsible for core product delivery, including platform foundations, integrations, and analytics dashboards. The ideal candidate should have solid...Remote workFull time
- ...AI-Native Founding Engineer Join Fancysauce as our second engineer and partner with a proven founding team of Harvard grads and Apple alumni. You will own core platform verticals end-to-end, building agentic recipes that help companies optimize AI stacks. This high-...Remote work
- ...AI Native Software Engineer (All Levels) Bay Area | In-Office About Larridin — We Measure AI Impact Larridin is the measurement layer... .... People who need fully baked specs to move forward Remote-only candidates (this role is in-office, Bay Area) Those...Remote workWork at office
- ...About Tread Tread is an AI-native vertical SaaS platform transforming construction materials logistics-a massive, essential industry... ...product development is run by the founding team and a small engineering group. To scale from $XM to $XXM+ ARR, we need Forward...Remote workFor contractorsImmediate startDay shift
$73.8k - $261.5k
...Advanced Technology Centers (ATCs) is the engine for reinvention in our clients' transformation... ...industry knowledge, the latest in Gen AI solutions, and tech expertise from around... ...client challenges You are: An AI Native Engineer with experience building cloud-native...Remote workWork experience placementLive inWork at officeLocal area3 days per week$23 - $34.33 per hour
...Meridial Marketplace, by Invisible is looking for a German Language Specialist to help train AI models. In this remote contract role, you will challenge language models on various linguistic topics and document improvement suggestions. A Master's or PhD in German language...Remote workGerman languageHourly payContract workFor contractors- ...Noxtua is seeking an AI Engineer to develop AI-powered features for its Review & Drafting Squad. The role involves collaborating... ...familiarity with Docker and Git. The position offers a remote work option (German residence required) and various benefits including flexible...Remote workGerman languageFlexible hours
- ...An AI technology startup is seeking a Benchmarking Specialist in Palo Alto to design and execute ML evaluation benchmarks. You'll work closely with the R&D team... ...fluent in English. This is a full-time position with remote work possibilities, targeting an immediate start...Remote workFull timeImmediate start
- ...experience in translating patents from German into Portuguese (native-level fluency of Portuguese) with... .... Project Details Location: Remote Start date: ASAP Employment Type... ...English; Native fluency in target language; Degree in linguistics, translation...Remote workGerman languageFreelanceImmediate start
- ...AI-Native Data Engineer @ TrueMeter SF Bay Area | Hybrid (3 days onsite, 2 remote) About Us We're building the AI Energy Agent that's becoming the default way any business pays for power and saves on energy. The grid is breaking under the weight of AI and...Remote workImmediate start
- ...German Into Bosnian Freelance Translator Lilt is looking... ...Please note that this is a remote, freelance contractor... ...position. Requirements Native or fluent in the target language (reading and writing)... ...Contribute to cutting-edge AI and language technology that...Remote workGerman languageFor contractorsFreelanceLocal area
- ...A company is looking for a Staff AI Builder (AI Native Mobile Engineer). Key Responsibilities Transform rough ideas into functional prototypes quickly, often within a day Build and iterate user interface components and experiences based on immediate feedback Collaborate...Remote workImmediate start
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to AI Benchmark Engineer | Native Language Specialist - German - Remote. Be the first to apply!
- machine learning ai engineer United States
- senior ai engineer United States
- ai engineer remote United States
- ai ml engineer United States
- ai engineer United States
- ai developer United States
- ai research engineer United States
- ai prompt engineer United States
- millwright helper United States
- manufacturing maintenance mechanic United States

