AI Benchmark Engineer | Native Language Specialist - Serbian - Remote
Lilt
- Remote job
We are building a rigorous, verifiable evaluation suite of Terminal-Bench tasks designed to test the limits of large language models on multilingual software challenges. Our goal is to measure multilingual robustness across prompt language effects, non-English data processing, and complex locale/encoding edge cases in terminal workflows. We are seeking experienced native-speaking software engineers to design, build, and validate these benchmarks. You will create high-signal, high-quality tasks that genuinely test a model's ability to handle multilingual environments without relying on English translation crutches. Note this is a remote, freelance opportunity Key Responsibilities Task Engineering: Evaluating Coding Agents. Asset Creation: Build realistic task environments using datasets and files in your native language. Crucially, these assets must remain in the target language to genuinely measure multilingual handling. Prompting & Translation: finding failure points where AI does not work, in your native language Implementation & Verification: Support the development of robust solutions (reference implementations) and write highly reliable, deterministic verifier scripts (using rubric-based judging only when strictly necessary). Calibration & Execution: Analyze execution logs and calibrate task difficulty (Easy to Very Hard) using standard Terminal-Bench run configurations against various model tiers (Haiku, Sonnet, Opus). Quality Assurance: Participate in a rigorous, 4-layer human quality control process (creation, human review, calibration review, and audit) alongside automated LLM-based checks to ensure fairness, grammatical accuracy, and benchmark integrity. Required Qualifications Experience: 5+ years of industry experience in software engineering. Background: Proven track record at leading technology companies and/or graduation from top-tier engineering universities. Language: Native or near-native fluency, with a deep understanding of its grammar, register, and phrasing rules. High English proficiency. Technical Stack: Strong proficiency in Python, standard shell scripting, and data processing. Workflow: Extensive experience with Terminal/CLI-based development workflows and a working familiarity with coding agents. Domain Expertise: Deep technical understanding of multilingual text processing pitfalls, including: Encoding/decoding robustness and Unicode normalization. Locale-dependent conventions (collation, casing, non-Gregorian dates). Text I/O, toolchain interoperability, and safe string operations. Bidirectional/RTL handling, font fallbacks, and rendering/typography in UI or artifacts. If interested, please submit your application including a latest copy for your CV in English. AI is changing how the world communicates — and LILT is leading that transformation. LILT's mission is to make the world's information available to everyone, no matter the language they speak. Join our global community who thrive on innovation and excellence. Our collective knowledge, uniqueness, and skills deliver multilingual AI and human-verified services to Enterprises, Governments, and AI Developers around the world. Earn money. Have fun. Advance human knowledge. Work on diverse projects from anywhere, any time you want. Get paid quickly and fairly, and build your professional network in a supportive community—all through a streamlined application process tailored to your expertise. Information collected and processed as part of your application process, including any job applications you choose to submit, is subject to LILT's Privacy Policy at At LILT, we are committed to a fair, inclusive, and transparent hiring process. As part of our recruitment efforts, we may use artificial intelligence (AI) and automated tools to assist in the evaluation of applications, including résumé screening, assessment scoring, and interview analysis. These tools are designed to support human decision-making and help us identify qualified candidates efficiently and objectively. All final hiring decisions are made by people. If you have any concerns, require accommodations, or would like to opt-out of the use of AI in our hiring process, please let us know at View email address on click.appcast.io. LILT is an equal opportunity employer. We extend equal opportunity to all individuals without regard to an individual’s race, religion, color, national origin, ancestry, sex, sexual orientation, gender identity, age, physical or mental disability, medical condition, genetic characteristics, veteran or marital status, pregnancy, or any other classification protected by applicable local, state or federal laws. We are committed to the principles of fair employment and the elimination of all discriminatory practices. #J-18808-Ljbffr Lilt
- A leading multilingual AI company is seeking experienced native-speaking software engineers to design and validate benchmarks for large language models. This remote opportunity requires 5+ years in software engineering with strong skills in Python and shell scripting. Candidates...Remote jobWorldwide
- ...AI Native Software Engineer (All Levels) Bay Area | In-Office About Larridin — We Measure AI Impact Larridin is the measurement layer... .... People who need fully baked specs to move forward Remote-only candidates (this role is in-office, Bay Area) Those...Remote workWork at office
- ...AI-Native Founding Engineer Join Fancysauce as our second engineer and partner with a proven founding team of Harvard grads and Apple alumni. You will own core platform verticals end-to-end, building agentic recipes that help companies optimize AI stacks. This high-...Remote work
- ...Seeking a hands-on AI Native Software Engineer to design, build, and deploy production-grade AI-driven systems within enterprise environments.... ...observability Proficiency in Python, Java, or similar backend languages Experience debugging and optimising production systems...Remote work
- ...applications and next steps. Our partner is looking for an AI-native QA Engineer based in the United States. This role sits at the intersection... ...working in an EST timezone environment. Benefits ~ Remote-first work environment ~20 days of paid time off plus U.S....Remote jobContract work
- ...applications and next steps. Our partner is looking for an AI-native QA Engineer based in Australia. This role is designed for a QA professional... ...compensation package aligned with experience ~ Remote-first work environment ~20 paid time off days plus U.S. holidays...Remote jobFull time
- Evident ID in Atlanta is seeking a Software Engineer with 3+ years of experience in Java/Python and a keen interest in AI development tools. This hybrid role involves integrating AI-native coding practices and delivering impactful software solutions. Candidates should...Remote job
- Mercor is looking for a skilled engineer to drive the development of an AI-native platform that enhances operational efficiency with real-time analytics and workflows. This role entails shipping production features and designing scalable solutions, making it integral to...Remote job
$122k - $150k
RTI International in North Carolina seeks an AI Native Engineer to design and develop cloud-native AI solutions across various sectors such... ...rigorous research standards. Flexibility to work on-site or remotely is available. Competitive salary range from $122,000 to $150...Remote job$184k - $287.5k
A leading technology company seeks an AI Benchmarking and Telemetry Engineer in Santa Clara, California. In this role, you will develop benchmarking approaches for HPC and AI tasks, maintain telemetry frameworks, and collaborate with engineering teams to optimize performance...Remote job- Mercor is seeking a Software Engineer to build an AI-native platform for real-time data analytics. The successful candidate will handle core product delivery and ensure features integrate seamlessly across the backend and frontend. Qualifications include strong software...Remote job
- Mercor is seeking a software engineer to build an AI-native platform replacing traditional operations with real-time dashboards. You will be responsible for core product delivery, including platform foundations, integrations, and analytics. A strong background in SaaS...Remote job
- Mercor is seeking a skilled engineer to take ownership of core product delivery for an AI-native platform. This role involves building features, designing integrations, and developing real-time analytics capabilities to replace traditional operations with advanced workflows...Remote job
- Mercor is seeking a skilled software engineer for their AI-native platform development. The role involves building a robust platform with real-time dashboards, integration layers, and analytics, requiring strong software engineering skills and experience in SaaS products...Remote job
- Mercor is seeking a Software Engineer to develop an AI-native platform and deliver production features across backend and frontend. The role requires strong practical software engineering skills and experience with SaaS products, APIs, and integrations. You will build...Remote job
- Mercor is looking for a skilled engineer to build an AI-native platform, replacing spreadsheet-driven operations with real-time dashboards. You'll own core product delivery end-to-end, implementing integrations and building analytics features. The successful candidate...Remote job
- Mercor in California seeks an engineer to build an AI-native platform that replaces spreadsheet operations with real-time dashboards. You'll be responsible for end-to-end delivery, including features across backend and frontend, and designing scalable integrations. The...Remote job
- Mercor is seeking a software engineer to build an AI-native platform focused on streamlining operations. The role includes shipping production features, designing integrations, and ensuring strong engineering practices. Candidates should have robust software engineering...Remote job
- An innovative AI startup is seeking a Benchmark Specialist to design and execute rigorous benchmarks and evaluate datasets... ...communicate technical specifications to both engineers and customers. The position is full-time and offers remote work flexibility. If you are passionate...Remote jobFull time
$132.3k - $172.73k
...people in any country where we have a legal entity. This is a hybrid Builder-Seller role in our newly formed AI Natives unit. As an AI GTM Engineer, you won't just execute a playbook - you'll code it. You'll sit at the intersection of Sales, Data Science, and Engineering...Remote workWork at officeLocal area- Mercor is seeking engineers to join their team in developing an AI-native platform designed to replace traditional operations with real-time dashboards and workflows. Responsibilities include shipping production features, designing integrations, and building analytics experiences...Remote jobFull time
- Mercor is seeking an experienced engineer to build an AI-native platform that replaces spreadsheet-driven operations with real-time dashboards. This role involves core product delivery, including platform foundations, integrations, and analytics. The candidate will be...Remote job
- Mercor is seeking a skilled engineer to develop an AI-native platform that enhances operational efficiency. You will be responsible for core product delivery, including platform foundations, integrations, and analytics dashboards. The ideal candidate should have solid...Remote jobFull time
$50 per hour
A leading AI research organization is seeking PhDs in Chemistry or related fields for a remote contract. The role involves designing advanced problems to test AI performance... ..., evaluating AI outputs, and refining benchmarks. The pay rate is $50+/hour, depending on expertise...Remote jobContract work- Mercor is hiring an engineer to develop an AI-native platform in Overland Park, Kansas. The selected candidate will be responsible for core product delivery, including platform foundations and pilot launch support, as well as designing a scalable integrations framework...Remote job
- Mercor in Murfreesboro, TN, is seeking a skilled software engineer to build an AI-native platform that enhances operational efficiency. The role requires you to ship features from backend to frontend and design robust integration frameworks. The ideal candidate will have...Remote job
- Mercor is looking for an engineer to build an AI-native platform, manage real-time analytics dashboards, and support pilot launches. Responsibilities include designing scalable frameworks and shipping production features across backend and frontend. The ideal candidate...Remote jobFull time
- Mercor is seeking an engineer to build an AI-native platform intended to replace spreadsheet operations with real-time dashboards. The role involves shipping production features across backend and frontend, designing scalable integrations, and developing real-time analytics...Remote job
- ...AI-Native Data Engineer @ TrueMeter SF Bay Area | Hybrid (3 days onsite, 2 remote) About Us We're building the AI Energy Agent that's becoming the default way any business pays for power and saves on energy. The grid is breaking under the weight of AI and...Remote workImmediate start
$140k - $230k
About Arize AI is rapidly transforming the world. As generative... ...AI is the leading AI & Agent Engineering observability and evaluation platform... ...an AI Sales Engineer, Digital Native (Solutions Architect) to join... ...of concept. While we are a remote-first company, due to the...Remote workWork experience placementWork from home
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to AI Benchmark Engineer | Native Language Specialist - Serbian - Remote. Be the first to apply!
- senior ai engineer New York, NY
- ai ml engineer New York, NY
- ai engineer remote New York, NY
- ai engineer New York, NY
- ai prompt engineer New York, NY
- ai developer New York, NY
- ai research engineer New York, NY
- machine learning ai engineer New York, NY
- industrial mechanic New York, NY
- production mechanic New York, NY


