AI Benchmark Engineer | Native Language Specialist - Serbian - Remote
Lilt
We are building a rigorous, verifiable evaluation suite of Terminal-Bench tasks designed to test the limits of large language models on multilingual software challenges. Our goal is to measure multilingual robustness across prompt language effects, non-English data processing, and complex locale/encoding edge cases in terminal workflows. We are seeking experienced native-speaking software engineers to design, build, and validate these benchmarks. You will create high-signal, high-quality tasks that genuinely test a model's ability to handle multilingual environments without relying on English translation crutches. Note this is a remote, freelance opportunity Key Responsibilities Task Engineering: Evaluating Coding Agents. Asset Creation: Build realistic task environments using datasets and files in your native language. Crucially, these assets must remain in the target language to genuinely measure multilingual handling. Prompting & Translation: finding failure points where AI does not work, in your native language Implementation & Verification: Support the development of robust solutions (reference implementations) and write highly reliable, deterministic verifier scripts (using rubric-based judging only when strictly necessary). Calibration & Execution: Analyze execution logs and calibrate task difficulty (Easy to Very Hard) using standard Terminal-Bench run configurations against various model tiers (Haiku, Sonnet, Opus). Quality Assurance: Participate in a rigorous, 4-layer human quality control process (creation, human review, calibration review, and audit) alongside automated LLM-based checks to ensure fairness, grammatical accuracy, and benchmark integrity. Required Qualifications Experience: 5+ years of industry experience in software engineering. Background: Proven track record at leading technology companies and/or graduation from top-tier engineering universities. Language: Native or near-native fluency, with a deep understanding of its grammar, register, and phrasing rules. High English proficiency. Technical Stack: Strong proficiency in Python, standard shell scripting, and data processing. Workflow: Extensive experience with Terminal/CLI-based development workflows and a working familiarity with coding agents. Domain Expertise: Deep technical understanding of multilingual text processing pitfalls, including: Encoding/decoding robustness and Unicode normalization. Locale-dependent conventions (collation, casing, non-Gregorian dates). Text I/O, toolchain interoperability, and safe string operations. Bidirectional/RTL handling, font fallbacks, and rendering/typography in UI or artifacts. If interested, please submit your application including a latest copy for your CV in English. AI is changing how the world communicates — and LILT is leading that transformation. LILT's mission is to make the world's information available to everyone, no matter the language they speak. Join our global community who thrive on innovation and excellence. Our collective knowledge, uniqueness, and skills deliver multilingual AI and human-verified services to Enterprises, Governments, and AI Developers around the world. Earn money. Have fun. Advance human knowledge. Work on diverse projects from anywhere, any time you want. Get paid quickly and fairly, and build your professional network in a supportive community—all through a streamlined application process tailored to your expertise. Information collected and processed as part of your application process, including any job applications you choose to submit, is subject to LILT's Privacy Policy at At LILT, we are committed to a fair, inclusive, and transparent hiring process. As part of our recruitment efforts, we may use artificial intelligence (AI) and automated tools to assist in the evaluation of applications, including résumé screening, assessment scoring, and interview analysis. These tools are designed to support human decision-making and help us identify qualified candidates efficiently and objectively. All final hiring decisions are made by people. If you have any concerns, require accommodations, or would like to opt-out of the use of AI in our hiring process, please let us know at View email address on click.appcast.io. LILT is an equal opportunity employer. We extend equal opportunity to all individuals without regard to an individual’s race, religion, color, national origin, ancestry, sex, sexual orientation, gender identity, age, physical or mental disability, medical condition, genetic characteristics, veteran or marital status, pregnancy, or any other classification protected by applicable local, state or federal laws. We are committed to the principles of fair employment and the elimination of all discriminatory practices. #J-18808-Ljbffr
- ...A leading multilingual AI company is seeking experienced native-speaking software engineers to design and validate benchmarks for large language models. This remote opportunity requires 5+ years in software engineering with strong skills in Python and shell scripting....Remote workWorldwide
- An innovative AI startup is seeking a Benchmark Specialist to design and execute rigorous benchmarks and evaluate datasets... ...communicate technical specifications to both engineers and customers. The position is full-time and offers remote work flexibility. If you are passionate...Remote workFull time
$140k - $160k
...Hireology is seeking a Sr. Software Engineer (AI-Native) to join their Developer Experience team. The role focuses on enhancing both internal... ...and a strong background in software engineering. This remote position prefers candidates near Chicago for occasional in-office...Remote workWork at office- ...AI Native Software Engineer (All Levels) Bay Area | In-Office About Larridin — We Measure AI Impact Larridin is the measurement layer... .... People who need fully baked specs to move forward Remote-only candidates (this role is in-office, Bay Area) Those...Remote workWork at office
- ...AI-Native Founding Engineer Join Fancysauce as our second engineer and partner with a proven founding team of Harvard grads and Apple alumni. You will own core platform verticals end-to-end, building agentic recipes that help companies optimize AI stacks. This high-...Remote work
$73.8k - $261.5k
...Advanced Technology Centers (ATCs) is the engine for reinvention in our clients' transformation... ...industry knowledge, the latest in Gen AI solutions, and tech expertise from around... ...client challenges You are: An AI Native Engineer with experience building cloud-native...Remote workWork experience placementLive inWork at officeLocal area3 days per week- ...AI-Native Data Engineer @ TrueMeter SF Bay Area | Hybrid (3 days onsite, 2 remote) About Us We're building the AI Energy Agent that's becoming the default way any business pays for power and saves on energy. The grid is breaking under the weight of AI and...Remote workImmediate start
- Evident ID in Atlanta is seeking a Software Engineer with 3+ years of experience in Java/Python and a keen interest in AI development tools. This hybrid role involves integrating AI-native coding practices and delivering impactful software solutions. Candidates should...Remote job
$155k - $240k
...technology solutions provider leading the AI and Digital Revolution. WWT combines the... ...advisor and thought leader for AI-Native Engineering, helping clients and internal teams understand... ...****@*****.*** . #LI-DP2 #LI-Remote WWT will consider for employment, without...Remote workFull timeShift work- RTI International in Durham, NC is looking for an AI Native Engineer to design and deploy AI-driven solutions. You will collaborate closely with internal teams to enhance workflows across sectors. This role supports the development of AI systems meeting high standards...Remote job
$184k - $287.5k
A leading technology company seeks an AI Benchmarking and Telemetry Engineer in Santa Clara, California. In this role, you will develop benchmarking approaches for HPC and AI tasks, maintain telemetry frameworks, and collaborate with engineering teams to optimize performance...Remote job$70 - $110 per hour
...A dynamic tech company in Canada is looking for a Software Engineer focused on AI-native platforms and integrations. This remote position requires strong software engineering skills to deliver end-to-end features in a fast-paced environment. Candidates should have experience...Remote workHourly payFlexible hours$122k - $150k
RTI International in North Carolina seeks an AI Native Engineer to design and develop cloud-native AI solutions across various sectors such... ...rigorous research standards. Flexibility to work on-site or remotely is available. Competitive salary range from $122,000 to $150...Remote job- ...AI Native Full Stack Engineer REMOTE CVS LOCATION: REMOTE **PLEASE MAKE SURE LINKED IN AND PHOTOS MATCH** **PLEASE... ...full-stack web application development using programing languages like Python and JavaScript frameworks such as Angular...Remote work
$184k - $287.5k
AI Benchmarking and Telemetry Engineer - NVIS page is loaded## AI Benchmarking and Telemetry Engineer - NVISlocations... ...: US, CA, Santa Clara: US, TX, Remote: US, VA, Remote: US, CA, Remotetime... ...Python, Bash, and other scripting languages for automation, data analysis, and...Remote work- ...Mercor in Shawnee, Kansas, is seeking a software engineer to build an AI-native platform. The engineer will be responsible for core product delivery, including designing scalable integrations and developing real-time analytics dashboards. The ideal candidate will have...Remote workFull time
- ...Mercor is seeking a Software Engineer to develop an AI-native platform that transforms operations through real-time dashboards and agentic workflows. This role involves end-to-end product delivery, integration with common tools, and creating scalable analytics solutions...Remote work
- ...Mercor is seeking a Software Engineer to build an AI-native platform that drives real-time operations through innovative dashboards. The role involves shipping production features and designing framework for integrations. Candidates should have practical software engineering...Remote work
- ...Mercor is building an AI-native platform that requires a dedicated engineer to oversee core product delivery from start to finish. Responsibilities include shipping production features, building real-time analytics experiences, and designing a scalable integrations framework...Remote work
- ...Mercor is seeking an experienced engineer to build an AI-native platform that transforms spreadsheet-driven operations into real-time dashboards. You’ll be responsible for core product delivery, including full-feature shipment and scalable integrations. The ideal candidate...Remote work
- ...Mercor is looking for a software engineer to play a pivotal role in building an AI-native platform. The engineer will manage core product delivery, including feature shipping, real-time analytics, and integrations design. The ideal candidate should possess strong software...Remote work
- ...Mercor is seeking an experienced Software Engineer to contribute to our AI-native platform, focusing on developing core features, integrations, and analytics dashboards. This role requires strong software engineering skills and experience in building SaaS products. The...Remote work
- ...Mercor is seeking a software engineer to own core product delivery for an AI-native platform that replaces spreadsheet-driven operations. The engineer will design, build, and implement a scalable integrations framework while enhancing real-time analytics experiences....Remote work
- ...Mercor is looking for an engineer to build an AI-native platform that enhances operations through real-time dashboards and efficient workflows. This role involves shipping production features, designing integrations, and crafting analytics experiences. The ideal candidate...Remote work
- ...Mercor is looking for a skilled software engineer to build an AI-native platform that replaces spreadsheet-driven operations. You will own core product delivery, which includes creating integrations, real-time analytics dashboards, and supporting pilot launches. This...Remote work
- ...Mercor is seeking a skilled engineer to build an AI-native platform that enhances operations with real-time dashboards. You will own core product delivery, build integrations, and support pilot launches. The ideal candidate has robust software engineering skills, experience...Remote work
- ...Mercor is seeking a software engineer to lead the development of an AI-native platform replacing traditional spreadsheet operations. You will design integrations, build real-time analytics, and ensure a solid engineering foundation. The role involves end-to-end ownership...Remote work
- ...Mercor is seeking a skilled engineer to construct an AI-native platform that enhances operational workflows with real-time dashboards. The successful candidate will be responsible for end-to-end product delivery, including backend and frontend features, integrations framework...Remote work
- ...Mercor is seeking a skilled software engineer to contribute to the development of an AI-native platform that replaces spreadsheet-driven operations. This role will involve shipping production features, designing scalable integration frameworks, and building real-time analytics...Remote work
- Mercor is seeking a software engineer to build an AI-native platform, focusing on core product delivery, integrations, and real-time analytics. The ideal candidate has strong software engineering skills, experience with SaaS products, and a capability to effectively use...Remote work
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to AI Benchmark Engineer | Native Language Specialist - Serbian - Remote. Be the first to apply!
- machine learning ai engineer New York, NY
- senior ai engineer New York, NY
- ai engineer remote New York, NY
- ai ml engineer New York, NY
- ai engineer New York, NY
- ai developer New York, NY
- ai research engineer New York, NY
- ai prompt engineer New York, NY
- manufacturing maintenance mechanic New York, NY
- production maintenance mechanic New York, NY

