Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

AI Benchmark Engineer | Native Language Specialist - Serbian - Remote

Lilt

New York, NY
  • Remote job

We are building a rigorous, verifiable evaluation suite of Terminal-Bench tasks designed to test the limits of large language models on multilingual software challenges. Our goal is to measure multilingual robustness across prompt language effects, non-English data processing, and complex locale/encoding edge cases in terminal workflows. We are seeking experienced native-speaking software engineers to design, build, and validate these benchmarks. You will create high-signal, high-quality tasks that genuinely test a model's ability to handle multilingual environments without relying on English translation crutches. Note this is a remote, freelance opportunity Key Responsibilities Task Engineering: Evaluating Coding Agents. Asset Creation: Build realistic task environments using datasets and files in your native language. Crucially, these assets must remain in the target language to genuinely measure multilingual handling. Prompting & Translation: finding failure points where AI does not work, in your native language Implementation & Verification: Support the development of robust solutions (reference implementations) and write highly reliable, deterministic verifier scripts (using rubric-based judging only when strictly necessary). Calibration & Execution: Analyze execution logs and calibrate task difficulty (Easy to Very Hard) using standard Terminal-Bench run configurations against various model tiers (Haiku, Sonnet, Opus). Quality Assurance: Participate in a rigorous, 4-layer human quality control process (creation, human review, calibration review, and audit) alongside automated LLM-based checks to ensure fairness, grammatical accuracy, and benchmark integrity. Required Qualifications Experience: 5+ years of industry experience in software engineering. Background: Proven track record at leading technology companies and/or graduation from top-tier engineering universities. Language: Native or near-native fluency, with a deep understanding of its grammar, register, and phrasing rules. High English proficiency. Technical Stack: Strong proficiency in Python, standard shell scripting, and data processing. Workflow: Extensive experience with Terminal/CLI-based development workflows and a working familiarity with coding agents. Domain Expertise: Deep technical understanding of multilingual text processing pitfalls, including: Encoding/decoding robustness and Unicode normalization. Locale-dependent conventions (collation, casing, non-Gregorian dates). Text I/O, toolchain interoperability, and safe string operations. Bidirectional/RTL handling, font fallbacks, and rendering/typography in UI or artifacts. If interested, please submit your application including a latest copy for your CV in English. AI is changing how the world communicates — and LILT is leading that transformation. LILT's mission is to make the world's information available to everyone, no matter the language they speak. Join our global community who thrive on innovation and excellence. Our collective knowledge, uniqueness, and skills deliver multilingual AI and human-verified services to Enterprises, Governments, and AI Developers around the world. Earn money. Have fun. Advance human knowledge. Work on diverse projects from anywhere, any time you want. Get paid quickly and fairly, and build your professional network in a supportive community—all through a streamlined application process tailored to your expertise. Information collected and processed as part of your application process, including any job applications you choose to submit, is subject to LILT's Privacy Policy at At LILT, we are committed to a fair, inclusive, and transparent hiring process. As part of our recruitment efforts, we may use artificial intelligence (AI) and automated tools to assist in the evaluation of applications, including résumé screening, assessment scoring, and interview analysis. These tools are designed to support human decision-making and help us identify qualified candidates efficiently and objectively. All final hiring decisions are made by people. If you have any concerns, require accommodations, or would like to opt-out of the use of AI in our hiring process, please let us know at View email address on click.appcast.io. LILT is an equal opportunity employer. We extend equal opportunity to all individuals without regard to an individual’s race, religion, color, national origin, ancestry, sex, sexual orientation, gender identity, age, physical or mental disability, medical condition, genetic characteristics, veteran or marital status, pregnancy, or any other classification protected by applicable local, state or federal laws. We are committed to the principles of fair employment and the elimination of all discriminatory practices. #J-18808-Ljbffr Lilt

Vacancy posted 5 days ago
Similar jobs that could be interesting for youBased on the AI Benchmark Engineer | Native Language Specialist - Serbian - Remote in New York, NY vacancy
  • Evident ID in Atlanta is seeking a Software Engineer with 3+ years of experience in Java/Python and a keen interest in AI development tools. This hybrid role involves integrating AI-native coding practices and delivering impactful software solutions. Candidates should... 
    Remote job

    Evident ID

    Atlanta, GA
    2 days ago
  • ChainGPT is seeking a Senior AI Engineer to drive AI-native development in a fast-paced, remote-first environment. You will directly influence core product features, ensuring quality and scalable systems while utilizing advanced AI tooling. Ideal candidates have extensive... 
    Remote job

    ChainGPT

    New York, NY
    4 days ago
  • Mercor is seeking a talented software engineer to build an AI-native platform that transforms operations using real-time dashboards. Responsibilities include delivering full-stack features, designing integrations, and ensuring system robustness. The ideal candidate will... 
    Remote job

    Mercor

    Lakewood, OH
    1 day ago
  •  ...support a fast-growing startup, the contracted-to-full-time AI-native QA Engineer will automate QA workflows using modern AI frameworks, perform...  ...while collaborating with cross-functional teams in a remote setting. Key responsibilities Build AI-based testing frameworks... 
    Remote work
    Full time

    Virtual Vocations Inc

    United States
    9 hours ago
  • $184k - $287.5k

    A leading technology company seeks an AI Benchmarking and Telemetry Engineer in Santa Clara, California. In this role, you will develop benchmarking approaches for HPC and AI tasks, maintain telemetry frameworks, and collaborate with engineering teams to optimize performance... 
    Remote job

    NVIDIA Corporation

    Santa Clara, CA
    3 days ago
  • An innovative AI startup is seeking a Benchmark Specialist to design and execute rigorous benchmarks and evaluate datasets...  ...communicate technical specifications to both engineers and customers. The position is full-time and offers remote work flexibility. If you are passionate... 
    Remote job
    Full time

    Pathway Genomics Corporation

    Palo Alto, CA
    5 days ago
  • $50 per hour

    A leading AI research organization is seeking PhDs in Chemistry or related fields for a remote contract. The role involves designing advanced problems to test AI performance...  ..., evaluating AI outputs, and refining benchmarks. The pay rate is $50+/hour, depending on expertise... 
    Remote job
    Contract work

    Turing

    San Francisco, CA
    6 days ago
  • AI-Native Data Engineer @ TrueMeter SF Bay Area | Hybrid (3 days onsite, 2 remote) About Us We’re building the AI Energy Agent that’s becoming the default way any business pays for power and saves on energy. The grid is breaking under the weight of AI and electrification... 
    Remote work
    Immediate start

    Pear VC

    Palo Alto, CA
    5 days ago
  • $140k - $230k

     ...AI Sales Engineer, Digital Native Remote (San Francisco) About Arize AI is rapidly transforming the world. As generative AI reshapes industries, teams need powerful ways to monitor, troubleshoot, and optimize their AI systems. That's where we come in. Arize AI... 
    Remote work
    Work experience placement
    Work from home

    Arize AI

    San Francisco, CA
    6 days ago
  • An AI technology startup is seeking a Benchmarking Specialist in Palo Alto to design and execute ML evaluation benchmarks. You'll work closely with the R&D team...  ...fluent in English. This is a full-time position with remote work possibilities, targeting an immediate start... 
    Remote job
    Full time
    Immediate start

    Pathway

    Palo Alto, CA
    6 days ago
  • A leading consulting firm is seeking a Senior AI Native Engineer to revolutionize business processes through artificial intelligence. This role...  ...effectively. The position allows for a mix of in-office and remote work, with a comprehensive benefits package. #J-18808-Ljbffr... 
    Remote work
    Work at office

    Ernst & Young Oman

    Austin, TX
    6 days ago
  •  ...transformer frontier model that solves AI's fundamental memory problem....  ...the fastest data processing engine on the market, Pathway enables...  ...design and execute rigorous benchmarks and define dataset standards....  ...and location. Location : Remote work. Possibility to work or meet... 
    Remote work
    Permanent employment
    Full time
    Contract work
    Immediate start

    Pathway

    Palo Alto, CA
    5 days ago
  • Snyk Ltd. is seeking a Director of Solutions Engineering to lead a skilled SE organization focused on strategic AMER customers. You will...  ...directing activities in the region, architecting success with AI-native solutions, and cultivating strong relationships with executive... 
    Remote job
    Flexible hours

    Snyk Ltd.

    Boston, MA
    3 days ago
  • Solo.io, Inc. is looking for a Software Engineer to engage in cloud-native technology solutions and AI product development. Candidates will apply their expertise in...  ...solutions and innovate within the team. This remote role requires strong software engineering experience... 
    Remote job

    Solo.io, Inc.

    New York, NY
    6 days ago
  • Ionic Partners is seeking an AI-Native Software Engineering Director to lead transformative initiatives within a fully remote global organization. This role focuses on integrating AI into software engineering practices, enhancing productivity, quality, and innovation in... 
    Remote job

    Ionic Partners

    New York, NY
    3 days ago
  • Ionic Partners is seeking an AI-Native Software Engineering Director to lead the transformation in software engineering at Sparkrock. This role is...  ...influencing how engineering teams employ AI within a global, remote framework. Candidates should have extensive experience in... 
    Remote job

    Ionic Partners

    New York, NY
    5 days ago
  • Mercor is seeking a Software Engineer to build an AI-native platform that replaces traditional operations with real-time dashboards. The ideal candidate will have strong practical software engineering skills, experience with SaaS products, and comfort with integrations... 
    Remote job

    Mercor

    Florida, NY
    4 days ago
  • Mercor in Grove City, Ohio, is looking for a highly skilled engineer to develop an AI-native platform and manage integrations. The successful candidate will be responsible for shipping production features, designing scalable frameworks, and building real-time analytics... 
    Remote job
    Full time

    Mercor Inc

    Grove City, OH
    3 days ago
  • Mercor is seeking a skilled engineer to contribute to the development of an AI-native platform. The role involves shipping features across backend and frontend, designing integrations, and creating real-time analytics dashboards. Ideal candidates have strong software engineering... 
    Remote job

    Mercor

    Brighton, CO
    5 days ago
  • Mercor is seeking a skilled Software Engineer to build an AI-native platform in Pearland, Texas. The role requires shipping production features across backend and frontend, designing scalable integration frameworks, and developing real-time analytics experiences. Candidates... 
    Remote job

    Mercor

    Pearland, TX
    2 days ago
  • Sparkrock, based in the United States, is seeking an AI-Native Software Engineering Director to lead a transformative initiative in software development...  ...development tools, and will coach teams to adopt new practices. This is a fully remote position. #J-18808-Ljbffr Ring Inc
    Remote work

    Ring Inc

    Brazil, IN
    3 days ago
  • $90 - $100 per hour

    Eliassen Group is hiring a Senior AI/ML Engineer to design and deliver cloud-native machine learning solutions on AWS, focusing on LLM orchestration and multi-agent systems. The role requires significant experience in AI and collaboration with product teams to align technical... 
    Remote job
    Hourly pay
    Permanent employment

    Eliassen Group

    Hartford, CT
    2 days ago
  • Eliassen Group is looking for a Senior AI/ML Engineer to deliver cloud-native machine learning solutions on AWS. The role focuses on LLM orchestration, RAG pipelines, and building predictive models. The ideal candidate is an expert in AI engineering with strong Python... 
    Remote job
    Permanent employment
    Contract work

    Eliassen Group

    Atlanta, GA
    6 days ago
  • Eliassen Group is seeking a Senior AI/ML Engineer to design and deliver cloud-native machine learning solutions. Responsibilities include AI solution engineering...  ...AWS, Python, and machine learning techniques. This remote role requires US citizenship or permanent residency... 
    Remote job
    Permanent employment
    Contract work

    Eliassen Group

    Providence, RI
    2 days ago
  • Job description The Manager, AI-Native Software Engineering leads a team of engineers building and maintaining the platforms that power Libra Solutions...  ..., NC, Rosemont, IL or Las Vegas, NV. We welcome strong remote candidates, with occasional travel to Las Vegas as needed.... 
    Remote work
    Work at office

    Libra Solutions, LP

    Las Vegas, NV
    4 days ago
  • Dome Systems Inc is seeking a Product Engineer to build software that addresses real customer...  ...across the stack. You will work in an AI-native environment contributing to backend...  ...software delivery. The role promotes a remote-friendly culture with a focus on meaningful... 
    Remote work

    Dome Systems Inc

    Redwood City, CA
    3 days ago
  •  ...Title: Language Specialist Location: Remote (Freelancing) up to 35hr/Week Key Responsibilities...  ...against objective quality gates and benchmarking results against industry standards....  ...: 1. Language Mentioned above : Native or near-native proficiency (written... 
    Remote work
    Freelance

    Futran Tech Solutions Pvt. Ltd.

    United States
    2 days ago
  • $48 - $53 per hour

     ...CTG is seeking to fill an AI Native Developer position for our client...  ...: Whippany, NJ (hybrid remote) Duration: 6 months Job...  ...Native Developer (or AI-Native Engineer) experienced to build applications...  ...code, leveraging LLMs (Large Language Models), and constructing... 
    Remote work
    Local area

    Computer Task Group

    Whippany, NJ
    4 days ago
  • $15 - $20 per hour

     ...technical talent with leading AI research labs....  ...our investors include Benchmark , General Catalyst ,...  ...0/hour Location: Remote Role Responsibilities...  ...Bachelor's degree . Native speaker in Malayalam...  ...experience using large language models (LLMs). Excellent... 
    Remote work
    Contract work
    Summer work

    Mercor

    New York, NY
    9 days ago
  • $40 - $80 per hour

     ...Full-Stack AI Engineer Poland (Remote) $40 - $80 USD per hour / zł143.93 - zł287.52 per hour About ANNA...  ...of neurodevelopmental care is AI-native: where every clinician is enabled by...  ...families. What We Offer Competitive rates benchmarked to the Warsaw market Engaged via a... 
    Remote work
    Hourly pay
    For contractors
    Local area
    Flexible hours
    Day shift

    Anna Autism Care

    New York, NY
    9 hours ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to AI Benchmark Engineer | Native Language Specialist - Serbian - Remote. Be the first to apply!