Senior Software Engineer for LLM Evaluation [Remote]
$50 per hourSaidGig
- Remote job
This role focuses on creating innovative datasets for training and advancing large language models, working closely with researchers to enhance AI-driven coding solutions. As a Software Engineering evaluator, you will curate code examples, provide precise solutions, and make corrections, primarily using Python, while also engaging with JavaScript (including ReactJS), C/C++, Java, Rust, and Go.
Key Responsibilities:
- Curate code examples, build solutions, and correct code primarily in Python, with additional work in JavaScript (including ReactJS), C/C++, Java, Rust, and Go.
- Evaluate and refine AI-generated code to ensure efficiency, scalability, and reliability.
- Collaborate with cross-functional teams to enhance AI-driven coding solutions against industry performance benchmarks.
- Develop agents and automated verification tools in Python to verify code quality and identify error patterns.
- Hypothesize on software engineering cycle steps (prototyping, architecture design, API design, production implementation, launch, experiments, monitoring, operational maintenance) and evaluate model capabilities.
- Design verification mechanisms to automatically verify solutions to software engineering tasks.
Qualifications:
- Minimum of 3 years of software engineering experience.
- Strong expertise in Python, with deep knowledge of frameworks, tooling, and best practices for building production-grade software.
- Experience in building full-stack applications and deploying scalable software using modern languages and tools.
- Deep understanding of software architecture, design, development, debugging, and code quality/review assessment.
- Excellent oral and written communication skills for clear, structured evaluation rationales.
Work Terms:
- Flexible engagement with a minimum commitment of 10 hours per week, up to 40 hours per week.
- Contractor position (no medical/paid leave).
- Initial duration of 1 month with potential extensions based on performance and fit.
- Candidates must be based in the United States.
Compensation:
Details regarding compensation will be discussed during the interview process.
Eligibility:
- The application process takes 15, 30 minutes.
- Completion of an AI video interview is required.
$40 per hour
...Role Overview: This position focuses on developing and evaluating large language model (LLM) training datasets to address realistic software engineering challenges. You will engage in creating verifiable software engineering tasks based on public repository histories,...SeniorFor contractorsRemote work$40 per hour
...In this role, you will contribute to the development of LLM evaluation and training datasets designed to tackle realistic software engineering challenges. Your expertise will help create verifiable software engineering tasks based on public repository histories, utilizing...SeniorContract workFor contractors$40 per hour
...Role Overview: This position offers a unique opportunity to engage in the development of LLM evaluation and training datasets aimed at solving realistic software engineering challenges. You will play a critical role in creating verifiable software engineering tasks based...SeniorFor contractors$40 per hour
...Role Overview: This position focuses on building LLM evaluation and training datasets aimed at addressing realistic software engineering challenges. The role involves creating verifiable software engineering tasks based on public repository histories, utilizing a synthetic...SeniorRemote jobContract workFor contractors$40 per hour
...Join a pioneering team focused on building LLM evaluation and training datasets designed to tackle realistic software engineering challenges. This role offers the opportunity to contribute to the development of verifiable software engineering tasks based on public repository...SeniorFor contractors$50 per hour
...Role Overview As a Software Engineering evaluator, you will create cutting-edge datasets for training, benchmarking, and advancing large language models, collaborating closely with researchers. This includes curating code examples, providing precise solutions, and making...SeniorRemote jobFull timeFor contractorsFlexible hours- ...training pipelines, plus top AI researchers who specialize in software engineering, logical reasoning, STEM, multilinguality, multimodality,... ...pedigree. Project Overview What Does a Typical Day Look Like? Evaluate and refine AI-generated code across backend and frontend...SeniorFor contractorsRemote workFlexible hours
$40 - $100 per hour
Remote Senior Software Engineer (LLM) - 34953 Remote Senior Software Engineer (LLM) - 34953 3 days ago Be among the first 25 applicants Get AI-powered... ...AI. Project Overview: We're building high-quality evaluation and training datasets to improve how Large Language Models...SeniorRemote jobFull timeContract workFor contractors- ...inference? Join NVIDIA’s TensorRT Edge‑LLM team and help shape the next generation... ...for automotive and robotics. We build the software stack that enables Large Language, Vision... ...Computer Science, Electrical/Computer Engineering, or a closely related field. 4+ years of...Senior
$148k - $356.5k
Senior Software Engineer, Metrics and Evaluation - Autonomous Vehicles page is loaded Senior Software Engineer, Metrics and Evaluation - Autonomous Vehicles Apply locations US, CA, Santa Clara US, GA, Remote US, NC, Remote US, WA, Remote US, DC, Remote time type Full time...SeniorFull timeRemote work$180k - $240k
...support the next generation of powerful, meaningful products built with AI. Job Overview We’re seeking an exceptional Senior Software Engineer to join our LLM team. This role is focused on building and maintaining our LLM gateway service—a unified API platform that...SeniorFull timeRemote workEasy work$144.7k - $221.4k
...analyses to introspect autonomous driving software performance at interfaces across the... ...closely with autonomy developers and systems engineers. Design and implement analysis... ...components in the autonomy stack, including evaluation of perception, prediction, and planning...SeniorLocal areaRemote workRelocationRelocation packageFlexible hours$120k - $250k
...processing, curation, and multi-dimensional evaluation. Design and implement high-performance... .... Build and optimize downstream engineering workflows for Large Language Models (LLMs... ...skills in C/C++, Python, and software design Strong foundation in mathematics...SeniorTemporary work$50 - $150 per hour
A leading AI company is seeking a software engineer to review and evaluate model-generated code. This contract role requires several years of software engineering experience, particularly as a full-stack engineer at notable tech firms. You will assess code quality and...SeniorHourly payContract workFlexible hours$196k - $294k
Nerdleveltech seeks a Senior Software Engineer to join their Trust & Safety team. Based in San Francisco, you'll protect millions of developers from... ...scale, using your skills in JavaScript, Python, and LLM methodologies. Responsibilities include analyzing threat actor...SeniorFlexible hours- A leading autonomous technology company based in California seeks an experienced engineer to develop importance sampling techniques for evaluation pipelines. You will collaborate with teams to optimize objectives and drive data-driven decisions. Ideal candidates will have...Senior
$184k - $287.5k
We are now looking for a Senior Deep Learning Software Engineer, LLM Performance! NVIDIA is seeking an experienced Deep Learning Engineer passionate about analyzing and improving the performance of LLM inference. NVIDIA is rapidly growing our research and development for...Senior$19 - $20 per hour
A tech consulting firm is seeking a Senior Software Engineer specializing in Python to evaluate and validate LLM performance in real-world scenarios. This remote position involves analyzing GitHub issues, developing software solutions, and collaborating with researchers...SeniorRemote jobHourly payFor contractors- TWG Global AI in New York is seeking a Senior or Staff AI Software Engineer in Test to develop test automation frameworks for AI products. The ideal candidate will have significant experience in Python and Java, along with a strong software engineering background focused...Senior
$160k - $200k
Madrona Venture Labs is seeking experienced engineers passionate about AI to design and build LLM powered systems in Seattle, Washington. The role offers a... ...with large language models, retrieval systems, and evaluation frameworks. Flexible time off and meaningful equity...SeniorFlexible hours- Driverai is seeking an Applied Data Scientist with expertise in LLM evaluation to join its innovative team in Austin, TX. This role focuses on building the evaluation function from scratch and requires a strong background in statistics and machine learning. The successful...SeniorRemote job
$176k - $253k
Harper is seeking a Senior Member of Technical Staff, AI... ...standards through robust evaluation processes. You'll build capability... ..., and work directly with engineers to ensure our AI systems... ...have 3-6 years of software experience, particularly in LLM and agent evaluations. Competitive...Senior$229.9k - $262.4k
...Senior Lead AI Engineer (FM Hosting, LLM Inference) Overview: At Capital One, we are creating responsible... ..., test, deploy, and support AI software components including foundation model... ...search, guardrails, model evaluation, experimentation, governance, and...SeniorFull timePart timeLocal area- Apple Inc. is looking for a Senior Software Engineer to join the Siri Agentic Evaluation Platform team in Cambridge. You will build software platforms that define, measure, and report Siri quality and effectiveness, providing insights to development teams across Apple....Senior
$225k
This employer is hiring an AI Solutions Engineer for a direct hire role based in Kennett Square... ...position sits at the intersection of software engineering and business consulting, focused... ...You Will Be Doing Tech Breakdown AI / LLM-based development (RAG, agents, workflows...Senior2 days per week- Role Overview We are seeking an experienced AI‑focused Software Engineer to design, build, and scale intelligent applications powered by modern... ...in Python software development Hands‑on experience with LLM frameworks and orchestration tools (LangChain, LangGraph, AutoGen...SeniorRemote job
- ...That is why we need you, an experienced Software Development Engineer who can operate at a system-of-... ...design and build agentic workflows, build LLM integrations, support tool‑calling... ...makes those decisions correctly Build evaluation pipelines for LLM quality, including...SeniorFull timeContract workPart timeWork at officeLocal areaRemote work
- Lendistry is seeking a Senior AI Engineer to enhance our AI capabilities and lead the development of LLM features. This hands-on role involves collaborating with cross-functional... ...in Python, LLM applications, and AI system evaluation. Our comprehensive benefits and supportive...Senior
- ...CSC, we are seeking a skilled .NET Software Developer to join our engineering team. The role involves designing,... ...agents, and intelligent workflows. Evaluate, implement, and optimize AI-augmented... ...with large language model (LLM) architectures, prompt engineering,...SeniorFull timeLocal areaRemote work
- Senior AI/ML Engineer - LLM & Agentic AI Systems - Hybrid We are looking for a Senior AI/ML Engineer with strong expertise in Large Language... ...prompt engineering strategies Support LLM fine‑tuning, model evaluation, and continuous model improvement initiatives Build...SeniorFull time
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Senior Software Engineer for LLM Evaluation [Remote]. Be the first to apply!
- software engineer amazon United States
- experienced software developer United States
- federal - software developer United States
- software developer internship United States
- senior software engineer United States
- software developer fintech United States
- part time software developer remote United States
- software developer intern United States
- software data engineer United States
- software engineer matlab simulink United States


