Senior Software Engineer for LLM Evaluation & Repository Validation
$40 per hourSaidGig
Role Overview: This position offers a unique opportunity to engage in the development of LLM evaluation and training datasets aimed at solving realistic software engineering challenges. You will play a critical role in creating verifiable software engineering tasks based on public repository histories, utilizing a synthetic approach with human-in-the-loop methodologies while expanding dataset coverage across various programming languages and difficulty levels.
Key Responsibilities:
- Analyze and triage GitHub issues across trending open-source libraries.
- Set up and configure code repositories, including Dockerization and environment setup.
- Evaluate unit test coverage and quality.
- Modify and run codebases locally to assess LLM performance in bug-fixing scenarios.
- Collaborate with researchers to design and identify repositories and issues that present challenges for LLMs.
- Lead a team of junior engineers on collaborative projects.
Qualifications:
- Minimum of 3+ years of overall experience.
- Strong experience with C#.
- Proficiency in Git, Docker, and basic software pipeline setup.
- Ability to understand and navigate complex codebases.
- Comfortable running, modifying, and testing real-world projects locally.
- Experience contributing to or evaluating open-source projects is a plus.
Nice to Have:
- Previous participation in LLM research or evaluation projects.
- Experience building or testing developer tools or automation agents.
Work Terms:
- Commitments Required: At least 4 hours per day and a minimum of 20 hours per week with 4 hours of overlap with PST. Options for time commitment include 20 hrs/week, 30 hrs/week, or 40 hrs/week.
- Employment Type: Contractor assignment (no medical/paid leave).
- Location: Open to candidates from India, Pakistan, Nigeria, Kenya, Egypt, Ghana, Bangladesh, Turkey, and Mexico.
Compensation: Competitive compensation commensurate with experience.
Eligibility:
- Must have the legal right to work in the specified locations.
Evaluation Process:
- Two rounds of interviews (60 minutes technical + 30 minutes technical & cultural discussion).
Why Join Us? This role places you at the forefront of evaluating how LLMs interact with real code, significantly influencing the future of AI-assisted software development. Join a fast-growing AI company and blend practical software engineering with AI research.
$40 per hour
...As a Senior Software Engineer focused on LLM Evaluation and Repository Validation, you will play a crucial role in developing datasets that train language models to tackle real-world software engineering challenges. This position offers the opportunity to work on innovative...SeniorContract workFor contractorsRemote work$40 per hour
...This position focuses on developing and evaluating large language model (LLM) training datasets to address realistic software engineering challenges. You will engage in creating... ...software engineering tasks based on public repository histories, utilizing a synthetic...SeniorFor contractorsRemote work$40 per hour
...opportunity to engage in the development of LLM evaluation and training datasets aimed at solving realistic software engineering challenges. You will play a crucial role in... ...software engineering tasks based on public repository histories, utilizing a synthetic approach...SeniorFor contractorsRemote work$40 per hour
...Overview: This position focuses on building LLM evaluation and training datasets aimed at addressing realistic software engineering challenges. The role involves creating... ...software engineering tasks based on public repository histories, utilizing a synthetic approach...SeniorRemote jobContract workFor contractors$19 - $20 per hour
A tech consulting firm is seeking a Senior Software Engineer specializing in Python to evaluate and validate LLM performance in real-world scenarios. This remote position involves analyzing GitHub issues, developing software solutions, and collaborating with researchers...SeniorRemote jobHourly payFor contractors$40 per hour
...Join a pioneering team focused on building LLM evaluation and training datasets designed to tackle realistic software engineering challenges. This role offers the opportunity... ...software engineering tasks based on public repository histories, employing a synthetic approach...SeniorFor contractors$40 - $100 per hour
Remote Senior Software Engineer (LLM) - 34953 Remote Senior Software Engineer (LLM) - 34953 3 days ago... ...about and work with real-world software repositories. You’ll be working at the... ...Overview: We're building high-quality evaluation and training datasets to improve how...SeniorRemote jobFull timeContract workFor contractors- ...hands-on Full Stack Engineer to design and... ...Establish and enforce validation procedures with Evaluation Frameworks, bias mitigation... ...applied Gen AI or LLM-based solutions.... ...candidate's code repository. Also, please vet the... ...or publications in software development domains...Senior
$144.7k - $221.4k
...introspect autonomous driving software performance at interfaces... ...autonomy developers and systems engineers. Design and implement... ...the autonomy stack, including evaluation of perception, prediction, and... ...critical scenarios, and prioritize validation efforts, integrating human‑...SeniorLocal areaRemote workRelocationRelocation packageFlexible hours$160k - $240k
Senior Software Engineer - Bloomberg Product Identifier Repository Location: New York Business Area: Engineering and CTO Ref #: 10052323 Description & Requirements... ...the metal of distributed data systems. You will evaluate latency and throughput requirements, cache...SeniorTemporary workFor contractorsWork experience placement$50 per hour
...This role focuses on creating advanced datasets for training and evaluating large language models, collaborating closely with researchers to enhance AI-driven coding solutions. As a Software Engineering evaluator, you will curate code examples, provide precise solutions...SeniorRemote jobFor contractorsFlexible hours$50 per hour
...Role Overview As a Software Engineering evaluator, you will create cutting-edge datasets for training, benchmarking, and advancing large language models, collaborating closely with researchers. This includes curating code examples, providing precise solutions, and making...SeniorRemote jobFull timeFor contractorsFlexible hours- ...training pipelines, plus top AI researchers who specialize in software engineering, logical reasoning, STEM, multilinguality, multimodality,... ...pedigree. Project Overview What Does a Typical Day Look Like? Evaluate and refine AI-generated code across backend and frontend...SeniorFor contractorsRemote workFlexible hours
- ...need you, an experienced Software Development Engineer who can operate at a system... ...agentic workflows, build LLM integrations, support tool... ...decisions correctly Build evaluation pipelines for LLM quality,... ...conscious engineering, input validation, output sanitisation,...SeniorFull timeContract workPart timeWork at officeLocal areaRemote work
- ...are seeking a skilled .NET Software Developer to join our engineering team. The role involves... ...and intelligent workflows. Evaluate, implement, and optimize... ...UI, API, and database validation using Playwright and related... ...large language model (LLM) architectures, prompt engineering...SeniorFull timeLocal areaRemote work
$82.3k - $220k
...space exploration to biomedical engineering, lives often depend on the... ...Job Description Summary The Software Engineer (SMTS) develops... .... Duties / Responsibilities Evaluates requirements, proposes solutions... ...independently verifying and validating safety critical software....SeniorFull time- ...are looking for a Senior AI Engineer to design, build... ...ship AI-powered software across the full... ...modes of LLM-based systems —... ...structured output validation, sandboxed tool... ...Develop systematic evaluation frameworks (evals... ...AGENTS.md structured repositories, architectural...Senior
$148k - $226.2k
...most difficult problems: evaluating the performance of the autonomous driving software stack before it reaches... ...roads. As a software engineer on the SimCore team,... ...fidelity and high-performance validation of the autonomy stack,... ..., gym environment, or LLM. Strong programming...SeniorRemote work- ...inference? Join NVIDIA’s TensorRT Edge‑LLM team and help shape the next generation... ...for automotive and robotics. We build the software stack that enables Large Language, Vision... ...Computer Science, Electrical/Computer Engineering, or a closely related field. 4+ years of...Senior
$50 - $150 per hour
...boundaries of AI-assisted software development. Our... ...-world software repositories. You’ll be... ...intersection of software engineering, open-source... ...high-quality evaluation and training datasets... ...to improve LLM performance on code... ...level Job Details Seniority level: Mid-Senior...SeniorFull timeContract workFor contractorsFlexible hours$50 - $150 per hour
...high-quality training and evaluation datasets to improve how Large... ...(LLMs) perform on real software engineering problems. The core of this... ...coding tasks from public GitHub repositories, supported by a human-in-... ...required Experience working with LLM-generated code or AI...SeniorFull timeContract workPart timeFor contractorsFlexible hours$148k - $356.5k
Senior Software Engineer, Metrics and Evaluation - Autonomous Vehicles page is loaded Senior Software Engineer, Metrics and Evaluation - Autonomous Vehicles Apply locations US, CA, Santa Clara US, GA, Remote US, NC, Remote US, WA, Remote US, DC, Remote time type Full time...SeniorFull timeRemote work- ...notch technology products. As a Senior Lead Software Engineer - Python/AWS/AI/LLM at JPMorganChase within the... ...while establishing measurable validation standards (secure coding, peer review... ...toolset such as tracing, evaluations, and guardrails; * Must have...SeniorFull timeFor contractors
- Role Overview As a Senior Software Simulation Validation Engineer, you will be a technical leader responsible for ensuring the quality and reliability of autonomous vehicle simulation platforms. The role bridges hands‑on coding, protocol/process definition, integration...SeniorLocal area
$180k - $240k
...support the next generation of powerful, meaningful products built with AI. Job Overview We’re seeking an exceptional Senior Software Engineer to join our LLM team. This role is focused on building and maintaining our LLM gateway service—a unified API platform that...SeniorFull timeRemote workEasy work$145k - $175k
Senior CI/CD Software Engineer - Aurora, CO - Active TS/SCI - Compensation Range:... ...deployment pipeline issues. Evaluate and integrate new tools... ...how requirements will be validated and verified by the test... ...managing package repositories such as Artifactory or Nexus...SeniorLocal area$125k - $191.7k
...Job Overview Hybrid: This role is categorized as hybrid/Remote. Role: As a Senior Software Systems Engineer on the Software Validation team within the AV organization, you will lead the strategy and execution of validation efforts for autonomous vehicle software, designing...SeniorLocal areaRemote workFlexible hours$91.7k - $163.7k
...forecasting, NLP, deep learning, LLM apps) at production... ..., vector indexing, and evaluation with guardrails Work closely with lead engineers to develop the best... ...best practices for software development and documentation... ...Model evaluation, validation, and performance tuning...SeniorMinimum wageFull timeWork experience placementLocal areaRemote work$120k - $250k
...processing, curation, and multi-dimensional evaluation. Design and implement high-performance... .... Build and optimize downstream engineering workflows for Large Language Models (LLMs... ...skills in C/C++, Python, and software design Strong foundation in mathematics...SeniorTemporary work$101.2k - $161.9k
About the Role As a Senior Software Engineer, you will use your technical expertise... ...reporting needs. Evaluate and adopt emerging technologies... ...issues that span repositories, pipelines, tooling, and integrations... ..., and pipeline‑driven validation used to enforce...SeniorMinimum wageTemporary workWork experience placementLocal area
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Senior Software Engineer for LLM Evaluation & Repository Validation. Be the first to apply!
- software engineer amazon United States
- experienced software developer United States
- federal - software developer United States
- software developer internship United States
- senior software engineer United States
- software developer fintech United States
- part time software developer remote United States
- software developer intern United States
- software data engineer United States
- software engineer matlab simulink United States


