Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Senior Software Engineer for LLM Evaluation & Repository Validation

$40 per hour

SaidGig

Role Overview: This position offers a unique opportunity to engage in the development of LLM evaluation and training datasets aimed at solving realistic software engineering challenges. You will play a critical role in creating verifiable software engineering tasks based on public repository histories, utilizing a synthetic approach with human-in-the-loop methodologies while expanding dataset coverage across various programming languages and difficulty levels.

Key Responsibilities:

  • Analyze and triage GitHub issues across trending open-source libraries.
  • Set up and configure code repositories, including Dockerization and environment setup.
  • Evaluate unit test coverage and quality.
  • Modify and run codebases locally to assess LLM performance in bug-fixing scenarios.
  • Collaborate with researchers to design and identify repositories and issues that present challenges for LLMs.
  • Lead a team of junior engineers on collaborative projects.

Qualifications:

  • Minimum of 3+ years of overall experience.
  • Strong experience with C#.
  • Proficiency in Git, Docker, and basic software pipeline setup.
  • Ability to understand and navigate complex codebases.
  • Comfortable running, modifying, and testing real-world projects locally.
  • Experience contributing to or evaluating open-source projects is a plus.

Nice to Have:

  • Previous participation in LLM research or evaluation projects.
  • Experience building or testing developer tools or automation agents.

Work Terms:

  • Commitments Required: At least 4 hours per day and a minimum of 20 hours per week with 4 hours of overlap with PST. Options for time commitment include 20 hrs/week, 30 hrs/week, or 40 hrs/week.
  • Employment Type: Contractor assignment (no medical/paid leave).
  • Location: Open to candidates from India, Pakistan, Nigeria, Kenya, Egypt, Ghana, Bangladesh, Turkey, and Mexico.

Compensation: Competitive compensation commensurate with experience.

Eligibility:

  • Must have the legal right to work in the specified locations.

Evaluation Process:

  • Two rounds of interviews (60 minutes technical + 30 minutes technical & cultural discussion).

Why Join Us? This role places you at the forefront of evaluating how LLMs interact with real code, significantly influencing the future of AI-assisted software development. Join a fast-growing AI company and blend practical software engineering with AI research.

Vacancy posted 14 days ago
Similar jobs that could be interesting for youBased on the Senior Software Engineer for LLM Evaluation & Repository Validation in United States vacancy
  • $40 per hour

     ...As a Senior Software Engineer focused on LLM Evaluation and Repository Validation, you will play a crucial role in developing datasets that train language models to tackle real-world software engineering challenges. This position offers the opportunity to work on innovative... 
    Senior
    Contract work
    For contractors
    Remote work

    SaidGig

    United States
    14 days ago
  • $40 per hour

     ...This position focuses on developing and evaluating large language model (LLM) training datasets to address realistic software engineering challenges. You will engage in creating...  ...software engineering tasks based on public repository histories, utilizing a synthetic... 
    Senior
    For contractors
    Remote work

    SaidGig

    United States
    14 days ago
  • $40 per hour

     ...opportunity to engage in the development of LLM evaluation and training datasets aimed at solving realistic software engineering challenges. You will play a crucial role in...  ...software engineering tasks based on public repository histories, utilizing a synthetic approach... 
    Senior
    For contractors
    Remote work

    SaidGig

    United States
    14 days ago
  • $40 per hour

     ...Overview: This position focuses on building LLM evaluation and training datasets aimed at addressing realistic software engineering challenges. The role involves creating...  ...software engineering tasks based on public repository histories, utilizing a synthetic approach... 
    Senior
    Remote job
    Contract work
    For contractors

    SaidGig

    United States
    28 days ago
  • $19 - $20 per hour

    A tech consulting firm is seeking a Senior Software Engineer specializing in Python to evaluate and validate LLM performance in real-world scenarios. This remote position involves analyzing GitHub issues, developing software solutions, and collaborating with researchers... 
    Senior
    Remote job
    Hourly pay
    For contractors

    Crossing Hurdles

    New York, NY
    4 days ago
  • $40 per hour

     ...Join a pioneering team focused on building LLM evaluation and training datasets designed to tackle realistic software engineering challenges. This role offers the opportunity...  ...software engineering tasks based on public repository histories, employing a synthetic approach... 
    Senior
    For contractors

    SaidGig

    United States
    14 days ago
  • $40 - $100 per hour

    Remote Senior Software Engineer (LLM) - 34953 Remote Senior Software Engineer (LLM) - 34953 3 days ago...  ...about and work with real-world software repositories. You’ll be working at the...  ...Overview: We're building high-quality evaluation and training datasets to improve how... 
    Senior
    Remote job
    Full time
    Contract work
    For contractors

    Turing

    New York, NY
    4 days ago
  •  ...hands-on Full Stack Engineer to design and...  ...Establish and enforce validation procedures with Evaluation Frameworks, bias mitigation...  ...applied Gen AI or LLM-based solutions....  ...candidate's code repository. Also, please vet the...  ...or publications in software development domains... 
    Senior

    Govserviceshub

    Mc Lean, VA
    3 days ago
  • $144.7k - $221.4k

     ...introspect autonomous driving software performance at interfaces...  ...autonomy developers and systems engineers. Design and implement...  ...the autonomy stack, including evaluation of perception, prediction, and...  ...critical scenarios, and prioritize validation efforts, integrating human‑... 
    Senior
    Local area
    Remote work
    Relocation
    Relocation package
    Flexible hours

    General Motors

    Sunnyvale, CA
    2 days ago
  • $160k - $240k

    Senior Software Engineer - Bloomberg Product Identifier Repository Location: New York Business Area: Engineering and CTO Ref #: 10052323 Description & Requirements...  ...the metal of distributed data systems. You will evaluate latency and throughput requirements, cache... 
    Senior
    Temporary work
    For contractors
    Work experience placement

    Bloomberg L.P.

    New York, NY
    3 days ago
  • $50 per hour

     ...This role focuses on creating advanced datasets for training and evaluating large language models, collaborating closely with researchers to enhance AI-driven coding solutions. As a Software Engineering evaluator, you will curate code examples, provide precise solutions... 
    Senior
    Remote job
    For contractors
    Flexible hours

    SaidGig

    United States
    14 days ago
  • $50 per hour

     ...Role Overview As a Software Engineering evaluator, you will create cutting-edge datasets for training, benchmarking, and advancing large language models, collaborating closely with researchers. This includes curating code examples, providing precise solutions, and making... 
    Senior
    Remote job
    Full time
    For contractors
    Flexible hours

    SaidGig

    United States
    28 days ago
  •  ...training pipelines, plus top AI researchers who specialize in software engineering, logical reasoning, STEM, multilinguality, multimodality,...  ...pedigree. Project Overview What Does a Typical Day Look Like? Evaluate and refine AI-generated code across backend and frontend... 
    Senior
    For contractors
    Remote work
    Flexible hours

    Turing

    Chicago, IL
    1 day ago
  •  ...need you, an experienced Software Development Engineer who can operate at a system...  ...agentic workflows, build LLM integrations, support tool...  ...decisions correctly Build evaluation pipelines for LLM quality,...  ...conscious engineering, input validation, output sanitisation,... 
    Senior
    Full time
    Contract work
    Part time
    Work at office
    Local area
    Remote work

    Phase2 Technology

    Washington DC
    5 days ago
  •  ...are seeking a skilled .NET Software Developer to join our engineering team. The role involves...  ...and intelligent workflows. Evaluate, implement, and optimize...  ...UI, API, and database validation using Playwright and related...  ...large language model (LLM) architectures, prompt engineering... 
    Senior
    Full time
    Local area
    Remote work

    Corporation Service Company

    Wilmington, DE
    3 days ago
  • $82.3k - $220k

     ...space exploration to biomedical engineering, lives often depend on the...  ...Job Description Summary The Software Engineer (SMTS) develops...  .... Duties / Responsibilities Evaluates requirements, proposes solutions...  ...independently verifying and validating safety critical software.... 
    Senior
    Full time

    The Charles Stark Draper Laboratory, Inc.

    Cambridge, MA
    5 days ago
  •  ...are looking for a Senior AI Engineer to design, build...  ...ship AI-powered software across the full...  ...modes of LLM-based systems —...  ...structured output validation, sandboxed tool...  ...Develop systematic evaluation frameworks (evals...  ...AGENTS.md structured repositories, architectural... 
    Senior

    Dexmate

    Santa Clara, CA
    6 days ago
  • $148k - $226.2k

     ...most difficult problems: evaluating the performance of the autonomous driving software stack before it reaches...  ...roads. As a software engineer on the SimCore team,...  ...fidelity and high-performance validation of the autonomy stack,...  ..., gym environment, or LLM. Strong programming... 
    Senior
    Remote work

    General Motors

    Sunnyvale, CA
    6 days ago
  •  ...inference? Join NVIDIA’s TensorRT Edge‑LLM team and help shape the next generation...  ...for automotive and robotics. We build the software stack that enables Large Language, Vision...  ...Computer Science, Electrical/Computer Engineering, or a closely related field. 4+ years of... 
    Senior

    NVIDIA Gruppe

    Santa Clara, CA
    4 days ago
  • $50 - $150 per hour

     ...boundaries of AI-assisted software development. Our...  ...-world software repositories. You’ll be...  ...intersection of software engineering, open-source...  ...high-quality evaluation and training datasets...  ...to improve LLM performance on code...  ...level Job Details Seniority level: Mid-Senior... 
    Senior
    Full time
    Contract work
    For contractors
    Flexible hours

    Turing

    San Francisco, CA
    4 days ago
  • $50 - $150 per hour

     ...high-quality training and evaluation datasets to improve how Large...  ...(LLMs) perform on real software engineering problems. The core of this...  ...coding tasks from public GitHub repositories, supported by a human-in-...  ...required Experience working with LLM-generated code or AI... 
    Senior
    Full time
    Contract work
    Part time
    For contractors
    Flexible hours

    Turing

    San Francisco, CA
    5 days ago
  • $148k - $356.5k

    Senior Software Engineer, Metrics and Evaluation - Autonomous Vehicles page is loaded Senior Software Engineer, Metrics and Evaluation - Autonomous Vehicles Apply locations US, CA, Santa Clara US, GA, Remote US, NC, Remote US, WA, Remote US, DC, Remote time type Full time... 
    Senior
    Full time
    Remote work

    NVIDIA Corporation

    Raleigh, NC
    4 days ago
  •  ...notch technology products. As a Senior Lead Software Engineer - Python/AWS/AI/LLM at JPMorganChase within the...  ...while establishing measurable validation standards (secure coding, peer review...  ...toolset such as tracing, evaluations, and guardrails; * Must have... 
    Senior
    Full time
    For contractors

    JPMorgan Chase & Co.

    Jersey City, NJ
    1 hour ago
  • Role Overview As a Senior Software Simulation Validation Engineer, you will be a technical leader responsible for ensuring the quality and reliability of autonomous vehicle simulation platforms. The role bridges hands‑on coding, protocol/process definition, integration... 
    Senior
    Local area

    Israelvcforum

    Sunnyvale, CA
    4 days ago
  • $180k - $240k

     ...support the next generation of powerful, meaningful products built with AI. Job Overview We’re seeking an exceptional Senior Software Engineer to join our LLM team. This role is focused on building and maintaining our LLM gateway service—a unified API platform that... 
    Senior
    Full time
    Remote work
    Easy work

    AssemblyAI

    New York, NY
    3 days ago
  • $145k - $175k

    Senior CI/CD Software Engineer - Aurora, CO - Active TS/SCI - Compensation Range:...  ...deployment pipeline issues. Evaluate and integrate new tools...  ...how requirements will be validated and verified by the test...  ...managing package repositories such as Artifactory or Nexus... 
    Senior
    Local area

    Cornerstone Defense LLC

    Aurora, CO
    3 days ago
  • $125k - $191.7k

     ...Job Overview Hybrid: This role is categorized as hybrid/Remote. Role: As a Senior Software Systems Engineer on the Software Validation team within the AV organization, you will lead the strategy and execution of validation efforts for autonomous vehicle software, designing... 
    Senior
    Local area
    Remote work
    Flexible hours

    Broughton Group

    Topeka, KS
    5 days ago
  • $91.7k - $163.7k

     ...forecasting, NLP, deep learning, LLM apps) at production...  ..., vector indexing, and evaluation with guardrails Work closely with lead engineers to develop the best...  ...best practices for software development and documentation...  ...Model evaluation, validation, and performance tuning... 
    Senior
    Minimum wage
    Full time
    Work experience placement
    Local area
    Remote work

    UnitedHealth Group

    Eden Prairie, MN
    4 days ago
  • $120k - $250k

     ...processing, curation, and multi-dimensional evaluation. Design and implement high-performance...  .... Build and optimize downstream engineering workflows for Large Language Models (LLMs...  ...skills in C/C++, Python, and software design Strong foundation in mathematics... 
    Senior
    Temporary work

    Pony.ai Inc.

    Fremont, CA
    6 days ago
  • $101.2k - $161.9k

    About the Role As a Senior Software Engineer, you will use your technical expertise...  ...reporting needs. Evaluate and adopt emerging technologies...  ...issues that span repositories, pipelines, tooling, and integrations...  ..., and pipeline‑driven validation used to enforce... 
    Senior
    Minimum wage
    Temporary work
    Work experience placement
    Local area

    H&R Block, Inc.

    Kansas City, MO
    3 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior Software Engineer for LLM Evaluation & Repository Validation. Be the first to apply!