Remote Senior Software Engineer - LLM Evaluation (US-based)
Turing
About Us:
Based in San Francisco, California, Turing is the world’s leading research accelerator for frontier AI labs and a trusted partner for global enterprises deploying advanced AI systems. Turing supports customers in two ways: first, by accelerating frontier research with high-quality data, advanced training pipelines, plus top AI researchers who specialize in software engineering, logical reasoning, STEM, multilinguality, multimodality, and agents; and second, by applying that expertise to help enterprises transform AI from proof of concept into proprietary intelligence with systems that perform reliably, deliver measurable impact, and drive lasting results on the P&L.
Ideal Background:
This role is ideal for engineers who have worked at the frontier of AI — at companies like OpenAI, NVIDIA, Databricks, Palantir, Snowflake, or similar organizations pushing the boundaries of intelligent systems. We especially welcome graduates from leading programs such as Harvard, Columbia, Princeton, Yale, University of Pennsylvania, and comparable institutions — though exceptional experience and skill always take precedence over pedigree.
Project Overview:
As a Software Engineering evaluator, you will create cutting-edge datasets for training, benchmarking, and advancing large language models, collaborating closely with researchers. This includes curating code examples, providing precise solutions, and making corrections in Python, C/C++, Rust, Go, Java, and JavaScript (including ReactJS) — with particular emphasis on systems-level code, performance-critical applications, and infrastructure. You will evaluate and refine AI-generated code for efficiency, scalability, and reliability, and work with cross-functional teams to enhance enterprise-level AI-driven coding solutions.
What Does a Typical Day Look Like?
- Work on AI model training initiatives by curating code examples, building solutions, and correcting code in Python, C/C++, Rust, Go, Java, and JavaScript (including ReactJS).
- Evaluate and refine AI-generated code with an emphasis on systems-level correctness, performance, and reliability.
- Collaborate with cross-functional teams to enhance AI-driven coding solutions against industry performance benchmarks.
- Build agents that can verify the quality of systems-level and infrastructure code and identify error patterns.
- Hypothesize on steps in the software engineering cycle (prototyping, architecture design, API design, production implementation, launch, experiments, monitoring, operational maintenance) and evaluate model capabilities on them.
- Design verification mechanisms that can automatically verify a solution to a software engineering task.
Required Skills:
- Several years of software engineering experience (3 years or more)
- Strong expertise in systems programming, infrastructure, or backend development using languages like Python, C/C++, Rust, and Go.
- Experience building and deploying scalable, production-grade software using modern languages and tools.
- Deep understanding of software architecture, design, development, debugging, and code quality/review assessment.
- Excellent oral and written communication skills for clear, structured evaluation rationales.
Engagement Details:
- Commitment: flexible engagement, minimum 10 hrs/week, up to 40 hrs/week
- Type: Contractor (no medical/paid leave)
- Duration: 1 month (potential extensions based on performance and fit)
- Location: Candidates must be based in the United States
Evaluation Process:
- The application process takes 15–30 minutes.
- Completion of an AI video interview is required.
Note: As part of assessments you will go through an AI video interview.
After applying, you will receive an email with a login link. Please use that link to access the portal and complete your profile.
Know amazing talent? Refer them at turing.com/referrals, and earn money from your network.
- ...About Us: Based in San Francisco, California, Turing is the world’s leading research... ...top AI researchers who specialize in software engineering, logical reasoning, STEM, multilinguality... ...Overview: As a Software Engineering evaluator, you will create cutting-edge datasets...Remote workSeniorFor contractorsFlexible hours
$175k - $245k
...Senior Software Engineer II - Applied AI and Evaluations (Remote Eligible) -REMOTE, USA- For over 20 years, Smartsheet... ...at the intersection of LLM evaluation, prompt and... ...work and productivity US employees are automatically... ...provides a competitive base salary range for roles...Remote workSeniorFull timeTemporary workLocal areaImmediate start$80 - $100 per hour
...locations. For US applicants:... ...benchmarks and evaluation pipelines used... ...models on real software engineering work: Design... ...implementing LLM coding benchmarks... ...to Have Senior or Lead-level profile... ...: Fully remote — work from anywhere... ...: $80–$100/hr based on location and...Remote workSeniorFull timeContract workFor contractors- Senior Agentic AI Software Engineer - Hybrid US Job ID: 497243 Posted since: 04-Mar-2... ...Full-time, Hybrid (Remote/Office), Permanent... ...services, to reliability, evaluation, and long-term... ...human-in-the-loop) based on problem... ...experience building LLM-powered applications...Remote workSeniorPermanent employmentFull timeWork at officeLocal areaWork from home
$238k - $302k
...Senior Software Engineer, ML Evaluation Infra and Efficiency Waymo is an autonomous driving... ...system expertise to help us train, evaluate and... ...for LLMs. The expected base salary range for this full... ...the role can be performed remote, the specific salary range...Remote workSeniorFull time$204k - $259k
...Driver. The Simulator Evaluation team faces the... ...looking for aSenior Software Engineer to build the... ...will report to a Senior Staff Software Engineering... ...are heuristic-based, physics-based, or... ...time position across US locations is... ...can be performed remote, the specific salary...Remote workSeniorFull time$152k - $241.5k
...NVIDIA's TensorRT Edge-LLM team and help shape the... ...robotics. We build the software stack that enables Large... ...for transformer-based models running on constrained... ...Science, Electrical/Computer Engineering, or a closely related... ...excellence, come join us to shape the future of...Remote workSenior$204k - $259k
.... Waymo's Release Evaluation org ensures that each version... ...that might help us to more efficiently discover... ...Collaborate with other engineers, data scientists,... ...large and complex code base You have: ~ BS... ...role can be performed remote, the specific salary range...Remote workSeniorFull time$40 - $100 per hour
...Remote Senior Software Engineer (LLM) - 34953 Remote Senior Software Engineer (LLM) - 34953 3 days ago Be among... ...: We're building high-quality evaluation and training datasets to improve how... ...starting next week; potential extensions based on performance and fit) Rates: $40...Remote workSeniorFull timeContract workFor contractors$204k - $259k
...Senior Software Engineer, Quantitative Evaluations Waymo is an autonomous driving technology company... ...data scientists to help us improve how we... ...a large and complex code base. Analyze data and make... ...the role can be performed remote, the specific salary range...Remote workSeniorFull time$148k - $356.5k
...Senior Software Engineer, Metrics and Evaluation - Autonomous Vehicles page is loaded Senior Software Engineer... ...Vehicles Apply locations US, CA, Santa Clara US, GA, Remote US, NC, Remote US, WA, Remote... ...Linux (Ubuntu) or another Unix based system ~ Ability and enthusiasm...Remote workSeniorFull time- ...looking for experienced Software Engineers to design and... ...pipelines used to evaluate frontier AI models... .... This is a fully remote contract role. If... ...and implementing LLM coding benchmarks... ...Makes a Perfect Match Senior or Lead‑level... ...benchmarking Why Join Us Work on cutting‑...Remote workSeniorHourly payFull timeContract workFreelance
$180k - $240k
...re seeking an exceptional Senior Software Engineer to join our LLM team. This role is focused... ...- $240,000 The expected base compensation for this role... ...flexibility of being fully remote. Working at AssemblyAI... ...just fit in, but will help us define and build our...Remote workSeniorFull timeEasy work$140k - $220k
...As a Senior Software Engineer on our Advertising, Company Intelligence, and Intent... ...at designing and implementing LLM‑powered systems such as RAG pipelines... ...compensation offered will be based on factors such as the... ...when you apply for jobs with us. Please review our Job...Remote workSenior- ...unprecedented scale. Join us to help deliver the... ...Organization The Evaluation team builds and... ...clear feedback for engineering and leadership, and... ...autonomous driving software performance atinterfaces... ...coverage. Hybrid/Remote: This role can be based remotely but if you...Remote workSeniorLocal areaWork from homeRelocationRelocation packageFlexible hours
$100 per hour
...deployment, is looking for qualified Senior Software Engineers to assist in a one-time project to assist with their LLM training. Selected... ...to be ~1 hour. * Evaluate and improve large language models... ...:** * Candidates must be based in the United States * 5+...Remote workSeniorHourly payTemporary work$163.8k - $245.8k
...the world around us. As a Fortune 500... ...with extraordinary engineers to build and evolve... ...and develop core software modules for real-time... ...availability. Evaluate and implement new... ...The annualized base salary ranges for... ...in-person time and remote. Our approach enables...Remote workSeniorWork at officeHome officeFlexible hours- ...Senior Software Engineer, Knowledge Engine About Pinecone Pinecone is the leading... ...friendly technology. Pinecone is based in New York and raised $138M... ...unstructured data–to modern LLM-powered applications,... ...Improve retrieval quality through evaluation and observability frameworks...Remote workSeniorLocal areaWork from homeFlexible hours
$125k - $160k
...Full Stack Software Engineer (US Based Remote) Torus is headquartered in Utah and is expanding manufacturing at our 540,000-square-foot facility in... ...enabled APIs or services into applications Understanding of evaluating AI outputs for accuracy and reliability Interest in...Remote workTemporary workCasual workWork at office$156k - $185k
...Senior Full Stack Software Engineer - Remote in US Knock is redefining the home buying and selling experience. We’re a... ...into production applications—such as LLM integration (OpenAI, Anthropic),... ...Philosophy: As a fully remote (U.S.‑based) workforce, our goal is to ensure...Remote workSeniorFull timeLocal areaFlexible hours$204k - $259k
...Senior Machine Learning Engineer – VLM/LLM Evaluation Waymo is an autonomous driving technology... ...report to a Senior Staff Software Engineer. You will:... ...available to all eligible US based employees. Benefits for... ...role can be performed remote, the specific salary...Remote workSeniorFull timeTemporary work- ...what matters. Our remote-first team spans the... .... Come join us for a whale of a ride... ...loved tools in modern software development. The... ...like a platform engineer—designing modular,... ...roadmap priorities based on what you’ve learned... .... As part of the evaluation process we provide...Remote workSeniorWorldwideHome office
$136k - $199.2k
## Senior Software Engineer, Autonomy EvaluationApplyremote type: Remote/Hybridlocations: Sunnyvale, California, United... ...scale. Join us to help deliver the next... ...the Organization**The Evaluation team builds and evolves... ...Remote:** This role can be based remotely but if you...Remote workSeniorRelocationRelocation packageFlexible hours- ...more at As a Senior AI Infrastructure Engineer at Sword Health,... ...From optimizing LLM inference and deploying... ...strategies – evaluate and implement techniques... ...your hours (remotely) with unlimited... ..., check here. US - Sword Benefits... ...valid EU visa and be based in Portugal...Remote workSeniorFull timeWork from homeWorldwideRelocation packageFlexible hoursShift work
$55k - $151.47k
...people in data and analytics engineering focus on leveraging advanced... ...with PwC standards. As a Senior Associate you will analyze complex... ...platform Executing LLM evaluation frameworks using defined metrics... ...anticipated application deadlines: #LI-Remote #LI-Hybrid...Remote workSeniorFull timeWork experience placementH1b$99.6k - $174k
...Senior Full Stack Engineer, AI Platform & Agents Build... .... Location: US/Canada, Hybrid or Remote - Work Hours: Must... ...Apply current LLM patterns (RAG,... ...professional software engineering experience... ...and evaluation ~ Backend development... ...listed is based on primary location...Remote workSeniorWork at office2 days per week- ...Senior Full Stack Engineer As an early Senior Full Stack Engineer... ...requires: ~5+ years of software development... ...Worked in a fully remote environment with colleagues... ...Worked in sprint-based environments with an... ...with company match (US) ~ Either SF, or remote...Remote workSenior
$157.25k - $198.88k
...to talk to you about joining us in creating next-generation GraphQL... ...of GraphOS - the underlying engine central to making GraphOS a... ...country. We make hiring decisions based on your skills, experience,... ...Financial. Location: This is a remote position that can be done from...Remote workSenior- ...We are building LLM evaluation and training datasets... ...on realistic software engineering problems. One of... ...verifiable SWE tasks based on public... ...quality Why Join Us? Turing is one of... ...Work in a fully remote environment. Opportunity... ...date as next week Seniority level Seniority...Remote workSeniorContract workFor contractorsFreelanceInternship
$177k - $245k
...simplifying the process of evaluating, purchasing, and managing W&... ...practitioners and our business. As a Senior Software Engineer, you will lead initiatives... ...and Family Coverage Remote first culture with in-office... ...environment will flourish with us. We are an equal opportunity...Remote workSeniorFull timeTemporary workWork at officeHome officeFlexible hours
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Remote Senior Software Engineer - LLM Evaluation (US-based). Be the first to apply!
- graduate software developer Los Angeles, CA
- rust software engineer Los Angeles, CA
- senior software design engineer Los Angeles, CA
- software engineer student Los Angeles, CA
- software engineer amazon Los Angeles, CA
- software developer positions Los Angeles, CA
- software engineer full time Los Angeles, CA
- new graduate software engineer Los Angeles, CA
- junior software developer Los Angeles, CA
- IT software engineer Los Angeles, CA

