Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Senior Software Engineer AI Evaluation & Benchmarks (Python) [Remote]

$80 - $100 per hour
Full-time

G2i

Miami, FL
  • Remote job

Before Applying

This role is open to contractors in accepted locations only. Please confirm your country is on the list before applying — we're unable to process applications from unlisted locations. List of accepted countries and locations.

For US applicants: This is a 1099 independent contractor role. It is not compatible with F-1 OPT, STEM OPT, or any visa status that requires W-2 employment, guaranteed hours, or employer sponsorship. We are unable to provide offer letters or employment verification for this role.

What You'll Be Doing

Design and build the coding benchmarks and evaluation pipelines used to test frontier AI models on real software engineering work:

  • Design coding benchmarks that evaluate frontier models on real-world programming tasks — reasoning, debugging, and production-quality code

  • Build and maintain scalable data pipelines for evaluation workflows

  • Analyze model-generated code for correctness, reliability, and edge-case failures

  • Construct structured evaluation scenarios across large repos and multi-language environments

  • Provide detailed technical feedback on model performance and failure patterns

  • Contribute to evaluation frameworks that set the bar for how coding ability is measured

End result: benchmarks that meaningfully separate what frontier models can and can't do — and shape how the next generation is trained and improved.

AI coding evaluation in one line: Design task → build harness → run model → analyze failures → feed findings back into the benchmark → evaluations that actually distinguish strong models from weak ones.

What You'll Need

  • 4+ years of professional software engineering experience (non-negotiable)

  • Expert Python — clean, performant, well-tested code

  • Hands-on experience working in large, complex codebases

  • Proven experience designing and implementing LLM coding benchmarks and evaluation data pipelines

  • Strong command of Git and modern development workflows

  • Track record at a high-growth tech company or top-tier software organization

  • Strong written English communication

Identity verification: Applicants will be required to verify their identity and confirm they have valid documentation to work as an independent contractor in their country of residence.

Nice to have

  • Senior or Lead-level profile with a history of technical ownership

  • Bachelor's or Master's in CS, ML, or related field (or equivalent professional experience)

  • Proficiency in additional languages: JavaScript, Go, C++, or others

  • CI/CD experience and writing robust unit tests (pytest, Mocha, JUnit)

  • Background in security engineering or significant open-source contributions

  • Familiarity with AI/ML evaluation methodologies or model benchmarking

Logistics

  • Location: Fully remote — work from anywhere on the accepted locations list

  • Compensation: $80–$100/hr based on location and seniority

  • Contract length: 3 months, with potential for extension

  • Hours: Full-time availability preferred — hours vary by project and are not guaranteed week to week

  • Engagement: 1099 independent contractor

  • Payment: Weekly via PayPal or Stripe

⚠️ Important: Hours are project-dependent and can vary week to week. We recommend keeping other work options open alongside this engagement rather than relying on it as your sole source of income.

Vacancy posted 15 hours ago
Similar jobs that could be interesting for youBased on the Senior Software Engineer AI Evaluation & Benchmarks (Python) [Remote] in Miami, FL vacancy
  • $157k - $298k

     ...Senior Software Engineer – AI Applications At Bezos Academy, we believe all children deserve the great...  ...to deployment, monitoring, and evaluation—you work effectively despite constraints...  ...production environments; Proficiency in Python for backend development and... 
    Senior
    Python
    Full time
    Relocation
    Relocation package
    Flexible hours

    Bezos Academy

    Miami, FL
    1 day ago
  • About Iru Iru is the AI‑powered security & IT platform used...  ...satisfaction. The Opportunity As a Senior Software Engineer (Full‑Stack, Front‑End...  ...key technical decisions : Evaluate tradeoffs pragmatically and...  ...with (or strong interest in) Python + FastAPI backend development... 
    Senior
    Python
    Full time
    Work at office
    3 days per week

    Iru

    Miami, FL
    4 days ago
  • $175k - $210k

     ...Overview Well-Funded AI Startup. Fully Remote, Complex Problems, Very Stable, Meaningful...  ...make key decisions. We\'re looking for engineers who are product-minded and can think...  ...maintaining complex backend systems, ideally with Python. Existing comfort deeply integrating AI... 
    Senior
    Python
    Local area
    Remote work
    Work from home

    Jobot

    Doral, FL
    2 days ago
  • $125k - $350k

     ...markets and take data analysis to new levels. Software Engineers create next generation software...  ...• In-depth experience working in Python • The ambition to solve open-ended...  ...power of compute, machine learning and AI to power our analytics and tackle the market... 
    Senior
    Python

    Citadel

    Miami, FL
    2 days ago
  •  ...About Iru Iru is the AI-powered security & IT platform used by the world's...  ...satisfaction. The Opportunity As a Senior Software Engineer (Full-Stack) on Iru's Customer...  ...a backend programming language (e.g., Python, Go, Java, or similar). ~ Hands-on experience... 
    Senior
    Python
    Full time
    Work at office
    3 days per week

    Iru

    Miami, FL
    4 days ago
  • $74.1k - $147.8k

     ...innovative ideas, such as AI-enabled insights and...  ...Job Description: Senior AI Engineer 1 About Crowe...  ...distributed training and model evaluation workflows that improve...  ...AI/ML engineering or software engineering experience...  ...~ Deep proficiency in Python, ML frameworks, and... 
    Senior
    Python
    Local area
    Remote work
    Worldwide
    Flexible hours

    Crowe

    Miami, FL
    1 day ago
  • $124k - $280k

     ...Competency: Data, Analytics & AI Industry/Sector: Not...  ...people in data and analytics engineering focus on leveraging advanced...  ...] is a plus Proficient in Python and structured/unstructured data...  ...closely with team members. We evaluate these factors thoughtfully to... 
    Senior
    Python
    Full time
    H1b

    PwC

    Miami, FL
    1 day ago
  • $102.5k - $187.9k

     ...Technology – Digital Engineering – Full Stack Engineer (Senior) Whole industries...  ..., and applied AI across a wide range...  ...Senior Full Stack Software Engineer , you will...  ...methods, techniques, and evaluation criteria for...  ...languages such as Python, Java, C#, or JavaScript... 
    Senior
    Python
    Summer holiday
    Local area
    Flexible hours

    EY

    Miami, FL
    20 hours ago
  • $132.5k - $366.3k

     ...company at the forefront of AI-native innovation. We partner...  ...generation, agent-powered workflows engineered to scale in real-world...  ...based routing, tool invocation, evaluation harnesses, and lifecycle...  ...of experience programming in Python, Java, or equivalent; familiarity... 
    Senior
    Python
    Live in
    Work at office
    Local area

    Accenture

    Miami, FL
    4 days ago
  • $77k - $202k

     ...Competency: Data, Analytics & AI Industry/Sector:...  ...in data and analytics engineering focus on leveraging...  ...meet business needs. As a Senior Associate, you analyze...  ...languages like Python, Java, Scala - Proficient...  ...with team members. We evaluate these factors thoughtfully... 
    Senior
    Python
    Full time
    H1b

    PwC

    Miami, FL
    2 days ago
  •  ...Worth is looking for a Senior Software Engineer to join our Integrations Squad and own the systems...  ...and subsystems, from initial partner evaluation through production rollout and ongoing...  ...data. Production experience with Python in backend systems. Background in... 
    Senior
    Python
    Remote work
    Flexible hours

    Worth AI

    Miami, FL
    14 days ago
  • $175k - $210k

    A pioneering AI Startup is seeking an experienced Back-End Engineer to design and build intelligent systems. The role is fully remote and suitable for those with...  ...environments. Ideal candidates are proficient in Python and PostgreSQL, with strong technical communication... 
    Senior
    Python
    Remote work

    Jobot

    Doral, FL
    10 days ago
  • $155k - $235k

     ...Databricks, to the semantic and AI layers that sit on top....  ...data works for engineers, analysts, and business...  ...We're looking for a Senior AI Data Engineer to lead...  ...Scribd. Tech Stack We use Python, SQL, Databricks (Unity...  ...trained, aligned and evaluated (RLHF, fine‑tuning,... 
    Senior
    Python
    Home office
    Flexible hours

    Scribd, Inc.

    Miami, FL
    4 days ago
  • $105.8k - $174.8k

     ...and Decision Science – AI Native Engineering Physical AI Engineering Consultant, Senior Consultant The...  ...twin applications, and software engineering. We'll...  ...programming languages such as Python, C++, or Java, with...  ...data and evaluating results to make meaningful... 
    Senior
    Python
    Full time
    Work experience placement
    Summer holiday
    Flexible hours

    EY

    Miami, FL
    4 days ago
  • $124k - $280k

     ...Competency: Data, Analytics & AI Industry/Sector:...  ...in data and analytics engineering focus on leveraging...  ...and health plans. As a Senior Manager, you will drive...  ...modeling, prompt engineering, Python-based development,...  ...with team members. We evaluate these factors thoughtfully... 
    Senior
    Python
    Full time
    H1b

    PwC

    Miami, FL
    20 hours ago
  • A leading global consulting firm in Miami is seeking a Senior AI Native Engineer to research and implement scalable AI systems that meet business...  ...ideal candidate has a bachelor's degree, strong skills in Python, and 3-6 years of experience in AI or Machine Learning. This... 
    Senior
    Python

    Ernst & Young Oman

    Doral, FL
    2 days ago
  •  ...company in Miami is seeking an experienced engineer to join their Transformation Office. This...  ...on building automated infrastructure for AI initiatives, integrating legacy data with...  ...MLOps and LLMOps, and strong skills in Python, SQL, and cloud environments. The position... 
    Senior
    Python
    Work at office

    Sedgwick

    Doral, FL
    4 days ago
  • $106.9k - $176.5k

     ...skills and ambitions. As a Senior AI Native Engineer, you will be at the forefront...  ...science, analytics, and software engineering. We’ll look to...  ...Learning. Strong skills in Python. Ability to collaborate and...  ...ingestion, transformation, and evaluation. Experience with model... 
    Senior
    Python
    Full time
    Work experience placement
    Summer holiday
    Flexible hours

    Ernst & Young Oman

    Doral, FL
    2 days ago
  •  ...Role Summary Research Engineers at Citadel Securities are responsible...  ...priorities and deliver custom software solutions Design and...  ...Proficiency with C++ and Python Experience with derivatives...  ...compute, machine learning and AI to power our analytics and tackle... 
    Senior
    Python

    Citadel

    Miami, FL
    2 days ago
  • Rialto Associate Services, LLC is looking for an experienced AI and Data Engineer to design and build enterprise AI knowledge bases that unify...  ...in investment analytics, and hands-on experience with Python, FastAPI, and Azure. The role involves integrating LLM workflows... 
    Senior
    Python

    Rialto Associate Services, LLC

    Miami, FL
    20 hours ago
  •  ...award of contract** The Senior Software Developer advances our customer's...  ...Enterprise Data Architecture (EDA) by engineering modern, scalable applications that...  ..., HTML5, CSS3, Nod JS, and AI/ML. Experience with Python, Grunt, Node, AngularJS, ArangoDB,... 
    Senior
    Python
    Contract work
    Work at office
    Worldwide

    SOSi

    Doral, FL
    1 day ago
  • $115k - $165k

     ...Title: Sr Front End Software Engineer Location: Miami, FL (Onsite) Employment Type: Full...  ...Technologies: JavaScript | TypeScript | Python | Java | Front-end frameworks | Back-end...  ...for this job, you agree to receive AI-generated calls, text messages, and/or emails... 
    Senior
    Python
    Full time

    Mitchell Martin

    Miami, FL
    1 day ago
  • $105.8k - $174.8k

     ...skills and ambitions. As a Senior AI Native Engineer, you will be at the...  ...digital twin applications, and software engineering. We'll look to...  ...programming languages such as Python, C++, or Java, with experience...  ..., transforming data and evaluating results to make meaningful... 
    Senior
    Python
    Full time
    Work experience placement
    Summer holiday
    Flexible hours

    Ernst & Young Oman

    Miami, FL
    1 day ago
  • $150k - $200k

    100% REMOTE Senior Python Engineer / Senior Full Stack Developer Needed for Growing Reinsurance Company...  ...that is on the lookout for a Senior Software Engineer / Senior Full Stack Engineer with...  ...this job, you agree to receive calls, AI-generated calls, text messages, or... 
    Senior
    Python
    Local area
    Remote work
    Work from home

    Jobot

    Miami, FL
    2 days ago
  • A leading AI security firm located in Miami, Florida is seeking a Senior Software Engineer who will play a crucial role in shaping the architecture and development processes...  ...is proficient in backend languages such as Python or Java, along with skills in Kubernetes, CI/... 
    Senior
    Python

    Iru

    Miami, FL
    3 days ago
  • A leading cruise line company is seeking an AI/ML Engineer in Miami, Florida. The ideal candidate will design, build, and deploy AI agents...  ...Bachelor's degree, 5-6 years of relevant experience, and strong Python skills. The role is mainly in-office, requiring team... 
    Senior
    Python
    Work at office
    Remote work

    CARNIVAL CRUISE LINES

    Miami, FL
    2 days ago
  •  ...Celestar Corporation is seeking a Senior Software Developer (SSD) to support USSOUTHCOM J26...  ...with one of the following technologies: Python, SQL, PostgreSQL, Grunt, Node, AngularJS...  ...of JavaScript, HTML5, CSS3, Nod JS, and AI/ML. • Active TS/SCI Security Clearance... 
    Senior
    Python
    Local area
    Remote work

    Celestar

    Doral, FL
    20 hours ago
  • A global professional services firm is seeking a Senior AI Native Engineer to lead the development of AI systems that meet diverse business needs...  .... Candidates should have strong programming experience in Python or C++, along with excellent problem-solving and communication... 
    Senior
    Python

    Ernst & Young Oman

    Miami, FL
    4 days ago
  • The Blackstone Group L.P. is seeking a Senior Vice President, Software Engineering Manager for the AI Platform Development team in Miami, Florida. This role involves...  ...engineering experience, advanced skills in Python, C#, and TypeScript, and a strong leadership track... 
    Senior
    Python

    The Blackstone Group L.P.

    Miami, FL
    4 days ago
  • $155k

     ...the Team The Quality Engineering team builds the shared...  ...We are looking for a Senior Software Engineer, Quality...  ...in implementing how AI reshapes quality engineering...  ...Experience using or evaluating AI-powered...  ...ranges are based on local benchmarks for each role, level,... 
    Senior
    Contract work
    Local area
    Home office
    Flexible hours
    Shift work

    Scribd, Inc.

    Miami, FL
    20 hours ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior Software Engineer AI Evaluation & Benchmarks (Python) [Remote]. Be the first to apply!