Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

AI Evaluation Engineer | Shell/Bash, Docker & Linux

Biz Tech Analytics

Biz Tech Analytics is looking for a skilled professional to review and validate AI benchmark tasks in real-world repositories. The role involves running containerized test suites, verifying patches and solutions, and debugging flaky tests to ensure quality standards. The ideal candidate will have 3–10 years of experience in production software, strong skills in Docker and Linux, and the ability to navigate large codebases while delivering clear feedback. This position offers the opportunity to work on cutting-edge AI solutions. #J-18808-Ljbffr Biz Tech Analytics

Vacancy posted 4 days ago
Similar jobs that could be interesting for youBased on the AI Evaluation Engineer | Shell/Bash, Docker & Linux in New York, NY vacancy
  • Biz Tech Consultants is seeking an AI Evaluation Engineer to work part-time. The role involves reviewing...  ...tasks in Python repositories, running Docker-based test suites, and debugging flaky...  ...skills in Python, Docker, pytest, and Linux. Educational qualifications include B.Tech... 
    Linux
    Remote job
    Part time
    Freelance
    Work from home

    Biz Tech Consultants

    New York, NY
    2 days ago
  • AI Evaluation Engineer Shell / Bash Scripting Biz-Tech Analytics offers specialized, human‑in‑the‑loop annotation...  .../HCL repos: run Terratest in Docker (mocked providers), verify patches/oracles...  ...& correctness. Strong HCL + Linux skills required. Employment Type: Part... 
    Linux
    Part time
    Freelance

    Biz Tech Analytics

    New York, NY
    5 days ago
  •  ...Analytics is looking for a skilled candidate to review and validate AI benchmark tasks in real-world repositories. Responsibilities...  ...has 3-10 years of production software experience, hands-on Docker and Linux skills, and a strong ability to navigate large codebases with... 
    Linux

    Biz Tech Analytics

    New York, NY
    4 days ago
  •  ...Tech Analytics is looking for an experienced software engineer based in the United States to validate AI benchmark tasks across various repositories. The candidate should have a robust background with Docker and Linux, able to run containerized test suites and ensure... 
    Linux

    Biz Tech Analytics

    New York, NY
    4 days ago
  • Review and validate AI benchmark tasks in real‑world repositories. Run containerized test suites, verify patches and solutions, debug...  ..., reproducibility, and correctness. Require strong CLI, Docker, Linux skills. Required Candidate profile 3-10 years in production software... 
    Linux

    Biz Tech Analytics

    New York, NY
    2 days ago
  • Review & validate AI benchmark tasks in C++/Rust repositories. Run containerized builds...  .... Strong knowledge of build systems and Linux is required. Educational Requirements...  ...specialization, B.Tech / B.E. in Computer Science and Engineering (CSE) PG: MCA in any specialization, M.... 
    Linux
    Freelance

    Biz Tech Consultants

    New York, NY
    4 days ago
  •  ...Homebased Responsibilities Review AI benchmark tasks in Dart/...  ...repositories. Run Dart tests in Docker containers. Verify patches or...  ...with Dart async and Linux environments. Proficiency in...  ...B.E. in Computer Science and Engineering (CSE); PG - M.Tech in Computers... 
    Linux
    Part time
    Freelance

    Biz Tech Analytics

    New York, NY
    2 days ago
  •  ...freelance developer to review and validate AI benchmark tasks in Clojure repositories....  ...include running Kaocha/clojure.test suites in Docker, verifying patches, and debugging issues....  ...should have strong skills in REPL and Linux. Education credentials include a B.Tech/B.... 
    Linux
    Remote job
    Part time
    Freelance

    Biz Tech Analytics

    New York, NY
    2 days ago
  • Review & validate AI benchmark task in Clojure repo. Run Kaocha/clojure.test suites in Docker, verify patches & oracle solution, debug...  ...functional correctness. Strong REPL & Linux skill needed. Employment...  ...B.E. in Computer Science and Engineering (CSE) PG: MCA in Any... 
    Linux
    Part time
    Freelance

    Biz Tech Analytics

    New York, NY
    2 days ago
  • $80 per hour

     ...ethically shape the future of AI. What We Do The Mindrift...  ...Calling all security researchers, engineers, and penetration testers with...  ...tools for running and evaluating agent behavior. You’ll implement...  ...interfaces Understanding of Docker, Linux CLI, and communication Ability... 
    Linux
    Part time
    Freelance
    Remote work
    Flexible hours

    Mind Rift

    New York, NY
    4 days ago
  • $175k - $275k

     ...AI Engineer Vatic is looking for an AI engineer with proven experience with large foundational...  ...that contribute towards rigorous evaluation and in-depth model analysis. We employ...  ...with Python development in a Linux environment Strong understanding of machine... 
    Linux
    Work at office
    Night shift

    Vatic Labs

    New York, NY
    3 days ago
  • $60 per hour

    A leading AI development company seeks proficient programmers to contribute to cutting-edge AI systems while enjoying fully remote...  ...include solving coding problems, writing high-quality code, and evaluating AI-generated code. Ideal candidates should have a bachelor’s degree... 
    Remote job
    Hourly pay
    Flexible hours

    DataAnnotation

    New York, NY
    2 days ago
  • $40 per hour

    A leading AI security solutions provider is seeking experienced cybersecurity professionals to evaluate AI-generated security content and solve real-world technical problems. In this remote role, candidates will require over 2 years of cybersecurity experience, fluency... 
    Hourly pay
    Remote work

    DataAnnotation

    Brooklyn, NY
    5 days ago
  • $150k - $250k

    Slingshot Aerospace is seeking a Senior AI Engineer to join our AI and Data Science team. This role involves developing evaluation frameworks for intelligent systems in mission-critical space operations. Responsibilities include maintaining our validation SDK, designing... 
    Remote job

    Slingshot Aerospace

    New York, NY
    2 days ago
  • A leading technology company is seeking AI Developers to design and implement AI/ML features in a remote role. Responsibilities include...  ...building AI services, developing data pipelines, and creating evaluations for LLMs. Ideal candidates have mid-senior experience in AI... 
    Remote job
    Hourly pay

    Rex.zone

    New York, NY
    2 days ago
  • Feedinkoo is looking for experienced software engineers to contribute their expertise to AI evaluation and improvement projects. This role involves applying software engineering skills to assess AI systems with no prior AI experience required, as training will be provided... 
    Remote job
    Freelance

    Feedinkoo

    New York, NY
    4 days ago
  • $40 per hour

    A technology consulting company is seeking experienced cybersecurity professionals for a remote position. In this role, you'll evaluate AI-generated cybersecurity content, solve technical problems, and provide valuable feedback to enhance AI models. Ideal candidates should... 
    Hourly pay
    Remote work

    DataAnnotation

    New York, NY
    2 days ago
  • $40 per hour

    A technology company is seeking a Web Engineer to train AI models remotely, offering either full-time or part-time options. The role involves evaluating AI chatbots' coding challenges and assessing the quality of AI outputs. Applicants should be fluent in English and detail... 
    Remote job
    Hourly pay
    Full time
    Part time

    DataAnnotation

    New York, NY
    2 days ago
  • $140k - $230k

    About Arize AI is rapidly transforming the world. As generative AI reshapes...  ...Arize AI is the leading AI & Agent Engineering observability and evaluation platform , empowering AI engineers to...  ...500, etc) Proficiency in: Python Linux/Unix Bonus Points, But Not Required... 
    Linux
    Work experience placement
    Remote work
    Work from home

    Cerebras

    New York, NY
    1 day ago
  •  ...Member of Technical Staff - Evals located in New York, NY, where you will ensure our AI-powered features are high quality and reliable. Your responsibilities include designing evaluation frameworks, building automated tests, and developing tools for seamless evaluations.... 

    Entendre

    New York, NY
    4 days ago
  • $40 per hour

    A cybersecurity firm in the United States seeks experienced professionals to evaluate AI-generated security content and solve technical cybersecurity problems. In this remote role, you'll work on your own schedule, contributing to the next generation of AI security systems... 
    Hourly pay
    Remote work

    DataAnnotation

    New York, NY
    2 days ago
  • $40 per hour

    A cybersecurity AI company is looking for experienced professionals to evaluate AI-generated security content and solve technical problems to improve AI systems. This role requires a minimum of 2 years in cybersecurity and allows for remote work with flexible scheduling... 
    Hourly pay
    Remote work
    Flexible hours

    DataAnnotation

    New York, NY
    2 days ago
  • $40 per hour

    A leading AI training company is seeking a DevOps Engineer to join their remote team. In this role, you will provide coding challenges to AI chatbots and evaluate their outputs for correctness and performance. Candidates should be proficient in Python or JavaScript and... 
    Remote job
    Hourly pay

    DataAnnotation

    New York, NY
    4 days ago
  • $40 per hour

    A cybersecurity company is seeking experienced professionals to evaluate AI-generated security content and solve technical issues. The role requires 2+ years in the field, coding experience, and strong analytical skills. Candidates should have a bachelor's degree and cybersecurity... 
    Hourly pay
    Full time
    Part time
    Remote work

    DataAnnotation

    New York, NY
    2 days ago
  • A cybersecurity solutions company is looking for experienced cybersecurity professionals to help train AI models. You will work remotely to evaluate AI-generated security content, solve technical problems, and provide feedback to improve AI systems. Ideal candidates have... 
    Remote job
    Flexible hours

    DataAnnotation

    New York, NY
    2 days ago
  • Akraya, Inc. is looking for a skilled individual to develop AI-powered tools that support the creation and evaluation of agent skills and prompts. This remote role involves building user-friendly interfaces while integrating existing systems. The successful candidate will... 
    Remote job

    Akraya, Inc.

    New York, NY
    2 days ago
  • $40 per hour

    A leading AI-focused cybersecurity firm is looking for experienced cybersecurity professionals to evaluate AI-generated content and solve technical security problems. In this flexible role, you can work remotely and choose your projects. Ideal candidates will have 2+ years... 
    Remote job
    Hourly pay
    Flexible hours

    DataAnnotation

    New York, NY
    2 days ago
  • $40 per hour

    A cybersecurity firm is looking for experienced professionals to join its team. This remote role involves evaluating AI-generated security content and solving technical cybersecurity problems. Candidates should have over 2 years of hands-on experience in cybersecurity and... 
    Remote job
    Hourly pay
    Flexible hours

    DataAnnotation

    Brooklyn, NY
    5 days ago
  • A leading cybersecurity firm is seeking experienced professionals to evaluate AI-generated cybersecurity content and solve technical security problems. You will play a significant role in training AI models, providing critical feedback, and improving system accuracy. This... 
    Remote job
    Flexible hours

    DataAnnotation

    New York, NY
    5 days ago
  • $400 per month

     ...looking for contributors to support a Frontier Code Agents project with a leading AI research lab. The role involves using AI coding agents to evaluate and improve machine learning and AI engineering tasks. Ideal candidates should have 2+ years of machine learning engineering... 
    Remote job

    Obsidian

    New York, NY
    4 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to AI Evaluation Engineer | Shell/Bash, Docker & Linux. Be the first to apply!