AI Evaluation Engineer | Shell/Bash, Docker & Linux

Biz Tech Analytics

Biz Tech Analytics is looking for a skilled professional to review and validate AI benchmark tasks in real-world repositories. The role involves running containerized test suites, verifying patches and solutions, and debugging flaky tests to ensure quality standards. The ideal candidate will have 3–10 years of experience in production software, strong skills in Docker and Linux, and the ability to navigate large codebases while delivering clear feedback. This position offers the opportunity to work on cutting-edge AI solutions. #J-18808-Ljbffr Biz Tech Analytics

Apply

Vacancy posted 4 days ago

Similar jobs that could be interesting for youBased on the AI Evaluation Engineer | Shell/Bash, Docker & Linux in New York, NY vacancy

AI Evaluation Engineer: Python & Docker — Remote
Biz Tech Consultants is seeking an AI Evaluation Engineer to work part-time. The role involves reviewing... ...tasks in Python repositories, running Docker-based test suites, and debugging flaky... ...skills in Python, Docker, pytest, and Linux. Educational qualifications include B.Tech...
Linux
Remote job
Part time
Freelance
Work from home
Biz Tech Consultants
New York, NY
2 days ago
AI Evaluation Engineer HCL / Terraform / Infrastructure as Code
AI Evaluation Engineer Shell / Bash Scripting Biz-Tech Analytics offers specialized, human‑in‑the‑loop annotation... .../HCL repos: run Terratest in Docker (mocked providers), verify patches/oracles... ...& correctness. Strong HCL + Linux skills required. Employment Type: Part...
Linux
Part time
Freelance
Biz Tech Analytics
New York, NY
5 days ago
AI Evaluation Engineer - Ruby, Docker, Linux
...Analytics is looking for a skilled candidate to review and validate AI benchmark tasks in real-world repositories. Responsibilities... ...has 3-10 years of production software experience, hands-on Docker and Linux skills, and a strong ability to navigate large codebases with...
Linux
Biz Tech Analytics
New York, NY
4 days ago
AI Evaluation Engineer (Java) | Docker & Linux
...Tech Analytics is looking for an experienced software engineer based in the United States to validate AI benchmark tasks across various repositories. The candidate should have a robust background with Docker and Linux, able to run containerized test suites and ensure...
Linux
Biz Tech Analytics
New York, NY
4 days ago
AI Evaluation Engineer - Ruby
Review and validate AI benchmark tasks in real‑world repositories. Run containerized test suites, verify patches and solutions, debug... ..., reproducibility, and correctness. Require strong CLI, Docker, Linux skills. Required Candidate profile 3-10 years in production software...
Linux
Biz Tech Analytics
New York, NY
2 days ago
AI Evaluation Engineer Rust (Freelancer)
Review & validate AI benchmark tasks in C++/Rust repositories. Run containerized builds... .... Strong knowledge of build systems and Linux is required. Educational Requirements... ...specialization, B.Tech / B.E. in Computer Science and Engineering (CSE) PG: MCA in any specialization, M....
Linux
Freelance
Biz Tech Consultants
New York, NY
4 days ago
AI Evaluation Engineer Dart / Flutter
...Homebased Responsibilities Review AI benchmark tasks in Dart/... ...repositories. Run Dart tests in Docker containers. Verify patches or... ...with Dart async and Linux environments. Proficiency in... ...B.E. in Computer Science and Engineering (CSE); PG - M.Tech in Computers...
Linux
Part time
Freelance
Biz Tech Analytics
New York, NY
2 days ago
Remote AI Evaluation Engineer - Clojure & JVM (Part-Time)
...freelance developer to review and validate AI benchmark tasks in Clojure repositories.... ...include running Kaocha/clojure.test suites in Docker, verifying patches, and debugging issues.... ...should have strong skills in REPL and Linux. Education credentials include a B.Tech/B....
Linux
Remote job
Part time
Freelance
Biz Tech Analytics
New York, NY
2 days ago
AI Evaluation Engineer Clojure / JVM Functional
Review & validate AI benchmark task in Clojure repo. Run Kaocha/clojure.test suites in Docker, verify patches & oracle solution, debug... ...functional correctness. Strong REPL & Linux skill needed. Employment... ...B.E. in Computer Science and Engineering (CSE) PG: MCA in Any...
Linux
Part time
Freelance
Biz Tech Analytics
New York, NY
2 days ago
MCP & Tools Python Developer - Agent Evaluation Infrastructure
$80 per hour
...ethically shape the future of AI. What We Do The Mindrift... ...Calling all security researchers, engineers, and penetration testers with... ...tools for running and evaluating agent behavior. You’ll implement... ...interfaces Understanding of Docker, Linux CLI, and communication Ability...
Linux
Part time
Freelance
Remote work
Flexible hours
Mind Rift
New York, NY
4 days ago
AI Engineer
$175k - $275k
...AI Engineer Vatic is looking for an AI engineer with proven experience with large foundational... ...that contribute towards rigorous evaluation and in-depth model analysis. We employ... ...with Python development in a Linux environment Strong understanding of machine...
Linux
Work at office
Night shift
Vatic Labs
New York, NY
3 days ago
Remote AI Training Engineer - Code, Evaluate & Shape AI
$60 per hour
A leading AI development company seeks proficient programmers to contribute to cutting-edge AI systems while enjoying fully remote... ...include solving coding problems, writing high-quality code, and evaluating AI-generated code. Ideal candidates should have a bachelor’s degree...
Remote job
Hourly pay
Flexible hours
DataAnnotation
New York, NY
2 days ago
Remote AI Security Engineer - SOC & Model Evaluator
$40 per hour
A leading AI security solutions provider is seeking experienced cybersecurity professionals to evaluate AI-generated security content and solve real-world technical problems. In this remote role, candidates will require over 2 years of cybersecurity experience, fluency...
Hourly pay
Remote work
DataAnnotation
Brooklyn, NY
5 days ago
Senior AI Engineer: Agentic V&V & Evaluation (Remote)
$150k - $250k
Slingshot Aerospace is seeking a Senior AI Engineer to join our AI and Data Science team. This role involves developing evaluation frameworks for intelligent systems in mission-critical space operations. Responsibilities include maintaining our validation SDK, designing...
Remote job
Slingshot Aerospace
New York, NY
2 days ago
Remote AI Engineer — Production ML & Evaluation
A leading technology company is seeking AI Developers to design and implement AI/ML features in a remote role. Responsibilities include... ...building AI services, developing data pipelines, and creating evaluations for LLMs. Ideal candidates have mid-senior experience in AI...
Remote job
Hourly pay
Rex.zone
New York, NY
2 days ago
Remote AI Evaluation Software Engineer - Freelance
Feedinkoo is looking for experienced software engineers to contribute their expertise to AI evaluation and improvement projects. This role involves applying software engineering skills to assess AI systems with no prior AI experience required, as training will be provided...
Remote job
Freelance
Feedinkoo
New York, NY
4 days ago
AI Security Engineer: Train & Evaluate Cyber AI Models
$40 per hour
A technology consulting company is seeking experienced cybersecurity professionals for a remote position. In this role, you'll evaluate AI-generated cybersecurity content, solve technical problems, and provide valuable feedback to enhance AI models. Ideal candidates should...
Hourly pay
Remote work
DataAnnotation
New York, NY
2 days ago
Remote AI Web Engineer — Train & Evaluate Chatbots
$40 per hour
A technology company is seeking a Web Engineer to train AI models remotely, offering either full-time or part-time options. The role involves evaluating AI chatbots' coding challenges and assessing the quality of AI outputs. Applicants should be fluent in English and detail...
Remote job
Hourly pay
Full time
Part time
DataAnnotation
New York, NY
2 days ago
AI Sales Engineer, EMEA
$140k - $230k
About Arize AI is rapidly transforming the world. As generative AI reshapes... ...Arize AI is the leading AI & Agent Engineering observability and evaluation platform , empowering AI engineers to... ...500, etc) Proficiency in: Python Linux/Unix Bonus Points, But Not Required...
Linux
Work experience placement
Remote work
Work from home
Cerebras
New York, NY
1 day ago
Staff AI Evaluation Engineer
...Member of Technical Staff - Evals located in New York, NY, where you will ensure our AI-powered features are high quality and reliable. Your responsibilities include designing evaluation frameworks, building automated tests, and developing tools for seamless evaluations....
Entendre
New York, NY
4 days ago
AI Security Engineer: Train & Evaluate Cyber AI Models
$40 per hour
A cybersecurity firm in the United States seeks experienced professionals to evaluate AI-generated security content and solve technical cybersecurity problems. In this remote role, you'll work on your own schedule, contributing to the next generation of AI security systems...
Hourly pay
Remote work
DataAnnotation
New York, NY
2 days ago
AI Security Engineer: Model Evaluation & Feedback
$40 per hour
A cybersecurity AI company is looking for experienced professionals to evaluate AI-generated security content and solve technical problems to improve AI systems. This role requires a minimum of 2 years in cybersecurity and allows for remote work with flexible scheduling...
Hourly pay
Remote work
Flexible hours
DataAnnotation
New York, NY
2 days ago
Remote AI DevOps Engineer: Model Evaluation & Training
$40 per hour
A leading AI training company is seeking a DevOps Engineer to join their remote team. In this role, you will provide coding challenges to AI chatbots and evaluate their outputs for correctness and performance. Candidates should be proficient in Python or JavaScript and...
Remote job
Hourly pay
DataAnnotation
New York, NY
4 days ago
AI Security Engineer: Model Evaluation & Feedback
$40 per hour
A cybersecurity company is seeking experienced professionals to evaluate AI-generated security content and solve technical issues. The role requires 2+ years in the field, coding experience, and strong analytical skills. Candidates should have a bachelor's degree and cybersecurity...
Hourly pay
Full time
Part time
Remote work
DataAnnotation
New York, NY
2 days ago
Remote AI Security Engineer - SOC & Model Evaluator
A cybersecurity solutions company is looking for experienced cybersecurity professionals to help train AI models. You will work remotely to evaluate AI-generated security content, solve technical problems, and provide feedback to improve AI systems. Ideal candidates have...
Remote job
Flexible hours
DataAnnotation
New York, NY
2 days ago
AI UX Tools Engineer — Remote, Prompt & Skill Evaluation
Akraya, Inc. is looking for a skilled individual to develop AI-powered tools that support the creation and evaluation of agent skills and prompts. This remote role involves building user-friendly interfaces while integrating existing systems. The successful candidate will...
Remote job
Akraya, Inc.
New York, NY
2 days ago
Remote AI Security Engineer - SOC & Model Evaluator
$40 per hour
A leading AI-focused cybersecurity firm is looking for experienced cybersecurity professionals to evaluate AI-generated content and solve technical security problems. In this flexible role, you can work remotely and choose your projects. Ideal candidates will have 2+ years...
Remote job
Hourly pay
Flexible hours
DataAnnotation
New York, NY
2 days ago
Remote AI Security Engineer - SOC & Model Evaluator
$40 per hour
A cybersecurity firm is looking for experienced professionals to join its team. This remote role involves evaluating AI-generated security content and solving technical cybersecurity problems. Candidates should have over 2 years of hands-on experience in cybersecurity and...
Remote job
Hourly pay
Flexible hours
DataAnnotation
Brooklyn, NY
5 days ago
Remote AI Security Engineer - SOC & Model Evaluator
A leading cybersecurity firm is seeking experienced professionals to evaluate AI-generated cybersecurity content and solve technical security problems. You will play a significant role in training AI models, providing critical feedback, and improving system accuracy. This...
Remote job
Flexible hours
DataAnnotation
New York, NY
5 days ago
Remote AI/ML Engineer: Frontier Code Agents Evaluator
$400 per month
...looking for contributors to support a Frontier Code Agents project with a leading AI research lab. The role involves using AI coding agents to evaluate and improve machine learning and AI engineering tasks. Ideal candidates should have 2+ years of machine learning engineering...
Remote job
Obsidian
New York, NY
4 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to AI Evaluation Engineer | Shell/Bash, Docker & Linux. Be the first to apply!