AI Evaluation Engineer | Shell/Bash, Docker & Linux
Biz Tech Analytics
Biz Tech Analytics is looking for a skilled professional to review and validate AI benchmark tasks in real-world repositories. The role involves running containerized test suites, verifying patches and solutions, and debugging flaky tests to ensure quality standards. The ideal candidate will have 3–10 years of experience in production software, strong skills in Docker and Linux, and the ability to navigate large codebases while delivering clear feedback. This position offers the opportunity to work on cutting-edge AI solutions. #J-18808-Ljbffr Biz Tech Analytics
- Biz Tech Consultants is seeking an AI Evaluation Engineer to work part-time. The role involves reviewing... ...tasks in Python repositories, running Docker-based test suites, and debugging flaky... ...skills in Python, Docker, pytest, and Linux. Educational qualifications include B.Tech...LinuxRemote jobPart timeFreelanceWork from home
- AI Evaluation Engineer Shell / Bash Scripting Biz-Tech Analytics offers specialized, human‑in‑the‑loop annotation... .../HCL repos: run Terratest in Docker (mocked providers), verify patches/oracles... ...& correctness. Strong HCL + Linux skills required. Employment Type: Part...LinuxPart timeFreelance
- ...Analytics is looking for a skilled candidate to review and validate AI benchmark tasks in real-world repositories. Responsibilities... ...has 3-10 years of production software experience, hands-on Docker and Linux skills, and a strong ability to navigate large codebases with...Linux
- ...Tech Analytics is looking for an experienced software engineer based in the United States to validate AI benchmark tasks across various repositories. The candidate should have a robust background with Docker and Linux, able to run containerized test suites and ensure...Linux
- Review and validate AI benchmark tasks in real‑world repositories. Run containerized test suites, verify patches and solutions, debug... ..., reproducibility, and correctness. Require strong CLI, Docker, Linux skills. Required Candidate profile 3-10 years in production software...Linux
- Review & validate AI benchmark tasks in C++/Rust repositories. Run containerized builds... .... Strong knowledge of build systems and Linux is required. Educational Requirements... ...specialization, B.Tech / B.E. in Computer Science and Engineering (CSE) PG: MCA in any specialization, M....LinuxFreelance
- ...Homebased Responsibilities Review AI benchmark tasks in Dart/... ...repositories. Run Dart tests in Docker containers. Verify patches or... ...with Dart async and Linux environments. Proficiency in... ...B.E. in Computer Science and Engineering (CSE); PG - M.Tech in Computers...LinuxPart timeFreelance
- ...freelance developer to review and validate AI benchmark tasks in Clojure repositories.... ...include running Kaocha/clojure.test suites in Docker, verifying patches, and debugging issues.... ...should have strong skills in REPL and Linux. Education credentials include a B.Tech/B....LinuxRemote jobPart timeFreelance
- Review & validate AI benchmark task in Clojure repo. Run Kaocha/clojure.test suites in Docker, verify patches & oracle solution, debug... ...functional correctness. Strong REPL & Linux skill needed. Employment... ...B.E. in Computer Science and Engineering (CSE) PG: MCA in Any...LinuxPart timeFreelance
$80 per hour
...ethically shape the future of AI. What We Do The Mindrift... ...Calling all security researchers, engineers, and penetration testers with... ...tools for running and evaluating agent behavior. You’ll implement... ...interfaces Understanding of Docker, Linux CLI, and communication Ability...LinuxPart timeFreelanceRemote workFlexible hours$175k - $275k
...AI Engineer Vatic is looking for an AI engineer with proven experience with large foundational... ...that contribute towards rigorous evaluation and in-depth model analysis. We employ... ...with Python development in a Linux environment Strong understanding of machine...LinuxWork at officeNight shift$60 per hour
A leading AI development company seeks proficient programmers to contribute to cutting-edge AI systems while enjoying fully remote... ...include solving coding problems, writing high-quality code, and evaluating AI-generated code. Ideal candidates should have a bachelor’s degree...Remote jobHourly payFlexible hours$40 per hour
A leading AI security solutions provider is seeking experienced cybersecurity professionals to evaluate AI-generated security content and solve real-world technical problems. In this remote role, candidates will require over 2 years of cybersecurity experience, fluency...Hourly payRemote work$150k - $250k
Slingshot Aerospace is seeking a Senior AI Engineer to join our AI and Data Science team. This role involves developing evaluation frameworks for intelligent systems in mission-critical space operations. Responsibilities include maintaining our validation SDK, designing...Remote job- A leading technology company is seeking AI Developers to design and implement AI/ML features in a remote role. Responsibilities include... ...building AI services, developing data pipelines, and creating evaluations for LLMs. Ideal candidates have mid-senior experience in AI...Remote jobHourly pay
- Feedinkoo is looking for experienced software engineers to contribute their expertise to AI evaluation and improvement projects. This role involves applying software engineering skills to assess AI systems with no prior AI experience required, as training will be provided...Remote jobFreelance
$40 per hour
A technology consulting company is seeking experienced cybersecurity professionals for a remote position. In this role, you'll evaluate AI-generated cybersecurity content, solve technical problems, and provide valuable feedback to enhance AI models. Ideal candidates should...Hourly payRemote work$40 per hour
A technology company is seeking a Web Engineer to train AI models remotely, offering either full-time or part-time options. The role involves evaluating AI chatbots' coding challenges and assessing the quality of AI outputs. Applicants should be fluent in English and detail...Remote jobHourly payFull timePart time$140k - $230k
About Arize AI is rapidly transforming the world. As generative AI reshapes... ...Arize AI is the leading AI & Agent Engineering observability and evaluation platform , empowering AI engineers to... ...500, etc) Proficiency in: Python Linux/Unix Bonus Points, But Not Required...LinuxWork experience placementRemote workWork from home- ...Member of Technical Staff - Evals located in New York, NY, where you will ensure our AI-powered features are high quality and reliable. Your responsibilities include designing evaluation frameworks, building automated tests, and developing tools for seamless evaluations....
$40 per hour
A cybersecurity firm in the United States seeks experienced professionals to evaluate AI-generated security content and solve technical cybersecurity problems. In this remote role, you'll work on your own schedule, contributing to the next generation of AI security systems...Hourly payRemote work$40 per hour
A cybersecurity AI company is looking for experienced professionals to evaluate AI-generated security content and solve technical problems to improve AI systems. This role requires a minimum of 2 years in cybersecurity and allows for remote work with flexible scheduling...Hourly payRemote workFlexible hours$40 per hour
A leading AI training company is seeking a DevOps Engineer to join their remote team. In this role, you will provide coding challenges to AI chatbots and evaluate their outputs for correctness and performance. Candidates should be proficient in Python or JavaScript and...Remote jobHourly pay$40 per hour
A cybersecurity company is seeking experienced professionals to evaluate AI-generated security content and solve technical issues. The role requires 2+ years in the field, coding experience, and strong analytical skills. Candidates should have a bachelor's degree and cybersecurity...Hourly payFull timePart timeRemote work- A cybersecurity solutions company is looking for experienced cybersecurity professionals to help train AI models. You will work remotely to evaluate AI-generated security content, solve technical problems, and provide feedback to improve AI systems. Ideal candidates have...Remote jobFlexible hours
- Akraya, Inc. is looking for a skilled individual to develop AI-powered tools that support the creation and evaluation of agent skills and prompts. This remote role involves building user-friendly interfaces while integrating existing systems. The successful candidate will...Remote job
$40 per hour
A leading AI-focused cybersecurity firm is looking for experienced cybersecurity professionals to evaluate AI-generated content and solve technical security problems. In this flexible role, you can work remotely and choose your projects. Ideal candidates will have 2+ years...Remote jobHourly payFlexible hours$40 per hour
A cybersecurity firm is looking for experienced professionals to join its team. This remote role involves evaluating AI-generated security content and solving technical cybersecurity problems. Candidates should have over 2 years of hands-on experience in cybersecurity and...Remote jobHourly payFlexible hours- A leading cybersecurity firm is seeking experienced professionals to evaluate AI-generated cybersecurity content and solve technical security problems. You will play a significant role in training AI models, providing critical feedback, and improving system accuracy. This...Remote jobFlexible hours
$400 per month
...looking for contributors to support a Frontier Code Agents project with a leading AI research lab. The role involves using AI coding agents to evaluate and improve machine learning and AI engineering tasks. Ideal candidates should have 2+ years of machine learning engineering...Remote job
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to AI Evaluation Engineer | Shell/Bash, Docker & Linux. Be the first to apply!

