AI Red Teamer, LLM Generalist
Cacheflow
About the Role As an AI Red Teamer, you will stress-test large language models by intentionally trying to break them. Rather than checking whether an answer is correct, you will design creative, adversarial prompts that expose vulnerabilities: unsafe content, bias, broken guardrails, hallucinations, prompt injection weaknesses, and unexpected behaviors. Your work directly supports AI safety and model robustness for leading research labs. This is a generalist red teaming role. You will probe models across the full spectrum of risk categories, including content safety, CBRN (chemical, biological, radiological, nuclear), cybersecurity, persuasion and influence operations, child safety, self-harm, over-companionship, and regulatory compliance. Red teaming may span text, image, voice, and agentic model capabilities depending on project needs. This role requires creativity, curiosity, and an ability to think like an adversary while operating with strong ethical judgment. Responsibilities Craft creative prompts and multi-turn scenarios to stress-test AI guardrails across diverse risk categories Discover ways around safety filters, restrictions, and defenses using jailbreak, evasion, and prompt injection techniques Explore edge cases to provoke disallowed, harmful, or incorrect outputs Evaluate and score model responses against structured harm taxonomies and severity rubrics Document experiments clearly, including what you tried, why you tried it, and what it revealed Review and refine adversarial prompts generated by other team members Contribute to harm taxonomy development, calibration exercises, and inter-rater reliability work Collaborate with engineers, data scientists, and researchers to share findings and strengthen defenses Work with potentially disturbing content on a regular basis (see Content Warning below) Stay current on jailbreaks, attack methods, and evolving model behaviors Desired Capabilities Strong hands‑on experience using multiple LLMs (ChatGPT, Claude, Gemini, open‑source models, etc.) Intuition for crafting adversarial prompts; familiarity with jailbreak or evasion techniques is a strong plus Creative, adversarial problem‑solving skills Clear and thoughtful written communication Strong ethical judgment and the ability to separate adversarial thinking from personal values Self‑directed, collaborative, and comfortable in feedback‑heavy environments Curiosity, persistence, and comfort with frequent failure in experimentation Extra Credit Familiarity with Python or other scripting languages Experience working with LLM APIs or evaluation tooling Comfort with structured data annotation and rubric‑based scoring Prior work in trust and safety, content moderation, QA, or security research Subject matter expertise in any high‑risk domain (cybersecurity, chemistry, biology, medicine, law, finance, etc.) You Will Thrive Here If You treat every model response as a hypothesis to challenge You can switch between creative free‑association and rigorous documentation in the same session You go deep into unusual interests (fandoms, niche internet cultures, gaming exploits, Wikipedia rabbit holes, etc.) You come from a creative background: writing, visual art, improv, puzzle design, or similar You are energized by finding the thing nobody else thought to try You are genuinely passionate about AI and follow the space closely Content Warning This role involves regular and deliberate exposure to harmful content. You will encounter and intentionally generate content involving violence, self‑harm, hate speech, sexually explicit material, child safety scenarios, and other categories of harmful output as part of structured adversarial testing. Candidates must be able to engage with this material professionally and sustainably. Support resources are available. #J-18808-Ljbffr
- ...About the Role As an AI Red Teamer, you will stress-test large language models by intentionally... ...for leading research labs. This is a generalist red teaming role. You will probe models... ...scripting languages Experience working with LLM APIs or evaluation tooling Comfort with...Suggested
- Handshake is looking for an AI Red Teamer in Seattle, USA, to stress-test large language models by designing adversarial prompts that expose vulnerabilities. This role requires creativity and the ability to think like an adversary while maintaining strong ethical judgment...Suggested
- Handshake is seeking an AI Red Teamer in Seattle, WA, to stress-test large language models by designing adversarial prompts. The role involves evaluating model outputs, documenting findings, and collaborating with a multidisciplinary team to strengthen AI safety. Ideal...Suggested
- ...institutions. In 2025, we started Handshake AI and built the fastest-growing AI data... ...largest scale. About the Role As a CBRNE Red Teamer, you will evaluate whether AI models appropriately... ...with Python or scripting languages, LLM APIs, or evaluation tooling Published research...SuggestedImmediate start
- Handshake is hiring a CBRNE Red Teamer in Seattle to evaluate how AI models handle queries related to hazardous threats. The role requires crafting adversarial scenarios and analyzing model outputs to identify safety gaps. The ideal candidate will have a graduate-level...Suggested
$214k - $285k
...and execute, engage and deliver innovation and unlock the power of AI for thousands of enterprise customers. This position is based in... ...across other engineering partner teams to continuously improve LLM development velocity and capabilities at Snowflake. Support team...Flexible hours$184k - $287.5k
...requisition id: JR2016042Join our team at NVIDIA and help bring AI solutions to our largest customers. We are seeking an expert Solutions... ...performance aspects related to tasks like large scale LLM training and inference.* Conducting regular technical customer meetings...- ...building the future of trustworthy AI. Grounded in behavioral science... ...work in areas like RL gyms, red teaming, and benchmarking, we... ...experts, annotators, reviewers, red teamers, contractors, and quality... ...sumé. Experience in AI safety, LLM evaluation, or trust & safety operations...Contract workFor contractors
$230k - $280k
...CTEM). The HackerOne Platform unites agentic AI solutions with the ingenuity of the world's... ...disclosure, agentic pentesting, AI red teaming, and code security, HackerOne delivers... ...memory systems, RAG, long-horizon tasks, and LLM-based models-into the HackerOne platform, applying...ApprenticeshipWork at officeLocal areaRemote workFlexible hoursShift work1 day per week- A leading technology company in Seattle seeks a Senior Software Engineer to join their AI Networking team. This role involves building ML tools for optimizing AI workloads across data centers, focusing on large-scale deep learning. Candidates should have a PhD or equivalent...
- ...thrive.**Role Overview**F5 is expanding its **AI Center of Excellence** and is hiring a **... ...deep expertise in **AI, Data Science, and LLM behavior** to support our **AI Runtime Security... ...outcomes from Proof of Concepts (POCs), red-teaming exercises, and runtime guardrail...Work at officeLocal areaRemote workWork from home
$185k - $220k
...Principal Data Scientist job at Curative AI. Bellevue, WA. About Curative AI, Inc. Curative AI, Inc. is an ambitious innovative... ...Natural Language Processing (NLP) and/or Large Language Models (LLM) highly desired ~ Excellent understanding of statistical methods...Full timeH1b$30 - $60 per hour
Portland Seed Fund is seeking part-time Red Teaming Experts in Seattle to support AI safety evaluation campaigns. Candidates will design and simulate AI conversations, identifying risks and evaluating performance. This role requires strong analytical skills and creative...Part time- ...industries, helping them shape their hybrid cloud and AI journeys. With support from our strategic partners, robust IBM technology, and Red Hat, you’ll have the tools to drive... ...maintainable AI-powered solutions Provide cloud and LLM consumption cost estimates, optimizing for...Worldwide
- KPMG is looking for a Senior Associate, AI Engineer to develop GenAI applications and integrate solutions within their Advisory Services. The role requires at least three years of experience in AI/ML, along with proficiency in Python and familiarity with cloud platforms...
- YO IT Consulting is looking for a Red-Teaming Quality Assurance Lead to ensure quality and consistency across AI red-teaming projects. This remote position involves evaluating AI-generated evaluations and providing feedback to maintain quality standards. The ideal candidate...Remote job
$160k - $200k
Madrona Venture Labs is seeking experienced engineers passionate about AI to design and build LLM powered systems in Seattle, Washington. The role offers a competitive salary between $160K and $200K and is hybrid. Responsibilities include developing production AI systems...Flexible hours- ...seeking a Product Manager in Seattle to lead strategies for our AI-driven service experience. The successful candidate will collaborate... ...over 3 years of experience in product management related to LLM products. The role offers a competitive salary package and comprehensive...
$139.5k - $258.1k
Seattle, Washington, United States Machine Learning and AI The Apple Knowledge Quality Team is building the next-generation of machine... ...Experience in building evaluation solution powered by LLM is a plus Experience in designing and developing large-scale data...Relocation$148.2k - $300.96k
Machine Learning Engineer Graduate (TikTok Trust and Safety‑CV/NLP/Multimodal LLM) - 2026 Start (PhD) Location: Seattle Employment Type: Regular Job Code: A115025 Responsibilities Collaborate with product teams to define business objectives and improve trust and safety...Temporary workLocal area- Optum in Washington is seeking a Senior AI/ML Applied Scientist to develop cutting-edge solutions in health care innovation. This role involves implementing models and methods to enhance AI capabilities, working collaboratively within the Optum AI team. The ideal candidate...Remote job
$145k - $175k
...place in a warehouse, before any Physical AI system is trusted in the real world, it has... ...from there. Use AI tooling actively. LLM-assisted coding, scripting, and asset... ...with non-technical product partners. Generalist disposition. Able to move pragmatically across...Remote work$225k - $265k
...it. Who you are Metropolis is seeking a Director of Applied AI to lead our newly formed Applied AI organization and architect an... ..., and deploys AI workflows, including prompt systems and LLM integrations Establish metrics for each process transformation...Temporary workWork at officeLocal area$190k - $230k
...CTEM). The HackerOne Platform unites agentic AI solutions with the ingenuity of the world’s... ...disclosure, agentic pentesting, AI red teaming, and code security, HackerOne delivers... ..., Memory systems, RAG, Long horizon tasks, LLM-based models into the HackerOne platform, applying...ApprenticeshipWork at officeLocal areaRemote workFlexible hoursShift work1 day per week$300k - $320k
...a Technical Program Manager to lead our AI model evaluation initiatives across multiple... ...who are comfortable acting as adaptable generalists who add value fast. We excel at... ...including designing test suites, coordinating red team exercises, and analyzing results Create...Work at officeHome officeVisa sponsorshipRelocation package$125k - $175k
...sophisticated, enterprise SaaS solutions into large, complex organizations Advantageous to have experience with Large Language Models (LLM) Demonstrated ability to create demand and influence C-level leaders, articulating strategic value with clarity and confidence...Work at officeWork from homeWorldwideVisa sponsorshipWork visa$94.4k - $293.8k
You Are As a Advanced AI Architect, you will be responsible for designing and delivering full stack AI architecture for an AI platform... ..., governance, observability. Design and document AgentOps and LLM Ops solutions. Recommend design optimizations and improvements for...Work experience placementLive inWork at officeLocal area$200.8k - $251k
A leading AI technology company in San Francisco seeks a team member to build and optimize a machine learning framework for large language models. Candidates should have system optimization experience and solid software engineering skills, particularly in tools like CUDA...Full time$139.5k - $258.1k
...and Services We are the Apple Services Engineering (ASE) Security Red Team. We focus on deep technical security review work of critical... ...review depth and quality. We are growing our team and looking an AI Security Engineer to lead deep reviews that identify meaningful...Relocation- Software Development Manager, AI Inference Technology, Neuron SDK job at Annapurna Labs (U.S.) Inc.. Seattle, WA. DESCRIPTION DESCRIPTION AWS Utility Computing (UC) provides product innovations — from foundational services such as Amazon Elastic Compute Cloud (EC2), to...
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to AI Red Teamer, LLM Generalist. Be the first to apply!


