AI Red Teamer, LLM Generalist
Handshake
About Handshake Handshake was founded on a simple belief that everyone deserves a path to a great career, regardless of where they went to school or who they know. Today, we power 25 million job seekers, 1 million+ employers, and 1,600 educational institutions. In 2025, we started Handshake AI and built the fastest-growing AI data business in history. We work directly with frontier AI lab researchers to create evaluations, publish benchmarks, and push the boundary of data. We’ve grown from $0 to ~$1B run rate and pay ~$60M to over 30K individuals every month. Why join Handshake now: Shape how every career evolves in the AI economy, at global scale, with impact your friends, family and peers can see and feel Partner hand-in-hand with world-class AI labs, Fortune 500 partners and the world’s top educational institutions Work together with engineers, scientists, operators, and more from Palantir, Meta, Scale AI, and former YC founders Build a massive, fast-growing business with billions in revenue About Handshake AI Human data is the core infrastructure to AI advancement. Frontier AI labs currently improve model capabilities with various data-intensive post-training techniques. We believe that data spend for AI training will increase by 3-5x in the next few years and continue for much longer as models take on new domains. Handshake AI supports all of the frontier AI labs, working on their most complex data at the largest scale. About the Role As an AI Red Teamer, you will stress-test large language models by intentionally trying to break them. Rather than checking whether an answer is correct, you will design creative, adversarial prompts that expose vulnerabilities: unsafe content, bias, broken guardrails, hallucinations, prompt injection weaknesses, and unexpected behaviors. Your work directly supports AI safety and model robustness for leading research labs. This is a generalist red teaming role. You will probe models across the full spectrum of risk categories, including content safety, CBRN (chemical, biological, radiological, nuclear), cybersecurity, persuasion and influence operations, child safety, self-harm, over-companionship, and regulatory compliance. Red teaming may span text, image, voice, and agentic model capabilities depending on project needs. This role requires creativity, curiosity, and an ability to think like an adversary while operating with strong ethical judgment. Craft creative prompts and multi-turn scenarios to stress-test AI guardrails across diverse risk categories Discover ways around safety filters, restrictions, and defenses using jailbreak, evasion, and prompt injection techniques Explore edge cases to provoke disallowed, harmful, or incorrect outputs Evaluate and score model responses against structured harm taxonomies and severity rubrics Document experiments clearly, including what you tried, why you tried it, and what it revealed Review and refine adversarial prompts generated by other team members Contribute to harm taxonomy development, calibration exercises, and inter-rater reliability work Collaborate with engineers, data scientists, and researchers to share findings and strengthen defenses Work with potentially disturbing content on a regular basis (see Content Warning below) Stay current on jailbreaks, attack methods, and evolving model behaviors Desired Capabilities Strong hands-on experience using multiple LLMs (ChatGPT, Claude, Gemini, open-source models, etc.) Intuition for crafting adversarial prompts; familiarity with jailbreak or evasion techniques is a strong plus Creative, adversarial problem-solving skills Clear and thoughtful written communication Strong ethical judgment and the ability to separate adversarial thinking from personal values Self-directed, collaborative, and comfortable in feedback-heavy environments Curiosity, persistence, and comfort with frequent failure in experimentation Extra Credit Familiarity with Python or other scripting languages Experience working with LLM APIs or evaluation tooling Comfort with structured data annotation and rubric-based scoring Prior work in trust and safety, content moderation, QA, or security research Subject matter expertise in any high-risk domain (cybersecurity, chemistry, biology, medicine, law, finance, etc.) You Will Thrive Here If You treat every model response as a hypothesis to challenge You can switch between creative free-association and rigorous documentation in the same session You go deep into unusual interests (fandoms, niche internet cultures, gaming exploits, Wikipedia rabbit holes, etc.) You come from a creative background: writing, visual art, improv, puzzle design, or similar You are energized by finding the thing nobody else thought to try You are genuinely passionate about AI and follow the space closely Content Warning This role involves regular and deliberate exposure to harmful content. You will encounter and intentionally generate content involving violence, self-harm, hate speech, sexually explicit material, child safety scenarios, and other categories of harmful output as part of structured adversarial testing. Candidates must be able to engage with this material professionally and sustainably. Support resources are available.
- ...institutions. In 2025, we started Handshake AI and built the fastest-growing AI data... .... About the Role As a Cybersecurity Red Teamer, you will evaluate whether AI models can be... ..., or AI evaluation Familiarity with LLM APIs or evaluation tooling You Will Thrive...SuggestedImmediate start
- ...institutions. In 2025, we started Handshake AI and built the fastest-growing AI data... ...largest scale. About the Role As a CBRNE Red Teamer, you will evaluate whether AI models appropriately... ...with Python or scripting languages, LLM APIs, or evaluation tooling Published...SuggestedFull timeImmediate start
$159.3k - $202.4k
...Application deadline: Jun 2, 2026 Amazon's STORM Red Team (SDO Threat Operations, Research &... ...We hack Amazon's services, infrastructure, AI/ML systems, processes, and controls, then... ...the security of AI/ML systems including LLM applications, agentic architectures, RAG pipelines...SuggestedLocal areaRemote workFlexible hours- ...-latency serving at scale. Oversees production operations for AI workloads, including monitoring, incident response, security, and... ...Falcon, Mistral). ~ Demonstrated success architecting and deploying LLM & GNN solutions on AWS (e.g., SageMaker, Bedrock, EKS) at...Suggested
$201.3k - $367.4k
...Machine Learning Architect, SIML - LLM & Generative AI The System Intelligence and Machine Learning (SIML) organization at Apple is looking for an experienced and visionary Machine Learning Architect to drive technology direction, shape our machine learning strategy...SuggestedRelocation- ...Description Job Description The Opportunity We are building an elite AI Red Team to stress-test and harden enterprise-scale AI products... .... What You’ll Do Design and lead adversarial testing of LLM and AI-driven systems Conduct threat modelling across model,...
$148.2k - $300.96k
...job is to support the following businesses with the most advanced AI technology: - Combat any kinds of risks/violations issues in E-... ...basic algorithms such as NLP, vision, multimodal, search, graph, LLM, etc. to provide support for governance business, explore cutting...Temporary workLocal areaOverseas$204k - $259k
...Senior Machine Learning Engineer – VLM/LLM Evaluation Waymo is an autonomous driving technology company with the mission to be the... ...in simulation across 15+ U.S. states. The mission of the Waymo AI Foundations team is to develop machine learning solutions addressing...Full timeTemporary workRemote work$60 - $70 per hour
...Overview: We are seeking a Machine Learning Engineer to join a high-impact team focused on advancing LLM evaluation, NLP, and AI-driven automation. This role centers on designing scalable evaluation frameworks, optimizing prompt strategies, and building systems that...Contract workTemporary workRemote work3 days per week$139.5k - $258.1k
...LLM Machine Learning Research Engineer, Model Optimization & Algorithms Development, SIML The Apple Intelligence Model Optimization and Algorithms Development team brings innovative AI research into Apple products. We are looking for strong Machine Learning Applied...Relocation- ...LLM/Prompt-Context Engineer – Fullstack Python (AI Agents, LangGraph, Context Engineering) Location – 1st Atlanta, 2nd Dallas, 3rd Seattle (Onsite no remote). Onsite interview required We are looking for a highly skilled LLM/Prompt-Context Engineer with a strong...Remote work
$110k - $260k
...Staff Security Engineer For Red Team At GEICO, we offer a rewarding career where your... ...Team with deep technical expertise in running AI-driven adversary operations that measurably... ...environments. ~ Deep understanding of LLM architecture and familiarity with how models...Hourly payWork experience placementLocal areaFlexible hours$139.5k - $258.1k
...LLM Machine Learning Research Engineer Apple is seeking a Research Engineer to join our Foundation Model Preparation and Algorithm... ...Team. We are looking for all levels of talent to bring innovative AI research into Apple products. We are looking for strong ML applied...Relocation$212k - $386.3k
...Staff Applied Research and ML, Responsible AI and Safety Play a part in building the next generation of generative AI... ...future of Apple and our products. Research and advance red teaming methods for LLM's and diffusion models Research and develop mitigations...Relocation$202.16k - $368.22k
...LLM AIOps Development Engineer - Data Center Networking Location: Seattle Employment Type: Regular Job Code: A220006 Responsibilities: About the team Networking brings together innovative ideas and technologies from network architecture, software defined...Temporary workLocal area$208.3k - $281.8k
...Description Application deadline: Jun 4, 2026 We are looking for an experienced Senior Manager to lead our AI Red Team within Threat Operations. You will build and lead a team of security engineers and researchers focused on security research and offensive security...Local areaFlexible hours$148.2k - $300.96k
...knowledge integration - Design prompt engineering and reasoning workflows that connect structured features, risk indicators, and real-time LLM-based decisions. - Knowledge Distillation and BERT-style architectures - Build agentic workflows for complex cases, including modular...Temporary workLocal areaWorldwide$264.8k - $331k
...As the leading data and evaluation partner for frontier AI companies, Scale is dedicated to advancing the evaluation and benchmarking... ...of large language models (LLMs). We are building industry-leading LLM evals, setting new standards for model performance assessment....Full time$178.4k - $226.7k
...deadline: Applications will be accepted on an ongoing basis We are looking for an experienced Senior Security Engineer to join our AI Red Team within Threat Operations. You will conduct sophisticated offensive security operations targeting AI systems, infrastructure,...Local areaFlexible hours$230k - $280k
...CTEM). The HackerOne Platform unites agentic AI solutions with the ingenuity of the world's... ...disclosure, agentic pentesting, AI red teaming, and code security, HackerOne delivers... ...memory systems, RAG, long-horizon tasks, and LLM-based models-into the HackerOne platform, applying...ApprenticeshipWork at officeLocal areaRemote workFlexible hoursShift work1 day per week- ...A leading AI research accelerator is seeking candidates with a solid foundation in materials science or similar fields for projects... ...complex problems, create clear solutions, and work closely with LLM researchers. Candidates with experience in the materials industry...For contractorsRemote work
- ...Solutions Engineer F5 is expanding its AI Center of Excellence and is hiring a Specialist... ...deep expertise in AI, Data Science, and LLM behavior to support our AI Runtime Security... ...particularly outcomes from Proof of Concepts (POCs), red-teaming exercises, and runtime guardrail...
$57 per hour
...resume (Start date, End date). Responsibilities: - Design, implementation and deployment of high-speed network technologies to support AI/LLM applications. - Design and development of platforms/systems for monitoring, analysis and diagnosis of large scale AI/LLM network. -...Hourly payInternshipLocal area$134.96k - $188.95k
...technologies. We use the latest AWS technologies, big data approaches, and LLM to build distributed, highly available systems to achieve our goals. We are seeking a highly skilled Software Engineer II - AI/ML to join our Metadata team. This role requires an individual with...Permanent employmentTemporary workLocal areaFlexible hours- ...The Opportunity We are building a dedicated AI Red Team to rigorously test and harden enterprise-scale AI products. We are looking... .... This role focuses on identifying vulnerabilities in LLM-driven systems, breaking model guardrails, exploiting data pathways...
$110k - $220k
...Scientist Multiple locations 2 open positions Join the Associate AI Experiences team to apply advanced data science, machine learning... ...multiple autonomous AI agents (not just experimented with LLM APIs) Are Level 5+ on Steve Yegge's Vibe Coding scale and can...Full timeTemporary workPart timeHome office$139.5k - $258.1k
...Platform Are you excited about building innovative generative AI experiences that empower millions of users daily? Do you thrive in... ...machine learning and deep learning algorithms Experience in LLM, machine learning, deep learning, information retrieval or natural...Relocation$202.16k - $368.22k
...AI/LLM Network Software Development Engineer Location: Seattle Team: Technology Employment Type: Regular Job Code: JE3HP Responsibilities About the Team ByteDance Networking brings together innovative ideas and technologies from network architecture,...Temporary workLocal area- ...Principal Data Scientist: Associate AI Experience We are seeking a Principal Data Scientist to provide technical leadership and long... ..., distribution centers, and home office. We leverage the Element LLM Gateway to safely access state-of-the-art language models while...Full timeTemporary workPart timeHome office
- ...and coaching for a team of mid-level software engineers delivering AI-enabled capabilities across the ProductGPT ecosystem, spanning... ...iterate on AI development strategies for use-case delivery, including LLM integration approaches, evaluation strategies, prompt/context...Flexible hoursShift work
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to AI Red Teamer, LLM Generalist. Be the first to apply!

