Director, Model Post-Training and Agentic Research (Remote)
$195k - $290kCrowdStrike
As a global leader in cybersecurity, CrowdStrike protects the people, processes and technologies that drive modern organizations. Since 2011, our mission hasn’t changed — we’re here to stop breaches, and we’ve redefined modern security with the world’s most advanced AI-native platform. Our customers span all industries, and they count on CrowdStrike to keep their businesses running, their communities safe and their lives moving forward. We’re also a mission-driven company. We cultivate a culture that gives every CrowdStriker both the flexibility and autonomy to own their careers. We’re always looking to add talented CrowdStrikers to the team who have limitless passion, a relentless focus on innovation and a fanatical commitment to our customers, our community and each other. Ready to join a mission that matters? The future of cybersecurity starts with you. About the Role: The security domain presents one of the richest and most consequential training signal environments in applied AI. It’s adversarial by nature, grounded in real operational outcomes, and evolving faster than any static benchmark can capture. We're building the post-training and reinforcement learning capability to build the latest models and harnesses into security-specialized systems that reason, plan, and act across complex cyber workflows. The person leading this work will be in the research, not just directing it. In this role, you'll own the full post-training stack for security-domain AI (e.g., supervised fine-tuning, reward modeling, RLHF and RLAIF pipelines, and agent-RL environments) and the agentic research that sits on top of it. That means designing, building, and evaluating the harnesses that security agents actually run on (e.g., the scaffolding, tool-use interfaces, planning loops, memory and context management, and multi-step execution frameworks) that determine whether a trained model can operate reliably on complex security tasks. Post-training and agent architecture are not separable problems in this work. The reward signal you design has to reflect what the harness can measure, and the harness has to be built to surface what training needs to optimize. You'll set the technical direction on both, and you'll be in the work on both. You'll lead a team of research scientists and engineers, but the team will look to your own work as the standard. The successful candidate shapes research priorities, keeps the team moving at high velocity across multiple training cycles per year, and elevates the quality of work by staying close enough to it to know what good actually looks like. What You'll Do: Own and personally drive the full post-training pipeline for security-domain AI — SFT, RLHF/RLAIF, agent-RL, and reward modeling. Set research priorities and architectural direction, and lead experimental work on the hardest problems yourself rather than delegating them away. Design reward modeling methodology grounded in verified security outcomes rather than proxy signals, drawing on both human expert feedback and automated adversarial evaluation. Define data curation standards across sourcing, filtering, quality scoring, and domain weighting that drive measurable capability improvement. Build and maintain agent-RL training environments that simulate realistic cyber workflows (multi-step offensive and defensive tasks, tool use, and long-horizon planning) contributing directly to environment design and reward shaping. Lead the design and build of the agent harnesses that run on top of those trained models: scaffolding architecture, tool-calling interfaces, planning and reasoning loops, and memory and context management. Treat harness design with the same rigor as the training pipeline; these systems determine whether strong post-training translates into reliable, trustworthy behavior in the field. Develop and own evaluation methodology for the full agentic stack, not model capability in isolation, but harness behavior, tool-use reliability, planning coherence, and end-to-end task completion across realistic security workflows. Define the benchmarks, red-line tests, and measurement practices that give the team and the organization genuine confidence that an agent works. Partner closely with other teams to ensure post-training and agentic work integrates cleanly with the broader model development loop. Contribute original research through publications, external presentations, and open-source artifacts where appropriate, building CrowdStrike's credibility as a research-first organization in this space. Recruit, develop, and retain a high-density team of research scientists and ML engineers. Set a technical bar through your own contributions, not just your standards. What You'll Need: MS or PhD in computer science, machine learning, or a related quantitative discipline. 8+ years of experience in ML research or engineering, with meaningful depth in large language model post-training. Hands-on expertise across the modern post-training stack, including SFT data pipelines, RLHF/RLAIF, PPO or similar RL algorithms applied to language models, and reward model design and training. This means you've done the work, not managed people who have. Demonstrated experience designing or building agentic system harnesses for LLM-based agents, including tool-use frameworks, planning scaffolds, multi-step execution environments, and context or memory management. You've built these systems, not just used them. Strong evaluation instincts: experience designing evaluation protocols that are resistant to overfitting, capable of measuring genuine capability improvement, and interpretable to both technical and non-technical stakeholders. Track record of running high-velocity research programs with disciplined tracking and fast iteration. Proven ability to lead and grow research teams while remaining a credible, active technical contributor. Ways to Stand Out: Demonstrated experience building or operating RL training environments for language model agents, including environment design, rollout infrastructure, and reward shaping. Experience applying post-training or RL techniques in security, adversarial ML, or other high-stakes operational domains where ground truth is expensive and noisy. Deep hands-on experience with agent harness architecture applied to long-horizon, multi-step task environments where reliability and failure modes matter as much as peak capability. Background designing synthetic data pipelines or simulation environments for agent training in complex, tool-using workflows. Familiarity with the offensive or defensive security practitioner's workflow — penetration testing, detection engineering, incident response, or threat intelligence — sufficient to reason about what good model behavior looks like in practice. Published research in post-training, RLHF, RL for language agents, or related areas at top-tier venues (NeurIPS, ICML, ICLR, ACL, or equivalent). Experience working on and adapting open-weight base models (Llama-class, Qwen-class, or similar) for domain-specialized continued training and fine-tuning. #LI-JF1 #LI-Remote Benefits of Working at CrowdStrike: Market leader in compensation and equity awards Comprehensive physical and mental wellness programs Competitive vacation and holidays for recharge Paid parental and adoption leaves Professional development opportunities for all employees regardless of level or role Employee Networks, geographic neighborhood groups, and volunteer opportunities to build connections Vibrant office culture with world class amenities Great Place to Work Certified™ across the globe CrowdStrike is proud to be an equal opportunity employer. We are committed to fostering a culture of belonging where everyone is valued for who they are and empowered to succeed. We support veterans and individuals with disabilities through our affirmative action program. CrowdStrike is committed to providing equal employment opportunity for all employees and applicants for employment. The Company does not discriminate in employment opportunities or practices on the basis of race, color, creed, ethnicity, religion, sex (including pregnancy or pregnancy-related medical conditions), sexual orientation, gender identity, marital or family status, veteran status, age, national origin, ancestry, physical disability (including HIV and AIDS), mental disability, medical condition, genetic information, membership or activity in a local human rights commission, status with regard to public assistance, or any other characteristic protected by law. We base all employment decisions--including recruitment, selection, training, compensation, benefits, discipline, promotions, transfers, lay-offs, return from lay-off, terminations and social/recreational programs--on valid job requirements. If you need assistance accessing or reviewing the information on this website or need help submitting an application for employment or requesting an accommodation, please contact us at View email address on click.appcast.io for further assistance. Find out more about your rights as an applicant. CrowdStrike participates in the E-Verify program. Notice of E-Verify Participation Right to Work CrowdStrike, Inc. is committed to fair and equitable compensation practices. Placement within the pay range is dependent on a variety of factors including, but not limited to, relevant work experience, skills, certifications, job level, supervisory status, and location. The base salary range for this position for all U.S. candidates is $195,000 - $290,000 per year, with eligibility for bonuses, equity grants and a comprehensive benefits package that includes health insurance, 401k and paid time off. For detailed information about the U.S. benefits package, please click here. Expected Close Date of Job Posting is:08-11-2026 CrowdStrike was founded in 2011 to fix a fundamental problem: The sophisticated attacks that were forcing the world’s leading businesses into the headlines could not be solved with existing malware-based defenses. Founder George Kurtz realized that a brand new approach was needed — one that combines the most advanced endpoint protection with expert intelligence to pinpoint the adversaries perpetrating the attacks, not just the malware. There’s much more to the story of how Falcon has redefined endpoint protection but there’s only one thing to remember about CrowdStrike: We stop breaches.
- ...Anthropic is seeking a Research Lead for the Training Insights team to shape the evaluation of model capabilities. This hands-on leadership role involves developing innovative... .... This role is based primarily in San Francisco with remote-friendly options. #J-18808-LjbffrRemote workTraining
- ...Agentic Ai Technical Product Manager Hagerty is a company built by drivers for drivers... ...Support product rollout through documentation, training, and internal enablement. Gather... ...note This position is open to U.S. remote work. However, team members who reside within...Remote workTrainingWork at office3 days per week
- A tech company specializing in AI training is seeking a Graduate Biology Research Intern to work remotely. The role involves training AI models, evaluating their outputs, and providing insights to improve model quality. Applicants must possess an expert understanding of...Remote jobTrainingHourly payFull timePart timeInternship
- A leading AI training company is seeking a Graduate Biology Research Intern to enhance AI models related to biology. You will assess the performance of AI chatbots and provide complex biology questions. Ideal candidates will have an expert understanding of various biology...Remote jobTrainingHourly payContract workInternshipFlexible hours
- ...of patients as we research, manufacture, and... ...CD&A - Associate Director, Agentic AI Business... ...implementation, rollout, and training—ensuring... ...teams to ensure models and agents meet business... ...user feedback post‑deployment.... ...models, including remote and hybrid work arrangements...Remote workTrainingWork at officeFlexible hoursShift work2 days per week3 days per week
$204k - $259k
...mission of the Waymo Applied Research team is to develop... ...applied foundation model research and development... ...compelling experiments by training and evaluating large... ..., Gemini, Llama, GPT) Post-training, incl. reinforcement... ...role can be performed remote, the specific salary...Remote workTrainingFull time- Mercor is seeking Part-time Chemistry Researchers to connect elite talent with leading AI labs... ...chemistry problems, evaluating model outputs, and identifying failures. Ideal... ...publications in top journals, advanced chemistry training, and an active research role. The...Remote jobTrainingHourly payPart time
$207k - $285k
...vulnerabilities, and collaborating closely with researchers to strengthen model reliability and public trust.... ...to integrate findings into model training and deployment cycles. Develop... ...OpenAI that you believe this job posting is non-compliant, please submit a report...TrainingWork at officeRelocation package$40 per hour
A research technology company in New Hampshire seeks a Postdoctoral Physics Research Associate. This role involves training AI models by providing complex physics problems, measuring the outputs,... ...thermodynamics. This is a flexible, remote position with hourly pay...Remote jobTrainingHourly payFlexible hours$320k
Anthropic in New York City is seeking a Research Engineer to develop evaluations for Claude’s capabilities. The ideal... ...running evaluations, and debugging results during training runs. The role offers a hybrid work model and competitive compensation ranging from $320,000...Remote jobTraining- .... About Mercor projects Training and evaluating AI models in Biology Creating tasks... ...feedback to advance frontier AI research Projects vary in scope... ...work independently in a remote environment How it works... ...This is not a specific job posting. By applying, you're...Remote jobTrainingContract work
$40 per hour
A technology company is seeking a Research And Development Chemist to join their team in Washington, DC. In this remote position, you will train AI models by measuring chatbot progress, evaluating logic, and solving problems to improve model quality. Applicants should...Remote jobTrainingHourly payFlexible hours$40 per hour
A leading AI training company is looking for a Research And Development Chemist to join their team. In this remote role, you will evaluate AI chatbots by measuring their responses to... ...assessing the performance of various AI models. Candidates should have a strong grasp of...Remote jobTrainingHourly pay$40 per hour
A specialized AI training company in the United States is seeking a Research and Development Chemist to evaluate AI models specifically in chemistry. In this remote position, you will be responsible for providing complex chemistry questions to chatbots and assessing their...Remote jobTrainingHourly payFlexible hours- A financial services company in the United States seeks a Fixed Income Research Analyst to aid in training AI models. The role requires expertise in financial reasoning and the ability to evaluate the outputs of AI chatbots. Offering flexible hours and competitive hourly...Remote jobTrainingHourly payContract workFlexible hours
- Alignerr is seeking a Material Science Expert to contribute to advanced AI research in molecular modeling and semiconductor materials. This fully remote role allows you to apply your expertise in materials science while collaborating with AI researchers to enhance model...Remote jobTrainingFreelance
$40 per hour
A technology company is seeking a Research and Development Physicist to assist in training AI models. This role involves evaluating AI chatbots' performance through... ...and related fields. This position is independent, remote, and offers flexible scheduling with hourly pay starting...Remote jobTrainingHourly payFlexible hours- ...company is seeking Experts in Finance to enhance AI models for performance improvement. This role offers... ...understanding of financial concepts. This is an entry-level contract position with remote capabilities and opportunities for on-job training. #J-18808-Ljbffr TuringRemote jobTrainingHourly payContract work
- Jobgether is seeking an AI Research Engineer focused on advancing post-training techniques for agentic AI systems. This role offers the opportunity to shape models that operate beyond text generation, working in a remote-first environment. Your responsibilities will include...Remote jobTraining
- ...Senior Director, Medical Evidence And Outcomes Research Location: Remote, US (Cambridge, MA / Morristown, NJ) Join the team transforming care for people with... ...delivery of field tools, resources, materials, and training designed to convey medical and payer/population...Remote workTrainingLocal area
$250k - $300k
...to develop mission-critical agentic applications in complex industries... ...Role As a Principal AI Researcher, you will define and drive... ..., with a focus on how models behave inside real-world execution... ...and fine-tuning recipes, post-training adaptation techniques and...Remote workTrainingFull timeShift work- ...intelligence to serve humanity. We’re training and deploying frontier models for developers and enterprises who... ...our customers. Cohere is a team of researchers, engineers, designers, and more,... ...Paris, Seoul and London. We embrace a remote-friendly environment, and as part...Remote workTrainingFull timeWork at officeFlexible hours
$126.9k - $184k
...Job Title: Director, Medical Relations (skinbetter science) Division... ...Beauty Location : Remote (Field Based) Who We Are... ...promises. We invest heavily in research and development, and co-... ...discriminate in recruitment, hiring, training, promotion, or other...Remote workTrainingPermanent employmentWork experience placementSummer workFlexible hoursWeekend work- CellType Inc. is seeking a Founding Research Engineer to develop and optimize systems for their biological AI models. This pivotal role involves training, evaluation, and making systems run... ...in New York City but may allow for remote work. #J-18808-Ljbffr CellType Inc.Remote workTraining
$50 - $60 per hour
A data-focused technology firm is seeking an Equity Research Associate to join their remote team. The role involves training AI models and evaluating their outputs. Ideal candidates will possess strong financial reasoning and be detail-oriented. Candidates with a Master...Remote jobTrainingHourly payFor contractorsFlexible hours- ...Job Title Correlation Research Division (CRD) Associate Director Job Description The... ...team members, including modeling and applying the leadership... ..., recruitment, hiring, training, and orientation, and other... ...allows two days remote work each week. Must reside...Remote workTrainingTemporary workWork experience placementLocal areaWorldwideFlexible hours2 days per week
- ...Research Engineer (Agentic Models) At JetBrains, code is our passion. Ever since we started, back in 200... ...you'll be responsible for the models, training loops, and evaluation pipelines that... ...the intersection of SFT and RL-style post-training, and product-driven evaluation...Remote workTraining
$174.7k - $218.4k
...: We are seeking an Associate Director, Clinical Science to architect trial... ...technical protocol validation training to internal teams and contract research organizations (CROs), and guide case... ...cost of labor considerations. Remote USA $174,700—$218,400 USD OUR...Remote workTrainingContract workWork at officeImmediate startWorldwideHome office- ...About the Role The Associate Director, Clinical Development Trial... ...Investigator Study Specific Training (ISSTs)/ Affiliate Study Training... ...Serve as the CRO (Clinical Research Organization) clinical... ...Center in Indianapolis, IN. Remote options will be considered...Remote workTrainingLocal area
- ...OpenAI Model Policy Team Role Our Safety Systems... ...boundaries clear enough to train, evaluate, and enforce?... ...contexts, such as agentic systems, multimodal systems... ...will work closely with research, engineering, product,... ...you believe this job posting is non-compliant, please...Remote workTrainingWork at officeWork from homeRelocation packageShift work
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Director, Model Post-Training and Agentic Research (Remote). Be the first to apply!



