Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Director, Model Post-Training and Agentic Research (Remote)

$195k - $290k
Full-time

CrowdStrike

As a global leader in cybersecurity, CrowdStrike protects the people, processes and technologies that drive modern organizations. Since 2011, our mission hasn’t changed — we’re here to stop breaches, and we’ve redefined modern security with the world’s most advanced AI-native platform. Our customers span all industries, and they count on CrowdStrike to keep their businesses running, their communities safe and their lives moving forward. We’re also a mission-driven company. We cultivate a culture that gives every CrowdStriker both the flexibility and autonomy to own their careers. We’re always looking to add talented CrowdStrikers to the team who have limitless passion, a relentless focus on innovation and a fanatical commitment to our customers, our community and each other. Ready to join a mission that matters? The future of cybersecurity starts with you. About the Role: The security domain presents one of the richest and most consequential training signal environments in applied AI. It’s adversarial by nature, grounded in real operational outcomes, and evolving faster than any static benchmark can capture. We're building the post-training and reinforcement learning capability to build the latest models and harnesses into security-specialized systems that reason, plan, and act across complex cyber workflows. The person leading this work will be in the research, not just directing it. In this role, you'll own the full post-training stack for security-domain AI (e.g., supervised fine-tuning, reward modeling, RLHF and RLAIF pipelines, and agent-RL environments) and the agentic research that sits on top of it. That means designing, building, and evaluating the harnesses that security agents actually run on (e.g., the scaffolding, tool-use interfaces, planning loops, memory and context management, and multi-step execution frameworks) that determine whether a trained model can operate reliably on complex security tasks. Post-training and agent architecture are not separable problems in this work. The reward signal you design has to reflect what the harness can measure, and the harness has to be built to surface what training needs to optimize. You'll set the technical direction on both, and you'll be in the work on both. You'll lead a team of research scientists and engineers, but the team will look to your own work as the standard. The successful candidate shapes research priorities, keeps the team moving at high velocity across multiple training cycles per year, and elevates the quality of work by staying close enough to it to know what good actually looks like. What You'll Do: Own and personally drive the full post-training pipeline for security-domain AI — SFT, RLHF/RLAIF, agent-RL, and reward modeling. Set research priorities and architectural direction, and lead experimental work on the hardest problems yourself rather than delegating them away. Design reward modeling methodology grounded in verified security outcomes rather than proxy signals, drawing on both human expert feedback and automated adversarial evaluation. Define data curation standards across sourcing, filtering, quality scoring, and domain weighting that drive measurable capability improvement. Build and maintain agent-RL training environments that simulate realistic cyber workflows (multi-step offensive and defensive tasks, tool use, and long-horizon planning) contributing directly to environment design and reward shaping. Lead the design and build of the agent harnesses that run on top of those trained models: scaffolding architecture, tool-calling interfaces, planning and reasoning loops, and memory and context management. Treat harness design with the same rigor as the training pipeline; these systems determine whether strong post-training translates into reliable, trustworthy behavior in the field. Develop and own evaluation methodology for the full agentic stack, not model capability in isolation, but harness behavior, tool-use reliability, planning coherence, and end-to-end task completion across realistic security workflows. Define the benchmarks, red-line tests, and measurement practices that give the team and the organization genuine confidence that an agent works. Partner closely with other teams to ensure post-training and agentic work integrates cleanly with the broader model development loop. Contribute original research through publications, external presentations, and open-source artifacts where appropriate, building CrowdStrike's credibility as a research-first organization in this space. Recruit, develop, and retain a high-density team of research scientists and ML engineers. Set a technical bar through your own contributions, not just your standards. What You'll Need: MS or PhD in computer science, machine learning, or a related quantitative discipline. 8+ years of experience in ML research or engineering, with meaningful depth in large language model post-training. Hands-on expertise across the modern post-training stack, including SFT data pipelines, RLHF/RLAIF, PPO or similar RL algorithms applied to language models, and reward model design and training. This means you've done the work, not managed people who have. Demonstrated experience designing or building agentic system harnesses for LLM-based agents, including tool-use frameworks, planning scaffolds, multi-step execution environments, and context or memory management. You've built these systems, not just used them. Strong evaluation instincts: experience designing evaluation protocols that are resistant to overfitting, capable of measuring genuine capability improvement, and interpretable to both technical and non-technical stakeholders. Track record of running high-velocity research programs with disciplined tracking and fast iteration. Proven ability to lead and grow research teams while remaining a credible, active technical contributor. Ways to Stand Out: Demonstrated experience building or operating RL training environments for language model agents, including environment design, rollout infrastructure, and reward shaping. Experience applying post-training or RL techniques in security, adversarial ML, or other high-stakes operational domains where ground truth is expensive and noisy. Deep hands-on experience with agent harness architecture applied to long-horizon, multi-step task environments where reliability and failure modes matter as much as peak capability. Background designing synthetic data pipelines or simulation environments for agent training in complex, tool-using workflows. Familiarity with the offensive or defensive security practitioner's workflow — penetration testing, detection engineering, incident response, or threat intelligence — sufficient to reason about what good model behavior looks like in practice. Published research in post-training, RLHF, RL for language agents, or related areas at top-tier venues (NeurIPS, ICML, ICLR, ACL, or equivalent). Experience working on and adapting open-weight base models (Llama-class, Qwen-class, or similar) for domain-specialized continued training and fine-tuning. #LI-JF1 #LI-Remote Benefits of Working at CrowdStrike: Market leader in compensation and equity awards Comprehensive physical and mental wellness programs Competitive vacation and holidays for recharge Paid parental and adoption leaves Professional development opportunities for all employees regardless of level or role Employee Networks, geographic neighborhood groups, and volunteer opportunities to build connections Vibrant office culture with world class amenities Great Place to Work Certified™ across the globe CrowdStrike is proud to be an equal opportunity employer. We are committed to fostering a culture of belonging where everyone is valued for who they are and empowered to succeed. We support veterans and individuals with disabilities through our affirmative action program. CrowdStrike is committed to providing equal employment opportunity for all employees and applicants for employment. The Company does not discriminate in employment opportunities or practices on the basis of race, color, creed, ethnicity, religion, sex (including pregnancy or pregnancy-related medical conditions), sexual orientation, gender identity, marital or family status, veteran status, age, national origin, ancestry, physical disability (including HIV and AIDS), mental disability, medical condition, genetic information, membership or activity in a local human rights commission, status with regard to public assistance, or any other characteristic protected by law. We base all employment decisions--including recruitment, selection, training, compensation, benefits, discipline, promotions, transfers, lay-offs, return from lay-off, terminations and social/recreational programs--on valid job requirements. If you need assistance accessing or reviewing the information on this website or need help submitting an application for employment or requesting an accommodation, please contact us at View email address on click.appcast.io for further assistance. Find out more about your rights as an applicant. CrowdStrike participates in the E-Verify program. Notice of E-Verify Participation Right to Work CrowdStrike, Inc. is committed to fair and equitable compensation practices. Placement within the pay range is dependent on a variety of factors including, but not limited to, relevant work experience, skills, certifications, job level, supervisory status, and location. The base salary range for this position for all U.S. candidates is $195,000 - $290,000 per year, with eligibility for bonuses, equity grants and a comprehensive benefits package that includes health insurance, 401k and paid time off. For detailed information about the U.S. benefits package, please click here. Expected Close Date of Job Posting is:08-11-2026 CrowdStrike was founded in 2011 to fix a fundamental problem: The sophisticated attacks that were forcing the world’s leading businesses into the headlines could not be solved with existing malware-based defenses. Founder George Kurtz realized that a brand new approach was needed — one that combines the most advanced endpoint protection with expert intelligence to pinpoint the adversaries perpetrating the attacks, not just the malware. There’s much more to the story of how Falcon has redefined endpoint protection but there’s only one thing to remember about CrowdStrike: We stop breaches.

Vacancy posted 14 hours ago
Similar jobs that could be interesting for youBased on the Director, Model Post-Training and Agentic Research (Remote) in Virginia vacancy
  •  ...Anthropic is seeking a Research Lead for the Training Insights team to shape the evaluation of model capabilities. This hands-on leadership role involves developing innovative...  .... This role is based primarily in San Francisco with remote-friendly options. #J-18808-Ljbffr
    Remote work
    Training

    Anthropic

    San Francisco, CA
    6 days ago
  •  ...Agentic Ai Technical Product Manager Hagerty is a company built by drivers for drivers...  ...Support product rollout through documentation, training, and internal enablement. Gather...  ...note This position is open to U.S. remote work. However, team members who reside within... 
    Remote work
    Training
    Work at office
    3 days per week

    Hagerty Insurance

    United States
    4 days ago
  • A tech company specializing in AI training is seeking a Graduate Biology Research Intern to work remotely. The role involves training AI models, evaluating their outputs, and providing insights to improve model quality. Applicants must possess an expert understanding of... 
    Remote job
    Training
    Hourly pay
    Full time
    Part time
    Internship

    DataAnnotation

    New York, NY
    2 days ago
  • A leading AI training company is seeking a Graduate Biology Research Intern to enhance AI models related to biology. You will assess the performance of AI chatbots and provide complex biology questions. Ideal candidates will have an expert understanding of various biology... 
    Remote job
    Training
    Hourly pay
    Contract work
    Internship
    Flexible hours

    DataAnnotation

    Boston, MA
    4 days ago
  •  ...of patients as we research, manufacture, and...  ...CD&A - Associate Director, Agentic AI Business...  ...implementation, rollout, and training—ensuring...  ...teams to ensure models and agents meet business...  ...user feedback post‑deployment....  ...models, including remote and hybrid work arrangements... 
    Remote work
    Training
    Work at office
    Flexible hours
    Shift work
    2 days per week
    3 days per week

    Amgen SA

    Thousand Oaks, CA
    1 day ago
  • $204k - $259k

     ...mission of the Waymo Applied Research team is to develop...  ...applied foundation model research and development...  ...compelling experiments by training and evaluating large...  ..., Gemini, Llama, GPT) Post-training, incl. reinforcement...  ...role can be performed remote, the specific salary... 
    Remote work
    Training
    Full time

    SupportFinity™

    San Francisco, CA
    3 days ago
  • Mercor is seeking Part-time Chemistry Researchers to connect elite talent with leading AI labs...  ...chemistry problems, evaluating model outputs, and identifying failures. Ideal...  ...publications in top journals, advanced chemistry training, and an active research role. The... 
    Remote job
    Training
    Hourly pay
    Part time

    Mercor

    New York, NY
    2 days ago
  • $207k - $285k

     ...vulnerabilities, and collaborating closely with researchers to strengthen model reliability and public trust....  ...to integrate findings into model training and deployment cycles. Develop...  ...OpenAI that you believe this job posting is non-compliant, please submit a report... 
    Training
    Work at office
    Relocation package

    OpenAI

    San Francisco, CA
    22 hours ago
  • $40 per hour

    A research technology company in New Hampshire seeks a Postdoctoral Physics Research Associate. This role involves training AI models by providing complex physics problems, measuring the outputs,...  ...thermodynamics. This is a flexible, remote position with hourly pay... 
    Remote job
    Training
    Hourly pay
    Flexible hours

    DataAnnotation

    Boston, MA
    4 days ago
  • $320k

    Anthropic in New York City is seeking a Research Engineer to develop evaluations for Claude’s capabilities. The ideal...  ...running evaluations, and debugging results during training runs. The role offers a hybrid work model and competitive compensation ranging from $320,000... 
    Remote job
    Training

    Menlo Ventures

    San Francisco, CA
    4 days ago
  •  .... About Mercor projects Training and evaluating AI models in Biology Creating tasks...  ...feedback to advance frontier AI research Projects vary in scope...  ...work independently in a remote environment How it works...  ...This is not a specific job posting. By applying, you're... 
    Remote job
    Training
    Contract work

    Mercor

    Alameda, CA
    4 days ago
  • $40 per hour

    A technology company is seeking a Research And Development Chemist to join their team in Washington, DC. In this remote position, you will train AI models by measuring chatbot progress, evaluating logic, and solving problems to improve model quality. Applicants should... 
    Remote job
    Training
    Hourly pay
    Flexible hours

    DataAnnotation

    Washington DC
    4 days ago
  • $40 per hour

    A leading AI training company is looking for a Research And Development Chemist to join their team. In this remote role, you will evaluate AI chatbots by measuring their responses to...  ...assessing the performance of various AI models. Candidates should have a strong grasp of... 
    Remote job
    Training
    Hourly pay

    DataAnnotation

    Annapolis, MD
    4 days ago
  • $40 per hour

    A specialized AI training company in the United States is seeking a Research and Development Chemist to evaluate AI models specifically in chemistry. In this remote position, you will be responsible for providing complex chemistry questions to chatbots and assessing their... 
    Remote job
    Training
    Hourly pay
    Flexible hours

    DataAnnotation

    Florida, NY
    3 days ago
  • A financial services company in the United States seeks a Fixed Income Research Analyst to aid in training AI models. The role requires expertise in financial reasoning and the ability to evaluate the outputs of AI chatbots. Offering flexible hours and competitive hourly... 
    Remote job
    Training
    Hourly pay
    Contract work
    Flexible hours

    DataAnnotation

    Seattle, WA
    2 days ago
  • Alignerr is seeking a Material Science Expert to contribute to advanced AI research in molecular modeling and semiconductor materials. This fully remote role allows you to apply your expertise in materials science while collaborating with AI researchers to enhance model... 
    Remote job
    Training
    Freelance

    Alignerr

    New York, NY
    1 day ago
  • $40 per hour

    A technology company is seeking a Research and Development Physicist to assist in training AI models. This role involves evaluating AI chatbots' performance through...  ...and related fields. This position is independent, remote, and offers flexible scheduling with hourly pay starting... 
    Remote job
    Training
    Hourly pay
    Flexible hours

    DataAnnotation

    Brooklyn, NY
    2 days ago
  •  ...company is seeking Experts in Finance to enhance AI models for performance improvement. This role offers...  ...understanding of financial concepts. This is an entry-level contract position with remote capabilities and opportunities for on-job training. #J-18808-Ljbffr Turing
    Remote job
    Training
    Hourly pay
    Contract work

    Turing

    Chicago, IL
    1 day ago
  • Jobgether is seeking an AI Research Engineer focused on advancing post-training techniques for agentic AI systems. This role offers the opportunity to shape models that operate beyond text generation, working in a remote-first environment. Your responsibilities will include... 
    Remote job
    Training

    Jobgether

    New Bremen, OH
    3 days ago
  •  ...Senior Director, Medical Evidence And Outcomes Research Location: Remote, US (Cambridge, MA / Morristown, NJ) Join the team transforming care for people with...  ...delivery of field tools, resources, materials, and training designed to convey medical and payer/population... 
    Remote work
    Training
    Local area

    Sanofi

    United States
    2 days ago
  • $250k - $300k

     ...to develop mission-critical agentic applications in complex industries...  ...Role As a Principal AI Researcher, you will define and drive...  ..., with a focus on how models behave inside real-world execution...  ...and fine-tuning recipes, post-training adaptation techniques and... 
    Remote work
    Training
    Full time
    Shift work

    Trase Systems

    United States
    1 day ago
  •  ...intelligence to serve humanity. We’re training and deploying frontier models for developers and enterprises who...  ...our customers. Cohere is a team of researchers, engineers, designers, and more,...  ...Paris, Seoul and London. We embrace a remote-friendly environment, and as part... 
    Remote work
    Training
    Full time
    Work at office
    Flexible hours

    Cohere

    San Francisco, CA
    2 days ago
  • $126.9k - $184k

     ...Job Title: Director, Medical Relations (skinbetter science) Division...  ...Beauty Location : Remote (Field Based) Who We Are...  ...promises. We invest heavily in research and development, and co-...  ...discriminate in recruitment, hiring, training, promotion, or other... 
    Remote work
    Training
    Permanent employment
    Work experience placement
    Summer work
    Flexible hours
    Weekend work

    L'Oreal USA, Inc

    Los Angeles, CA
    3 days ago
  • CellType Inc. is seeking a Founding Research Engineer to develop and optimize systems for their biological AI models. This pivotal role involves training, evaluation, and making systems run...  ...in New York City but may allow for remote work. #J-18808-Ljbffr CellType Inc.
    Remote work
    Training

    CellType Inc.

    New York, NY
    1 day ago
  • $50 - $60 per hour

    A data-focused technology firm is seeking an Equity Research Associate to join their remote team. The role involves training AI models and evaluating their outputs. Ideal candidates will possess strong financial reasoning and be detail-oriented. Candidates with a Master... 
    Remote job
    Training
    Hourly pay
    For contractors
    Flexible hours

    DataAnnotation

    Florida, NY
    1 day ago
  •  ...Job Title Correlation Research Division (CRD) Associate Director Job Description The...  ...team members, including modeling and applying the leadership...  ..., recruitment, hiring, training, and orientation, and other...  ...allows two days remote work each week. Must reside... 
    Remote work
    Training
    Temporary work
    Work experience placement
    Local area
    Worldwide
    Flexible hours
    2 days per week

    The Church of Jesus Christ of Latter-day Saints

    Riverton, UT
    1 day ago
  •  ...Research Engineer (Agentic Models) At JetBrains, code is our passion. Ever since we started, back in 200...  ...you'll be responsible for the models, training loops, and evaluation pipelines that...  ...the intersection of SFT and RL-style post-training, and product-driven evaluation... 
    Remote work
    Training

    JetBrains

    United States
    1 day ago
  • $174.7k - $218.4k

     ...: We are seeking an Associate Director, Clinical Science to architect trial...  ...technical protocol validation training to internal teams and contract research organizations (CROs), and guide case...  ...cost of labor considerations. Remote USA $174,700—$218,400 USD OUR... 
    Remote work
    Training
    Contract work
    Work at office
    Immediate start
    Worldwide
    Home office

    Natera

    San Carlos, CA
    8 days ago
  •  ...About the Role The Associate Director, Clinical Development Trial...  ...Investigator Study Specific Training (ISSTs)/ Affiliate Study Training...  ...Serve as the CRO (Clinical Research Organization) clinical...  ...Center in Indianapolis, IN. Remote options will be considered... 
    Remote work
    Training
    Local area

    Integrated Resources, Inc ( IRI )

    Stamford, CT
    1 day ago
  •  ...OpenAI Model Policy Team Role Our Safety Systems...  ...boundaries clear enough to train, evaluate, and enforce?...  ...contexts, such as agentic systems, multimodal systems...  ...will work closely with research, engineering, product,...  ...you believe this job posting is non-compliant, please... 
    Remote work
    Training
    Work at office
    Work from home
    Relocation package
    Shift work

    OpenAI

    United States
    1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Director, Model Post-Training and Agentic Research (Remote). Be the first to apply!