Range Scenario · gauntlet · 60 min
AI SOAR Playbook Design: LLM-Driven Phishing Response
This cybersecurity training scenario simulates a working incident. Design an LLM-driven cybersecurity playbook for a phishing campaign in 60 minutes. Specify prompts, decision points, escalation criteria, and hallucination guardrails. Defend the design to your skeptical security director.
Scenario briefing
You are the cybersecurity automation lead. The CISO wants a SOAR playbook that uses an LLM to triage, contain, and document phishing incidents at scale. Phishing is your team's highest-volume incident class: 40 to 60 reports per day.
60 minutes to design. Output: a playbook with named LLM steps, prompt structure, decision points, escalation criteria, and guardrails against hallucination. The director will challenge every automation decision. Your design must justify human-in-the-loop placement.
This scenario tests whether you can build LLM automation that is safe in production. The trap is over-automation: if the LLM can reset passwords and quarantine endpoints unsupervised, hallucinations turn into outages. The other trap is under-automation: every step gated on a human is no better than the manual runbook.
What you will practice
- Design an LLM-driven SOAR playbook with explicit human-in-the-loop placement
- Specify prompts, decision points, and escalation criteria
- Build hallucination guardrails into automated workflows
- Justify automation depth using reversibility and blast-radius reasoning
How this scenario is scored
The scenario has 8 ordered steps. Most steps are exact-match (a MITRE ATT&CK technique ID, a tool name, or a yes/no decision) or multiple choice. Free-text steps queue for manual review and do not affect the auto-final-score in the MVP.
Each step has a max score of 100 points. Hints deduct points up front, listed before you reveal them. Your final score is the sum across steps. Range Elo updates on completion based on scenario difficulty (Advanced) and your final score percentage.
Frequently asked questions
Where should a human stay in the loop in an LLM-driven SOAR playbook?
Use the reversibility and blast-radius test. Reversible, low-blast-radius actions (tagging an email, drafting a notification, enriching an indicator) can run unsupervised. Irreversible or high-blast-radius actions (quarantining endpoints, forcing password resets, blocking domains at the edge) need human approval. Mature playbooks fail closed by default.
What guardrails stop LLM hallucination from causing outages?
Schema validation on every LLM output. Refuse free-form action specs. Require structured action types from a known list. Confidence thresholds with abstention as a valid output. Pre-action sanity checks (the user actually exists, the host is in the asset inventory). And a kill switch that pauses the playbook if abnormal volume hits.
How do you measure if the playbook is working?
Mean time to triage (target: under 5 minutes), false-positive rate on automated containment (target: under 1 percent), human override rate (a high override rate means the LLM is wrong or the decision points are placed badly), and incident throughput (alerts handled per analyst per day). Set baselines before launch and review weekly.
Course content is for educational purposes only and does not constitute professional advice. All claims are supported by cited peer-reviewed academic research. DecipherU does not teach or reproduce any proprietary sales methodology. Verify all referenced sources independently.
Get cybersecurity career insights delivered weekly
Join cybersecurity professionals receiving weekly intelligence on threats, job market trends, salary data, and career growth strategies.
By subscribing you agree to our privacy policy. Unsubscribe anytime.