Range Scenario · gauntlet · 60 min

AI SOAR Playbook Design: LLM-Driven Phishing Response

This cybersecurity training scenario simulates a working incident. Design an LLM-driven cybersecurity playbook for a phishing campaign in 60 minutes. Specify prompts, decision points, escalation criteria, and hallucination guardrails. Defend the design to your skeptical security director.

Advanced·AI for Cybersecurity·8 steps·Last verified April 2026

MITRE ATT&CK techniques

T1566.002 T1078 T1059.001

Start cybersecurity scenario Browse all scenarios

Scenario briefing

You are the cybersecurity automation lead. The CISO wants a SOAR playbook that uses an LLM to triage, contain, and document phishing incidents at scale. Phishing is your team's highest-volume incident class: 40 to 60 reports per day.

60 minutes to design. Output: a playbook with named LLM steps, prompt structure, decision points, escalation criteria, and guardrails against hallucination. The director will challenge every automation decision. Your design must justify human-in-the-loop placement.

This scenario tests whether you can build LLM automation that is safe in production. The trap is over-automation: if the LLM can reset passwords and quarantine endpoints unsupervised, hallucinations turn into outages. The other trap is under-automation: every step gated on a human is no better than the manual runbook.

What you will practice

Design an LLM-driven SOAR playbook with explicit human-in-the-loop placement
Specify prompts, decision points, and escalation criteria
Build hallucination guardrails into automated workflows
Justify automation depth using reversibility and blast-radius reasoning

How this scenario is scored

The scenario has 8 ordered steps. Most steps are exact-match (a MITRE ATT&CK technique ID, a tool name, or a yes/no decision) or multiple choice. Free-text steps queue for manual review and do not affect the auto-final-score in the MVP.

Each step has a max score of 100 points. Hints deduct points up front, listed before you reveal them. Your final score is the sum across steps. Range Elo updates on completion based on scenario difficulty (Advanced) and your final score percentage.

Frequently asked questions

Where should a human stay in the loop in an LLM-driven SOAR playbook?

Use the reversibility and blast-radius test. Reversible, low-blast-radius actions (tagging an email, drafting a notification, enriching an indicator) can run unsupervised. Irreversible or high-blast-radius actions (quarantining endpoints, forcing password resets, blocking domains at the edge) need human approval. Mature playbooks fail closed by default.

What guardrails stop LLM hallucination from causing outages?

Schema validation on every LLM output. Refuse free-form action specs. Require structured action types from a known list. Confidence thresholds with abstention as a valid output. Pre-action sanity checks (the user actually exists, the host is in the asset inventory). And a kill switch that pauses the playbook if abnormal volume hits.

How do you measure if the playbook is working?

Mean time to triage (target: under 5 minutes), false-positive rate on automated containment (target: under 1 percent), human override rate (a high override rate means the LLM is wrong or the decision points are placed badly), and incident throughput (alerts handled per analyst per day). Set baselines before launch and review weekly.

Course content is for educational purposes only and does not constitute professional advice. All claims are supported by cited peer-reviewed academic research. DecipherU does not teach or reproduce any proprietary sales methodology. Verify all referenced sources independently.

Last verified: 2026-04-26?Report an inaccuracy

Get cybersecurity career insights delivered weekly

Join cybersecurity professionals receiving weekly intelligence on threats, job market trends, salary data, and career growth strategies.

By subscribing you agree to our privacy policy. Unsubscribe anytime.

AI SOAR Playbook Design: LLM-Driven Phishing Response

Advanced·AI for Cybersecurity·8 steps·Last verified April 2026

Scenario briefing

How this scenario is scored

Frequently asked questions