You are a cybersecurity privacy engineer at Example HealthTech Co. The triage chatbot helps patients answer routine questions. The system prompt embeds the patient's medical record summary. The chatbot has been observed repeating other patients' summaries when asked carefully.
Two leakage paths: cross-conversation memory contamination and system-prompt leakage. The product team wants to ship a fix this sprint and a longer-term architecture next quarter.
This scenario tests OWASP LLM06:2025 Sensitive Information Disclosure, the architectural patterns that prevent it, and the test gate that catches regressions. Sources: OWASP LLM Top 10 (2025), HIPAA Privacy Rule, Zou et al. 2023 'Universal and Transferable Adversarial Attacks on Aligned Language Models'.
One ordered pass through every step. No clock. Each answer scores against the canonical solution.
Hints reduce the points you can earn for that step. Free-text steps queue for manual review.