Range Scenario · crucible · 30 min
Model Denial of Service: Long-Context Resource Exhaustion
This cybersecurity training scenario simulates a working incident. An attacker submits 200,000-token cybersecurity prompts to your public LLM endpoint. Costs spike, latency tanks. Trace the abuse, design the rate-limit and cost guardrails.
Scenario briefing
You are an SRE running the public cybersecurity LLM endpoint at Example Inference Inc. Your endpoint accepts up to 200,000-token prompts. The pricing is per-token.
An anonymous user has been submitting 195,000-token prompts at 8 requests per minute for the past 6 hours. Cost is $4,200 in the last 6 hours alone, latency for legitimate users went from 1.2s to 6.8s, and the GPU pool is saturated.
This scenario tests OWASP LLM04:2025 Model DoS, the design of input-size limits, rate limits per identity, and cost guardrails. Sources: OWASP LLM Top 10 (2025), AWS Well-Architected Reliability Pillar.
What you will practice
- Map abuse to OWASP LLM04
- Design per-identity rate limits and token-budget caps
- Distinguish economic DoS from availability DoS in the LLM context
- Set circuit-breaker triggers based on cost, not just latency
How this scenario is scored
The scenario has 6 ordered steps. Most steps are exact-match (a MITRE ATT&CK technique ID, a tool name, or a yes/no decision) or multiple choice. Free-text steps queue for manual review and do not affect the auto-final-score in the MVP.
Each step has a max score of 100 points. Hints deduct points up front, listed before you reveal them. Your final score is the sum across steps. Range Elo updates on completion based on scenario difficulty (Intermediate) and your final score percentage.
Frequently asked questions
What is the difference between availability DoS and economic DoS for LLMs?
Availability DoS makes the service unusable for everyone. Economic DoS is when the cost of serving the attacker exceeds revenue, even if availability holds. LLMs are uniquely vulnerable to economic DoS because per-token cost and per-token compute mean an attacker can quietly drain budget without breaking the service.
Why per-identity rate limits over per-IP?
Per-IP rate limits get bypassed by IPv6 rotation, residential proxies, and CGNAT. Per-identity (authenticated user, API key, or session token) gives a stable handle. For anonymous endpoints, combine identity with IP and a short-lived browser fingerprint, and accept that abuse rates will be higher than authenticated ones.
What is a circuit breaker for LLM cost?
A circuit breaker monitors cumulative spend per identity, per endpoint, per minute. When spend exceeds the budget, the breaker opens: the endpoint refuses new requests from that identity, returning 429 with a clear error message. The breaker resets on a timer or after a manual review.
Course content is for educational purposes only and does not constitute professional advice. All claims are supported by cited peer-reviewed academic research. DecipherU does not teach or reproduce any proprietary sales methodology. Verify all referenced sources independently.
Get cybersecurity career insights delivered weekly
Join cybersecurity professionals receiving weekly intelligence on threats, job market trends, salary data, and career growth strategies.
By subscribing you agree to our privacy policy. Unsubscribe anytime.