Range Scenario · crucible · 40 min
Adversarial Examples: Image Classifier Under Attack
This cybersecurity training scenario simulates a working incident. A cybersecurity image-classifier model misclassifies adversarially perturbed inputs. Trace the attack class, deploy the right defenses, set the eval gate.
Scenario briefing
You are an AI cybersecurity engineer at Example Defense Tech. Your image classifier reviews photos uploaded to the platform for safety violations. A red-team report shows adversarially perturbed images (epsilon-bounded L-infinity perturbations) bypass the classifier with 78 percent attack success rate.
Design the defense stack and the evaluation gate that catches regressions. The model team has access to adversarial training, randomized smoothing, and input preprocessing.
This scenario tests MITRE ATLAS AML.T0015 Evade ML Model, the Goodfellow et al. 2014 'Explaining and Harnessing Adversarial Examples' baseline, and the practical defense stack. Sources: Goodfellow et al. 2014, Madry et al. 2018 'Towards Deep Learning Models Resistant to Adversarial Attacks', MITRE ATLAS, Carlini & Wagner 2017.
What you will practice
- Distinguish FGSM, PGD, and C&W attacks
- Apply adversarial training, randomized smoothing, and input preprocessing
- Set evaluation thresholds that include adversarial robustness
- Frame the defense as raising attacker cost, not eliminating risk
How this scenario is scored
The scenario has 6 ordered steps. Most steps are exact-match (a MITRE ATT&CK technique ID, a tool name, or a yes/no decision) or multiple choice. Free-text steps queue for manual review and do not affect the auto-final-score in the MVP.
Each step has a max score of 100 points. Hints deduct points up front, listed before you reveal them. Your final score is the sum across steps. Range Elo updates on completion based on scenario difficulty (Advanced) and your final score percentage.
Frequently asked questions
What is FGSM and PGD?
FGSM (Fast Gradient Sign Method, Goodfellow et al. 2014) is a single-step attack that perturbs each pixel by epsilon in the gradient direction. PGD (Projected Gradient Descent, Madry et al. 2018) iterates FGSM and projects back into the epsilon-ball, producing stronger attacks. PGD is the standard adversarial-training and adversarial-evaluation baseline.
What does adversarial training actually do?
Adversarial training augments the training set with adversarial examples generated against the current model, retraining the model to classify them correctly. Madry et al. 2018 made it the de facto defense baseline. It raises attacker cost but does not eliminate the attack class. Robust accuracy typically lags clean accuracy.
What is randomized smoothing?
Randomized smoothing adds Gaussian noise to inputs at inference time and uses the most-likely class across many noisy copies. The method gives certified robustness within a known L2 radius. Cohen et al. 2019 formalized the approach. The defense is provable but adds inference cost.
Course content is for educational purposes only and does not constitute professional advice. All claims are supported by cited peer-reviewed academic research. DecipherU does not teach or reproduce any proprietary sales methodology. Verify all referenced sources independently.
Get cybersecurity career insights delivered weekly
Join cybersecurity professionals receiving weekly intelligence on threats, job market trends, salary data, and career growth strategies.
By subscribing you agree to our privacy policy. Unsubscribe anytime.