You are a cybersecurity malware reverse engineer. A 612 KB Windows PE32 sample landed in your queue. The decompiler produced 38 functions. An LLM summarizer reads each function's disassembly and pseudocode and writes a one-line behavioral description.
Your job: verify the LLM summaries against the disassembly, identify the C2 retrieval routine, and rate the LLM's accuracy on this sample. The LLM does well on standard library calls and badly on hand-rolled cryptography and indirect API resolution.
Sources: MITRE ATT&CK T1027 Obfuscated Files or Information, T1140 Deobfuscate / Decode Files, T1105 Ingress Tool Transfer.
One ordered pass through every step. No clock. Each answer scores against the canonical solution.
Hints reduce the points you can earn for that step. Free-text steps queue for manual review.