arXiv 2603.11214
Measuring AI Agents' Progress on Multi-Step Cyber Attack Scenarios
By Linus Folkerts, Will Payne, et al.
Published 2026-03-11
Citation lineage
Review the prior work and downstream research connected to this paper.
We evaluate the autonomous cyber-attack capabilities of frontier AI models on two purpose-built cyber ranges-a 32-step corporate network attack and a 7-step industrial control system attack-that require chaining heterogeneous capabilities across extended action sequences. By comparing seven models released over an eighteen-month period (August 2024 to February 2026) at varying inference-time compute budgets, we obse…