arXiv 2602.20021

Agents of Chaos

By Natalie Shapira, Chris Wendler, et al.

Published 2026-02-23

Citation lineage

Review the prior work and downstream research connected to this paper.

We report an exploratory red-teaming study of autonomous language-model-powered agents deployed in a live laboratory environment with persistent memory, email accounts, Discord access, file systems, and shell execution. Over a two-week period, twenty AI researchers interacted with the agents under benign and adversarial conditions. Focusing on failures emerging from the integration of language models with autonomy,…

View the original paper on arXiv