arXiv 2602.20021

Agents of Chaos

By Natalie Shapira, Chris Wendler, et al.

Published 2026-02-23

Mindmap

Browse the paper's core ideas, clusters, and relationships in a structured outline.

We report an exploratory red-teaming study of autonomous language-model-powered agents deployed in a live laboratory environment with persistent memory, email accounts, Discord access, file systems, and shell execution. Over a two-week period, twenty AI researchers interacted with the agents under benign and adversarial conditions. Focusing on failures emerging from the integration of language models with autonomy,…

View the original paper on arXiv