arXiv 2603.07427

AutoControl Arena: Synthesizing Executable Test Environments for Frontier AI Risk Evaluation

By Changyi Li, Pengfei Lu, et al.

Published 2026-03-08

Wiki summary

Explore the paper's summary, context, and related research on Papiers.

As Large Language Models (LLMs) evolve into autonomous agents, existing safety evaluations face a fundamental trade-off: manual benchmarks are costly, while LLM-based simulators are scalable but suffer from logic hallucination. We present AutoControl Arena, an automated framework for frontier AI risk evaluation built on the principle of logic-narrative decoupling. By grounding deterministic state in executable code…

View the original paper on arXiv