arXiv 2507.22133
Prompt Optimization and Evaluation for LLM Automated Red Teaming
By Michael Freenor, Lauren Alvarez, et al.
Published 2025-07-29
Citation lineage
Review the prior work and downstream research connected to this paper.
Applications that use Large Language Models (LLMs) are becoming widespread, making the identification of system vulnerabilities increasingly important. Automated Red Teaming accelerates this effort by using an LLM to generate and execute attacks against target systems. Attack generators are evaluated using the Attack Success Rate (ASR) the sample mean calculated over the judgment of success for each attack. In this…