arXiv 2507.22133

Prompt Optimization and Evaluation for LLM Automated Red Teaming

By Michael Freenor, Lauren Alvarez, et al.

Published 2025-07-29

Wiki summary

Explore the paper's summary, context, and related research on Papiers.

Applications that use Large Language Models (LLMs) are becoming widespread, making the identification of system vulnerabilities increasingly important. Automated Red Teaming accelerates this effort by using an LLM to generate and execute attacks against target systems. Attack generators are evaluated using the Attack Success Rate (ASR) the sample mean calculated over the judgment of success for each attack. In this…

View the original paper on arXiv