arXiv 2507.19457
GEPA: Reflective Prompt Evolution Can Outperform Reinforcement Learning
By Lakshya A Agrawal, Shangyin Tan, et al.
Published 2025-07-25
Citation lineage
Review the prior work and downstream research connected to this paper.
Large language models (LLMs) are increasingly adapted to downstream tasks via reinforcement learning (RL) methods like Group Relative Policy Optimization (GRPO), which often require thousands of rollouts to learn new tasks. We argue that the interpretable nature of language can often provide a much richer learning medium for LLMs, compared with policy gradients derived from sparse, scalar rewards. To test this, we iā¦