arXiv 2507.19457
GEPA: Reflective Prompt Evolution Can Outperform Reinforcement Learning
By Lakshya A Agrawal, Shangyin Tan, et al.
Published 2025-07-25
Mindmap
Browse the paper's core ideas, clusters, and relationships in a structured outline.
Large language models (LLMs) are increasingly adapted to downstream tasks via reinforcement learning (RL) methods like Group Relative Policy Optimization (GRPO), which often require thousands of rollouts to learn new tasks. We argue that the interpretable nature of language can often provide a much richer learning medium for LLMs, compared with policy gradients derived from sparse, scalar rewards. To test this, we iā¦