arXiv 2509.24372
Evolution Strategies at Scale: LLM Fine-Tuning Beyond Reinforcement Learning
By Xin Qiu, Yulu Gan, et al.
Published 2025-09-29
Mindmap
Browse the paper's core ideas, clusters, and relationships in a structured outline.
Fine-tuning pre-trained large language models (LLMs) for down-stream tasks is a critical step in the AI deployment pipeline. Reinforcement learning (RL) is arguably the most prominent fine-tuning method, contributing to the birth of many state-of-the-art LLMs. In contrast, evolution strategies (ES), which once showed comparable performance to RL on models with a few million parameters, was neglected due to the pessi…