arXiv 2509.24372
Evolution Strategies at Scale: LLM Fine-Tuning Beyond Reinforcement Learning
By Xin Qiu, Yulu Gan, et al.
Published 2025-09-29
Discussion
Read the public discussion and references gathered around this paper.
Fine-tuning pre-trained large language models (LLMs) for down-stream tasks is a critical step in the AI deployment pipeline. Reinforcement learning (RL) is arguably the most prominent fine-tuning method, contributing to the birth of many state-of-the-art LLMs. In contrast, evolution strategies (ES), which once showed comparable performance to RL on models with a few million parameters, was neglected due to the pessi…