arXiv 2512.16848

Meta-RL Induces Exploration in Language Agents

By Yulun Jiang, Liangze Jiang, et al.

Published 2025-12-18

Discussion

Read the public discussion and references gathered around this paper.

Reinforcement learning (RL) has enabled the training of large language model (LLM) agents to interact with the environment and to solve multi-turn long-horizon tasks. However, the RL-trained agents often struggle in tasks that require active exploration and fail to efficiently adapt from trial-and-error experiences. In this paper, we present LaMer, a general Meta-RL framework that enables LLM agents to actively expl…

View the original paper on arXiv