arXiv 2512.16848

Meta-RL Induces Exploration in Language Agents

By Yulun Jiang, Liangze Jiang, et al.

Published 2025-12-18

Mindmap

Browse the paper's core ideas, clusters, and relationships in a structured outline.

Reinforcement learning (RL) has enabled the training of large language model (LLM) agents to interact with the environment and to solve multi-turn long-horizon tasks. However, the RL-trained agents often struggle in tasks that require active exploration and fail to efficiently adapt from trial-and-error experiences. In this paper, we present LaMer, a general Meta-RL framework that enables LLM agents to actively expl…

View the original paper on arXiv