arXiv 2311.18232
LMRL Gym: Benchmarks for Multi-Turn Reinforcement Learning with Language Models
By Marwa Abdulhai, Isadora White, et al.
Published 2023-11-30
Citation lineage
Review the prior work and downstream research connected to this paper.
Large language models (LLMs) provide excellent text-generation capabilities, but standard prompting and generation methods generally do not lead to intentional or goal-directed agents and might necessitate considerable prompt tuning. This becomes particularly apparent in multi-turn conversations: even the best current LLMs rarely ask clarifying questions, engage in explicit information gathering, or take actions now…