arXiv 2311.18232

LMRL Gym: Benchmarks for Multi-Turn Reinforcement Learning with Language Models

By Marwa Abdulhai, Isadora White, et al.

Published 2023-11-30

Citation lineage

Review the prior work and downstream research connected to this paper.

Large language models (LLMs) provide excellent text-generation capabilities, but standard prompting and generation methods generally do not lead to intentional or goal-directed agents and might necessitate considerable prompt tuning. This becomes particularly apparent in multi-turn conversations: even the best current LLMs rarely ask clarifying questions, engage in explicit information gathering, or take actions now…

View the original paper on arXiv