arXiv 2505.05445
clem:todd: A Framework for the Systematic Benchmarking of LLM-Based Task-Oriented Dialogue System Realisations
By Chalamalasetti Kranti, Sherzod Hakimov, et al.
Published 2025-05-08
Wiki summary
Explore the paper's summary, context, and related research on Papiers.
The emergence of instruction-tuned large language models (LLMs) has advanced the field of dialogue systems, enabling both realistic user simulations and robust multi-turn conversational agents. However, existing research often evaluates these components in isolation-either focusing on a single user simulator or a specific system design-limiting the generalisability of insights across architectures and configurations…