arXiv 2505.05445
clem:todd: A Framework for the Systematic Benchmarking of LLM-Based Task-Oriented Dialogue System Realisations
By Chalamalasetti Kranti, Sherzod Hakimov, et al.
Published 2025-05-08
Mindmap
Browse the paper's core ideas, clusters, and relationships in a structured outline.
The emergence of instruction-tuned large language models (LLMs) has advanced the field of dialogue systems, enabling both realistic user simulations and robust multi-turn conversational agents. However, existing research often evaluates these components in isolation-either focusing on a single user simulator or a specific system design-limiting the generalisability of insights across architectures and configurations…