arXiv 2510.23948
ChessQA: Evaluating Large Language Models for Chess Understanding
By Qianfeng Wen, Zhenwei Tang, et al.
Published 2025-10-28
Wiki summary
Explore the paper's summary, context, and related research on Papiers.
Chess provides an ideal testbed for evaluating the reasoning, modeling, and abstraction capabilities of large language models (LLMs), as it has well-defined structure and objective ground truth while admitting a wide spectrum of skill levels. However, existing evaluations of LLM ability in chess are ad hoc and narrow in scope, making it difficult to accurately measure LLM chess understanding and how it varies with sā¦