arXiv 2510.23948
ChessQA: Evaluating Large Language Models for Chess Understanding
By Qianfeng Wen, Zhenwei Tang, et al.
Published 2025-10-28
Citation lineage
Review the prior work and downstream research connected to this paper.
Chess provides an ideal testbed for evaluating the reasoning, modeling, and abstraction capabilities of large language models (LLMs), as it has well-defined structure and objective ground truth while admitting a wide spectrum of skill levels. However, existing evaluations of LLM ability in chess are ad hoc and narrow in scope, making it difficult to accurately measure LLM chess understanding and how it varies with sā¦