arXiv 2510.23948
ChessQA: Evaluating Large Language Models for Chess Understanding
By Qianfeng Wen, Zhenwei Tang, et al.
Published 2025-10-28
Mindmap
Browse the paper's core ideas, clusters, and relationships in a structured outline.
Chess provides an ideal testbed for evaluating the reasoning, modeling, and abstraction capabilities of large language models (LLMs), as it has well-defined structure and objective ground truth while admitting a wide spectrum of skill levels. However, existing evaluations of LLM ability in chess are ad hoc and narrow in scope, making it difficult to accurately measure LLM chess understanding and how it varies with sā¦