arXiv 2508.09932

Mathematical Computation and Reasoning Errors by Large Language Models

By Liang Zhang and Edith Aurora Graf

Published 2025-08-13

Mindmap

Browse the paper's core ideas, clusters, and relationships in a structured outline.

Large Language Models (LLMs) are increasingly utilized in AI-driven educational instruction and assessment, particularly within mathematics education. The capability of LLMs to generate accurate answers and detailed solutions for math problem-solving tasks is foundational for ensuring reliable and precise feedback and assessment in math education practices. Our study focuses on evaluating the accuracy of four LLMs (…

View the original paper on arXiv