arXiv 2503.16416

Survey on Evaluation of LLM-based Agents

By Asaf Yehudai, Lilach Eden, et al.

Published 2025-03-20

Citation lineage

Review the prior work and downstream research connected to this paper.

The emergence of LLM-based agents represents a paradigm shift in AI, enabling autonomous systems to plan, reason, use tools, and maintain memory while interacting with dynamic environments. This paper provides the first comprehensive survey of evaluation methodologies for these increasingly capable agents. We systematically analyze evaluation benchmarks and frameworks across four critical dimensions: (1) fundamental…

View the original paper on arXiv