arXiv 2511.04502
RAGalyst: Automated Human-Aligned Agentic Evaluation for Domain-Specific RAG
By Joshua Gao, Quoc Huy Pham, et al.
Published 2025-11-06
Mindmap
Browse the paper's core ideas, clusters, and relationships in a structured outline.
Retrieval-Augmented Generation (RAG) is a critical technique for grounding Large Language Models (LLMs) in factual evidence, yet evaluating RAG systems in specialized, safety-critical domains remains a significant challenge. Existing evaluation frameworks often rely on heuristic-based metrics that fail to capture domain-specific nuances and other works utilize LLM-as-a-Judge approaches that lack validated alignment…