arXiv 2511.04502

RAGalyst: Automated Human-Aligned Agentic Evaluation for Domain-Specific RAG

By Joshua Gao, Quoc Huy Pham, et al.

Published 2025-11-06

Wiki summary

Explore the paper's summary, context, and related research on Papiers.

Retrieval-Augmented Generation (RAG) is a critical technique for grounding Large Language Models (LLMs) in factual evidence, yet evaluating RAG systems in specialized, safety-critical domains remains a significant challenge. Existing evaluation frameworks often rely on heuristic-based metrics that fail to capture domain-specific nuances and other works utilize LLM-as-a-Judge approaches that lack validated alignment…

View the original paper on arXiv