arXiv 2505.05541

Safety by Measurement: A Systematic Literature Review of AI Safety Evaluation Methods

By Markov Grey and Charbel-Raphaël Segerie

Published 2025-05-08

Citation lineage

Review the prior work and downstream research connected to this paper.

As frontier AI systems advance toward transformative capabilities, we need a parallel transformation in how we measure and evaluate these systems to ensure safety and inform governance. While benchmarks have been the primary method for estimating model capabilities, they often fail to establish true upper bounds or predict deployment behavior. This literature review consolidates the rapidly evolving field of AI safe…

View the original paper on arXiv