arXiv 2511.02817
Oolong: Evaluating Long Context Reasoning and Aggregation Capabilities
By Amanda Bertsch, Adithya Pratapa, et al.
Published 2025-11-04
Wiki summary
Explore the paper's summary, context, and related research on Papiers.
As model context lengths continue to grow, concerns about whether models effectively use the full context length have persisted. While several carefully designed long-context evaluations have recently been released, these evaluations tend to rely on retrieval from one or more sections of the context, which allows nearly all of the context tokens to be disregarded as noise. This represents only one type of task that…