arXiv 2511.07885
Intelligence per Watt: Measuring Intelligence Efficiency of Local AI
By Jon Saad-Falcon, Avanika Narayan, et al.
Published 2025-11-11
Wiki summary
Explore the paper's summary, context, and related research on Papiers.
Large language model (LLM) queries are predominantly processed by frontier models in centralized cloud infrastructure. Rapidly growing demand strains this paradigm, and cloud providers struggle to scale infrastructure at pace. Two advances enable us to rethink this paradigm: small LMs (<=20B active parameters) now achieve competitive performance to frontier models on many tasks, and local accelerators (e.g., Apple M…