arXiv 2511.04670

Cambrian-S: Towards Spatial Supersensing in Video

By Shusheng Yang, Jihan Yang, et al.

Published 2025-11-06

Citation lineage

Review the prior work and downstream research connected to this paper.

We argue that progress in true multimodal intelligence calls for a shift from reactive, task-driven systems and brute-force long context towards a broader paradigm of supersensing. We frame spatial supersensing as four stages beyond linguistic-only understanding: semantic perception (naming what is seen), streaming event cognition (maintaining memory across continuous experiences), implicit 3D spatial cognition (inf…

View the original paper on arXiv