arXiv 2511.10647
Depth Anything 3: Recovering the Visual Space from Any Views
By Haotong Lin, Sili Chen, et al.
Published 2025-11-13
Wiki summary
Explore the paper's summary, context, and related research on Papiers.
We present Depth Anything 3 (DA3), a model that predicts spatially consistent geometry from an arbitrary number of visual inputs, with or without known camera poses. In pursuit of minimal modeling, DA3 yields two key insights: a single plain transformer (e.g., vanilla DINO encoder) is sufficient as a backbone without architectural specialization, and a singular depth-ray prediction target obviates the need for compl…