arXiv 2408.00714

SAM 2: Segment Anything in Images and Videos

By Nikhila Ravi, Valentin Gabeur, et al.

Published 2024-08-01

Mindmap

Browse the paper's core ideas, clusters, and relationships in a structured outline.

We present Segment Anything Model 2 (SAM 2), a foundation model towards solving promptable visual segmentation in images and videos. We build a data engine, which improves model and data via user interaction, to collect the largest video segmentation dataset to date. Our model is a simple transformer architecture with streaming memory for real-time video processing. SAM 2 trained on our data provides strong performa…

View the original paper on arXiv