arXiv 2406.04325
ShareGPT4Video: Improving Video Understanding and Generation with Better Captions
By Lin Chen, Xilin Wei, et al.
Published 2024-06-06
Mindmap
Browse the paper's core ideas, clusters, and relationships in a structured outline.
We present the ShareGPT4Video series, aiming to facilitate the video understanding of large video-language models (LVLMs) and the video generation of text-to-video models (T2VMs) via dense and precise captions. The series comprises: 1) ShareGPT4Video, 40K GPT4V annotated dense captions of videos with various lengths and sources, developed through carefully designed data filtering and annotating strategy. 2) ShareCap…