arXiv 2212.03191
InternVideo: General Video Foundation Models via Generative and Discriminative Learning
By Yi Wang, Kunchang Li, et al.
Published 2022-12-06
Wiki summary
Explore the paper's summary, context, and related research on Papiers.
The foundation models have recently shown excellent performance on a variety of downstream tasks in computer vision. However, most existing vision foundation models simply focus on image-level pretraining and adpation, which are limited for dynamic and complex video-level understanding tasks. To fill the gap, we present general video foundation models, InternVideo, by taking advantage of both generative and discrimi…