arXiv 2103.00020
Learning Transferable Visual Models From Natural Language Supervision
By Alec Radford, Jong Wook Kim, et al.
Published 2021-02-26
Wiki summary
Explore the paper's summary, context, and related research on Papiers.
State-of-the-art computer vision systems are trained to predict a fixed set of predetermined object categories. This restricted form of supervision limits their generality and usability since additional labeled data is needed to specify any other visual concept. Learning directly from raw text about images is a promising alternative which leverages a much broader source of supervision. We demonstrate that the simple…