arXiv 2211.05232
MuMIC -- Multimodal Embedding for Multi-label Image Classification with Tempered Sigmoid
By Fengjun Wang, Sarai Mizrachi, et al.
Published 2022-11-02
Wiki summary
Explore the paper's summary, context, and related research on Papiers.
Multi-label image classification is a foundational topic in various domains. Multimodal learning approaches have recently achieved outstanding results in image representation and single-label image classification. For instance, Contrastive Language-Image Pretraining (CLIP) demonstrates impressive image-text representation learning abilities and is robust to natural distribution shifts. This success inspires us to le…