arXiv 2211.05232

MuMIC -- Multimodal Embedding for Multi-label Image Classification with Tempered Sigmoid

By Fengjun Wang, Sarai Mizrachi, et al.

Published 2022-11-02

Citation lineage

Review the prior work and downstream research connected to this paper.

Multi-label image classification is a foundational topic in various domains. Multimodal learning approaches have recently achieved outstanding results in image representation and single-label image classification. For instance, Contrastive Language-Image Pretraining (CLIP) demonstrates impressive image-text representation learning abilities and is robust to natural distribution shifts. This success inspires us to le…

View the original paper on arXiv