arXiv 2211.05232

MuMIC -- Multimodal Embedding for Multi-label Image Classification with Tempered Sigmoid

By Fengjun Wang, Sarai Mizrachi, et al.

Published 2022-11-02

Wiki summary

Explore the paper's summary, context, and related research on Papiers.

Multi-label image classification is a foundational topic in various domains. Multimodal learning approaches have recently achieved outstanding results in image representation and single-label image classification. For instance, Contrastive Language-Image Pretraining (CLIP) demonstrates impressive image-text representation learning abilities and is robust to natural distribution shifts. This success inspires us to le…

View the original paper on arXiv