arXiv 2303.15343
Sigmoid Loss for Language Image Pre-Training
By Xiaohua Zhai, Basil Mustafa, et al.
Published 2023-03-27
Mindmap
Browse the paper's core ideas, clusters, and relationships in a structured outline.
We propose a simple pairwise Sigmoid loss for Language-Image Pre-training (SigLIP). Unlike standard contrastive learning with softmax normalization, the sigmoid loss operates solely on image-text pairs and does not require a global view of the pairwise similarities for normalization. The sigmoid loss simultaneously allows further scaling up the batch size, while also performing better at smaller batch sizes. Combine…