Next Article in Journal
DAP-SDD: Distribution-Aware Pseudo Labeling for Small Defect Detection
Previous Article in Journal
Measuring Gender Bias in Contextualized Embeddings
 
 
Proceeding Paper

The Details Matter: Preventing Class Collapse in Supervised Contrastive Learning †

Department of Computer Science, Stanford University, Stanford, CA 94035, USA
*
Author to whom correspondence should be addressed.
Presented at the AAAI Workshop on Artificial Intelligence with Biased or Scarce Data (AIBSD), Online, 28 February 2022.
These authors contributed equally to this work.
Academic Editors: Kuan-Chuan Peng and Ziyan Wu
Comput. Sci. Math. Forum 2022, 3(1), 4; https://doi.org/10.3390/cmsf2022003004
Published: 15 April 2022
Supervised contrastive learning optimizes a loss that pushes together embeddings of points from the same class while pulling apart embeddings of points from different classes. Class collapse—when every point from the same class has the same embedding—minimizes this loss but loses critical information that is not encoded in the class labels. For instance, the “cat” label does not capture unlabeled categories such as breeds, poses, or backgrounds (which we call “strata”). As a result, class collapse produces embeddings that are less useful for downstream applications such as transfer learning and achieves suboptimal generalization error when there are strata. We explore a simple modification to supervised contrastive loss that aims to prevent class collapse by uniformly pulling apart individual points from the same class. We seek to understand the effects of this loss by examining how it embeds strata of different sizes, finding that it clusters larger strata more tightly than smaller strata. As a result, our loss function produces embeddings that better distinguish strata in embedding space, which produces lift on three downstream applications: 4.4 points on coarse-to-fine transfer learning, 2.5 points on worst-group robustness, and 1.0 points on minimal coreset construction. Our loss also produces more accurate models, with up to 4.0 points of lift across 9 tasks. View Full-Text
Keywords: contrastive learning; supervised contrastive learning; transfer learning; robustness; noisy labels; coresets contrastive learning; supervised contrastive learning; transfer learning; robustness; noisy labels; coresets
Show Figures

Figure 1

MDPI and ACS Style

Fu, D.Y.; Chen, M.F.; Zhang, M.; Fatahalian, K.; Ré, C. The Details Matter: Preventing Class Collapse in Supervised Contrastive Learning. Comput. Sci. Math. Forum 2022, 3, 4. https://doi.org/10.3390/cmsf2022003004

AMA Style

Fu DY, Chen MF, Zhang M, Fatahalian K, Ré C. The Details Matter: Preventing Class Collapse in Supervised Contrastive Learning. Computer Sciences & Mathematics Forum. 2022; 3(1):4. https://doi.org/10.3390/cmsf2022003004

Chicago/Turabian Style

Fu, Daniel Y., Mayee F. Chen, Michael Zhang, Kayvon Fatahalian, and Christopher Ré. 2022. "The Details Matter: Preventing Class Collapse in Supervised Contrastive Learning" Computer Sciences & Mathematics Forum 3, no. 1: 4. https://doi.org/10.3390/cmsf2022003004

Find Other Styles
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Back to TopTop