Multi-Augmentation-Based Contrastive Learning for Semi-Supervised Learning
Abstract
:1. Introduction
- (1)
- A novel moderate augmentation technique is introduced, which is incorporated into two distinctive losses.
- (2)
- Numerous experiments demonstrate that MAC-SSL achieves state-of-the-art (SOTA) results across all standard benchmark datasets (Section 3.3).
- (3)
- The conducted ablation experiments illustrate the excellent performance of MAC-SSL (Section 3.4).
2. Method
2.1. Data Augmentation
Algorithm 1 AugMix |
1: Input: Original image , Operations . |
2: function AugmentAndMix |
3: Fill with zeros of same shape as |
4: Sample mixing weights Dirichlet ▷ Dirichlet denotes Dirichlet distribution |
5: for to do |
6: ▷ Sample augmentation operation |
7: Compose operations with varying depth ▷ denotes composition of operations |
8: Sample uniformly from one of these operations |
9: ▷ Addition is elementwise, denotes that attach weights to each augmentation operation. |
10: end for ▷ Completion of the augmetation process |
11: Sample weight |
12: Interpolation with rule ▷ Completion of the mix process |
13: return |
14: end function |
15: AugmentAndMix ▷ is stochastically generated by the function AugmentAndMix . |
16: return . |
2.2. Pseudo-Labels Generating
2.3. MixUP
Algorithm 2 MAC-SSL |
1: Input: Batch of labeled examples , batch of unlabeled examples , ratio of sample size , sharpening temperature , Beta distribution parameter for MixUp, unlabeled loss weight , contrastive loss weight . |
2: for to |
3: Weak-Augment |
4: end for |
5: for to |
6: Weak-Augment , Moderate-Augment , Strong-Augment |
7: |
8: ▷ Compute average predictions of and |
9: ▷ Apply temperature sharpening to the average prediction (see Equation (3)) |
10: end for |
11: , |
12: , , |
13: ▷ Combine and shuffle labeled and unlabeled data |
14: ▷ see Section 2.3 |
15: , , |
16: ▷ Equation (12) |
17: ▷ Equation (13) |
18: ▷ Equation (14) |
19: ▷ Equation (15) |
20: ▷ Equation (16) |
21: return |
2.4. The Proposed MAC-SSL
- (1)
- Integration of information from different types of data: This model combines labeled and unlabeled data, applying different types of loss functions to both. and utilize the true labels of labeled data, utilizes pseudo-labels of unlabeled data, and performs contrastive learning on unlabeled data. By comprehensively utilizing these different types of data and loss functions, the model can learn the features of the data more comprehensively, thereby improving its performance.
- (2)
- Enhancing the model’s generalization ability: Integrating different types of loss functions can improve the model’s generalization ability. and help the model learn accurate classification decisions on labeled data, while and can help the model utilize information from unlabeled data to enhance its generalization ability.
- (3)
- Strengthening the model’s understanding of the data: Different types of loss functions allow the model to understand the data from different perspectives. and help the model understand the true labels of the data, while and enable the model to learn the distribution and features of the data from unlabeled data, thereby improving the model’s robustness.
3. Results
3.1. Implementation Details
- Datasets and metrics: MAC-SSL was experimentally validated on the CIFAR-10 [42], CIFAR-100 [42], and SVHN [43] datasets, as shown in Figure 5.
- CIFAR-10: The CIFAR-10 dataset is a widely used benchmark in the field of computer vision. It consists of 60,000 color images, each sized at 32 × 32 pixels, belonging to 10 different classes. These classes include common objects such as airplanes, automobiles, cats, birds, dogs, and more. The dataset is divided into 50,000 training images and 10,000 test images, with an equal distribution of images across the classes.
- CIFAR-100: Similar to CIFAR-10, CIFAR-100 is another dataset used for image classification tasks. It contains 60,000 images, each also sized at 32 × 32 pixels, but it is divided into 100 fine-grained classes. Each class represents a specific object or concept, such as insects, food containers, trees, fish, and so on. The dataset is split into 50,000 training images and 10,000 test images, with a balanced distribution of images across the classes. Classifying CIFAR-100 is more difficult than CIFAR-10 due to its larger number of categories.
- SVHN (Street View House Numbers): The SVHN dataset is focused on digit recognition tasks. It consists of real-world images taken from Google Street View that contain house numbers. Each image in SVHN contains multiple digits (from zero to nine). The images are of varying sizes but are predominantly sized at 32 × 32 pixels. SVHN is divided into a training set with 73,257 images and a test set with 26,032 images.
- Experiment setting: All experiments were implemented using PyTorch and conducted on an Ubuntu system server with four NVIDIA 3090 GPUs and 128 GB of memory, and they followed SSL evaluation methods. The experiments were conducted using the “Wide ResNet-28” model from [44]. Training for the CIFAR-10 [42] and SVHN [43] datasets continued for 300 epochs until convergence. A batch size of 64 was used with the Wide ResNet-28-2 model, which has 1.47 M parameters. Due to computational limitations, a batch size of up to 32 was used for the CIFAR-100 [42] dataset in MAC-SSL, while for the other methods, the batch size remained at 64. The Wide ResNet-28-8 model with 23.46 M parameters was utilized for the CIFAR-100 training. For the selection of hyperparameters, we employed a random search. The hyperparameter , controlling the sample ratio, was set to 5. The weight hyperparameter was set as , and was set to 1. The learning rate was set to 0.01 for CIFAR-10, CIFAR-100, and SVHN. The threshold was set to 0.95. The training employed the SGD optimizer with cosine weight decay. Exponential moving average (EMA) with a decay rate of 0.999 was utilized for evaluating the models. Of note, for SVHN, we applied strong augmentation using RandAugment [40] on top of moderate augmentation, referred to as MAC-SSL-II. This differed from CIFAR-10/100. Each epoch involved 1024 steps of training, and checkpoints were saved at each epoch. The average accuracy of the last 20 checkpoints was recorded. This approach simplified the analysis process.
3.2. Comparison Algorithms
- (1)
- To demonstrate the superiority of MAC-SSL, performance comparison experiments were conducted with the following seven state-of-the-art semi-supervised algorithms:
- (2)
- CoMatch [34] combines pseudo-based, contrast-loss-based, and graph-based models to improve model performance with limited labeled data. It jointly learns class probabilities and low-dimensional embeddings, enhancing the quality of pseudo-labels by imposing a smoothness constraint on the class probabilities.
- (3)
- MixMatch [33] optimizes both supervised and unsupervised losses. It utilizes cross-entropy for supervised losses and mean square errors (MSEs) between predictions and generated pseudo-labels for unsupervised losses. MixMatch constructs pseudo-labels through data augmentation and improves their quality using the Sharpen function. MixUP [38] interpolation is also employed to create virtual samples.
- (4)
- Mean-Teacher [32] employs a student–teacher approach for SSL. The teacher model is based on the average weights of a student model in each update step. Mean-Teacher utilizes MSE loss as the consistency loss between two predictions and updates the model using exponential moving average (EMA) to control the update speed.
- (5)
- ICT [45] extends MixUP by interpolating unlabeled data, generating diverse mixed samples. It enforces consistency across different interpolation ratios using regularization. ICT trains a model by constraining predictions of mixed data to align with mixed predictions of original data. It effectively utilizes unlabeled data, particularly in scenarios with limited labeled data, resulting in improved generalization capabilities.
- (6)
- VAT [46] replaces data augmentation with adversarial transformations. It perturbs input data through adversarial transformations, leading to lower classification errors.
- (7)
- Temporal-ensembling [30] is a method based on temporal ensembling that improves model consistency and robustness by using an exponential moving average of historical prediction results during training. It trains the model by minimizing the consistency loss between the predictions of unlabeled data and the true labels of labeled data.
- (8)
- Pimodel [30] is a method based on data augmentation and consistency regularization. It generates virtual samples using data augmentation and trains the model by applying consistency constraints between labeled and unlabeled data.
3.3. Performance Comparison
- CIFAR-10: For CIFAR-10, performance comparison experiments were conducted with six baselines, including MixMatch [33], Mean-Teacher [32], ICT [45], VAT [46], Temporal-ensembling [30], and Pimodel [30]. The accuracy of these methods was evaluated with a varying number of labeled samples from 250 to 4000 (as is standard practice). The result can be seen in Figure 6. It can be observed that MAC-SSL was significantly superior to all other methods, especially when labeled samples were scarce, such as with 250 labels and 500 labels. MAC-SSL outperformed the second-best method, MixMatch, by 6.82% and 7.02%, respectively. These results highlight MAC-SSL’s ability to effectively utilize information from unlabeled data, thereby delivering a strong performance even in a scenario with limited labeled samples.
- CIFAR-100: To further demonstrate the effectiveness of MAC-SSL, we conducted comparative experiments on CIFAR-100. The baselines for comparison were the same as those used for CIFAR-10. The CIFAR-100 evaluation involved a varying number of labeled samples ranging from 400 to 2500. The results are presented in Figure 7. Upon observation, it can be seen that MAC-SSL achieved the best performance. When the number of labeled samples was 400, which means only 4 labeled samples per class, MAC-SSL outperformed the second-best method, MixMatch, by 16.58%. It is also noteworthy that as the number of labeled samples decreased, MAC-SSL brought even greater improvements, further validating the ability of the proposed method to effectively utilize information from unlabeled samples.
- SVHN: We conducted comparative experiments on the SVHN dataset and, in addition to CIFAR-10 and CIFAR-100, included CoMatch as one of the baselines. The accuracy of these methods was evaluated with varying numbers of labeled samples ranging from 250 to 4000. The experimental results are presented in Figure 8. As observed earlier, MAC-SSL achieved the best results and demonstrated greater improvements when the number of labeled samples was reduced. This once again confirmed the effectiveness of MAC-SSL and its ability to fully utilize unlabeled samples.
3.4. Ablation Experiment
- (1)
- Investigating the effects of different strong augmentation strategies: directly using RandAugment (MAC-SSL) and using RandAugment after applying moderate augmentation (MAC-SSL-II)
- (2)
- Using different sample ratio hyperparameter values ranging from 1 to 9
- (3)
- Removing temperature sharpening (i.e., setting )
- (4)
- Performing MixUP between labeled examples only, unlabeled examples only, and without mixing across labeled and unlabeled examples
- (5)
- Using the mean class distribution over two augmentations (i.e., weak and moderate augmentations) or using the class distribution for a single augmentation (i.e., only weak augmentation)
- (6)
- Employing weak augmentation, moderate augmentation, and strong augmentation, as well as the scenario where only weak augmentation and strong augmentation were utilized (with the class distribution being used only for weak augmentation)
4. Discussion
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A
Method | CIFAR-10 | |||
---|---|---|---|---|
250 Labels | 500 Labels | 1000 Labels | 4000 Labels | |
Temporal-ensembling [30] | 19.86 ± 1.9 | 29.48 ± 11.44 | 49.16 ± 5.34 | 61.97 ± 4.61 |
Vat [46] | 34.22 ± 2.46 | 51.3 ± 2.11 | 61.56 ± 0.61 | 79.08 ± 0.62 |
Pimodel [30] | 46.9 ± 2.15 | 53.93 ± 0.16 | 68.01 ± 0.08 | 85.92 ± 0.17 |
Mean-Teacher [32] | 45.3 ± 0.1 | 54.98 ± 0.12 | 73.14 ± 0.11 | 84.34 ± 0.08 |
ICT [45] | 59.38 ± 2.78 | 69.07 ± 2.15 | 83.43 ± 0.4 | 92.2 ± 0.3 |
MixMatch [33] | 86.88 ± 0.34 | 86.91 ± 0.17 | 90.03 ± 0.16 | 93.08 ± 0.08 |
MAC-SSL (ours) | 93.7 ± 0.1 | 93.93 ± 0.1 | 94.32 ± 0.18 | 94.96 ± 0.09 |
Method | CIFAR-100 | |||
---|---|---|---|---|
400 Labels | 800 Labels | 1000 Labels | 2500 Labels | |
Temporal-ensembling [30] | 8.42 ± 0.14 | 11.85 ± 0.14 | 13.33 ± 0.14 | 24.15 ± 0.14 |
Vat [46] | 10.23 ± 0.13 | 14.67 ± 0.14 | 15.27 ± 0.15 | 34.46 ± 0.19 |
Pimodel [30] | 11.35 ± 0.04 | 19.59 ± 0.15 | 21.66 ± 0.07 | 34.19 ± 0.05 |
Mean-Teacher [32] | 8.93 ± 0.09 | 15.38 ± 0.08 | 17.31 ± 0.13 | 37.9 ± 0.18 |
ICT [45] | 18.81 ± 0.22 | 26.57 ± 0.16 | 30.79 ± 0.73 | 56.53 ± 0.42 |
MixMatch [33] | 21.57 ± 0.2 | 39.42 ± 0.18 | 44.74 ± 0.14 | 58.21 ± 0.19 |
MAC-SSL (ours) | 38.15 ± 0.22 | 51.32 ± 0.25 | 52.20 ± 0.13 | 61.64 ± 0.16 |
Method | SVHN | |||
---|---|---|---|---|
250 Labels | 500 Labels | 1000 Labels | 4000 Labels | |
Temporal-ensembling [30] | 31.54 ± 15.92 | 48.14 ± 19.06 | 67.86 ± 6.01 | 81.16 ± 4.89 |
Vat [46] | 55.66 ± 11.11 | 72.9 ± 3.92 | 80.74 ± 2.17 | 92.07 ± 0.25 |
Pimodel [30] | 86.62 ± 0.31 | 90.29 ± 0.05 | 91.9 ± 0.08 | 94.34 ± 0.11 |
ICT [45] | 83.17 ± 1.27 | 87.75 ± 0.75 | 92.03 ± 0.22 | 95.66 ± 0.16 |
Mean-Teacher [32] | 84.64 ± 0.08 | 90.48 ± 0.07 | 91.47 ± 0.05 | 94.36 ± 0.04 |
MixMatch [33] | 89.39 ± 0.34 | 88.94 ± 0.3 | 88.73 ± 0.38 | 92.51 ± 0.08 |
CoMatch [34] | 92.35 ± 0.24 | 95.06 ± 0.66 | 95.6 ± 0.54 | 95.96 ± 0.42 |
MAC-SSL (ours) | 95.87 ± 0.06 | 96.08 ± 0.07 | 96.46 ± 0.04 | 96.85 ± 0.04 |
References
- Huang, X.; Song, Z.; Ji, C.; Zhang, Y.; Yang, L. Research on a Classification Method for Strip Steel Surface Defects Based on Knowledge Distillation and a Self-Adaptive Residual Shrinkage Network. Algorithms 2023, 16, 516. [Google Scholar] [CrossRef]
- He, R.; Han, Z.; Lu, X.; Yin, Y. Safe-student for safe deep semi-supervised learning with unseen-class unlabeled data. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 19–24 June 2022; pp. 14585–14594. [Google Scholar]
- Zhang, L.; Xiong, N.; Pan, X.; Yue, X.; Wu, P.; Guo, C. Improved Object Detection Method Utilizing YOLOv7-Tiny for Unmanned Aerial Vehicle Photographic Imagery. Algorithms 2023, 16, 520. [Google Scholar] [CrossRef]
- Wu, H.; Ma, X.; Liu, S. Designing multi-task convolutional variational autoencoder for radio tomographic imaging. IEEE Trans. Circuits Syst. II Express Briefs 2021, 69, 219–223. [Google Scholar] [CrossRef]
- Pacella, M.; Mangini, M.; Papadia, G. Utilizing Mixture Regression Models for Clustering Time-Series Energy Consumption of a Plastic Injection Molding Process. Algorithms 2023, 16, 524. [Google Scholar] [CrossRef]
- Wang, S.; Liu, X.; Liu, L.; Tu, W.; Zhu, X.; Liu, J.; Zhou, S.; Zhu, E. Highly-efficient incomplete large-scale multi-view clustering with consensus bipartite graph. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 19–24 June 2022; pp. 9776–9785. [Google Scholar]
- Zhu, F.; Zhao, J.; Cai, Z. A Contrastive Learning Method for the Visual Representation of 3D Point Clouds. Algorithms 2022, 15, 89. [Google Scholar] [CrossRef]
- Ali, O.; Ali, H.; Shah, S.A.A.; Shahzas, A. Implementation of a modified U-Net for medical image segmentation on edge devices. IEEE Trans. Circuits Syst. II Express Briefs 2022, 69, 4593–4597. [Google Scholar] [CrossRef]
- Hu, X.; Zeng, Y.; Xu, X.; Zhou, S.; Liu, L. Robust semi-supervised classification based on data augmented online ELMs with deep features. Knowl. Based Syst. 2021, 229, 107307. [Google Scholar] [CrossRef]
- Lindstrom, M.R.; Ding, X.; Liu, F.; Somayajula, A.; Needell, D. Continuous Semi-Supervised Nonnegative Matrix Factorization. Algorithms 2023, 16, 187. [Google Scholar] [CrossRef]
- Yang, J.; Cao, J.; Xue, A. Robust maximum mixture correntropy criterion-based semi-supervised ELM with variable center. IEEE Trans. Circuits Syst. II Express Briefs 2020, 67, 3572–3576. [Google Scholar] [CrossRef]
- Saito, K.; Kim, D.; Sclaroff, S.; Darrell, T.; Saenko, K. Semi-supervised domain adaptation via minimax entropy. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019; pp. 8050–8058. [Google Scholar]
- Kipf, T.N.; Welling, M. Semi-Supervised Classification with Graph Convolutional Networks. arXiv 2016, arXiv:1609.02907. [Google Scholar]
- Hu, Z.; Yang, Z.; Hu, X.; Nevatia, R. Simple: Similar pseudo label exploitation for semi-supervised classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Online, 19–25 June 2021; pp. 15099–15108. [Google Scholar]
- Cai, J.; Hao, J.; Yang, H.; Zhao, X.; Yang, Y. A review on semi-supervised clustering. Inf. Sci. 2023, 632, 164–200. [Google Scholar] [CrossRef]
- Chen, X.; Yuan, Y.; Zeng, G.; Wang, J. Semi-supervised semantic segmentation with cross pseudo supervision. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Online, 19–25 June 2021; pp. 2613–2622. [Google Scholar]
- Xu, M.; Zhang, Z.; Hu, H.; Wang, J.; Wang, L.; Wei, F.; Bai, X.; Liu, Z. End-to-end semi-supervised object detection with soft teacher. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 10–17 October 2021; pp. 3060–3069. [Google Scholar]
- Kostopoulos, G.; Karlos, S.; Kotsiantis, S.; Ragos, O. Semi-supervised regression: A recent review. J. Intell. Fuzzy Syst. 2018, 35, 1483–1500. [Google Scholar] [CrossRef]
- Van Engelen, J.E.; Hoos, H.H. A survey on semi-supervised learning. Mach. Learn. 2020, 109, 373–440. [Google Scholar] [CrossRef]
- Lee, D.H. Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks. In Proceedings of the Workshop on Challenges in Representation Learning, ICML, Atlanta, GA, USA, 16–21 June 2013; p. 896. [Google Scholar]
- Rizve, M.N.; Duarte, K.; Rawat, Y.S.; Shah, M. In defense of pseudo-labeling: An uncertainty-aware pseudo-label selection framework for semi-supervised learning. arXiv 2021, arXiv:2101.06329. [Google Scholar]
- Xie, Q.; Luong, M.T.; Hovy, E.; Le, Q.V. Self-training with noisy student improves imagenet classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 14–19 June 2020; pp. 10687–10698. [Google Scholar]
- Rosenberg, C.; Hebert, M.; Schneiderman, H. Semi-supervised self-training of object detection models. In Proceedings of the 2005 Seventh IEEE Workshops on Applications of Computer Vision (WACV/MOTION’05), Breckenridge, CO, USA, 5–7 January 2005; Volume 1, pp. 29–36. [Google Scholar]
- Zhai, X.; Oliver, A.; Kolesnikov, A.; Beyer, L. S4l: Self-supervised semi-supervised learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019; pp. 1476–1485. [Google Scholar]
- Li, X.; Sun, Q.; Liu, Y.; Zhou, Q.; Zheng, S.; Chua, T.S.; Schiele, B. Learning to self-train for semi-supervised few-shot classification. In Proceedings of the Advances in Neural Information Processing Systems: 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, BC, Canada, 8 December 2019; p. 32. [Google Scholar]
- Tanha, J.; Van Someren, M.; Afsarmanesh, H. Semi-supervised self-training for decision tree classifiers. Int. J. Mach. Learn. Cybern. 2017, 8, 355–370. [Google Scholar] [CrossRef]
- Moskalenko, V.; Kharchenko, V.; Moskalenko, A.; Petrov, S. Model and Training Method of the Resilient Image Classifier Considering Faults, Concept Drift, and Adversarial Attacks. Algorithms 2022, 15, 384. [Google Scholar] [CrossRef]
- Zhang, H.; Zhang, Z.; Odena, A.; Lee, H. Consistency regularization for generative adversarial networks. arXiv 2019, arXiv:1910.12027. [Google Scholar]
- Bachman, P.; Alsharif, O.; Precup, D. Learning with pseudo-ensembles. In Proceedings of the Advances in Neural Information Processing Systems: Annual Conference on Neural Information Processing Systems 2014, Montreal, QC, Canada, 8–13 December 2014; p. 27. [Google Scholar]
- Laine, S.; Aila, T. Temporal ensembling for semi-supervised learning. arXiv 2016, arXiv:1610.02242. [Google Scholar]
- Zahedi, E.; Saraee, M.; Masoumi, F.S.; Yazdinejad, M. Regularized Contrastive Masked Autoencoder Model for Machinery Anomaly Detection Using Diffusion-Based Data Augmentation. Algorithms 2023, 16, 431. [Google Scholar] [CrossRef]
- Tarvainen, A.; Valpola, H. Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. In Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, Long Beach, CA, USA, 4–9 December 2017; p. 30. [Google Scholar]
- Berthelot, D.; Carlini, N.; Goodfellow, I.; Papernot, N.; Oliver, A.; Raffel, C.A. Mixmatch: A holistic approach to semi-supervised learning. In Proceedings of the Advances in Neural Information Processing Systems: 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, BC, Canada, 8–14 December 2019; p. 32. [Google Scholar]
- Li, J.; Xiong, C.; Hoi, S.C. Comatch: Semi-supervised learning with contrastive graph regularization. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 10–17 October 2021; pp. 9475–9484. [Google Scholar]
- Antoniou, A.; Storkey, A.; Edwards, H. Data augmentation generative adversarial networks. arXiv 2017, arXiv:1711.04340. [Google Scholar]
- Shorten, C.; Khoshgoftaar, T.M. A survey on image data augmentation for deep learning. J. Big Data 2019, 6, 1–48. [Google Scholar] [CrossRef]
- Zhong, Z.; Zheng, L.; Kang, G.; Li, S.; Yang, Y. Random erasing data augmentation. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), New York, NY, USA, 7–12 February 2020; Volume 34, pp. 13001–13008. [Google Scholar]
- Zhang, H.; Cisse, M.; Dauphin, Y.N.; Lopez-Paz, D. mixup: Beyond empirical risk minimization. arXiv 2017, arXiv:1710.09412. [Google Scholar]
- Hendrycks, D.; Mu, N.; Cubuk, E.D.; Zoph, B.; Gilmer, J.; Lakshminarayanan, B. Augmix: A simple data processing method to improve robustness and uncertainty. arXiv 2019, arXiv:1912.02781. [Google Scholar]
- Cubuk, E.D.; Zoph, B.; Shlens, J.; Le, Q.V. Randaugment: Practical automated data augmentation with a reduced search space. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPR), Seattle, WA, USA, 14–19 June 2020; pp. 702–703. [Google Scholar]
- LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
- Krizhevsky, A.; Hinton, G. Learning Multiple Layers of Features from Tiny Images; Technical Report; University of Toronto: Toronto, ON, Canada, 2009; pp. 1–58. [Google Scholar]
- Netzer, Y.; Wang, T.; Coates, A.; Bissacco, A.; Wu, B.; Ng, A.Y. Reading digits in natural images with unsupervised feature learning. In Proceedings of the NIPS Workshop on Deep Learning and Unsupervised Feature Learning 2011, Granada, Spain, 12–17 December 2011. [Google Scholar]
- Zagoruyko, S.; Komodakis, N. Wide residual networks. arXiv 2016, arXiv:1605.07146. [Google Scholar]
- Verma, V.; Kawaguchi, K.; Lamb, A.; Kannala, J.; Solin, A.; Bengio, Y.; Lopez-Paz, D. Interpolation consistency training for semi-supervised learning. Neural Netw. 2022, 145, 90–106. [Google Scholar] [CrossRef]
- Miyato, T.; Maeda, S.I.; Koyama, M.; Ishii, S. Virtual adversarial training: A regularization method for supervised and semi-supervised learning. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 41, 1979–1993. [Google Scholar] [CrossRef]
- Deng, J.; Dong, W.; Socher, R.; Li, L.J.; Li, K.; Fei-Fei, L. Imagenet: A large-scale hierarchical image database. In Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Miami, FL, USA, 20–25 June 2009; pp. 248–255. [Google Scholar]
Method | CIFAR-10 (µ = 5) | |||
---|---|---|---|---|
40 Labels | 500 Labels | 1000 Labels | 4000 Labels | |
MAC-SSL (base for CIFAR10) | 87.36 ± 0.09 | 93.93 ± 0.1 | 94.32 ± 0.18 | 94.96 ± 0.09 |
MAC-SSL-II (base for SVHN) | 90.22 ± 0.14 | 93.29 ± 0.07 | 93.97 ± 0.08 | 94.91 ± 0.07 |
MAC-SSL/MAC-SSL-II without temperature sharpening (T = 1) | – | – | 93.31 ± 0.08 | – |
MAC-SSL/MAC-SSL-II without MixUP | – | – | 91.88 ± 0.11 | – |
MAC-SSL/MAC-SSL-II with MixUP on labeled only | – | – | 90.07 ± 0.14 | – |
MAC-SSL/MAC-SSL-II with MixUP on unlabeled only | – | – | 93.97 ± 0.35 | – |
MAC-SSL/MAC-SSL-II with MixUP on separate labeled and unlabeled | – | – | 92.36 ± 0.42 | – |
MAC-SSL/MAC-SSL-II without distribution averaging | – | – | 91.6 ± 0.18 | – |
MAC-SSL/MAC-SSL-II without moderate augmentation | – | – | 94.18 ± 0.24 | – |
Method | SVHN (µ = 5) | |
---|---|---|
250 Labels | 1000 Labels | |
MAC-SSL (base for CIFAR10) | 95.35 ± 0.1 | 95.65 ± 0.08 |
MAC-SSL-II (base for SVHN) | 95.87 ± 0.06 | 96.46 ± 0.04 |
MAC-SSL/MAC-SSL-II without temperature sharpening (T = 1) | – | 95.95 ± 0.05 |
MAC-SSL/MAC-SSL-II without MixUP | – | 94.83 ± 0.12 |
MAC-SSL/MAC-SSL-II with MixUP on labeled only | – | 94.02 ± 0.13 |
MAC-SSL/MAC-SSL-II with MixUP on unlabeled only | – | 96.17 ± 0.06 |
MAC-SSL/MAC-SSL-II with MixUP on separate labeled and unlabeled | – | 96.06 ± 0.04 |
MAC-SSL/MAC-SSL-II without distribution averaging | – | 96.44 ± 0.04 |
MAC-SSL/MAC-SSL-II without moderate augmentation | – | 95.50 ± 0.06 |
Method | CIFAR-10 (1000 Labels) | ||||
---|---|---|---|---|---|
µ = 1 | µ = 3 | µ = 5 | µ = 7 | µ = 9 | |
MAC-SSL | 89.47 ± 0.14 | 93.18 ± 0.08 | 94.32 ± 0.18 | 94.55 ± 0.08 | 96.46 ± 0.04 |
MAC-SSL-II | 88.49 ± 0.18 | 92.96 ± 0.17 | 93.96 ± 0.08 | 94.5 ± 0.09 | 94.85 ± 0.11 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, J.; Yang, J.; He, J.; Peng, D. Multi-Augmentation-Based Contrastive Learning for Semi-Supervised Learning. Algorithms 2024, 17, 91. https://doi.org/10.3390/a17030091
Wang J, Yang J, He J, Peng D. Multi-Augmentation-Based Contrastive Learning for Semi-Supervised Learning. Algorithms. 2024; 17(3):91. https://doi.org/10.3390/a17030091
Chicago/Turabian StyleWang, Jie, Jie Yang, Jiafan He, and Dongliang Peng. 2024. "Multi-Augmentation-Based Contrastive Learning for Semi-Supervised Learning" Algorithms 17, no. 3: 91. https://doi.org/10.3390/a17030091
APA StyleWang, J., Yang, J., He, J., & Peng, D. (2024). Multi-Augmentation-Based Contrastive Learning for Semi-Supervised Learning. Algorithms, 17(3), 91. https://doi.org/10.3390/a17030091