Next Article in Journal
AI-Powered Cybersecurity Mesh for Financial Transactions: A Generative-Intelligence Paradigm for Payment Security
Previous Article in Journal
LSTM-Based News Article Category Classification
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Proceeding Paper

Self-Supervised Learning for Complex Pattern Interpretation in Vitiligo Skin Imaging †

by
Priyanka Pawar
1,*,
Anagha Kulkarni
1,
Bhavana Pansare
1,
Prajakta Pawar
2,
Prachi Bahekar
3 and
Madhavi Kapre
3
1
Dr. D. Y. Patil School of Science and Technology, Dr. D. Y. Patil Vidyapeeth, Pimpri, Pune 411033, India
2
Department of Mechanical Engineering, Dr. D. Y. Patil Institute of Technology, Pimpri, Pune 411018, India
3
Department of Computer Engineering, Pimpri Chinchwad College of Engineering and Research, Pune 412101, India
*
Author to whom correspondence should be addressed.
Presented at the First International Conference on Computational Intelligence and Soft Computing (CISCom 2025), Melaka, Malaysia, 26–27 November 2025.
Comput. Sci. Math. Forum 2025, 12(1), 9; https://doi.org/10.3390/cmsf2025012009
Published: 18 December 2025

Abstract

Depigmented patches are the result of vitiligo, a skin condition brought on by the slow breakdown of melanocytes. High variability, complex lesion morphology, and subtle differences between affected and unaffected skin make accurate diagnosis difficult. In these situations, conventional supervised image analysis techniques have trouble generalizing. By allowing models to acquire significant representations from unlabeled data, self-supervised learning (SSL) presents a viable substitute. The new SSL-based framework for vitiligo skin image analysis proposed in this study uses contrastive learning with augmentation-based pretext tasks to capture complex visual patterns such as patch distribution, texture loss, and border irregularity. The SSL-enhanced model achieved a validation accuracy of 0.83 after fine-tuning on a small, labeled subset. This suggests that SSL could support accurate and labeled efficient vitiligo assessment in clinical and research settings. Direct comparisons with existing supervised model were not performed and were left for future research.

1. Introduction

A chronic dermatological disorder called vitiligo is typified by the gradual loss of melanocytes, which leaves skin patches depigmented [1]. In order for therapeutic intervention to be effective, disease progression to be tracked, and customized treatments to be developed, early and accurate diagnosis is essential. In the field of dermatology, clinical imaging methods such as dermoscopy, Wood’s lamp examination, ultraviolet radiation high-frequency ultrasound, and standard digital photography have grown in popularity [2,3]. The great variability of skin imaging, the dearth of extensive, high-quality annotated datasets, and the time-consuming, costly, and specialized domain knowledge needed for annotation make these modalities difficult to use.
A self-supervised learning (SSL) [4,5] paradigm has been suggested as a solution to these problems. SSL is a promising method for challenging tasks like vitiligo pattern interpretation because it allows models to learn informative representations from unlabeled data by completing pretext tasks. This work aims to develop and apply a self-supervised learning framework specifically for dermatological imaging; use learned representations for downstream tasks like lesion classification, pattern segmentation, and disease severity estimation; and evaluate the SSL model’s performance, generalizability, and interpretability across a range of patient datasets and imaging scenarios.
A unique SSL framework created especially for vitiligo imaging, sophisticated pattern recognition capabilities via feature learning, and thorough testing using actual clinical datasets are some of the main contributions of this paper. This work advances automated vitiligo diagnosis and monitoring by addressing the drawbacks of conventional supervised learning in skin imaging and putting forth an SSL-based solution.

2. Background

Visual evaluation using a variety of imaging modalities, each providing unique insights into skin’s structure and pigmentation, is crucial for diagnosing and monitoring vitiligo. By providing high-resolution cross-sectional imaging of the skin’s layers, High-Frequency Ultrasound (HFUS) [6] makes it possible to see structural changes like inflammation, changes in the dermal–epidermal junction, and variations in skin thickness [7]. Through magnification and polarized light, dermascopy improves surface visualization by exposing pigmentary and microvascular patterns that are frequently invisible to the unaided eye. To help detect subtle lesions, especially in fair-skinned people whose vitiligo may be harder to spot in normal lighting, Wood’s Lamp uses ultraviolet light to intensify the contrast between normal and depigmented skin.
Dermatologists manually interpret vitiligo images using traditional analysis methods, occasionally aided by hand-crafted features. Subjective judgment, inter-observer variability, and an inability to capture intricate, high-dimensional visual patterns are the limitations of these methods. Labeled data is necessary for conventional machine learning techniques, but it is expensive and hard to come by in dermatology, especially when it comes to uncommon or unclear patterns. In fields like medical imaging, where labeled datasets are scarce, self-supervised learning (SSL) [8,9] has become a viable substitute for supervised learning. By creating pretext tasks auxiliary learning objectives that do not require human annotations but allow the model to learn meaningful representations, SLS makes use of vast amounts of unlabeled data [10].
Table 1 shows a comparison between different SSL methods. SimCLR, MoCo, BYOL, and SwAV are important SSL frameworks for dermatological imaging. SSL has recently been applied to skin imaging, including fine-tuning for melanoma classification and pretraining models on large unlabeled dermoscopic datasets using contrastive learning. The potential is in using SSL to address data scarcity, pattern complexity, and domain variability in vitiligo imaging by extracting rich, disease-relevant features without the need for extensive annotations.

3. Methodology

3.1. Dataset Description

The dataset used in this study was assembled from public repositories like Kaggle dataset. Both annotated and unannotated samples were included in the dataset. Lesion presence or absence, lesion segmentation masks for depigmented areas, and severity scores derived from the Vitiligo Area Scoring Index (VASI) [17,18] may be manually applied to annotated images. The study does not evaluate predictions against the VASI score due to absence of clinical annotations in the dataset. The self-supervised learning stage relied heavily on unannotated images, which allowed for representation learning without the high expense of annotation. The preprocessing pipeline, which included resizing, color normalization, data augmentation, artifact removal, and segmentation mask alignment, was used to guarantee consistency and optimize learning efficiency. To ensure that the encoder could capture both fine-grained textural patterns and broader lesion distribution features while remaining resilient to real-world variations, this diverse and preprocessed dataset offered a rich foundation for the self-supervised learning (SSL) pertaining stage.
A vitiligo dataset that was separated into two groups, healthy skin and vitiligo skin, was used in the study. A total of 1528 excellent skin photos representing various body parts and skin tones were included in the healthy skin folder. The 2100 photos of verified vitiligo cases in the vitiligo skin folder were saved in a combination of .jpeg and .png formats [19]. Of these images, 80% were used for training and 20% for validation, resulting in 294 healthy and 423 vitiligo samples in the validation set. A representative benchmark for vitiligo classification was provided by the dataset, which included variations in lighting, resolution, and anatomical site. The images were preprocessed before training, which included format handling, color normalization, data augmentation, and resizing to a consistent resolution. For supervised fine-tuning and self-supervised feature learning, this carefully selected dataset offered a well-balanced combination of vitiligo and healthy samples.

3.2. SSL Framework

Without requiring a lot of manual annotations, the proposed SSL framework seeks to learn reliable and transferable feature representations from unlabeled skin images in the vitiligo dataset. ResNet-18, the encoder backbone, strikes a balance between computational efficiency and representational power, enabling the model to capture both broad lesion structures and fine-grained texture patterns. The SimCLR framework serves as the inspiration for the contrastive learning paradigm, which pushes representations of views from different images apart while bringing representations of various augmented views of the same image closer together in the feature space. Because contrastive SSL ensures invariance to changes in lighting, orientation, and scale factors that naturally vary in clinical and real-world photography, it is ideally suited for dermatological imaging.
Since it specifies the invariances the model will learn, a well-thought-out augmentation pipeline is essential to SSL’s success. Generic augmentations, random cropping and resizing, horizontal flipping, random rotation, color jitter, Gaussian blur, domain-specific augmentations, mild desaturation, artificial glare or specular highlights, and noise from JPEG compression are all included in the strategy. These augmentation techniques are conceptual components of the proposed methodology, intended to guide future implementation and validation. The idea behind this design is to improve generalization by exposing the model to realistic perturbations found in both clinical and non-clinical settings. The advantage of SSL at this point is that, despite having little labeled data, the model can still perform well in classification tasks, which is crucial for medical AI applications. The current study evaluates the SSL framework only for binary classification. Its applicability to other downstream tasks such as lesion segmentation and severity estimation will be explored in future work to assess feature transferability. Although this study focuses on vitiligo-based classification, the learned features could be integrated into larger AI-driven diagnostic systems.

3.2.1. Input Images

Figure 1 shows the self-supervised learning framework for vitiligo image classification. The collection of vitiligo skin photos taken in different lighting conditions, body parts, and skin tones without any disease labels manually applied is known as the Unlabeled Images dataset. It is only utilized in the self-supervised pertaining stage to acquire robust and generic visual representations free from the bias of human annotation. The benefits of the dataset include diversity, scalability, and participation in the Self-Supported Learning (SSL) network. Following pertained feature extractor on unlabeled data, it is employed in the downstream fine-tuning stage. The classifier maps learned features to clinically meaningful categories with the help of labels. Class imbalance and a lack of data are obstacles.

3.2.2. Data Augmentation

A critical step in the self-supervised learning pipeline is the data augmentation stage, which creates various versions of the same input image. Each unlabeled vitiligo skin image in this framework is subjected to two separate augmentation processes, resulting in various views that are visually distinct but semantically equivalent. Transformation-invariant feature representations are learned based on these augmented views. To introduce superficial variations while maintaining the lesion’s global identity, Augmentation applies distinct random transformations to the same image. This procedure replicates the inherent unpredictability found in clinical settings, such as variations in lighting, camera settings, or orientation.

3.2.3. Feature Extraction

As the central computational unit of the self-supervised learning framework, the feature extraction stage converts unprocessed image data into high-dimensional feature representations that capture key patterns in vitiligo skin lesions. The ResNet-18 [20] backbone, which incorporates residual connections to lessen the vanishing gradient issue, was selected due to its balance between computational efficiency and representational capacity. During the pertaining task, the backbone ensures consistency in representation learning by processing two augmented views of each unlabeled image in parallel.
Using a hierarchical feature learning approach, deeper layers capture higher-level patterns like lesion shapes, border irregularities, and depigmentation spread, while early layers capture low-level patterns like edges, texture, and pigmentation changes. Each input image receives a feature vector from the output, which is then sent to the projection head for additional processing during the SSL phase. The benefits of using ResNet-18 in vitiligo Imaging include transferability, parameter efficiency, and robust feature learning. Smaller medical datasets can benefit from the model’s ability to capture subtle pigmentation patterns and textural variations across lesion types without overfitting.

3.2.4. Projection Head and SSL Loss

The ResNet-18 backbone optimizes feature embedding for self-supervised learning by extracting features from each view of an augmented image and processing them with a projection head. This step fills the gap between the loss function that directs representation learning and the raw learned features. Usually, a Multi-Layer Perceptron (MLP) with one or two fully connected layers is used as the projection head. By separating the space used for downstream tasks from the space used for loss computation, SSL aims to enhance the quality of learned representations.
Learning without labels is made possible by applying the self-supervised loss to the projection head’s outputs. The backbone learns augmentation-invariant representations of lesion structure and pigmentation changes based on the SSL technique. While maintaining semantic similarity, contrastive learning like SimCLR promotes invariance to augmentations. Based on areas of the image that are visible, predictive learning methods such as masked image modeling make predictions about portions of the image that are missing or altered.
The backbone is jointly trained by the SSL loss and the projection head to generate augmentation-invariant and semantically rich embedding. The backbone, which now contains the learned representations, is kept for later fine-tuning with labeled data after pretraining is finished, while the projection head is discarded. Discarding the projection head ensures that feature representations are clean, compact, and generalizable and the backbone can be directly used for downstream applications. Annotation-free pretraining, improved generalization, and resistance to overfitting are beneficial in medical image analysis, lessening reliance on tiny, labeled datasets.

3.3. Fine-Tuning and Down-Streaming

A flexible feature extractor that can be tailored to different downstream vitiligo analysis tasks is the ResNet-18 backbone. Depending on the clinical objective, it can be adjusted using various tactics and model heads. Lesion segmentation or localization, which entails determining the spatial extent of vitiligo lesions within a skin image, is task 1. The backbone is optimized on a small, labeled dataset of lesion masks, which automates tracking of lesion shrinkage or spread over time and assists dermatologists in quantifying the area of depigmentation for disease monitoring.
In task two, vitiligo lesions are categorized into morphological patterns that are clinically recognized. Cross-entropy loss is used to train the backbone, and class weighting may be used to address imbalance. This aids in precisely identifying subtypes, establishing prognoses and treatment plans, and standardizing pattern recognition among various clinical facilities. Task 3 entails estimating the severity of the disease or forecasting how vitiligo is likely to develop over time. Even when there is a lack of labeled data for each task, the backbone learns generalizable lesion features that transfer well to segmentation, classification, and prediction tasks. Because only a small labeled dataset per task is required to achieve a high performance, this lessens the annotation burden.

3.4. Evaluation Metrics

Using metrics like accuracy, precision, recall, F1-score, and ROC-AUC, performance is evaluated during the evaluation stage. The learned feature vectors are visualized using the t-SNE visualization to show how separable they are in two dimensions. Clinical relevance demonstrates the interpretability of learned features, while well-clustered points from the same class show that the SSL-pretrained backbone has learned discriminative representations.
Data efficiency, robustness, and lower annotation costs are some benefits of SSL-based fine-tuning in vitiligo imaging. The learned weights of the backbone are either fine-tuned (full training) or frozen (linear evaluation). The learned feature embedding is mapped to vitiligo categories using a straightforward fully connected layer that only needs labeled data. For every labeled input image, the trained classifier predicts disease classes or severity stages and uses t-SNE visualization to qualitatively evaluate the quality of the representation.

4. Experimental Results

As a self-supervised backbone trained on ResNet-18, the t-SNE visualization examines the learned feature space of photos from various vitiligo categories. For visual analysis, it assists in projecting high-dimensional features into a 2D or 3D plot that is comprehensible to humans. Each image’s feature vectors are extracted by the backbone, which also calculates sample similarity in high-dimensional space and employs a non-linear mapping to capture intricate cluster structures. With colors or markers denoting recognized labels, each image is represented as a point in 2D/3D space. Because it can be used to check for cluster separation, identify overlapping clusters, and visualize training progress, t-SNE is helpful for SSL evaluation.
Before fine-tuning, features in the vitiligo context may cluster loosely; however, following fine-tuning, clusters become more compact and the lines separating patterns become more distinct. The clusters are distinguished by color, with red denoting vitiligo-affected skin and blue denoting healthy skin, as shown in Figure 2. The t-SNE visualization in Figure 2 shows some overlap between healthy and vitiligo clusters, which may explain the reduced precision for the healthy samples. Future work may investigate this misclassification using interpretability methods. The observed asymmetry in the confusion matrix suggests that false negatives for vitiligo may be more critical and approaches such as threshold adjustment or class-weighted loss could help provide a better balance in clinical performance.
The purpose of this study was to assess how well a deep learning model (SSL) detected vitiligo lesions. The model was trained on a dataset of different skin types and sizes using a backbone known as SSL. Lesion segmentation and lesion type classification were the two downstream tasks used to test the model. The findings demonstrated that, particularly in low-label situations, SSL-pretrained models continuously outperformed supervised baselines. An extensive evaluation of the model’s predictive capabilities for the two classes, healthy and vitiligo, was given in the classification report. Precision, recall, F1-score, and support were among the evaluation metrics. Particularly in terms of precision, the model performed marginally better for vitiligo detection than for healthy classification. The slightly lower recall for vitiligo suggests that the model needs to be improved to detect all positive cases, whereas the high recall for healthy indicates that the model tends to correctly identify non-diseased cases.
With an overall accuracy of 0.83, as shown in Figure 3, the model accurately predicts the class in 83% of the 717 samples. In the context of medical imaging, this is a good performance; however, when class distributions are unbalanced, accuracy by itself can be deceptive. In this instance, accuracy must be interpreted in conjunction with class-specific metrics because the classes are not perfectly balanced (423 vitiligo cases vs. 294 healthy cases). The model predicts a class approximately 82% of the time, according to the macro average precision score (0.82), which treats vitiligo and healthy classes equally. Both classes are given equal weight by the recall score (0.83), which shows that the model accurately detects 83% of true samples for each class on average.
The classification report, as shown in Figure 4, assesses how well a model predicts outcomes for the vitiligo and healthy classes. Precision, recall, F1-score, and support metrics are used to evaluate the model’s performance. Overall, 76% of samples in the healthy class are predicted by the model to be healthy, with 24% being false positives. In total, 84% of real healthy cases are successfully identified, while 16% are false negatives. A balanced assessment of the model’s performance is provided by the F1-score (0.80), where a score of 0.80 denotes strong classification ability. With 82% of all real cases correctly identified, the model predicts vitiligo 88% of the time in the vitiligo class. For vitiligo, the model’s precision is marginally higher than its recall, suggesting a cautious approach to disease prediction.
In terms of accuracy, the model performs marginally better for vitiligo detection than for healthy classification. While vitiligo’s slightly lower recall suggests that the model needs to be improved to detect all positive cases, the high recall for healthy cases shows that the model correctly identifies non-diseased cases. To increase vitiligo recall without significantly compromising precision, more fine-tuning might be needed. There are 423 vitiligo cases and 294 healthy cases in the dataset, which is a little unbalanced. For the minority or more challenging-to-classify class, augmentation, weighted loss functions, or balanced training techniques may increase recall.
The precision of each class is weighted by the number of samples (support) in the weighted average precision (0.83). Due to its larger sample size, the higher precision of the vitiligo class (0.88) has a greater impact on the weighted average than the healthy class (0.76). The slightly lower recall for vitiligo lowers the value from the healthy recall (0.84) because the weighted recall score is also support-weighted. Table 2 shows the classification performance metrics for healthy and vitiligo classes. The model performs evenly across classes, as evidenced by the close accuracy, weighted average, and macro average values. Since missing disease cases is more important in a clinical setting, improvements should concentrate on increasing vitiligo recall and reducing false negatives. As this work proposes a conceptual framework for applying self-supervised learning to vitiligo imaging, no direct experimental comparison with existing SSL studies was conducted.

5. Training Time and Considerations

Training time is influenced by batch size, dataset size, model complexity, and hardware. In this study, 10 epochs were used on dataset processed with a batch size of 32, with a ResNet 18 model. Training was performed at a CPU.

6. Conclusions

A self-supervised learning (SSL) framework for interpreting intricate patterns in vitiligo skin imaging is presented in this study. The model achieves a competitive classification performance with balanced precision, recall, and F1-scores across the healthy and vitiligo classes by learning robust and clinically relevant feature representations from unlabeled data. The findings demonstrate how SSL can help with the lack of annotated dermatological datasets, allowing for precise disease detection and reducing the need for expensive manual labeling. To help vitiligo patients with early diagnosis, severity assessment, and individualized treatment planning, the method can be expanded to other dermatological conditions, combined with explainable AI techniques for clinical interpretability, and implemented in real-world screening pipelines.

7. Future Scope

Future experimental validation will focus on accessing the impact of suggested augmentation techniques compared to generic ones through an ablation-based evaluation. Improving sensitivity, broadening the dataset to encompass a range of skin tones, lighting conditions, and disease stages, and incorporating multi-modal data are all important for the future of vitiligo detection. This will assist the model in adjusting to the actual variability found in clinical settings. Predictions can be strengthened by combining clinical photos, dermoscopic images, and patient history. It may be possible to identify subtle diagnostic cues that single-image models miss by using multi-modal deep learning architecture. Clinicians can have more faith in the model’s predictions if Explainable AI (XAI) is used in clinical settings. Performance in low-label settings can be enhanced by sophisticated self-supervised learning frameworks. As new data becomes available, the model can be updated using continuous learning techniques without having to be retrained from scratch. Performance should be evaluated in actual clinical settings in real-world deployment studies. Future work also includes a pilot study with dermatologist assessed with VASI score to examine the correlations with the learned features and enhance the clinical relevance.

Author Contributions

Conceptualization, P.P. (Priyanka Pawar); methodology, P.P. (Priyanka Pawar); software, A.K.; validation, B.P. and P.P. (Priyanka Pawar); formal analysis, P.B.; investigation, B.P. and M.K.; resources, P.P. (Prajakta Pawar); data curation, P.P. (Prajakta Pawar); writing—original draft preparation, P.P. (Priyanka Pawar); writing—review and editing, P.P. (Priyanka Pawar); visualization, P.B.; supervision, M.K.; project administration, A.K.; funding acquisition, P.P. (Priyanka Pawar) and A.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Dr. D. Y. Patil School of Science and Technology, Dr. D. Y. Patil Vidyapeeth, Pune, India grant number [DPU/1212(4)/2024] and The APC was funded by [DPU/1212(4)/2024].

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available at Kaggle. Vitiligo Dataset. Available online: https://www.kaggle.com/datasets/shinynose/vitiligo (accessed on 31 August 2025) [19].

Acknowledgments

The authors gratefully acknowledge the financial support provided by the Dr. D.Y. Patil School of Science and Technology, Dr. D.Y. Patil Vidyapeeth, Pune, India, through the Seed Money Project. We also extend our thanks to Dr. D.Y. Patil Vidyapeeth for providing the necessary facilities to conduct this research.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Al-Smadi, K.; Imran, M.; Leite-Silva, V.R.; Mohammed, Y. Vitiligo: A Review of Aetiology, Pathogenesis, Treatment, and Psychosocial Impact. Cosmetics 2023, 10, 84. [Google Scholar] [CrossRef]
  2. Mojeski, J.A.; Almashali, M.; Jowdy, P.; Fitzgerald, M.E.; Brady, K.L.; Zeitouni, N.C.; Colegio, O.R.; Paragh, G. Ultraviolet Imaging in Dermatology. Photodiagn. Photodyn. Ther. 2020, 30, 101743. [Google Scholar] [CrossRef] [PubMed]
  3. Halani, S.; Foster, F.S.; Breslavets, M.; Shear, N.H. Ultrasound and Infrared-Based Imaging Modalities for Diagnosis and Management of Cutaneous Diseases. Front. Med. 2018, 5, 115. [Google Scholar] [CrossRef] [PubMed]
  4. Wang, W.C.; Ahn, E.; Feng, D.; Kim, J. A Review of Predictive and Contrastive Self-Supervised Learning for Medical Images. Mach. Intell. Res. 2023, 20, 483–513. [Google Scholar] [CrossRef]
  5. Anton, J.; Castelli, L.; Chan, M.F.; Outters, M.; Tang, W.H.; Cheung, V.; Shukla, P.; Walambe, R.; Kotecha, K. How Well Do Self-Supervised Models Transfer to Medical Imaging? J. Imaging 2022, 8, 320. [Google Scholar] [CrossRef] [PubMed]
  6. Wortsman, X.; Araya, I.; Maass, M.; Valdes, P.; Zemelman, V. Ultrasound Patterns of Vitiligo at High Frequency and Ultra-High Frequency. J. Ultrasound Med. 2024, 43, 1605–1610. [Google Scholar] [CrossRef] [PubMed]
  7. Levy, J.; Barrett, D.L.; Harris, N.; Jeong, J.J.; Yang, X.; Chen, S.C. High-Frequency Ultrasound in Clinical Dermatology: A Review. Ultrasound J. 2021, 13, 24. [Google Scholar] [CrossRef] [PubMed]
  8. Wang, H.; Ahn, E.; Bi, L.; Kim, J. Self-Supervised Multi-Modality Learning for Multi-Label Skin Lesion Classification. Comput. Methods Programs Biomed. 2025, 265, 108729. [Google Scholar] [CrossRef] [PubMed]
  9. Zeng, X.; Abdullah, N.; Sumari, P. Self-supervised learning framework application for medical image analysis: A review and summary. BioMed. Eng. OnLine 2024, 23, 107. [Google Scholar] [CrossRef] [PubMed]
  10. Kwasigroch, A.; Grochowski, M.; Mikołajczyk, A. Self-Supervised Learning to Increase the Performance of Skin Lesion Classification. Electronics 2020, 9, 1930. [Google Scholar] [CrossRef]
  11. Chen, T.; Kornblith, S.; Norouzi, M.; Hinton, G. A Simple Framework for Contrastive Learning of Visual Representations. In Proceedings of the 37th International Conference on Machine Learning (ICML 2020), Virtual Event, 13–18 July 2020; pp. 1597–1607. Available online: https://arxiv.org/abs/2002.05709 (accessed on 31 August 2025).
  12. Gröger, F.; Gottfrois, P.; Amruthalingam, L.; Gonzalez-Jimenez, A.; Lionetti, S.; Navarini, A.A.; Pouly, M. Towards Reducing the Need for Annotations in Digital Dermatology with Self-Supervised Learning. In Proceedings of the 1st Workshop on Scarce Data in Artificial Intelligence for Healthcare (SDAIH 2022), ECAI 2022; ScitePress: Lisbon, Portugal, 2022; pp. 31–42. [Google Scholar] [CrossRef]
  13. Grill, J.-B.; Strub, F.; Altché, F.; Tallec, C.; Richemond, P.H.; Buchatskaya, E.; Doersch, C.; Pires, B.A.; Guo, Z.D.; Azar, M.G.; et al. Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning. Adv. Neural Inf. Process. Syst. 2020, 33, 21271–21284. Available online: https://arxiv.org/abs/2006.07733 (accessed on 31 August 2025).
  14. Caron, M.; Misra, I.; Mairal, J.; Goyal, P.; Bojanowski, P.; Joulin, A. Unsupervised Learning of Visual Features by Contrasting Cluster Assignments. Adv. Neural Inf. Process. Syst. 2020, 33, 9912–9924. [Google Scholar]
  15. Zhou, L.; Liu, H.; Bae, J.; He, J.; Samaras, D.; Prasanna, P. Self Pre-Training with Masked Autoencoders for Medical Image Classification and Segmentation. In Proceedings of the 2023 IEEE 20th International Symposium on Biomedical Imaging (ISBI), Cartagena, Colombia, 18–21 April 2023; pp. 1–6. [Google Scholar] [CrossRef]
  16. Chen, K.; Guo, Y.; Yang, C.; Xu, Y.; Zhang, R.; Li, C.; Wu, R. Enhanced Breast Lesion Classification via Knowledge Guided Cross-Modal and Semantic Data Augmentation. In Medical Image Computing and Computer Assisted Intervention—MICCAI 2021; de Bruijne, M., Cattin, P.C., Cotin, S., Padoy, N., Speidel, S., Zheng, Y., Essert, C., Eds.; Lecture Notes in Computer Science; Springer: Cham, Switzerland, 2021; Volume 12905, pp. 51–61. [Google Scholar] [CrossRef]
  17. Kawakami, T.; Hashimoto, T. Disease Severity Indexes and Treatment Evaluation Criteria in Vitiligo. Dermatol. Res. Pract. 2011, 2011, 750342. [Google Scholar] [CrossRef] [PubMed]
  18. Kallipolitis, A.; Moutselos, K.; Zafeiriou, A.; Andreadis, S.; Matonaki, A.; Stavropoulos, T.G.; Maglogiannis, I. Skin Image Analysis for Detection and Quantitative Assessment of Dermatitis, Vitiligo and Alopecia Areata Lesions: A Systematic Literature Review. BMC Med. Inform. Decis. Mak. 2025, 25, 10. [Google Scholar] [CrossRef] [PubMed]
  19. Kaggle. Vitiligo Dataset. Available online: https://www.kaggle.com/datasets/shinynose/vitiligo (accessed on 31 August 2025).
  20. Pandey, G.K.; Srivastava, S. ResNet-18 Comparative Analysis of Various Activation Functions for Image Classification. In Proceedings of the 2023 International Conference on Inventive Computation Technologies (ICICT), Lalitpur, Nepal, 26–28 April 2023; pp. 595–601. [Google Scholar] [CrossRef]
Figure 1. Self-supervised learning framework for vitiligo skin image classification.
Figure 1. Self-supervised learning framework for vitiligo skin image classification.
Csmf 12 00009 g001
Figure 2. t-SNE visualization of learned feature embeddings before and after fine-tuning, showing improved class separation in vitiligo pattern categories.
Figure 2. t-SNE visualization of learned feature embeddings before and after fine-tuning, showing improved class separation in vitiligo pattern categories.
Csmf 12 00009 g002
Figure 3. The confusion matrix illustrates the classification performance of the fine-tuned model.
Figure 3. The confusion matrix illustrates the classification performance of the fine-tuned model.
Csmf 12 00009 g003
Figure 4. Performance evaluation of the fine-tuned model using precision, recall, and F1-score for each vitiligo class.
Figure 4. Performance evaluation of the fine-tuned model using precision, recall, and F1-score for each vitiligo class.
Csmf 12 00009 g004
Table 1. Comparisons between different SSL methods.
Table 1. Comparisons between different SSL methods.
StudySSL MethodPretext TaskDownstream TaskDatasetLimitations
[11]SimCLRContrastive learning (augmentation pairs)Skin lesion
classification
HAM10000 (10 k)Limited to classification; no structural insight into lesions
[12]ColorMe, DINO, iBOTCromatic self distillation, masked distillation learningLesion classificationDerm7pt, ISIC (107 k)Limited hyperparameter tuning; potential overfitting to certain datasets
[13]BYOLView predictionLesion
classification
Custom dataset (1.2 k)Small dataset; lacks pattern-level granularity
[14]SwAVClustering + augmentationLesion
segmentation
In-house (2 k)No visual explanation tools used; modality-specific
[15]MAEMasked image modelingDiagnosisISIC 2018MAE training was computationally expensive
[16]MoCo v2Instance discriminationOrgan
segmentation
BraTS, LUNANot validated for dermatological images
Table 2. Detailed classification performance metrics for the healthy and vitiligo classes.
Table 2. Detailed classification performance metrics for the healthy and vitiligo classes.
MetricSupportPrecisionRecallF1-ScoreInterpretation
Healthy2940.760.840.80High recall suggests strong detection of healthy cases, but lower precision means some vitiligo cases are misclassified as healthy.
Vitiligo4230.880.820.85High precision indicates reliable vitiligo identification, but slightly lower recall means some actual cases are missed.
Accuracy7170.8383% of all predictions are correct; good overall performance but must be considered alongside class metrics due to mild imbalance.
Macro Avg7170.820.830.83Equal weight to both classes; shows balanced performance with good sensitivity and precision across categories.
Weighted Avg7170.830.830.83Support-weighted measure; like macro avg, reflecting consistent performance and minimal bias toward the majority class.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Pawar, P.; Kulkarni, A.; Pansare, B.; Pawar, P.; Bahekar, P.; Kapre, M. Self-Supervised Learning for Complex Pattern Interpretation in Vitiligo Skin Imaging. Comput. Sci. Math. Forum 2025, 12, 9. https://doi.org/10.3390/cmsf2025012009

AMA Style

Pawar P, Kulkarni A, Pansare B, Pawar P, Bahekar P, Kapre M. Self-Supervised Learning for Complex Pattern Interpretation in Vitiligo Skin Imaging. Computer Sciences & Mathematics Forum. 2025; 12(1):9. https://doi.org/10.3390/cmsf2025012009

Chicago/Turabian Style

Pawar, Priyanka, Anagha Kulkarni, Bhavana Pansare, Prajakta Pawar, Prachi Bahekar, and Madhavi Kapre. 2025. "Self-Supervised Learning for Complex Pattern Interpretation in Vitiligo Skin Imaging" Computer Sciences & Mathematics Forum 12, no. 1: 9. https://doi.org/10.3390/cmsf2025012009

APA Style

Pawar, P., Kulkarni, A., Pansare, B., Pawar, P., Bahekar, P., & Kapre, M. (2025). Self-Supervised Learning for Complex Pattern Interpretation in Vitiligo Skin Imaging. Computer Sciences & Mathematics Forum, 12(1), 9. https://doi.org/10.3390/cmsf2025012009

Article Metrics

Back to TopTop