Comparative Analysis of AI Models for Atypical Pigmented Facial Lesion Diagnosis
Abstract
:1. Introduction
2. Materials and Methods
2.1. Image Collection
2.2. Image Testing
2.3. Model Development
- The Logistic Regression Scoring Model: This model was developed using a stepwise logistic regression approach, incorporating the 14 dermoscopic patterns described in Section 2.2, the patient’s age and sex, and lesion diameter as the predictor variables. The binary outcome consisted of malignant lesions (LM + LMM) vs. benign lesions (SK + SL + SLK + PAK + AN). The stepwise procedure was a forward–backward procedure based on the Area Under the Receiver Operating Characteristic (AUROC) [25]. A variable could be added or removed only if it contributed at least 0.003 of the AUROC and was statistically significant. The model was trained and validated with a 5-fold cross-validation technique on 80% of the dataset. The best-performing model was then selected and tested on the remaining 20% of the data. The coefficients were transformed into integer scores to create a user-friendly scoring system for clinical use.
- The CNN Model: A ResNet-34 architecture [26] was employed. Other experiments with more complex models, such as ResNet-101, EfficientNet B0, and EfficientNet B1, have also been performed previously, obtaining similar or worse results. For this reason, the simplest model (ResNet-34) was chosen and presented. The pre-trained model was fine-tuned on 1197 collected images (see Section 2.1) and 743 images of facial aPFLs extracted from the International Skin Imaging Collaboration (ISIC) 2018 dataset [27]. LM and LMM diagnoses were aggregated because of their similar superficial patterns. The model was trained and validated with 5-fold cross-validation and finally tested on 111 unseen images. Data augmentation was performed on the dataset by applying geometric and colour transformations: crop (probability = 0.1), horizontal flip (probability = 0.5), vertical flip (probability = 0.5), and colour transformations (brightness, contrast, and saturation transformations with probability = 0.1). The final parameters for training the CNN model were selected after 5-fold cross-validation to optimize the performance (Table 3). An early stopping rule was also defined to manage overfitting. The training stopped if the validation loss did not decrease by at least 0.03 within 10 epochs. The final model was that with the lowest loss at the beginning of the early stopping epoch count. Cross-entropy weighted for class frequency was chosen as the loss function, the AdamW stochastic gradient descent method was chosen as the optimizer, and “reduce learning rate on plateau” was chosen as the learning rate scheduler (factor = 0.1, patience = 3, and threshold = 0.0001) [28].
3. Results
3.1. Statistical Analysis
3.2. Logistic Regression Model Performance
- Very low (range of 0–2): Malignant lesions are rarely observed within this score range.
- Intermediate (range of 3–9): It is not possible to confidently determine whether a lesion is more likely benign or malignant in this range.
- Very high (range of 10–16): Observed lesions are highly likely to be malignant in this range.
3.3. CNN Model Performance
4. Discussion
Limitations
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
AI | Artificial Intelligence |
AN | Atypical Nevi |
AUROC | Area Under the Receiver Operating Characteristic |
aPFLs | Atypical Pigmented Facial Lesions |
CNN | Convolutional Neural Network |
DL | Deep Learning |
IARC | International Agency for Research on Cancer |
iDScore | Integrated Dermoscopy Score |
ISIC | International Skin Imaging Collaboration |
LM | Lentigo Maligna |
LMM | Lentigo Maligna Melanoma |
LR | Logistic Regression |
ML | Machine Learning |
MLLM | Multimodal Large Language Model |
PAK | Pigmented Actinic Keratosis |
ROC | Receiver Operating Characteristic |
SK | Seborrheic Keratosis |
SL | Solar Lentigo |
SLK | Seborrheic Lichenoid Keratosis |
UV | Ultraviolet |
ViT | Vision Transformer |
References
- Arnold, M.; Singh, D.; Laversanne, M.; Vignat, J.; Vaccarella, S.; Meheus, F.; Cust, A.E.; de Vries, E.; Whiteman, D.C.; Bray, F. Global Burden of Cutaneous Melanoma in 2020 and Projections to 2040. JAMA Dermatol. 2022, 158, 495–503. [Google Scholar] [CrossRef]
- American Cancer Society. Key Statistics for Melanoma Skin Cancer. Available online: https://www.cancer.org/cancer/types/melanoma-skin-cancer/about/key-statistics.html (accessed on 7 August 2024).
- Micantonio, T.; Neri, L.; Longo, C.; Grassi, S.; Di Stefani, A.; Antonini, A.; Coco, V.; Fargnoli, M.; Argenziano, G.; Peris, K. A new dermoscopic algorithm for the differential diagnosis of facial lentigo maligna and pigmented actinic keratosis. Eur. J. Dermatol. 2018, 28, 162–168. [Google Scholar] [CrossRef]
- Weyers, W. The ‘epidemic’ of melanoma between under- and overdiagnosis. J. Cutan. Pathol. 2012, 39, 9–16. [Google Scholar] [CrossRef]
- Costa-Silva, M.; Calistru, A.; Barros, A.; Lopes, S.; Esteves, M.; Azevedo, F. Dermatoscopy of flat pigmented facial lesions—Evolution of lentigo maligna diagnostic criteria. Dermatol. Pract. Concept. 2018, 8, 198–203. [Google Scholar] [CrossRef]
- Kittler, H.; Pehamberger, H.; Wolff, K.; Binder, M. Diagnostic accuracy of dermoscopy. Lancet Oncol. 2002, 3, 159–165. [Google Scholar] [CrossRef]
- Williams, N.M.; Rojas, K.D.; Reynolds, J.M.; Kwon, D.; Shum-Tien, J.; Jaimes, N. Assessment of Diagnostic Accuracy of Dermoscopic Structures and Patterns Used in Melanoma Detection: A Systematic Review and Meta-analysis. JAMA Dermatol. 2021, 157, 1078–1088. [Google Scholar] [CrossRef]
- Tognetti, L.; Bonechi, S.; Andreini, P.; Bianchini, M.; Scarselli, F.; Cevenini, G.; Moscarella, E.; Farnetani, F.; Longo, C.; Lallas, A.; et al. A new deep learning approach integrated with clinical data for the dermoscopic differentiation of early melanomas from atypical nevi. J. Dermatol. Sci. 2021, 101, 115–122. [Google Scholar] [CrossRef]
- Bjørch, M.F.; Gram, E.G.; Brodersen, J.B. Overdiagnosis in malignant melanoma: A scoping review. BMJ Evid.-Based Med. 2024, 29, 17–28. [Google Scholar] [CrossRef]
- Li, Z.; Koban, K.C.; Schenck, T.L.; Giunta, R.E.; Li, Q.; Sun, Y. Artificial Intelligence in Dermatology Image Analysis: Current Developments and Future Trends. J. Clin. Med. 2022, 11, 6826. [Google Scholar] [CrossRef]
- LeCun, Y.; Bengio, Y.; Hinton, G. Deep Learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef]
- Ghahramani, Z. Probabilistic machine learning and artificial intelligence. Nature 2015, 521, 452–459. [Google Scholar] [CrossRef]
- Chan, Y.H. Biostatistics 304. Cluster analysis. Singap. Med. J. 2005, 46 4, 153–159. [Google Scholar]
- Chen, L.C.; Papandreou, G.; Kokkinos, I.; Murphy, K.; Yuille, A.L. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 40, 834–848. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, I. Attention is All You Need. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Red Hook, NY, USA, 4–9 December 2017; pp. 6000–6010. [Google Scholar]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An Image is Worth 16 × 16 Words: Transformers for Image Recognition at Scale. In Proceedings of the International Conference on Learning Representations, Vienna, Austria, 4 May 2021. [Google Scholar]
- Gulzar, Y.; Khan, S.A. Skin Lesion Segmentation Based on Vision Transformers and Convolutional Neural Networks—A Comparative Study. Appl. Sci. 2022, 12, 5990. [Google Scholar] [CrossRef]
- Mehmood, A.; Gulzar, Y.; Ilyas, Q.M.; Jabbari, A.; Ahmad, M.; Iqbal, S. SBXception: A Shallower and Broader Xception Architecture for Efficient Classification of Skin Lesions. Cancers 2023, 15, 3604. [Google Scholar] [CrossRef]
- Gupta, A.K.; Talukder, M.; Wang, T.; Daneshjou, R.; Piguet, V. The Arrival of Artificial Intelligence Large Language Models and Vision-Language Models: A Potential to Possible Change in the Paradigm of Healthcare Delivery in Dermatology. J. Investig. Dermatol. 2024, 144, 1186–1188. [Google Scholar] [CrossRef]
- Yap, J.; Yolland, W.; Tschandl, P. Multimodal skin lesion classification using deep learning. Exp. Dermatol. 2018, 27, 1261–1267. [Google Scholar] [CrossRef]
- Teledermatology Task Force of the European Academy of Dermatology and Venerology. iDScore—Teledermatological Platform for Integrated Diagnosis. Available online: https://en.idscore.net/ (accessed on 7 August 2024).
- Tognetti, L.; Cartocci, A.; Bertello, M.; Giordani, M.; Cinotti, E.; Cevenini, G.; Rubegni, P. An updated algorithm integrated with patient data for the differentiation of atypical nevi from early melanomas: The idScore 2021. Dermatol. Pract. Concept. 2022, 12, e2022134. [Google Scholar] [CrossRef]
- Tognetti, L.; Cartocci, A.; Żychowska, M.; Savarese, I.; Cinotti, E.; Pizzichetta, M.A.; Moscarella, E.; Longo, C.; Farnetani, F.; Guida, S.; et al. A risk-scoring model for the differential diagnosis of lentigo maligna and other atypical pigmented facial lesions of the face: The facial iDScore. J. Eur. Acad. Dermatol. Venereol. 2023, 37, 2301–2310. [Google Scholar] [CrossRef]
- South, L.; Saffo, D.; Vitek, O.; Dunne, C.; Borkin, M.A. Effective Use of Likert Scales in Visualization Evaluations: A Systematic Review. Comput. Graph. Forum 2022, 41, 43–55. [Google Scholar] [CrossRef]
- Xu, J.W.; Suzuki, K. Max-AUC feature selection in computer-aided detection of polyps in CT colonography. IEEE J. Biomed. Health Inform. 2013, 18, 585–593. [Google Scholar] [CrossRef]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar]
- Tschandl, P.; Rosendahl, C.; Kittler, H. The HAM10000 Dataset: A Large Collection of Multi-Source Dermatoscopic Images of Common Pigmented Skin Lesions. Sci. Data 2018, 5, 180161. [Google Scholar] [CrossRef]
- PyTorch. ReduceLROnPlateau. Available online: https://pytorch.org/docs/stable/generated/torch.optim.lr_scheduler.ReduceLROnPlateau.html (accessed on 7 August 2024).
Diagnosis | Distribution of Images |
---|---|
Lentigo Maligna (LM) and Lentigo Maligna Melanoma (LMM) | 503 (41.7%) |
Pigmented Actinic Keratosis (PAK) | 200 (19.3%) |
Solar Lentigo (SL) | 200 (22.2%) |
Atypical Nevi (AN) | 194 (10.4%) |
Seborrheic Keratosis (SK) | 50 (4.0%) |
Seborrheic Lichenoid Keratosis (SLK) | 50 (2.3%) |
Dermoscopic Pattern | Definition |
---|---|
Hyperpigmented follicular ostia | Fine, irregular semicircles or double circles |
Obliterated follicular ostia | Closed follicular openings |
Rhomboidal structures | Polygonal lines forming rhomboids |
Grey rhomboidal lines | Grey dots/lines arranged in a rhomboidal pattern |
Slate-grey dots and globules | Grey dots/globules around follicles |
Grey structureless areas | Homogeneous grey areas |
Grey pseudo-network | Grey lines forming a pseudo-network |
Light brown/dark brown pseudo-network | Brown lines forming a pseudo-network |
Fine pigmented brown network | Thin brown lines forming a network |
Atypical network | Irregularly arranged network lines |
Circle within a circle | Dark circle within a hyperpigmented hair follicle |
Irregularly pigmented globules | Dispersed brown/black globules |
Dark dots | Black dots within the lesion |
Pseudopods | Peripheral projections of pigment |
Parameter | Value |
---|---|
Initial learning rate | |
Maximum epochs | 50 |
Batch size | 32 |
Early stopping | 10 epochs |
Loss function | Cross-entropy weighted for class frequency |
Optimizer | AdamW |
Learning rate scheduler | Reduce learning rate on plateau |
Atypical Nevus | Lentigo Maligna | Lentigo Maligna Melanoma | Pigmented Actinic Keratosis | Seborrheic Keratosis | Seborrheic Lichenoid Keratosis | Solar Lentigo | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Case rating | ||||||||||||||
Very easy | 11 | 8.5% | 19 | 14.7% | 24 | 18.6% | 23 | 17.8% | 9 | 7.0% | 0 | 0.00% | 43 | 33.3% |
Easy | 52 | 10.1% | 115 | 22.3% | 37 | 7.2% | 110 | 21.4% | 26 | 5.1% | 21 | 4.1% | 154 | 29.9% |
Moderate | 106 | 10.2% | 306 | 29.6% | 102 | 9.9% | 202 | 19.5% | 44 | 4.2% | 28 | 2.7% | 247 | 23.9% |
Difficult | 71 | 12.5% | 177 | 31.2% | 91 | 16.0% | 113 | 19.9% | 19 | 3.4% | 10 | 1.8% | 87 | 15.3% |
Very difficult | 27 | 13.6% | 52 | 26.3% | 30 | 15.2% | 35 | 17.7% | 7 | 3.5% | 4 | 2.0% | 43 | 21.7% |
Confidence in diagnosis | ||||||||||||||
Very confident | 24 | 8.1% | 60 | 20.2% | 35 | 11.8% | 47 | 15.8% | 15 | 5.1% | 11 | 3.7% | 105 | 35.4% |
Mildly confident | 110 | 11.6% | 249 | 26.2% | 82 | 8.6% | 201 | 21.1% | 54 | 5.7% | 30 | 3.2% | 226 | 23.7% |
Uncertain | 72 | 10.3% | 220 | 31.5% | 97 | 13.9% | 139 | 19.9% | 20 | 2.9% | 15 | 2.2% | 135 | 19.3% |
Mildly under-confident | 34 | 13.3% | 68 | 26.7% | 42 | 16.5% | 45 | 17.7% | 8 | 3.1% | 2 | 0.8% | 56 | 22.0% |
Not confident | 27 | 11.1% | 72 | 29.6% | 28 | 11.5% | 51 | 21.0% | 8 | 3.3% | 5 | 2.1% | 52 | 21.4% |
Management | ||||||||||||||
Skin biopsy | 74 | 8.8% | 314 | 37.2% | 187 | 22.2% | 129 | 15.3% | 28 | 3.3% | 19 | 2.3% | 93 | 11.0% |
Reflectance confocal microscopy | 72 | 12.1% | 165 | 27.7% | 64 | 10.8% | 122 | 20.5% | 24 | 4.0% | 20 | 3.4% | 128 | 21.5% |
Close dermoscopic follow-up | 121 | 12.0% | 190 | 18.9% | 33 | 3.3% | 232 | 23.1% | 53 | 5.3% | 24 | 2.4% | 353 | 35.1% |
Diagnosis Categories | Accuracy (%) |
---|---|
Seven diagnoses | 42.9 |
Six diagnoses (Grouped LM with LMM) | 48.7 |
Four diagnoses (Grouped LM with LMM and SL with SLK and SK) | 55.8 |
Two diagnoses (malignant vs. benign) | 71.2 |
Variable | Coefficient |
---|---|
Maximum diameter ≥ 8 cm | +3 |
Age ≥ 70 years | +2 |
Male sex | +1 |
Presence of rhomboidal structures | +2 |
Presence of obliterated follicular openings | +2 |
Presence of a target-like pattern | +2 |
Presence of hyperpigmented follicular openings | +1 |
Absence of diffuse opaque yellow/brown pigmentation | +1 |
Absence of light brown fingerprint-like structures/areas | +1 |
Absence of red structures and lines | +1 |
Total score | 0–16 |
Range | LM/LMM | Benign |
---|---|---|
Very low (0–2) | 0% | 19.2% |
Intermediate (3–9) | 73.3% | 73.4% |
Very high (10–16) | 26.7% | 7.3% |
Class | Sensitivity (%) of the CNN Model | Sensitivity (%) of the Dermatologists |
---|---|---|
AN | 27.3 | 48.0 |
PAK | 42.9 | 42.0 |
SK | 50.0 | 41.7 |
SL | 50.0 | 50.0 |
SLK | 100.0 | 67.9 |
LM + LMM | 78.7 | 55.5 |
Model | Sensitivity (%) | Specificity (%) | Precision (%) |
---|---|---|---|
LR | 100.0% | 33.9 | 39.1 |
CNN | 78.7 | 79.7 | 75.5 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Cartocci, A.; Luschi, A.; Tognetti, L.; Cinotti, E.; Farnetani, F.; Lallas, A.; Paoli, J.; Longo, C.; Moscarella, E.; Tiodorovic, D.; et al. Comparative Analysis of AI Models for Atypical Pigmented Facial Lesion Diagnosis. Bioengineering 2024, 11, 1036. https://doi.org/10.3390/bioengineering11101036
Cartocci A, Luschi A, Tognetti L, Cinotti E, Farnetani F, Lallas A, Paoli J, Longo C, Moscarella E, Tiodorovic D, et al. Comparative Analysis of AI Models for Atypical Pigmented Facial Lesion Diagnosis. Bioengineering. 2024; 11(10):1036. https://doi.org/10.3390/bioengineering11101036
Chicago/Turabian StyleCartocci, Alessandra, Alessio Luschi, Linda Tognetti, Elisa Cinotti, Francesca Farnetani, Aimilios Lallas, John Paoli, Caterina Longo, Elvira Moscarella, Danica Tiodorovic, and et al. 2024. "Comparative Analysis of AI Models for Atypical Pigmented Facial Lesion Diagnosis" Bioengineering 11, no. 10: 1036. https://doi.org/10.3390/bioengineering11101036
APA StyleCartocci, A., Luschi, A., Tognetti, L., Cinotti, E., Farnetani, F., Lallas, A., Paoli, J., Longo, C., Moscarella, E., Tiodorovic, D., Stanganelli, I., Suppa, M., Dika, E., Zalaudek, I., Pizzichetta, M. A., Perrot, J. L., Cevenini, G., Iadanza, E., Rubegni, G., ... Rubegni, P. (2024). Comparative Analysis of AI Models for Atypical Pigmented Facial Lesion Diagnosis. Bioengineering, 11(10), 1036. https://doi.org/10.3390/bioengineering11101036