MCR-SL: A Multimodal, Context-Rich Skin Lesion Dataset for Skin Cancer Diagnosis
Abstract
1. Summary
2. Data Description
2.1. Dataset Structure
2.1.1. Lesion Entity
2.1.2. Subject Entity
2.1.3. Image Entities
2.1.4. Diagnostic Entities: Dermatology, Histopathology, and Unified Diagnosis
- (1)
- Dermatology Diagnosis: A diagnosis provided by a panel of dermatologists assigned to each lesion.
- (2)
- Histopathology Diagnosis: A diagnosis derived from histopathology reports, available for a subset of 29 excised lesions (out of 240). This report also contains tumor thickness information when applicable.
- (3)
- Unified Diagnosis: The definitive label for this dataset, derived by synthesizing the dermatology and histopathology diagnoses. The methodology for generating this label is detailed in the Methods section.
3. Methods
3.1. Ethics Declaration
3.2. Participants and Selection Criteria
3.3. Data Acquisition Workflow
- Informed consent: When the subject arrives in the room, they are informed about the study. Then, the subject is given the informed consent form to read and sign if they are willing to participate in the study (Figure 3a). Estimated time: 5 min.
- Clinical data collection: If the informed consent form is signed, the subject is asked to fill out a questionnaire in situ, so the data collector can clarify any questions the subject may have if needed (Figure 3b). Estimated time: 10 min.
- Clinical and dermoscopic image acquisition: A smartphone-based digital camera is used by the data collector for capturing the images with and without the dermoscope attached to the device (Figure 3c). Estimated time: 30 s per lesion.
- Diameter measurement of the skin lesion: The lesion is measured by the data collector with a caliper gauge (Figure 3d). Estimated time: 20 s per lesion.
- Data Storage: All acquired data are verified and stored in a secure, encrypted storage system (Figure 3e). Estimated time: 5 min per lesion.
- Real-World Baseline: Images were first acquired using the default automatic settings for all parameters. This captures the natural, heterogeneous noise expected from the average user of the WARIFA application.
- User Manipulation Scenario: Subsequent images of the same lesion were taken by deliberately adjusting settings such as brightness/exposure and focus. Crucially, this adjustment was performed using the typical user interface (e.g., tap-to-focus or brightness sliders) without setting specific technical values for ISO or exposure time.
3.4. Diagnosis Consolidation and Ground Truth Determination
3.5. Data Curation and Validation
3.5.1. Image Standardization and Curation
3.5.2. Metadata Validation and Consolidation
3.5.3. Experts’ Feedback
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
| AI | Artificial Intelligence |
| CNN | Convolutional Neural Network |
| ViT | Vision Transformer |
| MCR-SL | Multimodal, Context-Rich Skin Lesion |
| UNN | University Hospital of North Norway |
| WARIFA | Watching the Risk Factors |
| NEV | Nevus |
| SK | Seborrheic Keratosis |
| BCC | Basal Cell Carcinoma |
| AK | Actinic Keratosis |
| ATY | Atypical nevus |
| MEL | Melanoma |
| SCC | Squamous Cell Carcinoma |
| ANG | Angioma |
| DF | Dermatofibroma |
| UNK | Unknown |
| NM | Non-malignant |
| M | Malignant |
References
- Wang, R.; Chen, Y.; Shao, X.; Chen, T.; Zhong, J.; Ou, Y.; Chen, J. Burden of Skin Cancer in Older Adults from 1990 to 2021 and Modelled Projection to 2050. JAMA Dermatol. 2025, 161, 715. [Google Scholar] [CrossRef]
- Esteva, A.; Kuprel, B.; Novoa, R.A.; Ko, J.; Swetter, S.M.; Blau, H.M.; Thrun, S. Dermatologist-Level Classification of Skin Cancer with Deep Neural Networks. Nature 2017, 542, 115–118, Erratum in Nature 2017, 546, 686. https://doi.org/10.1038/nature22985. [Google Scholar] [CrossRef]
- Brinker, T.J.; Hekler, A.; Enk, A.H.; Klode, J.; Hauschild, A.; Berking, C.; Schilling, B.; Haferkamp, S.; Schadendorf, D.; Holland-Letz, T.; et al. Deep Learning Outperformed 136 of 157 Dermatologists in a Head-to-Head Dermoscopic Melanoma Image Classification Task. Eur. J. Cancer 2019, 113, 47–54. [Google Scholar] [CrossRef] [PubMed]
- Haenssle, H.A.; Fink, C.; Schneiderbauer, R.; Toberer, F.; Buhl, T.; Blum, A.; Kalloo, A.; Ben Hadj Hassen, A.; Thomas, L.; Enk, A.; et al. Man against Machine: Diagnostic Performance of a Deep Learning Convolutional Neural Network for Dermoscopic Melanoma Recognition in Comparison to 58 Dermatologists. Ann. Oncol. 2018, 29, 1836–1842. [Google Scholar] [CrossRef] [PubMed]
- Ha, Q.; Liu, B.; Liu, F. Identifying Melanoma Images Using EfficientNet Ensemble: Winning Solution to the SIIM-ISIC Melanoma Classification Challenge. arXiv 2020, arXiv:2010.05351. [Google Scholar]
- Dascalu, A.; Walker, B.N.; Oron, Y.; David, E.O. Non-Melanoma Skin Cancer Diagnosis: A Comparison between Dermoscopic and Smartphone Images by Unified Visual and Sonification Deep Learning Algorithms. J. Cancer Res. Clin. Oncol. 2021, 148, 2497–2505. [Google Scholar] [CrossRef] [PubMed]
- Pacheco, A.G.C.; Krohling, R.A. The Impact of Patient Clinical Information on Automated Skin Cancer Detection. Comput. Biol. Med. 2020, 116, 103545. [Google Scholar] [CrossRef] [PubMed]
- Pacheco, A.G.C.; Krohling, R.A. An Attention-Based Mechanism to Combine Images and Metadata in Deep Learning Models Applied to Skin Cancer Classification. IEEE J. Biomed. Health Inf. 2021, 25, 3554–3563. [Google Scholar] [CrossRef]
- Castro-Fernandez, M.; Hernandez, A.; Fabelo, H.; Balea-Fernandez, F.J.; Ortega, S.; Callico, G.M. Towards Skin Cancer Self-Monitoring through an Optimized MobileNet with Coordinate Attention. In Proceedings of the 2022 25th Euromicro Conference on Digital System Design (DSD), Maspalomas, Spain, 31 August–2 September 2022; IEEE: New York, NY, USA, 2022; pp. 607–614. [Google Scholar]
- Nie, Y.; Sommella, P.; Carratù, M.; O’Nils, M.; Lundgren, J. A Deep CNN Transformer Hybrid Model for Skin Lesion Classification of Dermoscopic Images Using Focal Loss. Diagnostics 2022, 13, 72. [Google Scholar] [CrossRef] [PubMed]
- Gallazzi, M.; Biavaschi, S.; Bulgheroni, A.; Gatti, T.M.; Corchs, S.; Gallo, I. A Large Dataset to Enhance Skin Cancer Classification with Transformer-Based Deep Neural Networks. IEEE Access 2024, 12, 109544–109559. [Google Scholar] [CrossRef]
- Tschandl, P.; Rosendahl, C.; Kittler, H. The HAM10000 Dataset, a Large Collection of Multi-Source Dermatoscopic Images of Common Pigmented Skin Lesions. Sci. Data 2018, 5, 180161. [Google Scholar] [CrossRef]
- Combalia, M.; Codella, N.C.F.; Rotemberg, V.; Helba, B.; Vilaplana, V.; Reiter, O.; Carrera, C.; Barreiro, A.; Halpern, A.C.; Puig, S.; et al. BCN20000: Dermoscopic Lesions in the Wild. arXiv 2019, arXiv:1908.02288. [Google Scholar] [CrossRef]
- Mendonca, T.; Ferreira, P.M.; Marques, J.S.; Marcal, A.R.S.; Rozeira, J. PH2—A Dermoscopic Image Database for Research and Benchmarking. In Proceedings of the 2013 35th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Osaka, Japan, 3–7 July 2013; IEEE: Osaka, Japan, 2013; pp. 5437–5440. [Google Scholar]
- Pacheco, A.G.C.; Lima, G.R.; Salomão, A.S.; Krohling, B.; Biral, I.P.; de Angelo, G.G.; Alves, F.C.R., Jr.; Esgario, J.G.M.; Simora, A.C.; Castro, P.B.C.; et al. PAD-UFES-20: A Skin Lesion Dataset Composed of Patient Data and Clinical Images Collected from Smartphones. Data Brief 2020, 32, 106221. [Google Scholar] [CrossRef]
- Codella, N.; Rotemberg, V.; Tschandl, P.; Celebi, M.E.; Dusza, S.; Gutman, D.; Helba, B.; Kalloo, A.; Liopyris, K.; Marchetti, M.; et al. Skin Lesion Analysis Toward Melanoma Detection 2018: A Challenge Hosted by the International Skin Imaging Collaboration (ISIC). arXiv 2019, arXiv:1902.03368. [Google Scholar] [CrossRef]
- Watching the Risk Factors: Artificial Intelligence and the Prevention of Chronic Conditions|WARIFA Project|Fact Sheet|H2020|CORDIS|European Commission. Available online: https://cordis.europa.eu/project/id/101017385/es (accessed on 27 October 2021).
- Petrie, T.C.; Larson, C.; Heath, M.; Samatham, R.; Davis, A.; Berry, E.G.; Leachman, S.A. Quantifying Acceptable Artefact Ranges for Dermatologic Classification Algorithms. Ski. Health Dis. 2021, 1, e19. [Google Scholar] [CrossRef]
- Yan, S.; Yu, Z.; Primiero, C.; Vico-Alonso, C.; Wang, Z.; Yang, L.; Tschandl, P.; Hu, M.; Ju, L.; Tan, G.; et al. A Multimodal Vision Foundation Model for Clinical Dermatology. Nat. Med. 2025, 31, 2691–2702. [Google Scholar] [CrossRef]
- Johansen, T.H.; Møllersen, K.; Ortega, S.; Fabelo, H.; Garcia, A.; Callico, G.M.; Godtliebsen, F. Recent Advances in Hyperspectral Imaging for Melanoma Detection. WIREs Comput. Stat. 2020, 12, e1465. [Google Scholar] [CrossRef]
- Leon, R.; Martinez-Vega, B.; Fabelo, H.; Ortega, S.; Melian, V.; Castaño, I.; Carretero, G.; Almeida, P.; Garcia, A.; Quevedo, E.; et al. Non-Invasive Skin Cancer Diagnosis Using Hyperspectral Imaging for In-Situ Clinical Support. J. Clin. Med. 2020, 9, 1662. [Google Scholar] [CrossRef]
- Aloupogianni, E.; Ishikawa, M.; Ichimura, T.; Sasaki, A.; Kobayashi, N.; Obi, T. Design of a Hyper-Spectral Imaging System for Gross Pathology of Pigmented Skin Lesions. In Proceedings of the 2021 43rd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), Guadalajara, Mexico, 1–5 November 2021; IEEE: New York, NY, USA, 2021; pp. 3605–3608. [Google Scholar]
- Hetz, M.J.; Garcia, C.N.; Haggenmüller, S.; Brinker, T.J. Advancing Dermatological Diagnosis: Development of a Hyperspectral Dermatoscope for Enhanced Skin Imaging. arXiv 2024, arXiv:2403.00612. [Google Scholar] [CrossRef]
- De Pascalis, A.; Perrot, J.L.; Tognetti, L.; Rubegni, P.; Cinotti, E. Review of Dermoscopy and Reflectance Confocal Microscopy Features of the Mucosal Melanoma. Diagnostics 2021, 11, 91. [Google Scholar] [CrossRef]
- Roth, B.; Kukk, A.F.; Wu, D.; Panzer, R.; Emmert, S. Four-Modal Device Comprising Optical Coherence Tomography, Photoacoustic Tomography, Ultrasound, and Raman Spectroscopy Developed for in Vivo Skin Lesion Assessment. Biomed. Opt. Express 2025, 16, 1792–1806. [Google Scholar] [CrossRef]
- Stridh, M.; Dahlstrand, U.; Naumovska, M.; Engelsberg, K.; Gesslein, B.; Sheikh, R.; Merdasa, A.; Malmsjö, M. Functional and Molecular 3D Mapping of Angiosarcoma Tumor Using Non-Invasive Laser Speckle, Hyperspectral, and Photoacoustic Imaging. Orbit 2024, 43, 453–463. [Google Scholar] [CrossRef]
- Wu, D.; Fedorov Kukk, A.; Panzer, R.; Emmert, S.; Roth, B. In Vivo Differentiation of Cutaneous Melanoma from Benign Nevi with Dual–Modal System of Optical Coherence Tomography and Raman Spectroscopy. J. Biophotonics 2025, 18, e70040. [Google Scholar] [CrossRef] [PubMed]
- Rotemberg, V.; Kurtansky, N.; Betz-Stablein, B.; Caffery, L.; Chousakos, E.; Codella, N.; Combalia, M.; Dusza, S.; Guitera, P.; Gutman, D.; et al. A Patient-Centric Dataset of Images and Metadata for Identifying Melanomas Using Clinical Context. Sci. Data 2021, 8, 34. [Google Scholar] [CrossRef] [PubMed]
- Daneshjou, R.; Barata, C.; Betz-Stablein, B.; Celebi, M.E.; Codella, N.; Combalia, M.; Guitera, P.; Gutman, D.; Halpern, A.; Helba, B.; et al. Checklist for Evaluation of Image-Based Artificial Intelligence Reports in Dermatology: CLEAR Derm Consensus Guidelines from the International Skin Imaging Collaboration Artificial Intelligence Working Group. JAMA Dermatol. 2022, 158, 90–96. [Google Scholar] [CrossRef] [PubMed]
- Bourkas, A.N.; Barone, N.; Bourkas, M.E.C.; Mannarino, M.; Fraser, R.D.J.; Lorincz, A.; Wang, S.C.; Ramirez-Garcialuna, J.L. Diagnostic Reliability in Teledermatology: A Systematic Review and a Meta-Analysis. BMJ Open 2023, 13, e068207. [Google Scholar] [CrossRef] [PubMed]
- ISIC Archive. ISIC 2020: Training Data. Available online: https://gallery.isic-archive.com/#!/topWithHeader/onlyHeaderTop/gallery?filter=%5B%22collections%7C70%22%5D (accessed on 2 October 2025).
- Tran, H.; Chen, K.; Lim, A.C.; Jabbour, J.; Shumack, S. Assessing Diagnostic Skill in Dermatology: A Comparison between General Practitioners and Dermatologists. Australas. J. Dermatol. 2005, 46, 230–234. [Google Scholar] [CrossRef] [PubMed]
- Nervil, G.G.; Ternov, N.K.; Lorentzen, H.; Kromann, C.; Ingvar, Å.; Nielsen, K.; Tolsgaard, M.; Vestergaard, T.; Hölmich, L.R. Teledermoscopic Triage of Melanoma-Suspicious Skin Lesions Is Safe: A Retrospective Comparative Diagnostic Accuracy Study with Multiple Assessors. J. Telemed. Telecare 2025, 31, 1296–1307. [Google Scholar] [CrossRef]
- Barata, C.; Celebi, M.E.; Marques, J.S. A Survey of Feature Extraction in Dermoscopy Image Analysis of Skin Cancer. IEEE J. Biomed. Health Inf. 2019, 23, 1096–1109. [Google Scholar] [CrossRef]








| Dataset Name | # Images | Classes Included | Image Modality | Gold Standard | Fields with IDs | Subject’s Data | Lesion Data | Diagnosis Variables |
|---|---|---|---|---|---|---|---|---|
| PH2 | 200 | NEV, MEL, ATY | Dermoscopic | Mixed (Histology, Expert Consensus) | Image | - | Dermoscopic criteria | - |
| BCN20000 | 10,015 | NEV, MEL, BCC, SK, AK, ANG, DF | Dermoscopic | Mixed (Histology, Follow-up, Confocal, Expert Consensus) | Lesion, Image | Age, sex | Body location | Verification Type (dx_type) |
| HAM10000 | 19,424 | NEV, MEL, BCC, SCC, SK, AK, ANG, DF, other | Dermoscopic | Mixed (Histology, Expert Consensus) | Subject, Lesion, Image | Age, sex | Body location | - |
| PAD-UFES-20 | 2298 | NEV, MEL, BCC, SCC, SK, AK | Clinical | Mixed (100% Biopsy for cancers; Expert Consensus for others) | Subject, Lesion, Image | Age, sex, skin cancer risk factors, others | Body location, lesion diameter, others | - |
| MCR-SL | 779; 1352 | NEV, SK, BCC, AK, ATY, MEL, ANG, DF, UNK | Clinical, Dermoscopic | Mixed (Histology, Expert Consensus) | Subject, Lesion, Image | Age, sex, skin cancer risk factors, others | Body location, lesion diameter, others | Certainty, image quality, time |
| Lesion Type | Malignancy | Diagnosed by Histopathology | Diagnosed by Dermatologists | ||
|---|---|---|---|---|---|
| Subjects | Lesions | Subjects | Lesions | ||
| BCC | Malignant | 18 (30.0%) | 20 (8.3%) | 18 (30.0%) | 26 (10.8%) |
| MEL | Malignant | 3 (5.0%) | 3 (1.3%) | 7 (11.7%) | 8 (3.3%) |
| SCC | Malignant | 0 (0.0%) | 0 (0.0%) | 5 (8.3%) | 5 (2.1%) |
| NEV | Non-Malignant | 3 (5.0%) | 3 (1.3%) | 37 (61.7%) | 85 (35.4%) |
| SK | Non-Malignant | 1 (1.7%) | 1 (0.4%) | 34 (56.6%) | 84 (35.0%) |
| AK | Non-Malignant | 0 (0.0%) | 0 (0.0%) | 10 (16.7%) | 12 (5.0%) |
| ATY | Non-Malignant | 2 (3.3%) | 2 (0.8%) | 6 (10.0%) | 7 (2.9%) |
| ANG | Non-Malignant | 0 (0.0%) | 0 (0.0%) | 2 (3.3%) | 4 (1.7%) |
| DF | Non-Malignant | 0 (0.0%) | 0 (0.0%) | 2 (3.3%) | 2 (0.8%) |
| UNK | - | 0 (0.0%) | 0 (0.0%) | 6 (10.0%) | 7 (2.9%) |
| Total | 27 (45.0%) | 29 (12.1%) | 60 (100.0%) | 240 (100.0%) | |
| Attribute [No Missing Values/Total] | Values | # | % | # NM | % NM | # M | % M | p-Value |
|---|---|---|---|---|---|---|---|---|
| Diameter [238/240] | 1.235–12.083 | 199 | 83% | 23 | 12% | 5 | 3% | 0.0030 |
| 12.083–22.867 | 34 | 14% | 17 | 50% | 0 | 0% | 0.0030 | |
| 22.867–33.65 | 2 | 1% | 0 | 0% | 0 | 0% | 0.0030 | |
| 33.65–44.433 | 1 | 0% | 0 | 0% | 0 | 0% | 0.0030 | |
| 55.217–66.0 | 1 | 0% | 1 | 100% | 0 | 0% | 0.0030 | |
| Location group [232/240] | Back | 99 | 41% | 11 | 11% | 3 | 3% | 0.0005 |
| Arms | 45 | 19% | 2 | 4% | 1 | 2% | 0.0005 | |
| Face | 41 | 17% | 16 | 39% | 1 | 2% | 0.0005 | |
| Torso | 31 | 13% | 10 | 32% | 0 | 0% | 0.0005 | |
| Legs | 12 | 5% | 2 | 17% | 1 | 8% | 0.0005 | |
| Head | 4 | 2% | 1 | 25% | 0 | 0% | 0.0005 | |
| unknown | 8 | 3% | 0 | 0% | 0 | 0% | 0.0005 | |
| Lesion status when captured [240/240] | Lesion | 235 | 98% | 38 | 16% | 6 | 3% | 0.0047 |
| Biopsied lesion | 5 | 2% | 4 | 80% | 0 | 0% | 0.0047 | |
| Referral diagnosis [240/240] | Voluntary sample | 197 | 82% | 16 | 8% | 6 | 3% | 0.0000 |
| BCC | 25 | 10% | 22 | 100% | 0 | 0% | 0.0000 | |
| SK | 7 | 3% | 0 | 0% | 0 | 0% | 0.0000 | |
| MEL | 5 | 2% | 3 | 60% | 0 | 0% | 0.0000 | |
| NEV | 5 | 2% | 0 | 0% | 0 | 0% | 0.0000 | |
| Morbus Bowen carcinoma | 1 | 0% | 1 | 100% | 0 | 0% | 0.0000 | |
| Malignancy | Non-malignant | 192 | 80% | |||||
| Malignant | 42 | 18% | ||||||
| unknown | 6 | 2% |
| Attribute [No Missing Values/Total] | Values | # | % | # NM | % NM | # M | % M | p-Value |
|---|---|---|---|---|---|---|---|---|
| Age [59/60] | 14.9–40.7 | 8 | 13% | 0 | 0% | 8 | 100% | 0.582 |
| 40.7–66.3 | 23 | 38% | 13 | 57% | 10 | 43% | ||
| 66.3–92.0 | 29 | 48% | 19 | 66% | 10 | 34% | ||
| Sex [60/60] | Female | 33 | 55% | 12 | 36% | 21 | 64% | 0.008 |
| Male | 27 | 45% | 20 | 74% | 7 | 26% | ||
| Height (cm) [59/60] | 145.9–162.3 | 14 | 23% | 4 | 29% | 10 | 71% | 0.053 |
| 162.3–178.7 | 27 | 45% | 17 | 63% | 10 | 37% | ||
| 178.7–195.0 | 19 | 32% | 11 | 58% | 8 | 42% | ||
| Weight (kg) [59/60] | 38.9–66.0 | 19 | 32% | 6 | 32% | 13 | 68% | 0.496 |
| 66.0–93.0 | 32 | 53% | 20 | 62% | 13 | 41% | ||
| 93.0–120.0 | 9 | 15% | 6 | 67% | 4 | 44% | ||
| Natural hair color (≤18 years old) [60/60] | Brown | 25 | 42% | 12 | 48% | 13 | 52% | 0.382 |
| Fair blonde | 19 | 32% | 10 | 53% | 9 | 47% | ||
| Dark brown, black | 12 | 20% | 9 | 75% | 3 | 25% | ||
| Red or auburn | 3 | 5% | 1 | 33% | 2 | 67% | ||
| Blonde | 1 | 2% | 0 | 0% | 1 | 100% | ||
| Skin reaction to sun exposure [60/60] | Red | 29 | 48% | 16 | 55% | 13 | 45% | 0.844 |
| Brown without 1st becoming red | 22 | 37% | 12 | 55% | 10 | 45% | ||
| Red with pain | 9 | 15% | 4 | 44% | 5 | 56% | ||
| Number of moles (≤18 years old) [53/60] | Few | 21 | 35% | 14 | 67% | 7 | 33% | 0.065 |
| Some | 18 | 30% | 5 | 28% | 13 | 72% | ||
| Many | 14 | 23% | 8 | 57% | 6 | 43% | ||
| Unknown | 7 | 12% | 5 | 71% | 2 | 29% | ||
| Moles > 5 mm [55/60] | Yes | 30 | 50% | 14 | 47% | 16 | 53% | 0.361 |
| No | 25 | 42% | 16 | 64% | 9 | 36% | ||
| Unknown | 5 | 8% | 2 | 40% | 3 | 60% | ||
| Moles > 20 cm [60/60] | No | 60 | 100% | 32 | 53% | 28 | 47% | 1.000 |
| Number of moles (now) [53/60] | Some | 24 | 40% | 9 | 38% | 15 | 62% | 0.133 |
| Few | 22 | 37% | 15 | 68% | 7 | 32% | ||
| Many | 7 | 12% | 3 | 43% | 4 | 57% | ||
| Unknown | 7 | 12% | 5 | 71% | 2 | 29% | ||
| Number of severe sunburns [52/60] | 0 | 28 | 47% | 14 | 50% | 14 | 50% | 0.617 |
| 1–2 | 13 | 22% | 7 | 54% | 6 | 46% | ||
| 3–5 | 8 | 13% | 3 | 38% | 5 | 62% | ||
| >5 | 3 | 5% | 2 | 67% | 1 | 33% | ||
| Unknown | 8 | 13% | 6 | 75% | 2 | 25% | ||
| Sunbed use [58/60] | No | 54 | 90% | 29 | 54% | 25 | 46% | 0.218 |
| Yes | 4 | 7% | 1 | 25% | 3 | 75% | ||
| Unknown | 2 | 3% | 2 | 100% | 0 | 0% | ||
| History of cancer [60/60] | No | 39 | 65% | 17 | 44% | 22 | 56% | 0.073 |
| Yes | 21 | 35% | 15 | 71% | 6 | 29% | ||
| History of skin cancer [56/60] | No | 41 | 68% | 19 | 46% | 22 | 54% | 0.102 |
| Yes | 15 | 25% | 9 | 60% | 6 | 40% | ||
| Unknown | 4 | 7% | 4 | 100% | 0 | 0% | ||
| History of skin cancer (close relatives) [60/60] | No | 50 | 83% | 25 | 50% | 25 | 50% | 0.418 |
| Yes | 10 | 17% | 7 | 70% | 3 | 30% | ||
| Organ transplant [59/60] | No | 57 | 95% | 30 | 53% | 27 | 47% | 0.234 |
| Yes | 2 | 3% | 2 | 100% | 0 | 0% | ||
| Unknown | 1 | 2% | 0 | 0% | 1 | 100% | ||
| Immunosuppression [59/60] | No | 54 | 90% | 30 | 56% | 24 | 44% | 0.448 |
| Yes | 5 | 8% | 2 | 40% | 3 | 60% | ||
| Unknown | 1 | 2% | 0 | 0% | 1 | 100% | ||
| Patients derived from [60/60] | Plastic surgery | 35 | 58% | 20 | 57% | 15 | 43% | 0.040 |
| Dermatology | 17 | 28% | 11 | 65% | 6 | 35% | ||
| Volunteer | 8 | 13% | 1 | 12% | 7 | 88% | ||
| Subjects with known malignant lesions | yes | 32 | 53% | |||||
| no | 28 | 47% |
| Attribute | Data Type | Description |
|---|---|---|
| lesion_id | string | A unique identifier for the lesion. |
| referral_diagnosis | text | The initial diagnosis provided during the subject’s referral. |
| lesion_status_when_captured | categorical | The status of the lesion at the time of imaging. |
| location | categorical | The anatomical location of the lesion on the subject’s body. |
| location_group | categorical | A broader classification of the lesion’s location. |
| diameter | numerical | The measured diameter of the lesion in millimeters. |
| malignancy | categorical | The malignancy status of the lesion (i.e., malignant, non-malignant). |
| lesion_diagnosis | text | The unified diagnosis assigned to the lesion. |
| diagnosis_image_id | string | The unique identifier of the specific image used by the dermatologists to make their diagnoses. |
| Attribute | Data Type | Description |
|---|---|---|
| subject_id | string | A unique identifier for the subject. |
| derived_from | categorical | The hospital’s department that derived the subject. |
| year_of_birth | integer | The subject’s year of birth. |
| age | integer | The subject’s age. |
| sex | categorical | The subject’s sex. |
| height | numerical | Subject height in centimeters. |
| weight | numerical | Subject weight in kilograms. |
| natural_hair_color | categorical | The subject’s natural hair color at 18 years old. |
| skin_reaction_to_sun | categorical | How the subject’s skin reacts to sun exposure without sun protection. |
| number_of_moles | integer | The total number of moles on the subject at 18 years old. |
| moles_bigger_5mm | integer | Current number of moles larger than 5mm. |
| moles_bigger_20cm | integer | Current number of moles larger than 20cm. |
| moles_body | integer | Current number of moles on the body. |
| sunburn_number | integer | The number of severe sunburns the subject has experienced. |
| sunburn_age | text | The age at which the subject experienced severe sunburns. |
| sunburn_number_group | categorical | A categorized group for the number of sunburns. |
| sunbed | boolean | Whether the subject has used a sunbed. |
| h_cancer | boolean | History of hereditary cancer. |
| h_skin_cancer | boolean | History of hereditary skin cancer. |
| h_skin_cancer_relatives | boolean | History of skin cancer in close relatives. |
| organ_transplant | boolean | Whether the subject has had an organ transplant. |
| immunosuppresion | boolean | Whether the subject is on immunosuppressive medication. |
| Attribute | Data Type | Description |
|---|---|---|
| image_id | string | A unique identifier for each image. |
| lesion_id | string | A unique identifier for the lesion depicted in the image. |
| modality | categorical | The modality of the image (clinical or dermoscopic). |
| Attribute | Data Type | Description |
|---|---|---|
| diagnosis_id | string | A unique identifier for each diagnosis. |
| lesion_id | string | The identifier of the lesion the diagnosis refers to. |
| image_id | string | The identifier of the image that was diagnosed. |
| expert_id | string | The identifier of the dermatologist who provided the diagnosis. |
| diagnosis | string | The primary diagnosis provided by the expert (e.g., NEV, MEL). |
| 2nd_option | string | An optional second choice or differential diagnosis. |
| certainty | categorical | A numerical rating of the expert’s confidence in their diagnosis. Potential values are 0%, 25%, 50%, 75%, and 100%. |
| image_rating | integer | The expert’s rating of the image quality, ranging from 1 to 10. |
| time | datetime | The time taken by the expert to provide the diagnosis. |
| Attribute | Data Type | Description |
|---|---|---|
| diagnosis_id | string | A unique identifier for each histopathology diagnosis. |
| lesion_id | string | The identifier of the lesion the diagnosis refers to. |
| procedure | string | The type of procedure described in the report (e.g., biopsy, excision). |
| tumor_thickness | float | The Breslow thickness of the tumor, if applicable. |
| diagnosis | string | The final diagnosis from the histopathology report (e.g., NEV, MEL). |
| Attribute | Data Type | Description |
|---|---|---|
| diagnosis_id | string | A unique identifier for the unified diagnosis. |
| lesion_id | string | The identifier of the lesion the diagnosis refers to. |
| dermatology_diagnosis | string | The final diagnosis selected by the dermatology experts. |
| histopathology_diagnosis | string | The diagnosis from the histopathology report, used as the ground truth when available. |
| diagnosis_id_histopath | string | The unique identifier of the histopathological diagnosis of the lesion. |
| unified_diagnosis | string | The final ground truth diagnosis for the lesion. |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Castro-Fernandez, M.; Schopf, T.R.; Castaño-Gonzalez, I.; Roque-Quintana, B.; Kirchesch, H.; Ortega, S.; Fabelo, H.; Godtliebsen, F.; Granja, C.; Callico, G.M. MCR-SL: A Multimodal, Context-Rich Skin Lesion Dataset for Skin Cancer Diagnosis. Data 2025, 10, 166. https://doi.org/10.3390/data10100166
Castro-Fernandez M, Schopf TR, Castaño-Gonzalez I, Roque-Quintana B, Kirchesch H, Ortega S, Fabelo H, Godtliebsen F, Granja C, Callico GM. MCR-SL: A Multimodal, Context-Rich Skin Lesion Dataset for Skin Cancer Diagnosis. Data. 2025; 10(10):166. https://doi.org/10.3390/data10100166
Chicago/Turabian StyleCastro-Fernandez, Maria, Thomas Roger Schopf, Irene Castaño-Gonzalez, Belinda Roque-Quintana, Herbert Kirchesch, Samuel Ortega, Himar Fabelo, Fred Godtliebsen, Conceição Granja, and Gustavo M. Callico. 2025. "MCR-SL: A Multimodal, Context-Rich Skin Lesion Dataset for Skin Cancer Diagnosis" Data 10, no. 10: 166. https://doi.org/10.3390/data10100166
APA StyleCastro-Fernandez, M., Schopf, T. R., Castaño-Gonzalez, I., Roque-Quintana, B., Kirchesch, H., Ortega, S., Fabelo, H., Godtliebsen, F., Granja, C., & Callico, G. M. (2025). MCR-SL: A Multimodal, Context-Rich Skin Lesion Dataset for Skin Cancer Diagnosis. Data, 10(10), 166. https://doi.org/10.3390/data10100166

