# Improving Automatic Melanoma Diagnosis Using Deep Learning-Based Segmentation of Irregular Networks

^{1}

^{2}

^{3}

^{4}

^{*}

## Abstract

**:**

## Simple Summary

## Abstract

## 1. Introduction

#### Irregular Network

## 2. Methods

#### 2.1. Datasets

#### 2.1.1. Segmentation Dataset

#### 2.1.2. Classification Dataset

#### 2.2. Models

#### 2.3. Evaluation Metrics

#### 2.4. Training

#### 2.4.1. Segmentation

#### 2.4.2. Classification

## 3. Results

#### 3.1. DL Segmentation

#### 3.2. Classification

## 4. Hardware and Software

## 5. Discussion

## 6. Conclusions

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Conflicts of Interest

## Appendix A

**Table A1.**The table shows the metrics

^{1}for all the encoder and architecture combinations trained for segmentation in the study. The highest value for each is in bold.

Architecture | Encoder | Threshold | Precision | Recall | F1-Score | Specificity | IoU |
---|---|---|---|---|---|---|---|

MA-Net | EfficientNet-b2 | 0.25 | 0.257 | 0.629 | 0.365 | 0.968 | 0.223 |

MA-Net | EfficientNet-b2 | 0.5 | 0.296 | 0.568 | 0.390 | 0.976 | 0.242 |

MA-Net | EfficientNet-b2 | 0.75 | 0.336 | 0.501 | 0.402 | 0.982 | 0.251 |

MA-Net | EfficientNet-b3 | 0.25 | 0.302 | 0.494 | 0.375 | 0.980 | 0.230 |

MA-Net | EfficientNet-b3 | 0.5 | 0.347 | 0.428 | 0.383 | 0.986 | 0.237 |

MA-Net | EfficientNet-b3 | 0.75 | 0.391 | 0.364 | 0.377 | 0.990 | 0.232 |

MA-Net | EfficientNet-b4 | 0.25 | 0.295 | 0.593 | 0.394 | 0.975 | 0.245 |

MA-Net | EfficientNet-b4 | 0.5 | 0.336 | 0.534 | 0.412 | 0.981 | 0.260 |

MA-Net | EfficientNet-b4 | 0.75 | 0.377 | 0.470 | 0.418 | 0.986 | 0.264 |

MA-Net | EfficientNet-b5 | 0.25 | 0.341 | 0.518 | 0.411 | 0.982 | 0.259 |

MA-Net | EfficientNet-b5 | 0.5 | 0.383 | 0.461 | 0.419 | 0.987 | 0.265 |

MA-Net | EfficientNet-b5 | 0.75 | 0.422 | 0.402 | 0.412 | 0.990 | 0.259 |

PA-Net | EfficientNet-b2 | 0.25 | 0.239 | 0.690 | 0.355 | 0.961 | 0.216 |

PA-Net | EfficientNet-b2 | 0.5 | 0.289 | 0.603 | 0.391 | 0.974 | 0.243 |

PA-Net | EfficientNet-b2 | 0.75 | 0.337 | 0.505 | 0.405 | 0.982 | 0.254 |

PA-Net | EfficientNet-b3 | 0.25 | 0.242 | 0.587 | 0.342 | 0.967 | 0.207 |

PA-Net | EfficientNet-b3 | 0.5 | 0.290 | 0.501 | 0.367 | 0.978 | 0.225 |

PA-Net | EfficientNet-b3 | 0.75 | 0.336 | 0.414 | 0.371 | 0.985 | 0.227 |

PA-Net | EfficientNet-b4 | 0.25 | 0.263 | 0.631 | 0.372 | 0.969 | 0.228 |

PA-Net | EfficientNet-b4 | 0.5 | 0.312 | 0.549 | 0.398 | 0.979 | 0.248 |

PA-Net | EfficientNet-b4 | 0.75 | 0.359 | 0.461 | 0.404 | 0.985 | 0.253 |

PA-Net | EfficientNet-b5 | 0.25 | 0.291 | 0.589 | 0.390 | 0.975 | 0.242 |

PA-Net | EfficientNet-b5 | 0.5 | 0.345 | 0.501 | 0.408 | 0.983 | 0.257 |

PA-Net | EfficientNet-b5 | 0.75 | 0.396 | 0.409 | 0.402 | 0.989 | 0.252 |

U-Net | EfficientNet-b2 | 0.25 | 0.280 | 0.572 | 0.376 | 0.974 | 0.232 |

U-Net | EfficientNet-b2 | 0.5 | 0.325 | 0.498 | 0.394 | 0.982 | 0.245 |

U-Net | EfficientNet-b2 | 0.75 | 0.368 | 0.422 | 0.394 | 0.987 | 0.245 |

U-Net | EfficientNet-b3 | 0.25 | 0.341 | 0.537 | 0.417 | 0.982 | 0.263 |

U-Net | EfficientNet-b3 | 0.5 | 0.385 | 0.476 | 0.426 | 0.987 | 0.271 |

U-Net | EfficientNet-b3 | 0.75 | 0.427 | 0.415 | 0.421 | 0.990 | 0.267 |

U-Net | EfficientNet-b4 | 0.25 | 0.319 | 0.586 | 0.413 | 0.978 | 0.261 |

U-Net | EfficientNet-b4 | 0.5 | 0.363 | 0.523 | 0.428 | 0.984 | 0.273 |

U-Net | EfficientNet-b4 | 0.75 | 0.404 | 0.458 | 0.429 | 0.988 | 0.273 |

U-Net | EfficientNet-b5 | 0.25 | 0.348 | 0.485 | 0.405 | 0.984 | 0.254 |

U-Net | EfficientNet-b5 | 0.5 | 0.394 | 0.420 | 0.407 | 0.989 | 0.255 |

U-Net | EfficientNet-b5 | 0.75 | 0.438 | 0.356 | 0.393 | 0.992 | 0.244 |

U-Net++ | EfficientNet-b2 | 0.25 | 0.282 | 0.657 | 0.395 | 0.970 | 0.246 |

U-Net++ | EfficientNet-b2 | 0.5 | 0.325 | 0.589 | 0.419 | 0.978 | 0.265 |

U-Net++ | EfficientNet-b2 | 0.75 | 0.365 | 0.517 | 0.428 | 0.984 | 0.272 |

U-Net++ | EfficientNet-b3 | 0.25 | 0.321 | 0.566 | 0.409 | 0.979 | 0.257 |

U-Net++ | EfficientNet-b3 | 0.5 | 0.364 | 0.507 | 0.423 | 0.984 | 0.269 |

U-Net++ | EfficientNet-b3 | 0.75 | 0.404 | 0.447 | 0.424 | 0.988 | 0.269 |

U-Net++ | EfficientNet-b4 | 0.25 | 0.315 | 0.622 | 0.418 | 0.976 | 0.265 |

U-Net++ | EfficientNet-b4 | 0.5 | 0.358 | 0.562 | 0.438 | 0.982 | 0.280 |

U-Net++ | EfficientNet-b4 | 0.75 | 0.400 | 0.499 | 0.444 | 0.987 | 0.285 |

U-Net++ | EfficientNet-b5 | 0.25 | 0.347 | 0.535 | 0.421 | 0.982 | 0.267 |

U-Net++ | EfficientNet-b5 | 0.5 | 0.393 | 0.473 | 0.429 | 0.987 | 0.273 |

U-Net++ | EfficientNet-b5 | 0.75 | 0.435 | 0.410 | 0.422 | 0.991 | 0.268 |

^{1}Rounded to three significant digits.

**Figure A1.**Comparison of ROC curves for conventional classifiers, DL classifiers, and their respective ensembles obtained by averaging the output probabilities. (

**a**) ROC curves for conventional classifiers applied directly to hand-crafted features, without cascade generalization; (

**b**) ROC curves for the deep learning (DL) models only and their ensemble. The area under the curve (AUC) and the AUC for a false-positive rate higher than 0.4 are both presented for comparison. All values are rounded to three significant digits.

**Figure A2.**Comparison of confusion matrices for conventional classifiers using only hand-crafted features for classification (without Cascade generalization) on the classification hold-out test set. (

**a**) Neural network; (

**b**) RBF SVM; (

**c**) Linear SVM; (

**d**) Random Forest; (

**e**) Decision Tree; (

**f**) Ensemble of conventional models.

**Figure A3.**Comparison of confusion matrices for DL classifiers without cascade generalization on the classification hold-out test set. (

**a**) ResNet50; (

**b**) EfficientNet-B0; (

**c**) EfficientNet-B1; (

**d**) Ensemble of DL classifier models.

**Figure A4.**Illustrates a comparison of confusion matrices obtained using cascade generalization, utilizing the probability outputs of the EfficientNet-B0 model. (

**a**) Confusion matrix for classification using only the EfficientNet-B0 model (DL-only, level-0); (

**b**) Confusion matrix for classification utilizing the EfficientNet-B0 probability outputs and hand-crafted features, obtained through an ensemble of conventional classifiers (averaging); (

**c**) Confusion matrix for classification using EfficientNet-B0 (level-0) probability outputs and hand-crafted features for the best conventional classifier, Neural networks. All confusion matrices are generated after applying a threshold of 0.5 to the models probability outputs to obtain the final predictions.

**Figure A5.**(

**a**) ROC curves for conventional classifiers (level-1) used with the DL output probabilities from the EfficientNet-B0 (level-0) model and the hand-crafted features (cascade generalization). (

**b**) Figure showing ROC curves only for the EfficientNet-B0 (level-0) model, the ensemble of conventional classifiers, and the best conventional level-1 classifier using the EfficientNet-B0 probability outputs. The AUC and the AUC for FPR higher than 0.4 are both presented for comparison. All values are rounded to three significant digits.

**Figure A6.**Illustrates a comparison of confusion matrices obtained using cascade generalization, utilizing the probability outputs of the EfficientNet-B1 model. (

**a**) Confusion matrix for classification using only the EfficientNet-B1 model (DL-only, level-0); (

**b**) Confusion matrix for classification utilizing the EfficientNet-B0 probability outputs and hand-crafted features, obtained through an ensemble of conventional classifiers (averaging); (

**c**) Confusion matrix for classification using EfficientNet-B1 (level-0) probability outputs and hand-crafted features for the best conventional classifier, random forest. All confusion matrices are generated after applying a threshold of 0.5 to the model’s probability outputs to obtain the final predictions.

**Figure A7.**(

**a**) ROC curves for conventional classifiers (level-1) used with the DL output probabilities from the EfficientNet-B1 (level-0) model and the hand-crafted features (cascade generalization); (

**b**) Figure showing ROC curves only for the EfficientNet-B1 (level-0) model, the ensemble of conventional classifiers, and the best conventional level-1 classifier using the EfficientNet-B1 probability outputs. The AUC and the AUC for FPR higher than 0.4 are both presented for comparison. All values are rounded to three significant digits.

**Figure A8.**Illustrates a comparison of confusion matrices obtained through cascade generalization, utilizing the probability outputs of the ResNet50 model. (

**a**) Confusion matrix for classification using only the ResNet50 model (DL-only, level-0); (

**b**) Confusion matrix for classification utilizing the ResNet50 probability outputs and hand-crafted features, obtained through an ensemble of conventional classifiers (averaging); (

**c**) Confusion matrix for classification using the ResNet50 (level-0) probability outputs and hand-crafted features for the best conventional classifier, random forest. All confusion matrices are generated after applying a threshold of 0.5 to the model’s probability outputs to obtain the final predictions.

**Figure A9.**(

**a**) ROC curves for conventional classifiers (level-1) used with the DL output probabilities from the ResNet50 (level-0) model and the hand-crafted features (cascade generalization); (

**b**) Figure showing ROC curves only for the ResNet50 (level-0) model, the ensemble of conventional classifiers, and the best conventional level-1 classifier using the ResNet50 probability outputs. The AUC and the AUC for FPR higher than 0.4 are both presented for comparison. All values are rounded to three significant digits.

**Table A2.**Table shows the 20 most important features based on the mean accuracy decrease after feature permutation. The scores are the averaged mean importance scores of all conventional classifiers without cascade generalization. Note that objects are non-overlapping distinct contours (blobs) in the irregular network binary mask generated by the segmentation model.

Feature | Importance Score |
---|---|

Standard deviation of object’s color in L-plane inside lesion | 0.036 |

Mean of object’s color in L-plane inside lesion | 0.026 |

Standard deviation of object’s color in B-plane inside lesion | 0.026 |

Total number of objects inside lesion | 0.021 |

Mean of skin color in A-plane (excluding both lesion and irregular networks) | 0.019 |

Mean of object’s color in B-plane inside lesion | 0.018 |

Standard deviation of skin color in L-plane (excluding both lesion and irregular networks) | 0.016 |

Maximum width for all objects | 0.014 |

Standard deviation of skin color in B-plane (excluding both lesion and irregular networks) | 0.012 |

Standard deviation of eccentricity for all objects | 0.011 |

Standard deviation of width for all objects | 0.010 |

Density of objects inside lesion (objects’ area inside lesion/lesion area) | 0.010 |

Total number of objects remaining after applying erosion with circular structuring element of radius 7 | 0.007 |

Total number of objects remaining after applying erosion with circular structuring element of radius 9 | 0.006 |

Standard deviation of object’s color in L-plane outside lesion | 0.006 |

Total number of objects remaining after applying erosion with circular structuring element of radius 5 | 0.006 |

Total number of objects remaining after applying erosion with circular structuring element of radius 8 | 0.005 |

Mean of skin color in B-plane (excluding both lesion and irregular networks) | 0.005 |

Mean of object’s color in A-plane inside lesion | 0.004 |

Total number of objects remaining after applying erosion with circular structuring element of radius 3 | 0.004 |

**Table A3.**Table shows the 20 most important features based on the mean accuracy decrease after feature permutation. The scores are the averaged mean importance scores for all conventional classifiers with cascade generalization. Note that objects are non-overlapping distinct contours (blobs) in the irregular network binary mask generated by the segmentation model.

Feature | Importance Score |
---|---|

Standard deviation of object’s color in L-plane inside lesion | 0.037 |

Deep Learning probability output | 0.030 |

Mean of skin color in A-plane (excluding both lesion and irregular networks) | 0.030 |

Total number of objects inside lesion | 0.021 |

Total number of objects remaining after applying erosion with circular structuring element of radius 3 | 0.015 |

Mean of object’s color in L-plane inside lesion | 0.014 |

Standard deviation of skin color in B-plane (excluding both lesion and irregular networks) | 0.014 |

Standard deviation of objects color in B-plane inside lesion | 0.014 |

Standard deviation of skin color in L-plane (excluding both lesion and irregular networks) | 0.013 |

Mean of object’s color in B-plane inside lesion | 0.012 |

Total number of objects remaining after applying erosion with circular structuring element of radius 2 | 0.011 |

Standard deviation of object’s color in A-plane inside lesion | 0.010 |

Total number of objects remaining after applying erosion with circular structuring element of radius 1 | 0.010 |

Density of objects inside lesion (objects’ area inside lesion/lesion area) | 0.009 |

Standard deviation of eccentricity of objects | 0.009 |

Maximum width for all objects | 0.009 |

Mean of object’s color in A-plane inside lesion | 0.007 |

Total mask area after applying erosion with circular structuring element of radius 1 | 0.006 |

Total number of objects remaining after applying erosion with circular structuring element of radius 4 | 0.006 |

Mean of objects color in L-plane outside lesion | 0.006 |

## References

- Siegel, R.L.; Miller, K.D.; Fuchs, H.E.; Jemal, A. Cancer Statistics, 2022. CA. Cancer J. Clin.
**2022**, 72, 7–33. [Google Scholar] [CrossRef] - Siegel, R.L.; Miller, K.D.; Wagle, N.S.; Jemal, A. Cancer Statistics, 2023. CA. Cancer J. Clin.
**2023**, 73, 17–48. [Google Scholar] [CrossRef] - Rahib, L.; Wehner, M.R.; Matrisian, L.M.; Nead, K.T. Estimated Projection of US Cancer Incidence and Death to 2040. JAMA Netw. Open
**2021**, 4, e214708. [Google Scholar] [CrossRef] - Esteva, A.; Kuprel, B.; Novoa, R.A.; Ko, J.; Swetter, S.M.; Blau, H.M.; Thrun, S. Dermatologist-Level Classification of Skin Cancer with Deep Neural Networks. Nature
**2017**, 542, 115–118. [Google Scholar] [CrossRef] - Haenssle, H.A.; Fink, C.; Rosenberger, A.; Uhlmann, L. Reply to the Letter to the Editor “Man against Machine: Diagnostic Performance of a Deep Learning Convolutional Neural Network for Dermoscopic Melanoma Recognition in Comparison to 58 Dermatologists” by H. A. Haenssle et al. Ann. Oncol.
**2019**, 30, 854–857. [Google Scholar] [CrossRef] - Tschandl, P.; Rosendahl, C.; Kittler, H. The HAM10000 Dataset, a Large Collection of Multi-Source Dermatoscopic Images of Common Pigmented Skin Lesions. Sci. Data
**2018**, 5, 180161. [Google Scholar] [CrossRef] - Tschandl, P.; Codella, N.; Akay, B.N.; Argenziano, G.; Braun, R.P.; Cabo, H.; Gutman, D.; Halpern, A.; Helba, B.; Hofmann-Wellenhof, R.; et al. Comparison of the Accuracy of Human Readers versus Machine-Learning Algorithms for Pigmented Skin Lesion Classification: An Open, Web-Based, International, Diagnostic Study. Lancet Oncol.
**2019**, 20, 938–947. [Google Scholar] [CrossRef] - Codella, N.C.F.; Gutman, D.; Celebi, M.E.; Helba, B.; Marchetti, M.A.; Dusza, S.W.; Kalloo, A.; Liopyris, K.; Mishra, N.; Kittler, H.; et al. Skin Lesion Analysis toward Melanoma Detection: A Challenge at the 2017 International Symposium on Biomedical Imaging (ISBI), Hosted by the International Skin Imaging Collaboration (ISIC). In Proceedings of the International Symposium on Biomedical Imaging, Washington, DC, USA, 4–7 April 2018; Volume 2018, pp. 168–172. [Google Scholar]
- Combalia, M.; Codella, N.C.F.; Rotemberg, V.; Helba, B.; Vilaplana, V.; Reiter, O.; Carrera, C.; Barreiro, A.; Halpern, A.C.; Puig, S.; et al. BCN20000: Dermoscopic Lesions in the Wild. arXiv
**2019**, preprint. arXiv:1908.02288. [Google Scholar] - Teresa, M.; Pedro, M.F.; Jorge, M.; Andre, R.S.M.; Jorge, R. PH2-A Dermoscopic Image Database for Research and Benchmarking. In Proceedings of the 35th International Conference of the IEEE Engineering in Medicine and Biology Society, Osaka, Japan, 3–7 July 2013; pp. 5437–5440. [Google Scholar]
- Argenziano, G.; Soyer, H.P.; De Giorgi, V.; Piccolo, D.; Carli, P.; Delfino, M.; Ferrari, A.; Hofmann-Wellenhof, R.; Massi, D.; Mazzocchetti, G.; et al. Dermoscopy: A Tutorial; Edra—Medical Publishing & New Media: Milan, Italy, 2002; Volume 16. [Google Scholar]
- Dildar, M.; Akram, S.; Irfan, M.; Khan, H.U.; Ramzan, M.; Mahmood, A.R.; Alsaiari, S.A.; Saeed, A.H.M.; Alraddadi, M.O.; Mahnashi, M.H. Skin Cancer Detection: A Review Using Deep Learning Techniques. Int. J. Environ. Res. Public Health
**2021**, 18, 5479. [Google Scholar] [CrossRef] - Grignaffini, F.; Barbuto, F.; Piazzo, L.; Troiano, M.; Simeoni, P.; Mangini, F.; Pellacani, G.; Cantisani, C.; Frezza, F. Machine Learning Approaches for Skin Cancer Classification from Dermoscopic Images: A Systematic Review. Algorithms
**2022**, 15, 438. [Google Scholar] [CrossRef] - Wu, Y.; Chen, B.; Zeng, A.; Pan, D.; Wang, R.; Zhao, S. Skin Cancer Classification With Deep Learning: A Systematic Review. Front. Oncol.
**2022**, 12, 893972. [Google Scholar] [CrossRef] - Kousis, I.; Perikos, I.; Hatzilygeroudis, I.; Virvou, M. Deep Learning Methods for Accurate Skin Cancer Recognition and Mobile Application. Electronics
**2022**, 11, 1294. [Google Scholar] [CrossRef] - Codella, N.; Rotemberg, V.; Tschandl, P.; Celebi, M.E.; Dusza, S.; Gutman, D.; Helba, B.; Kalloo, A.; Liopyris, K.; Marchetti, M.; et al. Skin Lesion Analysis toward Melanoma Detection 2018: A Challenge Hosted by the International Skin Imaging Collaboration (Isic). arXiv
**2019**, preprint. arXiv:1902.03368. [Google Scholar] - Nambisan, A.K.; Lama, N.; Phan, T.; Swinfard, S.; Lama, B.; Smith, C.; Rajeh, A.; Patel, G.; Hagerty, J.; Stoecker, W.V.; et al. Deep Learning-Based Dot and Globule Segmentation with Pixel and Blob-Based Metrics for Evaluation. Intell. Syst. Appl.
**2022**, 16, 200126. [Google Scholar] [CrossRef] - Stoecker, W.V.; Wronkiewiecz, M.; Chowdhury, R.; Stanley, R.J.; Xu, J.; Bangert, A.; Shrestha, B.; Calcara, D.A.; Rabinovitz, H.S.; Oliviero, M.; et al. Detection of Granularity in Dermoscopy Images of Malignant Melanoma Using Color and Texture Features. Comput. Med. Imaging Graph.
**2011**, 35, 144–147. [Google Scholar] [CrossRef] [Green Version] - Stoecker, W.V.; Gupta, K.; Stanley, R.J.; Moss, R.H.; Shrestha, B. Detection of Asymmetric Blotches (Asymmetric Structureless Areas) in Dermoscopy Images of Malignant Melanoma Using Relative Color. Ski. Res. Technol.
**2005**, 11, 179–184. [Google Scholar] [CrossRef] - Argenziano, G.; Soyer, H.P.; Chimenti, S.; Talamini, R.; Corona, R.; Sera, F.; Binder, M.; Cerroni, L.; De Rosa, G.; Ferrara, G.; et al. Dermoscopy of Pigmented Skin Lesions: Results of a Consensus Meeting via the Internet. J. Am. Acad. Dermatol.
**2003**, 48, 679–693. [Google Scholar] [CrossRef] [Green Version] - Kittler, H.; Marghoob, A.A.; Argenziano, G.; Carrera, C.; Curiel-Lewandrowski, C.; Hofmann-Wellenhof, R.; Malvehy, J.; Menzies, S.; Puig, S.; Rabinovitz, H.; et al. Standardization of Terminology in Dermoscopy/Dermatoscopy: Results of the Third Consensus Conference of the International Society of Dermoscopy. J. Am. Acad. Dermatol.
**2016**, 74, 1093–1106. [Google Scholar] [CrossRef] [Green Version] - Marghoob, N.G.; Liopyris, K.; Jaimes, N. Dermoscopy: A Review of the Structures That Facilitate Melanoma Detection. J. Am. Osteopath. Assoc.
**2019**, 119, 380–390. [Google Scholar] [CrossRef] - Tognetti, L.; Cartocci, A.; Bertello, M.; Giordani, M.; Cinotti, E.; Cevenini, G.; Rubegni, P. An Updated Algorithm Integrated With Patient Data for the Differentiation of Atypical Nevi From Early Melanomas: The IdScore 2021. Dermatol. Pract. Concept.
**2022**, 12, e2022134. [Google Scholar] [CrossRef] - Jaimes, N.; Marghoob, A.A.; Rabinovitz, H.; Braun, R.P.; Cameron, A.; Rosendahl, C.; Canning, G.; Keir, J. Clinical and Dermoscopic Characteristics of Melanomas on Nonfacial Chronically Sun-Damaged Skin. J. Am. Acad. Dermatol.
**2015**, 72, 1027–1035. [Google Scholar] [CrossRef] - Ciudad-Blanco, C.; Avilés-Izquierdo, J.A.; Lázaro-Ochaita, P.; Suárez-Fernández, R. Dermoscopic Findings for the Early Detection of Melanoma: An Analysis of 200 Cases. Actas Dermosifiliogr.
**2014**, 105, 683–693. [Google Scholar] [CrossRef] - Shrestha, B.; Bishop, J.; Kam, K.; Chen, X.; Moss, R.H.; Stoecker, W.V.; Umbaugh, S.; Stanley, R.J.; Celebi, M.E.; Marghoob, A.A.; et al. Detection of Atypical Texture Features in Early Malignant Melanoma. Ski. Res. Technol.
**2010**, 16, 60–65. [Google Scholar] [CrossRef] [Green Version] - Lama, N.; Kasmi, R.; Hagerty, J.R.; Stanley, R.J.; Young, R.; Miinch, J.; Nepal, J.; Nambisan, A.; Stoecker, W.V. ChimeraNet: U-Net for Hair Detection in Dermoscopic Skin Lesion Images. J. Digit. Imaging
**2022**. [Google Scholar] [CrossRef] - Cheng, B.; Erdos, D.; Stanley, R.J.; Stoecker, W.V.; Calcara, D.A.; Gómez, D.D. Automatic Detection of Basal Cell Carcinoma Using Telangiectasia Analysis in Dermoscopy Skin Lesion Images. Ski. Res. Technol.
**2011**, 17, 278–287. [Google Scholar] [CrossRef] [Green Version] - Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Munich, Germany, 5–9 October 2015; pp. 234–241. [Google Scholar]
- Zhou, Z.; Rahman Siddiquee, M.M.; Tajbakhsh, N.; Liang, J. Unet++: A Nested u-Net Architecture for Medical Image Segmentation. In Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support; Springer: Cham, Switzerland, 2018; pp. 3–11. [Google Scholar]
- Fan, T.; Wang, G.; Li, Y.; Wang, H. Ma-Net: A Multi-Scale Attention Network for Liver and Tumor Segmentation. IEEE Access
**2020**, 8, 179656–179665. [Google Scholar] [CrossRef] - Li, H.; Xiong, P.; An, J.; Wang, L. Pyramid Attention Network for Semantic Segmentation. In Proceedings of the British Machine Vision Conference 2018, BMVC 2018, Newcastle, UK, 3–6 September 2018; BMVA Press: London, UK, 2019. [Google Scholar]
- Tan, M.; Le, Q.V. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. In Proceedings of the 36th International Conference on Machine Learning, ICML 2019, Long Beach, CA, USA, 9–15 June 2019; Volume 97, pp. 10691–10700. [Google Scholar]
- Yakubovskiy, P. Segmentation Models Pytorch. GitHub Repos
**2020**. [Google Scholar] - Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan, G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga, L.; et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Proceedings of the Advances in Neural Information Processing Systems, Vancouver, BC, Canada, 8–14 December 2019; Wallach, H., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E., Garnett, R., Eds.; Curran Associates, Inc.: New York, NY, USA, 2019; Volume 32. [Google Scholar]
- Deng, J.; Dong, W.; Socher, R.; Li, L.-J.; Li, K.; Fei-Fei, L. Imagenet: A Large-Scale Hierarchical Image Database. In Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 20–25 June 2009; pp. 248–255. [Google Scholar]
- Gama, J.; Brazdil, P. Cascade Generalization. Mach. Learn.
**2000**, 41, 315–343. [Google Scholar] [CrossRef] - Wolpert, D.H. Stacked Generalization. Neural Netw.
**1992**, 5, 241–259. [Google Scholar] [CrossRef] - Ting, K.M.; Witten, I.H. Issues in Stacked Generalization. J. Artif. Intell. Res.
**1999**, 10, 271–289. [Google Scholar] [CrossRef] [Green Version] - Dai, J.; He, K.; Sun, J. Instance-Aware Semantic Segmentation via Multi-Task Network Cascades. In Proceedings of the Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar]
- Breiman, L. Random Forests. Mach. Learn.
**2001**, 45, 5–32. [Google Scholar] [CrossRef] [Green Version] - Sudre, C.H.; Li, W.; Vercauteren, T.; Ourselin, S.; Jorge Cardoso, M. Generalised Dice Overlap as a Deep Learning Loss Function for Highly Unbalanced Segmentations. In Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Springer: Cham, Siwtzerland, 2017; Volume 10553, pp. 240–248. ISBN 9783319675572. [Google Scholar]
- Zhao, R.; Qian, B.; Zhang, X.; Li, Y.; Wei, R.; Liu, Y.; Pan, Y. Rethinking Dice Loss for Medical Image Segmentation. In Proceedings of the 2020 IEEE International Conference on Data Mining (ICDM), Virtual, 17–20 November 2020; pp. 851–860. [Google Scholar]
- Kingma, D.P.; Ba, J.L. Adam: A Method for Stochastic Optimization. In Proceedings of the 3rd International Conference on Learning Representations, ICLR 2015—Conference Track Proceedings, San Diego, CA, USA, 7–9 May 2015. [Google Scholar]
- Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-Learn: Machine Learning in Python. J. Mach. Learn. Res.
**2011**, 12, 2825–2830. [Google Scholar] - Calisto, F.M.; Nunes, N.; Nascimento, J.C. Modeling Adoption of Intelligent Agents in Medical Imaging. Int. J. Hum. Comput. Stud.
**2022**, 168, 102922. [Google Scholar] [CrossRef] - Calisto, F.M.; Ferreira, A.; Nascimento, J.C.; Gonçalves, D. Towards Touch-Based Medical Image Diagnosis Annotation. In Proceedings of the 2017 ACM International Conference on Interactive Surfaces and Spaces, Brighton, UK, 17–20 October 2017; Association for Computing Machinery: New York, NY, USA; pp. 390–395. [Google Scholar]
- Calisto, F.M.; Santiago, C.; Nunes, N.; Nascimento, J.C. BreastScreening-AI: Evaluating Medical Intelligent Agents for Human-AI Interactions. Artif. Intell. Med.
**2022**, 127, 102285. [Google Scholar] [CrossRef] - Hagerty, J.; Stanley, R.J.; Stoecker, W.V. Medical Image Processing in the Age of Deep Learning Is There Still Room for Conventional Medical Image Processing Techniques? In Proceedings of the VISIGRAPP 2017—the 12th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, Porto, Portugal, 27 February–1 March 2017; Imai, F.H., Trémeau, A., Braz, J., Eds.; SciTePress: Porto, Portugal, 2017; Volume 4, pp. 306–311. [Google Scholar]
- Cassidy, B.; Kendrick, C.; Brodzicki, A.; Jaworek-Korjakowska, J.; Yap, M.H. Analysis of the ISIC Image Datasets: Usage, Benchmarks and Recommendations. Med. Image Anal.
**2022**, 75, 102305. [Google Scholar] [CrossRef]

**Figure 1.**Figures showing examples of irregular network structures: (

**a**) irregular network, (

**b**) irregular streaks, and (

**c**) angulated lines.

**Figure 2.**Overview of the steps involved in our implementation. The segmentation dataset is used to train and test the DL segmentation model. The segmentation model is used over the classification dataset to generate masks and extract hand-crafted features for level-1 of the Cascade generalization pipeline. The classification dataset is also used to train a DL classification model (level-0), which is used with the hand-crafted features to train conventional classifiers for the final diagnosis classification.

**Figure 3.**IoU = 0.0; DL detects extra structures: globules and pseudopods in the periphery. Blue regions signify false-positive areas.

**Figure 4.**IoU = 0.29, close to the average IoU for all melanomas. DL detects peripheral irregular networks correctly. DL also detects normal network in the image center. Blue regions signify false-positive areas, green regions are false-negative, and teal-colored regions are true-positive.

**Figure 5.**IoU = 0.78; DL can detect the areas of darkened lines and constricted holes in this melanoma. The primary DL error is at the lower left, where DL detects normal network. Blue regions signify false-positive areas, green regions are false-negative, and teal-colored regions are true-positive.

**Figure 6.**The success of deep learning (DL) with and without domain knowledge (conventional learning). As the number of training cases increases, DL becomes more accurate than conventional learning with LE (Learning Equilibrium) training cases, and finally becomes as accurate as fusion learning with FE (fusion equilibrium) training cases. At some future number of training cases, we reach a point where the deep learning gap equals the fusion gap. From [49], with permission.

**Table 1.**The table shows the pixel-based metrics

^{1}for the best encoder for each architecture based on the IoU score after applying a threshold of 0.5 on the model outputs.

Architecture | Encoder | Precision | Recall | F1-Score | Specificity | IoU |
---|---|---|---|---|---|---|

U-Net | EfficientNet-b4 | 0.363 | 0.523 | 0.428 | 0.984 | 0.273 |

U-Net++ | EfficientNet-b4 | 0.358 | 0.562 | 0.438 | 0.982 | 0.280 |

MA-Net | EfficientNet-b5 | 0.383 | 0.461 | 0.419 | 0.987 | 0.265 |

PA-Net | EfficientNet-b5 | 0.345 | 0.501 | 0.408 | 0.983 | 0.257 |

^{1}Rounded to three significant digits. The highest value for each metric is in bold.

**Table 2.**The table presents the classification metrics for both deep learning (DL) models applied to lesion images and conventional models using only hand-crafted features. Additionally, the ensemble of DL architectures and conventional models, obtained by averaging the output probabilities, is presented. The metrics were calculated after applying a threshold of 0.5 on the output.

Level-0 | Type | Precision | Recall | F1-Score | Accuracy |
---|---|---|---|---|---|

Efficientnet-B0 | DL | 0.842 | 0.787 | 0.813 | 0.817 |

Efficientnet-B1 | DL | 0.847 | 0.787 | 0.816 | 0.820 |

Resnet50 | DL | 0.899 | 0.686 | 0.779 | 0.802 |

Ensemble | DL | 0.886 | 0.781 | 0.830 | 0.838 |

Decision Tree | Conventional | 0.744 | 0.757 | 0.751 | 0.745 |

Linear SVM | Conventional | 0.823 | 0.716 | 0.766 | 0.778 |

Neural Net | Conventional | 0.806 | 0.763 | 0.784 | 0.787 |

Random Forest | Conventional | 0.821 | 0.734 | 0.775 | 0.784 |

RBF SVM | Conventional | 0.748 | 0.669 | 0.706 | 0.718 |

Ensemble | Conventional | 0.833 | 0.740 | 0.784 | 0.793 |

**Table 3.**The table presents the classification metrics for cascade generalization with DL models as level-0 models and conventional models as level-1 models. The metrics were calculated after applying a threshold of 0.5 to the probabilities.

Level-0 | Level-1 | Precision | Recall | F1-Score | Accuracy |
---|---|---|---|---|---|

Efficientnet-B1 | Ensemble | 0.832 | 0.852 | 0.842 | 0.838 |

Efficientnet-B1 | Neural Net | 0.811 | 0.811 | 0.811 | 0.808 |

Efficientnet-B1 | Random Forest | 0.847 | 0.852 | 0.850 | 0.847 |

Efficientnet-B1 | Decision Tree | 0.819 | 0.828 | 0.824 | 0.820 |

Efficientnet-B1 | RBF SVM | 0.796 | 0.692 | 0.741 | 0.754 |

EfficientNet-B1 | Linear SVM | 0.826 | 0.757 | 0.790 | 0.796 |

EfficientNet-B0 | Ensemble | 0.860 | 0.870 | 0.865 | 0.862 |

EfficientNet-B0 | Neural Net | 0.835 | 0.781 | 0.807 | 0.811 |

EfficientNet-B0 | Random Forest | 0.835 | 0.840 | 0.838 | 0.835 |

EfficientNet-B0 | Decision Tree | 0.830 | 0.811 | 0.820 | 0.820 |

EfficientNet-B0 | RBF SVM | 0.778 | 0.663 | 0.716 | 0.733 |

EfficientNet-B0 | Linear SVM | 0.841 | 0.751 | 0.794 | 0.802 |

Resnet50 | Ensemble | 0.849 | 0.834 | 0.842 | 0.841 |

Resnet50 | Neural Net | 0.830 | 0.781 | 0.805 | 0.808 |

Resnet50 | Random Forest | 0.825 | 0.893 | 0.858 | 0.850 |

Resnet50 | Decision Tree | 0.822 | 0.822 | 0.822 | 0.820 |

Resnet50 | RBF SVM | 0.772 | 0.663 | 0.713 | 0.730 |

Resnet50 | Linear SVM | 0.860 | 0.728 | 0.788 | 0.802 |

**Table 4.**The table presents the classification metrics for the pipelines with the highest accuracy across each of the different pipelines used for classification. CG stands for cascade generalization.

Model | Classification Pipeline | Precision | Recall | F1-Score | Accuracy |
---|---|---|---|---|---|

Conventional Ensemble | No CG | 0.833 | 0.740 | 0.784 | 0.793 |

DL Ensemble | No CG | 0.886 | 0.781 | 0.830 | 0.838 |

EfficientNet-B0 + Conventional Ensemble | CG | 0.860 | 0.870 | 0.865 | 0.862 |

**Table 5.**The table presents the classification metrics for the pipelines with the highest recall across each of the different pipelines used for classification. The highest value for each metric is highlighted in bold. CG stands for cascade generalization.

Model | Classification Pipeline | Precision | Recall | F1-Score | Accuracy |
---|---|---|---|---|---|

Neural Net | No CG | 0.806 | 0.763 | 0.784 | 0.787 |

Efficientnet-B1 | No CG | 0.847 | 0.787 | 0.816 | 0.820 |

Resnet50 + Random Forest | CG | 0.825 | 0.893 | 0.858 | 0.850 |

**Table 6.**The table shows the ten most important features based on the mean accuracy decrease after feature permutation. The scores are the averaged mean importance scores of all conventional classifiers without cascade generalization. Note that objects are non-overlapping distinct contours (blobs) in the irregular network binary mask generated by the segmentation model.

Feature | Importance Score |
---|---|

Standard deviation of object’s color in L-plane inside lesion | 0.036 |

Mean of object’s color in L-plane inside lesion | 0.026 |

Standard deviation of object’s color in B-plane inside lesion | 0.026 |

Total number of objects inside lesion | 0.021 |

Mean of skin color in A-plane (excluding both lesion and irregular networks) | 0.019 |

Mean of object’s color in B-plane inside lesion | 0.018 |

Standard deviation of skin color in L-plane (excluding both lesion and irregular networks) | 0.016 |

Maximum width for all objects * | 0.014 |

Standard deviation of skin color in B-plane (excluding both lesion and irregular networks) | 0.012 |

Standard deviation of eccentricity for all objects | 0.011 |

**Table 7.**Table shows the 10 most important features based on the mean accuracy decrease after feature permutation. The scores are the averaged mean importance scores for all conventional classifiers with Cascade Generalization. Note that objects are non-overlapping distinct contours (blobs) in the irregular network binary mask generated by the segmentation model.

Feature | Importance Score |
---|---|

Standard deviation of object’s color in L-plane inside lesion | 0.037 |

Deep Learning (level-0) probability output * | 0.030 |

Mean of skin color in A-plane (excluding both lesion and irregular networks) | 0.030 |

Total number of objects inside lesion | 0.021 |

Total number of objects remaining after applying erosion with circular structuring element of radius 3 * | 0.015 |

Mean of object’s color in L-plane inside lesion | 0.014 |

Standard deviation of skin color in B-plane (excluding both lesion and irregular networks) | 0.014 |

Standard deviation of object’s color in B-plane inside lesion | 0.014 |

Standard deviation of skin color in L-plane (excluding both lesion and irregular networks) | 0.013 |

Mean of object’s color in B-plane inside lesion | 0.012 |

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Nambisan, A.K.; Maurya, A.; Lama, N.; Phan, T.; Patel, G.; Miller, K.; Lama, B.; Hagerty, J.; Stanley, R.; Stoecker, W.V.
Improving Automatic Melanoma Diagnosis Using Deep Learning-Based Segmentation of Irregular Networks. *Cancers* **2023**, *15*, 1259.
https://doi.org/10.3390/cancers15041259

**AMA Style**

Nambisan AK, Maurya A, Lama N, Phan T, Patel G, Miller K, Lama B, Hagerty J, Stanley R, Stoecker WV.
Improving Automatic Melanoma Diagnosis Using Deep Learning-Based Segmentation of Irregular Networks. *Cancers*. 2023; 15(4):1259.
https://doi.org/10.3390/cancers15041259

**Chicago/Turabian Style**

Nambisan, Anand K., Akanksha Maurya, Norsang Lama, Thanh Phan, Gehana Patel, Keith Miller, Binita Lama, Jason Hagerty, Ronald Stanley, and William V. Stoecker.
2023. "Improving Automatic Melanoma Diagnosis Using Deep Learning-Based Segmentation of Irregular Networks" *Cancers* 15, no. 4: 1259.
https://doi.org/10.3390/cancers15041259