Diagnostics of Melanocytic Skin Tumours by a Combination of Ultrasonic, Dermatoscopic and Spectrophotometric Image Parameters

Dermatoscopy, high-frequency ultrasonography (HFUS) and spectrophotometry are promising quantitative imaging techniques for the investigation and diagnostics of cutaneous melanocytic tumors. In this paper, we propose the hybrid technique and automatic prognostic models by combining the quantitative image parameters of ultrasonic B-scan images, dermatoscopic and spectrophotometric images (melanin, blood and collagen) to increase accuracy in the diagnostics of cutaneous melanoma. The extracted sets of various quantitative parameters and features of dermatoscopic, ultrasonic and spectrometric images were used to develop the four different classification models: logistic regression (LR), linear discriminant analysis (LDA), support vector machine (SVM) and Naive Bayes. The results were compared to the combination of only two techniques out of three. The reliable differentiation between melanocytic naevus and melanoma were achieved by the proposed technique. The accuracy of more than 90% was estimated in the case of LR, LDA and SVM by the proposed method.


Introduction
In Europe, cutaneous melanoma (CM) is the fifth most common type of cancer, with an incidence of 15.0 in age-standardized rate (ASR per 100,000 person-years). Northern Europe displays the largest ASR mortality of 3.8 in the region, with an incidence of 23.4 [1]. CM incidence rate shows high worldwide variability: it ranges from 0.30 in South-Central Asia to 33.6 in Australia and New Zealand [2]. The US Preventive Services Task Force does not currently recommend CM screening in the general population [3] and the introduction of such programs in Germany wielded inconclusive results [4]. A systematic review highlighted the key risk factors for CM screening initiation in high-risk individuals: a large number of melanocytic or dysplastic naevi, a family history of melanoma, light (Fitzpatrick I and II) skin types [5].
CM can be classified according to clinical and pathological features, based on the updated American Joint Committee on Cancer (AJCC) staging system [6]. Additional genetic classification is frequently used in research settings and has an expanding role in treatment selection: BRAF, Ras, NF-1, wild-type and other genetic subtypes have been identified [7]. Excision of primary tumor remains essential in diagnosing CM and includes histopathological measurement of tumor thickness according to Breslow to the nearest 0.1 mm [6,8]. Although new CM management guidelines highlight the sets of informative quantitative parameters from images of these three imaging technologies to train the classifiers. In order to shorten the computation time, combined sets of the most sensitive quantitative parameters extracted from the diagnostic images are used for the classification, instead of the whole images.

Description of the Proposed Technique
The flow chart describing the proposed diagnostic technique of CM is presented in Figure 1.
Diagnostics 2020, 10, x FOR PEER REVIEW 3 of 15 most sensitive quantitative parameters extracted from the diagnostic images are used for the classification, instead of the whole images.

Description of the Proposed Technique
The flow chart describing the proposed diagnostic technique of CM is presented in Figure 1. There are three sets of acquired spectrophotometric images (melanin, blood and collagen), one set of dermatoscopic images and HFUS B-scan images for groups of CM and melanocytic naevus (MN). The images were processed in Matlab 2020a (The MathWorks Inc., Natick, MA USA) by developing the special algorithms. In the case of ultrasonic B-scan, the region of interest (ROI) is extracted from the B-scan data by tracking the front and back surface reflections from the lesion boundaries. The sets of various quantitative parameters were extracted from different diagnostic images (dermatoscopic, spectrophotometric and ultrasonic) by processing these images for inputting to the classifiers. The four different algorithms for binary classification (CM or MN) using machinelearning techniques (logistic regression (LR), linear discriminant analysis (LDA), support vector machine (SVM) and Naive Bayes) were performed for the evaluation of diagnostic accuracy. The results of the automatic classification (accuracy, sensitivity, specificity, precision, Mathews correlation coefficient (MCC) and area under the ROC curve) are compared with the results of the histological examination. The proposed models could be used for reliable differentiation between the MN and CM.

Experimental Analysis and Clinical Measurements
In total, diagnostic images (dermatoscopic, spectrophotometric and ultrasonic B-scans) of skin lesions were acquired for 100 different patients. The age range of the examined patients was 17-87 years, an average 52.85 ± 17.33 years. According to the presence of artifacts within images, nine cases were excluded from the study. Therefore, further analysis was performed for 91 cases consisting of 50 naevi and 41 melanomas. All diagnoses were histopathologically confirmed by two experienced dermatopathologist. A third pathologist was called if there was discrepancy between the two pathologists. The range of MN thickness was from 0.2 mm up to 2.7 mm. The average thickness value There are three sets of acquired spectrophotometric images (melanin, blood and collagen), one set of dermatoscopic images and HFUS B-scan images for groups of CM and melanocytic naevus (MN). The images were processed in Matlab 2020a (The MathWorks Inc., Natick, MA USA) by developing the special algorithms. In the case of ultrasonic B-scan, the region of interest (ROI) is extracted from the B-scan data by tracking the front and back surface reflections from the lesion boundaries. The sets of various quantitative parameters were extracted from different diagnostic images (dermatoscopic, spectrophotometric and ultrasonic) by processing these images for inputting to the classifiers. The four different algorithms for binary classification (CM or MN) using machine-learning techniques (logistic regression (LR), linear discriminant analysis (LDA), support vector machine (SVM) and Naive Bayes) were performed for the evaluation of diagnostic accuracy. The results of the automatic classification (accuracy, sensitivity, specificity, precision, Mathews correlation coefficient (MCC) and area under the ROC curve) are compared with the results of the histological examination. The proposed models could be used for reliable differentiation between the MN and CM.

Experimental Analysis and Clinical Measurements
In total, diagnostic images (dermatoscopic, spectrophotometric and ultrasonic B-scans) of skin lesions were acquired for 100 different patients. The age range of the examined patients was 17-87 years, an average 52.85 ± 17.33 years. According to the presence of artifacts within images, nine cases were excluded from the study. Therefore, further analysis was performed for 91 cases consisting of 50 naevi and 41 melanomas. All diagnoses were histopathologically confirmed by two experienced dermatopathologist. A third pathologist was called if there was discrepancy between the two pathologists. The range of MN thickness was from 0.2 mm up to 2.7 mm. The average thickness value of MN was 0.78 ± 0.54 mm, correspondingly, the range of CM thicknesses was from 0.22 mm up to 3.15 mm with an average value of 1.0 ± 0.7 mm.
Ultrasonic B-scan images were acquired by DUB-USB ultrasound imaging system ("Taberna pro medium", Germany) possessing focussed transducer of 22 MHz central frequency. The sampling frequency was 100 MHz. The transducer was scanned up to 12.8 mm with a scanning step of 33 µm. The depth of the imaging region was 8 mm according to the velocity of ultrasound (1580 m/s) and length of signal acquisition window in time domain. Afterwards, the B-scan data were transferred to the computer for further processing.
Optical dermatoscopic and spectrophotometric images were acquired using spectrophotometer SimSys© (MedX Health Corp., Canada) operating in combined dermatoscopy-spectrophotometry modes and transferred to the computer for further analysis as well. The diameter of the imaging region was 11 mm. After surgical excision and during the routine histopathology, the diagnosis of skin lesions was confirmed.
All mentioned images were acquired at the Department of Skin and Venereal Diseases of Lithuanian University of Health Sciences. The presented study was approved by the regional ethics committee (No. P3-BE-2-25/2009, Date: 14 November 2017). The written informed consent was obtained from all patients before examination and surgical excision of the lesion. The diagnosis of skin lesions was confirmed by the routine histopathology after the surgical excision.

Data Processing
All acquired ultrasonic, dermatoscopic and spectrophotometric images were processed in Matlab R2020a. First of all, each ultrasonic A-scan signal of a B-scan image was interpolated achieving four times oversampling and reducing distortions of the digitized signal. The selection of front skin surface detection limits and maximum depth to analyze was performed manually for each B-scan data. The thickness of epidermis was selected as 0.1 mm throughout the data processing. It should be noted that typical skin tumors do not possess the expressed boundaries between tumors and the region of healthy tissues. That is why some false boundary points were detected. The second order low-pass Butterworth filter was used for the front and back surface contour follower. The optimal least-square polynomial approximation was performed for the detection of final boundaries. The minimal sum of differences between the polynomial and detected data points was used to select the optimal polynomial approximation that was in the range of 1st to 7th order. It can be expressed mathematically as follows [40]: where D approx is the sum of differences, a n and y[m] are the coefficients of a polynomial of degree n and x[m] is the polynomial term.
In the next step, the average RMS value of the B-scan spectrum was calculated. In order to smooth the average RMS spectra, zero-phase digital filtering was applied. Thereafter, the front-surface reflections and reverberations (second-order reflections) were removed. In order to locate the back-surface reflections (bottom of CM), the same process was repeated, however, the thresholding was applied to data by calculating the moving average filter rectification (absolute values > 0.7). The thickness of CM/MN was considered as the maximum distance between the front and back surfaces. After extracting the region of interest (ROI) from each B-scan data, resulting regions were acquired and saved as images for further processing. The detection of boundaries and extracted images of interest are presented in Figure 2 in the case of both MN and CM. Afterwards, a quantitative analysis of the extracted images was carried out to characterize the spatial features. Overall, thirteen quantitative parameters were computed to provide input information for classification. After converting the image into grayscale, energy (Em) and entropy (Ep) was calculated from the histogram [41]. The range of energy value lies between 0 and 1, that is a numerical descriptor of uniformity. Entropy signifies the statistical measure of uncertainty and randomness and lies between 0 and log2(M) (M is the total number of levels in the histogram). The other quantitative parameters (mean, standard deviation, root mean square (RMS), variance, smoothness, skewness and kurtosis) of the intensity value distribution were computed in order to characterize the global distribution of intensity [42]. To characterize the texture feature of the image, the grayscale co-occurrence matrix The range of energy value lies between 0 and 1, that is a numerical descriptor of uniformity. Entropy signifies the statistical measure of uncertainty and randomness and lies between 0 and log 2 (M) (M is the total number of levels in the histogram). The other quantitative parameters (mean, standard deviation, root mean square (RMS), variance, smoothness, skewness and kurtosis) of the intensity value distribution were computed in order to characterize the global distribution of intensity [42]. To characterize the texture feature of the image, the grayscale co-occurrence matrix (GSCM) was created from the image [43]. Afterwards, the four quantitative parameters (contrast, energy, correlation and homogeneity) from the GSCM were computed.
In contrast to the ultrasonic B-scan images, no initial preprocessing of dermatoscopic and spectrophotometric images was performed. The acquired images of CM and MN by dermatoscopy and spectrophotometry are shown in Figures 3 and 4, respectively.
Diagnostics 2020, 10, x FOR PEER REVIEW 6 of 15 (GSCM) was created from the image [43]. Afterwards, the four quantitative parameters (contrast, energy, correlation and homogeneity) from the GSCM were computed. In contrast to the ultrasonic B-scan images, no initial preprocessing of dermatoscopic and spectrophotometric images was performed. The acquired images of CM and MN by dermatoscopy and spectrophotometry are shown in Figure 3 and Figure 4, respectively.  The same number (thirteen) of the quantitative parameters as computed in the case of ultrasonic B-scan images were computed from these images as well. However, not all of these parametric values were statistically significant to be used further for classification. Only those parameters that were statistically significant (p < 0.05 by using t-test) in each case (optical dermatoscopic images, spectrophotometric images and ultrasonic B-scan images), were selected to be used as input data for the binary classification algorithm. The selected sets of parameters are presented in Table 1. The same number (thirteen) of the quantitative parameters as computed in the case of ultrasonic B-scan images were computed from these images as well. However, not all of these parametric values were statistically significant to be used further for classification. Only those parameters that were statistically significant (p < 0.05 by using t-test) in each case (optical dermatoscopic images, spectrophotometric images and ultrasonic B-scan images), were selected to be used as input data for the binary classification algorithm. The selected sets of parameters are presented in Table 1. Table 1. Numbers of selected quantitative parameters to be used for binary classification of images acquired using different imaging technologies (1-entropy, 2-energy, 3-contrast, 4-correlation, 5-energy from GSCM, 6-homogeneity, 7-mean, 8-standard deviation, 9-RMS, 10-variance, 11-smoothness, 12-kurtosis, 13-skewness) (x denotes the statistically significant (p < 0.05) parameter).  Table 1. Numbers of selected quantitative parameters to be used for binary classification of images acquired using different imaging technologies (1-entropy, 2-energy, 3-contrast, 4-correlation, 5-energy from GSCM, 6-homogeneity, 7-mean, 8-standard deviation, 9-RMS, 10-variance, 11-smoothness, 12-kurtosis, 13-skewness) (x denotes the statistically significant (p < 0.05) parameter).

Type of Imaging Technology and Images
Numbers of Selected Quantitative Parameters (p < 0.05) to be Used for Classification

Classification Algorithm
After computing the parameters from the mentioned imaging techniques, the next step was to train and evaluate the performance of classifiers. There are many classification methods (e.g., K-Nearest Neighbour, Decision Tree, Support Vector Machine, Artificial Neural Network (ANN), Neuro-Fuzzy, Fuzzy C-Mean (FCM), Naive Bayes and Clustering, linear regression etc.) that have been used in CAD of tissue affected by cancer [44,45].
In this work, four classification models (LR, LDA, Naive Bayes and SVM) have been used for binary classification (CM or MN) and their performances were compared.
One of the most common machine learning techniques for data classification is SVM [46]. The basic concept is based on the decision planes that separate the objects to differentiate the classes. If the data can be separated linearly, the simplest SVM is linear. If the data cannot be separated linearly, kernel SVM with radial base function (RBF) can be utilized [47] to classify the data into CM or MN. In our case, SVM with the utilization of RBF kernel is used. In the logistic regression model, the predictor variables can be scale-dependent and quantitative, however, the dependent variable lies in membership or non-membership category [48]. The mechanism on which logistic regression works is called Logit. In comparison to the multiple regression, logistic regression requires less assumption and hence, it is more flexible. Another classification model used in this work is Linear discriminant analysis (LDA) or Fischer discriminants which is a common technique for dimensionality reduction and classification [49]. The method aims to maximize the ratio of the between-group variance and the within-group variance [50]. The Naive Bayes classifiers simple probabilistic classifiers that are based on the application of Bayes theorem with strong distinguished assumptions between the features [51,52]. As Naive Bayes classifier is highly scalable, it requires the linear parameters for learning. During the training stage, the 10-fold cross-validation process was used to build all classifier models for the computation of optimized parameters.
The standard parameters were computed to measure the performance of these four models for binary classification. The five threshold-dependent parameters (sensitivity (S e ), specificity (S p ), accuracy (A c ) and Matthews correlation coefficient (MCC)) and one threshold-independent parameter (area under ROC (AUROC)) were employed to measure the performance.
The threshold dependent parameters can be expressed by following mathematical equations [53]: where S e , S p , A c , P r and MCC denote the sensitivity, specificity, accuracy, precision and Matthews correlation coefficient (MCC), respectively, and FP, FN, TP and TN denote the false positive, false negative, true positive and true negative, respectively. The standard AUROC curve was generated by plotting the sensitivity against the false positive rate at different thresholds. Afterwards, the area under the ROC curve was estimated to evaluate the AUROC parameter.

Results and Discussion
The performance of combining all three imaging techniques (dermatoscopy, spectrophotometry and ultrasound) based on the sets of quantitative parameters extracted from different images provided by particular imaging technique as discussed in Section 4 is compared. The results are sequentially discussed in this Section to show the improvement with the combination of all three imaging techniques compared to the combination of any two imaging techniques.

Case 1: Combining Quantitative Parameters from Dermatoscopic and Spectrophotometric Images
First of all, the classification models are developed by combining only the quantitative parameters computed from the dermatoscopy and spectrophotometry (melanin, blood and collagen) techniques. The results of the classifiers are presented in Table 2. The highest accuracy (90.11%), sensitivity (85.37%), specificity (94.00%), precision (92.11%), MCC (0.801) and AUROC (0.972) were achieved with SVM in comparison to all other models. Moreover, the accuracy was more than 74% for all classifiers. In the next step, the classification models were developed by combining the quantitative parameters of dermatoscopic and ultrasonic B-scan images. The classification results with this combination have been presented in Table 3. The performance of SVM (Table 3) was better in this case as compared to the previous combination (case 1, Table 2) by considering all statistical parameters except the AUROC. Moreover, the SVM model again showed the highest accuracy (91.21%), sensitivity (80.49%), specificity (100%), precision (100%), MCC (0.833) and AUROC (0.961) among all models. Moreover, the accuracy was more than 76% for all classifiers. The performance of Naive Bayes model (accuracy and MCC) was also improved as compared to the previous combination (case 1, Table 2). However, the performance of LR and LDA (Table 3) was reduced in comparison to a combination of spectrometric and optical dermatoscopic imaging techniques (case 1, Table 2).

Case 3: Combination of Spectrophotometry and HFUS Imaging Techniques
In this case, the quantitative parameters (Table 4) of both spectrophotometric and ultrasonic B-scan images are utilized in order to develop the classification models. The results of classifiers are presented in Table 4. More than 85% accuracy and sensitivity was achieved for LR, LDA and SVM. By considering all the statistical parameters obtained from the classifiers, the SVM shows the highest performance in this case with an accuracy of 95.60% and MCC of 0.912. As shown in Table 4, the performance of SVM was better in comparison to the case 1 ( Table 2) and case 2 ( Table 3). The AUROC for all classifiers is also higher in this case as comparison to the case 2 and case 3, except for LDA for which AUROC (0.905) is slightly less than case 1 (0.906). The performance of Naive Bayes remains the lowest. Finally, the quantitative parameters (Table 1) of all three different types of images (i.e., optical dermatoscopic, spectrophotometric and ultrasonic B-scan) are utilized in classification and presented in Table 5. In comparison to all the three cases mentioned above, the higher accuracy of more than 90% was achieved by using three classification models (LR, LDA and SVM). Moreover, the MCC and AUROC for all classifiers were highest in this case as compared to the previous three cases. The higher values of other parameters (sensitivity, specificity and precision) also signify the improvement of performance by combining three imaging techniques instead of the combination of any two of them. It is clearly observed that SVM outperformed all other classifiers; on another hand, Naive Bayes showed the worst performance. SVM is proven to be the optimal for linearly separable cases and its strategy to determine maximum-margin hyperplane is one of the best to reduce the prediction error [54]. In general, SVM is better for a two-class classification problem with a smaller number of features [55,56]. However, Naive Bayes can handle more features easily. Although the Naive Bayesian classification is an effective model for diagnosis melanoma, the decision tree algorithm is not well suited in this domain [57]. It is important to consider that SVM is not so popular with large data sets as it requires a significant amount of training time; however, in our research, this was not the case [56]. In comparison to the latest research work on SVM classification for melanoma (85.19% [58], 92.1% [59], 96% [60], 90% [61], 97.32% [62]), we achieved 98.9% accuracy with the proposed technique.

Conclusions
In our study a novel diagnostic system by combining the three different non-invasive medical imaging techniques (optical dermatoscopy, spectrophotometry and high-frequency ultrasound) is proposed for the reliable differentiation of CM and MN. In the case of having a limited number of diagnostic images and in order to expedite the processing, the sets of most sensitive quantitative parameters from the images were acquired and used as input of classifiers instead of images themselves. The binary classification results, combining the three imaging techniques, showed the highest accuracy of more than 90% for LR, LDA and SVM classifiers, which is not possible to achieve by combining only two imaging techniques. The obtained results reveal that SVM is the most suitable classification model for the detection of CM with accuracy of 98.9%, MCC of 0.978, sensitivity of 97.5%, AUROC of 0.999 and specificity and precision of 100%. The second classification model according to achieved high accuracy of 92.3% was LR. The proposed clinical decision support system can supplement non-invasive diagnostic methods already existing in clinical practice. Furthermore, big data analysis and deep learning neural networks (e.g., convolutional neural networks) could be used to implement a more accurate diagnostic system by using this approach in the future, after acquiring the required higher number of diagnostic images by aforementioned different imaging techniques.