Skin Cancer Diagnosis Based on Neutrosophic Features with a Deep Neural Network

Recent years evidenced an increase in the total number of skin cancer cases, and it is projected to grow exponentially. This paper proposes a computer-aided diagnosis system for the classification of a malignant lesion, where the acquired image is primarily pre-processed using novel methods. Digital artifacts such as hair follicles and blood vessels are removed, and thereafter, the image is enhanced using a novel method of histogram equalization. Henceforth, the pre-processed image undergoes the segmentation phase, where the suspected lesion is segmented using the Neutrosophic technique. The segmentation method employs a thresholding-based method along with a pentagonal neutrosophic structure to form a segmentation mask of the suspected skin lesion. The paper proposes a deep neural network base on Inception and residual blocks with softmax block after each residual block which makes the layer wider and easier to learn the key features more quickly. The proposed classifier was trained, tested, and validated over PH2, ISIC 2017, ISIC 2018, and ISIC 2019 datasets. The proposed segmentation model yields an accuracy mark of 99.50%, 99.33%, 98.56% and 98.04% for these datasets, respectively. These datasets are augmented to form a total of 103,554 images for training, which make the classifier produce enhanced classification results. Our experimental results confirm that the proposed classifier yields an accuracy score of 99.50%, 99.33%, 98.56%, and 98.04% for PH2, ISIC 2017, 2018, and 2019, respectively, which is better than most of the pre-existing classifiers.


Introduction
The technical development and scientific innovations in the field of medical science have led to improvement in the surgical condition of a patient; thereby, it has decreased the mortality rate and increased the satisfactory index of patients. Although new discoveries have aided the health care system with advanced diagnosis methods for several diseases, accurate diagnosis and timely treatment of a cancer patient remains exacting and challenging for researchers all around the world. Among several other types of cancers, skin cancer has been the most frequently diagnosed cancer as stated in by the National Institute of Skin Cancer (NISC) [1]. Supporting the statement of NISC, the World Health Organization (WHO) also reported the cases of skin cancer to be exceptionally massive in number, accounting for 1/3rd of the overall cancer cases, which seem to be increasing exponentially with time [2]. Skin being the outermost and massive sense organ of human anatomy is prone to several allergies and fatal infections due to maximum exposure to harmful ultraviolet (UV) radiations from the sun. The skin consists of three successive and overlapped protective layers of the epithelial tissues, namely dermis, epidermis, and hypodermis; these tissues guard the human body against UV radiations. There exists a color pigment termed as melanin, occupying the space at the junction of the dermis and epidermis layers. The melanin is responsible for notable coloration of the iris, skin, and hair. This pigment is produced by cells called melanocytes, based on different factors such as geographical location, climate, and exposure to sun or tanning devices. Melanin undergoing the classification phase, which increases accuracy of classification. Ref. [10] employed the method of generalized class of fractional partial differential equations for enhancement of skin lesion. A pixel's fractional mean-based image enhancement algorithm was employed by [11] for better image splicing detection, which enhances the quality of image, thereby enabling the classifiers to detect more features. Ref. [12] proposed a method for diagnosis of skin cancer using convolutional neural network for smartphone, which marks a significant growth in the field of digital diagnosis of melanoma. Ref. [13] used ESRGAN for pre-processing of skin lesion; thereafter, they experimented with several CNN models such as Resnet 50, Inception net, and Inception Resnet.
In this article, we have proposed a novel computer-aided diagnosis system (CAD) for digitally diagnosing the suspected lesion. A novel and effective pre-processing phase is introduced in the article which not only removes the artifacts (such as hair follicles, blood vessels, dermoscopic ruler, and frames) and reflections from the acquired images, but also enhances the image and automatically adjusting contrast and brightness of the image. Preprocessing is the fundamental and most important phase of CADs as availability of artifacts, such as hair follicles and blood vessel, affect the visual inspection of the lesion and hamper the accuracy of diagnosis, and thus, digitally removing such artifacts not only reduces the pain and labor of removing those artifacts over suspected lesion but also enhances the diagnosis outcome. Inspired from uncertainty theory, we want to incorporate the vagueness principle into image segmentation techniques to discard the ambiguity portion of a segmented image and to capture the best fitted region more efficiently. Obviously, some crucial questions to be asked are, if the size of the image is large, how we can cut or capture the desired actual feasible region in a logical way using the pixel values? Additionally, in case of segmentation how we can capture the hesitation portion using uncertainty logic? Most importantly, how can we link the matrix presentation of a segmented image with the pixel values? In case of iteration process where mathematical theory will be utilized to get the original affected region? This research article deploys a determinant based image segmentation method namely absolute value computational algorithm using each pixel values for three channels to detect the most affected region from the captured image. Initially, we will follow an algorithm based on 3 × 3 determinant constructions of the segmented image using the pixel values. We will scale down the total image into finite number of determinants and will calculate all the absolute values for each of the three channels. Furthermore, we will set the threshold value using the geometric mean concept, which will capture the affected region roughly from the original figure. After that, we will incorporate neutrosophic theory to judge the exact affected portion after 1st phase of segmentation. Neutrosophic number can grab all three components of an uncertain number, namely truth, false, and hesitation portions, very efficiently and logically. Here, we can utilize the pentagonal linguistic neutrosophic number for the second phase of segmentation to set the exact threshold value such that the image segmentation process can be performed properly to get more accurate approximation. Then, the article utilizes the segmented region for classification of lesion using proposed classifier in the Keras [14] framework. The model is trained over the PH2 [15], ISIC 2017 [16], ISIC 2018 [17], and ISIC 2019 [18] datasets of dermoscopic images. A huge set of dermoscopic data along with dense layers of classifiers yields an effective score for sensitivity and specificity. The proposed segmentation and classification method is evaluated for four publicly available datasets: PH2, ISIC 2017, ISIC 2018, and ISIC 2019. The first two have three categories of skin lesion images including melanoma, while the last two datasets have seven and eight categories, respectively. Moreover, the increase in classification accuracy have been marked due to various data tuning technique, such as data augmentation and balancing. The experimental results also depict the importance of segmentation phase prior to classification, as it plays a vital role in enhancement of the accuracy score. The diagnostics performance is critically being affected by the methods of data augmentation, rebalancing, and pre-segmentation. This article focuses on proper rebalancing the training data such that performance of diagnosis could be enhanced. The pre-segmentation phase is also considered to be one of the prerequisite steps, as the important features is targeted during classification of image and the irrelevant features of the surrounding tissues/skin (field of view) are segmented out of the lesion images. We have proposed a neural network, which employs inception and residual blocks with a SoftMax layer after each residual black, which makes the network wider; this paper addresses the classification performance by various well known deep learning classifiers and states that the proposed network achieves highest score for accuracy. The results achieved by this article provides a guideline for application of CAD system.
The fundamental motive of this research article is to develop a trustworthy CAD method based on principles of neutrosophy and deep learning for precise segmentation and classification of skin lesion. The forthcoming section in this article portrays materials and methods in Section 2, where the preliminaries to neutrosophic number is highlighted along with pre-processing using novel methods for removal of digital artifacts and image enhancement, segmentation using neutrosophy, and proposing a novel architecture for classification of the lesion. The experimental result is presented in Section 3. The paper ends with a discussion and conclusion in Sections 4 and 5, respectively.

Materials and Methods
This section explains the proposed method in detail, which uses different stages of computer aided diagnosis, namely: pre-processing, segmentation, lesion localization, and classification. Novel methods for pre-processing of lesion is introduced in this paper, which not only enhances the image quality but also digitally removes the artifacts. A unique segmentation phase is proposed in the article using neutrosophy and determinant to calculate the threshold value for segmentation of the lesion. Thereafter, a modified classifier is employed for classification of skin lesion into a malignant and non-malignant class. A complete flowchart of the proposed method is pictorially represented in Figure 1. The training dataset from PH2, ISIC 2017, ISIC 2018, and ISIC 2019 repositories are used to train the proposed deep neural network. Training data are augmented before passing it to the neural network. Training data being of different resolution is difficult to train on a neural network; thus, the images are resized into 512 × 512 pm. Thereafter, digital artifacts such as hair follicles, dermoscopic ruler and frame mark and mark of blood vessels are digitally added to the training image such that the accuracy of diagnosis in the real world (for holdout datasets) increases. Digital noise is also added to each of the images such that the model performs well in a real-world scenario. Henceforth, the images are rotated at an angle of 90, 180, and 270 degrees; therefore, the training dataset is augmented to four times is original size. Then, the augmented images are passed into the proposed neural network with specific hyperparameters (more about the proposed model and hyperparameters is mentioned in Section 2.4.1 Implementation and training). The test data from the public repositories of PH2, ISIC 2017, ISIC 2018, and ISIC 2019 are used to assess the performance of the model. The test data are pre-processed by the proposed pre-processing method, which includes the removal of digital noise and artifacts. In the next stage, the image is enhanced by the process of histogram equalization. The pre-processed image undergoes the segmentation phase, where the lesion is accurately segmented using the algorithms of weighted threshold calculation, which is followed by neutrosopic-based threshold calculation of the image but segmentation of lesion. Finally, the dermoscopic lesion image is classified to be either a melanoma or non-melanoma using the proposed model.

Neutrosophic Set [19]: A set A Nue is termed as a neutrosophic set if
as the truth function, σ A Nue (x) : X → [0 − , 1 + ] is called the hesitant function, and η A Nue (x) : . satisfy the following the relation:

Pre-Processing
In order to remove noise from the acquired input image f [x, y], we have applied a filter n σ s on the image f [x,y], such that noise at pixel (i, j) is flattened.: where: Thus, we get a modified image m[i,j] by applying the filter over the image, which is: In order to maintain the energy in the filter = 1, W b (weighting function) is created in order to add the product of spatial filter and brightness filter: However, due to the variation in the intensity of the image, a global filter alone cannot be employed to reduce digital image noise. Moreover, the image is smoothened to a great extent with an increased value of σ s , which might lose a few important features of the image and a low value of σ s might not be effective in the process of noise removal. Thus, a dynamic filter is required for enhanced noise filtering. The proposed filter works on the intensity value of the image and the filter is modified for each pixel. If the modular difference between the center [m,n] and [I, j] is more than is modified to 0 to avoid obsoletion of features. The modified brightness filter can be shown as: Thus, our modified equation for removal of extra unwanted noise is: As mentioned in Equation (9), digital noise is removed, and the image is smoothened without losing the key features which might be lost if we had only employed a Gaussian filter.
Artifacts such as presence of hair follicles (both thin and thick), presence of blood vessels, and reflection of dermoscopic gel and dermoscopic frames are digitally removed by the proposed method, where the noise free image m[i, j] 3 is converted into m[i, j] 1 , which is a monochromatic image. Henceforth, m[i,j] is binarized and it was checked if the selected pixels are continuous or discrete. Continuous and regular block of mask are not considered to be hairs follicles and they are set as the background of the mask. Thereafter, the binarized mask is alternatively diluted and convolved to form prominent lines of hair, which could be masked out from the original image f [x,y]. This method of hair removal also enables to remove marks from the dermoscopic ruler and frames.
Finally, the pre-processed image undergoes the process of image enhancement by histogram equalization to adjust the brightness and contrast of each pixel. The enhanced image is further used for lesion localization and segmentation. Additionally, the efficiency of the proposed method is illustrated in the result and analysis section, which clearly depicts the efficiency of the proposed pre-processing method. Figure 2 represents the pre-processing of the dermoscopic lesion at different proposed stages.

Segmentation
Initially, the pre-processed image (I) of dimension (x, y) is unified into a single channel, where: Thereafter, I min and I max are calculated from I new i,j , which is encapsulated along with weight w1 and w2 to calculate global threshold (γ) of the image: where w1 and w2 are 0.4 and 0.6, respectively. To form a segmentation mask, if I new i,j < γ, the pixel is part of mask M (i,j) = 255 , otherwise it is discarded M (i,j) = 0 . Furthermore, the mask (M) is convolved with a filter (F 1 ) of size 3 × 3, which is iterated over M to calculate the minima of overlaying pixels, thereafter changing the F 1 (1,1) to minimal value.
Henceforth, the segmentation mask is formed, but we can observe a fuzziness in boundaries of the segmented lesion. We are confused in case of the next phase of segmentation, as all the pixel values lie within a small bandwidth. Moreover, we are in a dilemma which pixel should be captured and which one should be discarded. Thus, for the next phase of segmentation, we will utilize the linguistic pentagonal neutrosophic number to select the threshold value of the next phase. Structure of PNN is pictorially represented in Figure 3, where false, hesitance, and truth value are represented on the y-axis as 0, δ, and 1, respectively. X n and X n+1 represents the possible range of threshold for segmentation of skin lesion. The neutrosophic threshold is an accurately determined probabilistic point between this range. We know that any neutrosophic number can grab degree of true, false, and indeterminacy value of a membership function in a compact way. Case-1: If |X i | < ϑ, where |X i | is the pixel value of the ith pixel of the segmented image, then the pixel is discarded.
Case-2: If |X i | ≥ ϑ, where |X i | is the pixel value of the ith pixel of the segmented image, then the pixel is accepted for the next round.
The conception of neutrosophic number is being proposed here to tackle the ambiguity portion and to fix the threshold value ν. A linguistic pentagonal neutrosophic number is capable to define all the three components of an uncertain number of (i) true, (ii) false, or (iii) hesitation, so in this circumstance, a pentagonal neutrosophic number (PNN) is proposed to tackle the threshold value computation. In case of hesitation, the PNN successfully generates a threshold value, which supports the segmentation method to yield a high score of segmentation. Therefore, the same concept is applied to set the threshold value T. Thus: where π, σ, µ ∈ [0, 1], (m 1 , m 2 , m 3 , m 4 , m 5 ) represents the pentagonal neutrosophic components and π indicates the truth, σ indicates the indeterminacy, and µ indicates the falsity part of the membership function. The asymmetrical PNN is also considered, as in a realtime situation, the threshold value may not always be symmetrical PNN. Here, we utilized the linguistic PNN such that it can grab all the verbal information (Very low, Low, Median, High, Very High) in a compact way and no other structure can grab this idea. Figure 4 represents the segmentation performance of proposed method, where a dermoscopic image and its respective ground truth segmented mask is compared with the proposed segmentation masks to portray the accuracy of the proposed model. It can be easily concluded from Figure 4 that the first segmentation phase draws a rough outline across the lesion, which is refined in the second phase using the pentagonal neutrosophic number.

Classification
Classification is the fundamental phase of computer-aided diagnosis, where the acquired skin lesion image is classified to be malignant or non-malicious by using our proposed and efficient deep learning model. It is trained over a publicly available and standard dataset of ISIC. The forthcoming subsections deal with a detailed description of datasets that are employed for training and the hyperparameter used to fetch the best-fit results of classification.

Implementation and Training
With the advancement in medical vision and computer-aided diagnostic systems, the classification of a malignant lesion has significantly improved. However, training a classifier to detect a particular class is a strenuous task. The ISIC 2017, ISIC 2018, and ISIC 2019 datasets are used for training the classifier, with a total of 103,524 images, out of which 15,464 are melanoma images, while 88,060 belong to non-malignant images. A huge set of training data increased the efficiency of the classification. To unify the dataset to overcome the problem of classification for various images of multiple dimensions, the training images are resized to 512 × 512 pixels. Additionally, this resizing of the dataset also decreases the computational processing of skin lesion classification, thereby increasing the training speed. The training data along with its ground truth result are passed through the proposed model of a deep neural network.
This article proposes a classifier based on the inception and residual block with a softmax block after each residual block, which makes the network wider and less deep, thereby decreasing the training time. Pictorial representation of the proposed neural network is shown in Figure 5. The network takes an input image of shape 512 × 512 px and thereafter passes it through Stem block, which is inspired from the Inception Resnet V2. It consists of series of parallel convolution layers which are concatenated to preserve the key features of the image. The Inception blocks (A, B and C) contains a series of 1 × 1 and 3 × 3 convolutional layers along with average pooling layer, such that both minor and major features are put to attention. The dimension of the image is kept like its input size, and at the end of each inception block, the filters are concatenated to produce more kernels of the same dimension. The reduction blocks (both A and B) are used to reduce the dimension of the image without losing any of the key features. The softmax block comprises of a 1 × 1 convolutional layer and two fully connected (FC) layers, followed by a softmax layer for the classification of lesion. This block is attached after every reduction unit, thereby helping in the optimization of weight and biases on the network at each stage. This technique of updating weight and biases prior to final classification helps the network to learn faster and enables it to produce a high score for accuracy. A max pooling layer is used after the Inception-C block to reduce the size of the network and keep the key value of the image; thereafter, a dropout of 50% is applied to reduce the filter size, as most of the filters might be repetitive as we have used higher value of small filters. Images from ISIC public directory are grouped into two different classes, i.e., melanoma and non-melanoma, and they are passed as the only parameter to the neural network and based on the convolution images are classified to be of either of the class. The proposed architecture of the neural network seems to produce a high value of accuracy for the classification of a melanoma lesion. The proposed classifier fetches the best fit results for classification under the following set of hyperparameters: batch size = 32, subdivision = 16, learning rate = 0.045, decay rate = 0.5, epsilon = 1.0, and momentum = 0.7. The classification result is generated for 5000 epochs while saving results for each 50th epoch. This criterion is evaluated to fetch enhanced classification results, as the learning rate is increased by saving epochs after 50th phase. The RMSProp optimizer is used for updating weight and biases during backpropagation and categorical_crossentropy is used as a loss function, as it seems to perform better with 'RMSProp'. The classifier is validated after each 50th epoch to check the enhanced efficiency of lesion detection. The configuration file is generated after training over a huge dataset of ISIC and is used to classify skin lesion images of the test dataset. The optimal hyperparameters are selected by employing Bayesian optimization methods. The key benefit of Bayesian approaches is that they can inform the choice of the subsequent hyperparameter values to be evaluated by using the results from historical runs or gradually increasing prior knowledge in the form of pairs of hyperparameter values and objective function scores. This reduces the time and computation costs required to attain acceptable objective function scores.

Augmentation of the Dataset
Efficient processing of dermoscopic images and accurate classification of skin lesions has emerged as a vital field of research. The accumulation of relevant datasets and accurate training of the classifier under specific parameter has always being a perplexing job. The proposed method is trained over the publicly accessible datasets of ISIC 2017, ISIC 2018, and ISIC 2019. These datasets consist reliable and easily available ground truth images that were drawn by a panel of expert dermatologist, which enable the researchers to compare and evaluate the proposed methodology. A huge set of dermoscopic data (after the process of data augmentation) not only helps the classifiers to train under rigorous conditions, but also increases the validation size, thereby producing an enhanced report of classification. The augmentation of data produces more training images from the same sample set by addition of digital noise and artifacts and rotating the image at specific angles, therefore helping the model to learn more features and classify the skin lesion accurately. Refs. [21,22] employed the process of data augmentation to increase the sample size of skin lesion image, thereby increasing the performance of the proposed model. Table 1 gives a summarized figure of total number of dermoscopic images for each dataset into training, testing, and validation, respectively. The pedro hispano hospital (PH2) dataset is a well-known dermoscopic dataset which is obtained from the Hospital of Pedro hispano, and it contains 200 dermoscopic 8-bit RGB images that are 761 × 570 to 769 × 577 pixel in dimension. It consists of 80 common nevi, 80 atypical nevi, and 40 melanoma images; this dataset is used as a hold-out dataset (used only for testing the performance of segmentation and classification) in our article. The ISIC dataset of 2017 was developed by the International Symposium on biomedical images (ISBI) organization. It contains three classes: Benign Nevus (BN), Seborrhoeic Keratosis (SK), and Malignant melanoma (MM). Image resolution of both training and testing images is ranging from 1022 × 767 to 6748 × 4499 pixels; these are 8-bit RGB images. The ISIC 2018 dataset has sources from HAM10000, and it contains 10,015 training images along with ground truth values and metadata, which contains the classification results. We have segregated 10,015 training images into 8695 images for training and 1320 dermoscopic images for testing the classifier. The ISIC 2018 dataset contains seven classes of different skin diseases. The ISIC 2019 dataset constitutes of 25,311 8-bit RGB images in the training dataset, of which 60% is used for training and 20% each for testing and validation of the proposed method. The dataset contains metadata and ground truth images for training; thus, only the training dataset is used in this research for training the classifier and evaluating it, such that we will get a result to compare our values. The dataset is derived of eight different classes: Squamous cell carcinoma, Vascular Lesion, Dermatofibroma, Benign Keratosis (Lichen planus-like keratosis, seborrheic keratosis and solar lentigo), Actinic Keratosis, Basal Cell Carcinoma, Melanocytic Nevus, and Melanoma. Each image of the dataset is resized to 512 × 512 pixels to reduce the dimensionality, such that we can perform the image enhancement and classification easily at less expensive computational cost. Henceforth, extra artifacts, such as thick and thin hair, ruler marks, and unwanted reflections and noise, are added to these replicated images, such that it can train the classifier to fetch enhanced results under any condition. Augmentation of the dataset at various angles is performed to produce a modified classification outcome, even for images acquired at a different angle. The sum of each training dataset is four times the original count, as each dermoscopic image is rotated at an angle of 90 • , 180 • , and 270 • . The total count of dermoscopic images for training the classifier was 25,881, which is increased to 103,524 after replication of the training dataset. Therefore, a total of 115,899 dermoscopic images are used for training, testing and validation in this research article. Pictorial representation to data augmentation is illustrated in Figure 6.  PH2  0  0  0  0  0  0  0  40  160  200  40  160  ISBI17  374  1626  2000  8000  197  403  600  30  120  150  1723  7057  ISIC18  779  7916  8695  34,780  334  986  1320  0  0  0  3450  32,650  ISIC19  2713  12,473  15,186  60,744  905  4158  5053  904  4158  5062 12,661 58,208

Performance Evaluation Metrics
This section specifies different evaluation metrics that are used to validate and evaluate each stage of the CAD system: pre-processing, localization, segmentation, and classification. These metrics are accepted and employed globally by well-known researchers in similar domains.

Evaluation Metric for Pre-Processing of Lesion
The efficiency of the pre-processing stage is evaluated by employing the universal image quality index (UIQI), peak signal to noise ratio (PSNR), root mean squared error (RMSE), and mean squared error (MSE). These matrices are used to calculate the enhancement capacity of the dermoscopic image in pre-processing stage, which might increase the efficiency of lesion detection in later stages. These parameters also measure the efficiency of the pre-processing method by measuring its capacity to remove artifacts digitally. The mean difference between an acquired image with noise I(x,y) and a pre-processed image Is(x,y) is calculated by the MSE. The square root of the MSE is the RMSE. The equations for both the MSE and RMSE are given as follows: The quality of image is widely measured by the peak signal to noise ratio (PSNR). The quality of image enhancement is marked by a high PSNR score. It is usually expressed in terms of the logarithmic decibel (dB) scale. The mathematical representation of the PSNR is given below in Equation (15): UIQI is the estimation of linear correlation of acquired dermoscopic image along with the pre-processed image, based on luminance, contrast, and structure features in the pre-processing stage. I, I P , and σ represents the mean of the input image, mean of pre-processed image, and standard deviation of the acquired image. The mathematical representation UIQI is illustrated below: U IQI = 4σ I P ,I I P I

Evaluation Metric for Localization of Lesion
The performance of an algorithm is visualized by a 2 × 2 table, which is known as a confusion matrix or error matrix. It encapsulates a detailed report of prediction results on a classification problem. Each row represents an occurrence of the predicted class and each column indicates occurrence of actual class. It is employed for localization and classification of malignant lesion. If a malignant lesion is predicted to be malignant, it is defined as a true positive (TP) value, whereas if it is predicted to be non-malignant, it is defined as a false negative (FN). If a non-malignant lesion is predicted to be melanoma, it is classified as a false positive (FP), as it raises a false malignant flag, whereas if it is predicted as non-malignant, the classification is defined as a true negative (TN).
Evaluation of melanoma lesion localization is executed by an overlapping predicted region over the ground truth region to find the intersection area; this greedy method is termed as intersection-over-union (IoU), which is represented by Equation (17): Mean average precision (mAP) is used for evaluating localization phase by computing the mean precision of detection of melanoma area. Equation (18) shows the detailed mathematical representation:

Evaluation Metric for Classification of Lesion
For evaluating the performance of skin lesion classification, evaluation metrics such as Sensitivity (Sn), Specificity (Sp), Accuracy (Ac), Dice index coefficient (Dc), and Jaccard score (Js) are used. These standardized criteria of evaluation are utilized for the validation of segmentation and classification performance in the ISBI challenge, which is considered to be a standard platform for publishing practical implementation of diagnosis of a melanoma lesion. Sensitivity (Sn) represents the ratio of accurate detection/segmentation of dermoscopic lesion, whereas Specificity (Sp) indicates the proportion of accurate detection/segmentation of non-melanoma pixels. Overall performance of diagnosis is quantified by a measure of the Accuracy (Ac) metric. The dice index coefficient (Dc) is used to measure the performance of detection/segmentation by comparing ground truth results. Similarly, the Jaccard score (Js) is a measure of the intersection of the union of segmented lesion with ground truth results. The area under curve (AUC) metric is used for assessing the performance by calculating the area of the ROC curve. A high value of AUC represents better classification architecture for the prediction of true values as true and false entities as false. Precision is the accuracy of the classification method to generate only valuable data, that is, it determines the rate of accurate detection. On the other hand, recall is the ability of the classifier to detect true values as true, which is also named the true positive rate. The weighted mean of recall and precision is termed the F1 score or harmonic mean. F1 is used to measure accuracy of machine learning architecture in a single score metric by evaluating both recall and precision. The Matthew correlation coefficient (MCC) quantifies the correlation between segmented and annotated area of lesion, the outcome of MCC ranges from −1 to 1. A larger value of MCC represents efficient detection/segmentation of skin lesions. The mathematical representation of all these matrices is shown as follows:

Results
This section comprises a detailed performance analysis of the proposed method. A machine with a Core-i7 processor and 32 GB of RAM is used to implement the proposed method and conduct all the experiments. OpenCV framework is used for processing and classification of the acquired images with Python programming language. Different parameters are used for analysis of performance at various stages of classification, namely evaluation of skin refinement efficiency, performance analysis of lesion localization, segmentation analysis, and evaluation of classification model. Efficiency of the proposed method is evaluated over five publicly available dataset: PH2, ISIC 2017, ISIC 2018, and ISIC 2019.
Pre-processing of skin lesions is one of the most important and vital stages in lesion classification. Efficient refinement of skin lesion enhances the classification results, as the pre-processing phase not only modifies the contrast and brightness of dermoscopic image, but also digitally removes the artifacts, which might end up misleading the classification result. Table 2 illustrate the performance of removal of artifacts digitally using proposed method for the PH2, ISIC 2017, ISIC 2018, and ISIC 2019 datasets, respectively. Evaluation metrics such as PSNR, MSE, RMSE, and UIQI are used to evaluate the efficiency of preprocessing. A low value of MSE and RMSE signifies robust image enhancement. The loss of energy in the pre-processed image is depicted by the value of RMSE. It evaluates the difference in intensity of the pre-processed and acquired image. A low value of RMSE portrays less distortion of the processed image. If the empirical score of PSNR is higher than 20 dB, it is a well-enhanced image. The quality of information sustained in the preprocessed image is portrayed by PSNR; higher values of PSNR depicts more sustained valuable details after removal of artifacts in the pre-processed image. The value of UIQI ranges from −1 to 1, which is used to ensure the image quality after enhancing the image. This matrix is used to express the evident change of removal of digital artifacts, such as hair, and thereafter refining the image from natural and clinical artifacts, such as clinical color swatches, clinical ruler marks, black frame, etc. Image quality is measured by UIQI with respect to human vision by employing parameters such as structural information, contrast, and luminance. Thus, it is used to measure the capacity of pre-processing method to remove the artifacts of images along with enhancing the image by modifying its contrast and sharpness, without lowering the image quality. Table 2 show a highly acceptable range of PSNR (in Db) and UIQI, which represent the high-quality enhancement of the images in the pre-processing phase. Figure 7 pictorially represents the pre-processing phase, where artifacts such as thick and thin hair, black frame, ruler mark, and unwanted reflections are removed digitally by the proposed method of pre-processing. Table 3 show the classification of skin lesions with and without undergoing a pre-processing and segmentation phase; the accuracy mark illustrated in the tables for each dataset clearly concludes and supports the importance of a pre-processing and segmentation phase in computer-aided diagnostic systems. The above-mentioned tables show an enhanced accuracy score of about 8% to 9% when the acquired image is preprocessed. The F1 score, AUC, and MCC do represents the supremacy of performance when data are augmented properly before training and pre-processed and segmented before testing the model. Training and testing time for each of the datasets are tabulated in the Table 3. The training time for both the Table 3 (for all the datasets) are nearly same as size of the training data is the same (as ordinary data augmentation technique is implemented for data in Table 3. The above-mentioned experimental results and pictorial representation illustrates the need and importance of pre-processing images before classification.   Localization of suspected lesion is performed by the proposed classifier, which forms bounding box around the lesion and marks it to be ROI. Efficiency of localization is measured by metrics, such as mAP and IOU. The acceptable range of IOU for accurate area localization lies from 0.5 to 1. This metric is used to compare the predicted area with the ground truth, which is generated by expert dermatologists, to calculate the efficiency of localization. Table 4 represents the performance analysis of lesion localization by our proposed classifier on various dermoscopic datasets. The classifier successfully fetches an accuracy of 100% for PH2 and ISIC 2017 for skin lesion localization, which symbolizes the effectiveness of the modified layers of the classifier. Furthermore, the ISIC 2018 and ISIC 2019 datasets yield an accuracy of 99.40% and 99.56%, respectively. This enhanced tabulated result shows the accuracy of the classifier, and it provides a clear indication that the proposed classifier is capable to fetching improved results of classification of dermoscopic images. The pre-processing phase is followed by the segmentation phase, whose efficiency is calculated and compared over parameters such as Sensitivity, Specificity, Dice score, Jaccard index, and Accuracy. An analysis of performance, based on these parameters, was proposed by ISIC, which is the official organization for publishing free accessible dermoscopic images for research and analysis on skin cancer. It is also mentioned by [23] that segmentation is a vital stage in CAD, which helps to improve the classification score. Table 5 encapsulates the analysis of segmentation performance over different datasets by the proposed method using neutrosophic and determinant methods. The novel proposed system seems to fetch an accuracy mark of 99.00% for the PH2 dataset, 98.83% for the ISIC 2017 dataset, 98.56% for the ISIC 2018 dataset, and 97.86% for the ISIC dataset. The system successfully fetches the sensitivity score (rate of accurate segmentation of true positives) of 97.56% for ISIC 2019, which contains visually challenging melanoma images. Similarly, the efficiency of the proposed method is reflected when it attains a mark of 97.97% of specificity (rate of accurate segmentation of false negatives) for ISIC 2019, which has a maximum number of different skin lesion classes, which makes it the most challenging dataset for even an expert dermatologist to perform accurate segmentation. Efficient mathematical logic of uncertainty principle by neutroscopy assists the proposed system to fetch accurate and out-topping score for segmentation of lesion over state-of-the-art methods. Table 6 illustrates a comparison between the proposed method (PM) for segmentation and well-established state-of-the-art techniques. The segmentation performance of the proposed work is compared with the most inspiring methods for segmentation of dermoscopic images from the PH2 dataset. Reference [24] used a two-stage segmentation model by employing L-R fuzzy logic and graph theory to yield efficient segmentation results, while reference [25] employed a framework which works on the principal of semantic segmentation model for automatic segmentation. Moreover, reference [26] used alternate segmentation and classification by bootstrapping the DCNN model. The grab cut algorithm, which is semi-automatic in nature, was used by reference [27], whereas a deep convolutional neural network was employed for segmentation by reference [1]. We have also drawn inspiration for efficient segmentation models, which are proposed by references [28,29], where FCN networks and multistage fully convolution network (FCN) with parallel integration (mFCN-PI) is used for segmentation of dermoscopic lesion. However, our proposed method yields 99%, 97.50%, 99.36%, 95.12%, and 97.50% for accuracy, sensitivity, specificity, Jaccard score, and dice index, respectively, which are highest scores when compared with state-of-the-art methods. A comparison of the proposed method against recently published and well-known segmentation methods for the ISIC 2017 dataset is tabulated in Table 7. The proposed method is contrasted against state-of-the-art methods for segmentation, such as the one used in reference [30], which utilizes an extension and amendment of FCN architecture, i.e., a fully convolutional residual network (FCRN). A robust deep learning SLS model of the encoder-decoder network was used in reference [31], where the dilated residual layers form the encoder network and decoder layer was constructed by a pyramidal pooling network, which was followed by three layers of convolution. Ref. [32] proposed a simultaneous segmentation and classification model using FrCN to yield a high score of specificity for the ISIC 2017 dataset. When the proposed segmentation method is computed for the ISIC 2017 dataset, it yields an accuracy mark of 98.83%, while other well-known and published works managed to obtain 97.33%, 95.30%, 81.57%, 95.06%, 93.39%, and 93.60% mark, respectively [24,25,27,[29][30][31]]. Segmentation performance of our method is contrasted against recently published segmentation methods, where Ref. [33] used a Difficulty-Guided Curriculum Learning (DGCL), Ref. [34] employed a Deep Saliency Segmentation method which employs a custom CNN of 10 convolutions, and Ref. [35] used AlexNet along with transfer learning for segmentation. Additionally, a few of the most successful segmentation models are proposed by [36], where the performance of U-Net is enhanced by BCDU-Net with con-vLSTM. Ref. [37] designed an architecture based on network of encoder and decoder for segmentation of skin lesion by Deep-Lab and PSP-Net; additionally, extraction of key features is performed by ResNet101. Despite using complex convolutions and architecture by these state-of-the-art methods, they cannot outperform the results of segmentation that is achieved by our method. Notwithstanding the fact that the proposed system yields an accuracy mark of 98.56%, our system not only fetches higher score for sensitivity (98.50%), but also shows its efficiency by yielding the highest specificity score of 98.58%. Table 8 represents a detailed comparison of the proposed segmentation method against state-of-the-art technology for the ISIC 2018 dataset. Segmentation performance for the ISIC 2019 dataset is evaluated and compared against [23,38], which fetched accuracy scores of 96.74% and 93.98%, respectively, while our method yields 97.86%. The method proposed in [38] uses a triangular neutrosophic number and straight-line based method for segmentation. Dynamic thresholding using pentagonal neutrosophic method has been proven to work efficiently for all the datasets. Thus, an enhanced and modified pre-processing method combined with an advance segmentation method will yield a better performance for the classification of lesion, which is the fundamental aim of this research work. A tabular representation of the performance evaluation metrics is illustrated in Table 9.
Classification performance is evaluated for the proposed method (PM) and compared with other well-known classifiers, such as the YOLO, K-nearest neighbor (KNN), Support vector machine (SVM), Decision Tree (DT), Multilayer perceptron (MLP), Bayesian network (BN), random forest (RF), logistic, and naïve bayes (NB) algorithms. The performance is evaluated for Accuracy, Sensitivity, Specificity, Precision, and F1-Score to determine the efficiency of classification by different classifiers. The performance analysis is tabulated in Table 10 for PH2, ISIC 2017, ISIC 2018, and ISIC 2019. The proposed classifier proves to perform extremely well for all four datasets and the statistical figure proves the proposed classifier to be much more accurate than any other classifiers that are present in the state-ofthe-art methods. The proposed classifier seems to score 99.50%, 99.33%, 98.56%, and 98.04% for the classification of the dermoscopic lesion from PH2, ISIC 2017, ISIC 2018, and ISIC 2019. Although the sensitivity score of the YOLO classifier for the ISIC 2019 dataset is the best out of all classifiers (with those which are compared), it scores 97.11% as its sensitivity score, whereas the proposed classifier scores 96.67%. With such high-scoring accuracy marks, the modified classifier proves itself to be eligible for execution of the application in a real-life scenario by dermatologists for the diagnosis of skin lesions. The tabulated data clearly indicate the need for selecting inception and residual blocks with a softmax layer in the architecture of the proposed classifier for the classification of malignant melanoma.

Discussion
The supremacy of the proposed classifier is evidently illustrated by the comparisons with well-known classifiers for the accurate identification of skin lesion. Our classifier yields a high score for all the evaluation parameters when compared with recently developed and extensively used classifiers. Due to its ability to distinguish the minor difference of pixel coloration for classification, it stands to be the most efficient and trustworthy for the classification of dermoscopic images. This article also focuses on the importance of an adequate pre-processing method for the removal of artifacts from the acquired image and thereafter enhancing the image, which will help the classifier to fetch such high values of accuracy. Notwithstanding the fact that segmentation of skin lesion using neutrosophy and a third-order determinant also helped to achieve high value of classification. The complete workflow of the proposed method is represented in Figure 8, where step 'A' shows the original dermoscopic image of the respective datasets, Step 'B' represents the pre-processed image, where digital artifacts are removed, and images are enhanced by the process of histogram equalization, Step 'C' portrays the segmentation phase where the lesion is segmented out from the dermoscopic image, and Step 'D' is the final phase where the lesion is classified into either melanoma or non-melanoma and a bounding box is drawn around the lesion. Dermoscopic images from each dataset (the ISIC 2017, ISIC 2018, and PH2 datasets) are chosen to represent the workflow. Figure 8 shows the working accuracy of the proposed method, where each phase of the CAD system (preprocessing, segmentation, and classification) is accurately performed for a visually challenging dermoscopic image of melanoma lesion. The performance analysis of the classification of skin lesions using five-fold crossvalidation is represented in Table 11, where performance metrics such as Accuracy (Acc), Sensitivity (Sen), Specificity (Spec), Precision (Prec), Recall (Rec), F1, MCC, AUC, and Standard deviation (SD) are calculated for different methods. A sum of 108,936 dermoscopic images are used to train and validate each method, out of which 103,524 images are from the training set (after data augmentation) and 5412 are from the validation set. The cross-validation fold is set to be 5, thereby avoiding any overfitting of data. In total, 21,787 dermoscopic images are randomly assigned to each of the four sample datasets. The performance of each model is based on the training on the first four sample datasets and validating it against the last sample dataset (of 21,787 images). The average performance of all these models is also encapsulated in Table 11 to show its overall performance. Table 11 concludes that the proposed method outperforms when trained and after five-fold crossvalidation against the state-of-the-art models, such as KNN, YOLO, SVM, etc. A confusion matrix for each of the classifier is pictorially represented in Figure 9, which shows the value of the true positive, true negative, false positive, and false negative.  Statistical tests were conducted of a few of the dermoscopic skin lesion from the PH2, ISIC 2017, ISIC 2018, and ISIC 2019 datasets to check if the lesion can be classified into a specific class, based on statistical parameters, such as mean, skewness, and kurtosis. The mean value of an image represents the average pixel density, which can be an important parameter to classify a skin lesion as melanoma, because melanoma lesion does lie into specific colors ranging from bluish grey to dark brown. A mathematical representation of the mean value is shown in Equation (29), where M and N are dimensions of the image (I). The asymmetry distribution of pixels of a malignant lesion is market by Skewness of the image, which is mathematically represented in Equation (30), where µ is the mean of the distribution and σ is the standard deviation. The measure of tailedness of a pixel distribution is marked by Kurtosis (represented in Equation (31)) of an image. It is an important statistical parameter which shows that how often outliers occur.
A statistical comparison of eight different dermoscopic images from the PH2, ISIC 2017, ISIC 2018, and ISIC 2019 datasets based on statistical parameters such as mean, skewness, and kurtosis is tabulated in Table 12. The tabular values show a higher value of the mean for images which belong to the melanoma class, which is because a melanoma lesion visually appears to be more darkish than a non-melanoma lesion. Similarly, a trend towards a higher value for melanoma lesion images can be noticed for skewness, with a melanoma being asymmetry in shape showing a more random distribution of pixels (i.e., higher value for skewness) than its counterpart. It is also noticed that the kurtosis value is very pointy (high value) for non-melanoma lesions and a bit flat (lower value) for the melanoma image. Notwithstanding the fact that there are few exceptions, such as for ISIC_0034412, which is a non-melanoma image, the mean value is as high as 201.184, which is because of the visually dark appearance of non-melanoma lesions. However, due to visual similarities of both the lesions, it is statistically a bit difficult to classify this lesion, but with the help of deep neural networks, we can classify them accurately, as the features are mapped for each lesion.
The mean accuracy obtained using the proposed method was higher than that for the state-of-the-art techniques. However, to assess how significantly different the accuracy is, we also performed a statistical analysis. The statistical test for all the classifiers based on the test datasets is encapsulated in Table 13. The test dataset contains 4450 dermoscopic images, which are taken from the ISIC 2017, ISIC 2018, and ISIC 2019 repositories. The p-value is calculated using the Freidman test with Bergmann and Hommel's correction. If the value of 'p' is less than 0.05, then it shows that the null hypothesis (H 0 ) is proven wrong, thereby proving the alternative hypothesis (H a ), which states that the proposed algorithm outperforms all of the state-of-the-art classifiers, as there exists a significant difference in the mean accuracy. H 0 is used to prove that there exists no difference in the mean accuracy of the classifiers when the models are classified using the test dataset. The t-test is performed because the classifiers run over the same dataset, thereby producing a t-value. The performance of each classifier is compared in a pair-wise manner against the proposed method. Hence, the first row of the table contains not-a-number (NaN). The mean rank for each classifier is calculated by ranking the performance of classification for each image in the test dataset by several classifiers (line KNN, MGSVM, etc.). The proposed method outperforms in classification all the dermoscopic test images; thereby, its mean rank (sum of ranks for all the images in test dataset/total image in test dataset) is 1.12. The mean and standard deviation of confidence value (the probability of a true or false value during classification) for each dermoscopic image in the test dataset is encapsulated under the mean and SD columns in Table 13. A low SD value shows an equal confidence score for all the images in the test dataset, which shows that the hyperparameters are tuned properly, as there exists a minor variance in the classification score. The results in Table 13 clearly depict the statistical supremacy of the proposed method over well-known classifiers. The p-value for all the classifiers is very low, which signifies the mean accuracy difference; moreover, the mean confidence score of the proposed method is 0.86, which is higher than any existing method, thereby portraying the efficiency of the proposed method.

Conclusions
In this article, an effective pre-processing model was proposed to digitally remove the artifacts present in the acquired image; the image is enhanced by the method of histogram equalization. After refining the image, it undergoes the segmentation phase, where a mathematical-based algorithm using thresholding and pentagonal neutrosophy is demonstrated to achieve enhanced segmentation results. Henceforth, the segmented image is classified using the proposed classifier. The model was trained with a huge dataset, which was proposed after data augmentation and balancing. The outcomes of the proposed methods were evaluated using publicly accessible datasets: PH2, ISIC 2017, ISIC 2018, and ISIC 2019. A vivid and vast range of test parameters proved that the proposed methods, for each stage of CAD, outperformed the state-of-the-art methods. The proposed method had a significantly higher score in sensitivity and specificity in the field of diagnosis of melanoma lesion. The classification results mentioned in Table 10 were end-to-end deep learning models, i.e., each dataset was augmented (by the proposed augmentation technique), and the training images underwent the process of pre-processing and segmentation before the classification of the skin lesion. Various state-of-the-art classifiers, such as YOLO, KNN, Bayesian networks, etc., were employed on the same augmented and processed data. Our proposed deep neural network outperformed all the state-of-the-art classifiers for classification of skin lesion images. The importance of the proposed data-augmentation technique, pre-processing, and segmentation is highlighted in Table 3 where the proposed deep neural network is used to classify the skin lesion (with the standard data augmentation technique only); however, it underperformed when compared to the model which employed the proposed techniques for data augmentation, pre-processing, and segmentation. The sensitivity and specificity scores of the model which employed the proposed techniques seem to be 8-10% higher than the model with standard augmentation.
More pronounced output can be achieved in the near future, when an extensive and varied range of datasets with several classes will be available, along with modified CAD equipment, which will increase the image quality during image acquisition. A broad range of classifications for various skin diseases (such as Vitiligo, Alopecia areata, Psoriasis, Atopic dermatitis, and Lamellar ichthyosis) and several other types of skin cancers (such as basal cell carcinoma, squamous cell carcinoma, actinic keratoses, etc.) should be performed in future research. In future research, we also aim to deploy the proposed algorithms into a smartphone application, such that the diagnosis of skin cancer is made easily available without the need for any invasive techniques. Additionally, we also aim to proposes a smartphone-based dermoscopic tool which will increase the accuracy of diagnosis when diagnosed using a smartphone. The digital diagnosis of skin lesions is an extensive area of research and it has a great potential in the near future.