Deep Learning and Machine Learning Techniques of Diagnosis Dermoscopy Images for Early Detection of Skin Diseases

: With the increasing incidence of severe skin diseases, such as skin cancer, endoscopic medical imaging has become urgent for revealing the internal and hidden tissues under the skin. Diagnostic information to help doctors make an accurate diagnosis is provided by endoscopy devices. Nonetheless, most skin diseases have similar features, which make it challenging for dermatologists to diagnose patients accurately. Therefore, machine and deep learning techniques can have a critical role in diagnosing dermatoscopy images and in the accurate early detection of skin diseases. In this study, systems for the early detection of skin lesions were developed. The performance of the machine learning and deep learning was evaluated on two datasets (e.g., the International Skin Imaging Collaboration (ISIC 2018) and Pedro Hispano (PH2)). First, the proposed system was based on hybrid features that were extracted by three algorithms: local binary pattern (LBP), gray level co-occurrence matrix (GLCM), and wavelet transform (DWT). Such features were then integrated into a feature vector and classiﬁed using artiﬁcial neural network (ANN) and feedforward neural network (FFNN) classiﬁers. The FFNN and ANN classiﬁers achieved superior results compared to the other methods. Accuracy rates of 95.24% for diagnosing the ISIC 2018 dataset and 97.91% for diagnosing the PH2 dataset were achieved using the FFNN algorithm. Second, convolutional neural networks (CNNs) (e.g., ResNet-50 and AlexNet models) were applied to diagnose skin diseases using the transfer learning method. It was found that the ResNet-50 model fared better than AlexNet. Accuracy rates of 90% for diagnosing the ISIC 2018 dataset and 95.8% for the PH2 dataset were reached using the ResNet-50 model.


Introduction
The skin is the largest human organ, and it is the outer covering of the body. The skin is the first line of defense in the human body [1]. It has the role of (i) protecting the internal organs from external environmental influences, (ii) regulating body temperatures, (iii) providing immunity against many diseases, and (iv) providing beauty to the body [2]. The human body is protected by the skin from harmful ultraviolet rays from the sun, although the essential vitamin D is also produced by this organ when the body is exposed to sunlight. Skin color (body pigmentation) and moisture (from oily to dry skin) vary from person to person according to the hot and cold regions in the world [3]. Cellular DNA is damaged if the body is exposed to sunlight or ultraviolet rays for a long time, decreasing skin pigmentation and the incidence of malignant skin diseases. Skin cancer (melanoma) is a fatal skin disease with no early diagnosis. In the early stages of the disease, it is not detected due to the similarity of the cancer cells to other skin cells. Abnormal cancerous cells divide rapidly, penetrating the lower skin layers and becoming incurable malignant melanomas [4].

•
Extract the important features from each image with LBP, GLCM, and DWT algorithms, combining the extracted features into one vector to obtain representative features for each image; diagnose the images using artificial neural networks ANN and FFNN classifiers.

•
The capabilities of deep learning networks lie in imparting acquired skills to solve new, relevant problems.

•
In this research, the early diagnosis of skin lesions and distinguishing benign images from malignant ones are considered. • Machine learning algorithms (ANN and FFNN) achieved better results than CNN models (ResNet-50 and AlexNet).

•
Machine and deep learning techniques will help medical doctors in the early detection of skin lesions, enhancing the confidence of doctors and reducing the number of biopsies and surgeries.
classification of skin lesions. This method consists of two stages. In the first stage of the encoder-decoder fully convolutional network, complex and heterogeneous features are learned. In the second stage, the coarse texture is extracted at the encoder stage, and the lesion boundaries are extracted during the decoding stage [28].  presented a High-Frequency approach with a Multilayered Feed-Forward Neural Network (HFaFFNN) to integrate all images, enhance images with a log-opening-based activation function. Pre-trained CNNs Darknet-53 and NasNet-mobile have been implemented, and parameters are tuned for high performance. Finally, a parallel max entropy correlation (PMEC) algorithm was used to fuse the extracted features [29]. Muhammad et al. (2021) presented a two-stage framework for segmentation and classification. For lesions segmentation, a hybrid technique was used through the complementary strengths of two CNNs to produce a region of interest (RoI). To classify lesions by CNNs, 30 layers were trained on the HAM10000 dataset. The most important features were selected using the Regular Falsi method; the system achieved a good performance in diagnosing skin lesions [30]. Muhammad et al. (2021) presented a two-way CNN information fusion framework for diagnosing melanoma. Image contrast was improved based on fusion, then improved features were extracted by the skewness-controlled moth-flame optimization method. The second frame uses the MobileNetV2 model to extract the features. Finally, the features extracted from the two methods are fused by the new parallel multimax coefficient correlation algorithm. The system achieved superior performance in diagnosing skin cancer [31]. Muhammad et al. (2021) presented a robust system for diagnosing skin lesions through several stages. Firstly, image enhancement using local color-controlled histogram intensity values (LCcHIV). Secondly, pest segmentation by novel Deep Saliency. The threshold function is applied to obtain binary images. Thirdly, the most important features were extracted through the improved moth flame optimization (IMFO) algorithm. Finally, the system achieved high performance in diagnosing skin lesions by incorporating features and categorizing them using the Kernel Extreme Learning Machine [32].
Previous studies contain many challenges such as hair, air bubbles, artifacts, and light reflections. Researchers also face challenges such as the similarity of characteristics between types of diseases, which constitute a major challenge in the diagnosis and distinction between diseases. Therefore, the proposed systems in this study addressed all the challenges of the previous studies. Many enhancement techniques improved the images by removing artifacts, air bubbles, skin lines, and reflections by applying two filters together, namely Laplacian and average filters. Furthermore, the Dullrazor technology works with images containing hair and removes hair with high accuracy. As for the challenges of similar features between some diseases, three hybrid algorithms were applied to extract the features from each image and combine the features extracted from the three methods into one vector for each image. Thus, each disease is represented by its representative features. Parameters and weights of two models, ResNet-50 and AlexNet, were also adjusted to extract each disease's deep and representative characteristics.

Materials and Methodology
The evaluated proposed systems were applied to two datasets: ISIC 2018 and PH2. Each of these datasets was collected under different conditions and had different characteristics. The use of each dataset has several inherent challenges. The most important of which are (i) isolating the lesion from healthy skin (segmentation), (ii) localizing the features and patterns, (iii) and extracting and classifying the features of each lesion. Therefore, skin lesions may be detected by the systems, and skin cancer may be distinguished from other kinds of lesions. In Figure 1, the mechanism of action of the proposed method for diagnosing skin diseases is described. Images were enhanced, and the noise was removed using Laplacian and average filters and hair removal with the Dullrazor technique. The lesion segmentation was performed using the adopted region growth algorithm. Feature extraction was conducted in two different ways, in which traditional and deep learning were considered. In traditional methods, features were extracted by combining the extracted Electronics 2021, 10, 3158 5 of 29 features through three algorithms: LBP, GLCM, and DWT. The deep feature maps were also extracted by CNN for the ResNet-50 and AlexNet models. The features extracted using traditional methods were classified using ANN and FFNN. At the same time, the deep feature maps were classified by two pre-trained CNNs, namely ResNet-50 and AlexNet models.
Electronics 2021, 10, 3158 5 of 29 the noise was removed using Laplacian and average filters and hair removal with the Dullrazor technique. The lesion segmentation was performed using the adopted region growth algorithm. Feature extraction was conducted in two different ways, in which traditional and deep learning were considered. In traditional methods, features were extracted by combining the extracted features through three algorithms: LBP, GLCM, and DWT. The deep feature maps were also extracted by CNN for the ResNet-50 and AlexNet models. The features extracted using traditional methods were classified using ANN and FFNN. At the same time, the deep feature maps were classified by two pre-trained CNNs, namely ResNet-50 and AlexNet models.

Dataset
In this study, the two standard datasets, namely, ISIC 2018 and PH2, were used for diagnosing dermatological diseases, which are explained as follows.

International Skin Imaging Collaboration )ISIC 2018) Dataset
The proposed systems were evaluated using the ISIC 2018 dataset. The criteria for endoscopic devices were assessed to obtain high-resolution images and techniques, such as illumination, size, calibration markers, poses, magnification, terminology as diagnoses, lesion site, and morphology. The ISIC 2018 dataset, also known as HAM-10000, contains seven unbalanced diseases. In this study, the different kinds of diseases evaluated from this dataset and the amount of images considered in each condition were the following: actinic keratoses (AKIEC; n = 200 images), basal cell carcinoma (BCC; n = 200 images), benign keratosis lesions (BKL; n = 200 images), dermatoma (DF; n = 100 images), melanoma (MEL; n = 200 images), melanocytic nevi (NV; n = 200 images), and vascular (VASC; n = 100 images). In Figure 2, the samples from the ISIC 2018 dataset for seven diseases are described. The data from ISIC 2018 used in this study may be obtained [33].

PH2 Dataset
The proposed systems were evaluated on the PH2 dataset obtained from the Dermatology Service of Hospital Pedro Hispano (Matosinhos, Portugal). All images were obtained under the same conditions and instrumentation resolution. The dataset consisted of 200 images divided into three diseases: melanocytic nevi (NV; n = 80 images), atypical (n = 80 images), and melanoma (mel; n = 40 images). Where melanocytic nevi are benign tumors and melanoma is a malignant tumor, while atypical have characteristics of benign

Dataset
In this study, the two standard datasets, namely, ISIC 2018 and PH2, were used for diagnosing dermatological diseases, which are explained as follows.

International Skin Imaging Collaboration (ISIC 2018) Dataset
The proposed systems were evaluated using the ISIC 2018 dataset. The criteria for endoscopic devices were assessed to obtain high-resolution images and techniques, such as illumination, size, calibration markers, poses, magnification, terminology as diagnoses, lesion site, and morphology. The ISIC 2018 dataset, also known as HAM-10000, contains seven unbalanced diseases. In this study, the different kinds of diseases evaluated from this dataset and the amount of images considered in each condition were the following: actinic keratoses (AKIEC; n = 200 images), basal cell carcinoma (BCC; n = 200 images), benign keratosis lesions (BKL; n = 200 images), dermatoma (DF; n = 100 images), melanoma (MEL; n = 200 images), melanocytic nevi (NV; n = 200 images), and vascular (VASC; n = 100 images). In Figure 2, the samples from the ISIC 2018 dataset for seven diseases are described. The data from ISIC 2018 used in this study may be obtained [33].

PH2 Dataset
The proposed systems were evaluated on the PH2 dataset obtained from the Dermatology Service of Hospital Pedro Hispano (Matosinhos, Portugal). All images were obtained under the same conditions and instrumentation resolution. The dataset consisted of 200 images divided into three diseases: melanocytic nevi (NV; n = 80 images), atypical (n = 80 images), and melanoma (mel; n = 40 images). Where melanocytic nevi are benign tumors and melanoma is a malignant tumor, while atypical have characteristics of benign tumors but may develop into malignant tumors. In Figure 3, samples from the PH2 dataset are described. The data from the PH2 dataset used in this study may be obtained [34]. tumors but may develop into malignant tumors. In Figure 3, samples from the PH2 dataset are described. The data from the PH2 dataset used in this study may be obtained [34].   tumors but may develop into malignant tumors. In Figure 3, samples from the PH2 dataset are described. The data from the PH2 dataset used in this study may be obtained [34].

Pre-Processing
The pre-processing process is the first stage of image processing. In this section, the following information is provided: a description of the filters applied to enhance the images and the hair removal method from the images.

Laplacian and Average Filter Methods
Image enhancement is the first step in image processing. During this process, some noisy features such as hair, air bubbles, skin lines, and reflections due to lighting, etc., in the image are fixed to obtain a more precise image. In this study, Laplacian and average filters were used to remove noise and artifacts, enhance edges, and treat low contrast between lesions and healthy skin. First, an average filter was applied. The image was smoothed by the average filter with a reduction in the differences between adjacent pixels. The filter was applied to image frames of 5 × 5 pixels each time. The process continues until the entire image is covered. Then, the value of each pixel in the image was replaced with an average value based on the adjacent values. In Equation (1), the mechanism of action of the intermediate filter is described.
where z(m) is the input, y(m − 1) is the previous input, and M is the number in the average filter. The Laplacian filter, which is an edge detection filter, was then used. This filter detects the changing areas in the image (e.g., edges of skin lesions). In Equation (2), the general functioning of the Laplacian filter is explained.
where f is a second-order differential equation and x, y represents the coordinates in a 2D matrix. Finally, the image enhanced by the Laplacian filter is subtracted from the image enhanced by the averaging filter to obtain a more improved image.

Hair Removal Technique
Hair is one of the challenges in diagnosing skin lesions. DullRazor is a pre-processing technology that removes hair from the lesion area. The presence of hair in the lesion area causes confusion for the segmentation methods, as well as the feature extraction algorithm, and the presence of hair causes the feature extraction algorithms to add some features of hair in addition to the features of the lesion; therefore, the resulting features will be inaccurate because they contain the features of both the lesion and the hair. Therefore, the DullRazor technique removes hair before the process of segmentation and extraction of features [35]. The following three steps were necessary for hair removal:

•
The location of dark hair is determined by the process of morphological closing of the images of the two datasets that contain hair; • The structure of long or thin hair is checked using bilinear interpolation and substitution of specified pixel values; • The new pixels are then smoothed by a medium filter.
In Figure 4, a sample of the dataset images containing hair is shown. After applying the Dullrazor tool, the image was processed, and the hair was removed.

Adopted Region Growth Algorithm (Segmentation)
Dermatoscopy images consisted of an affected portion (skin lesion) and a healthy portion. Therefore, extracting features from the entire image, including healthy skin, leads to incorrect classification results [36]. Consequently, it is necessary to isolate the lesion region from normal skin. In this study, we used the adopted region growth algorithm. Groups of similar pixels were treated with this algorithm. The following conditions are needed for the successful segmentation process by the algorithm: where m is the number of regions = 1,2, … … , First, the segmentation process needs to be completed. Second, similar pixel units must be separated into different groups, and the union of all groups represents the whole image. Third, similar pixel units must be corrected. Fourth, no two pixels should be the same and belong to two different regions. The algorithm works on a bottom-up principle, where it starts from pixels and grows to form regions. Each region contains similar pixels. The basic idea is that the algorithm begins each region with a single-pixel seed. Then, each region grows with similar pixels, and the border regions grow with similar units to represent the boundaries of the lesion and separate it from healthy skin. In Figures 5 and  6, samples from the ISIC 2018 and PH2 dataset are described, respectively. In these figures, the process is shown after the optimization, hair removal, segmentation process by isolating the lesion region from normal skin, and morphology method to further enhance the images, and the gaps of which were filled after the segmentation process.

Adopted Region Growth Algorithm (Segmentation)
Dermatoscopy images consisted of an affected portion (skin lesion) and a healthy portion. Therefore, extracting features from the entire image, including healthy skin, leads to incorrect classification results [36]. Consequently, it is necessary to isolate the lesion region from normal skin. In this study, we used the adopted region growth algorithm. Groups of similar pixels were treated with this algorithm. The following conditions are needed for the successful segmentation process by the algorithm: First, the segmentation process needs to be completed. Second, similar pixel units must be separated into different groups, and the union of all groups represents the whole image. Third, similar pixel units must be corrected. Fourth, no two pixels should be the same and belong to two different regions. The algorithm works on a bottom-up principle, where it starts from pixels and grows to form regions. Each region contains similar pixels. The basic idea is that the algorithm begins each region with a single-pixel seed. Then, each region grows with similar pixels, and the border regions grow with similar units to represent the boundaries of the lesion and separate it from healthy skin. In Figures 5 and 6, samples from the ISIC 2018 and PH2 dataset are described, respectively. In these figures, the process is shown after the optimization, hair removal, segmentation process by isolating the lesion region from normal skin, and morphology method to further enhance the images, and the gaps of which were filled after the segmentation process.    . Enhancement and segmentation process for some images from the PH2 dataset. Figure 6. Enhancement and segmentation process for some images from the PH2 dataset.

Feature Extraction
In this study, we combined three feature extraction methods, the LBP, GLCM, and DWT algorithms, to extract the most critical features of skin lesions from the images. The LBP algorithm is one of the simplest and most effective feature extraction algorithms. The central (target) pixel was determined using the algorithm. Then, a frame of 3 × 3 neighboring pixels was selected for each central pixel, known as the parameter R, representing the radius. This parameter was responsible for determining the number of adjacent pixels. Two-dimensional textures were described using the LBP algorithm [37]. In Equation (3), the decomposition of the center pixel by adjacent pixels is described, and the substitution of the resulting value of the center pixel. After this first step, the method continues for all pixels of the image. A total of 203 features were extracted for each image by the LBP algorithm.
where g c is the center pixel, g p is the neighboring pixel, R is the radius around the central pixel, and P is the number of neighbors. The binary threshold function x is defined in Equation (4) as follows: The internal structure of the lesion area was displayed in gray levels using the GLCM algorithm. Then, the algorithm extracted texture features from an area of interest. Since the lesion area had a smooth and rough texture, the smooth area had pixel values close to each other, while the rough area had different pixels. Then, texture features from the spatial gray levels of a lesion were extracted using the algorithm. Textile metrics were calculated from spatial and statistical information. The location of a pixel from another pixel was determined by the spatial information in terms of a distance d and a direction θ. There were four values of θ: 0 • , 45 • , 90 • , and 135 • . Additionally, d = 1 when θ is horizontal or vertical (θ = 0 • or θ = 90 • ), and d = √ 2 when θ (θ = 45 • or θ = 135 • ). A total of 13 features was extracted for each image: contrast, energy, mean, entropy, correlation, kurtosis, standard deviation, smoothness, homogeneity, RMS, skewness, and variance.
Four features from each image were extracted using the DWT method, in which the input signal was analyzed into two signals with different frequencies using square mirror filters. These two signals were compatible with low-and high-pass filters. Approximation coefficients were produced by low-pass (LL) filters, while detailed coefficients (horizontal, vertical, and diagonal) were produced by high-pass filters (LH, HL, and HH). A total of 220 hybrid features were extracted using all the algorithms (LBP, GLCM and DWT). Such features were combined in one vector for each image. The produced vector was fed to the classification stage to train the classifier. In Figure 7, the hybrid process for feature extraction is described.

Feature Extraction
In this study, we combined three feature extraction methods, the LBP, GLCM, and DWT algorithms, to extract the most critical features of skin lesions from the images. The LBP algorithm is one of the simplest and most effective feature extraction algorithms. The central (target) pixel was determined using the algorithm. Then, a frame of 3 × 3 neighboring pixels was selected for each central pixel, known as the parameter R, representing the radius. This parameter was responsible for determining the number of adjacent pixels. Two-dimensional textures were described using the LBP algorithm [37]. In Equation (3), the decomposition of the center pixel by adjacent pixels is described, and the substitution of the resulting value of the center pixel. After this first step, the method continues for all pixels of the image. A total of 203 features were extracted for each image by the LBP algorithm.
where gc is the center pixel, gp is the neighboring pixel, R is the radius around the central pixel, and P is the number of neighbors. The binary threshold function x is defined in Equation (4) as follows: The internal structure of the lesion area was displayed in gray levels using the GLCM algorithm. Then, the algorithm extracted texture features from an area of interest. Since the lesion area had a smooth and rough texture, the smooth area had pixel values close to each other, while the rough area had different pixels. Then, texture features from the spatial gray levels of a lesion were extracted using the algorithm. Textile metrics were calculated from spatial and statistical information. The location of a pixel from another pixel was determined by the spatial information in terms of a distance d and a direction θ. There were four values of θ: 0°, 45°, 90°, and 135°. Additionally, d = 1 when θ is horizontal or vertical (θ = 0° or θ = 90°), and d = √2 when θ (θ = 45° or θ = 135°). A total of 13 features was extracted for each image: contrast, energy, mean, entropy, correlation, kurtosis, standard deviation, smoothness, homogeneity, RMS, skewness, and variance.
Four features from each image were extracted using the DWT method, in which the input signal was analyzed into two signals with different frequencies using square mirror filters. These two signals were compatible with low-and high-pass filters. Approximation coefficients were produced by low-pass (LL) filters, while detailed coefficients (horizontal, vertical, and diagonal) were produced by high-pass filters (LH, HL, and HH). A total of 220 hybrid features were extracted using all the algorithms (LBP, GLCM and DWT). Such features were combined in one vector for each image. The produced vector was fed to the classification stage to train the classifier. In Figure 7, the hybrid process for feature extraction is described.

Classification
In this section, the ISIC 2018 and PH2 datasets were evaluated according to two traditional classification algorithms (e.g., ANN and FFNN) and convolutional neural networks (CNNs) (e.g., ResNet-50 and AlexNet) models for diagnosing skin diseases.

ANN and FFNN
ANN is a type of neural network of soft computing. It is a group of layers consisting of interconnected and internally connected neurons. It has a superior ability to interpret and analyze complex data and to produce clear and explanatory patterns. The error between actual and predicted probabilities is also minimized by ANN [38]. Information is propagated between neurons and stored as connecting points between them called weights. The objective of the ANN is to update the specified weights w to obtain the minimum square error between the actual output x and predicted output y as given by the mean square error (MSE), as described in Equation (5): The ANN algorithm was evaluated on the ISIC 2018 and PH2 datasets for diagnosing skin diseases. A model was trained on 220 features through 10 hidden layers between the input and output layers. In Figure 8a, the architecture of the ANN algorithm is shown for the ISIC 2018 dataset. Seven classes were produced. In Figure 8b, the architecture of the ANN algorithm is shown for the PH2 dataset, in which three classes were produced.

Classification
In this section, the ISIC 2018 and PH2 datasets were evaluated according to two traditional classification algorithms (e.g., ANN and FFNN) and convolutional neural networks (CNNs) (e.g., ResNet-50 and AlexNet) models for diagnosing skin diseases.

ANN and FFNN
ANN is a type of neural network of soft computing. It is a group of layers consisting of interconnected and internally connected neurons. It has a superior ability to interpret and analyze complex data and to produce clear and explanatory patterns. The error between actual and predicted probabilities is also minimized by ANN [38]. Information is propagated between neurons and stored as connecting points between them called weights. The objective of the ANN is to update the specified weights w to obtain the minimum square error between the actual output x and predicted output y as given by the mean square error (MSE), as described in Equation (5): The ANN algorithm was evaluated on the ISIC 2018 and PH2 datasets for diagnosing skin diseases. A model was trained on 220 features through 10 hidden layers between the input and output layers. In Figure 8a, the architecture of the ANN algorithm is shown for the ISIC 2018 dataset. Seven classes were produced. In Figure 8b, the architecture of the ANN algorithm is shown for the PH2 dataset, in which three classes were produced. The FFNN algorithm is similar to the ANN algorithm in solving complex computational problems. The hidden layer neurons are interconnected by w weights. The algorithm works, and information between neurons is fed in the forward direction. The results of each neuron were obtained based on the weight associated with it multiplied by the output of the previous neuron [39]. The weights were updated in the forward direction from the hidden layer to the output layer. In each iteration, the weights were updated until the minimum squared error was obtained between the expected and actual output. The criteria were used to select the algorithms ANN and FFNN. It is known that these algorithms are among the best machine learning algorithms and are distinguished from The FFNN algorithm is similar to the ANN algorithm in solving complex computational problems. The hidden layer neurons are interconnected by w weights. The algorithm works, and information between neurons is fed in the forward direction. The results of each neuron were obtained based on the weight associated with it multiplied by the output of the previous neuron [39]. The weights were updated in the forward direction from the hidden layer to the output layer. In each iteration, the weights were updated until the minimum squared error was obtained between the expected and actual output. The criteria were used to select the algorithms ANN and FFNN. It is known that these algorithms are among the best machine learning algorithms and are distinguished from the rest of the machine learning algorithms by several criteria such as: (1) they contain many layers such as the input layer to receive the features (220 features in this study) and many layers of hidden (10 hidden layers in this study) and output layers (seven neurons in the case of the ISIC 2018 dataset or three neurons in the case of the PH2 dataset). (2) Interconnected neurons. (3) The weights that connect each neuron with the other neurons. (4) Mean square error compares the actual and predicted output and repeatedly works until the lowest ratio between the predicted and actual output is obtained by changing the weights frequently [40].

Convolutional Neural Networks (CNNs)
CNNs are deep learning methods used in many areas, including signal processing and image processing, to recognize patterns, classify objects, and detect regions of interest [41]. In this study, the two datasets, ISIC 2018 and PH2, were evaluated on ResNet-50 and AlexNet for diagnosing skin diseases. Several CNN structures for diagnosing skin lesions have been established, including several layers, training steps, activation functions, and learning rates. The most important layers in a CNN are the convolutional layers, max, average pooling layers, the fully connected layer, and activation functions [42].
When the image was inputted into the CNN structure, the image was represented as image height × image width. After the image passed through the convolutional layers, the feature map contained the feature depth, represented as image height × image width × image depth. Filter size, step, and zero padding were the most critical parameters of the convolutional layers that affected the performance of the convolutional layers. Convolutional layers wrap with the filter size around the image, learn the weights during the training phase, process the input, and pass it to the next layer.
Zero padding was the process of filling neurons with zeros to maintain the size of the resulting neurons. When zero padding was one, the neurons were padded with a row and a column around the edges. The output in each neuron was input to the next neuron. This output was calculated according to Equation (6) as follows: where W represents the volume of the input neuron, K represents the filter volume in a convolutional layer, P represents the volume of the input padding, and S represents the step. Rectified linear unit (ReLU) layers were also used after convolutional layers for image processing. The purpose of ReLU was to pass the positive output, suppress the negative output, and convert it to zero [43]. Equation (7) showns how a ReLU layer works.
The dimensions were reduced by the pooling layer, as the dimensions of the image were reduced by grouping many neurons and representing them in one neuron according to the maximum or average method, which is called the max-pooling layer or average pooling layer. The maximum value of the groups of neurons was selected using the maximum method, and the average value of the neurons was chosen using the average method. CNNs have millions of parameters, and there was an overfitting problem. Therefore, overfitting was prevented by the dropout layer by stopping 50% of the neurons, while the remaining neurons were turned on in each iteration. However, the training time was doubled by this technique due to repetition by an amount of two. In the fully connected layers, the last layer of convolutional neural networks, each neuron was connected to all neurons. Feature maps were converted to flat representations (unidirectional). Each image was diagnosed by a fully connected layer in its related class. Thus, the network takes a long time during the training and testing phases, and many fully connected layers can be used in the same network.
Softmax is the activation function used in the last stage of the convolutional neural network model. It is nonlinear and is used by multiple classes. In Equation (8), the functioning of the softmax function is described. Respectively, seven and three classes were produced for both the ISIC 2018 and PH2 datasets by the softmax function.
where y is the output of softmax, and n is the output total number. Two CNN models of transfer learning, the ResNet-50 and AlexNet models, were implemented in this work.

ResNet50 Model
The ResNet-50 model contained 16 blocks with 177 layers divided into 49 convolutional layers, ReLU, batch normalization, one max-pooling layer, one average-pooling layer, one fully connected layer, and the softmax function. Seven and three classes were produced by the softmax function for the ISIC 2018 and PH2 datasets, respectively. ResNet-50 also contains 23.9 million parameters [44]. Figure 9 describes the basic architecture of ResNet-50 for diagnosing the ISIC 2018 dataset. Table 1 describes the number of layers, the size of each filter, and the parameters of the ResNet-50 model. Softmax is the activation function used in the last stage of the convolutional neural network model. It is nonlinear and is used by multiple classes. In Equation (8), the functioning of the softmax function is described. Respectively, seven and three classes were produced for both the ISIC 2018 and PH2 datasets by the softmax function.
where y is the output of softmax, and n is the output total number. Two CNN models of transfer learning, the ResNet-50 and AlexNet models, were implemented in this work.

ResNet50 Model
The ResNet-50 model contained 16 blocks with 177 layers divided into 49 convolutional layers, ReLU, batch normalization, one max-pooling layer, one averagepooling layer, one fully connected layer, and the softmax function. Seven and three classes were produced by the softmax function for the ISIC 2018 and PH2 datasets, respectively. ResNet-50 also contains 23.9 million parameters [44]. Figure 9 describes the basic architecture of ResNet-50 for diagnosing the ISIC 2018 dataset. Table 1 describes the number of layers, the size of each filter, and the parameters of the ResNet-50 model.    AlexNet Model The AlexNet model contained 25 layers divided into five convolutional layers, three max-pooling layers, three fully connected layers, a classification-output layer, two leaking layers, and a softmax activation function. Seven and three classes were produced for the ISIC 2018 and PH2 datasets, respectively [45]. AlexNet contained 62 million parameters, 630 million connections, and 650,000 neurons. In Figure 10, the basic architecture of AlexNet for diagnosing the ISIC 2018 dataset is described. The number of layers, the size of each filter, and the parameters of the AlexNet model are described in Table 2. AlexNet Model The AlexNet model contained 25 layers divided into five convolutional layers, three max-pooling layers, three fully connected layers, a classification-output layer, two leaking layers, and a softmax activation function. Seven and three classes were produced for the ISIC 2018 and PH2 datasets, respectively [45]. AlexNet contained 62 million parameters, 630 million connections, and 650,000 neurons. In Figure 10, the basic architecture of AlexNet for diagnosing the ISIC 2018 dataset is described. The number of layers, the size of each filter, and the parameters of the AlexNet model are described in Table 2.  CNNs have many layers to extract feature maps from the input images. The transfer learning technique was applied with the aim of transferring the experience gained from pre-training to perform new tasks on the ISIC 2018 and PH2 data sets. The knowledge gained when training convolutional neural networks were stored with this technology for more than a million images to obtain more than a thousand classes. Learning is transferred to solve new problems related to the classification of skin diseases. We aim to use CNNs for diagnosing skin diseases by comparing the results with traditional neural networks ANN and FFNN.

Splitting Dataset and Environment Setup
The proposed systems were evaluated on the ISIC 2018 and PH2 datasets. The division of the two data sets is described in Table 3. First, the ISIC 2018 dataset, which contained 1200 images divided into seven diseases, was divided into 80% (960 images) for training and validation (80% and 20%; 678 and 192 images, respectively) and 20% (240  CNNs have many layers to extract feature maps from the input images. The transfer learning technique was applied with the aim of transferring the experience gained from pre-training to perform new tasks on the ISIC 2018 and PH2 data sets. The knowledge gained when training convolutional neural networks were stored with this technology for more than a million images to obtain more than a thousand classes. Learning is transferred to solve new problems related to the classification of skin diseases. We aim to use CNNs for diagnosing skin diseases by comparing the results with traditional neural networks ANN and FFNN.

Splitting Dataset and Environment Setup
The proposed systems were evaluated on the ISIC 2018 and PH2 datasets. The division of the two data sets is described in Table 3. First, the ISIC 2018 dataset, which contained 1200 images divided into seven diseases, was divided into 80% (960 images) for training and validation (80% and 20%; 678 and 192 images, respectively) and 20% (240 images) for testing. Then, the PH2 dataset, which contained 200 images, was divided into three diseases. The dataset was divided into 80% (160 images) for training and validation (80% and 20%; 128 and 32 images, respectively) and 20% (40 images) for evaluating the methods.
All proposed systems were implemented with an Intel ® i5 processor, 8 GB RAM, 4 GB NVIDIA GeForce 940MX and Software by MATLAB 2018b.

Evaluation Metrics
For the performance of machine learning algorithms (ANN and FFNN) and deep learning (ResNet-50 and AlexNet) implemented on the ISIC 2018 and PH2 datasets, the following statistical measures were considered: accuracy, precision, sensitivity, specificity, and AUC. The following equations describe the evaluation of the proposed systems through a confusion matrix that contains all correctly classified cases (TP and TN) and incorrectly classified cases (FP and FN) [46]: where TP represented a sufferer case (skin disease) that was correctly classified, TN represented a sufferer case that was correctly classified as normal, FN represented a sufferer case (skin disease) classified as normal, and FP represented a normal case classified as skin disease.

Results of the ANN and FFNN Algorithms
Traditional neural networks are good tools for medical image diagnosis. The neural network process consists of the training and validation phases. Then, a testing phase of the quality of the performance of the system on new samples is conducted. In Figure 11, the training process for the ANN and FFNN algorithms is described. These algorithms consisted of an input layer with 220 neurons (number of features extracted from the previous stage), 10 hidden layers in which all the computations were performed, and an output layer containing seven classes for the ISIC 2018 dataset and three categories for the PH2 dataset.

Performance Analysis
The performance of the algorithms was analyzed by calculating the cross-entropy loss and the least-square error between the expected and actual output. In Figure 12, the errors during the training, validation, and testing phase are described. The performance of the ANN algorithm for the ISIC 2018 dataset is shown in Figure 12a. The best performance achieved at a value of 0.058987 occurred in epoch 48. In Figure 12b, the best performance of the ANN algorithm for the PH2 dataset is shown, achieving the best performance value of 0.020612, which occurred in epoch 22. The training stage was indicated by the color, the validation was indicated by the green color, the testing stage was indicated by the red color, and the best performance was indicated by the crossed line. The minimum error was obtained in the training data, with a more significant number of epochs. The training was stopped when validation was stopped, and the weights were set to specific values.

Performance Analysis
The performance of the algorithms was analyzed by calculating the cross-entropy loss and the least-square error between the expected and actual output. In Figure 12, the errors during the training, validation, and testing phase are described. The performance of the ANN algorithm for the ISIC 2018 dataset is shown in Figure 12a. The best performance achieved at a value of 0.058987 occurred in epoch 48. In Figure 12b, the best performance of the ANN algorithm for the PH2 dataset is shown, achieving the best performance value of 0.020612, which occurred in epoch 22. The training stage was indicated by the color, the validation was indicated by the green color, the testing stage was indicated by the red color, and the best performance was indicated by the crossed line. The minimum error was obtained in the training data, with a more significant number of epochs. The training was stopped when validation was stopped, and the weights were set to specific values.

Gradient
In Figure 13, the gradient and validation values are described. In Figure 13a

Performance Analysis
The performance of the algorithms was analyzed by calculating the cross-entropy loss and the least-square error between the expected and actual output. In Figure 12, the errors during the training, validation, and testing phase are described. The performance of the ANN algorithm for the ISIC 2018 dataset is shown in Figure 12a. The best performance achieved at a value of 0.058987 occurred in epoch 48. In Figure 12b, the best performance of the ANN algorithm for the PH2 dataset is shown, achieving the best performance value of 0.020612, which occurred in epoch 22. The training stage was indicated by the color, the validation was indicated by the green color, the testing stage was indicated by the red color, and the best performance was indicated by the crossed line. The minimum error was obtained in the training data, with a more significant number of epochs. The training was stopped when validation was stopped, and the weights were set to specific values.

Gradient
In Figure 13, the gradient and validation values are described. In Figure 13a, the gradient value of the ANN algorithm is shown for the ISIC 2018 dataset. A value of 0.007904 was reached at epoch 54, and a validation value of six was reached at epoch 54.
In Figure 13b, the gradient value of the ANN algorithm is described in the ISIC 2018 dataset. A value of 0.28647 was reached at epoch 28, and a validation value of six was reached at epoch 28.

Gradient
In Figure 13, the gradient and validation values are described. In Figure 13a, the gradient value of the ANN algorithm is shown for the ISIC 2018 dataset. A value of 0.007904 was reached at epoch 54, and a validation value of six was reached at epoch 54. In Figure 13b, the gradient value of the ANN algorithm is described in the ISIC 2018 dataset. A value of 0.28647 was reached at epoch 28, and a validation value of six was reached at epoch 28.

Regression
Regression is a method for predicting continuous variable(s) based on other variable(s) values. The relationship between the actual and predicted outputs is accurate when R approaches 1.0. In Figure 14, the regression when evaluating the ISIC 2018 dataset using the FFNN algorithm is described. The relationship between the actual and predicted outputs was 92.35% during the training phase. The relationship was 75.85% during the validation phase and 78.18% during the testing phase. The total relationship was 87.30%.

Regression
Regression is a method for predicting continuous variable(s) based on other variable(s) values. The relationship between the actual and predicted outputs is accurate when R approaches 1.0. In Figure 14, the regression when evaluating the ISIC 2018 dataset using the FFNN algorithm is described. The relationship between the actual and predicted outputs was 92.35% during the training phase. The relationship was 75.85% during the validation phase and 78.18% during the testing phase. The total relationship was 87.30%.

Confusion Matrix
System performance outcomes were represented by a confusion matrix. The system performance metrics are extracted from them in such a matrix. All correctly classified

Regression
Regression is a method for predicting continuous variable(s) based on other variable(s) values. The relationship between the actual and predicted outputs is accurate when R approaches 1.0. In Figure 14, the regression when evaluating the ISIC 2018 dataset using the FFNN algorithm is described. The relationship between the actual and predicted outputs was 92.35% during the training phase. The relationship was 75.85% during the validation phase and 78.18% during the testing phase. The total relationship was 87.30%.

Confusion Matrix
System performance outcomes were represented by a confusion matrix. The system performance metrics are extracted from them in such a matrix. All correctly classified

Confusion Matrix
System performance outcomes were represented by a confusion matrix. The system performance metrics are extracted from them in such a matrix. All correctly classified samples, TP and TN, and incorrectly classified samples, FP and FN, were displayed in a confusion matrix. This section contains the confusion matrix for the ISIC 2018 and PH2 datasets for the ANN algorithm. The confusion matrix of the ISIC 2018 dataset is shown in Figure 15. The disease classes were represented as follows: MEL (class 1), VASC (class 2), DF (class 3), NV (class 4), AKIEC (class 5), BCC (class 6), and BKL (class 7). In Figure 16, which corresponds to the PH2 dataset, the disease classes were represented as follows: MEL (class 1), benign disease (class 2), and atypical disease (class 3). 16, which corresponds to the PH2 dataset, the disease classes were represented as follows: MEL (class 1), benign disease (class 2), and atypical disease (class 3).
In Figure 15 and Table 4, the results of the ANN algorithm for the ISIC 2018 dataset during the training, validation, testing, and overall results phase are described. An overall accuracy of 95.3% was achieved by the algorithm. During the training, validation, and testing phases, the accuracy rates reached by the ANN were 98.8%, 88.5%, and 88.5%, respectively. In Figure 16, the results of the ANN algorithm for the PH2 dataset during the training, validation, testing, and overall results phase are described. An overall accuracy of 97% was reached by the algorithm. During the training, validation, and testing phases, accuracies of 98.5%, 93.9%, and 93.9% were reached by the method, respectively. In Figure 15 and Table 4, the results of the ANN algorithm for the ISIC 2018 dataset during the training, validation, testing, and overall results phase are described. An overall accuracy of 95.3% was achieved by the algorithm. During the training, validation, and testing phases, the accuracy rates reached by the ANN were 98.8%, 88.5%, and 88.5%, respectively.
the FFNN algorithm reached an accuracy of 95.24%, precision of 91.69%, sensitivity of 98.26%, specificity of 92.21%, and AUC of 95.03%. Regarding the PH2 dataset, the FFNN algorithm outperformed the ANN algorithm. The FFNN algorithm reached an accuracy of 97.91%, precision of 97.09%, sensitivity of 98.68%, specificity of 97.14%, and AUC of 97.89%. At the same time, the ANN algorithm achieved accuracy of 97%, precision of 95.97%, sensitivity of 98.45%, specificity of 96.07%, and AUC of 97.23%.   In Figure 16, the results of the ANN algorithm for the PH2 dataset during the training, validation, testing, and overall results phase are described. An overall accuracy of 97% was reached by the algorithm. During the training, validation, and testing phases, accuracies of 98.5%, 93.9%, and 93.9% were reached by the method, respectively.
In Table 4, the performance of ANN and FFNN algorithms in detecting skin diseases on the ISIC 2018 and PH2 datasets was summarized. We noted that the ANN algorithm was superior to FFNN in diagnosing diseases in the ISIC 2018 dataset. First, to diagnose the ISIC 2018 dataset, the ANN algorithm reached an accuracy of 95.3%, precision of 94.63%, sensitivity of 99.18%, specificity of 94.87%, and AUC of 96.93%. At the same time, the FFNN algorithm reached an accuracy of 95.24%, precision of 91.69%, sensitivity of 98.26%, specificity of 92.21%, and AUC of 95.03%. Regarding the PH2 dataset, the FFNN algorithm outperformed the ANN algorithm. The FFNN algorithm reached an accuracy of 97.91%, precision of 97.09%, sensitivity of 98.68%, specificity of 97.14%, and AUC of 97.89%. At the same time, the ANN algorithm achieved accuracy of 97%, precision of 95.97%, sensitivity of 98.45%, specificity of 96.07%, and AUC of 97.23%.

Receiver Operating Characteristic (ROC)
The receiver operating characteristic (ROC) is a system performance curve for diagnosing a dataset during the training, validation, and testing phases. Colored lines represented curves for each class of diseases. Each color represented a disease assessment curve. The false-positive rate (FPR), known as specificity, was represented in the x-axis. The true positive rate (TPR), known as sensitivity, was represented in the y-axis. The ROC values of the ISIC 2018 dataset, consisting of seven disease classes described during the training, validation, and testing phases, are shown in Figure 17a. The ROC values of the PH2 dataset, composed of three disease classes, were described during the training, validation, and testing phases, as shown in Figure 17b. With an average value of seven classes, an accuracy rate of 96.93% was reached by ROC in the ISIC 2018 dataset. For the PH2 dataset with an average value of three classes, a rate of 97.23% was reached by ROC. The receiver operating characteristic (ROC) is a system performance curve for diagnosing a dataset during the training, validation, and testing phases. Colored lines represented curves for each class of diseases. Each color represented a disease assessment curve. The false-positive rate (FPR), known as specificity, was represented in the x-axis. The true positive rate (TPR), known as sensitivity, was represented in the y-axis. The ROC values of the ISIC 2018 dataset, consisting of seven disease classes described during the training, validation, and testing phases, are shown in Figure 17a. The ROC values of the PH2 dataset, composed of three disease classes, were described during the training, validation, and testing phases, as shown in Figure 17b. With an average value of seven classes, an accuracy rate of 96.93% was reached by ROC in the ISIC 2018 dataset. For the PH2 dataset with an average value of three classes, a rate of 97.23% was reached by ROC.

Results of the CNN Models
Here, the performance evaluation of ResNet-50 and AlexNet, two CNN networks, is shown during the image classification of the ISIC 2018 and PH2 datasets for the early detection of skin diseases. The two datasets were divided into 80% for training and validation (80% and 20%) and 20% for testing. The image size was standardized in the system to obtain robust results. In the ResNet-50 model, the images entered with a resolution of 224 × 224 × 3 pixels, whereas the resolution considered in the AlexNet model was 227 × 227 × 3 pixels. The output formats of the ResNet-50 and AlexNet models were determined by the softmax activation function. Seven and three disease classes were produced by the softmax function to identify and classify skin diseases in the ISIC 2018 and PH2 datasets, respectively. The performance of the ResNet-50 and AlexNet models depended on the parameters in each layer. For example, network performance depended

Results of the CNN Models
Here, the performance evaluation of ResNet-50 and AlexNet, two CNN networks, is shown during the image classification of the ISIC 2018 and PH2 datasets for the early detection of skin diseases. The two datasets were divided into 80% for training and validation (80% and 20%) and 20% for testing. The image size was standardized in the system to obtain robust results. In the ResNet-50 model, the images entered with a resolution of 224 × 224 × 3 pixels, whereas the resolution considered in the AlexNet model was 227 × 227 × 3 pixels. The output formats of the ResNet-50 and AlexNet models were determined by the softmax activation function. Seven and three disease classes were produced by the softmax function to identify and classify skin diseases in the ISIC 2018 and PH2 datasets, respectively. The performance of the ResNet-50 and AlexNet models depended on the parameters in each layer. For example, network performance depended on the filter size, stride, and padding of the convolutional layers. The extracted feature maps differed from layer to layer depending on the filters used. Classification accuracy is also affected by the size of the pooling layers. The CNN model tuning regarding the optimizer, learning rate, maximum epoch, mini-batch size, training time, and validation frequency are illustrated in Table 5. The proposed models were evaluated on the ISIC 2018 dataset, which contained 1200 images divided into seven classes, and the PH2 dataset, which contained 120 images divided into three classes. The results of the ResNet-50 were better than those of AlexNet models, as shown in Table 6. Thus, the ResNet-50 model has an essential role in the diagnostic accuracy of early detection of skin diseases. Superior performance in the early detection of skin diseases was achieved by the two models with images from the PH2 dataset than with images from the ISIC 2018 dataset. It was noted with the 2018 ISIC dataset, the ResNet-50 model reached an accuracy of 90%, precision of 91.43%, sensitivity of 89.37%, specificity of 97.84%, and AUC of 85%. With the PH2 dataset, an accuracy of 95.8%, precision of 96.33%, sensitivity of 95.64%, specificity of 98.21%, and AUC of 100% was reached by the ResNet-50 model. It was also noted that with the 2018 ISIC dataset, the AlexNet model reached an accuracy of 85.3%, the precision of 85.42%, the sensitivity of 84.43%, specificity of 97.71%, and AUC of 96.81%. With the PH2 dataset, an accuracy of 91.7%, precision of 92.66%, sensitivity of 91.66%, specificity of 96%, and AUC of 100% was reached. The confusion matrices of the ResNet-50 model for early detection of skin diseases using the ISIC 2018 and PH2 datasets are shown in Figure 18. In the ISIC 2018 dataset, the diagnostic accuracy reached by the ResNet-50 model in each disease class was 100% for NV and MEL images, 87.5% for BKL images, 85% for BCC, AKIEC, and VASC images, and 80% for DF images. At the same time, the diagnostic accuracy reached by the ResNet-50 model was 100% for MEL images and 85.7% for images of benign and atypical diseases. The confusion matrices of the AlexNet model for the early detection of skin diseases in the ISIC  Figure 19. For the PH2 dataset, the diagnostic accuracy at the level of each disease class reached by the AlexNet model was 100% for NV images, 98.3% for the MEL images, 83.3% for BKL images, 76.7% for BCC images, 73.3% for AKIEC images, 90% for VASC images, and 70% for DF images. For the PH2 dataset, the AlexNet model achieved a diagnostic accuracy of 100% for MEL images and 87.5% for benign and atypical disease images.    The ResNet-50 model performance regarding the ISIC 2018 and PH2 datasets considering the ROC from the AUC scale is shown in Figure 20. In that image, the closer the curve was to a right angle, the better the result, because it is close to 100%. AUC values of 85% and 100% were reached by the ResNet-50 model for the ISIC 2018 and PH2 datasets, respectively. The AlexNet model performance regarding the ISIC 2018 and PH2 datasets and considering the ROC from the AUC scale is shown in Figure 21. AUC values of 96.81% and 100% were obtained by the AlexNet model for the ISIC 2018 and PH2 datasets, respectively.

Discussion of the Performance of the Proposed Systems
In this study, systems were developed using artificial intelligence techniques (e.g., machine learning and deep learning) to diagnose images of the ISIC 2018 and PH2 datasets for the early detection of skin diseases. The dataset was divided into 80% for training and validation phases (80% and 20%, respectively) and 20% for the testing phase.

Discussion of the Performance of the Proposed Systems
In this study, systems were developed using artificial intelligence techniques (e.g., machine learning and deep learning) to diagnose images of the ISIC 2018 and PH2 datasets for the early detection of skin diseases. The dataset was divided into 80% for training and validation phases (80% and 20%, respectively) and 20% for the testing phase. First, an automated learning system was developed based on segmentation methods, separating the lesion area from healthy skin and extracting the hybrid characteristics using three algorithms: LBP, GLCM, and DWT. A total of 220 features was produced using the three methods. These features were fed into the ANN and FFNN algorithms for diagnosing skin diseases. Two hundred and twenty features were entered, and these features were processed with 10 hidden layers. The two algorithms produced seven classes for the ISIC 2018 dataset and three classes for the PH2 dataset. It is worth noting that the highest accuracy of 95.24% for the ISIC 2018 dataset and 97.91% for the PH2 dataset was achieved using the FFNN algorithm. Accuracy rates of 95.3% for the ISIC 2018 dataset and 97% for the PH2 dataset were acquired using the ANN algorithm. Second, by using two deep learning models (e.g., ResNet-50 and AlexNet) to diagnose the ISIC 2018 and PH2 datasets for early detection of skin diseases. The ResNet-50 model performed better than the AlexNet model for both datasets. Accuracy rates of 90% for the ISIC 2018 dataset and 95.8% for the PH2 dataset were achieved using the ResNet-50 model. In contrast, an accuracy of 85.3% for the ISIC 2018 dataset and 91.7% for the PH2 dataset was obtained using the AlexNet model. We concluded that more accurate results were acquired by the ANN and FFNN algorithms than by CNN networks through the results mentioned above.
In the case of the ANN and FFNN algorithms, we used pretreatment and hair removal techniques, which helped us obtain more accurate results. However, accurate results for the early detection of skin diseases were obtained using the rest of the proposed systems.
The accuracy rates reached by each system in diagnosing each disease are described in Table 7. In the ISIC 2018 dataset, the best accuracy for diagnosing AKIEC was reached by the ANN classifier, reaching 90%. The best accuracy for diagnosing BCC (100%) was obtained by the FFNN classifier. Then, the best accuracy for diagnosing BKL was reached by the FFNN classifier, with 100% accuracy. The ANN classifier had the best accuracy (92%) for diagnosing DF. The best accuracy for diagnosing MEL was reached using the ResNet-50 model (100%). The highest accuracy (100%) for diagnosing NV was reached by the ResNet-50 and AlexNet models. Finally, the best accuracy for the diagnosis of VASC cases was reached by the ANN classifier, with a 92% accuracy rate. Regarding the PH2 dataset, the best accuracy for MEL disease diagnosis (100%) was reached by ResNet-50 and AlexNet. The best accuracy for NV disease diagnosis was reached by the ANN classifier (95% accuracy). Finally, the best accuracy rate (100%) for diagnosing atypical disease cases was achieved using the ANN and FFNN classifiers. The performance comparison of the proposed systems for diagnosing skin diseases at the level of each disease is shown in Figure 22.  Figure 22. Performance comparison of the proposed systems for the early diagnosis of each skin disease analyzed in this study.

Comparison with Related Studies
In this section, a comparison of the performance results of the proposed systems with previous related studies is presented in Table 8. The results of the proposed methods show

Comparison with Related Studies
In this section, a comparison of the performance results of the proposed systems with previous related studies is presented in Table 8. The results of the proposed methods show that they have a better diagnostic performance than the previous related works. The performance was compared through accuracy, precision, sensitivity, specificity, and AUC; the table shows that the proposed systems have all the standards, while the other systems lack some of the measures. All previous systems achieved an accuracy between 89.3% and 60%, while the proposed system ANN achieved 95.3% and ResNet-50 achieved 90%. Regarding sensitivity, the previous systems achieved a rate ranging between 37.6% and 88.24%, while the proposed system ANN achieved a rate of 99.18% and ResNet-50 achieved 89.37%. As for specificity, the previous systems achieved a percentage ranging between 81% and 95.4%, while the proposed method ResNet-50 achieved 97.84%. Figure 23 display the performance of the proposed systems with the performance of some related previous studies.  Figure 23. Comparing the performance of the proposed systems with some relevant previous studies.

Conclusions
Skin diseases are spreading nowadays in many countries due to long-term exposure to the sun and weather changes. Many skin diseases must be diagnosed and treated early to avoid severe consequences for the health of affected individuals. Melanoma (skin cancer) is considered one of the most dangerous types of skin disease, and it must be diagnosed before it penetrates the internal tissues of the skin and spreads from one place to another in the body. In this work, we developed diagnostic systems based on artificial intelligence to diagnose the images of two standard datasets, ISIC 2018 and PH2, for the early detection of skin diseases. The images in the two data sets were divided into 80% for training and validation (80% and 20%, respectively) and 20% for testing. In the first step

Conclusions
Skin diseases are spreading nowadays in many countries due to long-term exposure to the sun and weather changes. Many skin diseases must be diagnosed and treated early to avoid severe consequences for the health of affected individuals. Melanoma (skin cancer) is considered one of the most dangerous types of skin disease, and it must be diagnosed before it penetrates the internal tissues of the skin and spreads from one place to another in the body. In this work, we developed diagnostic systems based on artificial intelligence to diagnose the images of two standard datasets, ISIC 2018 and PH2, for the early detection of skin diseases. The images in the two data sets were divided into 80% for training and validation (80% and 20%, respectively) and 20% for testing. In the first step of the proposed early detection using these proposed systems, ANN and FFNN algorithms were implemented to diagnose the features extracted by hybrid methods (e.g., LBP, GLCM, and DWT). The features of the three methods were combined and collected in a features matrix so that each vector (image) contained 220 essential features representing the disease types. In the second step, CNNs models were implemented, ResNet-50 and AlexNet, based on transfer learning. The results obtained with traditional neural networks (ANN and FFNN) were compared with CNN networks (ResNet-50 and AlexNet). It was noted that ANN and FFNN algorithms performed better than two CNN models, ResNet-50 and AlexNet. Despite applying many optimization techniques and extracting the features by hybrid methods between three algorithms, there are some limitations and challenges encountered in the study, which are represented in the significant similarity between the features of some diseases, which causes confusion for the classification algorithms when making a diagnosis. Solving these limitations in the future will require extracting features from various algorithms using traditional methods and combining them with deep feature maps extracted by CNN models, as well as applying hybrid methods between machine learning algorithms and deep learning models by using two blocks. In the first block, the deep features are extracted by CNN models. The second block is one of the machine learning algorithms that is fed with the output of the first block for classifying dermatology. Funding: This research has been funded by Prince Sultan University, Saudi Arabia.

Informed Consent Statement:
This study is based on ISIC 2018 and PH2 datasets publicly available online.