AI Techniques of Dermoscopy Image Analysis for the Early Detection of Skin Lesions Based on Combined CNN Features

Melanoma is one of the deadliest types of skin cancer that leads to death if not diagnosed early. Many skin lesions are similar in the early stages, which causes an inaccurate diagnosis. Accurate diagnosis of the types of skin lesions helps dermatologists save patients’ lives. In this paper, we propose hybrid systems based on the advantages of fused CNN models. CNN models receive dermoscopy images of the ISIC 2019 dataset after segmenting the area of lesions and isolating them from healthy skin through the Geometric Active Contour (GAC) algorithm. Artificial neural network (ANN) and Random Forest (Rf) receive fused CNN features and classify them with high accuracy. The first methodology involved analyzing the area of skin lesions and diagnosing their type early using the hybrid models CNN-ANN and CNN-RF. CNN models (AlexNet, GoogLeNet and VGG16) receive lesions area only and produce high depth feature maps. Thus, the deep feature maps were reduced by the PCA and then classified by ANN and RF networks. The second methodology involved analyzing the area of skin lesions and diagnosing their type early using the hybrid CNN-ANN and CNN-RF models based on the features of the fused CNN models. It is worth noting that the features of the CNN models were serially integrated after reducing their high dimensions by Principal Component Analysis (PCA). Hybrid models based on fused CNN features achieved promising results for diagnosing dermatoscopic images of the ISIC 2019 data set and distinguishing skin cancer from other skin lesions. The AlexNet-GoogLeNet-VGG16-ANN hybrid model achieved an AUC of 94.41%, sensitivity of 88.90%, accuracy of 96.10%, precision of 88.69%, and specificity of 99.44%.


Introduction
The largest organ in the human body is the skin. It performs many functions, such as protecting the body from external shocks, regulating temperature, protecting it from attacks by viruses and bacteria, and giving it immunity to resist diseases [1]. Its thickness varies from one area to another, ranging from 0.5 mm in the eyelids area to 4 mm in the palms of the hands [2]. It also maintains the internal organs and is considered the first line of defense, protecting the body from harmful sunlight and ultraviolet (UV) rays. It also produces vitamin D through sunlight [3]. Weather from cold to hot and skin types from oily to dry affect skin pigmentation. The sharp decrease in the levels of skin pigmentation leads to skin diseases such as skin cancer, and the cure rate is high if in the first stage. The DNA of skin cells is damaged by exposure to sunlight or ultraviolet radiation [4]. Melanoma is one of the most serious and deadly skin diseases that leads to death without an early diagnosis. Melanoma represents 1% of all types of skin cancer but causes more than 70% of deaths.

•
Improving dermatoscopy images using two successive techniques: CLAHE and average filter • Segmentation of dermatoscopy images of the ISIC 2019 dataset using the GAC algorithm and then feeding them to CNN models • Analysis of dermatoscopy images for early diagnosis of skin cancer and their distinction from skin lesions by hybrid models CNN-ANN and CNN-RF based on the GAC algorithm • Analysis of dermatoscopy images for the early diagnosis of skin cancer and distinguishing them from skin lesions using the ANN and RF networks based on the fused CNN features.
The rest of the paper is organized as follows: Section 2 discusses techniques and findings from previous studies. Section 3 presents methods for analyzing dermatoscopic images for the early diagnosis of skin lesions. Section 4 presents the findings of the hybrid models. Section 5 discusses the results of the systems and compares their performance. Section 6 concludes the research.

Related Work
Ismail et al. [11] proposed an EfficientNet-B6 model for diagnosing the images of the ISIC 2020 data set as malignant or benign with an accuracy of 97.84%. Since malignant lesions represent 2% of the data set, data oversampling and over-sampling techniques

Enhancement of ISIC 2019 Dermoscopic Images
The first step for all proposed systems is to improve dermatoscopic images. The Dermatoscopy images include noise and artifacts due to the variety of acquisition devices, which negatively affect the subsequent stages of image processing and lead to unreliable results [27]. So, the main purpose of pre-processing is to remove noise and artifacts such as air bubbles, hair, skin lines, low contrast between lesion borders, and light reflections when the gel is applied to the skin at the time of image capture. RGB channels were averaged, and color constancy was adjusted [28].
The presence of some artifacts leads to the occlusion of an essential part of the skin lesion or the extraction of false features, which makes the extracted features incorrect. A 6 by 6 averaging filter was applied to refine the images of the ISIC 2019 dataset, and the pixel value adjacent to the unwanted pixel was calculated using a 2D convolution operator. On each iteration, the operator targets a convolution of 36 pixels divided into one target pixel and 35 adjacent pixels. The average of adjacent pixels is calculated, and its value is substituted for the target pixel as in Equation (1) [29].
where F(x) refers to the output, M refers to the number of pixels in operators, z(x) refers to the input and z(x − i) refers to the prior input. Due to the low contrast between the edges of the lesions and their periphery, the low contrast was improved by the contrast limited adaptive histogram equalization (CLAHE) technique. The basic idea of this technique is to distribute the bright pixels to dark regions based on neighboring pixels. Each pixel is compared with its neighbors in each iteration and based on the comparison, the contrast is improved as follows: if the value of the target pixel is greater than its neighbors, the contrast will be reduced. In contrast, the image contrast increases when the neighboring pixels are greater than the target pixel [30]. Thus, the mechanism continues for each pixel in the image until the appearance of the edges of the skin lesions is improved. Figure 1b shows samples from the ISIC 2019 dataset after undergoing optimization techniques.

Geometric Active Contour Algorithm
The Geometric Active Contour (GAC) algorithm is a contour model used to segment biomedical images such as skin lesions, extract a region of interest (ROI), and separate it from the healthy regions [31]. The algorithm creates a set of points and moves them on the perpendicular curve of the lesion edges to obtain a smooth curve. So that the movement of the points on the curve is proportional to the curves in the ROI in the image. The contour is described based on the geometric flow of the curves to detect the lesion area [32]. The engineering flows are conducted according to the external and internal measurements of the ROI. By the geometric flow of the contour, the geometric lines of the initial curve C0 are determined as in Equation (2).
where g refers to the edge scalar function, k refers to a vector of curvature, → refers to from the vector to the curve and v refers to a constant value. The curve continues to move until g reaches zero, which means the curve has reached the edge of the skin lesion. When the curve comes to the edge of the lesion, the parameters are replaced by the Euclidean arc length as in Equation (3).
Euclidean arc length explains irregular curves based on curves and energy forces. Minimal geometric curve flows are derived through internal and external forces. Equation (4), provided by Euler-Lagrange, shows the curved differential for the ROI.
The lesion area is determined based on the geometric plane developing curve functions. The minimum internal force is applied through the force of the balloon, which shows progress in the inner circumference of the lesion. Thus, the Euler-Lagrange expression defines the contour as the innermost descending as in Equation (5) [33].

Geometric Active Contour Algorithm
The Geometric Active Contour (GAC) algorithm is a contour model used to segment biomedical images such as skin lesions, extract a region of interest (ROI), and separate it from the healthy regions [31]. The algorithm creates a set of points and moves them on the perpendicular curve of the lesion edges to obtain a smooth curve. So that the movement of the points on the curve is proportional to the curves in the ROI in the image. The contour is described based on the geometric flow of the curves to detect the lesion area [32]. The engineering flows are conducted according to the external and internal measurements of the ROI. By the geometric flow of the contour, the geometric lines of the initial curve C0 are determined as in Equation (2).
where g refers to the edge scalar function, k refers to a vector of curvature, N → refers to from the vector to the curve and v refers to a constant value. The curve continues to move until g reaches zero, which means the curve has reached the edge of the skin lesion. When the curve comes to the edge of the lesion, the parameters are replaced by the Euclidean arc length as in Equation (3).
Euclidean arc length explains irregular curves based on curves and energy forces. Minimal geometric curve flows are derived through internal and external forces. Equation (4), provided by Euler-Lagrange, shows the curved differential for the ROI.
The lesion area is determined based on the geometric plane developing curve functions. The minimum internal force is applied through the force of the balloon, which shows progress in the inner circumference of the lesion. Thus, the Euler-Lagrange expression defines the contour as the innermost descending as in Equation (5) [33]. The contour models describe the curve of the geometric flows to show the geometric features of the edges of the lesion area as shown in Figure 2. The edges of the skin lesions are defined based on the color gradation of the edges by the active geometric lines. Edgebased engineering models have effective computational capabilities to segment lesions areas. There are gaps in some areas due to the graduated curves. Therefore, the engineering models are sensitive to the graduated curves to determine the contour by increasing the weights of the curves and the horizontal width. Geometric models depend on the difference in density inside and outside the contour lines or lesions' internal and external contrast. The contour models describe the curve of the geometric flows to show the geometric features of the edges of the lesion area as shown in Figure 2. The edges of the skin lesions are defined based on the color gradation of the edges by the active geometric lines. Edgebased engineering models have effective computational capabilities to segment lesions areas. There are gaps in some areas due to the graduated curves. Therefore, the engineering models are sensitive to the graduated curves to determine the contour by increasing the weights of the curves and the horizontal width. Geometric models depend on the difference in density inside and outside the contour lines or lesions' internal and external contrast.  In this study, the skin lesion regions were segmented and stored in a new folder to be sent to the CNN models to extract features from the lesion regions only (ROI) instead of feeding the CNN models with full dermoscopic images of the diseased and healthy parts.

Extract Deep Feature Maps
CNNs consist of many convolutional and pooling layers ending in a series of fully connected layers. The convolutional layers receive an image of size m × n × z, where m and n are the width and height of the image and z is the number of color channels [34]. The number of convolutional layers differs from one network to another, and each convolutional layer usually consists of many filters of size f × f × z. The number of channels of the input image must match the number of channels of the convolutional filter. The key to convolutional neural networks is the convolution process of superimposing the filter f (t) on the input image x(t) and moving it over the image until all areas of the image are processed (the filter must process all pixels of the image) as in Equation (6) [35].
Each convolutional layer produces feature maps equal to the number of filters in the layer and the new image becomes an entry to the next layer after adding some biases and passing it from auxiliary layers such as ReLU.
where f (t) denotes the filter, x(t) denotes the image input and y(t) denotes output. After convolutional layers, the feature maps are very dimensional, so the data size must be reduced. CNN networks provide pooling layers of two types, either max or average, which work to reduce the size of the input images through the implementation of the pooling process [36]. The pooling layers interact similarly to the convolutional layer, which produces operations in small regions of the input matrix area. The Max-Pooling works by selecting a group from the size of the matrix and searching for the max value and replacing it with the greatest value as in Equation (7). The average-Pooling work mechanism is the same as the Max-Pooling work mechanism, except instead of replacing the selected group with max here, it is replaced by the arithmetic average for the selected group as in Equation (8).
Thus, the computational cost is reduced, as pooling results in an output matrix with dimensions much smaller than the output matrix of the convolutional layer. At the same time, it helps to obtain and locate dominant features. CNN networks have achieved great success in identifying features from the input images and their high ability to detect complex features effectively because filters act as detectors for hidden and small features such as edges, shapes, colors, structures, and so on [37].
where f indicates the wrapped filter in area image, m, n indicates the matrix location, k indicates number of pixels and p indicates p-step. Finally, the output should be a classification of the inputs (probabilities for each class), high-level features are converted to vectors by fully connected layers [38]. Followed by the SoftMax activation function, which consists of neurons with a number of input classes. SoftMax works on labelling the features of each image to its appropriate class. CNN is computationally efficient in that the features in one area of the image are often the same as those in another part of the image, which provides the use of the same weights to calculate activations on other regions of the image. Thus, the number of parameters, weights, and links to be trained is reduced [39]. The last convolutional or pooling layers of the AlexNet, GoogLeNet and VGG16 models produce higher-level feature maps as follows: (13,13,256), (7,7,512) and (7,7,512), respectively. Finally, the high-level features are converted to features vectors using a global average pooling layer, which puts the high-level features into flat vectors of size 4096 for each model AlexNet, GoogLeNet and VGG16. Thus, the size of the ISIC 2019 dataset is represented by a feature matrix with the size 25,331 × 4096 for AlexNet, GoogLeNet and VGG16 separately.

Inductive and Deductive Phase
The last stage of medical image processing is classification, which depends on the previous stages' efficiency. After the stages of optimization and extraction of the ROI (lesion area), the features of the skin lesions are extracted by CNN models and saved in vectors. The data set's features are represented in the feature matrix, which is input to the ANN and RF networks. Classification networks include an inductive phase to build a classification model called data training and the deductive stage of testing new data to measure the system's performance.

ANN Network
The ANN is a type of high-efficiency soft computing. ANN consists of three basic layers connected by many interconnected neurons with exact weights. The network can effectively extract information from complex data, analyze it, produce clear and interpretable patterns, and adapt to changing environments. The ANN passes information between layers and neurons and reduces the computational error probability of overlapping affinities between classes. ANN consists of processing units that send signals to each other through weighted links. Each neuron has an activation unit using weight wjk by signal j on the k unit. It also has a spreading base that receives external inputs and determines effective ones. The ANN input layer receives feature vectors extracted by CNN models and hybrid features between CNN models [40]. The input layer consists of units with the same number of features extracted from the previous stage. The inputs of the input layer are passed to the hidden layers in which the calculations are performed to perform the required tasks. The accuracy of the new data classification depends on the system's performance when building the training and validation template. The performance of an ANN depends on its internal structure, the number of hidden layers, and its neurons. In this study, the number of hidden cells was set to 15 hidden layers, as shown in Figure 3. The network measures its performance through squared errors between the actual xi and expected zi values. The network is repeated, and in each iteration, the weights are set iteratively until the network reaches an optimal set of weights through minimum square error (MSE), as in Equation (9). The output layers contain eight neurons equal to the dataset's number of classes. The activation function in the output layer maps all feature vectors to their appropriate category.
where m means the number of data, x i means actual output, and z i means expected output.

Random Forest Network
The random forest algorithm has a superior ability to make effective predictions on a biomedical dataset. As its name suggests, it is built based on assembling the predictions of many trees. RF aggregates the results of each decision tree and makes a decision based on majority voting, which is called the ensemble learning method. RF selects data points randomly, and based on these points, the algorithm creates decision trees with a specified number of points. RF uses its hyperparameters to increase the predictive efficiency of the classifier. More decision trees increase the performance and stability of predictions, but they increase the processing time after achieving the training step and creating a training model that can be applied to a new data set to test the system and measure its generalization on Diagnostics 2023, 13, 1314 9 of 29 new data. RF works with the Bagging method based on creating a sub-set of data and the final decision based on a majority vote for all decision-making trees. The mechanism begins with random data known as bootstrap data and called Bootstrapping. The models are trained separately, and each decision tree produces a result. In the end, the results are collected, called aggregation, and decisions are made according to the majority vote.

Random Forest Network
The random forest algorithm has a superior ability to make effective predictions on a biomedical dataset. As its name suggests, it is built based on assembling the predictions of many trees. RF aggregates the results of each decision tree and makes a decision based on majority voting, which is called the ensemble learning method. RF selects data points randomly, and based on these points, the algorithm creates decision trees with a specified number of points. RF uses its hyperparameters to increase the predictive efficiency of the classifier. More decision trees increase the performance and stability of predictions, but they increase the processing time after achieving the training step and creating a training model that can be applied to a new data set to test the system and measure its generalization on new data. RF works with the Bagging method based on creating a sub-set of data and the final decision based on a majority vote for all decision-making trees. The mechanism begins with random data known as bootstrap data and called Bootstrapping. The models are trained separately, and each decision tree produces a result. In the end, the results are collected, called aggregation, and decisions are made according to the majority vote.
This study used CNN networks (AlexNet, GoogLeNet and VGG16) to extract deep feature maps from ISIC 2019 dermoscopy images and classify them using ANN and Random Forest networks. It is worth noting that the performance of the hybrid systems was evaluated using ISIC 2019 images before and after applying the GAC algorithm to extract the ROI.
The ISIC 2019-ROI image classification strategy using a hybrid model of CNN-machine learning based on the GAC segmentation algorithm goes through the following im- This study used CNN networks (AlexNet, GoogLeNet and VGG16) to extract deep feature maps from ISIC 2019 dermoscopy images and classify them using ANN and Random Forest networks. It is worth noting that the performance of the hybrid systems was evaluated using ISIC 2019 images before and after applying the GAC algorithm to extract the ROI.
The ISIC 2019-ROI image classification strategy using a hybrid model of CNN-machine learning based on the GAC segmentation algorithm goes through the following implementation steps, as shown in Figure 4. First, the ISIC 2019 dataset images was improved to remove artifacts. Second, the lesion area was segmented and separated from healthy skin by the GAC algorithm and stored in a new folder called ISIC 2019-ROI. Third, the images of the new ISIC 2019-ROI dataset were fed into AlexNet, GoogLeNet and VGG16 models separately. Feature maps were extracted from each model by convolutional layers and pooling and saved as a feature matrix of sizes 25,331 × 4096, 25,331 × 4096, and 25,331 × 4096 for AlexNet, GoogLeNet and VGG16 models. Fourth, the high-dimensional feature matrix was fed into PCA to remove non-significant and redundant features and retain the most representative features. The PCA method produced a highly representative feature matrix of sizes 25,331 × 610, 25,331 × 590, and 25,331 × 640 for AlexNet, GoogLeNet, and VGG16 models, respectively. Fifth, the representative features matrix was fed to the ANN and RF networks to train and test their performance. tional layers and pooling and saved as a feature matrix of sizes 25,331 × 4096, 25,331 × 4096, and 25,331 × 4096 for AlexNet, GoogLeNet and VGG16 models. Fourth, the highdimensional feature matrix was fed into PCA to remove non-significant and redundant features and retain the most representative features. The PCA method produced a highly representative feature matrix of sizes 25,331 × 610, 25,331 × 590, and 25,331 × 640 for AlexNet, GoogLeNet, and VGG16 models, respectively. Fifth, the representative features matrix was fed to the ANN and RF networks to train and test their performance. The ISIC 2019-ROI image classification strategy by machine learning classifiers with combined features of CNN models goes through the following implementation steps as shown in Figure 5: The first four implementation steps are the same as the previous strategy. Fifth, the deep feature maps combine the CNN models Serially: AlexNet-GoogLeNet, GoogLeNet-VGG16, AlexNet-VGG16 and AlexNet-GoogLeNet-VGG16. Thus, the fused feature matrix has sizes of 25,331 × 1200, 25,331 × 1230, 25,331 × 1250 and 25,331 × 1840 for each of AlexNet-GoogLeNet, GoogLeNet-VGG16, AlexNet-VGG16 and AlexNet- The ISIC 2019-ROI image classification strategy by machine learning classifiers with combined features of CNN models goes through the following implementation steps as shown in Figure 5: The first four implementation steps are the same as the previous strategy. Fifth, the deep feature maps combine the CNN models Serially: AlexNet-GoogLeNet, GoogLeNet-VGG16, AlexNet-VGG16 and AlexNet-GoogLeNet-VGG16. Thus, the fused feature matrix has sizes of 25,331 × 1200, 25,331 × 1230, 25,331 × 1250 and 25,331 × 1840 for each of AlexNet-GoogLeNet, GoogLeNet-VGG16, AlexNet-VGG16 and AlexNet-GoogLeNet-VGG16 models, respectively. Sixth, the highly representative feature matrix is fed to the ANN and RF networks to train and test their performance.
GoogLeNet-VGG16 models, respectively. Sixth, the highly representative feature matrix is fed to the ANN and RF networks to train and test their performance.

Split of ISIC 2019 Data Set
The performance of the proposed systems in this study was measured on the dermatoscopic images available online to researchers and experts of the ISIC 2019 dataset. The ISIC 2019 dataset contains 25,331 dermatoscopic images distributed unevenly among eight classes (types of skin diseases), melanocytic and skin non-Melanocytic types, as shown in Table 1. In all proposed strategies, the data set was randomly divided into 80% during training and validation (80:20) and 20% for the testing phase. As shown in the table, the data set is unbalanced, highlighting a problem that needs to be addressed.

Split of ISIC 2019 Data Set
The performance of the proposed systems in this study was measured on the dermatoscopic images available online to researchers and experts of the ISIC 2019 dataset. The ISIC 2019 dataset contains 25,331 dermatoscopic images distributed unevenly among eight classes (types of skin diseases), melanocytic and skin non-Melanocytic types, as shown in Table 1. In all proposed strategies, the data set was randomly divided into 80% during training and validation (80:20) and 20% for the testing phase. As shown in the table, the data set is unbalanced, highlighting a problem that needs to be addressed.

Systems Performance Measures
The confusion matrix is one of the most important standard tools that evaluates the performance of systems for classifying a data set. The confusion matrix is a quadrilateral matrix with an equal number of rows and columns based on the data set classes. The confusion matrix contains the number of correctly and incorrectly classified test group samples. The main diagonal represents correctly classified samples called true positive (TP), and the rest of the cells represent incorrectly classified samples called true negative (TN) and false negative (FN). The performance of the systems is measured through Equations (10)- (14). The equations derive their variables from the confusion matrix [41].

Balancing Classes of ISIC 2019 Dataset
CNN models face many challenges, including the problem of overfitting, because they need a very large data set, which is not available in the biomedical data set. Additionally, the unbalanced data set, which contains unbalanced classes, is a challenge for the results of artificial intelligence models because the accuracy tends to class which has the majority images. Thus, CNN models provide a data augmentation tool to address these challenges. The tool increases the training data set's images from the original data set through many operations, such as rotating the images at various angles, resizing, vertical and horizontal transformation, vertical and horizontal flipping, displacement, shearing, and others [42]. To obtain balanced classes, the images were increased in seven categories, while the nevi class was not increased because it contained sufficient images. Additionally, the images of each class have been increased by a different amount from the other classes to achieve a balance. Table 2 and Figure 6 shows the number of images of the ISIC 2019 training data set before and after applying the data augmentation tool. Where it is noted that each image in the Scc class increased by 20 times, each image in the Akiec class increased by 15 times, each image in the Bcc class increased by 4 times, each image in the Bkl class increased by 5 times, each image in the Df class increased by 40 times, each image in the Mel class increased by 3 times, and each image in the Vasc class increased by 40 times.

Results of Pre-Trained Deep Learning
This section summarizes the performance results of the pre-trained AlexNet, GoogLeNet and VGG16 models. These models were trained on the ImageNet dataset, which has more than 1,200,000 images to classify more than 1000 classes. Unfortunately, the ImageNet dataset does not contain most biomedical image datasets, such as dermatoscopic images of skin lesions. These models transfer the experience gained when training the ImageNet dataset to perform new dermatoscopic image classification tasks. The input layers receive the skin lesions images of the ISIC 2019 dataset and send them to the convolutional, pooling and auxiliary layers for processing and extracting the deep and hidden features. Fullyconnected layers convert higher-level features into vectors and classify each feature vector into an appropriate class. in the Mel class increased by 3 times, and each image in the Vasc class increased by 40 times.

Results of Pre-Trained Deep Learning
This section summarizes the performance results of the pre-trained AlexNet, Goog-LeNet and VGG16 models. These models were trained on the ImageNet dataset, which has more than 1,200,000 images to classify more than 1000 classes. Unfortunately, the ImageNet dataset does not contain most biomedical image datasets, such as dermatoscopic images of skin lesions. These models transfer the experience gained when training the ImageNet dataset to perform new dermatoscopic image classification tasks. The input layers receive the skin lesions images of the ISIC 2019 dataset and send them to the convolutional, pooling and auxiliary layers for processing and extracting the deep and hidden features. Fully-connected layers convert higher-level features into vectors and classify each feature vector into an appropriate class. Table 3

Results of Pre-Trained Deep Learning Based on GAC Algorithm
In this section, we summarize the performance results of pre-trained AlexNet, GoogLeNet and VGG16 models based on the segmentation of dermatoscopy images using the GAC algorithm. The dermatoscopy images of the ISIC 2019 dataset were first segmented, and only the lesions area was extracted and saved in new folders to be fed into AlexNet, GoogLeNet and VGG16 models. The input layers receive the segmented skin lesions images of the ISIC 2019 data set and send them to the convolutional, pooling and auxiliary layers for processing and extracting the deep and hidden features. Fully-connected layers convert higher-level features into vectors and classify each feature vector into an appropriate class. Table 4 and Figure 8 summarize the results of the AlexNet, GoogLeNet and VGG16 models based on the GAC algorithm for dermatoscopic image analysis for diagnosis of the ISIC 2019 dataset. The AlexNet achieved average results: an AUC of 82.93%, sensitivity of 77.96%, accuracy of 92.20%, precision of 79.14%, and specificity of 98.65%. GoogLeNet has achieved average results: an AUC of 84.34%, sensitivity of 87.54%, accuracy of 91.80%, precision of 79.56%, and specificity of 98.54%. VGG16 has achieved average results: an AUC of 81.88%, sensitivity of 74.58%, accuracy of 90.50%, precision of 76.14%, and specificity of 98.41%.

Results of Hybrid Models of CNN, ANN and RF
This section discusses the summary results of hybrid models between CNN models (AlexNet, GoogLeNet and VGG16) with both ANN and RF networks separately for image analysis of the ISIC 2019 dataset for skin lesions. The mechanism of the hybrid models is segmentation of the lesion area after image optimization and feature map extraction through CNN models. To keep the important features and delete the redundant ones using the PCA method, the feature vectors generated by PCA are sent to the ANN and RF networks for training and performance testing.
The CNN-ANN and RF-CNN hybrid models for image analysis of skin lesions of the ISIC 2019 dataset have high capabilities in distinguishing skin cancer from other skin diseases.

Results of Hybrid Models of CNN, ANN and RF
This section discusses the summary results of hybrid models between CNN models (AlexNet, GoogLeNet and VGG16) with both ANN and RF networks separately for image analysis of the ISIC 2019 dataset for skin lesions. The mechanism of the hybrid models is segmentation of the lesion area after image optimization and feature map extraction through CNN models. To keep the important features and delete the redundant ones using the PCA method, the feature vectors generated by PCA are sent to the ANN and RF networks for training and performance testing.
The CNN-ANN and RF-CNN hybrid models for image analysis of skin lesions of the ISIC 2019 dataset have high capabilities in distinguishing skin cancer from other skin diseases. Table 5 and Figure 9 present the measurement performance of the CNN-ANN hybrid models for the ISIC 2019 dataset image analysis for early diagnosis of skin lesions as fol-          The hybrid models of CNN-ANN and RF-CNN produce confusion matrices that show the performance of the hybrid models for the early detection of skin cancer for distinguishing skin cancer from other lesions.   The hybrid models of CNN-ANN and RF-CNN produce confusion matrices that show the performance of the hybrid models for the early detection of skin cancer for distinguishing skin cancer from other lesions. Figure 11 presents the confusion matrix of the AlexNet-ANN, GoogLeNet-ANN and VGG16-ANN models for early diagnosis of skin lesions of the ISIC 2019 data set.

Results of Hybrid Models Based on Fused CNN Features
This section presents the findings of hybrid models based on CNN (AlexNet, GoogLeNet and VGG16) features fused to image analysis of the ISIC 2019 dataset for the early diagnosis and discrimination of skin cancer and other skin lesions. The mechanism of the technique is first to improve the images and then segment the lesion area. CNN models are fed lesion area images of the ISIC 2019 dataset for feature map extraction through convolutional layers and pooling. To keep important features and delete redundant features by PCA, the feature vectors generated by PCA are sent to ANNs and RF networks for training and performance testing.
Hybrid models based on fused features of CNN models for image analysis of skin lesions of the ISIC 2019 dataset have high capabilities in distinguishing skin cancer from other skin diseases. Table 7 and Figure 13         The hybrid models of CNN-ANN and RF-CNN based on fusion features of the CNN model produce confusion matrices that show the performance of the hybrid models for the early detection of skin cancer for distinguishing skin cancer from other lesions.

Discussion and Comparison of the Performance Results of the Systems
Exposure of the skin to ultraviolet radiation causes DNA damage to skin cells and skin cancer. Melanoma is one of the deadliest skin lesions, leading to death if not diagnosed and treated early. Many skin lesions have similar clinical characteristics and vital signs, especially in the initial stages, which require highly experienced dermatologists. Automated systems help diagnose and distinguish skin cancer from other skin lesions. This study focuses on developing several hybrid systems based on fused CNN features. ISIC 2019 dataset images were optimized, and the GAC method segmented and isolated the lesion area from healthy skin.
The first methodology for the early recognition of skin cancer from other skin lesions involved using the pre-trained AlexNet, GoogLeNet and VGG16 models. The results reached by the pre-trained models were not satisfactory, especially for classifying some types of pests. The pre-trained AlexNet, GoogLeNet, and VGG16 models achieved an accuracy of 88.5%, 88.7%, and 88.3%, respectively.
The second methodology for diagnosing ISIC 2019 images and distinguishing skin cancer from other skin lesions involved using AlexNet, GoogLeNet and VGG16 models based on the GAC algorithm, in which the lesion area was segmented from the images and fed to AlexNet, GoogLeNet and VGG16 models. When feeding the CNN models with lesion areas only, the results are improved, as the AlexNet, GoogLeNet, and VGG16 models reached an accuracy of 92.2%, 91.8%, and 90.5%, respectively.
The third methodology for the early diagnosis of skin cancer and distinguishing them from other lesions involved using the hybrid models CNN-ANN and CNN-RF based on the GAC algorithm. The hybrid models of AlexNet-ANN, GoogLeNet-ANN, and VGG16-ANN achieved an accuracy of 94.8%, 93.7%, and 93.6%, respectively, while the hybrid models of AlexNet-RF, GoogLeNet-RF, and VGG16-RF achieved an accuracy of 94.3%, 94.9%, and 94.2%, respectively.
The fourth methodology for the early diagnosis of skin cancer and distinguishing them from other lesions involved using hybrid models between fused CNN models and ANN and RF networks. CNN features were combined and classified by ANN and RF networks. The hybrid models AlexNet-GoogLeNet-ANN, GoogLeNet-VGG16-ANN, AlexNet-VGG16-ANN and AlexNet-GoogLeNet-VGG16-ANN achieved accuracy of 95%, 94.6%, 95.2% and 96.1%, while the hybrid models AlexNet-GoogLeNet-RF, GoogLeNet-VGG16-RF, AlexNet-VGG16-RF and AlexNet-GoogLeNet-VGG16-RF achieved accuracy of 95.3%, 95.3%, 94.3% and 95.7%. Table 9 and Figure 17 present the performance of all systems for analyzing dermatoscopic images for diagnosing skin cancer from the ISIC 2019 data set and distinguishing them from other skin lesions. The table discusses the overall accuracy and accuracy of each type of skin lesion in the ISIC 2019 dataset. The classification of skin lesions was carried out by three pre-trained models, AlexNet, GoogLeNet and VGG16, which did not achieve good results, especially for classifying some classes (lesions). When applying the GAC algorithm to segment the lesion area and feeding it to the AlexNet, GoogLeNet and VGG16 models, the results improved better than when providing the models with the whole picture. CNN-ANN and CNN-RF hybrid models based on the GAC algorithm have been implemented. It is noted that the classification results of the ISIC 2019 data set by the hybrid technique are better than the pre-trained models. Due to the similar characteristics of skin lesions and to achieve promising accuracy, CNN-ANN and CNN-RF hybrid models were applied based on fused CNN features. It is noted that the ANN and RF networks with combined CNN features, after deleting the non-significant and repetitive features by PCA, achieved the best results compared to other strategies.
The improvement in the accuracy of each class is noted as follows: for the Scc class, the accuracy improved from 38.9% by the pre-trained GoogLeNet to 88.1% by the AlexNet-VGG16-ANN hybrid model. The accuracy of the Akiec class improved from 43.7% by the pre-trained VGG16 to 90.8% by the AlexNet-ANN hybrid model. The accuracy of the Bcc class improved from 85.6% by the pre-trained VGG16 to 97.9% by the AlexNet-VGG16-ANN hybrid model. The accuracy of the Bkl class improved from 85.9% by the pre-trained VGG16 to 98.3% by the AlexNet-ANN hybrid model. Class Df accuracy improved from 43.8% by the pre-trained VGG16 to 97.9% by the AlexNet-VGG16-RF hybrid model. The accuracy of the Mel class improved from 87.7% by the pre-trained VGG16 to 97.9% by the AlexNet-GoogLeNet-VGG16-RF hybrid model. Class Nv accuracy improved from 95.3% by the pre-trained VGG16 to 98.1% by the GoogLeNet-RF hybrid model. The accuracy of the Vasc class improved from 64.7% by the pre-trained CNN to 78.4% by the AlexNet-GoogLeNet-VGG16-ANN hybrid model.

Conclusions
Many skin lesions are similar in the early stages, making it difficult to distinguish skin cancer from other skin lesions. Thus, many hybrid systems with fused features have been developed based on segmentation of lesion areas and their isolation from the rest of the healthy skin. The images were optimized, and the lesion area was segmented by the GAC algorithm and fed into AlexNet, GoogLeNet and VGG16 models. The first hybrid model involved dermatoscopic image analysis for the early diagnosis of skin cancer using the ISIC 2019 data set and their distinction from other skin lesions using CNN-ANN and CNN-RF. The second hybrid model for diagnosing ISIC 2019 dataset images involved the

Conclusions
Many skin lesions are similar in the early stages, making it difficult to distinguish skin cancer from other skin lesions. Thus, many hybrid systems with fused features have been developed based on segmentation of lesion areas and their isolation from the rest of the healthy skin. The images were optimized, and the lesion area was segmented by the GAC algorithm and fed into AlexNet, GoogLeNet and VGG16 models. The first hybrid model involved dermatoscopic image analysis for the early diagnosis of skin cancer using the ISIC 2019 data set and their distinction from other skin lesions using CNN-ANN and CNN-RF. The second hybrid model for diagnosing ISIC 2019 dataset images involved the hybrid model based on fused CNN features. The AlexNet-GoogLeNet-VGG16-ANN hybrid model is based on extracting the features from the AlexNet-GoogLeNet-VGG16 models separately and reducing the dimensions by eliminating the redundant and unimportant features by PCA, then fusing the features of the three models serially and sending them to a network ANN for classification. The AlexNet-GoogLeNet-VGG16-ANN hybrid model achieved an AUC of 94.41%, sensitivity of 88.90%, accuracy of 96.10%, precision of 88.69%, and specificity of 99.44%.
This study aims to develop high-efficiency systems to help physicians diagnose skin diseases and differentiate between skin cancer and other lesions.
The limitation in this method is the imbalance of the data set, which is processed by the data augmentation technique.
Future work will involve developing systems to classify the ISIC 2019 dataset using fusion features between handcrafted features and CNN models and generalizing the proposed systems to the ISIC 2020 dataset.

Data Availability Statement:
In this work, the data supporting the performance of the proposed systems were obtained from the ISIC 2019 dataset that is publicly available at the following link: https://challenge.isic-archive.com/data/#2019 (accessed on 18 October 2022).