Detection of Skin Cancer Based on Skin Lesion Images Using Deep Learning

An increasing number of genetic and metabolic anomalies have been determined to lead to cancer, generally fatal. Cancerous cells may spread to any body part, where they can be life-threatening. Skin cancer is one of the most common types of cancer, and its frequency is increasing worldwide. The main subtypes of skin cancer are squamous and basal cell carcinomas, and melanoma, which is clinically aggressive and responsible for most deaths. Therefore, skin cancer screening is necessary. One of the best methods to accurately and swiftly identify skin cancer is using deep learning (DL). In this research, the deep learning method convolution neural network (CNN) was used to detect the two primary types of tumors, malignant and benign, using the ISIC2018 dataset. This dataset comprises 3533 skin lesions, including benign, malignant, nonmelanocytic, and melanocytic tumors. Using ESRGAN, the photos were first retouched and improved. The photos were augmented, normalized, and resized during the preprocessing step. Skin lesion photos could be classified using a CNN method based on an aggregate of results obtained after many repetitions. Then, multiple transfer learning models, such as Resnet50, InceptionV3, and Inception Resnet, were used for fine-tuning. In addition to experimenting with several models (the designed CNN, Resnet50, InceptionV3, and Inception Resnet), this study’s innovation and contribution are the use of ESRGAN as a preprocessing step. Our designed model showed results comparable to the pretrained model. Simulations using the ISIC 2018 skin lesion dataset showed that the suggested strategy was successful. An 83.2% accuracy rate was achieved by the CNN, in comparison to the Resnet50 (83.7%), InceptionV3 (85.8%), and Inception Resnet (84%) models.


Introduction
The uncontrollable development of tissues in a specific body area is known as cancer [1]. One of the most quickly spreading diseases in the world looks to be skin cancer. Skin cancer is a disease in which abnormal skin cells develop out of control [2]. In order to determine potential cancer therapies, early detection and accurate diagnosis are essential. Melanoma, the deadliest form of skin cancer, is responsible for most skin cancer-related deaths in developed countries. The major skin cancer types comprise basal cell carcinoma [3], squamous cell carcinoma [4], Merkel cell cancer [5], dermatofibroma [6], vascular lesion [7], and benign keratosis [8].
In order to diagnose abnormalities in various regions of the body, such as skin cancer [9], breast cancer [10], brain tumors [11], lung cancer [12], and stomach cancer [13], diagnostic imaging assessment plays an important part. According to the GLOBOCAN survey, there will be 19.2 million new cancer diagnoses and 9.9 million cancer deaths in 2020. Lung cancer is the leading cause of death (18.2%), followed by colorectal cancer (9.5%), liver cancer (8.4%), stomach cancer (7.8%), breast cancer (6.9%), esophageal cancer (5.5%), and pancreatic cancer (4.7%). The GLOBOCAN survey also points out more than half of cancer deaths occur in Asia, along with about 20% of cancer deaths occurring in Europe. Furthermore, the areas most affected by skin cancer around the globe are shown in Figure 1, with North America accounting for about half of the total.
In order to diagnose abnormalities in various regions of the body, such as skin cancer [9], breast cancer [10], brain tumors [11], lung cancer [12], and stomach cancer [13], diagnostic imaging assessment plays an important part. According to the GLOBOCAN survey, there will be 19.2 million new cancer diagnoses and 9.9 million cancer deaths in 2020. Lung cancer is the leading cause of death (18.2%), followed by colorectal cancer (9.5%), liver cancer (8.4%), stomach cancer (7.8%), breast cancer (6.9%), esophageal cancer (5.5%), and pancreatic cancer (4.7%). The GLOBOCAN survey also points out more than half of cancer deaths occur in Asia, along with about 20% of cancer deaths occurring in Europe. Furthermore, the areas most affected by skin cancer around the globe are shown in Figure  1, with North America accounting for about half of the total. To ensure better prognosis and death rates, early skin cancer identification is crucial, yet solid tumor detection typically relies mostly on screening mammography with inadequate sensitivity, which is then validated by clinical specimens. Cancer screening and treatment reaction evaluations are usually not appropriate uses for this approach [2,3]. An increasing number of healthcare providers are using artificial intelligence (AI) for medical diagnostics to improve and accelerate the diagnosis decision-making procedure [4]. However, despite some current evidence of improvement in this domain, the accurate assessment and adequate reporting of predicted flaws have been entirely or partly ignored by currently available AI research for clinical diagnosis.
Computer-aided design (CAD) can quickly, reliably, and consistently diagnose various disorders. CAD also provides the option for advanced tumor disease detection and protection that is both precise and cost-effective. Human organ disorders are typically assessed using a variety of imaging technologies, including magnetic resonance imaging (MRI) [5], positron emission tomography (PET) [6], and X-rays [7]. Computed tomography (CT) [8,9], dermatoscopy image analysis, clinical screening, and other approaches were initially used to visually diagnose skin lesions. Dermatologists with little expertise have shown reduced accuracy in skin lesion diagnostics [10][11][12]. The methods for physicians to evaluate and analyze lesion images are time-consuming, complex, subjective, and error-prone. This is mainly because the images of skin lesions are so complicated. Unambiguous identification of lesion pixels is essential to performing image analysis, for evaluation and awareness of skin lesions. Using machine learning approaches in computer vision has led to a significant advance in computer-aided diagnostic and prediction systems for skin cancer detection [13]. Image preprocessing and classification of lesion images are some of the main processes used to outline the entire cancer detection and diagnosis, as described in Figure 2 [14]. To ensure better prognosis and death rates, early skin cancer identification is crucial, yet solid tumor detection typically relies mostly on screening mammography with inadequate sensitivity, which is then validated by clinical specimens. Cancer screening and treatment reaction evaluations are usually not appropriate uses for this approach [2,3]. An increasing number of healthcare providers are using artificial intelligence (AI) for medical diagnostics to improve and accelerate the diagnosis decision-making procedure [4]. However, despite some current evidence of improvement in this domain, the accurate assessment and adequate reporting of predicted flaws have been entirely or partly ignored by currently available AI research for clinical diagnosis.
Computer-aided design (CAD) can quickly, reliably, and consistently diagnose various disorders. CAD also provides the option for advanced tumor disease detection and protection that is both precise and cost-effective. Human organ disorders are typically assessed using a variety of imaging technologies, including magnetic resonance imaging (MRI) [5], positron emission tomography (PET) [6], and X-rays [7]. Computed tomography (CT) [8,9], dermatoscopy image analysis, clinical screening, and other approaches were initially used to visually diagnose skin lesions. Dermatologists with little expertise have shown reduced accuracy in skin lesion diagnostics [10][11][12]. The methods for physicians to evaluate and analyze lesion images are time-consuming, complex, subjective, and error-prone. This is mainly because the images of skin lesions are so complicated. Unambiguous identification of lesion pixels is essential to performing image analysis, for evaluation and awareness of skin lesions. Using machine learning approaches in computer vision has led to a significant advance in computer-aided diagnostic and prediction systems for skin cancer detection [13]. Image preprocessing and classification of lesion images are some of the main processes used to outline the entire cancer detection and diagnosis, as described in Figure 2 [14]. The exponential growth in processing power has led to tremendous advancemen in computer vision technologies, particularly in the development of deep learning mod such as CNN. The earliest possible detection of skin cancer is now required. Skin canc is the second most common cancer (after breast cancer) in women between the ages of and 35, and the most common cancer in women between the ages of 25 and 29, accordi to Dr. Lee [15], who serves several young patients with skin cancer. Early identification skin cancer using deep learning outperformed human specialists in many computer visi challenges [15,16], resulting in reduced death rates. It is possible to get outstanding a cutting-edge processing and classification accuracy by including efficient formulatio into deep learning techniques [17][18][19].
In order to correctly diagnose early cancer signs from lesion images, this study pr poses a crossbred DL model for cancer classification and prediction. Preprocessing a classification are key components of the system under consideration. During the prepr cessing phase, the entire intensity of the image is improved to decrease the inconsistenc among photos. The image is additionally scaled and standardized to fit the traini model's scale during this procedure. Many different metrics were used to evaluate t suggested model in the comparison studies. These metrics included precision and rec metrics, the F1-score, and the area under the curve (AUC). The publicly available, larg scale ISIC 2018 dataset comprises a massive number of lesion images with diagnosed ca cer. Pretrained networks such as Resnet50, InceptionV3, and Inception Resnet were e ployed for comparison. A training process with varying configurations of training stra gies (e.g., validation patience and data augmentation) was employed to boost the reco mended technique's universal efficiency and prevent overfitting.
The remainder of this paper is broken down as follows: Section 2 summarizes exi ing investigations, Section 3 describes the methods used to build the cancer dataset a the recommended system's design requirements, Section 4 offers the findings of the stud and Section 5 finishes with the conclusion and suggestions for further studies.

Related Work
Skin cancer is on the upswing, and this has been true for the last 10 years [20]. Becau the skin is the body's central part, it is reasonable to assume that skin cancer is the mo The exponential growth in processing power has led to tremendous advancements in computer vision technologies, particularly in the development of deep learning models such as CNN. The earliest possible detection of skin cancer is now required. Skin cancer is the second most common cancer (after breast cancer) in women between the ages of 30 and 35, and the most common cancer in women between the ages of 25 and 29, according to Dr. Lee [15], who serves several young patients with skin cancer. Early identification of skin cancer using deep learning outperformed human specialists in many computer vision challenges [15,16], resulting in reduced death rates. It is possible to get outstanding and cutting-edge processing and classification accuracy by including efficient formulations into deep learning techniques [17][18][19].
In order to correctly diagnose early cancer signs from lesion images, this study proposes a crossbred DL model for cancer classification and prediction. Preprocessing and classification are key components of the system under consideration. During the preprocessing phase, the entire intensity of the image is improved to decrease the inconsistencies among photos. The image is additionally scaled and standardized to fit the training model's scale during this procedure. Many different metrics were used to evaluate the suggested model in the comparison studies. These metrics included precision and recall metrics, the F1-score, and the area under the curve (AUC). The publicly available, large-scale ISIC 2018 dataset comprises a massive number of lesion images with diagnosed cancer. Pretrained networks such as Resnet50, InceptionV3, and Inception Resnet were employed for comparison. A training process with varying configurations of training strategies (e.g., validation patience and data augmentation) was employed to boost the recommended technique's universal efficiency and prevent overfitting.
The remainder of this paper is broken down as follows: Section 2 summarizes existing investigations, Section 3 describes the methods used to build the cancer dataset and the recommended system's design requirements, Section 4 offers the findings of the study, and Section 5 finishes with the conclusion and suggestions for further studies.

Related Work
Skin cancer is on the upswing, and this has been true for the last 10 years [20]. Because the skin is the body's central part, it is reasonable to assume that skin cancer is the most frequent disease in humans. Timely detection of skin cancer is essential for successful therapy. Skin cancer indications can now be quickly and easily diagnosed using computer- The use of machine aid in the early diagnosis of cancer has opened up a new field of study and demonstrated the ability to eliminate limitations in the manual method. An overview of several relevant studies is presented here to better understand the topic of discussion and to create a vision of the current state of the art. Deep learning techniques have produced outstanding outcomes in several areas compared to other traditional machine learning methodologies. In the last few decades, deep learning has completely transformed the nature of machine learning. The artificial neural network is the most advanced branch of machine learning. The anatomy and operation of the human brain was the source of inspiration for this method [21].
Experts have examined and assessed the strength of the facts supporting the accuracy rate of computer-aided techniques [22]. ScienceDirect, SpringerLink, and IEEE databases were consulted. Skin lesion segmentation and classification approaches were analyzed, outlining the significant limitations. An enhanced melanoma skin cancer diagnosis technique was presented in [23]. An implantation manifold with nonlinear embeddings was used to create synthetic views of melanoma. Employing dermatoscopic scans from the publicly accessible PH 2 dataset, the data augmentation approach was utilized to build a new collection of skin melanoma datasets. The SqueezeNet deep learning model was trained using the enhanced images. The experiments revealed that the accuracy of melanoma identification improved significantly (92.18). Extracting a skin melanoma (SM) region from a digital dermatoscopy image using the VGG-SegNet algorithm was suggested in [24]. Essential performance parameters were subsequently established after a comparison between the extracted segmented SM and the ground truth (GT). Employing the standard ISIC2016 database, the proposed scheme was evaluated and verified.
Scholars have combined human and artificial intelligence to classify skin cancer. A total of 112 German dermatologists and a CNN categorized 300 biopsy-verified skin lesions into five classifications. Using gradient boosting, the two separately obtained sets of diagnoses were joined to create a unified classifier. Man and machine obtained 82.95% multiclass accuracy [25]. The deep learning-based InSiNet technique detects benign and malignant tumors [26]. Under similar scenarios, the approach was evaluated on HAM10000 images (ISIC 2018), ISIC 2019, and ISIC 2020. Accordingly, the created InSiNet framework outperformed the other approaches, obtaining 94.59%, 91.89%, and 90.549% accuracy when using the ISIC 2018, ISIC 2019, and ISIC2020 datasets.
To categorize skin melanoma at an early stage, researchers offered a deep-learningbased methodology, including a region-based convolutional neural network (RCNN) and fuzzy k-means clustering (FKM) [27]. The suggested technique was put to the test using a variety of clinical photos in order to aid dermatologists in the early detection of this lifethreatening condition. The ISIC-2017, PH2, and ISBI-2016 datasets were used to assess the provided methodology's effectiveness. The findings revealed that it outperformed current state-of-the-art methodologies with an average accuracy of 95.40%, 93.1%, and 95.6%.
DL models such as convolutional neural networks (CNNs) have proven themselves superior to more traditional methods in various fields, especially image and feature recognition [28]. Moreover, they have been effectively applied in the medical profession, with phenomenal results and outstanding performance in a variety of challenging situations. Doctors and professionals now have access to a variety of DL-based medical imaging systems to aid in cancer prognosis, treatment, and follow-up assessments.
The Lesion-classifier, relying on pixel-by-pixel classification findings, was presented to categorize skin lesions into melanoma and non-melanoma cases. Skin lesion datasets ISBI2017 and PH2 were used in the investigation to verify efficacy. The experiments showed that the suggested technique had an accuracy rate of 95% on the ISIC 2017 and PH2 datasets [29].
In recent years, various deep learning algorithms have been applied to classify skin cancer, as outlined in Table 1, as well as other existing studies such as [30,31]. Table 1 presents the various methods for predicting cancer. Timely screening and prediction have been found to enhance the probability of proper medication and reduce mortality. However, most of these studies focused solely on applying DL models to actual images rather than preprocessed images, limiting the ultimate classification network's ability to adapt. By altering the framework of pretrained systems via the addition of multiple layers, the present work builds a lightweight skin cancer diagnosis method in order to achieve a higher level of confidence.

Proposed System
A CNN model using images from the image data store is presented schematically to generate discriminative and relevant attribute interpretations for the cancer detection technique, as shown in Algorithm 1. To begin, a basic explanation of the used dataset is provided. Moreover, the details of the implementation of proposed model, including preprocessing techniques and the basic architecture, are presented.

ISIC 2018 Image Dataset
Data are at the core of DL, representing what these learning techniques run on. Cancer is a unique disease, and there have already been many datasets published. We used lesion images from publicly accessible image databases of identified affected individuals. The ISIC 2018 dataset was utilized for training the proposed approach, which contained 10,015 training and 1512 test images for a total of 11,527 images [30]. ISIC 2018 provided the ground-truth data only for the training set, consisting of seven classes, melanoma, melanocytic nevus, basal cell carcinoma, squamous cell carcinoma, vascular lesions, dermatofibroma, and benign keratosis, as shown in Figure 3. Step 2: Split (dataset)/training, testing, and validating Step 3: Train CNN model Step 4: Train pretrained models (Resnet, Inception, Inception Resnet) 4.1 Fine-tune model parameters (freeze layers, learning rate, epochs, batch size) Step 5: Compute VPM (confusion matrix, accuracy, precision, ROC, F1, AUC, recall) Step 6: Evaluation (existing work)

ISIC 2018 Image Dataset
Data are at the core of DL, representing what these learning techniques run on. Cancer is a unique disease, and there have already been many datasets published. We used lesion images from publicly accessible image databases of identified affected individuals. The ISIC 2018 dataset was utilized for training the proposed approach, which contained 10,015 training and 1512 test images for a total of 11,527 images [30]. ISIC 2018 provided the ground-truth data only for the training set, consisting of seven classes, melanoma, melanocytic nevus, basal cell carcinoma, squamous cell carcinoma, vascular lesions, dermatofibroma, and benign keratosis, as shown in Figure 3. We applied the proposed CNN model to the ISIC 2018 skin lesion classification challenge test set; our data store consisted of 3533 lesion scans where 1760 of them are benign and 1773 are malignant, and we tested the proposed system using a total of 960 images consisting of 360 benign and 300 malignant cases. The lesion images were acquired from an openly accessible data repository ISIC 2018 [31]. For evaluation, the authors obtained radiological scans from many legitimate databases of cancer incidences; images from this source are used in most cancer diagnostics. The database, which is updated regularly, offers a free library of cancer cases and lesion images. The Kaggle list "Lesion Images" was used to collect lesion images; 3533 images from these sources are included in the We applied the proposed CNN model to the ISIC 2018 skin lesion classification challenge test set; our data store consisted of 3533 lesion scans where 1760 of them are benign and 1773 are malignant, and we tested the proposed system using a total of 960 images consisting of 360 benign and 300 malignant cases. The lesion images were acquired from an openly accessible data repository ISIC 2018 [31]. For evaluation, the authors obtained radiological scans from many legitimate databases of cancer incidences; images from this source are used in most cancer diagnostics. The database, which is updated regularly, offers a free library of cancer cases and lesion images. The Kaggle list "Lesion Images" was used to collect lesion images; 3533 images from these sources are included in the ISIC2018 collection [41]. Figure 4 shows various lesion image examples from the ISIC2018 dataset, demonstrating the collection's diversity of patient situations. It was decided to build ISIC2018 because the library is openly available and openly available to the academic communities and the public society.

Image Preprocessing
This process involved data augmentation, image improvement using (ESRGAN), image resizing, and normalization.

ESRGAN
Approaches such as super-resolution generative adversarial network enhanced SRGAN [42] can help improve the detection of skin lesions. The enhanced edition of the super-resolution GAN (Ledig et al.) [43] uses a resilient-in-residual block instead of a basic residual network or a simple convolution trunk when it comes to microscopic-level gradients. Additionally, the model does not have a batch normalization layer for smoothing down the image. Accordingly, the sharp edges of the image artefacts can be better approximated in the images produced by ESRGAN. When determining if an image is real or false, ESRGAN employs a relativistic discriminator https://arxiv.org/pdf/1807.00734.pdf (accesed on 10 April 2022). This method yields more accurate results. Perceptual differences between the actual and false images are combined with the relativistic average loss and pixelwise absolute difference between the real and fake images as the loss function during adversarial training. A two-phase training scheme is used to sharpen the generator's skills. This reduces the pixelwise L1 distance between the input and target high-resolution image to avoid local minima when beginning with complete randomization in the first phase of the algorithm.
In the second stage, the goal is to refine and improve the reconstructed images of the smallest artefacts. The final trained model is interpolated between the L1 loss and the adversarially trained models for a photorealistic reconstruction.
A discriminator network was trained to distinguish between super-resolved images and actual photo images. By rearranging the lightness elements in the source image's histogram, an evolutionary contrast enhancement algorithm was used to strengthen the lesion picture's minutiae, textures, and poor contrast. As a result, this method enhanced the appearance of borders and arcs in each section of the picture, as shown in Figure 5, while simultaneously increasing the image's contrast level.

Image Preprocessing
This process involved data augmentation, image improvement using (ESRGAN), image resizing, and normalization.

ESRGAN
Approaches such as super-resolution generative adversarial network enhanced SR-GAN [42] can help improve the detection of skin lesions. The enhanced edition of the super-resolution GAN (Ledig et al.) [43] uses a resilient-in-residual block instead of a basic residual network or a simple convolution trunk when it comes to microscopic-level gradients. Additionally, the model does not have a batch normalization layer for smoothing down the image. Accordingly, the sharp edges of the image artefacts can be better approximated in the images produced by ESRGAN. When determining if an image is real or false, ESRGAN employs a relativistic discriminator https://arxiv.org/pdf/1807.00734.pdf (accesed on 10 April 2022). This method yields more accurate results. Perceptual differences between the actual and false images are combined with the relativistic average loss and pixelwise absolute difference between the real and fake images as the loss function during adversarial training. A two-phase training scheme is used to sharpen the generator's skills. This reduces the pixelwise L1 distance between the input and target high-resolution image to avoid local minima when beginning with complete randomization in the first phase of the algorithm.
In the second stage, the goal is to refine and improve the reconstructed images of the smallest artefacts. The final trained model is interpolated between the L1 loss and the adversarially trained models for a photorealistic reconstruction.
A discriminator network was trained to distinguish between super-resolved images and actual photo images. By rearranging the lightness elements in the source image's histogram, an evolutionary contrast enhancement algorithm was used to strengthen the lesion picture's minutiae, textures, and poor contrast. As a result, this method enhanced the appearance of borders and arcs in each section of the picture, as shown in Figure 5, while simultaneously increasing the image's contrast level.

Augmentation
For each image in the dataset, upgraded images with associated masks including rotation, reflection, shifting, brightness, and resizing were produced. Detection and assessment are restricted by the poor quality of raw lesion images generated by electronic detectors. There were a total of 1440 benign and 1197 malignant training images. After conducting augmentation, there were a total of 1760 benign and 1773 malignant images. The imbalanced distribution of classes was addressed by performing oversampling on the malignant images.
To avoid biased prediction consequences, the ISIC2018 dataset was split into three mutually distinct sets (training, validation, and evaluation sets) to address the overfitting issue caused by the short number of training photographs. The output of the image augmentation process after applying different augmentation parameters is shown in Figure  6.

Data Preparation
Image acquisition factors can vary due to the fact that certain photos in the dataset have low pixel dimensions, and all images should be resized. As a result, the image's luminance and size can change dramatically. Each acquisition tool has its own unique set of criteria; hence, the lesion image dataset is likely to contain a variety of images. In order to verify that the data were consistent and free of noise, the pixel strength of all images was standardized within the interval [−1, 1]. Normalization computed using Equation (1) ensured that the model was less susceptible to minor weight changes, facilitating its improvement. Below, Inorm, MinI, and MaxI represent image, normalize, minimum, and maximum, respectively.
Benign Malignan Figure 5. Images after the enhancement process.

Augmentation
For each image in the dataset, upgraded images with associated masks including rotation, reflection, shifting, brightness, and resizing were produced. Detection and assessment are restricted by the poor quality of raw lesion images generated by electronic detectors. There were a total of 1440 benign and 1197 malignant training images. After conducting augmentation, there were a total of 1760 benign and 1773 malignant images. The imbalanced distribution of classes was addressed by performing oversampling on the malignant images.
To avoid biased prediction consequences, the ISIC2018 dataset was split into three mutually distinct sets (training, validation, and evaluation sets) to address the overfitting issue caused by the short number of training photographs. The output of the image augmentation process after applying different augmentation parameters is shown in Figure 6.

Data Preparation
Image acquisition factors can vary due to the fact that certain photos in the dataset have low pixel dimensions, and all images should be resized. As a result, the image's luminance and size can change dramatically. Each acquisition tool has its own unique set of criteria; hence, the lesion image dataset is likely to contain a variety of images. In order to verify that the data were consistent and free of noise, the pixel strength of all images was standardized within the interval [−1, 1]. Normalization computed using Equation (1) ensured that the model was less susceptible to minor weight changes, facilitating its improvement. Below, I norm , Min I , and Max I represent image, normalize, minimum, and maximum, respectively.

Proposed CNN for ISIC2018 Detection
Due to the enormous number of hyperparameters and structures that need to be accounted for, DL models face significant difficulty (e.g., learning rate, number of frozen layers, batch size, and number of epochs). Several hyperparameter values were tested to see how they affected the efficiency of the suggested systems. The proposed CNN model consisted of three layers, as shown in Figure 7. As depicted in Figure 7, the skin cancer detection system employed a transfer DL strategy to learn discriminative and informative feature representations from preprocessed images in the image dataset.

Proposed CNN for ISIC2018 Detection
Due to the enormous number of hyperparameters and structures that need to be accounted for, DL models face significant difficulty (e.g., learning rate, number of frozen layers, batch size, and number of epochs). Several hyperparameter values were tested to see how they affected the efficiency of the suggested systems. The proposed CNN model consisted of three layers, as shown in Figure 7. As depicted in Figure 7, the skin cancer detection system employed a transfer DL strategy to learn discriminative and informative feature representations from preprocessed images in the image dataset. Healthcare 2022, 10, x. 10 of 19 Figure 7. An illustration of the skin cancer detection technique.
The presented system's core architecture was built on three learning models: Res-net50, Inception, and Inception Resnet50.

Resnet50
Resnet50is a 50-layer residual network [44]. Several difficulties emerged when scholars tried to apply the adage "the deeper the better" to deep learning methods. In comparison to networks having 20-30 layers, the deep network with 52 layers produced subpar outcomes, disproving the theory that "the deeper the network, the higher the network's efficiency". Resnet-50, a residual learning feature of the CNN model, was developed by experts. The residual unit is compensated for by using a conventional layer with a skip connection. Tying a layer's incoming signal to a certain layer's output using a skip connection is possible. The residual units allowed the training of a 152-layer model that was used to win the 2015 LSVRC2015 challenge. There is less of a learning curve because of its novel residual structure. A top five false-positive rate of <3.6% can be achieved using this machine.

Inception V3
An essential feature of the Inception module is its capacity to perform multiresolution processing [45]. To capture characteristics in standard CNN models, kernels with distinct receptive areas are utilized in certain layers. In an inception model, on the other hand, many kernels with differing receptive fields are employed in tandem to retrieve features of various sizes. The Inception module's outcome is created by stacking the parallel features that were extracted one on top of the other. The subsequent convolutional layer of the CNN uses the rich attribute maps produced by the Inception module's merged result. Because of this, the Inception module's effectiveness in medical imaging, specifically on lesion pictures, is exceptional [46]. The presented system's core architecture was built on three learning models: Resnet50, Inception, and Inception Resnet50.

Resnet50
Resnet50is a 50-layer residual network [44]. Several difficulties emerged when scholars tried to apply the adage "the deeper the better" to deep learning methods. In comparison to networks having 20-30 layers, the deep network with 52 layers produced subpar outcomes, disproving the theory that "the deeper the network, the higher the network's efficiency". Resnet-50, a residual learning feature of the CNN model, was developed by experts. The residual unit is compensated for by using a conventional layer with a skip connection. Tying a layer's incoming signal to a certain layer's output using a skip connection is possible. The residual units allowed the training of a 152-layer model that was used to win the 2015 LSVRC2015 challenge. There is less of a learning curve because of its novel residual structure. A top five false-positive rate of <3.6% can be achieved using this machine.

Inception V3
An essential feature of the Inception module is its capacity to perform multiresolution processing [45]. To capture characteristics in standard CNN models, kernels with distinct receptive areas are utilized in certain layers. In an inception model, on the other hand, many kernels with differing receptive fields are employed in tandem to retrieve features of various sizes. The Inception module's outcome is created by stacking the parallel features that were extracted one on top of the other. The subsequent convolutional layer of the CNN uses the rich attribute maps produced by the Inception module's merged result. Because of this, the Inception module's effectiveness in medical imaging, specifically on lesion pictures, is exceptional [46].

Inception Resnet
The Resnet50 and Inception frameworks were combined into one model to classify hyperspectral images. More than one million photos from the ImageNet collection were used to train the Inception ResnetV2 convolutional neural network. In total, there are 164 layers in this network, and it is capable of classifying photos into 1000 different object categories. Consequently, the network has amassed a diverse set of feature descriptions. The network accepts a 299-by-299-pixel picture as an input and gives a set of classifiers.

Experimental Results
Experiments were conducted on the ISIC2018 dataset to illustrate the effectiveness of the suggested DL systems and to compare their findings to those of the current state of the art.

Parameter Setting and Experimental Evaluation Index
Simulations on the ISIC2018 dataset were carried out to illustrate the performance of the suggested DL systems and to compare their results to the current state of the art. On a linux desktop with a GPU RTX3060 and 8GB of RAM, the TensorFlow Keras program for the present scheme was tested. Training and testing sets were separated using a ratio of 80 to 20%, as shown in Figure 8. The training set contained 1760 benign and 1773 malignant images, while the testing set comprised 360 benign and 300 malignant images.

Inception Resnet
The Resnet50 and Inception frameworks were combined into one model to classify hyperspectral images. More than one million photos from the ImageNet collection were used to train the Inception ResnetV2 convolutional neural network. In total, there are 164 layers in this network, and it is capable of classifying photos into 1000 different object categories. Consequently, the network has amassed a diverse set of feature descriptions. The network accepts a 299-by-299-pixel picture as an input and gives a set of classifiers.

Experimental Results
Experiments were conducted on the ISIC2018 dataset to illustrate the effectiveness of the suggested DL systems and to compare their findings to those of the current state of the art.

Parameter Setting and Experimental Evaluation Index
Simulations on the ISIC2018 dataset were carried out to illustrate the performance of the suggested DL systems and to compare their results to the current state of the art. On a linux desktop with a GPU RTX3060 and 8GB of RAM, the TensorFlow Keras program for the present scheme was tested. Training and testing sets were separated using a ratio of 80 to 20%, as shown in Figure 8. The training set contained 1760 benign and 1773 malignant images, while the testing set comprised 360 benign and 300 malignant images. The suggested training set comprised an 80% randomized array of lesion images. All testing was conducted using this set. Then, 10% of the data were used for verification throughout the learning phase. The weight combinations with the greatest accuracy values were retained. On the ISIC2018 dataset, the Adam optimizer was used to pretrain the suggested architecture, which employs a learning rate technique that slows down learning when it becomes static for an extended period (i.e., validation patience). Furthermore, we implemented a batch rebalancing technique to improve the prevalence of infection forms during the batching process. The hyperparameters and their values used by the Adam optimizer for training are presented in Table 2. The suggested training set comprised an 80% randomized array of lesion images. All testing was conducted using this set. Then, 10% of the data were used for verification throughout the learning phase. The weight combinations with the greatest accuracy values were retained. On the ISIC2018 dataset, the Adam optimizer was used to pretrain the suggested architecture, which employs a learning rate technique that slows down learning when it becomes static for an extended period (i.e., validation patience). Furthermore, we implemented a batch rebalancing technique to improve the prevalence of infection forms during the batching process. The hyperparameters and their values used by the Adam optimizer for training are presented in Table 2.

Performance Assessment
This part of the study includes an in-depth explanation of the evaluation metrics utilized and their outcomes. Classifier accuracy (Acc) is the primarily used statistic for evaluating classification effectiveness. It is described as the number of instances (images) categorized accurately divided by the number of examples (images) in the dataset under analysis, as expressed in Equation (2). There are two used metrics generally used for evaluating the effectiveness of image categorization systems: precision (Pr) and recall (Rc). Precision is a measure of how many classified photos are correctly labeled compared to the total number of images, as expressed in Equation (3). Recall is the percentage of successfully categorized images in the database compared to the number of associated images, as expressed in Equation (4). The F-score is the harmonic mean of precision and recall; a greater value is an indicator of the system's ability to forecast the future. The effectiveness of systems cannot be judged just on the basis of precision or recall. Equation (5) is the mathematical representation of the F-score (Fs).
where T p indicates a true positive, T n indicates a true negative, F p indicates a false positive, and F n indicates a false negative.

Performance of Different DCNN Models
Different DCNNs (CNN, Resnet50, Inception, and Inception Resnet) were implemented for training and testing tasks on the ISIC 2018 skin lesion classification challenge dataset. Using an 80-20 split between training and testing, the results are presented of various assessments on the ISIC2018 dataset for the suggested systems. This division was chosen to minimize the impact on execution time. CNN, Resnet50, Inception, and Inception Resnet models were trained for 50 epochs employing 10% of the training set as a validation set, with a batch size ranging from 2 to 32, and learning rates varying from 1 × 10 4 to 1 × 10 6 . Moreover, fine-tuning was performed regarding Resnet50, Inception, and Inception Resnet by freezing different numbers of layers to achieve the best accuracy. In order to train the models using similar parameters (runs 1-3, Tables 3-6), we used several runs (three runs for the similar parameters) to construct the model ensemble. The accuracy fluctuated from run to run since the weights were generated at random for each run; only the best run outcome was saved. Tables 3-6 show the accuracy results for the proposed CNN.     Diagnostic effectiveness was assessed using the AUC receiver operating characteristic curve (ROC), depicting the model's categorization effectiveness as a function of two parameters: true positives and false positives. The AUC is calculated as the area under the ROC curve covered by small trapezoidal segments. As shown in Figure 11, we performed ROC analyses using a CNN model with an area of 0.83. The best-case ROC outcome for the suggested model after fine-tuning using InceptionV3 is shown in Figure 12. Diagnostic effectiveness was assessed using the AUC receiver operating characteristic curve (ROC), depicting the model's categorization effectiveness as a function of two parameters: true positives and false positives. The AUC is calculated as the area under the ROC curve covered by small trapezoidal segments. As shown in Figure 11, we performed ROC analyses using a CNN model with an area of 0.83. The best-case ROC outcome for the suggested model after fine-tuning using InceptionV3 is shown in Figure 12.    It is now clear that the proposed strategy can be used in real-world settings to help radiologists diagnose cancer infection more correctly by utilizing lesion images, while simultaneously lowering their burden.

Comparison with Other Methods
A comparison of the suggested method's efficacy to that of existing methods was performed to better demonstrate its viability. Table 7 shows that our strategy was superior to other networks in terms of performance. In the proposed approach, the Inception model had an overall accuracy rate of 85.7%, outperforming the existing models. It is now clear that the proposed strategy can be used in real-world settings to help radiologists diagnose cancer infection more correctly by utilizing lesion images, while simultaneously lowering their burden.

Comparison with Other Methods
A comparison of the suggested method's efficacy to that of existing methods was performed to better demonstrate its viability. Table 7 shows that our strategy was superior to other networks in terms of performance. In the proposed approach, the Inception model had an overall accuracy rate of 85.7%, outperforming the existing models.

Discussion
According to our findings, none of the other approaches could match our level of precision. We attribute this to (i) ESRGAN's overall resolution improvement, (ii) the finetuning to learn particular dataset aspects, and (iii) our use of numerous architectures, each with a different capacity to generalize and adapt to various data. The lack of unique medical image features meant that the transfer learning architectures could not achieve a higher level of classification accuracy. Despite being better at classifying natural pictures, Resnet50's classification accuracy was lower than that of InceptionV3 when used on medical images. These findings suggest that shallower networks, such as that in InceptionV3, have more generalizable properties that may be used for a larger variety of imagery. On the other hand, deeper networks such as Resnet50 and Inception Resnet learn abstract characteristics that may be applied to any domain. Because the features of InceptionV3 are less semantically suited to natural pictures, they are more generalizable and adaptable when applied to medical images (compared to Resnet50 and Inception Resnet). Furthermore, fine-tuning the networks improved the accuracy of the four models. Compared to Resnet50 and Inception Resnet, InceptionV3's accuracy increased the most. According to the results of this study, deep networks are more likely to acquire relevant features when fine-tuned on a smaller dataset than shallow networks. The confusion matrices and numerical data shown in Figures 9 and 10 indicate that the suggested procedures were sufficient.

Conclusions and Future Work
By analyzing images of lesions on the skin, we developed a technique for quickly and accurately diagnosing both benign and malignant forms of cancer. The suggested system uses image enhancement approaches to boost the luminance of the lesion image and reduce noise. Resnet50, InceptionV3, and Resnet Inception were all trained on the upper edge of the preprocessed lesion medical images to prevent overfitting, as well as improve the overall competencies of the suggested DL methods. A lesion image dataset called the ISIC2018 dataset was used to test the proposed system's performance. In the proposed approach, the Inception model had an overall accuracy rate of 85.7%, which is comparable to that of experienced dermatologists. In addition to experimenting with several models (designed CNN, Resnet50, InceptionV3, and Inception Resnet), this study's innovation and contribution are the use of ESRGAN as a preprocessing step. Our designed model showed results comparable to the pretrained model. According to the comparative research, the proposed system outperformed current models. To establish the effectiveness of the suggested method, there is a need to conduct tests on a large, complex dataset that includes many cancer cases. It is possible that, in the future, we will employ Densenet, VGG, or AlexNet to analyze the cancer dataset.

Conflicts of Interest:
The authors declare no conflict of interest.