Optimization System Based on Convolutional Neural Network and Internet of Medical Things for Early Diagnosis of Lung Cancer

Recently, deep learning and the Internet of Things (IoT) have been widely used in the healthcare monitoring system for decision making. Disease prediction is one of the emerging applications in current practices. In the method described in this paper, lung cancer prediction is implemented using deep learning and IoT, which is a challenging task in computer-aided diagnosis (CAD). Because lung cancer is a dangerous medical disease that must be identified at a higher detection rate, disease-related information is obtained from IoT medical devices and transmitted to the server. The medical data are then processed and classified into two categories, benign and malignant, using a multi-layer CNN (ML-CNN) model. In addition, a particle swarm optimization method is used to improve the learning ability (loss and accuracy). This step uses medical data (CT scan and sensor information) based on the Internet of Medical Things (IoMT). For this purpose, sensor information and image information from IoMT devices and sensors are gathered, and then classification actions are taken. The performance of the proposed technique is compared with well-known existing methods, such as the Support Vector Machine (SVM), probabilistic neural network (PNN), and conventional CNN, in terms of accuracy, precision, sensitivity, specificity, F-score, and computation time. For this purpose, two lung datasets were tested to evaluate the performance: Lung Image Database Consortium (LIDC) and Linear Imaging and Self-Scanning Sensor (LISS) datasets. Compared to alternative methods, the trial outcomes showed that the suggested technique has the potential to help the radiologist make an accurate and efficient early lung cancer diagnosis. The performance of the proposed ML-CNN was analyzed using Python, where the accuracy (2.5–10.5%) was high when compared to the number of instances, precision (2.3–9.5%) was high when compared to the number of instances, sensitivity (2.4–12.5%) was high when compared to several instances, the F-score (2–30%) was high when compared to the number of cases, the error rate (0.7–11.5%) was low compared to the number of cases, and the computation time (170 ms to 400 ms) was low compared to how many cases were computed for the proposed work, including previous known methods. The proposed ML-CNN architecture shows that this technique outperforms previous works.


Introduction
The development of Internet of Things (IoT)-based medical services and sensor systems has created numerous new study topics. IoT, cloud services, and intelligent systems (SIs) have all been covered by the current literature, changing traditional health services • Image Preprocessing: In this step, image quality is improved by removing undesirable distortions in the image and correcting the geometric transformation of the image (rotation, scaling, and translation). Some of the essential preprocessing tasks are as follows: denoising, contrast enhancement, normalization, and so on [4].

•
Feature Extraction: This step converts the given input data into a set of features in a process known as feature extraction. In medical image processing, feature extraction starts with the initial set of reliable data, and feature extraction is performed on borrowed values known as features [5]. This makes the classification method much more satisfactory for prediction. Most medical image classifiers are based on the result of feature extraction in CAD.

•
Feature Selection: This step reduces the feature space extracted from the whole image, enhancing the detection rate and reducing the execution and response time. The reduced feature space is obtained by neglecting the irrelevant, noisy, and redundant features and choosing the feature subset that can achieve good performance with the subject of all metrics. In this stage, the dimensionality is reduced [6]. • Classification: In this stage, the selected features are applied to classify each sample to denote the desired class. However, classification is a big issue in medical image analysis tasks. For automatic classification, optimal sets of features are used, and this step needs to be further improved for the better clinical analysis of the human body for diseases [7]. Computer-aided diagnostic systems have proven efficient in finding and diagnosing problems more rapidly and accurately. Unfortunately, this diagnostic technique is prone to several flaws and ambiguities, which might lead to an inaccurate prognosis. Hence, IoT sensor information is fused into the classification model. Machine learning (ML) models are currently used to indicate lung disease in computerized tomography (CT) images. The pipeline architecture is given in Figure 1. The following are the paper's significant contributions [8]: • Improved CNN Model: The CNN's efficiency is optimized to boost the lung cancer diagnostic performance, and its computation is based on various inputs. The CNN calculates the improved range of values based on the extracted features [9]. • Hyperparameter Tuning: The learning parameters for the CNN must be generated on demand. It is also regarded as an optimization problem in which hyperparameters are tuned using the particle swarm optimization (PSO) algorithm [10][11][12]. • Significant Performance Gains of Diagnosis Complexity: The optimized CNN architecture employs fewer convolutional layers by considering relevant information from the convolutional, fully connected, and max-pool layers. The proposed optimized CNN can reach minimal complexity as a result of the primary structural view of the CNN [13,14]. • Consideration of IoT Sensor and Medical Images: A combination of sensor information and medical images is used to diagnose lung cancer from pictures [15,16].
Bioengineering 2023, 10, x FOR PEER REVIEW 3 of 28 • Consideration of IoT Sensor and Medical Images: A combination of sensor information and medical images is used to diagnose lung cancer from pictures [15,16] Figure 1 shows the pipeline for IoMT-based lung disease diagnosis together with the design of the research model. The scientific effort has been given a defined road map.

Motivation for Lung Cancer Research
People affected by the disease daily come from many regions of the country and lose their identity, pleasures, and gifted quality of life. These facts inspired us to identify and analyze the disease's underlying causes and address the need for the early disease detection of lung cancer. This motivated us to research lung cancer as soon as possible. Other factors are summarized below: (i) Earlier diagnosis can improve the patient's survival and quality of life.
(ii) For people who have been diagnosed with lung cancer, research can provide a better and longer future.
(iii) It can also, in the end, improve the number of survivors. In advanced medical images, the internal organs and tissues of the body are seen using one or more specialized diagnostic techniques, referred to as "advanced imaging." Without any human oversight, the CNN automatically finds essential characteristics. A multi-layer perceptron that can automatically learn the distribution law of data from a massive dataset has been created using the convolutional neural network (CNN).
The convolutional neural network (CNN) uses a significant quantity of data to identify the complicated representation of visual input. Because humans can perceive colors, it is essentially inspired by the human visual system. Convolutional networks recognize images as volumes, which represent three-dimensional objects. Red, Green, and Blue (RGB) encoding and the mixing of these three colors create the color spectrum in digital pictures. Each layer produces a dot product of the input pixels before moving on to the following layer. Along with the fundamental CNN structure, the CNN contains the following levels of activities.

Motivation for Lung Cancer Research
People affected by the disease daily come from many regions of the country and lose their identity, pleasures, and gifted quality of life. These facts inspired us to identify and analyze the disease's underlying causes and address the need for the early disease detection of lung cancer. This motivated us to research lung cancer as soon as possible. Other factors are summarized below: (i) Earlier diagnosis can improve the patient's survival and quality of life.
(ii) For people who have been diagnosed with lung cancer, research can provide a better and longer future.
(iii) It can also, in the end, improve the number of survivors. In advanced medical images, the internal organs and tissues of the body are seen using one or more specialized diagnostic techniques, referred to as "advanced imaging." Without any human oversight, the CNN automatically finds essential characteristics. A multi-layer perceptron that can automatically learn the distribution law of data from a massive dataset has been created using the convolutional neural network (CNN).
The convolutional neural network (CNN) uses a significant quantity of data to identify the complicated representation of visual input. Because humans can perceive colors, it is essentially inspired by the human visual system. Convolutional networks recognize images as volumes, which represent three-dimensional objects. Red, Green, and Blue (RGB) encoding and the mixing of these three colors create the color spectrum in digital pictures. Each layer produces a dot product of the input pixels before moving on to the following layer. Along with the fundamental CNN structure, the CNN contains the following levels of activities.
In particle swarm optimization (PSO), the potential solutions are particles. At the optimal level, particles traverse the problem space and are linked with fitness, which is the best option. Each particle also maintains a tracking of all of its coordinates. Pbest is the new name for this fitness value. A particle's value is the best overall value and is referred to as the best if it accepts the whole population as its neighbor.
The elements of the study project are (2) primary knowledge, which aids the reader in understanding the fundamental idea; (3) related works, which piece together the main principles; (4) system model, intended for the essential features; (5) experimental results and discussion, where the results are extensively discussed; (6) the conclusion, which discusses the results, viewpoints, and references for a simple comprehension of the work.
The rest of the paper is structured as follows: Section 2 discusses the preliminary knowledge about lung cancer diagnostic methods. Section 3 describes the related work, which elaborates on the current state-of-the-art methods for the classification of lung cancer. Section 4 discusses the proposed system model for lung cancer diagnosis. Section 5 discusses the experimental results for the proposed as well as existing methods. Section 6 concludes the paper and also explains future work.

Primary Knowledge
Current classification methods can be divided into two categories: supervised or unsupervised classification methods. In SVM, the Gaussian kernel is used; when there is no prior knowledge of the data, it is utilized to execute the transformation. Random Forest is used throughout the cited studies. Because of its efficiency and short processing time, SVM is the best technique for finding the best model and has received growing interest. An SVM algorithm for lung cancer disease does not support small-scale datasets and also requires a long processing time [17].
The automatic segmentation of components present inside an image, such as the region of interest in diagnostic images, has improved because of new methodologies combining transfer learning and techniques such as fine-tuning [18]. Deep learning and artificially intelligent techniques are applied to this data collection to develop statistics. Professionals then analyze the data to help customers lead healthier lives and avoid illness. Patients' data are collected via health records (HRs), IoT sensor systems, personal and mobile phones, internet data, and social networks, which are emphasized among recent innovations. The PH uses artificial intelligence (AI) approaches to enhance illness development, illness prognosis, self-regulation and the self, and clinical intervention procedures using the set of data generated [19]. Recent studies are progressively looking into IoT devices in the healthcare profession. Healthcare diagnosis is routinely aided. Lung cancer patients will have a better chance of surviving if they can be diagnosed early. Deep learning and machine learning are frequently employed in the screening and effective identification of lung disease. This study's major goal is to examine the function of deep learning in identifying and diagnosing lung disease. As a result, we employed a convolutional neural network (CNN), a type of deep learning model, to diagnose lung disease using radiography imaging [20]. A CT scan of the lungs can be used to identify and arrange the lung spikes and determine how dangerous they are. An improved CNN has high accuracy and temporal complexity characteristics; however, this is not as responsive to the setup as prior frameworks. They have tumors.

Related Work
A survey of roughly 36 articles that employed computational methods [21] to forecast various diseases was carried out in this study. The research consists of a variety of neural network models that are used to identify a range of illnesses and identify gaps that can be filled in the future for identifying lung disease in healthcare IoT. Every method was examined stepwise, and the general disadvantages were identified. Furthermore, the review examined the type of information utilized to predict the disease in question, that is, whether they were benchmark or manually obtained data. Lastly, strategies are defined Bioengineering 2023, 10, 320 5 of 26 and shown based on several accessible approaches. This will aid research in the future in correctly identifying malignant patients in their initial stages with no defects.
In [22], the authors propose an efficient SVM for lung classification, wherein the SVM's characteristics are adjusted, and features are extracted using a refined gray wolf optimization algorithm in combination with a genetic algorithm (GWO-GA). The experimental phase is classified into three categories: parameterization testing, feature engineering, and optimal SVM. A benchmarking image dataset of 50 low-dose and archived lung CT scan images was used to monitor the effectiveness of the provided technique. The given work demonstrated the performance on the entire sample image dataset in various ways. Furthermore, it has an accuracy rate of 93.54%, which is much better than the evaluated approaches.
The current study presents a novel prototype for the feature extraction and classification of computed pulmonary tomography based on the Internet of Things. Combining the backpropagation algorithm with Parzen's probability density, the presented approach employs a Software Development Kit (API) known as the Internet of Medical Things to identify lung images. The method worked well, with more than 98% correct classifications of lung images. The model then moves on to the lung segmentation method, which creates a lung map and uses fine-tuning to detect the pulmonary boundaries on the CT image using the Masks R-CNN network. The hypothesis was proven correct, as the suggested approach outperformed previous efforts in the research, achieving categorization parameters such as 98.34% [23].
Again, for evaluation, this work utilized computed tomography (CT) images of the lung and a probabilistic neural network (PNN) for diagnosis. After the raw lung images are preprocessed, features are extracted using the Gray-Level Co-Occurrence Matrix (GLCM). The chaotic crow search algorithm (CCSA)-based selection of features is presented. The calculation metrics employed were precision, sensitivity, predictive value, and accuracy. According to the investigation, on pictures with unevenly distributed cells, the feature selection based on CCSA effectively reaches 90%.When pulmonary development and degeneration are out of proportion, cancer cases of different kinds arise as a result; cancer cells then grow and expand, eventually creating a tumor. Whenever atmospheric poisons contact lung tissue, they cause harm. The unclear features of the lung cancer nodule make computer-aided automated diagnosis difficult at the moment. A selection approach to extract key characteristics was utilized together with the average accuracy to improve the classification performance. According to the results, the PNN with CCSA-based selection features outperforms the PNN without such features [24].
Researchers proposed a novel Internet of Things (IoT)-based predictive model based on fuzzy-cluster-based enhancement and classification to forecast lung cancer development via ongoing analysis and to enhance health by delivering medical instructions. For successful picture segmentation, the fuzzy clustering approach is utilized, which is based on the energy that can be transferred through extraction. In addition, the Fuzzy C-Means Clustering technique is employed to classify the transition area characteristics from the lung cancer image features. This method uses the Otsu thresholding approach to recover the transition zone from a lung image. In addition, the right-hand-side images and the morphology reduction procedure are applied to improve segmentation accuracy [25].
The morphology cleansing and image region filling procedures were conducted across an edge's lung image to obtain the image areas. The authors also present a novel continuous classification approach that integrates existing Association Rules (ARM), the conventional Decision Tree (DT) with feature representation, and a CNN. LungNet, a unique combination DNN-network-based framework built using CT scans and healthcare IoT data obtained using a wearable camera, is presented to address the issue. LungNet is made up of a 22-layer CNN model that blends latent factors acquired from CT scan pictures with MIoT data to improve the system's detection ability. This system, which is controlled from a centralized server and was trained using a balanced dataset of 525,000 images, can classify lung disease into five classifications with high precision (96.81%) and a lower false-positive rate Bioengineering 2023, 10, 320 6 of 26 (3.35%) than comparable CNN-based classifications. Furthermore, it accurately differentiates phase 1 and phase 2 lung tumors into 1A, 1B, 2A, and 2B sub-classes with a probability of detection of 10.5% [26].
All of the devices in this system are responsive to VOCs linked to lung cancer. The device's data had to be normalized before they could be used. Next, to improve classification success, characteristics were collected from the data. Principal Component Analysis was performed to eliminate superfluous characteristics. Finally, k-Nearest Neighborhood and SVM algorithms were used to classify the collected characteristics [27].
This research used computed tomography (CT) images to develop automatized learning with quantitative research utilizing lung identification and therapy. On CT samples acquired from a lung dataset, the proposed model applies a Gaussian filtering (GF)-based preprocessing method. Moreover, they are put through the normalized cuts (Nuts) method, which may spot nodules that existed before the image. In addition, the oriented FAST and rotated BRIEF (ORB) technique is used as a feature extractor. Finally, the sunflower optimization-based wavelet neural network (SFO-WNN) model is used for lung cancer diagnosis. A descriptive analysis was conducted to evaluate the diagnosis results of the MLDS-LCDC model, and the findings were tested based on many factors [28].
The use of automated and optimal computer-aided diagnosis for lung disease is proposed in this paper. The procedure begins with the normalization and noise removal of the input images in a preliminary step. After that, Kapur entropy maximization and classification techniques are used to divide pulmonary areas. In the remainder of this study, 19 GLCM features were collected from the segmented images. The images with the highest priority were then chosen to reduce the program's complexities. The feature selection method is considered a new optimization approach known as Enhanced Thermal Exchange Optimization (ITEO), which aims to increase reliability and stability. ITEO uses an optimized artificial neural network to classify images into healthy or malignant situations [29].
Magnetic resonance imaging (MRI), computed tomography scans (CT scans), ultrasonic devices, and positron emission tomography (PET) have all been used widely in the past; nevertheless, these methods have significant disadvantages, including cost, as well as the existence of radioactive material. Because of its convenience, accessibility, low price, and convenience of use, sensor synchronization in processing has recently been adopted in advanced screening. In terms of medical use, biotechnologies are becoming common tools for cancer scares, and intelligent sensor advancement has been realized. This article describes the results of studies on the use of biosensing in the treatment of lung cancer. According to the study, the electrochemical biosensor is the most widely utilized for earlystage lung cancer diagnosis [30].
In this study, the researchers present a new deep learning approach for lung cancer detection, which combines a convolutional neural network (CNN) with kernel k-means clustering. The proposed methodology was tested using the Anti-PD-1 Immunotherapy Lung datasets from The Medical Image Archives (https://www.kaggle.com/datasets/nih-chestxrays/data and https://comport.com/healthcare-it-solutions/medical-image-archives/, accessed on 19 October 2022). The researchers utilized 400 MRI images from these data, which were carefully classified and included 150 routine lung scans and 250 cancer images. The framework was used to analyze the information throughout the first step. The kernel k-means clustering approach flattens neurons in the feature space for each image resulting from the convolution operation in the CNN. The center of each cluster is again determined using this procedure, which provides the prediction classification of each piece of data in the cross-validation. Various data points were used in k-fold cross-validation to assess the effectiveness of the suggested approach. When employing kernel functions with sigma = 0.05 in nine-fold cross-validation, the presented approach achieved the best set of metrics, with 98.85%, 98.32% sensitivity, 99.40% precision, 99.39% specificity, and a 98.86% F1-measure [31].
Employing expanded SegNet and convolutional neural network models, an intense CAD system for cancer detection with breast tomography images was constructed in this study (CNNs). To segment the lung from chest CT images, a dilated SegNet model was used, and a CNN model with batch normalization was built to identify the real nodules among all probable nodules. Using example cases from the LUNA16 dataset (https://luna1 6.grand-challenge.org/, accessed on 19 October 2022), the expanded SegNet and CNN classifiers were trained. The nodule classification was assessed using sensitivity, and the segmented model's efficiency was quantified by the Dice coefficient. The statistical method confirmed the detection accuracy of the features obtained by CNN classification [32].
Another study provided a combined lung lesion detection and recognition network that can simultaneously identify, segment, and classify pulmonary nodules while accounting for inherent labeling uncertainties in the training dataset. It allows the final identification of nodules and the segmentation of the discovered nodules. For the improved analysis of 3D data, both the nodule identification and categorization subnets of the proposed combined system use a 3D encoder-decoder design. Furthermore, the categorization subnet boosts the categorization model by implementing characteristics extracted from the detection subnetwork with multiresolution nodule-specific data. When compared to the current backpropagation method, the proposed method provides useful prior knowledge to better optimize the much more difficult 3D classifier correlational design and better identify worrisome clusters from several other tissues [33].
Integrating two feedforward attentiveness processes, one at the nodule level and the other at the item level, a hierarchy attention-based multiple instance learning (HA-MIL) paradigm for clients diagnosing the disease was developed. The suggested HA-MIL platform was built by combining important parts' representations with nodule information and then combining the nodule representation with lung recognition. The HA-MIL framework outperformed existing techniques, such as higher transaction learning in the case of MIL and embedding-space MIL, on the general populace Lung Image Database Consortium and Set Of images Resource Initiative (LIDC-IDRI) dataset (https://paperswithcode. com/dataset/lidc-idri, accessed on 19 October 2022), demonstrating the efficacy of hierarchy ensemble learning. The HA-MIL model also detected the important nodules and qualities by using greater attentiveness weights, according to the analysis of the findings [34].
Traditional Chinese Medicine (TCM) has been proven to be an excellent therapy for treating lung disease, and precise syndrome classification is critical to treatment. The creation of smart TCM syndrome discrimination algorithms is enabled by concrete evidence of TCM treatment instances and the advancement of intelligent systems. This is intended to broaden the scope of TCM's advantages for lung cancers. This study aimed to develop end-to-end TCM diagnostic systems that might mimic lung disease classification. The developed models employed unorganized patient files as inputs to take advantage of data gathered by lung professionals for actual TCM therapy instances. The resulting algorithms outperformed those that used unstructured TCM data [35,36].Even though the articles in the related work have their significance, they can still increase their accuracy and performance. Through the understanding gained by the analysis of the above articles, this research has identified that ML-CNN with PSO and hyperparameters provides higher accuracy and greater performance and precision: • SVM with wolf optimization and the genetic algorithm has 93.54% accuracy; • R-CNN with the backpropagation algorithm and Parzen's probability density has 98.34% accuracy; • The Gray-Level Co-Occurrence Matrix and chaotic crow search algorithm with a probabilistic neural network provide 90% accuracy; • The CNN with Association Rules and conventional Decision Tree (DT) provides 96.81% accuracy; • Gaussian filtering with the sunflower optimization-based wavelet neural network has less than 90% accuracy; • Enhanced Thermal Exchange Optimization and Kapur entropy maximization and classification techniques have less than 90% accuracy; • The convolutional neural network, in nine-fold cross-validation, provides 98.32% accuracy.
However, our approach, ML-CNN with PSO and hyperparameters, provides 98.85% accuracy, which is the highest among all compared methodologies.

System Model
In this section, the proposed system model is given in detail. Through IoMT technologies, sensors, and devices, the classification of lung cancer is implemented. Figure 2 illustrates the proposed model in detail. The convolutional neural network, in nine-fold cross-validation, provides 98.32% accuracy.
However, our approach, ML-CNN with PSO and hyperparameters, provides 98.85% accuracy, which is the highest among all compared methodologies.

System Model
In this section, the proposed system model is given in detail. Through IoMT technologies, sensors, and devices, the classification of lung cancer is implemented. Figure 2 illustrates the proposed model in detail.

Preprocessing
Firstly, lung images are denoised to improve the quality of the images. In this work, the Bayesian threshold-based Taylor series approach is proposed for denoising noisy pixels into better-quality pixels. This proposed approach is based on the Taylor series since it provides high-resolution wavelet subbands. The Taylor series defines an infinite sum of terms that are estimated from derivative functions in which the values are pixels and are varied. As a result of this, the Taylor series for a 3D frame is expressed as Figure 2 clearly illustrates the system architecture and functions for core comprehension. Figure 2 explains the ML-CNN and feature extraction models together with all preprocessing concepts.
In general, an image comprises different pixel values since the intensities of an image are not similar in all places of an image. Therefore, the changes in pixel values need

Preprocessing
Firstly, lung images are denoised to improve the quality of the images. In this work, the Bayesian threshold-based Taylor series approach is proposed for denoising noisy pixels into better-quality pixels. This proposed approach is based on the Taylor series since it provides high-resolution wavelet subbands. The Taylor series defines an infinite sum of terms that are estimated from derivative functions in which the values are pixels and are varied. As a result of this, the Taylor series for a 3D frame is expressed as Figure 2 clearly illustrates the system architecture and functions for core comprehension. Figure 2 explains the ML-CNN and feature extraction models together with all preprocessing concepts.
In general, an image comprises different pixel values since the intensities of an image are not similar in all places of an image. Therefore, the changes in pixel values need more attention for denoising. From the defined Taylor series, the noise in subbands is computed and eliminated if it is greater than the estimated Bayesian threshold. The threshold for all subbands is formulated using the following expression: (2) where σ N represents the noise estimation, and T B is the threshold obtained for all bands. When σ s ̸ = 0, it is formulated as where σ y is defined as The subbands SB i are SB i = {{LLL, LLH, LHL, LHH, HLL, HLH, HHL, HHH }, which is considered the total number of subbands, and σ N is determined by Upon estimating the T B value, the obtained values of the Bayesian threshold are arranged, and then using the curve fitting method, an expression is written as follows: In Equation (6), β defines the standard noise deviation, and the values of other terms are given as p 1 = 0.9592, p 2 = 3.648, p 3 = −0.138, and q = 0.1245. Next, a BM3D filtering algorithm is applied, and its mathematical formulation is expressed as follows: In Equation (7), T 3D is the unitary 3D transform,ŶS SB is denoted as stacked subbands, Z is the subband size, and ∝ defines the threshold operator. From this final formulation, each subband is reconstructed and attains denoised frames for feature extraction. Equation (12) D RE (F i , F i+1 ) denotes the distance measure of Relative Entropy, and D RE (F i , F i+1 ) gives the distance measure and Square Root of Relative Entropy between the frames. The term P i = {P i (1), P i (2) . . . P i (n)} represents the probability distribution function of frames present in a video, i.e., given as Fi and Fi+1, obtained from a normalized intensity histogram of n bins. The value of the bin is n = 256, and k represents the total number of frames present in each shot. Upon estimating the distance measures based on Relative Entropy and the Square Root of Relative Entropy, a comparison is made between them based on the time direction. From this comparison, the alterations in frames are identified, and redundantly occurring frames are eliminated. In this step, image intensity values are normalized into equal ranges for correct processing at the next level. The next step is executed as an intensity normalization process, in which the range is between 0 and 1. The Z-score normalization function is used for intensity normalization, the performance of which is better than the decimal and min-max normalization techniques. This normalization function is computed as follows: where Y i represents the normalized intensity values for the input image, µ is the mean intensity value, δ represents the variance, and x i represents the intensity position. The Z-score normalization function was used for intensity normalization, which was carried out in the processing of the image for upcoming processes, in the rational range (0-1). This step is mandatory for all images for both training and testing operations. This minimizes the overhead or complexity for similar medical image retrieval processes with the same properties (size, intensity values, and so on). Table 1 shows the list of extracted features using the above-mentioned methods. The following features were re-extracted from the lung images:

Homogeneity
It is defined as the distribution closeness of entities in the matrix. It is computed in all directions of the image.

Entropy
It is defined by the complexity of textures or unevenness , j) is the normalized image matrix for one pixel coordinate, n is the total number of distinct gray levels in the image

Area
The number of pixels in the largest axial slice computes it lar aS multiplied by the resolution of the pixels

Aspect Ratio
It is computed by the major axis and minor axis length

Roundness
It is computed by the similarity of the lung nodule region to the circular shape 4πA L

Perimeter
It is defined as the structural property for the list of coordinates and also the sum of the distance from each coordinate

Circularity
It is computed by the 〖lar 〗_aS of each lung nodule image ∆A is the area of one pixel in the shape of I(x, y), X i and Y i is the ith pixel coordinates, A is the nodule area, and L is the nodule region boundary length,

MA(x, y)[L] is the major axis length and MI(x, y)[L]
is the minor axis length.
It is defined by the 4th moment of mean It is defined by the 3rd moment of the mean values The relative intensity differences in the given region define it R i Is a random attribute that indicates the intensity, H(R_i) is the Histogram of the Intensity in the nodule region, l denotes the number of potential intensity levels, and σ represents the standard deviation After feature extraction, the optimal set of features is extracted since the classification method cannot process a huge set of features. For this reason, the best sets of features are selected from the extracted feature set by spatial transformer networks.
The CNN is one of the types of neural network schemes with a major problem, i.e., high training complexity. Training depends upon loss values; training is frequently required. To address this issue, in this paper, the PSO algorithm is proposed, which finds the optimal values of the hyperparameters according to the input types and ranges. This type of integration helps reduce the hardware cost for CNN training and further produces fewer epochs for training and testing with the use of epochs. Further, this combination reduces the local minimum issue with regular backpropagation methodology training. The structure of the CNN is represented as follows:

•
Convolution Layers: This is the input layer that acquires the inputs, i.e., signals. This layer normalizes the range of all inputs into a single value for fast processing. • Pooling Layers: This layer's purpose is to minimize the input size, which automatically influences the higher performance. • Activation Layer: This layer uses the ReLU activation function, which maximizes the outputs for non-linear cases. • Fully Connected Layer: This layer uses an activation function called softmax to control the output range. The proposed CNN computes the weight values for a given problem between neighboring neurons.
In Figure 3, deep neural networks called convolutional neural networks (CNNs) are shown classifying and segmenting pictures. Both supervised and unsupervised machine learning techniques can be used to train the CNN. With no loss of information, its built-in convolutional layer reduces the high dimensionality of pictures. In Table 2, low-level extracted features are listed, which are used for fundamental characteristics automatically retrieved from a picture without needing shape information or focusing on proximity. Table 3 shows lung nodules that are very typical and typically not a reason for alarm. Even so, finding out that one has a lesion on the lung might be scary.
shown classifying and segmenting pictures. Both supervised and unsupervised machin learning techniques can be used to train the CNN. With no loss of information, its builtconvolutional layer reduces the high dimensionality of pictures. In Table 2, low-level e tracted features are listed, which are used for fundamental characteristics automatical retrieved from a picture without needing shape information or focusing on proximit Table 3 shows lung nodules that are very typical and typically not a reason for alarm Even so, finding out that one has a lesion on the lung might be scary.

Correlation
Contrast Homogeneity     Based on the fitness function, the PSO algorithm computes the position of the neighboring particle that is influenced by the current position, and hence, each particle is updated using If particle ( ) lies close to the local best , the second term in the velocity-updating rule approaches zero, and for particles ( ) lying close to their global best , the third term approaches zero. So, the particles close to their local best ( ) or global best can evolve more easily than those far away from their best ones. The velocity-updating rule is as follows.
For the overall CNN layers, the performance of parameters used in training and testing is tuned using the PSO algorithm. With the use of global and local search issues, the performance is optimized. The hybrid CNN with the PSO is described below.
( + ) represents the ith particle's velocity, which is updated for every iteration.
Other variables are the particle's best position and coefficient variables. In the softmax layer of the proposed ML-CNN model, entropy functions are used for disease diagnosis and the threshold update to the given set of inputs. For the classification into the appropriate classes, the features are extracted, the similarity between probability distributions is calculated using Relative Entropy, and an additional entity is measured for determin- Based on the fitness function, the PSO algorithm computes the position of the neighboring particle that is influenced by the current position, and hence, each particle is updated using If particle p i (t) lies close to the local best p best i , the second term in the velocity-updating rule approaches zero, and for particles p i (t) lying close to their global best p best g , the third term approaches zero. So, the particles close to their local best p i (t) or global best p best g can evolve more easily than those far away from their best ones. The velocity-updating rule is as follows.
For the overall CNN layers, the performance of parameters used in training and testing is tuned using the PSO algorithm. With the use of global and local search issues, the performance is optimized. The hybrid CNN with the PSO is described below. V i (t + 1) represents the ith particle's velocity, which is updated for every iteration. Other variables are the particle's best position and coefficient variables. In the softmax layer of the proposed ML-CNN model, entropy functions are used for disease diagnosis and the threshold update to the given set of inputs. For the classification into the appropriate classes, the features are extracted, the similarity between probability distributions is calculated using Relative Entropy, and an additional entity is measured for determining the threshold values for the classifications, which is estimated using the Square Root of Relative Entropy. In this work, the features are extracted in the left direction toward the right for the accurate classification of the class. For this approach, the values of Relative Entropy and the Square Root of Relative Entropy are determined from the following formulations. In Algorithm 1 machine learning CNN classification pseudocode is presented for clear understanding purpose.

Implementation
This study involved creating a fast and smart lung cancer detection system for use in the healthcare context. It classifies individuals with lung concerns using a deep-learningbased approach on a lung database. The information on the characteristics of the lung was determined using the ML-CNN approach. These data are used to guide the search area for the ML-CNN technique to determine the suitable class again for provided inputs using the PSO algorithm. The entire study was carried out using the Python programming language. The results are reported and organized into tables and graphs for comparison to other assessment parameters. The necessary description is provided in Table 4 and the CNN is learning using hyperparameters and PSO algorithms, which minimize iteration and time. The above settings are applied to prevent overfitting. The hyperparameters are calculated using shallow tree and deep tree models. Once we do not have enough data, we can use the four-fold validation method; as such, it is not a heuristic method.
The above values are called hyperparameters and define the training required to build models. Changing the values of the hyperparameters changes the generated model and is generally used to determine how the algorithm learns relationships in the data.
A set of examples that are not part of the training process and do not participate in model tuning are used to evaluate the test set's performance. This toolkit does not leak data into models, so it can be safely used to get an idea of how the model will perform in production.
During implementation, various hierarchical models of the PSO algorithm were adjusted. For such systematic methods, a variety of performance measures were employed, all of which were generated from the CNN model. The performance indicators that we utilized in our research are listed below. We abbreviate this dataset as LISS CISL∥, a public database for researchers and academic purposes. It contains 511 2D CISLs for 252 patients, 166 3D images for 19 patients, and nearly 9 lung nodule signs captured at the Cancer Institute and Hospital at the Chinese Academy of Medical Sciences. These images were captured by GELightSpeed VCT-64 and Toshiba Aquilion 64 Slice CT scanners, Holt, MI, USA. The image format is DICOM 3.0. In the LISS database, 2D instances are covered by the nine categories of lung nodule sign images, and 3D instances are covered by only one sign (ground-glass opacity (GGO)).

Dataset Description
In LISS, all CT images are assigned their corresponding label, and each image size is in pixels. The main reason why we considered this database is that it comprises a large number of lung images with different nodule signs. The lung image slice size in each class is 0.418-1 mm, and the mean value of the slice is 0.664 mm. The LISS database description is depicted in Table 5, which is categorized into two classes: CT imaging scans and annotated ROIs. In the LISS database, the CT imaging scans are stored in plain text, and all of the annotated ROI CT images are stored as an individual file per nodule sign. The description of each nodule sign is given below: • GGO: This is a sign showing hazy maximized attenuation in the lung and vascular and bronchial margins. It is suggestive of lung adenocarcinoma and bronchioloalveolar carcinoma. • Lobulation: This is an indication that the septae of connective tissue, which contain fibroblasts recognised as coming from the perithymic mesenchyme, are growing (malignant lesion).

•
Calcification: This is a sign of the deposition of insoluble salts (magnesium and calcium). Its characteristics, such as distribution and morphology, are significant in distinguishing whether a lung nodule is benign or malignant. Popcorn and dense calcifications indicate that the given lung image is benign, and the central regions of lesions, spotted lesions, and irregular appearance indicate malignancy. It is determined from lung nodule areas with high-density pixels, and CT scan ranges are above 100 HU (Hounsfield Units). • Cavity and Vacuoles: These are signs of hollow spaces, represented by tissues. The vacuole is a type of cavity. It is suggestive of bronchioloalveolar carcinoma and adenocarcinoma, and the cavity is connected to tumors greater than 3 mm. • Speculation: This is a sign caused by stellate distortion in tissue lesions. The cancerous tumor's intrusion causes this, which is one of the common lung nodule signs correlated with a desmoplastic response. This result indicates the fibrotic strand radiates with lung parenchyma. It is a sign of a malignant lesion.
• Pleural Indentation: This is a sign of a tumor-affected area, which is diagnosed in the tissues. The sign is connected with peripheral adenocarcinomas that comprise central or subpleural anthracitic and fibrotic foci. • Bronchial Mucus Plugs (BMPs): These are also referred to as focal opacities. They vary in density and may have a liquefied density, which is above 100 HU. They are due to allergic bronchopulmonary aspergillosis and represent Intrabronchial Air, which the mucus transforms. • Air Bronchogram: This is a sign of the formation of low-attenuation bronchi over the background of the high-attenuation airless lung. It is due to (1) proximal airway patency, (b) the removal of alveolar air, which produces absorption (atelectasis) or replacement (pneumonia), or (3) the presence of both symptoms. In very rare cases, the air displacement is the result of the marked Interstitial Expansion (Lymphoma).

•
Obstructive Pneumonia: This rectangle mark available in Figure 5 is a sign of a small lung nodule volume, which is due to distal collapse. This is mainly due to proximal bronchial blocking. It is combined with adenocarcinoma and squamous cell carcinoma. For a visual representation, it represents the cuneate or flabellate area with increased density. Among the descriptions of lung nodule signs, radiologists have concluded that the GGO, Cavity and Vacuoles, Lobulation, Bronchial Mucus Plugs, and Pleural Indentation represent the malignant lung nodule lesions. Air Bronchogram, Obstructive Pneumonia, and Calcification are classified as benign or malignant lesions. Figure 5 presents the sites of nine lung nodule signs in images in the database.
Bioengineering 2023, 10, x FOR PEER REVIEW 19 of 28 due to allergic bronchopulmonary aspergillosis and represent Intrabronchial Air, which the mucus transforms.

•
Air Bronchogram: This is a sign of the formation of low-attenuation bronchi over the background of the high-attenuation airless lung. It is due to (1) proximal airway patency, (b) the removal of alveolar air, which produces absorption (atelectasis) or replacement (pneumonia), or (3) the presence of both symptoms. In very rare cases, the air displacement is the result of the marked Interstitial Expansion (Lymphoma).

•
Obstructive Pneumonia: This rectangle mark available in Figure 5 is a sign of a small lung nodule volume, which is due to distal collapse. This is mainly due to proximal bronchial blocking. It is combined with adenocarcinoma and squamous cell carcinoma. For a visual representation, it represents the cuneate or flabellate area with increased density. Among the descriptions of lung nodule signs, radiologists have concluded that the GGO, Cavity and Vacuoles, Lobulation, Bronchial Mucus Plugs, and Pleural Indentation represent the malignant lung nodule lesions. Air Bronchogram, Obstructive Pneumonia, and Calcification are classified as benign or malignant lesions. Figure 5 presents the sites of nine lung nodule signs in images in the database. (ii) LIDC-IDRI Dataset The LIDC-IDRI dataset is the world's biggest publicly available database for researchers and consists of thoracic CT images (1018 CT scan images) from 1010 patients ( Figure 6). These images show the presence of lung cancers and nodule annotations, which means outlines. The nodule annotations are available for all patients. Still, the diagnostic information is presented for 157 patients, which consists of nodule information ratings, where 0 is an unknown class, 1 is the benign class, 2 is the primary malignant class, and 3 is the metastatic (malignant) class. (ii) LIDC-IDRI Dataset The LIDC-IDRI dataset is the world's biggest publicly available database for researchers and consists of thoracic CT images (1018 CT scan images) from 1010 patients ( Figure 6). These images show the presence of lung cancers and nodule annotations, which means outlines. The nodule annotations are available for all patients. Still, the diagnostic information is presented for 157 patients, which consists of nodule information ratings, where 0 is an unknown class, 1 is the benign class, 2 is the primary malignant class, and 3 is the metastatic (malignant) class. Bioengineering 2023, 10, x FOR PEER REVIEW 20 of 28 The ratings assigned for each image are based on a biopsy, surgical resection, progression, and the revision of radiological images. The diagnosis class is classified into two classes: (i) patient level and (ii) nodule level. To make this dataset for radiologists, images from 1000 patients were obtained over a long time with different CT scanners. We present the diameter size distribution for lung nodules in the LIDC-IDRI database in Table 6. Multiple lung nodules accompany the different slices. Each lung nodule slice comprises annotated data, and its presence is based on a 3 mm size.
In the LIDC database, diagnostic data are the means of ascertaining malignancy. We chose the ratings from diagnostic data; the ground truth was processed for training in the triplet CBMIR system, and it was evaluated by comparing the results with the radiologist-given ratings in the database. This database provides annotations for nodules with sizes from 3 mm to 30 mm shown in the yellow box.

Performance Measures
Equation 14 shows the accuracy rate, which is the number of accurate illness estimates relative to the total number of diagnoses. The capacity of a predictive result to reliably recognize persons without any signs of illness risks in the lung is provided in Equation 15. and represents specificity. The capacity of a decision outcome to properly identify persons with the illness is characterized as sensitivity, which is estimated in Figure 8.
The number of actual positives (i.e., risk controls and indicators for people who belong to the true-positive group) relative to the total number of people identified by the classification model (i.e., the total number classified as positive outcomes, including components incorrectly labeled as belonging to the positive class) equals the precision for a category.
The F-Score is the harmonized average mean of the specificity and sensitivity scores, as shown in Figure 11. The processing time delay is the cumulative time lag in learning The ratings assigned for each image are based on a biopsy, surgical resection, progression, and the revision of radiological images. The diagnosis class is classified into two classes: (i) patient level and (ii) nodule level. To make this dataset for radiologists, images from 1000 patients were obtained over a long time with different CT scanners. We present the diameter size distribution for lung nodules in the LIDC-IDRI database in Table 6. Multiple lung nodules accompany the different slices. Each lung nodule slice comprises annotated data, and its presence is based on a 3 mm size. Table 6. LIDC-IDRI database description.

Nodule Description Number of Nodules
No. of nodules or lesions marked by at least 1 radiologist 7371 No. of <3 mm nodules marked by at least 1 radiologist 2669 No. of <3 mm nodules marked by all 4 radiologists 928 In the LIDC database, diagnostic data are the means of ascertaining malignancy. We chose the ratings from diagnostic data; the ground truth was processed for training in the triplet CBMIR system, and it was evaluated by comparing the results with the radiologistgiven ratings in the database. This database provides annotations for nodules with sizes from 3 mm to 30 mm shown in the yellow box.

Performance Measures
Equation (14) shows the accuracy rate, which is the number of accurate illness estimates relative to the total number of diagnoses. The capacity of a predictive result to reliably recognize persons without any signs of illness risks in the lung is provided in Equation (15). and represents specificity. The capacity of a decision outcome to properly identify persons with the illness is characterized as sensitivity, which is estimated in Figure 8.
The number of actual positives (i.e., risk controls and indicators for people who belong to the true-positive group) relative to the total number of people identified by the classification model (i.e., the total number classified as positive outcomes, including components incorrectly labeled as belonging to the positive class) equals the precision for a category.
The F-Score is the harmonized average mean of the specificity and sensitivity scores, as shown in Figure 11. The processing time delay is the cumulative time lag in learning the computational model and then evaluating it. It can be seen in Equation (15). In the following, the performance of the lung cancer diagnosis model is discussed and compared to existing methods. Table 7 illustrates the formulas for the performance metrics.   illustrate the performance of the model based on the accuracy, sensitivity, specificity, precision, and execution time concerning the number of instances. Based on the results shown in the figures, the performance of the proposed ML-CNN method produces the best results for lung cancer diagnosis. As a result of effective data collection, the preprocessing of images, extracted features, selected features, and classification by tuning input parameters, the performance of the proposed ML-CNN method has reached the maximum level. By addressing the limitations of poor-quality images, insufficient data, and features, the performance is improved. Furthermore, the filtering accuracy is computed for the proposed method, and the existing methods support the classification accuracy when the image-based lung cancer diagnosis is the focus. the computational model and then evaluating it. It can be seen in Equation 15. In the following, the performance of the lung cancer diagnosis model is discussed and compared to existing methods. Table 7 illustrates the formulas for the performance metrics.
where TP = true positive, TN = true negative, FP = false positive, FN = false negative  illustrate the performance of the model based on the accuracy, sensitivity, specificity, precision, and execution time concerning the number of instances. Based on the results shown in the figures, the performance of the proposed ML-CNN method produces the best results for lung cancer diagnosis. As a result of effective data collection, the preprocessing of images, extracted features, selected features, and classification by tuning input parameters, the performance of the proposed ML-CNN method has reached the maximum level. By addressing the limitations of poor-quality images, insufficient data, and features, the performance is improved. Furthermore, the filtering accuracy is computed for the proposed method, and the existing methods support the classification accuracy when the image-based lung cancer diagnosis is the focus.              After the selection of features (visual and semantic), they are subjected to a denoising process to remove noise, if any, present in them. To evaluate the efficiency of the method in this paper, filtering accuracy was analyzed. This filtering accuracy metric defines the range of filtering, i.e., whether the frames with noise are completely removed. The filtering accuracy is formulated in terms of the inverse of the mean square error, which is given as where the mean square error is In Equation (15), is the square of errors, and 1/n ∑_(i = i)^n is the mean.
According to the defined formulation, the filtering accuracy was determined and shows good results. The proposed Bayesian threshold method in this research work was compared with the previous fast bilateral filter involved in removing noises from images. The comparative results of the filtering accuracy are demonstrated in Figures 12 and 13, from which the better efficiency of the proposed method is confirmed. This result is due to the involvement of a block-matching filter with a Bayesian threshold that removes errors from all parts of the image.
where the mean square error is In Equation 15, ( − ) is the square of errors, and 1/n ∑_(i=i)^n is the mean. According to the defined formulation, the filtering accuracy was determined and shows good results. The proposed Bayesian threshold method in this research work was compared with the previous fast bilateral filter involved in removing noises from images. The comparative results of the filtering accuracy are demonstrated in Figures 12 and 13, from which the better efficiency of the proposed method is confirmed. This result is due to the involvement of a block-matching filter with a Bayesian threshold that removes errors from all parts of the image.  In contrast, a fast bilateral filter focuses on removing noise that is present at the diagonals only. This is the major reason for the reduced filtering accuracy of fast bilateral filters. The graphical comparison shows a gradual increase in filtering accuracy with the which is given as where the mean square error is In Equation 15, ( − ) is the square of errors, and 1/n ∑_(i=i)^n is the mean. According to the defined formulation, the filtering accuracy was determined and shows good results. The proposed Bayesian threshold method in this research work was compared with the previous fast bilateral filter involved in removing noises from images. The comparative results of the filtering accuracy are demonstrated in Figures 12 and 13, from which the better efficiency of the proposed method is confirmed. This result is due to the involvement of a block-matching filter with a Bayesian threshold that removes errors from all parts of the image.  In contrast, a fast bilateral filter focuses on removing noise that is present at the diagonals only. This is the major reason for the reduced filtering accuracy of fast bilateral filters. The graphical comparison shows a gradual increase in filtering accuracy with the In contrast, a fast bilateral filter focuses on removing noise that is present at the diagonals only. This is the major reason for the reduced filtering accuracy of fast bilateral filters. The graphical comparison shows a gradual increase in filtering accuracy with the increase in time. An increase in this metric also causes an increase in classification accuracy due to the improvement in the clarity of images. Then, an optimization-based comparison was implemented for the proposed and previous approaches. In this work, hyperparameters were tuned using the PSO algorithm, which was compared to the traditional algorithms. Table 8 compares the other optimization methods to the PSO with their merits and demerits. We analyzed the suggested approach using a variety of PSO-selected attributes to evaluate its overall performance parameters. F1 = 0.6, F2 = 0.5, and h = 1.0 have the maximum degree of precision. Table 9 represents the test results for various parameters. We ran several preliminary trials to find the optimal empirical combination of particle and iteration numbers. We discovered that 2500 particles and 29 iterations produced the final performance results given in Table 11 and Figure 7.  The PSO data, PSO iterations, and characteristics are described in Tables 10-12, which aids the model's ability to forecast results accurately.

Conclusions
Virtual access to health imaging, including lung cancer computed tomography, is one of the benefits of IoT-based healthcare institutions. IoT nodes have made it easier to leverage data collected by various servers to investigate illness trends. As a result, a cancer diagnosis can be made by utilizing this information for CNN training. We developed a new classifier based on a thorough fully convolutional neural network in this study. ML-CNN is a broad sense classifier that may be used to detect and categorize biological im-agery. However, this work uses ML-CNN to detect and categorize pulmonary nodules in CT scan images. The proposed ML-CNN classifier is trained on two categories: nodule (diseased-malignant or normal) and non-nodule (non-diseased-malignant or benign) (normal). Firstly, the sensor information is preprocessed using the normalization technique and then forwarded to the ML-CNN with PSO for feature extraction and classification. Secondly, the lung images are preprocessed using a Taylor series-based Bayesian threshold for noise reduction, and then Z-score normalization is used to normalize the intensity values. Then, the lung-image-based information is extracted from the image, such as color, shape, and intensity, and the performance is improved by tuning the hyperparameters using the PSO approach.
ML-CNN with PSO provides accuracy, precision, sensitivity, specificity, and F-measure values of 98.45, 98.89, 98.45, 98.62, and 98.85, respectively. The hybrid method gives high precision with greater convergence results when compared to other methods.
In deep learning, hyperparameters are parameters whose values are used to control the learning process. If we want to find some hidden aspects of the CNN, such as color, shape, and intensity, better performance is achieved by adjusting the hyperparameters. Model hyperparameters are manually set and used to help estimate model parameters. Model hyperparameters are often called parameters because they are a part of machine learning that must be manually fine-tuned.
PSO improves the accuracy, precision, sensitivity, specificity, and F-measure for both PSO and hyperparameters, which helps the CNN in feature extraction and classification.
We plan to employ more lung datasets in future research and apply hypotheses via fine-tuning and transfer learning to categorize different types of CT images, including melanoma, mammogram, and heart images. The model's generalization for diverse types of medical imaging will also be evaluated in the continued study.