1. Introduction
Cancer is an illness in which cells grow without control and spread throughout the body. [
1]. Cancer cells can move around and invade nearby tissues, which makes them bigger. These cells can turn into tumors that are either cancerous or not. Malignant or cancerous tumors spread to nearby tissues and throughout the body, where they can form more tumors [
2]. Based on the World Health Organization (WHO), there were about 20 million new cases of cancer, and 9.7 million people died [
3]. According to the IARC’s Global Cancer Observatory, lung, breast, and colorectal cancers were the three most prevalent forms of cancer worldwide [
4]. Lung and colon cancer are two of the most common kinds of cancer that kill people around the world [
5]. Detecting cancer early lets patients and their carers plan for the future and make smart treatment choices. Drugs and treatments work best when they are given early in the disease process. This shows how important it is to find cancer early and accurately to improve treatment outcomes for patients [
6]. A lot of diagnostic models and decision support systems have been built in the last few years to assist clinicians in locating and diagnosing diseases more correctly. ANNs are very useful for medical data analysis and making many decision support systems since they can make predictions and handle data at the same time [
7].
This paper presents a new method that uses histopathological images to diagnose lung and colon cancer early by combining the strengths of CNNs, PSO, and ANNs. PSO is a very good way to find the optimum solutions in challenging, high-dimensional areas [
8]. This algorithm is predicated on the behavior of a social swarm. It constitutes a component of the model’s neural network design. Its configuration and hyperparameters enhance the effectiveness of cancer diagnosis and the model’s ability to derive significant insights from histological images [
9].
CNNs are one of the newest approaches to detecting cancer. CNNs can automatically extract features from training data for the images, which is necessary for building models to discover diseases [
10]. The VGG19 model was employed in this paper to find strong features in images of lung and colon cancer. CNN is good at extracting features, and it aims to rely less on hand feature engineering and more on significant elements of histopathological images. We employ CNN architectures like VGG 16 and VGG 19 to analyze medical images since they can find features in a hierarchical approach. Adding PSO to these models would also make it much easier to classify medical data into sets [
11]. Feature selection is a crucial part of medical imaging because histopathological photos can include several different features. We can pick the most essential features from the feature set that the VGG19 model found using the SMA approach. This makes the data simpler and the model more efficient [
12]. The experimental findings demonstrate that the SMA technique outperforms other approaches in selecting accurate features in medical images.
The primary contributions of this study are the following:
- (1)
It uses VGG19 as a CNN model to design a trainable feature extractor, which can automatically extract the advanced features of the original image, making the image distinguishable.
- (2)
It uses SMA to select important features, lower data dimensionality, and improve model interpretability.
- (3)
It uses PSO to optimize ANN, design a trainable high-precision output classifier, and conducts experiments to verify its effectiveness.
- (4)
Combining the advantages of DL and ML, we propose a new hybrid CNN-PSO-ANN model to improve the accuracy of medical image classification.
This paper’s remaining sections are arranged as follows.
Section 1 introduces related works to the PSO technique embedded in the ANN model.
Section 2 shows the techniques included in the suggested algorithm.
Section 3 discusses and contrasts the experimental findings with the related work.
Section 4 introduces a discussion about the suggested model.
Section 5 presents the conclusion and prospective research directions.
2. Materials and Methods
The proposed framework is described in this section, which is partitioned into six stages: in the first stage, the selected histopathological medical datasets are input for analysis; in the second stage, the datasets undergo preprocessing; features are extracted to support classification in the third stage, utilizing VGG19; and, in the fourth stage, features are reduced and selected using a feature selection approach, SMA. The significant features are then categorized in the fifth stage using an ANN classifier and optimized by PSO, and the suggested model (CNN-PSO-ANN) is then assessed utilizing performance measures (accuracy, RMSE, and MAE).
Figure 1 demonstrates the overall workflow for the proposed system. The created model attempts to improve the ANN model’s ability to diagnose medical images by leveraging the PSO optimizer’s behavior to discover the optimal ANN parameters.
2.1. Medical Database
The medical dataset used in this work was taken from Kaggle [
19]. The images were produced from a unique sample of HIPAA-compliant and validated sources. Andrew Borkowski and his colleagues at James Hospital, Tampa, Florida, compiled the dataset, which has a total of 750 images of lung tissue (250 benign lung tissue, 250 lung adenocarcinomas, and 250 lung squamous cell carcinomas) and 500 images of colon tissue (250 benign colon tissue and 250 colon adenocarcinomas). The dataset was augmented to 25,000 utilizing the Augmentor package. The dataset LC25000 collection comprises five classes, each with 5000 images: lung benign tissue, lung adenocarcinoma, lung squamous cell carcinoma, colon adenocarcinoma, and colon benign tissue. Every dataset class is balanced and has the same number of histological images. All images are 768 × 768 pixels and saved in JPEG format.
The datasets were partitioned into 70% during systems training and validation, while 30% was reserved as a test dataset to assess system performance.
Table 2 displays the distribution of the dataset images utilized for classification. A random sample of the database’s images is shown in
Figure 2.
2.2. Database Preprocessing
Preprocessing is a critical step in attaining efficient findings [
20]. The histopathological images acquired undergo a series of preprocessing procedures. Image preprocessing entails resizing and normalizing images to enhance their quality and consistency. All images were scaled to 224 × 224 to fit the VGG19 input, as the original images acquired from the datasets were 768 × 768 in size.
Normalization is a fundamental step in training deep neural networks. It removes unwanted traits and redundant data while normalizing input data [
21]. Pixel values are scaled to a predefined range of [0, 1] to improve model convergence during training, to decrease the biases induced by changing illumination conditions, and to attain consistency in pixel intensity across the dataset. Preprocessing correctly ensures the model can derive relevant properties from the data [
22].
2.3. VGG19 Model
VGG19 is a deep CNN of 19 weight layers, including 16 convolutional layers with three fully connected (FC) layers. The structure follows a simple and repeatable pattern, which makes it easy to comprehend and apply. The primary components of the VGG19 architecture are as follows: convolutional layers use 3 × 3 filters with one stride and one padding to maintain spatial resolution. The activation function is ReLU (Rectified Linear Unit), which is used after every convolutional layer to present nonlinearity. Pooling layers employ max pooling with a stride of 2 and a 2 × 2 filter to reduce spatial dimensions. VGG19 is made up of five blocks. Blocks 1 and 2 each have two convolution layers and a max pooling layer, while Blocks 3, 4, and 5 each have four convolution layers and a max pooling layer. At the end of the network, there are three FC layers for classification, and the final layer that outputs class probabilities is the Softmax Layer [
23,
24]. This network was fed an RGB image with a fixed size (224 × 224), resulting in a matrix with the shape (224, 224, 3). The entire image was covered by 3 × 3 kernels with a stride size of one pixel. To preserve the image’s spatial resolution, spatial padding was employed. A 2 × 2 pixel window with a stride of two was used for max pooling. This was followed by the ReLU, which used nonlinearity to improve the model’s computational speed and classification accuracy.
VGG19 utilizes convolutional layers with ReLU activation functions and max pooling layers to extract features and reduce spatial dimensions. ReLU activation functions are also used by the final, fully interconnected network layers. In the initial fully connected layer, a total of 4096 × 25 learnable weights and a 4096 × 1 bias term are computed. It was discovered that the dropout rate may be reduced to 50% by implementing a dropout layer between the completely connected layers. The last layer has learnable weights that measure 1000 by 4096. The dimensions of the activation-derived feature map at the FC7 layer are 1 × 1 × 4096, while at the FC8 layer, they are 1 × 1 × 1000 [
25].
The feature concatenation procedure entails merging two feature spaces into a single vector that emphasizes the maximum value. While it enhances accuracy, it concurrently leads to increased prediction and training durations. Therefore, in this work, we used the FC8 layer to extract features because it has dimensions of 1 × 1 × 1000, which reduce training duration, while the FC7 layer has dimensions of 1 × 1 × 4096, which increase training duration. Also, increasing the number of features may result in a decrease in accuracy.
Figure 3 illustrates the VGG19 architecture for feature extraction.
The feature extraction from the model, depicted in
Figure 3, is applied to all layers, excluding the final classification layer. The resultant feature representation was transformed into a 1 × 1000 dimensional vector and, subsequently, input into the CNN-ANN and CNN-PSO-ANN classifiers following feature reduction.
Figure 4 depicts the detailed architecture of the first part of MATLAB 2023 implementation for the VGG19 model (layer numbers and the model parameters).
Figure 5 illustrates the part of the MATLAB implementation for the VGG19 model, showing the dimensions of the FC7 and FC8 layers.
2.4. Slim Mold Algorithm (SMA)
Feature selection constitutes a significant challenge in ML and pattern identification. The chosen attributes substantially influence the system’s performance, precision, and efficacy. The feature selection is perhaps the most significant aspect of data mining and intelligent modeling since irrelevant or partially related features might have a detrimental impact on system performance. One of the most significant steps in creating intelligent learning systems is putting in feature selection algorithms; employing a suitable feature set considerably minimizes the “computational costs” necessary for optimum system training when the dimension of the input feature space is quite large [
26]. In this study, we employed SMA for feature selection. SMA employs slime swarming, a novel population-based metaheuristic approach [
27]. SMA is regarded as one of the best algorithms in the field of intelligence optimization. The numerical experimental outcomes revealed that the suggested technique performs well in key feature selection.
SMA was inspired by the cognitive activity of a fungus known as slime mold [
28]. Molds have cognitive behavior and can execute routing rapidly and accurately. This kind of mold is a unicellular organism which aggregates to form a multicellular reproductive structure. Mucous molds lack brains, yet they exhibit intelligent behavior and navigate complex pathways effectively. They meticulously assess their nutrition and evaluate the nutritional value of food and its associated risks. The initial positions of molds or features are generated randomly using Equation (1) [
26,
29].
where ub and lb denote the upper and lower boundaries of every solution or chosen feature, respectively. All molds have their fitness function values calculated; the ones with the highest values are selected as references, with their location
identified as the feature of interest. Slime molds utilize the airborne scent of their prey to navigate toward and locate it. Equation (2) is provided for the identification of slime mold according to the scent of its prey.
where vb represents a parameter within the interval [−a, a], vc diminishes linearly from 1 to 0, t indicates the present iteration,
signifies the particular point exhibiting the maximum odor concentration now under examination,
is the slime mold’s location vector,
and
are two individuals chosen at random from the current population, and
is the slime mold’s weight. p is found from Equation (3).
where, in S(i), i ∈ {1, 2, ⋯, n} denotes the fitness of
, DF stands for the maximum fitness achieved over all iterations, and
is provided by Equation (4).
Additionally, (5) and (6) are employed to determine the
.
In this context, S(i) stands for the swarm’s initial half rank, r is a random number between zero and one,
is the optimal fitness value attained in this iteration,
is the suboptimal fitness value recorded throughout the iterations, and
is the sequence of fitness values, arranged ascendingly, with the minimum value being particularly important. Furthermore, the location of the slime mold is revised utilizing Equation (7).
In this context, LB and UB denote the bottom and upper limits of the feature range, while rand and r indicate random values within the interval [0, 1]. The value of Z is delineated in the parameter setting test. Consequently, this procedure is reiterated until the termination criterion is met. Subsequently, the output
, denoting the location of the optimum features, will be acquired [
26].
Feature selection reduces dimensionality by deleting unimportant and repeated elements. Based on this, SMA was utilized to choose the most beneficial and significant features from the 1000 features retrieved from the VGG19 model for cancer disease categorization. As a result, 56 key features were selected for the diagnosis of lung and colon cancer disease categories.
Running a model without feature selection and cross-validation typically results in diminished accuracy, as the model may not be optimized, perhaps leading to the overfitting or underfitting of the data. The SMA feature selection method was applied only to the training data to identify the optimal subset of features. We used holdout cross-validation, which assists in evaluating the robustness of the model by splitting the dataset into a ‘train’ and ‘validation’ set. The training set is utilized for model training, whereas the validation set assesses the model’s performance on previously unobserved data. Holdout is simple and quick to implement and computationally efficient. In practice, it does not take a long time to train, which helps to create a time-aware model. It is suitable for large datasets.
To correctly integrate feature selection with a single holdout cross-validation, the feature selection process itself must be performed exclusively on the training data, and the independent holdout test set should be used only for the final, unbiased performance evaluation. A multi-step holdout approach provides a robust framework for this integration.
Algorithm 1 describes the steps of the proper integration of feature selection and holdout cross-validation to ensure the model generalizes well to unseen data and avoids data leakage (where information from the test set influences the training phase):
| Algorithm 1: Integrated Feature Selection and Holdout Cross-Validation |
Step 1: Initial Data SplitDividing the full original dataset into two main subsets Training Set (70%), which is used for model training and feature selection. Testing Set (30%), which is a final, completely unseen dataset used only for a final, unbiased performance estimate of the chosen model and feature set.
Step 2: Iterative Feature Selection and Model Training
The data is further split into training and validation sets (we select a ratio of validation data = 20%)
Iterate through potential feature selection strategies or subsets on the train set. Train the model using the training set data. Evaluate the model’s performance on the validation set. Select the feature subset that yields the best performance on the validation set.
Step 3: Final Model Evaluation
Step 4: Final Model Evaluation
|
The chosen features were fed into the CNN-ANN and CNN-PSO-ANN classifiers utilized in this work to diagnose cancer illness in medical databases (lung and colon cancer). The SMA trials on the datasets revealed that 100 iterations, as illustrated in
Figure 6, were sufficient for the algorithm to find the ideal features for the datasets with an accuracy of up to 97%.
2.5. Artificial Neural Network (ANN)
ANN is a data-driven approach that replicates natural neuronal networks mathematically. The input layer, output layer, and intermediate or hidden layer (s) are the three layers in which these neurons are typically arranged and connected. Furthermore, it is necessary to design additional elements of the ANN architecture, such as activation functions, learning rules, and connection patterns [
30]. Layers are interconnected by nodes, facilitating the transmission of information signals from one layer to the subsequent layer. The learning process comprises two phases: forward propagation and backward propagation. The initial phase involves receiving external signals into the input layer, which processes them and transmits them to the hidden layer. In the concealed layer, the incoming data is processed by bias, summation, and activation functions. No definitive guideline exists for establishing the number of hidden layers; rather, the trial-and-error approach is typically employed until the desired output is achieved, and is used in this work. The second stage occurs when the output layer fails to produce the desired result, which is known as backpropagation [
31]. During backpropagation (BP), the expected error values in each hidden layer are calculated backwards, which leads to changes in the weights of the previous layers. This ultimately leads to a consistent reduction in total error. This process persists until the error is minimized to an acceptable threshold [
16].
Figure 7 depicts the architecture of the ANN model utilized to diagnose lung and colon cancer using medical information.
In
Figure 7, the inputs denote the features (F1, F2, …, Fn) extracted from the VGG19 model and diminished by the SMA, where n signifies the number of features (56 features for data characteristics). The network comprises five hidden layers with a single output. The variable w represents the network weights, b denotes the bias, and y indicates the network output.
It is necessary to make careful adjustments to ANN hyperparameters, which include the number of hidden layers and neurons. To acquire values that are suitable for these hyperparameters, techniques of cross-validation are applied.
In ANN, hidden and output neurons use activation functions (Sigmoid tansig). In most cases, the activation function for each hidden neuron is identical. The option is made based on the model’s purpose or prediction type. An activation function adds nonlinearity to an NN. Sigmoid is one of the most common activation functions utilized in NNs. The ANN’s outputs are determined by weights, bias settings, and inputs. BP is widely utilized in training [
16]. PSO algorithms are an effective approach for training optimization.
2.6. Practical Swarm Optimization (PSO)
PSO is a population-based optimization methodology that, by imitating swarm social behavior, effectively explores the parameter space for the optimal configurations. The PSO is motivated by ecological and, especially, social phenomena found in nature [
32]. PSO simulates a swarm of particles searching the search space for a solution. To discover a better solution, the particles follow areas of high fitness as they move throughout the search space. Both the particles’ current best location and the swarm’s overall best position have an impact on its migration. Every particle’s velocity is updated by the algorithm based on social, cognitive, and inertia characteristics. PSO has been used for a variety of optimization issues due to its ease of use and reduced processing requirements as compared to conventional direct search techniques. Its performance has been enhanced and its range of applications has been expanded by a number of changes and enhancements made in recent years [
33].
One of the most intriguing aspects of PSO is how simply it can be changed to deal with limits. The approach is also noted for its rapid convergence and the absence of the objective function’s gradient. PSO has various advantages, including the fact that it needs just a limited number of parameters and is straightforward to implement, making it suited for a wide range of optimization domains [
34]. According to PSO’s central principle, every particle is only aware of its current speed, its optimal configuration up to this point (pBest), and the swarm member that has achieved global supremacy (gBest). On each cycle, particles adjust their velocities to get closer to their pBest and gBest. Every particle’s velocity, v, is altered by the following Equation [
32]:
where
is the j-th dimension velocity of the particle, x is its present position, and w is a constant of momentum that regulates the amount by which the velocity at the prior time step influences the velocity at the present step.
and
are constants that have already been calculated, whereas
and
refer to random numbers from [0, 1]. Furthermore, altering the values of
and
also changes the algorithm’s exploration and exploitation capabilities. Lastly, the position of the i-th particle in the j-th dimension is altered as follows:
The primary steps in the PSO algorithm are as follows:
- 1.
Set the particle population values to initial values.
- 2.
Assess the fitness of the population.
- 3.
Recall the optimal answer.
- 4.
Repetition.
- (a)
Update the position and velocity of every particle based on Equations (8) and (9).
- (b)
Calculate each particle’s fitness value inside the population.
- (c)
Update the optimal solution.
- 5.
Keep going till a last need is satisfied.
PSO works by keeping a list of the possible solutions in the search space. In every iteration, the objective function being optimized assesses each potential solution to determine its fitness. Particles “flying” around the fitness landscape to find the maximum or lowest of the objective function can be used to represent all possible solutions [
35]. A collection of potential solutions is first chosen at random from the search space by the PSO algorithm. The search space contains all of the possible solutions. Since the PSO method lacks knowledge about the fundamental objective function, it is unable to determine if each of the candidate solutions is close to or distant from the local or global minima [
36]. The algorithm evaluates its candidate solutions using the objective function; the algorithm takes action based on the fitness values that are produced. Every particle maintains its position, including its velocity, assessed fitness, and proposed solution. The individual’s optimal position is the candidate solution that achieves this fitness. It also retains the individual’s best fitness, which is the highest fitness value attained thus far in the algorithm’s execution [
37].
Finally, the PSO algorithm retains the best global fitness, which is the average of all particle fitness values. The best global candidate solution is the finest position or candidate solution that achieves this fitness, or simply the best position globally. In this study, PSO is used to hyperparameter tune ANN models in cancer illness categorization, which improves the overall accuracy of the diagnostic system.
2.7. ANN Optimized by PSO (PSO-ANN)
We will use global position parameters of the ANN model, thereby enhancing its efficiency on a dataset of histopathological images. The PSO technique is utilized to optimize the neural network’s design and hyperparameters. Important aspects of the neural network that PSO optimizes include learning rates, activation functions, the number of hidden layers, and the number of neurons in every layer. This automation streamlines the optimization process, reducing the need for manual adjustments.
A model of neural networks is generated using hyperparameters and optimized architecture generated by the PSO method. The neural network is trained using medical lung and colon data that has been preprocessed and feature extracted. During training, the model’s weights and biases are changed to allow it to learn cancer-related patterns and traits.
Figure 9 shows the flowchart for the hybrid PSO-ANN model.
As illustrated in
Figure 8, the initial stage involves inputting data that has been extracted from the VGG19 model and decreased using the SMA approach. The initial values are then created to establish the ANN model, after which the parameters of training and choices are specified. Testing and training of the ANN model follow.
After determining the best solution (the ANN model’s optimal parameters) using the PSO optimizer, we train and evaluate the model to ensure that the proposed system (CNN-PSO-ANN) outperforms the CNN-ANN model in disease prediction. In this study, we examined how the PSO algorithm initializes its parameters. It is crucial to realize that these parameters are crucial to the construction of the model. The tuning parameter range of the PSO-ANN is displayed in
Table 3.
A number of trial-and-error iterations were employed to determine the ideal parameter values to enhance the model.
Table 4 shows the training and testing parameters for the ANN model.
The essential steps in the PSO-based parameter optimization method are outlined as follows:
In this step, the PSO settings are initialized using a population of random particles and velocities.
Step 2: Evaluate the ANN model’s fitness function after training it. The current particle’s c and r properties are utilized to train the ANN model. The fitness function is tested using the 10-fold cross-validation method. Ten mutually exclusive subsets of approximately equal size are randomly chosen from the training dataset. Nine of these subsets are utilized for training, while the tenth subset is utilized for testing. Each subset is tested once during the ten iterations of the previously described procedure.
Equations (10) and (11) define the fitness function as the
_validation of the cross-validation procedure in the training dataset. Additionally, solutions with higher
have lower fitness values [
14,
38].
where
and
refer to the number of true and false classifications, respectively.
Fitness function values are used in this stage to update the particles’ global and personal best positions.
Repeat steps 2–4 until the termination requirements are not met.
The proposed model has 56 neurons in the input layer, each representing an attribute from the lung and colon cancer datasets. The hidden layer is five, and the output layer corresponds to class labels. The weight of each neuron is calculated utilizing the PSO technique, and the optimal weight is utilized to train the NN.
2.8. Model Evaluation
Accuracy, RMSE, and MAE are statistical measures utilized to evaluate the forecasting capabilities and performance of the CNN-PSO-ANN model, in addition to the metrics obtained from the confusion matrix. We will discuss these metrics briefly [
39,
40].
Accuracy: This indicator counts the number of correctly classified cases.
RMSE: This is a measure of the average magnitude of error between the expected and actual values; it has a range of (0,+∞). The lower the RMSE value, the more accurate the prediction model. It is computed using the following Equation [
41]:
MAE: This is a measure that shows how big the average difference is between the expected and actual values. It is commonly referred to as the mean absolute deviation (MAD). The MAE range is (0, +∞), and a lower MAE value means that the prediction model is more accurate. It is computed using the following Equation [
41]:
where
represents the actual variable,
refers to the predicted variable, and n refers to the amount of collected data.
The model for histopathological image diagnosis was also evaluated with the measures of accuracy, precision, recall, F1-score, and AUC, indicated by Equations (14)–(18). The equations include variables TP and TN, representing the count of properly identified samples, and FP and FN, indicating the count of mistakenly categorized samples [
42]. All variables are derived from the confusion matrix, which is created to evaluate the model’s performance.
3. Results
This section presents the experimental and performance analysis results for the proposed model, as well as a comparison of the findings. All algorithms utilized on the chosen histopathological medical datasets in this study were executed utilizing the MATLAB 2023 programming language on a laptop equipped with an Intel(R) Core (TM) i7-10510U CPU and 16 GB of RAM.
This section discusses the efficacy of the suggested CNN-PSO-ANN-based strategy for diagnosing and forecasting lung and colon cancer. The outcomes of the proposed model are validated using the cross-validation method.
The significant metrics are utilized to assess the performance of the CNN-PSO-ANN diagnostic model. The outcomes of the CNN-ANN model and the suggested model are compared.
Table 5 shows the outcomes of the evaluation parameters accuracy, RMSE and MAE, acquired using the suggested CNN-PSO-ANN model architecture in conjunction with the CNN-ANN model, implemented on the medical datasets chosen for this study’s performance assessment.
As demonstrated in
Table 5, the CNN-PSO-ANN hybrid approach outperforms CNN-ANN in terms of accuracy, achieving a remarkable 98.8% versus 94.1% for CNN-ANN. This indicates the hybrid model’s capacity to accurately predict diseases by indicating that a large proportion of instances are correctly classified.
Furthermore, a confusion matrix is created to assess the performance of the CNN-PSO-ANN technique. The measures obtained from the confusion matrix are utilized to assess the effectiveness of the suggested diagnostic model.
Figure 10 depicts the confusion matrices for the suggested diagnostic approach for the early identification of lung and colon cancer disorders.
Table 6 shows the accuracy, precision, recall, F1-score, and AUC value obtained for each class for LC25000 dataset classification using the models (CNN-PSO-ANN and CNN-ANN).
In addition, we use ROC curves with AUC values to evaluate the performance of the suggested model on the LC25000 dataset.
One standard tool used by the ANN to evaluate how well it performs on the LC25000 dataset for cancer diagnosis is the receiver operating characteristic (ROC). The true positive rate (TPR) is plotted against the false positive rate (FPR) to construct a receiver operating characteristic (ROC) curve. The true positive rate is the percentage of positive observations that were accurately anticipated to be positive. The false positive rate is the percentage of negative observations that were incorrectly assumed to be positive. The area under the curve is denoted as AUC, with values ranging from zero to one. The predictive value escalates as it nears one and diminishes as it nears zero. The ROC curves and AUC values for each class in the LC25000 dataset are shown in
Figure 11 and
Figure 12.
For further evaluation, the suggested model (CNN-PSO-ANN) was trained by dividing the dataset using 5k-fold validation.
Table 7 shows the outcomes of the evaluation parameters, accuracy, RMSE, and MAE, acquired to assess the suggested CNN-PSO-ANN model implemented on the histopathological medical datasets.
Figure 13 summarizes the confusion matrix obtained from training the CNN-PSO-ANN model on the L25000 dataset using 5k-fold cross-validation.
Table 8 shows the CNN-PSO-ANN performance evaluation metrics (accuracy, recall, precision, F1-score, and AUC) values obtained by applying 5k-fold cross-validation on the L25000 dataset.
The ROC curves and AUC values for the CNN-PSO-ANN model by applying 5k-fold cross-validation on the L25000 dataset are shown in
Figure 14.
The CNN-ANN and CNN-PSO-ANN models for LC25000 medical dataset diagnosis are developed and created using MATLAB. They take as inputs the features retrieved by the VGG19 model and reduced by the SMA chosen features technique, and produce disease classes as an output.
Figure 15 depicts the dataset-specific ANN model structure.
The ANN acquires essential features, undergoes training, modifies weights during validation, and assesses their efficacy on test datasets to attain favorable outcomes in the early detection and differentiation of lung and colon cancer.
According to the experiments, 20 training iterations with the PSO method were enough to optimize the ANN and reach the highest accuracy of up to 98.8%, as shown in
Figure 16.
Table 9 displays the outcomes of the comparison between the suggested model and other related works that used the same dataset to detect diseases.
From
Table 9, we observe that the proposed model (CNN-PSO-ANN) performed better than the other models in classifying the selected medical database when compared with recent research that used the same dataset (LC25000 dataset). For example, while examining accuracy, CNN-PSO-ANN achieves significantly higher accuracy (98.8) than other comparative algorithms, suggesting the suggested model’s good performance (CNN-PSO-ANN).
This result establishes the efficacy of the hybrid CNN-PSO-ANN model, which combines ML and DL, highlighting its robustness and generalizability as a powerful model for multi-cancer diagnostic support systems, paving the way for more accurate and reliable computer-aided diagnosis.
4. Discussion
In this paper, an optimized model is developed for the diagnosis of histopathological medical datasets LC25000, which contain images for lung and colon cancer. The data were divided into two categories: 70% for the purpose of training the model and 30% for the purpose of testing its accuracy and efficiency.
The proposed CNN-PSO-ANN integrates the VGG19 model for feature extraction, the SMA approach for feature selection, the PSO algorithm as an optimizer, and the ANN classifier. The SMA approach was employed for feature selection derived from the VGG19 model for pre-trained CNN models. The SMA technique yielded favorable outcomes in feature selection, where it found the ideal features for the datasets with an accuracy of up to 97% in 100 iterations.
The VGG19 model is a deep learning architecture that yields optimal results for feature extraction from medical datasets. The PSO method demonstrates its efficacy as an optimizer for enhancing ANN parameters. ANN was utilized to classify the dataset. The efficacy of the suggested CNN-PSO-ANN was assessed and evaluated utilizing statistical performance criteria, accuracy, RMSE, MAE, precision, recall, F1-scoe and AUC.
Additionally, a comparison between the proposed CNN-PSO-ANN and the CNN-ANN model revealed that CNN-PSO-ANN outperforms CNN-ANN.
The performance measures values for the proposed CNN-PSO-ANN reached accuracy (94.1), RMSE (0.1466), MAE (0.1466), precision (94.1), recall (94.7), F1-score (94.3), and AUC (0.990), while the performance measures values for the CNN-ANN reached accuracy (98.8), RMSE (0.02939), MAE (0.0259), precision (98.5), recall (98.9), F1-score (98.7), and AUC (0.999).
The experimental outcomes indicated that the CNN-PSO-ANN model demonstrates a significant enhancement in accuracy relative to the CNN-ANN model.
In addition, we use ROC curves with AUC values to evaluate the performance of the suggested model on the LC25000 dataset.
For further evaluation, the suggested model (CNN-PSO-ANN) was trained by dividing the dataset using 5k-fold validation. During this, the CNN-PSO-ANN model achieved good results using performance metrics, with values reaching 98.01, 0.0784, and 0.0123 for accuracy, RMSA, and MAE, respectively, and 97.9, 98.5, 97.0, and 0.997 for recall, precision, F1-score, and AUC, respectively.
While the proposed model functions effectively on the histopathological medical image datasets utilized in this study for diagnostic evaluation, it is not without its drawbacks. While this suggested approach achieves good classification accuracy on the chosen medical datasets, its performance on other datasets might be subpar. Various factors, such as labeling and noise, cause the scanned images in different datasets to display variations. This problem can be solved by training the suggested system with scanned photos taken at different locations and times. In the future, we intend to work with a greater number of medical image datasets to demonstrate the usefulness of the proposed classifier, experiment with novel approaches to extract features from medical images, and explore hybrid algorithms to train model parameters and achieve better outcomes and diagnose more diseases.
5. Conclusions
This paper presents a novel hybrid CNN-PSO-ANN model integrating DL and ML to solve the issue of medical image classification. This study examines the effectiveness of an ANN integrated with the PSO technique for medical data diagnosis and evaluates the performance of the VGG19 model in feature extraction. The suggested CNN-PSO-ANN was compared with the regular CNN-ANN model, revealing that the former yields superior results relative to the latter; the CNN-PSO-ANN model achieved an accuracy 98.9%, while the CNN-ANN model achieved an accuracy 94.23%. The tuning of parameters in a CNN-ANN significantly influences the attainment of enhanced performance. The neural network parameters (five for the hidden layer, 100 training cycles, learning rate 0.1, and tansig as the training function) were adjusted using a PSO optimizer.
Thus, the following accomplishments may be mentioned:
Analyzing a huge medical image database to diagnose two different types of disorders to broaden the scope of the suggested system.
Extracting a comprehensive collection of automated deep features from medical datasets (lung and colon images) and utilizing the pre-trained CNN model VGG19 to help achieve satisfactory classification results.
The SMA approach is used as a feature selection method to select the best features and increase classification accuracy.
A hybrid model (CNN-PSO-ANN) was developed by combining a CNN (VGG19), PSO optimizer, and ANN classifier for disease diagnosis.
The hybrid CNN-PSO-ANN model improved prediction efficiency and accuracy, outperforming the CNN-ANN model. The findings of this study make a substantial contribution to the existing literature on illness diagnosis and the value of early disease prediction.
Performance statistical measurements were used to investigate and evaluate the suggested CNN-PSO-ANN’s performance, like MSE, RMSE, and MAE, as well as accuracy and confusion matrix.
The proposed approach has serious limitations, even though it performs well on medical datasets for diagnosis assessment. Because labels, noise, and other factors cause the images in varied datasets to differ, our suggested method may perform poorly in other medical datasets, even though it has a high classification accuracy (which reached to 98.9%) in the selected medical datasets.
To address this problem, scanned images gathered at different times and locations must be used to train the suggested system. In the future, we want to work on a larger set of medical imaging datasets to demonstrate the value of the proposed classifier, develop new approaches for feature extraction from medical images, apply different hybrid algorithms to train the model parameters for improved outcomes, and utilize the suggested methods to detect additional diseases.
Finally, our thorough examination of the proposed model utilizing the VGG19 model, ANN, and PSO has provided substantial insights into the usefulness of DL and ML in image classification. The statistical results reached 98.8, 0.2939, and 0.0259 for accuracy, RMSE, and MAE, respectively, and 98.9, 98.5, 98.7 and 0.99 for precision, recall, F1-score and AUC, respectively.
The proposed model indicates potential utility in clinical applications, making it a useful tool for supporting radiologists in the speedy and precise identification of various cancer diseases.