COVID-19 Case Recognition from Chest CT Images by Deep Learning, Entropy-Controlled Firefly Optimization, and Parallel Feature Fusion

In healthcare, a multitude of data is collected from medical sensors and devices, such as X-ray machines, magnetic resonance imaging, computed tomography (CT), and so on, that can be analyzed by artificial intelligence methods for early diagnosis of diseases. Recently, the outbreak of the COVID-19 disease caused many deaths. Computer vision researchers support medical doctors by employing deep learning techniques on medical images to diagnose COVID-19 patients. Various methods were proposed for COVID-19 case classification. A new automated technique is proposed using parallel fusion and optimization of deep learning models. The proposed technique starts with a contrast enhancement using a combination of top-hat and Wiener filters. Two pre-trained deep learning models (AlexNet and VGG16) are employed and fine-tuned according to target classes (COVID-19 and healthy). Features are extracted and fused using a parallel fusion approach—parallel positive correlation. Optimal features are selected using the entropy-controlled firefly optimization method. The selected features are classified using machine learning classifiers such as multiclass support vector machine (MC-SVM). Experiments were carried out using the Radiopaedia database and achieved an accuracy of 98%. Moreover, a detailed analysis is conducted and shows the improved performance of the proposed scheme.


Introduction
At the end of 2019, a new illness originated from a coronavirus appeared in the Hubei province of China and rapidly spread worldwide in 2020 [1]. This disease was named COVID-19 by the World Health Organization (WHO) in February 2020 [2]. COVID-19 disease is caused by the virus named SARS-CoV-2 [3]. This disease may cause organ failure and respiratory difficulties in severe cases [4]. In addition to the medical impact, the disease had a significant effect on the global economy and the environment [5].
The typical reverse transcription polymerase chain reaction (RT-PCR) test is a tedious procedure to recognize COVID-19 [6]. Artificial intelligence (AI) techniques have been deployed to combat the epidemic caused by COVID-19 and its negative consequences [7], and, specifically, for medical diagnostics [8]. Utilizing deep learning (DL), a modern form of machine learning, this disease can be detected and identified at early stages from the model-based classifier. They used 17 convolution layers and performed a filtering process on each layer. The proposed model achieved 98.08% accuracy for two classes and 87.02% for multi-classes. In [39], researchers combined CNN with long short term memory (LSTM) to automatically detect the COVID-19 in X-ray frames. This model extracts features from the CNN model, and LSTM is utilized for infection detection from extracted features. The maximum accuracy achieved with this model is 99.4% and an AUC of 99.9%.
In [40], the effectiveness of few-shot learning in U-Net architectures was investigated, which allows for dynamic fine-tuning of the network weights when few new samples are introduced into the U-Net. The results of the experiments show that the accuracy of segmenting COVID-19-infected lung areas has improved. In [41], the X-ray image features were extracted using the histogram-oriented gradient (HOG) and fused with the CNN features to construct the classification model. For enhanced edge retention and image denoising, the modified anisotropic diffusion filtering (MADF) technique was used. The substantial fracture zone in the raw X-ray images was identified using a watershed segmentation approach. With a testing accuracy of 99.49%, specificity of 95.7%, and sensitivity of 93.65%, this ensured a satisfactory performance in terms of recognizing COVID-19. In [42], a novel probabilistic model was created based on a linear combination of Gaussian distributions (LCG). The authors modified the standard expectation-maximization (EM) algorithm to estimate both dominant and subdominant Gaussian components, which are used to refine the final estimated joint density sequentially. In 3D CT scans, the approach was used to segment the COVID-19-affected lung region. In [43], flu symptoms, throat discomfort, immune status, diarrhea, voice type, breathing difficulty, chest pain, and other symptoms were employed to predict the likelihood of COVID-19 infection using machine learning methods, which achieved a prediction accuracy of more than 97%.
An automated system is required to identify the COVID-19 case based on the X-ray images. It is the cheapest method compared with the COVID-19 test (RT-PCR). However, manual inspection of these images is a hectic and time-consuming process. An experienced radiologist is always required for correct identification. Therefore, it is essential to identify these scans using an automated technique as early as possible. Computerized methods help the radiologist in clinics to support their manual result and detect COVID- 19. In this paper, we proposed a fully automated system using the fusion of features from two deep learning networks. Our significant contribution to this work is as follows: • A hybrid contrast enhancement technique is proposed by sequentially employing linear filters. • Transfer learning is performed by fine tuning the parameters of two deep CNN models. • Features are extracted from both models and an entropy-controlled Firefly optimization algorithm is implemented for optimal features' selection. • Selected optimal features are fused using a parallel positive correlation approach.
The rest of the manuscript is organized as follows. The proposed methodology (i.e., a technique for contrast enhancement, deep learning features, entropy-controlled Firefly based selection of best features, and fusion) is presented in Section 2. The results are discussed in Section 3. Finally, the conclusion of this technique is given in Section 4.

Methodology
The proposed COVID-19 classification method using optimal deep learning feature fusion is presented in this section with detailed visual effects and mathematical descriptions. Figure 1 shows the proposed architecture of the COVID-19 classification. This figure explains that, initially, the images are acquired from the Internet and labeled as COVID-19-infected and normal according to the given details. After that, a new hybrid approach is proposed for contrast enhancement. Features are extracted from both models and optimized using a novel entropy-controlled Firefly algorithm. Selected optimal features are fused using a new approach, named parallel positive correlation. Finally, the MC-SVM is used for the classification into normal or COVID-19-infected cases.
fusion is presented in this section with detailed visual effects and mathematical descriptions. Figure 1 shows the proposed architecture of the COVID-19 classification. This figure explains that, initially, the images are acquired from the Internet and labeled as COVID-19-infected and normal according to the given details. After that, a new hybrid approach is proposed for contrast enhancement. Features are extracted from both models and optimized using a novel entropy-controlled Firefly algorithm. Selected optimal features are fused using a new approach, named parallel positive correlation. Finally, the MC-SVM is used for the classification into normal or COVID-19-infected cases.

Dataset Preparation
The first step in any computerized approach is based on the nature of the database. In this paper, chest CT images of COVID-19-positive and normal images are considered for classification. We collected a total of 2500 COVID-19 images of 90 patients from the Radiopaedia database. On this website, more than 100 chest CT images are available. We consider the images of the first 90 patients for the COVID-19-positive class. We also collected 2000 images from the same website for normal (healthy) patients. All images are in gray scale format. We performed pre-processing and resized the images to a dimension of 512 × 512. Later, we increase the dataset using the data augmentation process, and the number of images in each class is 6000. In Figure 2, some sample images are illustrated.

Dataset Preparation
The first step in any computerized approach is based on the nature of the database. In this paper, chest CT images of COVID-19-positive and normal images are considered for classification. We collected a total of 2500 COVID-19 images of 90 patients from the Radiopaedia database. On this website, more than 100 chest CT images are available. We consider the images of the first 90 patients for the COVID-19-positive class. We also collected 2000 images from the same website for normal (healthy) patients. All images are in gray scale format. We performed pre-processing and resized the images to a dimension of 512 × 512. Later, we increase the dataset using the data augmentation process, and the number of images in each class is 6000. In Figure 2, some sample images are illustrated.

Contrast Enhancement
The enhancement of input image contrast is an important and useful step to improve an image's visual quality [44][45][46]. The primary motivation of this step is to visualize the COVID-19-positive images with more clarity. A hybrid technique is proposed in this paper, based on the combination of two filters: (i) top-hat filtering and (ii) Wiener filter. The output of both filters is passed in a new activation function for final enhancement.
Given ℧ is a database of images and ℧ ∈ ℝ , where each image is represented by ( , ) and ( , ) ∈ ℝ. Each image ( , ) has a dimension of × and = = 512. The nature of each image in the database ℧ is grayscale. Consider that is a

Contrast Enhancement
The enhancement of input image contrast is an important and useful step to improve an image's visual quality [44][45][46]. The primary motivation of this step is to visualize the COVID-19-positive images with more clarity. A hybrid technique is proposed in this paper, based on the combination of two filters: (i) top-hat filtering and (ii) Wiener filter. The output of both filters is passed in a new activation function for final enhancement.
Given is a database of n images and ∈ R n , where each image is represented by I n (x, y) and (x, y) ∈ R. Each image I n (x, y) has a dimension of N × M and N = M = 512. The nature of each image in the database is grayscale. Consider that e is a structuring element with a value of 21 and • is an opening operator, then top-hat filtering operation is defined as follows: The contrast of the image is enhanced using the mentioned filter. Next, the Weiner filter is employed for the removal of noise from image. This filter minimizes the mean square error (MSE) among the estimated random process and the desired process. Mathematically, it is defined as follows: Here, ∆ is a constant, and the value is initialized as 1. The resultant values of I top (x, y) and W mse (x, y) are passed in the activation function. The activation function is defined as follows: The output of this function is presented in Figure 3. The original CT images are illustrated in the first row, and the bottom row shows the intensified images. Based on these resultant images, it can be demonstrated that infected information is visualized with more clarity. These enhanced images are used in the next process for learning a model.

Modified AlexNet Deep Learning Model
To perform computer vision tasks like object detection and classification, AlexNet [47] is a widely used deep convolutional neural network (CNN) capable of attaining higher accuracies on challenging datasets. It has eight depth layers, five convolutional layers, and two fully connected layers with a Softmax layer of 1000 classes. The filter size utilized in convolutional layers is 11 × 11 and 5 × 5. Rectified linear units (ReLUs) are used as an activation function owing to their advantage of less computational time. Re-LUs are implemented after every convolutional layer. This model was trained on the ImageNet [48] challenging dataset having 1000 object classes. The input size of the CNN model is 227 × 227 × 3. The CNN model utilizes regularization to cope with the problem of over fitting. Regularization increased the training time with 0.5 dropouts.
In this work, we fine-tuned the AlexNet model and eliminate the last layer. A new layer was added, consisting of two target classes: COVID-19 and normal (healthy). The new fine-tuned model was trained through transfer learning (TL) [37], leading to a new modified target model. The modified AlexNet model after the fine-tuning process is shown in Figure 4. The features are extracted from the last layer (FC7) and saved in a new

Modified AlexNet Deep Learning Model
To perform computer vision tasks like object detection and classification, AlexNet [47] is a widely used deep convolutional neural network (CNN) capable of attaining higher accuracies on challenging datasets. It has eight depth layers, five convolutional layers, and two fully connected layers with a Softmax layer of 1000 classes. The filter size utilized in convolutional layers is 11 × 11 and 5 × 5. Rectified linear units (ReLUs) are used as an activation function owing to their advantage of less computational time. ReLUs are implemented after every convolutional layer. This model was trained on the ImageNet [48] challenging dataset having 1000 object classes. The input size of the CNN model is 227 × 227 × 3. The CNN model utilizes regularization to cope with the problem of over fitting. Regularization increased the training time with 0.5 dropouts.
In this work, we fine-tuned the AlexNet model and eliminate the last layer. A new layer was added, consisting of two target classes: COVID-19 and normal (healthy). The new fine-tuned model was trained through transfer learning (TL) [37], leading to a new modified target model. The modified AlexNet model after the fine-tuning process is shown in Figure 4. The features are extracted from the last layer (FC7) and saved in a new matrix of dimension N × 4096, and the mathematically featured matrix is denoted by Φ k1 N . Here, k1 denotes the feature vector length and N represents the number of images.
To perform computer vision tasks like object detection and classification, AlexNet [47] is a widely used deep convolutional neural network (CNN) capable of attaining higher accuracies on challenging datasets. It has eight depth layers, five convolutional layers, and two fully connected layers with a Softmax layer of 1000 classes. The filter size utilized in convolutional layers is 11 × 11 and 5 × 5. Rectified linear units (ReLUs) are used as an activation function owing to their advantage of less computational time. Re-LUs are implemented after every convolutional layer. This model was trained on the ImageNet [48] challenging dataset having 1000 object classes. The input size of the CNN model is 227 × 227 × 3. The CNN model utilizes regularization to cope with the problem of over fitting. Regularization increased the training time with 0.5 dropouts.
In this work, we fine-tuned the AlexNet model and eliminate the last layer. A new layer was added, consisting of two target classes: COVID-19 and normal (healthy). The new fine-tuned model was trained through transfer learning (TL) [37], leading to a new modified target model. The modified AlexNet model after the fine-tuning process is shown in Figure 4. The features are extracted from the last layer (FC7) and saved in a new matrix of dimension × 4096, and the mathematically featured matrix is denoted by Φ . Here, 1 denotes the feature vector length and represents the number of images.

Modified VGG16 Deep Learning Model
The VGG16 [49] convolutional neural network (CNN) is trained on an extensive image database ImageNet [48], having over a million images and 1000 classes. This model achieved 92.7% accuracy on the ImageNet database by securing a top five accuracy position on the ImageNet image recognition challenge. The input size for VGG16 is 224 × 224 × 3. This model improved the deficiencies in AlexNet by reducing the filter size on the first and second convolutional layers. The previous filter size was 11 × 11 and 5 × 5, which decreased to a 3 × 3 filter size. The image fed in this model has a size of 224 × 224. The image passes from multiple convolutional layers having different filter sizes varying from 3 × 3 to 1 × 1. The stride is fixed at 1 pixel. The pooling process is performed by deploying five pooling layers and a filter size of 2 × 2 with a stride of 2. Three fully connected layers after the stack of convolutional layers were added. The first two FC layers have 4096 features. The last fully connected layer expresses the number of classes 1000 of the ImageNet database for which the network was trained.
We fine-tuned this model and removed the last classification layer with a new layer of two output classes: COVID-19 and normal. The fine-tuned model was trained through TL, leading to a new target model. The modified VGG16 model is shown in Figure 5. This target model is now used for feature extraction. Features are extracted from the FC layer seven and receive a resultant feature vector of dimension N × 4096, and the mathematically featured matrix is denoted by Φ k2 N . Here, k2 denotes the feature vector length, and N represents the number of images.

Feature Selection
In the last decade, feature selection techniques have shown great success in computer vision, particularly in medical imaging, to make the system more efficient [50,51]. In feature selection techniques, the features are not altered like when using feature reduction techniques (such as principal component analysis, PCA) [52]. Subsets of features are selected from the input feature vector for the classification task. This is a primary motivation behind the use of feature selection.
classes 1000 of the ImageNet database for which the network was trained.
We fine-tuned this model and removed the last classification layer with a new layer of two output classes: COVID-19 and normal. The fine-tuned model was trained through TL, leading to a new target model. The modified VGG16 model is shown in Figure 5. This target model is now used for feature extraction. Features are extracted from the FC layer seven and receive a resultant feature vector of dimension × 4096, and the mathematically featured matrix is denoted by Φ . Here, 2 denotes the feature vector length, and represents the number of images.

Feature Selection
In the last decade, feature selection techniques have shown great success in computer vision, particularly in medical imaging, to make the system more efficient [50,51]. In feature selection techniques, the features are not altered like when using feature reduction techniques (such as principal component analysis, PCA) [52]. Subsets of features are selected from the input feature vector for the classification task. This is a primary motivation behind the use of feature selection.
We implemented an entropy-controlled Firefly algorithm (FA) for optimal feature selection. Initially, features are selected through the FA, and later, an entropy-based activation function is proposed and features are passed for the final selection phase. FA is a contemporary and widely used metaheuristic optimization approach, developed by We implemented an entropy-controlled Firefly algorithm (FA) for optimal feature selection. Initially, features are selected through the FA, and later, an entropy-based activation function is proposed and features are passed for the final selection phase. FA is a contemporary and widely used metaheuristic optimization approach, developed by Yang et al. [53], which originated from the glowing conduct of fireflies. Different species of fireflies have a particular flashing sequence. The process of biological luminous produces flashing light. The flashing pattern has two fundamental functions: prey attraction and attraction towards mating partner. FA adopts the flashing behavior of fireflies for the optimization of multimodal problems, and achieved robust performance compared with the particle swarm optimization (PSO) and the genetic algorithm (GA) [54].
Three main steps define FA: (i) A firefly appeals to all other fireflies, and the appeal is not gender-specific. (ii) The magnetism of flies is proportional to their glowing. The glowing fly will attract the fly with low brightness. Greater luminosity leads to a lesser distance between the fireflies. (iii) Lastly, the brightness of fireflies is mapped through a fitness function. The luminosity of a firefly with origin brightness Y is expressed as follows: where Y 0 describes the origin of brightness, the distance between two fireflies is expressed as s, and δ is the coefficient of light responsible for luminous intensity and occupation. As we know, brightness and attractiveness are proportional to each other; hence, attraction T can be expressed as follows: when s = 0, the attractiveness is T 0 . The attraction of Firefly l and m is expressed as follows: where ϕ describes the parameter randomness, z is the number of iterations, and Rand generates a random number between 0 and 1. The distance between the lth and mth Firefly is denoted by s lm and can be elaborated as follows: Based on the distance, the minimum distance features are evaluated. For the evaluation, a MC-SVM classifier was utilized. Based on the error rate, the next iteration is performed. As in this paper, we selected the total iteration number as \ = 100. After all iterations, an optimal vector was obtained with dimensions of N × 1746 and N × 1822 for feature vectors Φ k1 N and Φ k2 N , respectively. An entropy-based activation function is used for all features for later stage selection. In this stage, features are further refined using the entropy-based activation function. The activation function is defined as follows: where k ∈ (k1, k2), Φ k1 N is an optimal selected vector for Φ k1 N , and Φ k2 N is an optimal selected vector for Φ k2 N , respectively. In this paper, the length of optimal feature vectors after applying the activation function is N × 1346 and N × 1322, respectively.
The details are explained and given in Algorithm 1.

Feature Fusion and Classification
Feature fusion is an important method in pattern recognition [55]. It is used to combine or aggregate features originating from multiple inputs such as different types of images, different feature generation methods, or different layers of trained deep learning models [56,57]. Feature fusion is an important step in the proposed methodology, in which we fuse the information of both selected optimal deep feature vectors.
In this paper, we propose a new fusion approach, named parallel positive correlation. Initially, both vectors' lengths were equalized according to the size of the maximum length vector. As the length of Φ is higher than vector Φ , we performed zero padding. Based on the zero padding, we made the length of both vectors equal and then determined the correlation between the pair of features as and . The positively correlated features are selected for each and . The positive correlation denotes the features that have a correlation value close to one.
In the output, a vector size of dimension × 1346 was obtained for the final classification. The multiclass SVM (MCSVM) [58] was utilized as a classifier for final feature classification.

Results and Analysis
For the experiment, we collected 90 patients' data. Half of images are used to train a model, while the other half of the images are selected for the testing results. Tenfold cross-validation is performed for all the results. The other deep learning parameters of

Feature Fusion and Classification
Feature fusion is an important method in pattern recognition [55]. It is used to combine or aggregate features originating from multiple inputs such as different types of images, different feature generation methods, or different layers of trained deep learning models [56,57]. Feature fusion is an important step in the proposed methodology, in which we fuse the information of both selected optimal deep feature vectors.
In this paper, we propose a new fusion approach, named parallel positive correlation. Initially, both vectors' lengths were equalized according to the size of the maximum length vector. As the length of Φ k1 N is higher than vector Φ k2 N , we performed zero padding. Based on the zero padding, we made the length of both vectors equal and then determined the correlation between the pair of features as i and j. The positively correlated features are selected for each i and j. The positive correlation denotes the features that have a correlation value close to one.
In the output, a vector size of dimension N × 1346 was obtained for the final classification. The multiclass SVM (MCSVM) [58] was utilized as a classifier for final feature classification.

Results and Analysis
For the experiment, we collected 90 patients' data. Half of images are used to train a model, while the other half of the images are selected for the testing results. Tenfold cross-validation is performed for all the results. The other deep learning parameters of learning rate, mini batch size, number of epochs, and learning method are 0.001, 64, 200, and stochastic gradient descent, respectively. Multiple classifiers are utilized in the experiments, including naïve Bayes, fine tree, ensemble learning, and decision trees. Each classifier's performance is computed through several measures: sensitivity rate, precision rate, F1-score, accuracy, and false negative rate (FNR). Moreover, the computational time is also calculated to analyze the proposed method in the real-time testing phase.
All the simulations are conducted in MATLAB2020b (MathWorks Inc., Natick, MA, USA) using a desktop computer with Intel Core i7 of 512 SSD and 32 GB RAM and a 16 GB GPU.

Results
The results of the proposed method for several classifiers including MC-SVM, DT (decision tree), LDA (linear discriminant analysis), KNB (kernel naïve bayes), QSVM (quadratic SVM), F-KNN, cosine KNN, and EBT (ensemble boosted tree) are presented in Table 1. The highest achieved accuracy is 98%, by MC-SVM. The other computed measures include the sensitivity rate of 98%, precision rate of 98.05%, F1-score of 98.025, and AUC of 0.99, while the computational time is 12.416 (seconds). The accuracy achieved on the DT classifier is 94.4%, and FNR is 5.6%, which is 3.6% higher than that of MC-SVM. This classifier's computational time is 13.522 (seconds), which is higher than the time of MC-SVM. Similarly, the achieved accuracy on LDA, KNB, QSVM, F-KNN, cosine KNN, and EBT is 94.2%, 94.8%, 97.6%, 96.9%, 96.5%, and 96.3%, respectively. The FNR rate of each classifier is 5.8%, 5.2%, 2.4%, 3.1%, 3.5%, and 3.7%, respectively. Based on the accuracy and FNR, it is observed that the proposed method shows better results on MC-SVM. The computational time is also noted, and the minimum time is 12.115 (seconds) for F-KNN. However, this classifier's accuracy is less than MC-SVM, and the time difference between both classifiers is minimal. Moreover, the scatter plots and confusion matrix are given for the verification of achieved accuracy for MC-SVM. The scatter plots are illustrated in Figure 6. Note that the scatter plot (left side) is original, and the scatter plot (right side) is predicted by the MC-SVM classifier. The confusion matrix of the classification results using MC-SVM is given in Figure 7. This shows that the correct prediction rate of COVID-19 is 97%.
We performed separate experiments to compare the proposed method results with previous steps (i.e., original features extraction and optimal deep features selection without fusion). These experiments support the performance of our proposed method. The results of original deep features are tabulated in Table 2, which shows the results calculated for both deep models (AlexNet and VGG16) for all selected classifiers. For AlexNet model features, MC-SVM attains the best accuracy of 94.4%, while the error rate and computational time are 5.6% and 39.366 (seconds), respectively. For VGG16, MC-SVM gives better results of 92.4%, while the error rate and computation time are 7.6% and 42.896 (seconds), respectively. It is noted that the performance of AlexNet is better in terms of accuracy and time. However, the accuracy of VGG16 is also near to the results of this model. The accuracy of other listed classifiers is also presented in this table.  We performed separate experiments to compare the proposed method results with previous steps (i.e., original features extraction and optimal deep features selection without fusion). These experiments support the performance of our proposed method. The results of original deep features are tabulated in Table 2, which shows the results calculated for both deep models (AlexNet and VGG16) for all selected classifiers. For AlexNet model features, MC-SVM attains the best accuracy of 94.4%, while the error rate and computational time are 5.6% and 39.366 (seconds), respectively. For VGG16, MC-SVM gives better results of 92.4%, while the error rate and computation time are 7.6% and 42.896 (seconds), respectively. It is noted that the performance of AlexNet is better in terms of accuracy and time. However, the accuracy of VGG16 is also near to the results of this model. The accuracy of other listed classifiers is also presented in this table.  We performed separate experiments to compare the proposed method results with previous steps (i.e., original features extraction and optimal deep features selection without fusion). These experiments support the performance of our proposed method. The results of original deep features are tabulated in Table 2, which shows the results calculated for both deep models (AlexNet and VGG16) for all selected classifiers. For AlexNet model features, MC-SVM attains the best accuracy of 94.4%, while the error rate and computational time are 5.6% and 39.366 (seconds), respectively. For VGG16, MC-SVM gives better results of 92.4%, while the error rate and computation time are 7.6% and 42.896 (seconds), respectively. It is noted that the performance of AlexNet is better in terms of accuracy and time. However, the accuracy of VGG16 is also near to the results of this model. The accuracy of other listed classifiers is also presented in this table.   , respectively. Based on these values, it is noted that the performance of AlexNet model features is better. Overall, the MC-SVM accuracy is better, but this accuracy is 4% less than the proposed technique accuracy. Moreover, the time consumption of each classifier is three times higher as compared with that in Table 1.
The confusion matrix of MC-SVM using original AlexNet and VGG16 features is illustrated in Figure 8. The figure illustrates that the correct recognition rate of COVID19 is 94.4% and 88.6%, respectively.
The results of using the optimal deep features are tabulated in Table 3. MC-SVM achieved the highest accuracy of 96.2% and 94.2% for the AlexNet optimal and VGG16 optimal vectors, respectively. The error rate for each vector is 3.8% and 5.8%, respectively. Moreover, each vector's computational time is 14.277 (seconds) and 15.004 (seconds), respectively. Compared with this accuracy, the error rate and computation time achieved with the original features of the deep model are as tabulated in Table 2, which shows that the accuracy of deep features is improved.
Moreover, the time is decreased by almost threefold. The confusion matrix of the results by MC-SVM for this experiment is also illustrated in Figure 9. Besides, the results for other classifiers are also presented in Table 3 and compared with Table 1. Note that the optimal deep features provide better performance. However, the individual deep vector's accuracy is less than that of the proposed scheme, as tabulated in Table 1. The comparison  between Tables 1 and 3 shows that the accuracy of the proposed scheme is almost 2% better, and the time is nearly the same.
the performance of AlexNet model features is better. Overall, the MC-SVM accuracy is better, but this accuracy is 4% less than the proposed technique accuracy. Moreover, the time consumption of each classifier is three times higher as compared with that in Table  1.
The confusion matrix of MC-SVM using original AlexNet and VGG16 features is illustrated in Figure 8. The figure illustrates that the correct recognition rate of COVID19 is 94.4% and 88.6%, respectively.

Analysis and Comparison
The performance of the proposed method with a combination of several features is analyzed in this section. This step's primary aim is to support the proposed accuracy based on each involved step's strength. As shown in Figure 1, the implanted method has four fundamental steps (i.e., contrast enhancement, deep learning features' extraction, feature selection, and fusion). The results for each step are presented in Table 4. This table compares the effects of the proposed method with previous steps combinations. Initially, the AlexNet features are computed by employing contrast-enhanced images and an accuracy of 94.4%. In the next experiment, the AlexNet features are extracted without employing contrast-enhancing on images, achieving an accuracy of 91.7%. This step demonstrates that the utilization of contrast-enhanced images for AlexNet training improved the deep features. Similarly, the experiments are performed on the VGG16 model with and without contrast-enhanced images and achieve an accuracy of 92.4% and 90.3%, respectively. The proposed optimal feature selection approach is later applied to both vectors and achieves accuracy of 96.2% and 94.2%, respectively. It shows that the accuracy is significantly increased after employing the optimal feature selection approach. Finally, the experiment is performed using the proposed scheme, and achieves an accuracy of 98%, which shows the strength of the proposed method.
The confidence interval based analysis is also conducted for the proposed method. The proposed method was executed 100 times and obtained a minimum accuracy of 96.9%, and the maximum accuracy is 98%. Through these values, the calculated standard

Analysis and Comparison
The performance of the proposed method with a combination of several features is analyzed in this section. This step's primary aim is to support the proposed accuracy based on each involved step's strength. As shown in Figure 1, the implanted method has four fundamental steps (i.e., contrast enhancement, deep learning features' extraction, feature selection, and fusion). The results for each step are presented in Table 4. This table compares the effects of the proposed method with previous steps combinations. Initially, the AlexNet features are computed by employing contrast-enhanced images and an accuracy of 94.4%. In the next experiment, the AlexNet features are extracted without employing contrastenhancing on images, achieving an accuracy of 91.7%. This step demonstrates that the utilization of contrast-enhanced images for AlexNet training improved the deep features. Similarly, the experiments are performed on the VGG16 model with and without contrast-enhanced images and achieve an accuracy of 92.4% and 90.3%, respectively. The proposed optimal feature selection approach is later applied to both vectors and achieves accuracy of 96.2% and 94.2%, respectively. It shows that the accuracy is significantly increased after employing the optimal feature selection approach. Finally, the experiment is performed using the proposed scheme, and achieves an accuracy of 98%, which shows the strength of the proposed method.
The confidence interval based analysis is also conducted for the proposed method. The proposed method was executed 100 times and obtained a minimum accuracy of 96.9%, and the maximum accuracy is 98%. Through these values, the calculated standard deviation is 0.55, the variance is 0.3025, and the standard error mean (SEM) is 0.3889, respectively. Using these values, the confidence interval is plotted in Figure 10. Note that the margin of error (MOE) for the 95%, 1.960σ x confidence level is 97.45 ± 0.762 (±0.78%), while the accuracy of the proposed method is almost consistent after several iterations. deviation is 0.55, the variance is 0.3025, and the standard error mean (SEM) is 0.3889, respectively. Using these values, the confidence interval is plotted in Figure 10. Note that the margin of error (MOE) for the 95%, 1.960 ̅ confidence level is 97.45 ± 0.762 (±0.78%), while the accuracy of the proposed method is almost consistent after several iterations. Besides a comparison with other neural network models, we have implemented several pre-trained models and performed experiments. The results are plotted in Figure  11, which shows that the proposed method outperforms other selected deep learning models. Moreover, the results are also computed on several training/testing ratios to justify the selection of the 50:50 ratio. Normally, researchers use the 70:30 ratio; however, for the fair process of training and testing, the 50:50 approach is much better. We calculated the results in several ratios, 80:20, 70:30, 60:40, 50:50, 40:60, and 30:70, and obtained more stable results for the ratio of 50:50. From Figure 12, it is clearly noted that the accuracy is degraded for ratios of 70:30, 60:40, 40:60, and 30:70. Hence, the accuracy achieved when using the 50:50 ratio was found to be much better. Besides a comparison with other neural network models, we have implemented several pre-trained models and performed experiments. The results are plotted in Figure 11, which shows that the proposed method outperforms other selected deep learning models. deviation is 0.55, the variance is 0.3025, and the standard error mean (SEM) is 0.3889, respectively. Using these values, the confidence interval is plotted in Figure 10. Note that the margin of error (MOE) for the 95%, 1.960 ̅ confidence level is 97.45 ± 0.762 (±0.78%), while the accuracy of the proposed method is almost consistent after several iterations. Besides a comparison with other neural network models, we have implemented several pre-trained models and performed experiments. The results are plotted in Figure  11, which shows that the proposed method outperforms other selected deep learning models. Moreover, the results are also computed on several training/testing ratios to justify the selection of the 50:50 ratio. Normally, researchers use the 70:30 ratio; however, for the fair process of training and testing, the 50:50 approach is much better. We calculated the results in several ratios, 80:20, 70:30, 60:40, 50:50, 40:60, and 30:70, and obtained more stable results for the ratio of 50:50. From Figure 12, it is clearly noted that the accuracy is degraded for ratios of 70:30, 60:40, 40:60, and 30:70. Hence, the accuracy achieved when using the 50:50 ratio was found to be much better. Moreover, the results are also computed on several training/testing ratios to justify the selection of the 50:50 ratio. Normally, researchers use the 70:30 ratio; however, for the fair process of training and testing, the 50:50 approach is much better. We calculated the results in several ratios, 80:20, 70:30, 60:40, 50:50, 40:60, and 30:70, and obtained more stable results for the ratio of 50:50. From Figure 12, it is clearly noted that the accuracy is degraded for ratios of 70:30, 60:40, 40:60, and 30:70. Hence, the accuracy achieved when using the 50:50 ratio was found to be much better. Finally, the proposed method accuracy is compared with the existing techniques, presented in Table 5. In this table, 94.76% accuracy is achieved by [22]. They used the CT images having two classes, COVID-19 and normal, for classification purposes. The rest of the articles used the same CT images for the binary classification accuracy and achieved accuracy of 96.97% [59], 95.60% [60], and 95.1% [29]. The proposed method achieved an accuracy of 98%, which is improved compared with the existing techniques.

Conclusions
In this work, a new fully automated deep learning feature fusion-based method is proposed for the classification of chest CT images originating from COVID-19-infected and healthy subjects. In the proposed method, the first step is collecting a database from the Internet. The images in this database have low contrast; therefore, we implemented a new hybrid method. Through this method, the contrast was improved. This step plays a key role in the next step in obtaining useful characteristics. Fine tuning of two deep CNN models is performed according to the output classification classes. Transfer learning is employed on the modified fine-tuned models for training and deep features' extraction. The extracted features of both layers included little redundant information, which misleads the classification process. Therefore, we proposed an entropy-controlled Firefly algorithm for the robust feature selection. The individual optimal features did not achieve the target accuracy; therefore, we employed new concatenation technique called parallel positive correlation. The final features are classified using MCSVM and achieved an accuracy of 98%. The number of redundant features, which still exist in this work, is the limitation of above-mentioned method. Finally, the proposed method accuracy is compared with the existing techniques, presented in Table 5. In this table, 94.76% accuracy is achieved by [22]. They used the CT images having two classes, COVID-19 and normal, for classification purposes. The rest of the articles used the same CT images for the binary classification accuracy and achieved accuracy of 96.97% [59], 95.60% [60], and 95.1% [29]. The proposed method achieved an accuracy of 98%, which is improved compared with the existing techniques.

Conclusions
In this work, a new fully automated deep learning feature fusion-based method is proposed for the classification of chest CT images originating from COVID-19-infected and healthy subjects. In the proposed method, the first step is collecting a database from the Internet. The images in this database have low contrast; therefore, we implemented a new hybrid method. Through this method, the contrast was improved. This step plays a key role in the next step in obtaining useful characteristics. Fine tuning of two deep CNN models is performed according to the output classification classes. Transfer learning is employed on the modified fine-tuned models for training and deep features' extraction. The extracted features of both layers included little redundant information, which misleads the classification process. Therefore, we proposed an entropy-controlled Firefly algorithm for the robust feature selection. The individual optimal features did not achieve the target accuracy; therefore, we employed new concatenation technique called parallel positive correlation. The final features are classified using MCSVM and achieved an accuracy of 98%. The number of redundant features, which still exist in this work, is the limitation of above-mentioned method.