Abstract
Background and motivation: Lung computed tomography (CT) techniques are high-resolution and are well adopted in the intensive care unit (ICU) for COVID-19 disease control classification. Most artificial intelligence (AI) systems do not undergo generalization and are typically overfitted. Such trained AI systems are not practical for clinical settings and therefore do not give accurate results when executed on unseen data sets. We hypothesize that ensemble deep learning (EDL) is superior to deep transfer learning (TL) in both non-augmented and augmented frameworks. Methodology: The system consists of a cascade of quality control, ResNet–UNet-based hybrid deep learning for lung segmentation, and seven models using TL-based classification followed by five types of EDL’s. To prove our hypothesis, five different kinds of data combinations (DC) were designed using a combination of two multicenter cohorts—Croatia (80 COVID) and Italy (72 COVID and 30 controls)—leading to 12,000 CT slices. As part of generalization, the system was tested on unseen data and statistically tested for reliability/stability. Results: Using the K5 (80:20) cross-validation protocol on the balanced and augmented dataset, the five DC datasets improved TL mean accuracy by 3.32%, 6.56%, 12.96%, 47.1%, and 2.78%, respectively. The five EDL systems showed improvements in accuracy of 2.12%, 5.78%, 6.72%, 32.05%, and 2.40%, thus validating our hypothesis. All statistical tests proved positive for reliability and stability. Conclusion: EDL showed superior performance to TL systems for both (a) unbalanced and unaugmented and (b) balanced and augmented datasets for both (i) seen and (ii) unseen paradigms, validating both our hypotheses.
1. Introduction
The COVID-19 pandemic has caused significant disruptions and health concerns worldwide and worsened traditional diseases since its emergence in late 2019. Efforts to control its spread have included non-pharmaceutical interventions, such as social distancing, mask-wearing, and quarantine measures, as well as the development and administration of vaccines. The development and administration of vaccines are effective in reducing the severity of the disease and preventing hospitalization and death [1,2,3,4,5,6].
Ongoing research and analysis are needed to better understand the effectiveness of various control measures and their impact on reducing the spread of COVID-19. There are several motivations for researching COVID-19 and its control measures. First, COVID-19 is a novel virus, and there is still much to learn about its transmission, symptoms, and long-term effects [1,7]. Second, research can help to fill these knowledge gaps and inform public health strategies. Third, COVID-19 has highlighted existing health disparities and inequities, and research can help to identify and address these issues in the context of the response to the pandemic [8,9]. Lastly, the COVID-19 pandemic has spurred artificial intelligence innovation and collaboration in fields such as medicine, epidemiology, and public health. Research can help to build on these developments and inform future responses to similar global health crises [3,6,10,11,12].
Supercomputers and graphical processing units (GPU) ease the burden of researchers in detecting medical imaging diseases [5,13,14,15], e.g., pneumonia [5,16]. Transfer learning (TL), ensemble deep learning (EDL), and hybrid deep learning (HDL) are novel methods of achieving better accuracy faster than traditional methods [17,18,19,20]. Hospitals, labs, institutes, professors, and doctors are not only adopting these new paradigms but also collaborating to help humans. There is variability in the design of studies looking at COVID-19 and its control measures [1,15,21,22,23,24], which can make it challenging to compare and draw conclusions from different studies. Some studies may have limited generalizability, as they may be conducted in specific populations and may not apply to other populations. The emergence of new variants of the virus may affect the effectiveness of existing control measures. Despite these limitations, ongoing research is critical for understanding and mitigating the impacts of COVID-19 and developing effective control measures.
Researchers are facing challenges in obtaining a COVID-19 image dataset with good volume. X-ray images are noisy, and these images could not clearly explain the infected lung areas in comparison to CT images. The CroMED and NovMED datasets have helped this research to detect infected COVID sections, but they should be processed using correct models. There are several published machine learning (ML), and deep learning (DL) models. ML models are mostly used for classification, while DLs are for feature extraction and classification. Now, it could be said that DL models are more suitable for the COVID CT image dataset than ML models [7,25]. Current DLs are already trained and tested on the ImageNet dataset with good accuracy. These models can be utilized and trained on CT images, but this would be a very slow and non-novel process. This challenge leads us to use TLs and EDLs. TLs are faster than traditional DL methodologies. EDLs are stronger than TLs. Still, researchers doubt the correct data size for DLs. The previous state of the art has proven that data augmentation and data balancing have a significant role in achieving better accuracy. Most of the AI systems are overfitted or never generalized. Such a process is called memorized rather than generalized. Such systems are not practical for clinical settings. Such systems therefore do not give accurate results when tried on unseen data sets. This is the fundamental motivation of this study. We specifically addressed a novel system design which is a cascade of three major AI systems for multicenter data set design aimed squarely at unseen analysis towards generalization. Thus, there is no system which is a combination of HDL + 5 TL + 5 EDL systems which was designed and tested on special five types of multicenter data systems of COVID + CONTROL combinations, and the design was applied to “unseen analysis” to establish generalization over memorization.
Based on the limitations in current research, we hypothesize two points to improve detection accuracy. First, the mean accuracy of EDLs is better than the mean accuracy of TLs. Second, balanced and augmented data give better results compared to data without augmentation. We studied 275 published journal and conference papers at IEEE, ScienceDirect, Springer, and MDPI. After this, we finalized EfficientNetV2M, InceptionV3, MobileNetV2, ResNet152, ResNet50, VGG16, VGG19, and Xception for our research work. These TL combinations have generated EDL models that could improve COVID-19 detection accuracy [7,25]. Five EDL models and seven TLs are consistently used over dataset combinations (DC) taken from Croatia and Italy.
The layout of this study is as follows: Section 2 presents the related literature. We discuss recent research and its accuracy on currently available datasets. Section 3 is a methodology in which the architecture and approach of research are included. The results and performance evaluation based on methodology and different performance metrics are presented in Section 4. Section 5 presents the system reliability and explainability. The critical discussion is presented in Section 6, and finally, the study concludes in Section 7.
2. Background Literature
The COVID-19 pandemic has led to an unprecedented global health crisis, with a significant impact on public health, and social and economic aspects of life [9,25]. One of the primary challenges that has been faced by healthcare professionals during the pandemic is the early and accurate diagnosis of COVID-19 patients. CT scans are one of the most reliable and widely used methods for the diagnosis of COVID-19 owing to their high sensitivity and specificity. With the advent of DL-based AI models, researchers have been able to develop automated diagnostic tools that can help healthcare professionals to diagnose COVID-19 patients more accurately and efficiently. Several studies have been conducted to develop and evaluate AI-based models for the diagnosis of COVID-19 using CT scans. For instance, in a study conducted by Gozes et al. [26], a DL-based model was developed and evaluated using a dataset of ~1500 CT scans. The study reported an overall sensitivity of 98% and a specificity of 92%, indicating that the model could accurately distinguish COVID-19 patients from non-COVID-19 patients. In a more recent study by Li et al. [27], a DL-based model was developed and evaluated using a dataset of 1684 CT scans obtained from 468 COVID-19 patients and 1216 non-COVID-19 patients. The model has had an overall accuracy of 91.4%, indicating that the model could accurately distinguish COVID-19 patients from non-COVID-19 patients.
Other studies have also explored the use of AI-based models for COVID-19 diagnosis using CT scans. Alshazly et al. [28] used DenseNet169 and DenseNet201 to evaluate 746 CT scan images. The authors have achieved an accuracy of 91.2%, an F1-score of 90.8%, and an AUC of 0.91 on DeseNet169 and an accuracy of 92.9%, an F1-score of 92.5%, and an AUC of 0.93 on DenseNet201. Cruz et al. [29] conducted another study using 746 CT scans. The best accuracy metrics were 82.76%, precision was 85.39%, and AUC was 0.89 using DenseNet161; the second-best model is VGG16, for which accuracy was 81.77%, precision was 79.05%, and AUC was 0.9. Shaik et al. [30], Huang et al. [31], and Xu et al. [32] also used TL-based MobileNetV2 on Dataset COVID-CT, TL-based MobileNetV2 on SARS-CoV2, and TL-based EfficientNetV2m on COVID-CT, and they achieved accuracies of 97.38%, 88.67%, and 95.66%, respectively. EDL has a major role in improving detection accuracy. There are some popular EDL paradigms on the CT dataset. Pathan et al. [33], Kundu et al. [34], and Tao et al. [35] used EDL models to achieve better accuracy in comparison to TL models. All three studies had an accuracies of more than 97%. In recent years, some authors have also used ensemble methods to detect COVID and non-COVID patients on X-ray and CT datasets [35,36,37,38,39,40]. These studies utilized at most two ensemble methods, and the datasets were also not large. Some other sections were also missing, such as with and without data augmentation results, unseen data analysis, and reason for combining all TLs. In addition to CT data classification, ensemble methods have also supported other medical sectors, i.e., breast cancer identification in early stages [41,42,43,44,45], brain tumor identification and segmentation [46,47,48,49], heart disease [49,50,51], and diabetic patient identification and treatment [52,53,54,55,56]. Statistical analysis has a vital role in finding reliable systems [57,58,59,60,61,62,63,64]. This analysis helps researchers to utilize models for real world applications [65,66,67,68,69]. Current research has utilized EDL and TL models and has accuracy above 90%. For instance, the Multi-DL RADIC model [70] demonstrated a remarkable accuracy rate of 99.4%. Similarly, the Multi-Deep system attained a high accuracy rate of 94.7% in [71], while an explainable CNN model achieved 95% accuracy [72]. Moreover, the use of transfer learning (TL) with VGG19 resulted in a 94.52% accuracy rate [73], while DL using Wavelet achieved an impressive accuracy of 99.7% [74].
The emergence of a novel era of Internet of Things (IoT) and EDL has further boosted research efforts, leading to remarkable accuracy rates. For instance, in [75], researchers achieved an accuracy rate of 98.56% using EDL in IoT. Stacked EDL also demonstrated promising results, achieving an accuracy rate of 93.57% [76]. Additionally, the best EDNC model achieved an accuracy rate of 97.75% [77], while a FUSI-CAD system based on a CNN model attained an impressive 99% accuracy rate [78]. These results highlight the effectiveness of various TL and EDL models in achieving high accuracy rates, which is crucial for COVID patients’ detection using CT dataset.
After undergoing literature review, we concluded that there is a need for a study with superior performance analysis, that is more generalized using unseen data analysis, and that is checked for reliability, and that has stability in argumentation and non-augmentation data sets. Furthermore, our COVLIAS system also underwent scientific validation.
These AI-based models need further validation, they could potentially play a crucial role in the fight against the COVID-19 pandemic, especially in resource-limited settings where access to diagnostic tools is limited. Future, we noticed that none of the models had an extended role of EDL on TL models keeping HDL-based segmentation of CT scans, which is a superior method since it undergoes quality control. Lastly, there has been no attempt to undergo generalizability or cross-domain paradigm where the testing is conducted on an “unseen dataset” taken from other clinical centers, unlike in the seen data set, where both the training and testing have been conducted from the same hospital settings. Our study exclusively addresses the “unseen analysis” and tested for the reliability and stability of the system design.
3. Methodology
In this section, we discuss image acquisition and data demography, overall architecture, HDL-based segmentation, TL-based classification approach, and EDL paradigm for classification. These subsections present the complete process to achieve our hypothesis.
3.1. Image Acquisition and Data Demographics
In this research work, we utilized two distinct cohorts from different countries. This dataset has already been validated by radiologists and doctors who are also co-authors in this paper. The first cohort, referred to as the experimental data set, consists of 80 CroMED COVID-19-positive individuals, with 57 males and the remainder female. Sample images are in Figure 1. An RT-PCR test was conducted to confirm the presence of COVID-19 in the selected cohort, with an average value of around 4 for ground-glass opacity (GGO), consolidation, and other opacities. Of the 80 CroMED patients, 83% had a cough, 60% had dyspnea, 50% had hypertension, 8% were smokers, 12% had a sore throat, 15% were diabetic, and 3.7% had COPD. Out of the total cohort, 17 patients were admitted to the intensive care unit (ICU), and 3 patients died due to COVID-19 infection [2,79,80].
Figure 1.
Raw “COVID-19 CT slices” patient images taken from CroMED dataset.
The second data set included 72 NovMED COVID-19-positive individuals. Figure 2 included 47 males, and the remainder were female. An RT-PCR test was conducted to confirm the presence of COVID-19 in the selected cohort, with an average value of approximately 2.4 GGO, consolidation, and other opacities. Of the 72 NovMED patients, 61% had a cough, 9% had a sore throat, 54% had dyspnea, 42% had hypertension, 12% were diabetic, 11% had COPD, and 11% were smokers. In total, 10 patients died due to COVID-19 infection in this cohort. Figure 3 shows NovMED(control) datasets from Italy. The COVID (Croatia) dataset had dimensions of 512 × 512 and 5396 raw images, COVID (ITA) had dimensions of 768 × 768 and 5797 raw images, and control (Italy) had dimensions of 768 × 768 and 1855 raw images.
Figure 2.
Raw “COVID-19 CT slices” images taken from NovMED dataset.
Figure 3.
Raw “Control CT slices” images taken from NovMED dataset.
The CT dataset was acquired using a 64-detector FCT Speedia HD scanner (Fujifilm Corporation, Tokyo, Japan, 2017). The NovMED dataset, consisting of 72 COVID-19-positive individuals, was obtained from the Department of Radiology at Novara Hospital, Italy. The CT scans were performed using a 128-slice multidetector row CT scanner (Philips Ingenuity Core, by Philips Healthcare). The patients were required to have a positive RT-PCR test for COVID-19 as well as symptoms such as fever, cough, and shortness of breath. No contrast agent was administered during the acquisition, and a lung kernel of a 768 × 768 matrix together with a soft-tissue kernel was utilized to obtain a 1 mm thick slice. The CT scans were performed with a 120 kV, 226 mAs/slice detector configuration using Philips’s automated tube current modulation-Z-DOM with a spiral pitch factor of 1.08 and a 0.5 s gantry rotation time, and a 64 × 0.625 detector was considered [80]. Appendix A has more samples of the CroMED (COVID), NovMED (COVID), and NovMED (control) datasets. Data exclusion criteria for both CroMED and NOVMed dataset consisted of selection of the CT scans regions were based on the absence of metallic items and the high scan quality, free of external artefacts or blur caused from patient movement during the scanning procedure. In this group, the average patient’s CT volume had about 300 slices. During slice selection, slices with the greatest lung area were selected. Slice selection was performed by one of the senior radiologists (K.V.).
Balancing rationale: CroMED (COVID) consisted of 5396 images, while for NovMED (COVID), the data set consisted of 5797 images. NovMED (Control) consisted of 1855 images. Note that there were few control data points. The augmentation procedure consisted of increasing the COVID data two times and control data six times. Thus, the total numbers of images were changed to 10,792 (5396 × 2), 11,594 (5797 × 2), and 11,130 (1855 × 6), respectively. This was for balancing the data sets for COVID and the controls, and this made the control data sets nearly the same as the COVID data sets.
Folding rationale: The chosen sample size of COVID data was two times. This was based on the sample size computation (so-called power analysis, as discussed in the methodology section), for which the objective was to improve the accuracy. For the best accuracy, there was a need for at least 8100 images for COVID. Thus, we increased the COVID data sets by two times, i.e., to 10,792 (5396 × 2) and 11,594 (5797 × 2). Subsequently, the control was balanced by increasing the data set by six times, i.e., to 11,130 (1855 × 6). Table 1 depicts the distribution of the dataset.
Table 1.
Summary of CroMED (COVID), NovMED (COVID), and NovMED (control) dataset information for the experiment.
3.2. Overall Pipeline of the System
The proposed overall architecture is portrayed in Figure 4. In this architecture, The CT machine operator and doctor have contributed to the storage of raw images of lungs for research purposes. These raw images were subjected to HDL segmentation to produce segmented data, resulting in a clear and distinct image of the lung. The latest advancements in segmentation have yielded better results when compared to raw images. We utilized both TLs and EDLs to detect COVID-19 and control cases with high accuracy, which is shown in Figure 4. In this study, we hypothesized that the mean accuracy of EDLs is superior to that of TLs. Additionally, we hypothesized that the mean accuracy of models with augmented input data, balanced with augmentation, would be greater than those without augmentation in both TLs and EDLs. We conducted scientific validation, statistical analysis, precision, recall, F1-score, and AUC to evaluate the performance of the models.
Figure 4.
Overall pipeline consisting of CT volume acquisition, HDL-based segmentation, transfer learning (TL1–TL7), and ensemble-deep-learning-based (EDL1–EDL5) classification.
3.3. Hybrid Deep Learning Architecture of CT Lung Segmentation
After the data acquisition, raw input images were passed over to the HDL model for segmentation. The process of segmenting an image is breaking it up into segments, each of which corresponds to a desired class in the image. The approach that is utilized for image segmentation relies on the specific application and the characteristics of the picture that is being segmented. The study by Suri et al. [80] in the literature review has shown that HDLs are better than solo segmentation. Using the same spirit, ResNet–UNet was exclusively adopted for lung segmentation after pre-processing or quality control [81,82,83,84,85,86]. The ResNet–UNet-based HDL model is composed of 165 layers with ~16.5 million parameters. The final trained model size of the model was 188 megabytes. Using a cutoff of 80%, the model had Dice and Jaccard scores of 0.83 and 0.71, respectively.
These segmented images are the inputs for seven types of TL models and five EDL models in five different input data combinations (DC)—with and (i) without data augmentation and (ii) balanced and augmented data in predicting the presence of COVID-19 in three different datasets: CroMED (COVID), NovMED (COVID), and NovMED (control). ResNet helps in solving the vanishing gradient problem of previous models using skip connection. The convolutional neural networks (CNN) layer in ResNet brings down the sample features using stride two. UNet-based architecture helps in neutralizing the semantic segmentation problem. We have therefore used the combination of ResNet and UNet architecture to build HDL-based segmentation as shown in Figure 5. This amalgamation paradigm has effectively segmented the lungs in COVID-19 and control CT scans.
Figure 5.
ResNet–UNet HDL architecture for lung segmentation [80].
To balance the control and COVID classes, 3× augmentation of the control class was carried out using a vertical horizontal flip and 45-degree rotation. After class balancing in all five DC scenarios, data were further augmented twofold using a vertical horizontal flip and 30-degree rotation. The augmented data were analyzed over seven TLs and five EDLs in all five DC.
3.4. Transfer-Learning-Based Architecture for Classification
Transfer learning is one of the premier methods for classification and offers several advantages compared to DL-based classification [27,87,88]. Our seven TL models adopted were EfficientNetV2M, InceptionV3, MobileNetV2, ResNet152, ResNet50, VGG16, and VGG19, all pre-trained on the ImageNet dataset. Utilizing these TL models, we have designed false to top layers of all models and added a flatten, dense layer, dropout layer, and L2 regularizer. The flatten [89] helps to convert the multidimensional output of the previous layer to a 1-dimensional vector. It passes the values to a dense layer that has a ReLU activation function and L2 regularization with a strength of 0.001. This regularizer prevents overfitting. Dropout is another method that helps to reduce overfitting. Finally, the fully connected layer with two classes and a sigmoid activation function to the output of the previous layer. The output of the last layer represents the predicted probability for the two classes in the classification problem, COVID vs. control. All the architecture used in this work is shown in Appendix B. We have used these TL models due to their ability to bypass the long training time for scratch-based network designs [13,90].
3.5. Ensemble Deep Learning Architectures for Classification
The ensemble is the area of medical imaging that helps weak learners to make them stronger. We have also used the soft-max voting ensemble method. In this approach, the sum of the predicted score is used to predict the class of ensemble prediction. We have also proposed a novel Algorithm 1 to generate five EDL from TLs. EDL generators use a combination method over TL prediction score to create five EDLs from seven TLs. The accuracy of EDL is better than that of solo deep learning architecture. This is a novel algorithm for generating EDLs from TL combinations.
| Algorithm 1: EDL generator |
| Result: Combination of TL models score generates five EDLs |
| Input: TL Model predicted score |
| EDL = [ ]; |
| While1 len(EDL) < 6 do: |
| While2 i in range (2,8) do: |
| While3 k in combinations, i: do: |
| NewEDL = GenerateEDL(TLk,i combinations); |
| //GenerateEDL function to generate on predicted score of TLs |
| If ACC of NewEDL >ACC of contituents TLs ACC then |
| EDL.append(NewEDL); |
| Else |
| Print(“Try new combination”) |
| End //End of If |
| End //End of while3 |
| End //End of while2 |
| End //End of while1 |
After obtaining the segmented image, TL and EDL performed the task of accurate detection of COVID and control (Figure 6). First, these segmented data were preprocessed over all five data combinations. Parallel execution of models on original data size created a core dump (memory issue) at our GPU, which is why the input data size was reduced to 180 × 180. After that, the balancing of the COVID and control classes was performed after the augmentation of control class by 3×. Once the data were balanced, we augmented the data by 2× to increase the size of the data. Second augmentation was also performed to check the augmentation effect. Seven TLs were used for feature extraction, and the sigmoid function was used for binary classification. TL combinations generate EDL and EDL uses softMAX voting on the predicted score for detection of COVID and control. We have also performed balancing and augmentation on data.
Figure 6.
TL and EDL classification process on segmented images.
3.6. Loss Function
Cross-entropy (CE)-loss functions are frequently used for two or more than two DL models. CE-loss, , is dependent on the probability of the AI model and the gold standard label 1 and 0 by and , respectively, as shown in Equation (1).
3.7. Performance Metric
We have used true positive (TP), true negative (TN), false positive (FP), and false negative (FN) to estimate the various performance evaluation metrics. These are accuracy ( (Equation (2)), recall (Ɍ) (Equation (3)), precision () (Equation (4)), and F1-score (Ƒ) (Equation (5)). After calculating the accuracy of TL and EDL models, we calculated the mean accuracy of TL () in Equation (6) and the mean accuracy of EDL () in Equations (7) and (8). In these equations, “n” is the number of TLs, and “N” is the number of EDLs. Dice and Jaccard are also calculated based on Equations (9) and (10), where ƶ is a set of wanted items and ƴ set of found items. The probability curve ROC (receiver operating characteristics) and degree of separability AUC (area under the curve) have also been calculated for each model. In the standard deviation (, each value from the population is denoted by and µ, population mean. N is the size of the population.
3.8. Experimental Protocol
3.8.1. Five Data Combinations
For the robust design of the classification system, we designed five types of data combination scenarios. This is based on training and testing data using taken from two countries—namely, Croatia and Italy.
- DC1: Training validation and testing using both CroMED (COVID) and NovMED (control).
- DC2: Training validation and testing on both NovMED (COVID) and NovMED (control).
- DC3: Training validation using CroMED (COVID) and NovMED (control) and testing on NovMED (COVID) and NovMED (control).
- DC4: Training validation using NovMED (COVID) and NovMED (control) and testing on CroMED (COVID) and NovMED (control).
- DC5: Training validation and testing on mixed data in which COVID CT scans from Croatia and Italy are mixed; the control of Italy was used.
3.8.2. Experiment 1: Transfer Learning Models using Lung Segmented Data
This experiment consists of running seven types of TL models—namely, EfficientNetV2M, InceptionV3, MobileNetV2, ResNet152, ResNet50, VGG16, and VGG19—all pre-trained on the ImageNet dataset for classification of segmented lung data into COVID vs. controls. The lung segmentation was conducted using ResNet–UNet, and segmented images were input for TLs. The experiment highlights the effectiveness of using TL for improving the accuracy of models on HDL segmented data. The lung segmented data were split with a ratio of 80, 10, and 10 for training, validation, and testing, respectively. Models were saved after training and validation and later tested over 10% of the dataset under five input data combinations. These TLs further predict COVID and control.
3.8.3. Experiment 2: Ensemble Deep Learning for Classification
Here, TLs are combined to design EDL to achieve even higher accuracy in detecting COVID-19 vs. control [91,92,93,94,95,96,97,98]. In DC1 five EDLs have been generated on TL models. EDL1: VGG19 + VGG16, EDL2: InceptionV3 + VGG19, EDL3: VGG19 + EfficientNetV2M, EDL4: InceptionV3 + EfficientNetV2M, EDL5: ResNet50 + EfficientNetV2M. Using DC2, EDL1: ResNet50 + ResNet152, EDL2: VGG16 + EfficientNetV2M, EDL3: VGG19 + EfficientNetV2M, EDL4: VGG16 + MobileNetV2, EDL5: VGG19 + MobileNetV2. Using DC3 EDL1: ResNet50 + MobileNetV2, EDL2: ResNet50 + InceptionV3, EDL3: InceptionV3 + VGG19, EDL4: InceptionV3 + MobileNetV2 + ResNet152, EDL5: InceptionV3 + EfficientNetV2M + ResNet152. EDL1: EfficientNetV2M + ResNet50, EDL2: MobileNetV2 + ResNet50, EDL3: ResNet50 + ResNet 152, EDL4: ResNet152 + EfficientNetV2M, MobileNetV2 + VGG19 for DC5.
3.8.4. Experiment 3: Effect of EDL Classification over TL Classification with Augmentation
This experiment is to show the effect of EDL classification over TL classification on unaugmented data and augmented data [99,100,101,102,103]. Mean EDL accuracy and mean TL accuracy verified after the balance and augmentation.
3.8.5. Experiment 4: Unseen Data Analysis
Training on one combination of data and testing on another combination of data were experimented with here. We analyzed the models’ performance on unseen data to evaluate their generalizability [104,105,106,107,108,109,110,111]. The results showed that the models performed well on unseen data, indicating their potential for real-world applications. Input data scenarios DC3 and DC4 are examples of unseen data analysis.
3.9. Experimental Setup
We used Idaho State University (ISU) GPU cluster for executing all models using DC1 to DC5. Tensorflow 2.0 libraries helped us to design the software, and results were also evaluated using MedCalc v12.5 statistical software [112,113,114,115,116,117]. Common hyperparameters in TL models are Optimizer: Adam, Learning rate: 0.0001, Loss: function categorical_cross-entropy, Regularizer: L2 (0.01), Dropout: 0.5, Batch Size: 32, Classification activation function: Sigmoid, Other layer activation function: Relu, and epoch: 25.
3.10. Power Analysis
We calculated the sample size using the conventional method [118,119,120]. The formula for calculating the sample size is represented by , is as follows:
where z* is the z-score corresponding to the desired level of confidence, MoE is the margin of error (half the width of the confidence interval), and is the estimated proportion of the characteristic in the population. Using MedCalc software, we calculated the required values and substituted them into Equation (11). We need a sample size of at least 8100 to estimate the proportion of the characteristic in the population with a 95% confidence interval of 0.963 to 0.978 and an MoE of 0.0075.
4. Results and Performance Evaluation
To verify both hypotheses, we conducted four experiments on five DC scenarios. ResNet–UNet, a hybrid deep learning model, was used to segment the raw data. CroMED (COVID), NovMED (COVID), and NovMED (control) raw images are there along with the segmented images. We randomly selected four sample images from CroMED (COVID) and passed them through ResNet–UNet. The output segmented images have been placed below the raw images in Figure 7. With the same approach, NovMED (COVID) and NovMED (control) information is stored in the same diagram. All five DC have utilized the seven transfer learning models and five ensemble deep learning models over CroMED (COVID), NovMED (COVID), and NovMED (control). The seven TLs are EfficientNetV2M, InceptionV3, MobileNetV2, ResNet152, ResNet50, VGG16, and VGG19, and a combination of TL models with soft-voting ensemble methods generates EDL models. Training accuracy and loss plots for the ResNet–UNet on each epoch is shown in Figure 8.
Figure 7.
Raw images and corresponding segmented images after ResNet–UNet.
Figure 8.
Training accuracy and loss plot for ResNet–UNet.
4.1. PE for HDL Lung Segmentation
Figure 9 depicts a cumulative frequency plot for Dice (left) and Jaccard (right) for ResNet–UNet when computed against the medical doctor (MD) 1. The correlation coefficients and BA plots for MD1 and MD2 are shown in Figure 10 and Figure 11. The correlation coefficient graph depicts the relationship strength between ResNet–UNet and doctors’ views. The BA plot shows the compatibility between ResNet and UNet. After the segmentation of images, TL and EDL models utilize this segmented image for classification. We have decided on five scenarios for classification to prove our hypothesis. The evaluation metrics used to compare the models include mean accuracy (Mean ACC), standard deviation (SD), mean predicted score (Mean PR), area under the curve (AUC), p-value, precision, recall, and F1 score.
Figure 9.
Cumulative frequency plot for Dice (left) and Jaccard (right) for ResNet–UNet when computed against MD 1.
Figure 10.
Correlation coefficient plot for left: ResNet–UNet vs. MD 1 and right: ResNet–UNet vs. MD 2.
Figure 11.
BA plot for ResNet-UNet using MD 1 (left) vs. MD 2 (right).
4.2. Results of Experiment 1: Transfer Learning Models using Lung Segmented Data
In Experiment 1, we performed the TLs operations using ResNet–UNet segmented data. Following are the detailed results for all five DC scenarios.
- DC1 results: Table 2 and Figure 12 show that the best accuracy of 97.93% without augmentation and 99.93% with augmentation is shown by MobileNetV2. The mean accuracy of all seven TLs without augmentation is 93.91% and is 97.03% with augmentation. For TL6 (VGG16), the accuracy improves from 90.20% (before augmentation) to 95.61% (after augmentation), so the improvement was 5.41% using DC1 data combination. TL2 (Inception V3) had an accuracies of 93.60% (before augmentation) and 93.97% (after augmentation), so the improvement was 0.37%. Therefore, we see that augmentation has different effects on TL-based classifiers. It is more pronounced in TL6, unlike in TL2. Table 3 shows the COVID precision are significantly increased or comparable after balancing data.
Table 2. Comparative TL statistical analysis using with/without augmentation on DC1.
Figure 12. Comparison of mean TL accuracy with/without augmentation. TL1: EfficientV2M, TL2: InceptionV3, TL3: MobileNetV2, TL4: ResNet152, TL5: ResNet50, TL6: VGG16, TL7: VGG19 using DC1.
Table 3. Comparative precision, recall and F1 score analysis of COVID and control classes with/without augmentation on DC1. - DC2 results: Table 4 and Figure 13 show that the best accuracy of 90.84% is achieved by InceptionV3 without augmentation, and the best with augmentation of 93.92% is achieved by EfficientNetV2M. The mean accuracy of all seven TLs without augmentation is 84.41% and is 89.85% with augmentation. TL4 (ResNet152), the accuracy improves from 78.16% (before augmentation) to 87.40% (after augmentation) when using DC2 data combination, so the improvement was 11.82%. TL6 (VGG16) had accuracies of 85.6% (before augmentation) and 84.05% (after augmentation), so there was no improvement. Therefore, we see that augmentation has different effects on TL-based classifiers. It is more pronounced in TL4, unlike in TL6. Table 5 shows the effect of augmentation in COVID precision, recall and F1-score. These are significantly increased or comparable after balancing data.
Table 4. Comparative TL statistics analysis with/without augmentation on DC2.
Figure 13. Comparison of mean TL accuracy with/without augmentation. TL1: EfficientV2M, TL2: InceptionV3, TL3: MobileNetV2, TL4: ResNet152, TL5: ResNet50, TL6: VGG16, TL7: VGG19 using DC2.
Table 5. Comparative precision, recall, and F1 score analysis of COVID and control classes with/without augmentation in DC2-TL. - DC3 results: Table 6 and Figure 14 show that the best accuracies of 85.40% without augmentation and 91.41% with augmentation are achieved by EfficientNetV2M. The mean accuracy of all seven TLs without augmentation is 72.90% and is 82.355% with augmentation. For TL5 (ResNet50), the accuracy improves from 67.17% (before augmentation) to 80% (after augmentation) when using DC3 data combination, so the improvement was 19.10%. TL2 (InceptionV3) had accuracies of 67.58% (before augmentation) and 76.43% (after augmentation), so the improvement was 13.09%. Therefore, we see that augmentation has different effects on TL-based classifiers. It is more pronounced in TL5, unlike in TL2. Augmentation and balancing effects are visible in Table 7. It shows that better results can be achieved after balancing the data.
Table 6. Comparative TL statistics analysis with/without augmentation on DC3.
Figure 14. Comparison of mean TL Accuracy with/without augmentation. TL1: EfficientV2M, TL2: InceptionV3, TL3: MobileNetV2, TL4: ResNet152, TL5: ResNet50, TL6: VGG16, TL7: VGG19 using DC3.
Table 7. Comparative precision, recall, and F1 score analysis of COVID and control classes with/without augmentation in DC3. - DC4 results: Table 8 and Figure 15 show that the best accuracies of 69.40% without augmentation and 81.05% with augmentation are shown by VGG19. The mean accuracy of all seven TLs without augmentation is 47.85% and is 70.39% with augmentation. The augmentation effect was also visible with TL3 (MobileNetV2), which had a lowest accuracy of 27.97% before augmentation and 52.68% after the augmentation, so the improvement is 92.5%. Table 9 has been presented to show augmentation effect for precision, recall and F1-score.
Table 8. Comparative TL statistics analysis with/without augmentation on DC4.
Figure 15. Comparison of mean TL accuracy with/without augmentation. TL1: EfficientV2M, TL2: InceptionV3, TL3: MobileNetV2, TL4: ResNet152, TL5: ResNet50, TL6: VGG16, TL7: VGG19 using DC4.
Table 9. Comparative precision, recall, and F1 score analysis of COVID and control classes with/without augmentation in DC4. - DC5 results: Table 10 and Figure 16 show that the best accuracy of 95.10% is achieved by InceptionV3 without augmentation, and 95.28% is achieved by VGG16 with augmentation. The mean accuracy of all seven TLs without augmentation is 91.22% and is 93.76% with augmentation. TL6 (VGG16), the accuracy improves from 86.81% (before augmentation) to 95.28% (after augmentation) when using DC5 data combination, so the improvement is 9.75%. TL3 (MobileNetV2) has an accuracies of 92.95% (before augmentation) and 89.07% (after augmentation), so there is no improvement. Therefore, we see that augmentation has different effects on TL-based classifiers. It is more pronounced in TL6, unlike in TL3. In the most of TL models, improvement of precision, recall and F1-score can be seen Table 11, after the balancing and augmenting the data.
Table 10. Comparative TL statistics analysis with/without augmentation on DC5.
Figure 16. Comparison of mean TL accuracy with/without augmentation. TL1: EfficientV2M, TL2: InceptionV3, TL3: MobileNetV2, TL4: ResNet152, TL5: ResNet50, TL6: VGG16, TL7: VGG19 using DC5.
Table 11. Comparative precision, recall, and F1 score analysis of COVID and control classes with/without augmentation in DC5. - Table 3, Table 5, Table 7, Table 9 and Table 11 also show the precision and a comparison to Experiment 1, which presents the verification of Hypothesis 2 that data augmentation helps in improvement in the performance of TL model. p-value based on the Mann–Whitney test was used for all data combinations.
4.3. Results of Experiment 2: Ensemble Deep Learning for Classification
In Experiment 2, we performed the EDL operations for accurate classification of COVID and control. These EDLs are created using TL models. Following are the detailed results for all five DC scenarios.
- DC1 results: Table 12 and Figure 17 show that the mean accuracy of all EDLs without augmentation is 95.05% and is 97.07% with augmentation.
Table 12. Comparative EDL statistics analysis with/without augmentation on DC1.
Figure 17. Comparison of mean EDL accuracy with/without augmentation. EDL1: VGG19 + VGG16, EDL2: InceptionV3 + VGG19, EDL3: VGG19 + EfficientNetV2M, EDL4: InceptionV3 + EfficientNetV2M, EDL5: ResNet50 + EfficientNetV2M using DC1. - DC2 results: Table 13 and Figure 18 show that the mean accuracy of all EDLs without augmentation is 87.63% and is 92.70% with augmentation.
Table 13. Comparative EDL statistics analysis with/without augmentation on DC2.
Figure 18. Comparison of mean EDL accuracy with/without augmentation. EDL1: ResNet50 + ResNet152, EDL2: VGG16 + EfficientNetV2M, EDL3: VGG19 + EfficientNetV2M, EDL4: VGG16 + MobileNetV2, EDL5: VGG19 + MobileNetV2 using DC2. - DC3 results: Table 14 and Figure 19 show that the mean accuracy of all EDLs without augmentation is 75.88% and is 80.98% with augmentation.
Table 14. Comparative EDL statistics analysis with/without augmentation on DC3.
Figure 19. Comparison of mean EDL accuracy with/without augmentation. EDL1: ResNet50 + MobileNetV2, EDL2: ResNet50 + InceptionV3, EDL3: InceptionV3 + VGG19, EDL4: InceptionV3 + MobileNetV2 + ResNet152, EDL5: InceptionV3 + EfficientNetV2M + ResNet152 using DC3. - DC4 results: Table 15 and Figure 20 show that the mean accuracy of all EDLs without augmentation is 59.99% and is 79.22% with augmentation.
Table 15. Comparative EDL statistics analysis using with/without augmentation on DC4.
Figure 20. Comparison of mean EDL accuracy with/without augmentation. EDL1: VGG19 + EfficientNetV2M, EDL2: VGG16 + EfficientNetV2M, EDL3: Xception + ResNet152, EDL4: Xception + ResNet50, EDL5: VGG16 + VGG19 + EfficientNetV2M using DC4. - DC5 results: Table 16 and Figure 21 show that the mean accuracy of all EDLs without augmentation is 93.39% and is 95.64% with augmentation.
Table 16. Comparative EDL statistics analysis with/without augmentation on DC5.
Figure 21. Comparison of mean EDL accuracy with/without augmentation. EDL1: EfficientNetV2M + ResNet50, EDL2: MobileNetV2 + ResNet50, EDL3: ResNet50 + ResNet 152, EDL4: ResNet152 + EfficientNetV2M, MobileNetV2 + VGG19 using DC5.
4.4. Results of Experiment 3: EDL vs. TL Classification with Augmentation
In Experiment 3, we verified the effect of augmentation in EDLs over TLs in all five DC scenarios. Figure 22 shows results in unaugmented data, and we observed an accuracy improvement in EDLs over TLs of 5.54%. Similarly, Figure 23 shows an accuracy improvement of 2.82% in EDLs over TLs with balanced and augmented data. This verifies Hypothesis 1.
Figure 22.
Accuracy plot of TL vs. EDL without augmentation for DC1, DC2, DC3, DC4, and DC5.
Figure 23.
Accuracy plot of TL vs. EDL with augmentation for DC1, DC2, DC3, DC4, and DC5.
4.5. Results of Experiment 4: Unseen Data Analysis
In Experiment 4, we performed unseen data analysis. In the DC3 scenario, training was performed on CroMED (COVID) and testing on NovMED (COVID). Similarly, in DC4 scenarios, training was performed on NovMED (COVID) and testing on CroMED (COVID). As shown in Figure 22 and Figure 23, we observed that even in unseen data analysis, both of our hypotheses are proven to be correct.
The comparative graph of mean TL accuracy and mean EDL accuracy proves both of our hypotheses. First, the mean accuracy of EDLs is better than the mean accuracy of TLs. Second, balanced and augmented data give better results compared to those without augmentation. We have also presented the standard deviation, mean predicted score, AUC, and p-value for all input data scenarios. DC1, DC2, DC3, DC4, and DC5 TL models with data augmentation and balance improved mean accuracy by 3.32%, 6.56%, 12.96%, 47.1%, and 2.78%, respectively. Similarly, the five EDLs’ accuracies increased by 2.12%, 5.78%, 6.72%, 32.05%, and 2.40%, respectively.
4.6. Receiver Operating Charaterstics
We calculated the AUC from ROC graphs for our model to check explainability. Figure 24, Figure 25, Figure 26, Figure 27 and Figure 28 show TL1: EfficientV2M, TL2: InceptionV3, TL3: MobileNetV2, TL4: ResNet152, TL5: ResNet50, TL6: VGG16, and TL7: VGG19.ROC for input data scenarios DC1, DC2, DC3, DC4, and DC5, respectively.
Figure 24.
ROC of seven TLs using DC1 with augmentation and without augmentation.
Figure 25.
ROC of seven TLs using DC2 with augmentation and without augmentation.
Figure 26.
ROC of seven TLs using DC3 with augmentation and without augmentation.
Figure 27.
ROC of seven TLs using DC4 with augmentation and without augmentation.
Figure 28.
ROC of seven TLs using DC5 with augmentation and without augmentation.
Overall, the results show that deep learning models based on transfer learning and ensemble methods achieve high accuracy in detecting COVID-19. Among the transfer learning models, MobileNetV2 outperforms the other models in terms of accuracy and AUC in all five cases. In addition, the ensemble models show better performance than individual transfer learning. Similar to TL ROC, EDL’s ROC can also be generated. All EDLs AUC-ROC for all five data combinations is already discussed in the result Section 4.2 tables. One of the data combinations, DC1, with augmentation ROC is depicted in Figure 29. It shows that at most of the AUC points of EDLs are better than or equal to their constituents. Data combinations of two to five scenarios ROC are in Appendix C.

Figure 29.
ROC of five EDLs using DC1 with augmentation.
5. System Reliability
Statistical Test
Paired t-test, Mann–Whitney, and Wilcoxon tests were performed to check the reliability of the system in all five IDS. The p-value was less than 0.0001 in all five input data scenarios cases. This shows that our proposed system is highly reliable for real-world applications. The test has been performed using Python and MedCalc software. Result section tables also have stored p-values on the Mann–Whitney test for all TLs and EDLs using DC1, DC2, DC3, DC4, and DC5. Similarly paired t-test and Wilcoxon tests were also performed, and their summaries have been stored in Table 17 and Table 18. Table 17 shows the three statistical tests (paired t-test, Mann–Whitney, and Wilcoxon tests) for seven TL models (EfficientV2M, InceptionV3, MobileNetV2, ResNet152, ResNet50, VGG16, and VGG19). As seen in Table 17, all the TL models (TL1-TL7) exhibit p-values <0.0001. This clearly demonstrates the TL models’ reliability and stability as per the definition null hypothesis. Table 18 presents the three statistical tests (paired t-test, Mann–Whitney, and Wilcoxon tests) for five EDL models (EDL1-EDL5). As seen in Table 18, all the EDL models exhibit p-values <0.0001. This clearly demonstrates the EDL model’s reliability and stability as per the definition null hypothesis. Note that our results are consistent with our previous studies [80,121,122,123,124,125,126].
Table 17.
Summary of paired t-test, Mann–Whitney, and Wilcoxon tests for TL models using five data combinations.
Table 18.
Summary of paired t-test, Mann–Whitney, and Wilcoxon tests for EDL models using five data combinations.
6. Discussion
The proposed system has been trained on multicenter data using different acquisition machines and incorporated superior quality control techniques, class balancing using augmentation, and ResNet–UNet HDL segmentation. It uses seven types of TL classifiers and five types of EDL-based fusion to make accurate predictions. The model employed uniquely designed data systems, a generalized cross-validation protocol, and performance evaluation of HDL segmentation, TL classification, and EDL systems. It was also tested for reliability analysis and stability analysis and benchmarked against previous TL and EDL research work.
6.1. Principal Findings
Explainable transfer learning (TL) and ensemble deep learning (EDL) models accurately predicted the presence of COVID-19 in Croatian and Italian datasets and justified both hypotheses. This architecture presented the behavior of models on data augmentation and balancing. TL accuracy with augmentation and balancing overperformed compared to that without augmentation. Some TL and EDL models outperformed the benchmark in most cases in accuracy, precision, recall, F1 score, and AUC. The proposed method, which uses ResNet–UNet for segmentation and TL and EDL models for classification, is a promising approach for identifying COVID-19 in CroMED (COVID), NovMED (COVID), and NovMED (control). It is a novel approach that uses HDL segmentation for ensemble-based classification. Overall, these findings suggest that ensemble deep learning models can be useful tools for identifying COVID-19 and controlling its spread. Unseen analysis in data combinations two and three show that this infrastructure could be used for real world application. These TL and EDL results have proven that, and the novelties can be summarized as (i) implementation of ResNet–UNet-based HDL segmentation; (ii) executing seven types of TL classifiers, design of five types of EDL-based fusion; (iii) design of five types of data systems; (iv) generalized COVLIAS system design using unseen data; (v) tested for reliability analysis; (vi) tested for stability analysis. The methods applied in this study have created an effective and robust system that has better performance metrics in comparison to existing published models.
6.2. Benchmarking
We studied several papers and sorted some recent papers for benchmarking. These papers include the COVID-CT dataset and the SARS-CoV-2 dataset [127,128,129,130,131,132,133,134,135,136,137]. Our proposed models have used the CroMED (COVID), NovMED (COVID), and NovMED (control) datasets. Seven state-of-the-art transfer learning models, including DenseNet201, DenseNet169, DenseNet161, DenseNet121, VGG16, MobileNetV2, and EfficientNetV2M, have been used on the COVID dataset and compared with our best proposed model on MobileNetV2. We evaluated the models based on their accuracy, precision, recall, F1 score, p-value, and AUC and compared the results to the previous benchmark studies. Our experimental results showed that our proposed method, which used MobileNetV2 on Dataset 1 (CroMED (COVID) and NovMED (control)), outperformed all other models, with an accuracy of 99.99%, precision and recall of 100%, F1 score of 100%, and AUC of 1.0. The second-best model was DenseNet121 by Xu et al. [32], which achieved an accuracy of 99.44% on the COVIDx-CT 2A dataset. It is presented in Table 19. We have also compared our best TL with other existing models proposed by Alshazly et al. [28], Cruz et al. [45], Shaik et al. [30], and Huang et al. [31], who achieved accuracies of 92.9%, 82.76%, 97.38%, and 95.66%, respectively. Our results demonstrate the effectiveness of TL in developing accurate and efficient models for COVID-19 diagnosis using CT images. Our findings highlight the importance of using larger and more diverse datasets for training DL models for medical image analysis. Like the TL model comparison, we have also compared our proposed EDL models to state-of-the-art EDL models. We evaluated the models based on their accuracy, precision, recall, F1-score, and AUC and compared the results to the previous benchmark studies. The ensemble model, a combination of ResNet152 + MobileNetV2, outperformed all other models, with an accuracy of 99.99%, precision and recall of 100%, F1 score of 100%, AUC of 1.0, and p-value of less than 0.0001. The second-best model, with an accuracy of 99.05% and an F1 score of 98.59%, was proposed by Toa et al. [35]. Other ensemble models are also shown in Table 20 and are quite lower than our proposed model. Other EDL models were proposed by Pathan et al. [33], Kundu et al. [34], Cruz et al. [29], Shaik et al. [30], Khanibadi et al. [138], Lu et al. [139], and Huang et al. [31]. We also performed scientific validation that is missing in other models.
Table 19.
Transfer-learning-based models’ comparison.
Table 20.
Ensemble-deep-learning-based models’ comparison.
These TL and EDL results have proven that the novelties—ResNet–UNet HDL segmentation+ seven types of TL classifier + design of five types of EDL-based fusion + design of five types of data systems + generalized COVLIAS system design using unseen data + tested for reliability analysis + tested for stability analysis—applied in this study have created an effective and robust system that has better performance metrics compared to existing published models.
6.3. A Special Note on EDL
Ensemble-based models can be effective in addressing some of the limitations and weaknesses in current published research work on COVID-19 and its control measures. Ensemble-deep-learning-based models are deep learning models that combine multiple models to make more accurate predictions than any single model alone. This approach can improve the generalizability and robustness of predictions, which can be particularly useful in the context of COVID-19 research. Ensemble models always survive when the amalgamation of features or predicted score improves accuracy. If there is a bias in data, then EDL survival is difficult.
6.4. Strengths, Weaknesses, and Extensions
The study compares seven transfer learning and five ensemble deep learning models in predicting the presence of COVID-19, providing a comprehensive evaluation of different approaches. This work uses data augmentation and balanced data to improve the performance of the models, which can be a valuable technique in improving model accuracy. Our research outperforms the benchmark results in most cases, indicating that the proposed models are effective in predicting the presence of COVID-19. The study only uses three datasets, CroMED (COVID), NovMED (COVID), and NovMED (Control), which limits the generalizability of the results. It does not compare the proposed models to other COVID-19 prediction models that may have been developed outside of the benchmark studies. The work could investigate the impact of other segmentation methods on the accuracy of the models. Transformers can also be added for segmentation and detection of COVID-19 [140,141,142,143,144]. While the system is generalized, the system lacks explainability of the AI models, so-called explainable AI (XAI) models. The system lacks the role of superposition of heatmaps on the lung CT images, which can tell where COVID-19 lesions are present, especially using these TL models applied to the HDL segmented lung outputs. Previous systems have used heatmaps [5,121,122] but not in the cascaded framework of HDL + TL + EDL in the multicenter paradigm. Since the field of immunology brings discussions on lung damage causing different kinds of pneumonia, the current paradigm of COVID/control binary classification can be extended to multiclass framework. Our group has several studies which followed multiclass classification using AI framework [145,146,147,148,149]. Our system can therefore be extended as we acquire clinical data for different kinds of pneumonia.
7. Conclusions
In this research work, we had two hypotheses. First, that mean TL accuracy with augmentation is better than without augmented data, which was proven in all five input data scenarios. DC1, DC2, DC3, DC4, and DC5 TL models with data augmentation and balance improved mean accuracy by 3.32%, 6.56%, 12.96%, 47.1%, and 2.78%, respectively. Second, that weaker learners would be stronger in the ensemble process, and that mean EDL accuracy would over mean TL, which is visible in performance evaluation. Explainable transfer learnings have generated ROCs. These are useful for identifying better models. Three statistical tests have shown p-values of less than 0.0001 for all models. This indicates that the system is highly reliable. We have also compared our results to the benchmark results on the COVID dataset. The ensemble model, a combination of ResNet152 and MobileNetV2, outperformed all other models, with an accuracy of 99.99%, precision and recall of 100%, F1 score of 100%, AUC of 1.0, and p-value of less than 0.0001. The second-best benchmark model has 99.05% accuracy and a 98.59% F1 score. Our findings have not only supported both hypotheses, but the proposed methodology also outperforms benchmark performance indicators.
Some future works can also be implemented. We have performed a soft-max voting method in the ensemble process; fusion of features before the prediction is also an option. Statistical tests will confirm system reliability, and a heatmap of the ensemble model could also be generated.
Author Contributions
Conceptualization, J.S.S. and A.K.D.; data curation, G.L.C., A.C., A.P., P.S.C.D., S.A., L.M., N., N.S. (Narpinder Singh), D.W.S., J.R.L., I.M.S., N.S. (Neeraj Sharma), G.T., M.M.F., A.A., G.D.K., N.N.K., K.V., M.K., M.A.-M., A.E.-B., M.K.K. and K.V.; formal analysis, J.S.S., N.N.K. and L.S.; investigation, J.S.S. and M.M.F.; methodology, J.S.S. and A.K.D.; project administration, J.S.S.; software, A.K.D. and S.A.; supervision, J.S.S. and L.S.; validation, A.K.D. and J.S.S.; visualization, A.K.D., A.J., S.Y. and S.A.; writing—original draft, A.K.D.; writing—review and editing, J.S.S., A.K.D., G.L.C., A.C., A.P., P.S.C.D., S.A., L.M., N., N.S. (Narpinder Singh), S.Y., A.J., A.K., M.K.K., D.W.S., J.R.L., I.M.S., N.S. (Neeraj Sharma), G.T., M.M.F., A.A., G.D.K., N.N.K., K.V., M.K., M.A.-M., A.E.-B. and L.S. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
Not applicable.
Conflicts of Interest
The authors declare no conflict of interest. GBTI deals in lung image analysis and Jasjit S. Suri is affiliated with GBTI.
Appendix A
Appendix A includes three figures: Figure A1, Figure A2 and Figure A3. These diagrams are sample images of the dataset. Figure A1 is CroMED (COVID), Figure A2 depicts NovMED (COVID), and Figure A3 shows NovMED (Control).
Figure A1.
Raw “COVID-19 CT slices” taken from CroMED Dataset.
Figure A1.
Raw “COVID-19 CT slices” taken from CroMED Dataset.

Figure A2.
Raw “Control CT slices” taken from NovMED Dataset.
Figure A2.
Raw “Control CT slices” taken from NovMED Dataset.

Figure A3.
Raw “COVID-19 CT slices” taken from NovMED dataset.
Figure A3.
Raw “COVID-19 CT slices” taken from NovMED dataset.

Appendix B
Appendix B includes seven transfer learning models from Figure A4, Figure A5, Figure A6, Figure A7, Figure A8, Figure A9 and Figure A10. Figure A4, Figure A5, Figure A6, Figure A7, Figure A8, Figure A9 and Figure A10 show the fine-tuned transfer learning models TL1: EfficientV2M, TL2: InceptionV3, TL3: MobileNetV2, TL4: ResNet152, TL5: ResNet50, TL6: VGG16, and TL7: VGG19, respectively.
EfficientNetV2M, Figure A4, has 54.1 million parameters. By default, its input size is 480 × 480, and it is trained on the ImageNet dataset. It has a softmax activation function to classify one thousand classes. We removed the top layer; flattened the model output; and added three dense layers, two dropout layers, and L2 regularizers to avoid overfitting. The sigmoid activation function helps us to classify COVID and control classes.
Figure A4.
EfficientNetV2M transfer learning model.
Figure A4.
EfficientNetV2M transfer learning model.

InceptionV3, Figure A5, has 23.83 million parameters. By default, its input size is 299 × 299, and it is trained on the ImageNet dataset. It has a softmax activation function to classify one thousand classes. We removed the top layer; flattened the model output; and added three dense layers, two dropout layers, and L2 regularizers to avoid overfitting. The sigmoid activation function helps us to classify COVID and control classes.
Figure A5.
InceptionV3 transfer learning model.
Figure A5.
InceptionV3 transfer learning model.

MobileNetV2, Figure A6, has 4.3 million parameters. By default, its input size is 224 × 224, and it is trained on the ImageNet dataset. It has a softmax activation function to classify one thousand classes. We removed the top layer; flattened the model output; and added three dense layers, two dropout layers, and L2 regularizers to avoid overfitting. The sigmoid activation function helps us to classify COVID and control classes.
Figure A6.
MobileNetV2 transfer learning model.
Figure A6.
MobileNetV2 transfer learning model.

ResNet50, Figure A7, has 25.56 million parameters. By default, its input size is 224 × 224, and it is trained on the ImageNet dataset. It has a softmax activation function to classify one thousand classes. We removed the top layer; flattened the model output; and added three dense layers, two dropout layers, and L2 regularizers to avoid overfitting. The sigmoid activation function helps us to classify COVID and control classes.
Figure A7.
ResNet50 transfer learning model.
Figure A7.
ResNet50 transfer learning model.

ResNet152, Figure A8, has 60.19 million parameters. By default, its input size is 224 × 224, and it is trained on the ImageNet dataset. It has a softmax activation function to classify one thousand classes. We removed the top layer; flattened the model output; and added three dense layers, two dropout layers, and L2 regularizers to avoid overfitting. The sigmoid activation function helps us to classify COVID and control classes.
Figure A8.
ResNet152 transfer learning model.
Figure A8.
ResNet152 transfer learning model.

VGG16, Figure A9, has 138.3 million parameters. By default, its input size is 224 × 224, and it is trained on the ImageNet dataset. It has a softmax activation function to classify one thousand classes. We removed the top layer; flattened the model output; and added three dense layers, two dropout layers, and L2 regularizers to avoid overfitting. The sigmoid activation function helps us to classify COVID and control classes.
Figure A9.
VGG16 Transfer Learning Model.
Figure A9.
VGG16 Transfer Learning Model.

VGG19, Figure A10, has 23.83 million parameters. By default, its input size is 224 × 224, and it is trained on the ImageNet dataset. It has a softmax activation function to classify one thousand classes. We removed the top layer; flattened the model output; and added three dense layers, two dropout layers, and L2 regularizers to avoid overfitting. The sigmoid activation function helped us to classify COVID and control classes.
Figure A10.
VGG19 transfer learning model.
Figure A10.
VGG19 transfer learning model.

Appendix C
In this section, Figure A11, Figure A12, Figure A13 and Figure A14 depict the ROC of EDLs using DC2, DC3, DC4, and DC5, respectively. EDLs using DC1 are already discussed in the Results section of the ROC. These ROC indicate that the mean AUC of EDLs is better than that of the TL model.

Figure A11.
ROC of five EDLs using DC2 with augmentation.
Figure A11.
ROC of five EDLs using DC2 with augmentation.


Figure A12.
ROC of five EDLs using DC3 with augmentation.
Figure A12.
ROC of five EDLs using DC3 with augmentation.

Figure A13.
ROC of five EDLs using DC4 with augmentation.
Figure A13.
ROC of five EDLs using DC4 with augmentation.

Figure A14.
ROC of five EDLs using DC5 with augmentation.
Figure A14.
ROC of five EDLs using DC5 with augmentation.

References
- Congiu, T.; Demontis, R.; Cau, F.; Piras, M.; Fanni, D.; Gerosa, C.; Botta, C.; Scano, A.; Chighine, A.; Faedda, E. Scanning electron microscopy of lung disease due to COVID-19—A case report and a review of the literature. Eur. Rev. Med. Pharmacol. Sci. 2021, 25, 7997–8003. [Google Scholar]
- Suri, J.S.; Agarwal, S.; Elavarthi, P.; Pathak, R.; Ketireddy, V.; Columbu, M.; Saba, L.; Gupta, S.K.; Faa, G.; Singh, I.M.; et al. Inter-Variability Study of COVLIAS 1.0: Hybrid Deep Learning Models for COVID-19 Lung Segmentation in Computed Tomography. Diagnostics 2021, 11, 2025. [Google Scholar] [CrossRef] [PubMed]
- Suri, J.S.; Agarwal, S.; Gupta, S.; Puvvula, A.; Viskovic, K.; Suri, N.; Alizad, A.; El-Baz, A.; Saba, L.; Fatemi, M.; et al. Systematic Review of Artificial Intelligence in Acute Respiratory Distress Syndrome for COVID-19 Lung Patients: A Biomedical Imaging Perspective. IEEE J. Biomed. Health Inf. 2021, 25, 4128–4139. [Google Scholar] [CrossRef] [PubMed]
- Munjral, S.; Ahluwalia, P.; Jamthikar, A.D.; Puvvula, A.; Saba, L.; Faa, G.; Singh, I.M.; Chadha, P.S.; Turk, M.; Johri, A.M. Nutrition, atherosclerosis, arterial imaging, cardiovascular risk stratification, and manifestations in COVID-19 framework: A narrative review. Front. Biosci. 2021, 26, 1312–1339. [Google Scholar]
- Sanagala, S.S.; Nicolaides, A.; Gupta, S.K.; Koppula, V.K.; Saba, L.; Agarwal, S.; Johri, A.M.; Kalra, M.S.; Suri, J.S. Ten Fast Transfer Learning Models for Carotid Ultrasound Plaque Tissue Characterization in Augmentation Framework Embedded with Heatmaps for Stroke Risk Stratification. Diagnostics 2021, 11, 2109. [Google Scholar] [CrossRef] [PubMed]
- Jena, B.; Saxena, S.; Nayak, G.K.; Saba, L.; Sharma, N.; Suri, J.S. Artificial intelligence-based hybrid deep learning models for image classification: The first narrative review. Comput. Biol. Med. 2021, 137, 104803. [Google Scholar] [CrossRef]
- Gerosa, C.; Faa, G.; Fanni, D.; Manchia, M.; Suri, J.; Ravarino, A.; Barcellona, D.; Pichiri, G.; Coni, P.; Congiu, T. Fetal programming of COVID-19: May the barker hypothesis explain the susceptibility of a subset of young adults to develop severe disease. Eur. Rev. Med. Pharmacol. Sci. 2021, 25, 5876–5884. [Google Scholar]
- Suri, J.S.; Agarwal, S.; Saba, L.; Chabert, G.L.; Carriero, A.; Paschè, A.; Danna, P.; Mehmedović, A.; Faa, G.; Jujaray, T. Multicenter Study on COVID-19 Lung Computed Tomography Segmentation with varying Glass Ground Opacities using Unseen Deep Learning Artificial Intelligence Paradigms: COVLIAS 1.0 Validation. J. Med. Syst. 2022, 46, 62. [Google Scholar] [CrossRef]
- Suri, J.S.; Puvvula, A.; Biswas, M.; Majhail, M.; Saba, L.; Faa, G.; Singh, I.M.; Oberleitner, R.; Turk, M.; Chadha, P.S. COVID-19 pathways for brain and heart injury in comorbidity patients: A role of medical imaging and artificial intelligence-based COVID severity classification: A review. Comput. Biol. Med. 2020, 124, 103960. [Google Scholar] [CrossRef]
- Shalbaf, A.; Vafaeezadeh, M. Automated detection of COVID-19 using ensemble of transfer learning with deep convolutional neural network based on CT scans. Int. J. Comput. Assist. Radiol. Surg. 2021, 16, 115–123. [Google Scholar]
- Suri, J.S.; Maindarkar, M.A.; Paul, S.; Ahluwalia, P.; Bhagawati, M.; Saba, L.; Faa, G.; Saxena, S.; Singh, I.M.; Chadha, P.S. Deep Learning Paradigm for Cardiovascular Disease/Stroke Risk Stratification in Parkinson’s Disease Affected by COVID-19: A Narrative Review. Diagnostics 2022, 12, 1543. [Google Scholar] [CrossRef] [PubMed]
- Saba, L.; Sanagala, S.S.; Gupta, S.K.; Koppula, V.K.; Laird, J.R.; Viswanathan, V.; Sanches, M.J.; Kitas, G.D.; Johri, A.M.; Sharma, N. A Multicenter Study on Carotid Ultrasound Plaque Tissue Characterization and Classification Using Six Deep Artificial Intelligence Models: A Stroke Application. IEEE Trans. Instrum. Meas. 2021, 70, 1–12. [Google Scholar] [CrossRef]
- Skandha, S.S.; Nicolaides, A.; Gupta, S.K.; Koppula, V.K.; Saba, L.; Johri, A.M.; Kalra, M.S.; Suri, J.S. A hybrid deep learning paradigm for carotid plaque tissue characterization and its validation in multicenter cohorts using a supercomputer framework. Comput. Biol. Med. 2022, 141, 105131. [Google Scholar] [CrossRef]
- Saba, L.; Biswas, M.; Suri, H.S.; Viskovic, K.; Laird, J.R.; Cuadrado-Godia, E.; Nicolaides, A.; Khanna, N.N.; Viswanathan, V.; Suri, J.S. Ultrasound-based carotid stenosis measurement and risk stratification in diabetic cohort: A deep learning paradigm. Cardiovasc. Diagn. Ther. 2019, 9, 439–461. [Google Scholar] [CrossRef] [PubMed]
- Agarwal, M.; Saba, L.; Gupta, S.K.; Carriero, A.; Falaschi, Z.; Paschè, A.; Danna, P.; El-Baz, A.; Naidu, S.; Suri, J.S. A novel block imaging technique using nine artificial intelligence models for COVID-19 disease classification, characterization and severity measurement in lung computed tomography scans on an Italian cohort. J. Med. Syst. 2021, 45, 28. [Google Scholar] [CrossRef] [PubMed]
- Biswas, M.; Kuppili, V.; Saba, L.; Edla, D.R.; Suri, H.S.; Cuadrado-Godia, E.; Laird, J.R.; Marinhoe, R.T.; Sanches, J.M.; Nicolaides, A.; et al. State-of-the-art review on deep learning in medical imaging. Front. Biosci. 2019, 24, 392–426. [Google Scholar]
- LeCun, Y.; Denker, J.; Solla, S. Optimal brain damage. In Advances in Neural Information Processing Systems, Proceedings of the Neural Information Processing Systems Conference, Denver, CO, USA, 27–30 November 1989; Massachusetts Institute of Technology Press: Cambridge, MA, USA, 1989; Volume 2. [Google Scholar]
- Taylor, R. Interpretation of the correlation coefficient: A basic review. J. Diagn. Med. Sonogr. 1990, 6, 35–39. [Google Scholar] [CrossRef]
- Kozek, T.; Roska, T.; Chua, L.O. Genetic algorithm for CNN template learning. IEEE Trans. Circuits Syst. I Fundam. Theory Appl. 1993, 40, 392–402. [Google Scholar] [CrossRef]
- Kennedy, J.; Eberhart, R. Particle swarm optimization. In Proceedings of the ICNN’95-International Conference on Neural Networks, Perth, WA, Australia, 27 November–1 December 1995; IEEE: Washington, DC, USA, 1995; pp. 1942–1948. [Google Scholar]
- Acharya, U.R.; Kannathal, N.; Ng, E.; Min, L.C.; Suri, J.S. Computer-based classification of eye diseases. In Proceedings of the 2006 International Conference of the IEEE Engineering in Medicine and Biology Society, New York, NY, USA, 30 August–3 September 2006; IEEE: Washington, DC, USA, 2006; pp. 6121–6124. [Google Scholar]
- Molinari, F.; Liboni, W.; Pavanelli, E.; Giustetto, P.; Badalamenti, S.; Suri, J.S. Accurate and automatic carotid plaque characterization in contrast enhanced 2-D ultrasound images. In Proceedings of the 2007 29th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, Lyon, France, 22–26 August 2007; IEEE: Washington, DC, USA, 2007; pp. 335–338. [Google Scholar]
- Acharya, U.R.; Faust, O.; Sree, S.V.; Alvin, A.P.C.; Krishnamurthi, G.; Sanches, J.; Suri, J.S. Atheromatic™: Symptomatic vs. asymptomatic classification of carotid ultrasound plaque using a combination of HOS, DWT & texture. In Proceedings of the 2011 Annual International Conference of the IEEE Engineering in Medicine and Biology Society, Boston, MA, USA, 30 August–3 September 2011; IEEE: Washington, DC, USA, 2011; pp. 4489–4492. [Google Scholar]
- El-Baz, A.; Gimel’farb, G.; Suri, J.S. Stochastic Modeling for Medical Image Analysis, 1st ed.; CRC Press: Boca Raton, FL, USA, 2015. [Google Scholar]
- Murgia, A.; Erta, M.; Suri, J.S.; Gupta, A.; Wintermark, M.; Saba, L. CT imaging features of carotid artery plaque vulnerability. Ann. Transl. Med. 2020, 8, 1261. [Google Scholar] [CrossRef]
- Gozes, O.; Frid-Adar, M.; Greenspan, H.; Browning, P.D.; Zhang, H.; Ji, W.; Bernheim, A.; Siegel, E. Rapid ai development cycle for the coronavirus (COVID-19) pandemic: Initial results for automated detection & patient monitoring using deep learning ct image analysis. arXiv 2020, arXiv:2003.05037. [Google Scholar]
- Li, C.; Yang, Y.; Liang, H.; Wu, B. Transfer learning for establishment of recognition of COVID-19 on CT imaging using small-sized training datasets. Knowl.-Based Syst. 2021, 218, 106849. [Google Scholar] [CrossRef]
- Alshazly, H.; Linse, C.; Barth, E.; Martinetz, T.J.S. Explainable COVID-19 detection using chest CT scans and deep learning. Sensors 2021, 21, 455. [Google Scholar] [CrossRef] [PubMed]
- Cruz, J.F.H.S. An ensemble approach for multi-stage transfer learning models for COVID-19 detection from chest CT scans. Intell.-Based Med. 2021, 5, 100027. [Google Scholar] [CrossRef] [PubMed]
- Shaik, N.S.; Cherukuri, T.K. Transfer learning based novel ensemble classifier for COVID-19 detection from chest CT-scans. Comput. Biol. Med. 2022, 141, 105127. [Google Scholar] [CrossRef] [PubMed]
- Huang, M.-L.; Liao, Y.-C. Stacking Ensemble and ECA-EfficientNetV2 Convolutional Neural Networks on Classification of Multiple Chest Diseases Including COVID-19. Acad. Radiol. 2022, in press. [CrossRef] [PubMed]
- Xu, Y.; Lam, H.-K.; Jia, G.; Jiang, J.; Liao, J.; Bao, X. Improving COVID-19 CT classification of CNNs by learning parameter-efficient representation. Comput. Biol. Med. 2023, 152, 106417. [Google Scholar] [CrossRef] [PubMed]
- Pathan, S.; Siddalingaswamy, P.; Kumar, P.; MM, M.P.; Ali, T.; Acharya, U.R. Novel ensemble of optimized CNN and dynamic selection techniques for accurate COVID-19 screening using chest CT images. Comput. Biol. Med. 2021, 137, 104835. [Google Scholar] [CrossRef]
- Kundu, R.; Singh, P.K.; Mirjalili, S.; Sarkar, R. COVID-19 detection from lung CT-Scans using a fuzzy integral-based CNN ensemble. Comput. Biol. Med. 2021, 138, 104895. [Google Scholar] [CrossRef]
- Zhou, T.; Lu, H.; Yang, Z.; Qiu, S.; Huo, B.; Dong, Y. The ensemble deep learning model for novel COVID-19 on CT images. Appl. Soft Comput. 2021, 98, 106885. [Google Scholar] [CrossRef] [PubMed]
- Tang, S.; Wang, C.; Nie, J.; Kumar, N.; Zhang, Y.; Xiong, Z.; Barnawi, A. EDL-COVID: Ensemble deep learning for COVID-19 case detection from chest X-ray images. IEEE Trans. Ind. Inform. 2021, 17, 6539–6549. [Google Scholar] [CrossRef]
- Ray, E.L.; Wattanachit, N.; Niemi, J.; Kanji, A.H.; House, K.; Cramer, E.Y.; Bracher, J.; Zheng, A.; Yamana, T.K.; Xiong, X. Ensemble forecasts of coronavirus disease 2019 (COVID-19) in the US. MedRXiv 2008. [Google Scholar] [CrossRef]
- Batra, R.; Chan, H.; Kamath, G.; Ramprasad, R.; Cherukara, M.J.; Sankaranarayanan, S.K. Screening of therapeutic agents for COVID-19 using machine learning and ensemble docking studies. J. Phys. Chem. Lett. 2020, 11, 7058–7065. [Google Scholar] [CrossRef] [PubMed]
- Aversano, L.; Bernardi, M.L.; Cimitile, M.; Pecori, R. Deep neural networks ensemble to detect COVID-19 from CT scans. Pattern Recognit. Lett. 2021, 120, 108135. [Google Scholar] [CrossRef] [PubMed]
- Al, A.; Kabir, M.R.; Ar, A.M.; Nishat, M.M.; Faisal, F. COVID-EnsembleNet: An ensemble based approach for detecting COVID-19 by utilising chest X-Ray images. In Proceedings of the 2022 IEEE World AI IoT Congress (AIIoT), Seattle, WA, USA, 6–9 June 2022; IEEE: Washington, DC, USA, 2022; pp. 351–356. [Google Scholar]
- Hosni, M.; Abnane, I.; Idri, A.; de Gea, J.M.C.; Alemán, J.L.F. Reviewing ensemble classification methods in breast cancer. Comput. Methods Programs Biomed. 2019, 177, 89–112. [Google Scholar] [CrossRef]
- Lavanya, D.; Rani, K.U. Ensemble decision tree classifier for breast cancer data. Int. J. Inf. Technol. Converg. Serv. 2012, 2, 17–24. [Google Scholar] [CrossRef]
- Wang, H.; Zheng, B.; Yoon, S.W.; Ko, H.S. A support vector machine-based ensemble algorithm for breast cancer diagnosis. Eur. J. Oper. Res. 2018, 267, 687–699. [Google Scholar] [CrossRef]
- Abdar, M.; Zomorodi-Moghadam, M.; Zhou, X.; Gururajan, R.; Tao, X.; Barua, P.D.; Gururajan, R. A new nested ensemble technique for automated diagnosis of breast cancer. Pattern Recognit. Lett. 2020, 132, 123–131. [Google Scholar] [CrossRef]
- Hsieh, S.-L.; Hsieh, S.-H.; Cheng, P.-H.; Chen, C.-H.; Hsu, K.-P.; Lee, I.-S.; Wang, Z.; Lai, F. Design ensemble machine learning model for breast cancer diagnosis. J. Med. Syst. 2012, 36, 2841–2847. [Google Scholar] [CrossRef]
- Hu, J.; Gu, X.; Gu, X. Mutual ensemble learning for brain tumor segmentation. Neurocomputing 2022, 504, 68–81. [Google Scholar] [CrossRef]
- Noreen, N.; Palaniappan, S.; Qayyum, A.; Ahmad, I.; Alassafi, M.O. Brain tumor classification based on fine-tuned models and the ensemble method. Comput. Mater. Contin. 2021, 67, 3967–3982. [Google Scholar] [CrossRef]
- Kang, J.; Ullah, Z.; Gwak, J. MRI-based brain tumor classification using ensemble of deep features and machine learning classifiers. Sensors 2021, 21, 2222. [Google Scholar] [CrossRef]
- Younis, A.; Qiang, L.; Nyatega, C.O.; Adamu, M.J.; Kawuwa, H.B. Brain tumor analysis using deep learning and VGG-16 ensembling learning approaches. Appl. Sci. 2022, 12, 7282. [Google Scholar] [CrossRef]
- Almulihi, A.; Saleh, H.; Hussien, A.M.; Mostafa, S.; El-Sappagh, S.; Alnowaiser, K.; Ali, A.A.; Hassan, M.J.D.R. Ensemble Learning Based on Hybrid Deep Learning Model for Heart Disease Early Prediction. Diagnostics 2022, 12, 3215. [Google Scholar] [CrossRef] [PubMed]
- Karadeniz, T.; Maraş, H.H.; Tokdemir, G.; Ergezer, H. Two Majority Voting Classifiers Applied to Heart Disease Prediction. Appl. Sci. 2023, 13, 3767. [Google Scholar] [CrossRef]
- Fitriyani, N.L.; Syafrudin, M.; Alfian, G.; Rhee, J. Development of disease prediction model based on ensemble learning approach for diabetes and hypertension. IEEE Access 2019, 7, 144777–144789. [Google Scholar] [CrossRef]
- Reddy, G.T.; Bhattacharya, S.; Ramakrishnan, S.S.; Chowdhary, C.L.; Hakak, S.; Kaluri, R.; Reddy, M.P.K. An ensemble based machine learning model for diabetic retinopathy classification. In Proceedings of the 2020 International Conference on Emerging Trends in Information Technology and Engineering (IC-ETITE), Vellore, India, 24–25 February 2020; IEEE: Washington, DC, USA, 2020; pp. 1–6. [Google Scholar]
- Jiang, H.; Yang, K.; Gao, M.; Zhang, D.; Ma, H.; Qian, W. An interpretable ensemble deep learning model for diabetic retinopathy disease classification. In Proceedings of the 2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Berlin, Germany, 23–27 July 2019; IEEE: Washington, DC, USA, 2019; pp. 2045–2048. [Google Scholar]
- Mahesh, T.; Kumar, D.; Kumar, V.V.; Asghar, J.; Bazezew, B.M.; Natarajan, R.; Vivek, V. Blended ensemble learning prediction model for strengthening diagnosis and treatment of chronic diabetes disease. Comput. Intell. Neurosci. 2022, 2022, 4451792. [Google Scholar] [CrossRef] [PubMed]
- Antal, B.; Hajdu, A. An ensemble-based system for automatic screening of diabetic retinopathy. Knowl.-Based Syst. 2014, 60, 20–27. [Google Scholar] [CrossRef]
- Thirion, J.-P.; Prima, S.; Subsol, G.; Roberts, N. Statistical analysis of normal and abnormal dissymmetry in volumetric medical images. Med. Image Anal. 2000, 4, 111–121. [Google Scholar] [CrossRef]
- Cootes, T.F.; Taylor, C.J. Statistical models of appearance for medical image analysis and computer vision. In Medical Imaging 2001: Image Processing; SPIE: Philadelphia, PA, USA, 2001; pp. 236–248. [Google Scholar]
- Heimann, T.; Meinzer, H.-P. Statistical shape models for 3D medical image segmentation: A review. Med. Image Anal. 2009, 13, 543–563. [Google Scholar] [CrossRef]
- Duncan, J.S.; Ayache, N. Medical image analysis: Progress over two decades and the challenges ahead. IEEE Trans. Pattern Anal. Mach. Intell. 2000, 22, 85–106. [Google Scholar] [CrossRef]
- Tang, L.L.; Balakrishnan, N. A random-sum Wilcoxon statistic and its application to analysis of ROC and LROC data. J. Stat. Plan. Inference 2011, 141, 335–344. [Google Scholar] [CrossRef] [PubMed]
- Pérez, N.P.; López, M.A.G.; Silva, A.; Ramos, I. Improving the Mann–Whitney statistical test for feature selection: An approach in breast cancer diagnosis on mammography. Artif. Intell. Med. 2015, 63, 19–31. [Google Scholar] [CrossRef] [PubMed]
- Martínez-Murcia, F.J.; Górriz, J.M.; Ramirez, J.; Puntonet, C.G.; Salas-Gonzalez, D.; Alzheimer’s Disease Neuroimaging Initiative. Computer aided diagnosis tool for Alzheimer’s disease based on Mann–Whitney–Wilcoxon U-test. Expert Syst. Appl. 2012, 39, 9676–9685. [Google Scholar] [CrossRef]
- Lin, Y.; Su, J.; Li, Y.; Wei, Y.; Yan, H.; Zhang, S.; Luo, J.; Ai, D.; Song, H.; Fan, J. High-Resolution Boundary Detection for Medical Image Segmentation with Piece-Wise Two-Sample T-Test Augmented Loss. arXiv 2022, arXiv:2211.02419. [Google Scholar]
- Chin, M.H.; Muramatsu, N. What is the quality of quality of medical care measures?: Rashomon-like relativism and real-world applications. Perspect. Biol. Med. 2003, 46, 5–20. [Google Scholar] [CrossRef] [PubMed]
- Li, L.; Wang, P.; Yan, J.; Wang, Y.; Li, S.; Jiang, J.; Sun, Z.; Tang, B.; Chang, T.-H.; Wang, S. Real-world data medical knowledge graph: Construction and applications. Artif. Intell. Med. 2020, 103, 101817. [Google Scholar] [CrossRef] [PubMed]
- Zhao, W.; Wang, C.; Nakahira, Y. Medical application on internet of things. In Proceedings of the IET international conference on communication technology and application (ICCTA 2011), Beijing, China, 14–16 October 2011; IET: London, UK, 2011; pp. 660–665. [Google Scholar]
- Endrei, D.; Molics, B.; Ágoston, I. Multicriteria decision analysis in the reimbursement of new medical technologies: Real-world experiences from Hungary. Value Health 2014, 17, 487–489. [Google Scholar] [CrossRef]
- Hudson, J.N. Computer-aided learning in the real world of medical education: Does the quality of interaction with the computer affect student learning? Med. Educ. 2004, 38, 887–895. [Google Scholar] [CrossRef]
- Attallah, O. RADIC: A tool for diagnosing COVID-19 from chest CT and X-ray scans using deep learning and quad-radiomics. Chemom. Intell. Lab. Syst. 2023, 233, 104750. [Google Scholar] [CrossRef]
- Attallah, O.; Ragab, D.A.; Sharkas, M. MULTI-DEEP: A novel CAD system for coronavirus (COVID-19) diagnosis from CT images using multiple convolution neural networks. PeerJ 2020, 8, e10086. [Google Scholar] [CrossRef]
- Mercaldo, F.; Belfiore, M.P.; Reginelli, A.; Brunese, L.; Santone, A. Coronavirus COVID-19 detection by means of explainable deep learning. Sci. Rep. 2023, 13, 462. [Google Scholar] [CrossRef] [PubMed]
- Shah, V.; Keniya, R.; Shridharani, A.; Punjabi, M.; Shah, J.; Mehendale, N. Diagnosis of COVID-19 using CT scan images and deep learning techniques. Emerg. Radiol. 2021, 28, 497–505. [Google Scholar] [CrossRef] [PubMed]
- Attallah, O.; Samir, A. A wavelet-based deep learning pipeline for efficient COVID-19 diagnosis via CT slices. Appl. Soft Comput. 2022, 128, 109401. [Google Scholar] [CrossRef] [PubMed]
- Kini, A.S.; Reddy, A.N.G.; Kaur, M.; Satheesh, S.; Singh, J.; Martinetz, T.; Alshazly, H. Ensemble deep learning and internet of things-based automated COVID-19 diagnosis framework. Contrast Media Mol. Imaging 2022, 2022, 7377502. [Google Scholar] [CrossRef]
- Li, X.; Tan, W.; Liu, P.; Zhou, Q.; Yang, J. Classification of COVID-19 chest CT images based on ensemble deep learning. J. Healthc. Eng. 2021, 2021, 5528441. [Google Scholar] [CrossRef]
- Yang, L.; Wang, S.-H.; Zhang, Y.-D. EDNC: Ensemble deep neural network for COVID-19 recognition. Tomography 2022, 8, 869–890. [Google Scholar] [CrossRef]
- Ragab, D.A.; Attallah, O. FUSI-CAD: Coronavirus (COVID-19) diagnosis based on the fusion of CNNs and handcrafted features. PeerJ Comput. Sci. 2020, 6, e306. [Google Scholar] [CrossRef] [PubMed]
- Suri, J.S.; Agarwal, S.; Pathak, R.; Ketireddy, V.; Columbu, M.; Saba, L.; Gupta, S.K.; Faa, G.; Singh, I.M.; Turk, M. COVLIAS 1.0: Lung Segmentation in COVID-19 Computed Tomography Scans Using Hybrid Deep Learning Artificial Intelligence Models. Diagnostics 2021, 11, 1405. [Google Scholar] [CrossRef] [PubMed]
- Suri, J.S.; Agarwal, S.; Chabert, G.L.; Carriero, A.; Paschè, A.; Danna, P.S.; Saba, L.; Mehmedović, A.; Faa, G.; Singh, I.M. COVLIAS 2.0-cXAI: Cloud-based explainable deep learning system for COVID-19 lesion localization in computed tomography scans. Diagnostics 2022, 12, 1482. [Google Scholar] [CrossRef]
- Ibrahim, D.A.; Zebari, D.A.; Mohammed, H.J.; Mohammed, M.A. Effective hybrid deep learning model for COVID-19 patterns identification using CT images. Expert Syst. 2022, 39, e13010. [Google Scholar] [CrossRef] [PubMed]
- Afshar, P.; Heidarian, S.; Naderkhani, F.; Rafiee, M.J.; Oikonomou, A.; Plataniotis, K.N.; Mohammadi, A. Hybrid deep learning model for diagnosis of COVID-19 using CT scans and clinical/demographic data. In Proceedings of the 2021 IEEE International Conference on Image Processing (ICIP), Anchorage, AK, USA, 19–22 September 2021; IEEE: Washington, DC, USA, 2021; pp. 180–184. [Google Scholar]
- Chola, C.; Mallikarjuna, P.; Muaad, A.Y.; Benifa, J.B.; Hanumanthappa, J.; Al-antari, M.A. A hybrid deep learning approach for COVID-19 diagnosis via CT and X-ray medical images. Comput. Sci. Math. Forum 2021, 2, 13. [Google Scholar]
- Liang, S.; Zhang, W.; Gu, Y. A hybrid and fast deep learning framework for COVID-19 detection via 3D Chest CT Images. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 508–512. [Google Scholar]
- R-Prabha, M.; Prabhu, R.; Suganthi, S.; Sridevi, S.; Senthil, G.; Babu, D.V. Design of hybrid deep learning approach for COVID-19 infected lung image segmentation. J. Phys. Conf. Ser. 2021, 2040, 012016. [Google Scholar] [CrossRef]
- Wu, D.; Gong, K.; Arru, C.D.; Homayounieh, F.; Bizzo, B.; Buch, V.; Ren, H.; Kim, K.; Neumark, N.; Xu, P. Severity and consolidation quantification of COVID-19 from CT images using deep learning based on hybrid weak labels. IEEE J. Biomed. Health Inform. 2020, 24, 3529–3538. [Google Scholar] [CrossRef] [PubMed]
- Loey, M.; Manogaran, G.; Khalifa, N.E.M. A deep transfer learning model with classical data augmentation and CGAN to detect COVID-19 from chest CT radiography digital images. Neural Comput. Appl. 2020, 1–13, online ahead of print. [Google Scholar] [CrossRef] [PubMed]
- Ahuja, S.; Panigrahi, B.K.; Dey, N.; Rajinikanth, V.; Gandhi, T.K.J.A.I. Deep transfer learning-based automated detection of COVID-19 from lung CT scan slices. Appl. Intell. Vol. 2021, 51, 571–585. [Google Scholar] [CrossRef]
- Arora, V.; Ng, E.Y.-K.; Leekha, R.S.; Darshan, M.; Singh, A. Transfer learning-based approach for detecting COVID-19 ailment in lung CT scan. Comput. Biol. Med. 2021, 135, 104575. [Google Scholar] [CrossRef]
- Das, S.; Nayak, G.K.; Saba, L.; Kalra, M.; Suri, J.S.; Saxena, S. An artificial intelligence framework and its bias for brain tumor segmentation: A narrative review. Comput. Biol. Med. 2022, 143, 105273. [Google Scholar] [CrossRef]
- Alakus, T.B.; Turkoglu, I. Comparison of deep learning approaches to predict COVID-19 infection. Chaos Solitons Fractals 2020, 140, 110120. [Google Scholar] [CrossRef] [PubMed]
- Oh, Y.; Park, S.; Ye, J.C. Deep learning COVID-19 features on CXR using limited training data sets. IEEE Trans. Med. Imaging 2020, 39, 2688–2700. [Google Scholar] [CrossRef]
- Liu, H.; Shao, M.; Li, S.; Fu, Y. Infinite ensemble for image clustering. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 1745–1754. [Google Scholar]
- Orchard, J.; Mann, R. Registering a multisensor ensemble of images. IEEE Trans. Image Process. 2009, 19, 1236–1247. [Google Scholar] [CrossRef]
- Jiang, Y.; Zhou, Z.-H. SOM ensemble-based image segmentation. Neural Process. Lett. 2004, 20, 171–178. [Google Scholar] [CrossRef]
- Chaeikar, S.S.; Ahmadi, A. Ensemble SW image steganalysis: A low dimension method for LSBR detection. Signal Process. Image Commun. 2019, 70, 233–245. [Google Scholar] [CrossRef]
- Li, B.; Goh, K. Confidence-based dynamic ensemble for image annotation and semantics discovery. In Proceedings of the Eleventh ACM International Conference on Multimedia, Berkeley, CA, USA, 2–8 November 2003; pp. 195–206. [Google Scholar]
- Varol, E.; Gaonkar, B.; Erus, G.; Schultz, R.; Davatzikos, C. Feature ranking based nested support vector machine ensemble for medical image classification. In Proceedings of the 2012 9th IEEE international symposium on biomedical imaging (ISBI), Barcelona, Spain, 2–5 May 2012; IEEE: Washington, DC, USA, 2012; pp. 146–149. [Google Scholar]
- Wu, S.; Zhang, H.; Valiant, G.; Ré, C. On the generalization effects of linear transformations in data augmentation. In Proceedings of the International Conference on Machine Learning, Virtual, 13–18 July 2020; pp. 10410–10420. [Google Scholar]
- Perez, L.; Wang, J. The effectiveness of data augmentation in image classification using deep learning. arXiv 2017, arXiv:1712.04621. [Google Scholar]
- Van Dyk, D.A.; Meng, X.-L. The art of data augmentation. J. Comput. Graph. Stat. 2001, 10, 1–50. [Google Scholar] [CrossRef]
- Aquino, N.R.; Gutoski, M.; Hattori, L.T.; Lopes, H.S. The effect of data augmentation on the performance of convolutional neural networks. J. Braz. Comput. Soc. 2017. [Google Scholar]
- Tellez, D.; Litjens, G.; Bándi, P.; Bulten, W.; Bokhorst, J.-M.; Ciompi, F.; Van Der Laak, J. Quantifying the effects of data augmentation and stain color normalization in convolutional neural networks for computational pathology. Med. Image Anal. 2019, 58, 101544. [Google Scholar] [CrossRef] [PubMed]
- Parmar, C.; Barry, J.D.; Hosny, A.; Quackenbush, J.; Aerts, H.J. Data Analysis Strategies in Medical ImagingData Science Designs in Medical Imaging. Clin. Cancer Res. 2018, 24, 3492–3499. [Google Scholar] [CrossRef]
- Lai, M. Deep learning for medical image segmentation. arXiv 2015, arXiv:1505.02000. [Google Scholar]
- Abdollahi, B.; Tomita, N.; Hassanpour, S. Data augmentation in training deep learning models for medical image analysis. In Deep Learners Deep Learner Descriptors for Medical Applications; Springer: Cham, Switzerland, 2020; pp. 167–180. [Google Scholar]
- Chen, H.; Sung, J.J. Potentials of AI in medical image analysis in Gastroenterology and Hepatology. J. Gastroenterol. Hepatol. 2021, 36, 31–38. [Google Scholar] [CrossRef] [PubMed]
- Roth, H.R.; Oda, H.; Zhou, X.; Shimizu, N.; Yang, Y.; Hayashi, Y.; Oda, M.; Fujiwara, M.; Misawa, K.; Mori, K. Graphics, An application of cascaded 3D fully convolutional networks for medical image segmentation. Comput. Med. Imaging 2018, 66, 90–99. [Google Scholar] [CrossRef]
- Gu, R.; Zhang, J.; Huang, R.; Lei, W.; Wang, G.; Zhang, S. Domain composition and attention for unseen-domain generalizable medical image segmentation. In Proceedings of the Medical Image Computing and Computer Assisted Intervention–MICCAI 2021: 24th International Conference, Strasbourg, France, 27 September–1 October 2021; Part III 24. Springer: Berlin/Heidelberg, Germany, 2021; pp. 241–250. [Google Scholar]
- Kumar, A.; Kim, J.; Lyndon, D.; Fulham, M.; Feng, D. An ensemble of fine-tuned convolutional neural networks for medical image classification. IEEE J. Biomed. Health Inform. 2016, 21, 31–40. [Google Scholar] [CrossRef]
- Ghesu, F.C.; Georgescu, B.; Mansoor, A.; Yoo, Y.; Gibson, E.; Vishwanath, R.; Balachandran, A.; Balter, J.M.; Cao, Y.; Singh, R. Quantifying and leveraging predictive uncertainty for medical image assessment. Med. Image Anal. 2021, 68, 101855. [Google Scholar] [CrossRef]
- Adams, M.; Chen, W.; Holcdorf, D.; McCusker, M.W.; Howe, P.D.; Gaillard, F. Computer vs. human: Deep learning versus perceptual training for the detection of neck of femur fractures. J. Med. Imaging Radiat. Oncol. 2019, 63, 27–32. [Google Scholar] [CrossRef]
- González, G.; Washko, G.R.; Estépar, R.S.J. Deep learning for biomarker regression: Application to osteoporosis and emphysema on chest CT scans. In Medical Imaging 2018: Image Processing; SPIE: Philadelphia, PA, USA, 2018; pp. 372–378. [Google Scholar]
- Li, P.; Liu, Q.; Tang, D.; Zhu, Y.; Xu, L.; Sun, X.; Song, S. Lesion based diagnostic performance of dual phase 99m Tc-MIBI SPECT/CT imaging and ultrasonography in patients with secondary hyperparathyroidism. BMC Med. Imaging 2017, 17, 60. [Google Scholar] [CrossRef] [PubMed]
- Zhou, X.; Ma, C.; Wang, Z.; Liu, J.-L.; Rui, Y.-P.; Li, Y.-H.; Peng, Y.-F. Effect of region of interest on ADC and interobserver variability in thyroid nodules. BMC Med. Imaging 2019, 19, 55. [Google Scholar] [CrossRef]
- Shia, W.-C.; Chen, D.-R. Classification of malignant tumors in breast ultrasound using a pretrained deep residual network model and support vector machine. Comput. Med. Imaging Graph. 2021, 87, 101829. [Google Scholar] [CrossRef] [PubMed]
- Lahoud, P.; EzEldeen, M.; Beznik, T.; Willems, H.; Leite, A.; Van Gerven, A.; Jacobs, R. Artificial intelligence for fast and accurate 3-dimensional tooth segmentation on cone-beam computed tomography. J. Endod. 2021, 47, 827–835. [Google Scholar] [CrossRef] [PubMed]
- Skandha, S.S.; Gupta, S.K.; Saba, L.; Koppula, V.K.; Johri, A.M.; Khanna, N.N.; Mavrogeni, S.; Laird, J.R.; Pareek, G.; Miner, M.; et al. 3-D optimized classification and characterization artificial intelligence paradigm for cardiovascular/stroke risk stratification using carotid ultrasound-based delineated plaque: Atheromatic™ 2.0. Comput. Biol. Med. 2020, 125, 103958. [Google Scholar] [CrossRef]
- Jamthikar, A.; Gupta, D.; Khanna, N.N.; Saba, L.; Laird, J.R.; Suri, J.S. Cardiovascular/stroke risk prevention: A new machine learning framework integrating carotid ultrasound image-based phenotypes and its harmonics with conventional risk factors. Indian Heart J. 2020, 72, 258–264. [Google Scholar] [CrossRef] [PubMed]
- Jamthikar, A.; Gupta, D.; Khanna, N.N.; Saba, L.; Araki, T.; Viskovic, K.; Suri, H.S.; Gupta, A.; Mavrogeni, S.; Turk, M. A low-cost machine learning-based cardiovascular/stroke risk assessment system: Integration of conventional factors with image phenotypes. Cardiovasc. Diagn. Ther. 2019, 9, 420. [Google Scholar] [CrossRef]
- Agarwal, M.; Agarwal, S.; Saba, L.; Chabert, G.L.; Gupta, S.; Carriero, A.; Pasche, A.; Danna, P.; Mehmedovic, A.; Faa, G. Eight pruning deep learning models for low storage and high-speed COVID-19 computed tomography lung segmentation and heatmap-based lesion localization: A multicenter study using COVLIAS 2.0. Comput. Biol. Med. 2022, 146, 105571. [Google Scholar] [CrossRef] [PubMed]
- Suri, J.S.; Agarwal, S.; Chabert, G.L.; Carriero, A.; Paschè, A.; Danna, P.S.; Saba, L.; Mehmedović, A.; Faa, G.; Singh, I.M. COVLIAS 1.0 Lesion vs. MedSeg: An Artificial Intelligence Framework for Automated Lesion Segmentation in COVID-19 Lung Computed Tomography Scans. Diagnostics 2022, 12, 1283. [Google Scholar] [CrossRef] [PubMed]
- Jain, P.K.; Sharma, N.; Saba, L.; Paraskevas, K.I.; Kalra, M.K.; Johri, A.; Laird, J.R.; Nicolaides, A.N.; Suri, J.S. Unseen artificial intelligence—Deep learning paradigm for segmentation of low atherosclerotic plaque in carotid ultrasound: A multicenter cardiovascular study. Diagnostics 2021, 11, 2257. [Google Scholar] [CrossRef]
- Suri, J.S.; Agarwal, S.; Gupta, S.K.; Puvvula, A.; Biswas, M.; Saba, L.; Bit, A.; Tandel, G.S.; Agarwal, M.; Patrick, A. A narrative review on characterization of acute respiratory distress syndrome in COVID-19-infected lungs using artificial intelligence. Comput. Biol. Med. 2021, 130, 104210. [Google Scholar] [CrossRef]
- Verma, A.K.; Kuppili, V.; Srivastava, S.K.; Suri, J.S. An AI-based approach in determining the effect of meteorological factors on incidence of malaria. Front. Biosci.-Landmark 2020, 25, 1202–1229. [Google Scholar]
- Verma, A.K.; Kuppili, V.; Srivastava, S.K.; Suri, J.S. A new backpropagation neural network classification model for prediction of incidence of malaria. Front. Biosci.-Landmark 2020, 25, 299–334. [Google Scholar]
- Krammer, F. SARS-CoV-2 vaccines in development. Nature 2020, 586, 516–527. [Google Scholar] [CrossRef]
- Hasöksüz, M.; Kilic, S.; Sarac, F. Coronaviruses and SARS-CoV-2. Turk. J. Med. Sci. 2020, 50, 549–556. [Google Scholar] [CrossRef]
- Kim, D.; Lee, J.-Y.; Yang, J.-S.; Kim, J.W.; Kim, V.N.; Chang, H. The architecture of SARS-CoV-2 transcriptome. Cell 2020, 181, 914–921.e910. [Google Scholar] [CrossRef]
- Amanat, F.; Krammer, F. SARS-CoV-2 vaccines: Status report. Immunity 2020, 52, 583–589. [Google Scholar] [CrossRef]
- Lu, X.; Zhang, L.; Du, H.; Zhang, J.; Li, Y.Y.; Qu, J.; Zhang, W.; Wang, Y.; Bao, S.; Li, Y. SARS-CoV-2 infection in children. N. Engl. J. Med. 2020, 382, 1663–1665. [Google Scholar] [CrossRef]
- Ludwig, S.; Zarbock, A. Coronaviruses and SARS-CoV-2: A brief overview. Anesth. Analg. 2020, 131, 93–96. [Google Scholar] [CrossRef]
- Jackson, C.B.; Farzan, M.; Chen, B.; Choe, H. Mechanisms of SARS-CoV-2 entry into cells. Nat. Rev. Mol. Cell Biol. 2022, 23, 3–20. [Google Scholar] [CrossRef]
- Pedersen, S.F.; Ho, Y.-C. SARS-CoV-2: A storm is raging. J. Clin. Investig. 2020, 130, 2202–2205. [Google Scholar] [CrossRef]
- Yang, X.; He, X.; Zhao, J.; Zhang, Y.; Zhang, S.; Xie, P. COVID-CT-dataset: A CT scan dataset about COVID-19. arXiv 2020, arXiv:2003.13865. [Google Scholar]
- Ter-Sarkisov, A.J.A.I. COVID-ct-mask-net: Prediction of COVID-19 from ct scans using regional features. Appl. Intell. 2022, 52, 9664–9675. [Google Scholar] [CrossRef] [PubMed]
- Shakouri, S.; Bakhshali, M.A.; Layegh, P.; Kiani, B.; Masoumi, F.; Nakhaei, S.A.; Mostafavi, S.M. COVID19-CT-dataset: An open-access chest CT image repository of 1000+ patients with confirmed COVID-19 diagnosis. BMC Res. Notes 2021, 14, 178. [Google Scholar] [CrossRef] [PubMed]
- Khaniabadi, P.M.; Bouchareb, Y.; Al-Dhuhli, H.; Shiri, I.; Al-Kindi, F.; Khaniabadi, B.M.; Zaidi, H.; Rahmim, A. Two-step machine learning to diagnose and predict involvement of lungs in COVID-19 and pneumonia using CT radiomics. Comput. Biol. Med. 2022, 150, 106165. [Google Scholar] [CrossRef] [PubMed]
- Lu, H.; Dai, Q. A self-supervised COVID-19 CT recognition system with multiple regularizations. Comput. Biol. Med. 2022, 150, 106–149. [Google Scholar] [CrossRef]
- Cao, H.; Wang, Y.; Chen, J.; Jiang, D.; Zhang, X.; Tian, Q.; Wang, M. Swin-unet: Unet-like pure transformer for medical image segmentation. In Proceedings of the Computer Vision–ECCV 2022 Workshops, Tel Aviv, Israel, 23–27 October 2022; Part III. Springer: Berlin/Heidelberg, Germany, 2023; pp. 205–218. [Google Scholar]
- Sha, Y.; Zhang, Y.; Ji, X.; Hu, L. Transformer-unet: Raw image processing with unet. arXiv 2021, arXiv:2109.08417. [Google Scholar]
- Torbunov, D.; Huang, Y.; Yu, H.; Huang, J.; Yoo, S.; Lin, M.; Viren, B.; Ren, Y. Uvcgan: Unet vision transformer cycle-consistent gan for unpaired image-to-image translation. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA, 2–7 January 2023; pp. 702–712. [Google Scholar]
- Yan, X.; Tang, H.; Sun, S.; Ma, H.; Kong, D.; Xie, X. After-unet: Axial fusion transformer unet for medical image segmentation. In Proceedings of the IEEE/CVF winter conference on applications of computer vision, Waikoloa, HI, USA, 3–8 January 2022; p. 39713981. [Google Scholar]
- Xie, Y.; Zhang, J.; Shen, C.; Xia, Y. Cotr: Efficiently bridging cnn and transformer for 3d medical image segmentation. In Proceedings of the Medical Image Computing and Computer Assisted Intervention—MICCAI 2021: 24th International Conference, Strasbourg, France, 27 September–1 October 2021; Part III 24. Springer: Berlin/Heidelberg, Germany, 2021; pp. 171–180. [Google Scholar]
- Johri, A.M.; Singh, K.V.; Mantella, L.E.; Saba, L.; Sharma, A.; Laird, J.R.; Utkarsh, K.; Singh, I.M.; Gupta, S.; Kalra, M.S. Deep learning artificial intelligence framework for multiclass coronary artery disease prediction using combination of conventional risk factors, carotid ultrasound, and intraplaque neovascularization. Comput. Biol. Med. 2022, 150, 106018. [Google Scholar] [CrossRef] [PubMed]
- Suri, J.S.; Bhagawati, M.; Paul, S.; Protogerou, A.D.; Sfikakis, P.P.; Kitas, G.D.; Khanna, N.N.; Ruzsa, Z.; Sharma, A.M.; Saxena, S. A powerful paradigm for cardiovascular risk stratification using multiclass, multi-label, and ensemble-based machine learning paradigms: A narrative review. Diagnostics 2022, 12, 722. [Google Scholar] [CrossRef] [PubMed]
- Jamthikar, A.D.; Gupta, D.; Mantella, L.E.; Saba, L.; Laird, J.R.; Johri, A.M.; Suri, J.S. Multiclass machine learning vs.conventional calculators for stroke/CVD risk assessment using carotid plaque predictors with coronary angiography scores as gold standard: A 500 participants study. Int. J. Cardiovasc. Imaging 2021, 37, 1171–1187. [Google Scholar] [CrossRef]
- Tandel, G.S.; Balestrieri, A.; Jujaray, T.; Khanna, N.N.; Saba, L.; Suri, J.S. Multiclass magnetic resonance imaging brain tumor classification using artificial intelligence paradigm. Comput. Biol. Med. 2020, 122, 103804. [Google Scholar] [CrossRef] [PubMed]
- Jain, P.K.; Sharma, N.; Kalra, M.K.; Viskovic, K.; Saba, L.; Suri, J.S. Four types of multiclass frameworks for pneumonia classification and its validation in X-ray scans using seven types of deep learning artificial intelligence models. Diagnostics 2022, 12, 652. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).





























