1. Introduction
The rapid growth in Deep Learning (DL) technology helps in the development of accurate diagnostic tools by using labelled Medical Imaging data. However, results should be trustworthy and close to manual diagnosis. The main advantage of using DL methods is that the feature extraction is performed automatically with the help of convolutional layers and it beats the other traditional classification systems [
1,
2]. Many DL methods have been proposed such as Long Short Term Memory (LSTM) [
3], Recurrent Neural Network (RNN), Deep Belief Network (DBN) [
4], Convolutional Neural Network (CNN) and Capsule Network [
5]. CNN works in the manner of the human brain and play a significant role in pandemic detection as they do not need manual feature extraction. The hidden layers having the power of feature learning can achieve high sensitivity and specificity in classifying or diagnosing diseases [
6].
During the pandemic, researchers have been working on different Medical Imaging modalities and processing these data via DL models. A CNN is an important class of DL in which the input image as a pixel array is passed through different layers for processing. The convolutional layers are mainly for feature extraction. The properties of input images are learned by applying different sizes of filters, which is also called the kernel. After several convolutional layers, a feature map is generated. At this layer, ReLU or sigmoid are commonly used activation functions [
7]. The pooling layers are used for size reduction or down-sampling and are usually applied between two convolutional layers. The main function of these layers is to lessen the computation power by applying down-sampling or reducing the size of the feature map. Max pooling and average pooling are commonly used pooling layers. The dense layer or fully connected layer is used before the output layer, which contains the softmax function to perform classification. In a dense layer, each input from the previous layer is connected to each neuron. Hence, it makes a fully connected layer.
Medical Imaging plays an important role in disease diagnosis, in which healthy and infected CT scans or X-ray are compared and analyzed by expert radiologists [
8]. Various studies show that different image modalities have their own merits and demerits regarding health risks, cost, sensitivity, specificity and accuracy. CT scans have high sensitivity and also high specificity, which means that they have high accuracy in terms of positive cases while having low accuracy in other classes [
9].
CT scans are 360-degree cross-sectional images generated by CAT scanners. These scans are a series of X-rays taken from various angles, providing a more detailed visualization of bones, tissues and internal organs. Moreover, it is harmful to patients having many sessions of CT scans due to long time exposure to X-ray radiations [
10]. X-rays are more secure, efficient and cost-effective for pandemic patients and give quick diagnoses. They are the first tool that doctors recommend for diagnosis at an initial stage and also X-ray machines are easily available in hospitals. However, they give low accuracy in some cases [
6].
Ultrasound images are generated by a transducer, which uses high-frequency sound waves to create images of internal organs and their movement. After reflecting these waves from the body, the echo is recorded [
11]. Unlike CT-scan and X-ray, there is no ionization radiation, hence no cancer risk. For the diagnosis of the pandemic, lung ultrasound is a recommended tool, as it helps in the visualization of the lung’s condition. As the disease goes from moderate to severe infection, it is visualized by B-line artifacts in ultrasound images and they increase as the severity of the illness increases [
12]. This visualization is also useful in the grouping of patients according to their respiratory condition.
Artificial Intelligence (AI)-based automated systems using different image modalities help clinicians diagnose various lung organ diseases, as they give a second opinion. It is a difficult and challenging task for radiologists and clinicians to differentiate the disorders having similar patterns, such as pandemic patients, from other diseases, such as viral pneumonia, bacterial pneumonia and influenza, based on the medical images [
13].
However, Medical Imaging is useful for diagnosing and classifying various chronic diseases such as diabetes, lung cancer, heart disease, brain stroke and pandemic-related diseases. However, reading scans manually is a time-consuming and error-prone task. Therefore, researchers are moving towards DL-based automated image analysis systems. The emergence of ML and DL for disease detection and prediction plays a significant role in healthcare. The rising scope of these technologies also encourages researchers to play a major role in pandemic detection.
In this study, transfer learning-based stack ensemble architecture is proposed by using Medical Imaging datasets of three modalities (CT-Scans, X-ray and Ultrasound) and applying various CNN architectures for an accurate and reliable diagnosis. The datasets comprise pandemic positive and negative samples. These results might help in the early diagnosis of pandemic patients. The contributions of this research are summarized as follows:
Large publicly available multimodal datasets (Lung CT-scan, Chest X-ray, Lung Ultrasound) for pandemic detection are considered, which are taken from multiple online repositories.
Fifteen state-of-the-art fine-tuned pre-trained CNN models are applied to all three datasets and their performance is evaluated and compared.
The Transfer-Learning-based Stack Ensembling approach is proposed using the fine-tuned models to improve the accuracy of diagnosis on all three datasets.
The rest of the paper is organized as follows. In
Section 2, the related work is discussed in detail.
Section 3 presents our proposed approach, which contains the description of the dataset and working of the proposed architecture.
Section 4 presents the experimental results and comparative analysis. The conclusions and future dimensions of the research are presented in
Section 5.
2. Related Work
DL is a sub-branch of ML that deals with the algorithms inspired by the structure and function of the brain called ANN. Although Medical Imaging is useful in disease prediction and classification, reading scans manually is time-consuming. Therefore, researchers are moving towards a DL-based automated image analysis system, which has vast applications in the healthcare sector, particularly in disease diagnosis and severity prediction. In [
14], the authors proposed a quick automatic prediction system for pandemic patients using X-ray images. Pre-trained models comprise InceptionV3, ResNet50, InceptionResNetV2, ResNet151 and ResNet101.
Among these models, ResNet50 achieves high-performance accuracy, i.e., 98%. In reference [
15], DL-based CNN models are applied to the dataset of 6432 X-ray scans. Three models—InceptionV3, Xception and ResNeXt—are evaluated and compared, resulting in high accuracy for the Xception model, i.e., 97%. However, they used an unbalanced dataset of positive and negative samples of pandemic samples. In [
16], the authors proposed a system for early prediction of pandemics using X-ray radiographs by applying different AI techniques. CNN is implied in two ways. First, it is used for classification by using the softmax layer. In the second scenario, it is applied for feature extraction. These features are then passed to other classifiers, i.e., SVM and RF.
A. Gautam proposed a novel 13-layer CNN architecture for brain stroke classification into three categories, hemorrhagic, ischemic and normal stroke, using CT scan imaging data [
17]. Quadtree-based fusion technique is applied to improve the contrast of 2D slices containing stroke. The proposed model comprises two convolutional layers and two dense layers to make the model efficient and require less computation time. Transfer learning-based DL algorithms are presented in [
18] for classifying brain tumours into malignant and benign by using an open-source brain tumour MRI dataset. Various pre-trained models are utilized to achieve better accuracy. The TCIA dataset is used in this study, which consists of 224 benign images and 472 malignant images. In [
19], the CNN-based DL model is proposed for classifying brain tumour types by using two publicly available MRI imaging brain tumour datasets. The datasets comprise 73 and 233 patients, respectively. The model classifies meningioma, glioma and pituitary tumour on D1 while multi-classification is performed for characterizing different grades of glioma tumour on D2.
The proposed model achieves 96% and 98% test accuracy on D1 and D2, respectively. Every year, about 123,000 new instances of skin cancer are detected throughout the world, making it a serious public health issue. Melanoma is the worst form of skin cancer, accounting for over 9000 fatalities annually in the United States. Balazs Harangi [
20] proposed a weighted average ensemble architecture based on CNNs to classify dermoscopy images. Multiclass classification is performed for classifying seborrheic keratosis, nevus and melanoma lesions.
The proposed fusion-based ensemble architecture achieves good results as compared to individual CNN and achieves an AUC score of 0.89. In [
21], an automated diagnosis system is proposed to classify nevus, melanoma and atypical nevus lesions. The concept of transfer learning is applied by utilizing AlexNet architecture and appending the dense layer with a softmax function. The Ph2 dataset is used for training and testing AlexNet-based architecture. The proposed model achieves 98% test accuracy.
Lung cancer is the most dreadful cancer that result in a large number of deaths globally. The only approach to increasing a patient’s probability of life is to discover lung cancer early. A new automated diagnostic classification system for CT scans of the lungs was developed in [
22].
CT scans of lungs were evaluated using an Optimal Deep Neural Network (ODNN). The LDR approach was performed for dimensionality reduction of deep features to classify items into the categories malignant and benign lung nodules. The deep neural network is enhanced by using the (MGSA) algorithm. The suggested classifier has a sensitivity of 96.2%, a specificity of 94.2% and an accuracy of 94.56%.
Many custom DL models and architectures have been proposed to achieve more accurate and reliable results as they are designed according to the specific purpose of interest. They are evolved by using the existing DL models to develop a novel neural network or by combining the current DL models. For example, in reference [
23], CoroNet architecture is proposed based on the Xception DL model for the detection of a pandemic. The model is applied on two publicly available datasets and achieves 89% and 95% accuracy, respectively. In [
24], the authors proposed Bayesian CNN and discussed how drop-weight-based CNN predicts the uncertainty in DL models.
A new DL framework COVIDX-Net is proposed in [
25] for automatically detecting a pandemic. This framework consists of seven different architectures applied on X-ray scans of 50 patients, achieving 91% accuracy and 89% f1-score. In [
26], SqeezeNet with Bayesian optimization is proposed for pandemic detection. This study used a lightweight network design, a non-public augmented X-ray dataset and fine-tuned hyper-parameters to achieve good performance. In [
27], authors worked on early detection of a pandemic by using X-ray images by applying different pre-trained models. VGG18 achieves the highest accuracy, i.e., 80%.
In [
28], authors worked on patients’ CT images and their clinical reports. CNN models are applied to CT scans and ML models are applied to the clinical data of patients. A joint AI model is proposed for integrating CT scans and clinical data and achieves 0.84 AUC. In [
9], the authors implement sixteen pre-trained CNN models on large chest CT-scan datasets. This study achieves high performance with DenseNet121 and discusses that better classification results can be achieved without augmentation and by inputting whole slices of CT scans. The pre-trained Densenet model is applied for classification purposes. Another study [
29] used the transfer learning concept using CT slices. Ten well-known pre-trained models are implemented, among which ResNet101 and Xception achieved the best performance. However, they have used small train and test datasets.
Pathak et al. proposed a deep bidirectional LSTM network [
30]. The Mixture Density Network (MDN) [
31] is embedded along with LSTM which contains the output layer and hidden layers to perform classification. They achieved 98% accuracy. However, their dataset size was small. In reference [
32], the disease detection and severity classification method is proposed using CT scans. The model classifies the severity as mild, severe and moderate. UNet, deep Encoder–Decoder CNN [
33] and Feature Pyramid Network (FPN) [
34] are applied for lung segmentation and detection of disease.
U. Özkaya et al. [
35] proposed an automated system for early diagnosis of the pandemic in which CNN architecture is utilized for feature-extraction from CT images. Features are combined with the data fusion technique and then two subsets of data containing 16 × 16 and 32 × 32 patches are obtained. SVM is applied to perform the final classification from the patch datasets containing 3000 positive examples. In [
36], an early screening method is proposed in which ResNet18 architecture is enhanced by embedding the location-attention mechanism in the dense layer of CNN; this achieves 86% accuracy. Ten layer CTnet-10 architecture is proposed [
37] using CT scan samples and achieves 81% accuracy. DL architecture using the concept of transfer learning is proposed in [
38]. Multiclass class classification is performed by classifying normal, pneumonia and pandemic cases with an accuracy of 96% They also claimed that the proposed model has more sensitivity than the radiologists perform screening and prediction. However, they have used a small and unbalanced dataset that contains only 140 pandemic-infected samples.
EfficientCovidNet is proposed in [
39] by appending six new blocks in existing EffiecientNetB0 architecture and achieving 87% accuracy. They also performed cross-data testing, which reduced the classification accuracy to 56%. Various ensemble approaches are applied to enhance the performance of DL models. This allows the combination of the contribution of each base learner to give more accurate results and low variance in prediction errors. Multiple kernel ELM-based DL architecture is proposed in [
40], in which DenseNet201 is applied for feature extraction and the final result is obtained from the majority voting ensembling of multiple ELM classifiers. In [
41], a majority voting ensembling of five DL model architectures is proposed to enhance the performance of pre-trained models. This approach achieves 85% on test data of CT-Scan samples.
Five-block multi-scale Deep Neural Network (DNN) is proposed in [
42] for the detection of COVID-19 using chest radiographs. The proposed research presented an IoT framework to remotely provide a fast diagnosis to COVID-19 patients. The multi-scale sampling presented efficient feature learning using different-size convolutional filters and achieved remarkable results. However, they considered a single mode of dataset, i.e., chest X-ray. Vyas et al. [
43] presented a comparison of different feature extraction techniques and performed classification using state-of-the-art ML models to predict COVID-19 positive or negative status. Their findings illustrate that Local Binary Pattern (LBP) feature extraction along with Gradient Boosting (GB) classifier achieve the best results, i.e., 94% accuracy. However, they considered small training samples and their effects can be improved further.
Image preprocessing is an essential step in DL as the performance of DL models highly depends on it. Many segmentation and Region of Interest (ROI) extraction techniques have been proposed. In reference [
44], a hybrid image enhancement based on guided and matched filtering techniques is applied on fundas images to extract blood vessels and outperforms state-of-the-art techniques. Seal et al. [
45] applied three different probabilistic and predictive models for the detection of liver cancer. They first applied the segmentation of lesions using the fuzzy C-mean clustering technique. Their findings conclude that Logistic Regression (LR) has identified more significant features and achieved good accuracy as compared to Linear Discriminant Analysis (LDA) and Multilayer Perceptron (MLP). A novel DL architecture based on a correlation mechanism is proposed by [
46] for the detection of brain tumours using brain CT scans. The authors combine traditional CNN with a supporting neural network that helps in finding the most suitable filters for convolutional and pooling layers. Their findings show that the proposed neural classifier is faster and achieves good accuracy.
Lung ultrasound (LUS) is a convenient, easy-to-sterilize and low-cost imaging modality that can be used to diagnose various lung diseases [
47]. Very few studies, until this research, have used LUS for pandemic screening. In reference [
48], four pre-trained models are utilized for the detection of pandemic and pandemic/pneumonia classification using publicly available ultrasound frames. They achieve a highest accuracy of 89%. A fine-tuned VGG model is used in [
13] by using all three modalities. They achieve good results with LUS images. However, they did not achieve good results for X-ray and CT scans. CNN with a multi-layer fusion approach is proposed for pandemic screening from LUS images in [
49]. In [
50], a novel DL network based on Spatial Transformer Networks (STNs) is proposed, which predicts the severity score for the pandemic and also performs pixel-level identification of regions.
A lot of research has been conducted so far in pandemic diagnoses and various studies have achieve good results. However, many of them use small datasets for training their models which affects the overall performance and leads to overfitting. Moreover, there are very few studies that have utilized lung ultrasound data for pandemic diagnosis to the best of our knowledge. This is a cheap and more secure solution for patients with severe lung conditions. So, in this research, we used the largest publicly available dataset for three modalities (X-ray, CT-scan, Ultrasound) and applied various models to find out the more accurate solution. The unique features of this research are presented in
Table 1.