Study of Different Deep Learning Approach with Explainable AI for Screening Patients with COVID-19 Symptoms: Using CT Scan and Chest X-ray Image Dataset

The outbreak of COVID-19 disease caused more than 100,000 deaths so far in the USA alone. It is necessary to conduct an initial screening of patients with the symptoms of COVID-19 disease to control the spread of the disease. However, it is becoming laborious to conduct the tests with the available testing kits due to the growing number of patients. Some studies proposed CT scan or chest X-ray images as an alternative solution. Therefore, it is essential to use every available resource, instead of either a CT scan or chest X-ray to conduct a large number of tests simultaneously. As a result, this study aims to develop a deep learning-based model that can detect COVID-19 patients with better accuracy both on CT scan and chest X-ray image dataset. In this work, eight different deep learning approaches such as VGG16, InceptionResNetV2, ResNet50, DenseNet201, VGG19, MobilenetV2, NasNetMobile, and ResNet15V2 have been tested on two dataset-one dataset includes 400 CT scan images, and another dataset includes 400 chest X-ray images studied. Besides, Local Interpretable Model-agnostic Explanations (LIME) is used to explain the model's interpretability. Using LIME, test results demonstrate that it is conceivable to interpret top features that should have worked to build a trust AI framework to distinguish between patients with COVID-19 symptoms with other patients.


Introduction
The novel coronavirus, also known as COVID-19, created colossal health crises in 2020 worldwide. The virus that caused this disease known as severe acute respiratory syndrome coronavirus 2, also called SARS-CoV-2 [4]. Since the virus is spreading very fast, thus the number of infected people is increasing day by day, while the test kit is limited along with limited hospitals. While many COVID-19 cases exhibit mild symptoms, a small percentage suffers from severe or critical conditions [5]. In increasingly genuine cases, the contamination can cause pneumonia, extreme intense respiratory condition, multi-organ failure, and death [6]. Developed countries like the USA, UK, and Italy, the arXiv:2007.12525v1 [eess.IV] 24 Jul 2020 health systems have been overwhelmed due to the expanding demand for intensive care units, as those units filled with COVID-19 patients with severe medical conditions [7]. Since COVID-19 is a socially transmitted disease, screening patients with COVID-19 symptoms is the first and foremost step. As per the most recent rules acknowledged by the Chinese government, the determination of COVID-19 ought to be affirmed by gene sequencing for respiratory or blood tests as a key marker for reverse transcription-polymerase chain reaction (RT-PCR) [8]. However, while patients are waiting for the test results, many more people are affected by them, before moving them to isolation. Therefore, the earlier it is possible to detect COVID-19 patients, the faster the patients' isolation can be done to reduce community spreading of the disease. Factors such as airspace opacities, ground-glass opacity(GGO), and later consolidation are important signs in the lung, which play an essential role in detecting COVID-19 patients [9]. Due to the limitations of the test kit, several studies proposed alternative solutions like CT scan or chest X-ray images for early detection of COVID-19 patients [10,11,8,12]. X-ray machines are used to examine the affected body, such as cracks, bone disengagements, lung contamination, pneumonia, and tumors. On the other hand, CT scanning is a cutting edge X-ray machine that looks at the extremely delicate structure of the dynamic body part and more clear pictures of the delicate inward tissues and organs [11]. Utilizing X-rays is a quicker, simpler, less expensive, and less perilous strategy than CT [8]. Pneumonia is one of the most significant indications of COVID-19 disease, and it opens holes in the lungs like SARS, giving them a "honeycomb-like appearance." Both CT scan or chest X-ray uses to diagnose pneumonia. Thus, chest X-ray or CT scans could be the best choice as an early screening method [8]. [13] developed a deep learning-based model for segmenting infectious sites on the lung using chest CT. Butt& Gill (2020), developed early detection techniques to distinguish between COVID-19 pneumonia and influenza patients using deep-learning techniques [12]. [14] also used deep learning techniques to extract features from CT images related to COVID-19 disease. Similarly, some of the studies use chest X-ray to detect MERS-CoV and SARS-CoV, known as cousins of COVID-19 [15]. A study conducted by Kapur (2008) used data mining techniques to distinguish between pneumonia and SARS based on X-ray images [16]. [8] proposed three different convolutional neural network-based models (ResNet50, InceptionV3 and InceptionResNetV2) to identify coronavirus pneumonia infected patients based on chest X-ray images. To extend existing work one step further, in this experiment, eight different deep learning models (VGG16 [17], InceptionResNetV2 [18], ResNet50 [19], DenseNet201 [20], VGG19 [17], MobileNetV2 [21], NasNetMobile [22] ,and ResNet15V2 [23]) have been proposed for the detection of COVID-19 patients using CT scan and chest X-ray images. This research's novelty is summarized as follows: 1) Eight different convolutional neural network-based models have been proposed and tested on CT scan image and chest X-ray image dataset. 2) A comparative analysis was done in terms of accuracy, precision, recall, and f1-score. 3) Results showed MobileNetV2 and NasNetMobile outperformed all other models both on CT scan and chest X-ray image datasets. 4) Finally, models were explained with the help of Local Interpretable Model-agnostic Explanations (LIME). Several studies considered X-ray images due to its less sensitivity compared to CT scan images. A recent study showed that COVID-19 patients' early or mild symptoms are possible to detect using X-ray images [9]. Among them, 69% of the patients' X-ray report showed abnormality at the initial time of admission, while 80% of the patients' symptoms showed after their hospitalization. Since the disease outbreak in 2020 worldwide, there was not enough data to study extensively during that time. For example, [24] experimented on a dataset combination of 70 COVID-19 images from one source [25] and non-COVID-19 images from Kaggle chest X-ray dataset. They proposed the Bayesian CNN model, which improves the detection rate from 85.7% to 92.9% along with the VGG16 model [26]. Similarly, [8] proposed modification of three pre-trained models: ResNet50, InceptionV3 and Inception-ResNetV2 considering chest X-ray images. However, they have used only 100 images to conduct that experiment, and the dataset was the combination of 50 Kaggle's chest X-ray images (Pneumonia) as COVID-19 patients and 50 normal chest X-ray images as Non-COVID-19 patients. The experimental result showed that in terms of accuracy, ResNet50 (98%) outperformed InceptionV3 (97%) and InceptionResNetV2 (87%), respectively. Additionally, [27] presented a ResNet model in their work where they considered data imbalance as one of the primary concerns. They have used 70 COVID-19 patients and 1008 Non-COVID-19 pneumonia patients from different data sources. The evaluation result showed 96% sensitivity, 70.7% specificity and .952 of AUC for ResNet. [14] introduced a deep CNN based model known as COVID-Net, which attained 83.5% accuracy to detect COVID-19 patients from X-ray images. Their study comprises a dataset which contains 5941 images. The data set incorporates chest X-ray images of 1203 healthy people, 931 people with bacterial pneumonia, 660 patients with viral pneumonia, and 45 patients with COVID-19. In general, most of the current research uses chest X-ray images, which combine different data sizes. While the deep learning methods showed promising results on chest X-ray images, it is difficult to conclude that the same deep learning models will do better on CT scan images. Apart from this, other deep learning models need to be tested to conclude that models such as ResNet50 or modified VGG16 are the best model so far for the initial screening of patients with COVID-19 symptoms.

CT based screening
There are a number of literature, which considered chest CT images to distinguish between COVID-19 and Non-COVID-19 patients [28,27,29,30,12,31]. [28] uses UNet++ to classify COVID-19 and Non-COVID-19 patients considering 132 sample images. The dataset includes 51 COVID-19, 55 Non-COVID-19, 16 viral pneumonia and 11 non-pneumonia patients. Their work also revealed that, using artificial intelligence it is possible to reduce the reading time of radiologists up to 65%. Another work conducted by [27] uses the UNet model for lung segmentation. They have used 540 images, which includes 313 with COVID-19 and 229 Non-COVID-19 patients' images. In addition, a 3D CNN based model was proposed, which achieved 90.7% of sensitivity and 91.1% of specificity. Beside 3D networks, [29] uses the ResNet15V2 model for detecting COVID-19 patients with the dataset of 1881 images (496 COVID-19 and 1385 Non-COVID-19). Their result acquired 94.1% of sensitivity, while the specificity score was 95.5%. Several studies employed ResNet50 to detect COVID-19 patients from chest CT scan [30,12]. For example, [30] considered 88 COVID-19, 100 bacterial pneumonia and 55 viral pneumonia images to classify between COVID-19 and other patients with ResNet50 on CT scan images. Using this technique, they have achieved 86% accuracy. [31] uses ResNet-50 on a dataset containing 468 COVID-19 and 2996 other patients. Their experimental result showed, 90% of sensitivity and 96% of specificity on CT scan image dataset. However, it is not surprising that, shortly, to handle many patients, a system may need to develop, which could screen COVID-19 patients using both CT scan and chest X-ray images.

AI Explainable
Explanatory Artificial intelligence (EAI) recently explored due to the ability to provide insights into the behavior and thought process of some sophisticated machine learning problems [32]. Several studies showed, using a decision tree, linear models, it is possible to explain the models which are easily understandable and interpretable for humans [33]. [34] proposed Local Interpretable Model-agnostic Explanations (LIME). This novel explanation approach can explain the predictions of any classifier in an interpretable manner. They have explained the prediction of Google's pre-trained Inception neural network on arbitrary images in their work. Using LIME, it is possible to understand, visualize, and interpret any deep learning models that used for image classification [35]. [36] demonstrated that explainable-AI helps to facilitate the implementation of AI/ML in the medical domain. Thus, an AI that explains the X-ray images' top features might play a crucial role in distinguishing between COVID-19 patients with other patients for the radiologist. In general, this investigation found that a large portion of the study either considered chest X-ray or CT scan image analysis with a couple of deep learning models because of the time constraints. Be that as it may, when this experiment was conducted, none of the past works, considered both CT scan and chest X-ray images with different deep learning approaches to locate the best model on both scenarios with the help of AI-explanation. In this manner, this research expects to provide some insights on these issues, where eight diverse deep learning models were tried on both datasets. The experimental result will help the researchers and users develop a universal COVID-19 screening system using a deep learning application that could simultaneously work well on CT scan and chest X-ray images.

Methodology
Over the years, many studies use the Kaggle dataset as a reliable source for the experiment [37,38,39,40,41]. Several deep learning approaches such as VGG16 and ResNet50 becomes popular when used for Kaggle dataset competition, and later those datasets were used for conducting various experiment [42,43,44]. In this pandemic situation, some of the literature relies on images (i.e., chest X-ray, CT scan) acquired from Kaggle datasets in order to develop an AI-based screening system for patients' with COVID-19 symptoms [45,14,46,8,47]. This research also used images (chest X-ray and CT scan images) obtained from Kaggle datasets [48]. However, the dataset is the combination of multiple datasets. For example, 400 chest X-ray images containing both COVID-19 and Non-COVID-19 patients were collected from one source, and CT scan images containing 400 images were collected from another source. Finally, both datasets were examined separately with different deep learning techniques-VGG16 [17], InceptionResNetV2 [18], ResNet50 [19], DenseNet201 [20], VGG19 [17], MobileNetV2 [21], NasNetMobile [22], and ResNet15V2 [23]. Table 1 summarize two datasets used during this experiment. 80% images were used for training, and 20% were used for testing. Table 1 shows that, for each dataset, 320 images were used on the train set and 80 images were used for the test set.

Pre-trained Convet
Instead of developing a deep learning model from scratch, a more rational approach is to construct a model with existing, proven models [49]. In this work we have uses the model's weight from pre-trained models (i.e., imagenet [50]) which developed as standard image recognition techniques [51,52,53,54]. This technique is also known as transfer learning, a process where a trained model on one problem could be used for other similar types of problems [55]. This method's main advantage is less time consuming and possible to achieve higher accuracy with limited data [56]. When data sets are still developing and limited during this uncertain situation, like other studies [47,8,57,46,58,59], this research takes into account that a pre-trained model would be the right choice. The primary model's architecture contains three components-pre-trained network, modified head, and prediction class (inspired from [27]). We employed the pre-trained network to extract its high-level features connected to the modified network and classification head, respectively.
Most of the deep CNN models [20,21,22] consists a number of convolution layers followed by the pooling layer (i.e., Maxpooling [60], Averagepooling [61]). Figure 1 illustrates the modified architecture for VGG16. The architecture contains 16 [62] CNN layers with different filter numbers, sizes, and stride values. Let the letter, If we consider c 1 as the input layer, then our proposed models layout for VGG16 may express as: Figure 1: VGG16 architecture implemented during this experiment [17] A robust model also relay on proper feature extraction techniques as well [63]. Let the letter, Then the two-dimensional convolutional operation can be expressed as follows [49]: where * represents the discrete convolution operation [49]. Kernel, K slides over the images with the stride parameters. The Rectified Linear unit (ReLu) is used as an activation function in the dense layer. ReLu function can be calculated with the following equations [49]: During this experiment (3,3) convolution filter with (4,4) pool size is used for feature extraction [17]. An illustration for the flow of input image from convolutional layer and Maxpooling layer is given in 2. A deep learning model Figure 2: An illustration of convolutional and maxpooling layer operations [49] contains several tunable parameters (i.e., number of neurons, number of hidden layers) [64]. Since we have used a pre-trained model, thus batch size, the number of epochs and learning rate was only considered, instead of the number of neurons, and the number of hidden layers. However, manually tuning those parameters is time-consuming and less efficient [65,66]. A better way to do it is to use grid search methods [67]. We have randomly selected the following parameters for our grid search methods: Learning rate = [.001, .01, .1] Epochs = [10,20,30,40,50] Batch size = [5,10,15,20] Using the grid search method, we have found the following parameters as the most optimized parameters: Learning rate = .001 Epochs = 30 Batch size = 5 During the training phase, to optimize the model, we need to set an optimization algorithm [68]. Some of the most popular optimization algorithms are-adaptive learning rate optimization algorithm (Adam) [69], stochastic gradient descent (Sgd) [70], and Root means square propagation (Rmsprop) [71]. To keep our experiment simple, We have selected 'Adam' as our optimization algorithm due to its effectiveness on binary image classification [72,73]. Finally, the overall result was statistically analyzed based on accuracy, precision, recall, and f1-score [74] Where, True Positive (t p )= COVID-19 patient classified as patient False Positive (f p )= Healthy people classified as patient True Negative (t n )=Healthy people classified as healthy False Negative (f n )= COVID-19 patient classified as healthy. Figure 3 shows the overall flow diagram of the experiment. The best model was selected based on the statistical analysis on CT scan and chest X-ray image datasets.

Result
During this experiment overall accuracy, precision, recall, and f1-score were measured for eight different deep learning approaches considering CT scan and X-ray image using equation (3), (4), (5), and (6). We have used 80% data for the train and 20% data for the test, which is the most commonly used data mining techniques [75,76,77]. We ran the experiment twice and represented our result by averaging them for evaluation on the train and test set (inspired by [27]). Table 2 summarize the average accuracy, precision, recall, and f1-score for eight pre-trained deep learning models used during this experiment on the train set. The result shows that among all those model's MobileNetV2 performed better in terms of accuracy (99%), precision (99%), recall (99%), and f1-score (99%) and ResNet50 demonstrated the worst performance-56% accuracy, 71% precision, 56% recall, and 47% f1-score.

Confusion Matrix
The confusion matrix was calculated on the test set to simplify the understanding of the model's performance. Figure figure 5 outlines the overall correctly and incorrectly classified chest X-ray images by eight deep learning models used during this work. Results show that most of the models performed well in chest X-ray images compared to CT scan images. Among all of the models, NasNetMobile was able to distinguish 100% accurately both COVID-19 and other patients' X-ray images. Other algorithms, such as ResNet15V2 misclassified one image and VGG16, InceptionResNetV2, DenseNet201, and MobileNetV2 misclassified two images. Simultaneously, model ResNet50 and VGG19 showed poor performance and misclassified 29 and 7 images on chest X-ray images.

Models' Performance
Each model's performance monitored on each epoch for both CT scan and chest X-ray images. The accuracy and loss were observed both on the train and test set.  However, on DenseNet201, ResNet15V2, after several epochs, training accuracy and validation accuracy started to disperse. For example, on ResNet15V2, while training accuracy reached up to 95%, validation accuracy just fluctuated between 75% to 80% and started to drop below 80%. Apart from this, for model ResNet50, both training and validation accuracy showed unsteadiness after epoch 15 to until epoch 30 and unable to achieve accuracy more than 65% over time. Figure 7 delineates the model's training and validation loss for each epoch. It demonstrates that, both training and validation loss follows almost similar pattern for VGG16, InceptionResNetV2, ResNet50, VGG19, MobileNetV2, and NasNetMobile. Be that as it may, on DenseNet201, and ResNet15v2, validation, and training loss began to scatter after several epochs. For instance, on ResNet15V2, after 10 epochs, while training loss continuously decreased, the validation loss started to increase and rise to 50%  On the other hand, models like DenseNet201, VGG19, MobileNetV2, and ResNet15V2 also showed promising results even though models performance fluctuates after every five epochs. Among all of the models, ResNet50 showed the worst performance, and after 25 epoch, models training accuracy significantly dropped from 90% to 60% while validation accuracy was still rising. Based on the overall performance, it could be concluded that in terms of accuracy on train set and validation set, VGG16, InceptionResNetV2, and NasNetMobile showed more stability and better accuracy than the other six pre-trained convet.

Confidence Interval
The Confidence interval (CI) was measured using two common methods, such as Wilson score [2] and Bayesian Interval [1]; both methods are widely used and showed better performance on the small dataset [3]. Table 6 delineates 95% (CI) for model accuracy on the test set for CT scan and chest x-ray images. On the ct scan image dataset, Resnet50 has the lowest accuracy vary from 0.441 to 0.654 and o.441 to 0.656; in contrast, NasNetmobile has the highest accuracy set out from 0.815 to 0.948 and 0.820 to 0.952 respectively. On the Chest X-ray image dataset, accuracy for VGG16, InceptionResNetV2, DenseNet201, and MobileNetV2 was achieved between 0.913 to 0.993, and 0.922 to 0.995 using Wilson score and Bayesian interval respectively. However, among all the models, Higher accuracy was obtained for NasNetMobile, and lower accuracy was acquired for ResNet50.

Discussion
During this experiment, extensive analyses were done, considering both CT scan images and chest X-ray images using eight different deep learning approaches. One of this section's main goals is to find out the best model considering both CT scan and chest X-ray images.

Best Model on CT Scan Image Dataset
The best models for this specific experiment were selected considering the following factors-accuracy, precision, recall, f1 score, train and validation loss, and performance of confusion matrix. On the CT scan image dataset, on the train set, MobileNetV2 showed higher accuracy, precision, recall, and f1-score as 99%. However, on the test set, NasNetMobile outperformed all other models with 90% accuracy, precision, recall, and f1-score. On the other hand, when we looked over the confusion matrix result, it was found that, compared to any model, NasNet-Mobile demonstrated better results with only 8 misclassifications out of 80 images. However, model MobielNetV2 misclassified 13 of the images out of 80 images. Additionally, VGG16, InceptionResNetV2, VGG19, MobileNetV2, and NasNetMobile performed better in training and validation accuracy than any other model. Contrary, in terms of training and validation loss, VGG16, InceptionResNetV2, ResNet50, VGG19, MobileNetV2, and NasNetMobile showed a better result. From table 7, we can see that, compared to any model, MobileNetV2 outperformed all other models in terms of accuracy, precision, recall, and f1-score on train set and NasNetMobile on the test set. However, considering the confusion matrix, NasNetMobile outperformed all other methods. Additionally, the misclassification difference between MobileNetV2 and NasNetMobile is just one. Thus, it could be concluded that considering all those factors, both MobileNetV2 and NasNetMobile are the best models on the CT scan image dataset.

Best Model on X-ray Image Dataset
The best model on X-ray image dataset was chosen by following the same procedure as the CT scan image dataset. Based on the train set's overall performance, VGG16, DenseNet201, MobileNetV2, NasNetMobile, and ResNet15V2 outperformed all other models in terms of accuracy, precision, recall, and f1-score. On the other hand, on the test set, NasNetMobile outperformed all other models. Apart from this, based on the confusion matrix result, NasNetMobile exceeded all other models with zero misclassification. However, while considering the model's training and validation accuracy during each epoch, it was seen that VGG16, InceptionResNetV2, and NasNetMobile outperformed other models. Additionally, during training and validation loss, VGG16, InceptionResNetV2, VGG19, and NasNetMobile showed better results than other models. To find out the best model, a comparison was made on the following table 8. Table 8 showed that NasNetMobile outperformed all of the models taking into account all performance measurement tools such as accuracy (100%), precision (100%), recall (100%), f1-score (100%), confusing matrix (100%), and loss calculation.

= 95%
From table 9, we can see that MobileNetV2 outperformed NasNetMobile in terms of accuracy, precision, recall, and f1-score. However, misclassification rate for MobileNetV2 (11.25%) is slightly higher than NasNetMobile (10%). Since our dataset is small, this error rate may not be significant, yet, for a larger dataset, the misclassification rate may significantly impact.

Models Average Accuracy
Average accuracy was calculated by averaging the training and testing accuracy of all the models. Table 10 shows the average accuracy for CT scan and chest X-ray image dataset. Results show that almost all models performed better on the X-ray image data set compared to the CT scan. The average accuracy for all the models on CT scan and X-ray image dataset is 82.94% and 93.94%, respectively. In this work, we tried to understand how each layer dealt with the actual image. Figure 10 demonstrated CT scan images during different layers. Note that, just a few of the layers from VGG16 were addressed here to give some insights.
Here another Figure 11, manifested the different layer's activity of model ResNet50 on chest X-ray images. The region

Models Interpretability with LIME
The pre-trained CNN model extracts sophisticated features from the images, which sometimes reveals unnecessary features. Therefor training such a model is often computationally expensive. Additionally, a large set of features makes it challenging to understand which features are essential for predictions. In this paper, these issues also addressed while developing a model for screening COVID-19 patients.
To identify which specific features help deep learning model (MobileNetV2, NasNetMobile) to differentiate between COVID-19 and Non-COVID-19 patients, LIME was used. Local Interpretable Model-agnostic Explanations (LIME) is a procedure that helps to understand how the input features of a deep learning model affect its predictions. For example, Figure 11: Heat map of class activation of chest X-ray image on different layer acquired by ResNet50 for image classification, LIME finds the set of super-pixels with the most grounded relationship with a prediction label [78]. LIME makes clarifications by creating another dataset of random perturbations (with their separate forecasts) around the occasion being clarified and afterward fitting a weighted neighborhood proxy model. This neighborhood model is usually a more straightforward model with natural interpretability, such as a linear regression model. LIME creates perturbations by turning on and off a portion of the super-pixels in the image. A quick shift strategy was utilized with the following parameters in order to calculate the super pixel, as shown in table 11:  Figure 12 is the output after computing the super-pixels on a sample chest CT scan images.   Here figure 15, depicts the overall interpretability for image classification with LIME on chest X-ray images considering each step. The prediction was conducted using NasNetMobile. Using LIME, it was possible to identify which top features helps to identify COVID-19 patients from other patients considering chest X-ray images. In brief, based on the overall experiment, this study found that, among all eight deep learning models, MobileNetv2 and NasNetMobile performed better both on CT scan and chest X-ray image datasets. Additionally, all deep learning models performed well on the chest X-ray image dataset compared to CT scan images with an average 8% higher accuracy. This research addressed that existing deep learning approaches could be an alternative solution for detecting COVID-19 patients. However, a proper screening system should be developed based on the Expert like Doctors' and Radiologists opinion as well.

Conclusion
During this pandemic situation, there are many countries and places where it is difficult to test enough patients with existing tool kit or CT scan images due to the expense and time sensitivity issues. Thus, as a helping hand, a comprehensive study was conducted considering CT scan (400 samples) and chest X-ray (400 samples) image dataset to classify between patients with COVID-19 symptoms with other patients. Additionally, top features that differentiate between COVID-19 and other patients were analyzed using LIME. The experimental result revealed that existing deep-learning models performed better on chest X-ray images compared to CT scan images. Moreover, a chest X-ray takes less time and also turns out to be cost-efficient. Thus, a chest X-ray could be an alternative approach in order to resolve this shortage. Additionally, in this study, eight different deep learning models (VGG16, InceptionResNetV2, ResNet50, DenseNet201, VGG19, MobileNetV2, ResNet15V2, and NasNetMobile) were used and analyzed. The research outcome showed that, among all of the models, in CT scan image dataset, MobileNetV2 and NasNetMobile outperformed all other models, and NasNetMobile is the best model on chest X-ray image dataset. However, since the Figure 15: Overall prediction analysis using LIME dataset is comparatively small, thus the accuracy acquired from the model may not represent the exact accuracy on a large scale. Therefore, 95% CI for the accuracy on the test set was measured for all the models and results showed that NasNetMobile outperformed all other models with 95% CI on CT scan datasets, accuracy ranges from 81.5% to 95.2%, and on chest X-ray image dataset varies from 95.4% to 100%. The experimental result may bolster other current studies, which proposed that it is possible to develop an initial COVID-19 screening system using a deep-learning approach. With this short time and pandemic situations, we hope our study will give some insights to researchers and developers who are actively looking for alternative screening procedures by using both CT scan and chest X-ray image datasets for COVID-19 patients. Further study includes but not limited to-understanding deep learning models performance with highly imbalanced data, model performance with a larger dataset, Check for data bias [79], parameter tuning, and developing a decision support system.