Diagnostic Approach for Accurate Diagnosis of COVID-19 Employing Deep Learning and Transfer Learning Techniques through Chest X-ray Images Clinical Data in E-Healthcare

COVID-19 is a transferable disease that is also a leading cause of death for a large number of people worldwide. This disease, caused by SARS-CoV-2, spreads very rapidly and quickly affects the respiratory system of the human being. Therefore, it is necessary to diagnosis this disease at the early stage for proper treatment, recovery, and controlling the spread. The automatic diagnosis system is significantly necessary for COVID-19 detection. To diagnose COVID-19 from chest X-ray images, employing artificial intelligence techniques based methods are more effective and could correctly diagnosis it. The existing diagnosis methods of COVID-19 have the problem of lack of accuracy to diagnosis. To handle this problem we have proposed an efficient and accurate diagnosis model for COVID-19. In the proposed method, a two-dimensional Convolutional Neural Network (2DCNN) is designed for COVID-19 recognition employing chest X-ray images. Transfer learning (TL) pre-trained ResNet-50 model weight is transferred to the 2DCNN model to enhanced the training process of the 2DCNN model and fine-tuning with chest X-ray images data for final multi-classification to diagnose COVID-19. In addition, the data augmentation technique transformation (rotation) is used to increase the data set size for effective training of the R2DCNNMC model. The experimental results demonstrated that the proposed (R2DCNNMC) model obtained high accuracy and obtained 98.12% classification accuracy on CRD data set, and 99.45% classification accuracy on CXI data set as compared to baseline methods. This approach has a high performance and could be used for COVID-19 diagnosis in E-Healthcare systems.


Introduction
COVID-19 is a transferable illness that is caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) [1]. COVID-19 is very quickly spread, and numerous people have suffered and died from this global pandemic. The efficient and accurate identification of COVID-19 is a big challenge to researchers and medical experts. Effective diagnosis technologies are significantly necessary for effective treatment and recovery of COVID-19 at an early stage. The Coronaviruses are a big family of viruses and SARS-CoV-2 is a ribonucleic acid (RNA) virus that belongs to coronaviruses. The COVID-19 can be diagnosed through different methods such as medical symptoms (fever, cough, dyspnea, and pneumonia), epidemiological history, positive pathogenic testing, positive chest X-ray, and CT images. However, two virus detection methods are used: detection through nucleic two CT scan images data sets COVIDx CT-2A and COVID-CT have incorporated for evaluation of proposed model. The proposed method has been evaluated using different evaluation metrics and in terms of accuracy among the other CNN architectures the VGG19 obtained 98.87% on COVIDx CT-2A data set.
Gunraj et al. [30] proposed improved deep learning based diagnosis system (COVID-Net CT-2) for COVID-19 identification using CT scan images clinical data. The proposed method has been evaluated using different evaluation metrics and in terms of accuracy the method achieved 98.1% accuracy. Hu et al. [31] proposed a COVID-19 identification method using weakly supervised deep learning strategy and evaluated the proposed method using chest CT scan images data. The performance of proposed method achieved high predictive performance.
Khalifa et al. [32] proposed a COVID-19 diagnosis method using Generative adversarial networks (GAN) with a fine-tuned deep transfer learning. The proposed method has been evaluated using chest X-ray images data. They used 10% of data from data set for training and generate 90% data for training using GAN proposed model. Different transfer learning models such as Resnet18, Squeeznet, GoogLeNet, and AlexNet are used for detection of pneumonia. Furthermore, different performance evaluation metrics were used for model evaluation, but in terms of accuracy the proposed method obtained 99% accuracy. Wang et al. [33] proposed a deep convolutional Neural Network for the diagnosis of COVID-19 using data from chest X-ray images. The proposed model achieved 93.3 percent accuracy.
In this research paper, we have proposed a (R2DCNNMC) model for the diagnosis of COVID-19. In the designing of the method, we have incorporated a deep learning two-dimensional Convolution Neural Networks (2DCNN) model for extraction of deep features from chest X-ray images data and used these for final classification. In addition, transfer learning, and data augmentation techniques have been employed to increase the training process of the 2DCNN model. Furthermore, we have used the hold-out cross-validation technique for hyperparameters tuning and best model selection. The performance evaluation metrics have been computed for model performance evaluation. The performance of the baseline methods in terms of accuracy is compared with the proposed R2DCNNMC model. The remaining manuscript is arranged as follows: In Section 2, the data sets used in the work and proposed method methodology are discussed. Experiments are carried out and discussed in Section 3. Conclusions and future work are reported in Section 4.

Proposed Method Background
The method background described in the below subsection in detail.

Convolutional Neural Network (CNN) architecture for multi-Classificationn
Recently, CNN's models generated significant outcomes in different areas, such as NLP, image classification, and diagnosis systems [34]. In contrast to MLPs, CNN reduces the number of neurons and parameters, which results in lower complexity and faster adaptation. The CNN model has significant applications in medical image classification [34]. Here, we discuss the fundamental structure of the CNN model. The CNN is a type of Feed-Forward Neural Network (FFNN) and a DL model. Convolution operations can capture translation invariance, which means that the filter is independent of position, which reduces the number of parameters. The CNN model has three kinds of layers, such as Convolutional, Pooling, and fully connected layer. These three kinds of layers are necessary for performing functions of dimensionality reduction, feature extractors, and classification. During the convolution operation of the forward pass, the filter is slid onto the input volume and computes the activation map, which computes the point-wise output of each value and adds them to achieve the activation of that point. The sliding filter is deployed by convolution, and as a linear operator, it can be expressed as a dot product for fast deployment. Let us consider x is the input, and w is the kernel function, the convolution process (x * w)(a) on time index t can be mathematically expressed in Equation (1).
where a is in R n for any n ≥ 1. While Parameter t is discrete. In this case, the discrete convolution can be expressed as in Equation (2): However, usually use 2 or 3-dimensional convolutions in CNN model. In this work, we used two dimensional convolutions CNN model for our multi-classification problem. In case of two-dimensional image I as input, K is a two dimensional kernel and the convolution can be mathematically expressed as in Equation (3): Additionally, to gain non-linearities, two activation functions can be used, such as ReLU and Softmax. In Equation (4), the activation function ReLU expressed: the gradient of ReLU(x) = 1 for x > 0 and ReLU − (x) = 0 for x < 0. The ReLU convergence capability is good then sigmoid non-linearities. The second activation function is softmax, which is expressed mathematically in Equation (5). The softmax non-linearity activation function is suitable when the output needs to be included in more than two classes.
The CNN model pooling layers are utilized to output a statistics summary of its inputs and resize the output shape without losing necessary information. There are different type of pooling, and we use maximum pooling layer which generates the maximum values in individually rectangular neighborhood of individual point (i, j) for 2D data of each input feature, respectively. A fully connected layer FC is last layer with n and m input, and output sizes are described below. The output layer parameters are expressed as a weight matrix W ∈ M m,n . Where m rows, n columns, and a bias vector b ∈ R m . Assumed an input vector x ∈ R n , the fully connected layer FC output with activation function f is expressed mathematically in Equation (6) as: in Equation (6), Wx is the product of the matrix, while the function f is applied component wise. The fully connected layers are utilized for problems of classification. The fully connected layers FC of CNN model are generally attached at the top. For this, the CNN output is flattened and showed as a single vector. In our proposed 2D CNN model, there are three 2D convolution layers with each layer have an activation layer and max-pooling layer and FC is last layer. Furthermore, we use Stochastic Gradient Descent (SGD) Optimization algorithm for our model optimization. The structure of our CNN model is given in Table 1.

Transfer Learning to Improve 2DCNN Model Predictive Performance
To improve the 2DCNN model predictive capability, we employed transferred learning ResNet-50 model. The transfer learning (TL) techniques widely used in image classification tasks [20], COVID-19 sub-type recognition [35] and medical images filtering [36]. In this study, we incorporated the transfer learning ResNet-50 CNN pre-trained model to enhance the predictive performance of the proposed 2DCNN model. The ResNet-50 pre-train model is trained on imageNet data set and transferred the weights of the trained parameters to our 2DCNN model, and fine-tuned the model using the chest X-ray images for the final classification of the 2DCNN model.
The structure of ResNet-50 have 5 steps and each step with convolution, and identity block. In each block of convolution there are 3 layers of convolution, also three layers of convolution in each identity block. Furthermore, ResNets-50 is a variant of ResNet model, which has 48 Convolution layers along with 1 max-pool and 1 average pool layer. The ResNet-50 model has more than 74,917,380 trainable parameters. The architecture of ResNet-50 is given in Figure 1.

Cross Validation Criteria
The holdout cross-validation mechanism is used for model training and validation [5,8]. In this study chest X-ray images data sets were divided into 70% and 30% for training and teasing of the model for all experiments.

Model Assessment Criteria
In this work, important assessment measures [7] are used to evaluate the proposed method, which are expressed mathematically in Equations (7)-(12), respectively. We have designed the 2DCNN model for COVID-19 detection employing chest X-ray images data. To improve the predictive performance of the 2DCNN model, we have used techniques of data augmentation and transfer learning (TL). We have used transfer learning pre-trained CNN architecture ResNet-50 [37]. The imagesNet data set has been employed for pre-trained of ResNet-50, and the generated weights (trained parameters) of this model are transferred for the training of our 2DCNN model. Chest X-ray data set is utilized for fine-tuning of the 2DCNN model and for final multi-classification of the model. Thus, an integrated (ResNet-50+2DCNN) multi-classification (R2DCNNMC) model for COVID-19 diagnosis is proposed.
A hold-out cross validation (CV) mechanism is used in the proposed R2DCNNMC model, with 70% of the model being trained and 30% being tested. The integration of transfer learning greatly enhanced the predictive performance of the 2DCNN model. The performance of the proposed R2DCNNMC model has been evaluated using evaluation metrics. The pseudo code for our model R2DCNNMC is given in Algorithm 1, and a flow chart is shown in Figure 2.

Algorithm 1 Proposed R2DCNNMC model for COVID-19 diagnosis.
Input: E: Number of epochs; w: Transfer learning model parameters; η: Learning rate; b: Batch size; X train : Chest X-ray images training data set; X test : Chest X-ray images test data set; X : ImageNet data set Output: P test : The performance metrics on the test data set 1 Initialize transfer learning model (ResNet-50) parameters w 2 Transfer Learning: 3 for Local Epoch e ← 1 to E do 4 for b = (x, y) ∈ random batch from X i do 5 Optimize model parameters 9 Initialize upper layers of classification model parameters θ with trained transfer model parameters w and freeze 10 Multi-Classification Training: 11 Pre-process chest X-ray images data set 12 X train ← preData(X train ) 13 X test ← preData(X test ) 14 while θ has not converged do 15 for local epoch e ← 1 to E do 16 for s = (x, y) ∈ random batch from X train do 17 Update model parameters

Experimental Setup
For implementation of our proposed R2DCNNMC model we have performed various experiments. For model validation two chest X-ray data sets have utilized and hold-out cross validation technique is used for model training and validation. Model assessment measures have computed for model evaluation. In addition Stochastic Gradient Descent (SGD) Optimization algorithm has been used for proposed model optimization. Others parameters such as learning rate α (SGD) = 0.0001, epochs = 120, batch size = 100, Minibatch size = 9, outer activation function = Softmax and inner activation function = ReLU have been used in all experiments. In Table 2, the the proposed R2DCNNMC model parameters are defined accordingly. The hardware setup for all experiments we used a laptop with Intel Core i5, 64 GB RAM, and GPU. Python v3.7 is used for simulations and the proposed model is developed in Keras framework v2.2.4 and Tensor flow v1.12 as the back end. All experiments repeated many times for producing stable results. Two data set are used in this research for the evaluation of the proposed R2DCNNMC model. Before applying these data sets in the model we need to perform so pre-processing operations on both data sets that model suitable trained for effective performance. The COVID-19-Radiography-Dataset (CRD) is a data set of chest X-ray images for COVID-19 positive cases along with Normal, Viral Pneumonia, and Lung Opacity. This data set included 3616 COVID-19 positive cases, 10,192 Normal, 6012 Lung Opacity, and 1345 Viral Pneumonia images. The total images in the data set are 21,165.
To increase the data set size for effective training of the 2DCNN model we have used the data augmentation technique to augment the original dataset by using random transformation (rotation). All the images have been rotated with an angle of 45 degrees along the X-axis and added these augmented images to the original data set. Thus, the total images in new data are 42,330. The data augmentation technique has also used on the second chest X-ray (Covid-19 and Pneumonia) (CXI) data set. This data set has chest X-ray 6432 images, which belong to three classes (COVID19, Normal, PNEUMONIA). The data set contain 576 COVID19, 1583 Normal, and 4273 PNEUMONIA images, respectively.
After data augmentation, the images in the new dataset are 12,864. The proposed model has trained on original and augmented data sets, respectively, for all experiments. The holdout cross-validation method has used for the training and validation process in the proposed model because the data sets are now large enough so it will not make computational complexity problems and model will fit exactly and will generate high performance. The images of CRD and CXI datasets are shown in Figures 3 and 4.

2DCNN Model Performance Evaluation on Original and Augmented Date Sets
The predictive output of the 2DCNN model has been checked on two chest X-ray original and augmented data sets. The 2DCNN model has been trained with these two types of data sets along with other necessary hyperparameters. The SGD algorithm of optimization with a Learning Rate (LR) of 0.0001 is used in the model for model optimization. The number of epoch and batch sizes were 100 and 120, respectively, for all experiments. The results are reported in Table 3. Table 3 is present the performance of 2DCNN model on original and augmented COVID-19-Radiography (CRD) chest X-ray data sets. According to Table 3, the 2DCNN model on original COVID-19-Radiography (CRD) chest X-ray data set has gained 95.20% Accuracy, 97.00% Specificity, 80.25% Sensitivity/Recall, 92.40% Precision, 93.00% MCC, 95.09% F1-score and 96.00% AUC, respectively. On the other side, the 2DCNN model on augmented COVID-19-Radiography (CRD) chest X-ray data obtained high performance as compared to the performance on the original data set. The 2DCNN model has achieved 96.00% Accuracy, 96.45% Specificity, 97.00% Sensitivity/Recall, 97.43% Precision, 96.33% MCC, 96.52% F1-score and 97.23% AUC, respectively, on augmented data set (CRD). The accuracy of the 2DCNN model has increased from 95.20% to 96.45% when the model trained with augmented COVID-19-Radiography (CRD) chest X-ray data set. Similarly, the AUC value of the 2DCNN model has increased from 96.00% to 97.23%. The other evaluation metrics values also improved with the data augmentation process.
While with augmented chest X-ray COVID-19 and Pneumonia (CXI) data set the 2DCNN model has achieved 97.65% Accuracy, 99.10% Specificity, 97.86% Sensitivity/Recall, 99.80% Precision, 99.87% MCC, 97.73% F1-score and 99.23% AUC, respectively. Due to the data augmentation, the training of the 2DCNN was effectively performed and ultimately has increased model predictive performance. With the data augmentation process the model increased Accuracy from 97.02% to 97.65%, which demonstrates that the model predictive capability increased with data augmentation. Similarly, the MCC value has increased from 99.26% to 99.87%, and the AUC value has also improved from 99.00% to 99.23%.

ResNet-50 Model Performance Evaluation on Original and Augmented Date Sets
The ResNet-50 model performance has been evaluated on two chest X-ray original and augmented data sets. The ResNet-50 transfer learning CNN model trained with two types of data sets along with other required hyperparameters. The SGD optimization algorithm with LR of 0.0001 is used in the model for model optimization. The number of epoch and batch sizes were 100 and 120, respectively, for all experiments. To evaluate model performance different evaluation metrics have computed and reported in Table 4. The accuracy of the ResNet-50 model has increased from 94.03% to 95.20% when the model trained with augmented CRD chest X-ray data set. Similarly, the AUC value of the ResNet-50 model has increased from 94.20% to 95.00%. The other evaluation metrics values also improved with the data augmentation process.
The ResNet-50 model performance has also been evaluated by using chest X-ray COVID-19 and Pneumonia (CXI) data set. According to Table 4

R2DCNNMC Performance Evaluation on Original and Augmented Data Sets
The performance of the R2DCNNMC model has been evaluated on two chest X-ray original and augmented data sets. R2DCNNMC model has been trained with these two types of data sets along with other required hyperparameters. The SGD optimization algorithm with a LR of 0.0001 is used for model optimization. The number of epoch and batch sizes were 100 and 120, respectively, for all experiments. For training and validation of the model, 70% and 30% data are used. To evaluate model performance, different assessment measures have been computed and reported in Table 5.
The performance of the R2DCNNMC model on original and augmented COVID-19-Radiography (CRD) chest X-ray data sets is reported in Table 5. According to Table 5, the R2DCNNMC model on the original COVID-19-Radiography (CRD) chest X-ray data set has achieved 97.66% Accuracy, 99.00% Specificity, 89.18% Sensitivity/Recall, 99.10% Precision, 99.30% MCC, 98.00% F1-score and 97.03% AUC, respectively. On the other side, the R2DCNNMC model on augmented CRD data obtained high performance as compared to the performance on the original data set. The R2DCNNMC model has achieved 98.12% Accuracy, 99.28% Specificity, 93.00% Sensitivity/Recall, 99.56% Precision, 99.70% MCC, 98.23% F1-score and 98.60% AUC, respectively, on augmented CRD data set. The accuracy of the R2DCNNMC model has increased from 97.66% to 98.12% when the model trained with augmented CRD data set. Similarly, the AUC value of the 2DCNN model has increased from 97.03% to 98.60%. The other evaluation metrics values also improved with the data augmentation process.
The R2DCNNMC model performance has also been checked by using chest X-ray Covid-19 and Pneumonia (CXI) data set. According to Table 5, the R2DCNNMC model have obtained 98.17% Accuracy, 100.00% Specificity, 96.25% Sensitivity/Recall, 99.24% Precision, 99.70% MCC, 99.46% F1-score and 99.23% AUC, respectively. While with augmented chest X-ray Covid-19 and Pneumonia (CXI) data set the R2DCNNMC model has achieved 99.45% Accuracy, 99.63% Specificity, 96.99% Sensitivity/Recall, 100.00% Precision, 99.83% MCC, 99.78% F1-score, and 99.90% AUC, respectively. Due to the data augmentation the training of the R2DCNNMC effectively performed and ultimately has increased model predictive performance. With the data augmentation process the model increased Accuracy from 98.17% to 99.45%, which demonstrates that the model predictive capability increased with data augmentation. Similarly, the MCC value has increased from 99.70% to 99.83% and the AUC value has also improved from 99.23% to 99.90%. In Table 6, we compared the proposed R2DCNNMC model performance in terms of accuracy with baseline methods. Table 6 shows that the proposed model R2DCNNMC achieved 98.12% accuracy with data set CRD, which is higher than baseline methods. Similarly, the proposed model R2DCNNMC achieved 99.45 % accuracy with data set CXI, which is higher than baseline models. The excellent predictive performance of the proposed model demonstrated that it correctly detected COVID-19 and that it can be easily deployed in E-health care for COVID-19 diagnosis. COVID-19 is rapidly spreading, and many people are suffering and dying as a result of this global pandemic. Accurate and timely diagnosis is a significant medical challenge for effective COVID-19 control and treatment. Various techniques are used to control and diagnose this disease. Soft computing-based COVID-19 diagnosis methods are widely used, and numerous AI-based methods have been proposed by various researchers. However, these methods continue to suffer from a lack of accuracy in diagnosing COVID-19 patients.
The COVID-19 disease has a significant impact on the human respiratory system, and the lungs lose functionality quickly. Thus, using chest X-ray images to diagnose COVID-19 patients is an appropriate method that clinical professionals typically use. However, due to human error, medical doctors' interpretation of chest X-ray images to diagnose COVID-19 is insufficiently accurate. As a result, AI-based interpretation methods for distinguishing between normal and COVID-19 patient chest X-ray images are more effective.
The deep learning techniques based COVID-19 detection method from chest X-ray images is significantly important for the accurate diagnosis of COVID-19. The CNN model has significant applications in medical image classification [34]. The CNN model extracts more deep features from images data, and these features can help in the final classification.
To tackle the accurate diagnosis problem of COVID-19 in this research study, we have proposed a model for COVID-19 diagnosis employed CNN, data augmentation, and transfer learning techniques. The CNN model is used for deep features extraction and classification. Data augmentation and transfer learning techniques are used to improve the predictive capability of the CNN model. Two COVID-19 chest X-ray images data sets are used for validation of the proposed model. These data sets are not insufficient for effective training of the model. Hence, we have used the data augmentation [47] technique to increased the size of these data sets to train the model effectively and achieve excellent performance. The experimental results show that the proposed model obtained high performance on both original and augmented data sets as compared to baseline methods. The major finding of this study are as follows: Firstly, the accuracy of the 2DCNN model has increased from 95.20% to 96.45% when the model trained with augmented CRD data set. Similarly, the AUC value of the 2DCNN model has increased from 96.00% to 97.23%. In the 2DCNN model with augmented CXI data set, the accuracy improved from 97.02% to 97.65% and the MCC value increased from 9.26% to 99.87% , while the AUC value also improved from 99.00% to 99.23%. Thus, these results demonstrated that the model predictive capability increased with data augmentation.
Secondly, transfer learning techniques incorporated with the 2DCNN model and with CRD and CXI data sets the model increased accuracy from 97.66% to 98.12% and 98.17% to 99.45%, respectively.
Thirdly, the proposed model (R2DCNNMC) obtained 98.12% classification accuracy on CRD data set and 99.45% classification on CXI data set as compared to baseline methods. Due to higher predictive performance of the proposed model, we recommend it for accurate diagnosis of COVID-19 in E-healthcare.

Conclusions
Deep learning algorithms, particularly convolutional neural networks, are commonly used to analyze medical image data. The accurate diagnosis of COVID-19 is a critical issue, and a new accurate diagnosis method is significantly needed to address it. Hence to diagnosis COVID-19 accurately, we have proposed a R2DCNNMC model, which is based on deep and transfer learning. In the proposed model designing we have used 2DCNN model for deep features extraction, and classification of chest X-ray images data for recognition of COVID-19. Two data sets have utilized for the validation of the proposed model. Furthermore, data augmentation techniques have been used for increasing data sets size for effective training of the proposed model. In addition cross-validation and model assessment measures have been computed for model evaluation.
The experimental results demonstrated that the proposed R2DCNNMC diagnosis model has been obtained very high performance and obtained 98.12% classification accuracy on CRD data set and 99.45% classification on CXI data set as compared to baseline methods. We recommend the proposed method for effective COVID-19 identification in E-healthcare due to its high predictive performance. In the future, we will use advanced models of transfer learning, federated learning, and deep learning, as well as other types of data sets, to diagnose COVID-19.  Data Availability Statement: The data sets used in this study available on public repositories.

Conflicts of Interest:
The authors declare that they have no conflict of interest.

Abbreviations
The following section describes the mathematical notations and abbreviations used in this work.

X
Data set X train Chest X-ray training data set X test Chest X-ray test data set Y Predicted output classes label X ImageNet data set b Batch size