Deep MLP-CNN Model Using Mixed-Data to Distinguish between COVID-19 and Non-COVID-19 Patients

: The limitations and high false-negative rates (30%) of COVID-19 test kits have been a prominent challenge during the 2020 coronavirus pandemic. Manufacturing those kits and performing the tests require extensive resources and time. Recent studies show that radiological images like chest X-rays can offer a more efﬁcient solution and faster initial screening of COVID-19 patients. In this study, we develop a COVID-19 diagnosis model using Multilayer Perceptron and Convolutional Neural Network (MLP-CNN) for mixed-data (numerical/categorical and image data). The model predicts and differentiates between COVID-19 and non-COVID-19 patients, such that early diagnosis of the virus can be initiated, leading to timely isolation and treatments to stop further spread of the disease. We also explore the beneﬁts of using numerical/categorical data in association with chest X-ray images for screening COVID-19 patients considering both balanced and imbalanced datasets. Three different optimization algorithms are used and tested:adaptive learning rate optimization algorithm (Adam), stochastic gradient descent (Sgd), and root mean square propagation (Rmsprop). Preliminary computational results show that, on a balanced dataset, a model trained with Adam can distinguish between COVID-19 and non-COVID-19 patients with a higher accuracy of 96.3%. On the imbalanced dataset, the model trained with Rmsprop outperformed all other models by achieving an accuracy of 95.38%. Additionally, our proposed model outperformed selected existing deep learning models (considering only chest X-ray or CT scan images) by producing an overall average accuracy of 94.6% ± 3.42%.


Introduction
With the advent of the Novel Coronavirus (SARS-CoV-2) in December 2019, first detected in the Wuhan Province of China, there was a major outbreak of the associated disease (COVID- 19), which causes severe acute respiratory syndrome. More importantly, this virus can be transmitted directly from human to human, making it difficult to be contained. Rapidly, COVID-19 was observed in virtually all countries, triggering a severe public health crisis worldwide [1,2]. As a consequence, the World Health Organization (WHO) recognized this public health emergency as an ongoing pandemic on 11 March 2020 [3]. Coronaviruses (CoV) belong to a large family of viruses that cause diseases related to colds like the Middle East Respiratory Syndrome (MERS-CoV) and the Severe Acute Respiratory Syndrome (SARS-CoV) [4].
As of 30 August 2020, the number of Coronavirus cases in the world is approximately hitting the 25.3 million mark, with the total number of deaths surpassing 849,958 and an associated mortality rate Goshal and Tucker (2020) and Wang and Wong (2020) also developed a Convolutional Neural Network (CNN) to classify COVID-19 and Non-COVID-19 cases using X-ray images, with approximately 92.9% and 83.5% accuracy respectively [36,37].
Additionally, there are numerous other recent studies carried out with CT images using severaldeep learning models [38][39][40][41]. Likewise, machine learning (ML) algorithms using numerical/categorical data have also been utilized for the diagnosis of COVID-19. A number of studies [34,39,42] developed machine learning models based on Lasso regression, and multivariate logistic regression for early identification of COVID-19 patients. Some of the more significant factors in these studies were age, temperature, heart rate, blood pressure, fever, sex, uric acid, triglyceride and serum potassium.
Even though the Center for Disease Control and Prevention (CDC) currently does not recommend it, still, many studies in this field of research use Chest radiography or CT scan images to diagnose COVID-19 [43,44]. For instance, a recent report in the journal of Applied Radiology (22 March 2020) [44] claimed that using radiological images alone detects patients with ARDS (also known as acute respiratory distress syndrome [45]) or SARS (also known as severe acute respiratory syndrome [46]), as COVID-19, which is a drawback since the diseases are misclassified. Articles by Greenfieldboyce [47] and Jewell [48] suggested that a patient's information such as age, gender, temperature, and chronic disease history are significant predictors to identify affected COVID-19 patients. Keeping this in mind, some of the studies in this field of research use (numerical or categorical) information such as age, gender, body temperature, and chronic disease history for diagnosis of COVID-19 as well. For instance, Bai et al. (2020), uses CT images (image data) and a combination of demographics, signs, and symptoms (numerical/categorical data) to establish an Artificial Intelligence (AI) model that predicts patients having mild symptoms with potential malignant progression [49]. However, none of the previous studies considered numerical, categorical, and chest X-ray images in combination. Thus, developing a model comprising of numerical/categorical data coupled with chest X-ray images may create a new reliable alternative to screen patients with COVID-19 symptoms.
Taking these opportunities into account, our study focuses on mixed-data analysis using both image and numerical/categorical data to assist the early diagnosis of COVID-19 patients using a deep learning approach. A deep Multilayer Perceptron-Convolutional Neural network (Deep MLP-CNN) model is proposed, considering the age, gender, temperature, and chest X-ray images of patients. The model was tested under two conditions: a balanced dataset (containing 13 COVID-19 and 13 non-COVID-19 patients), henceforth referred to as Study One, and an imbalanced dataset (containing 112 COVID-19 and 30 non-COVID-19 patients), referred to as Study Two.

Dataset and Methodology
We adopted a COVID-19 data set containing both X-ray images and numerical/categorical data for each patient, collected from the open-source GitHub repository shared by Dr. Joseph Cohen [50]. This database is continuously being updated with data shared by several entities around the world and has been used by many studies for detecting COVID-19 patients considering various data mining techniques. At the time of our study, the dataset contains data from 184 different patients with information such as age, gender, temperature, survival, intubation, partial pressure of oxygen dissolved in the blood (PO2) and classification as COVID-19, SARS, Pneumocystis, E. coli, Streptococcus, or "no findings" patients. For simplicity, we have organized the dataset in two groups: COVID-19 patients, and all others as non-COVID-19 patients (Figure 1). One of the challenges associated with this dataset was the missing data for select parameters across patients. In consideration of that limitation, for Study One (balanced dataset), a small dataset was set up with 13 COVID-19 and non-COVID-19 patients considering age, gender, temperature, and chest X-ray images as variables. Since there were numerous missing entries in the temperature column, only rows with complete information of the aforementioned variables were taken into account. No statistically significant difference (p values were obtained using t-tests (*) and chi-square tests (**)) was found between COVID-19 (6 female, 7 male) and non-COVID-19 (5 female and 8 male) groups in terms of sex distribution (p * = 0.69 > 0.05), mean of age and temperature (p ** = 0.49 > 0.05). Contrarily, the size of the dataset was enlarged by ignoring the "Temperature" column entirely for Study Two. In this case, an imbalanced dataset was constructed with information from 142 patients (112 COVID-19, 30 non-COVID-19) to compare and contrast the model's performance with the imbalanced class. No statistically significant difference was observed between COVID-19 and non-COVID-19 groups in regards to the sex distribution (p * = 0.34 > 0.05) and the mean of age (p ** = 0.06 > 0.05). Table 1 summarizes the datasets used for both studies. The implementation of the MLP-CNN models and calculation of computational times took place using the Anaconda modules with Python 3.7, and ran on an office-grade laptop with common specifications (Windows 10, Intel Core I7-7500U, and 16 GB of RAM).

Proposed Model
Neural Networks (NN) recently showed more promising results over traditional machine learning (ML) algorithms like Linear regression, Logistic regression, and Random Forest, with high dimensional datasets, primarily when they contains combined numerical, categorical, and image data [51]. Classical ML approaches may perform better with a small dataset as it is computationally inexpensive and easily interpretable. However, once the size of the data increases (big data), handling such big data becomes challenging for traditional ML approaches. Conversely, deep NN methods guarantee an opportunity to develop a more robust model that perform well on both small and large datasets, mainly due to recent advancements in different NN approaches such as transfer learning, Recurrent Neural Network (RNN), and CNN. Additionally, classical ML approaches often require sophisticated feature engineering or dimensionality reduction [52]. In contrast, deep NN methods: provide better feature engineering methods, can be implemented directly, and achieve good results [53].
We developed a deep learning-based model inspired from [54]. Our choice for this architecture was motivated by its predictive performance on visual and textual features, addressed in many recent papers [55][56][57][58]. Our proposed model is a combination of a Multilayer Perceptron (MLP) and Convolutional Neural Network (CNN). On one hand, MLP was used to handle the numerical/categorical data; on the other, CNN was used to extract features from the X-ray images. Parameter tuning was performed to improve the model's performance, mainly: the number of hidden layers, number of neurons, epochs, and the batch size. At first, hidden layers and the number of neurons were set randomly; however, the optimal parameters were later determined using the grid search method. The optimized parameters using the grid search method were as follows: Learning Rate = 0.001, Batch Size = 5, Epochs = 50. Finally, the proposed MLP model was combined with the CNN architecture, as suggested by [54]. As shown in Figure 2, the highest accuracy (100%) was achieved on 50 epochs, while the training loss was minimized up to 85%.  Table 2 shows how different numbers of neurons and hidden layers affect the MLP models. Based on our experiment, with two hidden layers and four neurons, it is possible to achieve 100% accuracy while reducing the loss up to 100%. Table 2. Model performance with different numbers of neurons and hidden layers. In order to obtain the best model, optimization algorithms needed to be applied during the training phase [59]. For that purpose, we have tested three popular optimization algorithms: adaptive learning rate optimization algorithm (Adam) [60], stochastic gradient descent (Sgd) [61], and root mean square propagation (Rmsprop) [62].

How Our Proposed MLP-CNN Model Works
We applied the Rectified Linear Unit (ReLU) as the activation of each neuron in the input and hidden layers and utilized the "linear" function in the final layer [63]. The first input layer of the MLP consists of eight neurons and takes the numerical/categorical data as a one-dimensional array. The hidden layer consists of four neurons and the final layer consists of one neuron. Secondly, the proposed CNN model contains three convolution layers, along with three pooling layers (Max Pooling). The first hidden layer is a convolutional layer with 16 feature maps, each with a kernel size of 64 pixels and a "ReLu" activation function. Then, we have defined the pooling layer that takes the maximum value, configured with a pool size of (2,2). The following pooling layer is a dense layer that takes 16 neurons, succeeded by activation function-ReLU. The next layer is another dense layer with four neurons. Two individual outputs emerged from two separate models-one from the MLP model and the other from the CNN model. Both outputs are concatenated and considered as a single input. The newly acquired single input was counted as an initial input followed by additional two dense layers consisting of four neurons. The Keras functional API was utilized to concatenate the MLP and the CNN models, as it provides a potential opportunity to develop models with multiple inputs and outputs. Typically, such models merge inputs from different layers using an additional layer and combine several tensors, as shown in Figure 3, which illustrates the overall diagram of our proposed end-to-end model. In summary, we have encoded the numerical/categorical inputs and the chest X-ray input as vector inputs, then concatenated these vectors. Finally, the output layer has one neuron for the two classes and a linear activation function to provide probability-like predictions for each class.

Experiment Setup
The performance of the model was evaluated using 5-fold cross-validation for both Studies (Study One and Study Two). The experiment was repeated five times (as shown in Figure 4), and the overall performance of the model is computed by averaging the outcomes of all the 5 folds. The results were presented in terms of accuracy, precision, recall, and F1 score with 95% confidence interval [64]. where: • True Positive (t p )= COVID-19 patient classified as patient • False Positive ( f p )= Healthy individuals classified as patient • True Negative (t n )= Healthy individuals classified as healthy • False Negative ( f n )= COVID-19 patients classified as healthy

Computational Results
At first, as means of identifying appropriate training and testing set ratios for validation, we have split our data into the following training set/testing set ratios: 75:25, 70:30, 60:40, 85:15, and 80:20. Such split ratios are commonly used in deep learning techniques for model evaluation and validation [65][66][67]. The best results in terms of training and testing accuracy were found when the dataset was split randomly into 80% and 20% for training and testing sets, respectively. To exemplify that, Table 3 presents the performance of our proposed models with different ratios of randomly split data between training and testing. Since the dataset is comparatively small, reducing training data also reduces the model's ability to achieve better performance in terms of accuracy. In contrast, increasing the training set with a small number of datapoints for testing is not sufficient to confidently measure the model's overall performance. The training stage was carried out up to no more than 50 epochs to avoid overfitting. A graphical illustration of the model's overall performance using Adam and Rmsprop is presented for the 5th fold in Figure 5. Each model's average performance on both balanced (Study One) and imbalanced (Study Two) datasets along with 95% confidence intervals are displayed in Table 4. For the balanced dataset, Adam had the highest accuracy (96.3%), precision (97.2%), recall (96.3%), and F1 score (96.4%) compared to the other two models-trained with Rmsprop and Sgd. Rmsprop outperformed all other models on the imbalanced dataset. While considering the overall performance on both datasets (average of both studies) the model trained with Adam is the best in terms of accuracy (94.6% ± 3.4%), precision (93.5% ± 3.7%), recall (94.5% ± 3.5%), and F1 score (93.5% ± 3.7%). Overall execution time for both datasets is shown in Figure 6. The lowest registered execution time was 53 s for the model trained with Rmsprop in the balanced dataset, whereas the maximum execution time was 79 s when the model was trained with Sgd. Conversely, for the imbalanced dataset, Adam showed the lowest execution time of 138 s, while Rmsprop displayed the maximum execution time of 163 s. In conclusion, when both studies are considered holistically, the average execution time for Adam was lowest in comparison with the other two. To evaluate the predictive performance of each model, confusion matrices were generated. Figure 7 shows confusion matrices for the models trained with Adam, Rmsprop, and Sgd, respectively, for fold-5. In Study One, the test set contained 6 patients, where 4 were COVID-19, and 2 were non-COVID- 19

Discussion
In this study, we proposed and evaluated an MLP-CNN based model that can distinguish between patients with and without COVID-19, and demonstrated the advantage of combined MLP-CNN models over traditional CNN or MLP used exclusively for that purpose. Our combined model achieved an accuracy of around 96.3 (using Adam optimization algorithm) in comparison to few published studies that used only CNN [36,37,39] or traditional ML [41] approaches. On the one hand, MLP models are fast and time-efficient when used with numerical and/or categorical data only. On the other hand, CNN models are notably more accurate in extracting useful features from chest X-ray images for respiratory disease diagnosis. For instance, Wang and Wong [37] and Khan et al. [68] used CNN-based approaches to detect the onset of COVID-19 disease using chest X-ray images and achieved an accuracy of 83.5% and 89.6%, respectively. In comparison, as previously stated, our combined model demonstrated an accuracy of around 95.4%.
In Study One, our model learned from only 26 COVID-19 subjects, which represents 18% of the data used by Zhang et al. [69] and 2% by Shi et al. [41] (see Table 5 and 6). Therefore, our proposed model may be used as a useful computer-aided diagnosis tool for low-cost and fast COVID-19 screening considering small datasets. Table 5. Comparison of the proposed COVID-19 diagnostic method (MLP-CNN) with other deep learning methods developed using chest X-ray images.
Both balanced and imbalanced data were considered for the experiments, achieving an average accuracy of around 95% (96.3% from Study One and 95.4% from Study Two). Finally, our model can be easily adopted by healthcare professionals as it is cost and time-effective, which accelerates COVID-19 screening procedures and enables patients with the disease to be isolated at earlier stages. Real-time screening of COVID-19 patients using MLP-CNN approaches might be possible with minimal human interaction, provided that chest X-ray images and other relevant information such as age, gender, and temperature of the respective patients are available. Additionally, AI-based screenings can be tailored to a low degree of complexity to the end user, and may not require the training of technicians in the complex computational tools herein described. We identify the following limitations of our study, which present immediate opportunities for future investigations: 1. the size of the dataset adopted is comparatively small, and 2. only four numerical and categorical parameters were considered.

Conclusions
In this study, we proposed an MLP-CNN based model for early diagnosis of patients with COVID-19 symptoms considering mixed input data, specifically numerical/categorical data (age, gender, and temperature) and image data (chest X-ray images). Our results have shown that using input data of mixed nature enables the development of highly accurate models with small and balanced datasets (96.30% accuracy) for COVID-19 patient identification. Moreover, on larger and imbalanced datasets, our model performed notably well (95.4% accuracy) compared to similar models proposed by other authors, as shown in Table 6.
In conclusion, our study provides valuable insights into the development of a more robust screening system that supports healthcare providers in the identification of COVID-19 patients, such that individuals carrying the disease can be screened and isolated at an earlier stage. Our contributions are in line with the focus areas of global-scale initiatives such as the Rapid Assistance in Modelling the Pandemic (RAMP) [70] and associated literature focusing on the modeling of the pandemic. Specifically, we have developed a tool that partially addresses key points and opportunities in COVID-19 research, especially those with respect to medical care and monitoring of the contagion, as discussed by Bellomo et al. (2020) [71]. Future studies should reapply these methods in larger datasets with more images and complete patient information, work with highly imbalanced data, apply mixed-data analysis using kernel methods, and consider data containing the geographical location of patients.