An Improved Machine-Learning Approach for COVID-19 Prediction Using Harris Hawks Optimization and Feature Analysis Using SHAP

A healthcare monitoring system needs the support of recent technologies such as artificial intelligence (AI), machine learning (ML), and big data, especially during the COVID-19 pandemic. This global pandemic has already taken millions of lives. Both infected and uninfected people have generated big data where AI and ML can use to combat and detect COVID-19 at an early stage. Motivated by this, an improved ML framework for the early detection of this disease is proposed in this paper. The state-of-the-art Harris hawks optimization (HHO) algorithm with an improved objective function is proposed and applied to optimize the hyperparameters of the ML algorithms, namely HHO-based eXtreme gradient boosting (HHOXGB), light gradient boosting (HHOLGB), categorical boosting (HHOCAT), random forest (HHORF) and support vector classifier (HHOSVC). An ensemble technique was applied to these optimized ML models to improve the prediction performance. Our proposed method was applied to publicly available big COVID-19 data and yielded a prediction accuracy of 92.38% using the ensemble model. In contrast, HHOXGB provided the highest accuracy of 92.23% as a single optimized model. The performance of the proposed method was compared with the traditional algorithms and other ML-based methods. In both cases, our proposed method performed better. Furthermore, not only the classification improvement, but also the features are analyzed in terms of feature importance calculated by SHapely adaptive exPlanations (SHAP) values. A graphical user interface is also discussed as a potential tool for nonspecialist users such as clinical staff and nurses. The processed data, trained model, and codes related to this study are available at GitHub.


Introduction
The world is now facing a challenge to eradicate COVID-19 because of its continuous spreading [1]. People in Wuhan City, China, were the first to be affected by the virus in December 2019 [2,3]. Then it spread from person to person, creating the latest global

Related Works
Rajaraman and Antani [11] developed a deep-learning framework for COVID-19 detection using chest X-ray (CXR) data collected from COVID-affected people and then analyzed it using a convolution neural network (CNN) to evaluate the system. Yan et al. [12] proposed a model to detect severely affected COVID-19 patients using three clinical attributes and an ML algorithm. They used eXtreme gradient boosting (XGB) to predict their model. Another framework using ML and inpatient facility data for early detection of COVID-19 was presented by Awal et al. [13]. Here, Bayesian optimization was used to select optimal hyperparameters of the ML algorithms, and the adaptive synthetic (ADASYN) algorithm was applied to balance the dataset. The model showed 98.50% ACC using the XGB classifier. Here, the important features were also estimated using SHAP analysis, and a graphic interface was also presented for supporting medical staff. Kassania et al. [14] developed an ML-based COVID-19 detection system using CXR images where they applied a deep CNN to extract essential features. Two deep-learning techniques named DenseNet121 and ResNet50 were used for classification, and their accuracy was 99% and 98%, respectively. On the other hand, Saha et al. [15] proposed an automated diagnosis system for COVID-19 using CXR based on the ensemble method and CNN. Deep features extracted using CNN were utilized to combine four binary classifiers: support vector machine, random forest, AdaBoost and decision tree. Thus, they were able to build an ensemble of classifiers for better detection of COVID-19. The system showed better performance with 98.91% ACC. Rasheed et al. [16] presented the COVID-19 diagnosis system from a chest X-ray image using logistic regression (LR) and CNN. A principal component analysis (PCA) was used to reduce the dimension that increases the speed of the learning process. The CNN and LR showed 95.2-97.6% ACC without using PCA, whereas 97.6-100% ACC was obtained with PCA for COVID-19 detection.

Contributions
There are also many other methods for detecting COVID-19 where the ML algorithms are used, and CT (Chest) or CXR images are used as the input data. Generally, people need to go to the hospital to test for COVID-19, which is risky and time-consuming. Clinical data collected from home could be utilized in association with the ML-based systems for detecting COVID-19, where the inpatient facility data is used as the input dataset. The dataset contains fever, lack of smell and taste, cough, and chronic diseases such as diabetes, asthma, etc. Different boosting classifiers such as XGB, light gradient boosting (LGB), gradient boosting classifier (GBC), categorical boosting (CatBoost), random forest (RF), etc., could be used to train the model. Then these models could be combined to build ensemble classifiers that can increase performance. There are some hyperparameters in each classifier that can be optimized. Here we have used HHO to tune the hyperparameters, and it is expected that this model will be helpful and user-friendly for faster detection of COVID-19. The model's performance could be evaluated using different metrics and graphic representations.
The main contributing part of this paper is described as follows: • Since COVID-19 is a highly contagious disease, hospitals and diagnostic centers need extra precautions to test for COVID-19, ultimately increasing the costs and health hazards. The proposed model used the less expensive inpatient facility data which can be collected at home, instead of X-rays or CT scans to predict COVID-19 with the expectation of reducing patients' visits to the hospital and diagnostic center. • An ML framework is designed using HHO to detect COVID-19. • The hyperparameters of the boosting classifiers are optimized using our method, and then the ensemble classifier has increased the performance of our model. • The important features are estimated using SHapely adaptive exPlanations (SHAP) analysis. • The performance is compared with other existing models. • A decision support system and a clinically operable decision forest are created to support the medical staff.
The rest of the paper is organized as: Materials and Methods are described in Section 2. The experimental result is presented in Section 3. In Section 4, we present an organized discussion and comparison of our proposed system with other existing methods. The paper ends with a conclusion in Section 6.

Data Source
In this work, we used extensive data of COVID-19 patients gathered from a publicly available source (https://www.COVID19survivalcalculator.com/en/download, accessed on 23 February 2021). The master dataset contains data collected from 1,023,426 individuals with 98.80% COVID-19 negative and 1.20% COVID-19 positive patients. The dataset contains 59 columns about behavior, segmentation, geographic, health conditions, medications and risk values [17]. This dataset contains multiple columns such as age, Body Mass Index (BMI), sex, alcohol, cannabis, contacts_count, COVID19_symptoms, COVID19_result (positive and negative), smoking, different chronic diseases such as asthma, kidney_disease, lung_disease, diabetes, etc. Among the 59 columns in the master dataset, there are 40 categorical columns which are almost double the number of numeric columns. There are about 13.8% missing values and no duplicate value or row. It can be observed from the COVID-19 dataset that the variables, such as region, income, insurance, immigrant, prescription_medicaton, etc., have a higher rate of missing value than others. For this reason, the variables mentioned above have not been included in Table 1. Proposed system architecture please see in Figure 1.

Data Preprocessing
Firstly, metadata such as survey_date, region, country, ip_latitude, ip_longitude, ip_accuracy, house_count, prescription_medication, opinion_infection, opinion_mortality, risk_infection, risk_mortality, nursing_home, etc., have been removed as they are not relevant to our research objective. After that, the missing values of all attributes are imputed using iterative imputation techniques, where the BayesianRidge estimator has been used at each state of the iteration [18][19][20]. The sex and smoking attributes are encoded with a label encoder. Furthermore, COVID-yes is a minority class in the original dataset we used, whereas COVID-no is a majority class. Therefore, we have tried to make a nearly balanced dataset by taking a 1 : 3 ratio as COVID-yes and COVID-no, respectively, using the undersampling approach. The details of the dataset are shown in Table 1.

Classification Algorithms
Five classifiers, XGB, LGB, SVC and RF, have been applied in the proposed framework. These are the most recent classification algorithms used for COVID-19 classification [13]. We can evaluate the performance of these classification models with other existing works. Moreover, these classifiers have some hyperparameters that can be optimized using our proposed HHO-based framework. Finally, the SHAP analysis explains these classification algorithms to determine the most important clinical features for detecting COVID-19.

XGB
The XGB model [21], associated with gradient boosting, combines weak classifiers to build a strong classifier in an iterative manner [22]. The residual is used to correct the previous predictor, and the loss function can be minimized at each iteration of gradient boosting. XGB adds regularization into the objective function to measure model performance [13].
The key hyperparameters of XGB are learning_rate, max_depth, gamma, min_child_weight, colsample_by_tree, subsample and alpha, where learning_rate controls the effect of adding new trees in order to prevent overfitting, max_depth limits a tree to the maximum number of nodes, gamma denotes loss reduction required for the next partition on a tree's leaf node, min_child_weight means the minimum sum of weight needed, colsample_by_tree denotes the subsample ratio to build each tree, subsample builds a tree using the ratio of columns and alpha is the L1 regularization term on weight. By tuning these hyperparameters, the classification performance of the proposed system is increased.

CatBoost
CatBoost, or categorical boosting, is an open-source ML algorithm developed by Yandex [1]. The "CatBoost" name comes from two words, "Category" and "Boosting" because CatBoost works with categorical features by converting categorical values into numbers. It can be used in ranking, recommendation systems, forecasting and and even personal assistants along with regression and classification. These advantages motivated us to choose CatBoost as the classifier for the proposed COVID-19 detection system.
The dominating hyperparameters of CatBoost are depth, colsample_bylevel, subsample, n_estimator, learning_rate, l2_leaf_reg, etc. The learning_rate and subsample have already been discussed in XGB. In addition, Depth represents the depth of the tree, L2_leaf_reg coefficient calculates the leaf value, n_estimator is the number of boosting iterations and col-sample_bylevel is the ratio of columns for each level in tree building. These hyperparameters are adequately optimized to increase the performance of CatBoost.

RF
RF combines different decision tree classifiers, where each classifier is obtained using a random vector from the input dataset. Each tree provides a unit vote for the target class to classify data [23]. The algorithm that is used for classification is given below.

1.
Let N be the original dataset. A classification tree is constructed using Bootstrap. After random selection from the original dataset, the remaining samples create the out-of-bag data.

2.
At first, we have selected n variables randomly from each node of each tree. Then, a constant m (m < n) is set, and we select m variables from n variables. After splitting the tree, the variable having the most classification ability is chosen from m variables based on the Gini index of the node impurity measurement. During classification, the threshold value of the variable is determined by checking each classification point. For a given training set N, we randomly selected one case and said it belongs to some class C i .

3.
Every tree grows up to its maximum without any pruning.

4.
The classification outcome is obtained by the maximum voting result of the classifier.
The important hyperparameters of RF are min_sample_split, max_depth, min_sample_leaf, n_estimator and criterion. Here, max_depth is the maximum level of the tree, min_sample_leaf represents the minimum amount of samples that a node must keep after getting split, min_sample_split is the slightest amount of samples kept by an internal node required to split into further nodes, n_estimator controls the number of decision tree, and criterion measures the quality of splits. 'The optimization technique tunes the hyperparameters of RF and thus magnifies the performance of the classification task.

LGB
In general, LGB, introduced by Ke et al. [24], relies on decision tree algorithms, and thus it produces a tree leaf-wise, while other algorithms grow depth-wise or level-wise (https://www.analyticsvidhya.com/blog/2017/06/which-algorithm-takes-the-crown-lightgbm-vs-xgboost/, accessed on 12 June 2017). The main advantages of LGB are that it can handle a large amount of data, has lower memory usage because of replacing the continuous value with discrete bins, can show better ACC than other boosting algorithms, and uses a histogram-based algorithm that makes the training process faster, etc.
The important hyperparameters of LGB are learning_rate, number_of_leaves, feature_fraction, bagging_fraction, max_depth, max_bin, subsample, colsample_tree, min_child_samples and min_data _in_leaf. Here, learning_rate, max_depth, subsample and colsample_tree are as previously discussed . In addition, number_of_leaves represents the number of leaves in the full tree, and the default value is 31; feature_fraction is considered in each iteration to set a fraction of the features; bagging_fraction helps execute bagging for getting faster results, max_bin is the maximum number of the bin; min_child_samples represent minimum number of data in one leaf; min_data_in_leaf is the lowest number of records of a leaf (https://www.analyticsvidhya. com/blog/2017/06/which-algorithm-takes-the-crown-light-gbm-vs-xgboost/, accessed on 12 June 2017). By optimizing these hyperparameters, the performance of LGB can be magnified.

Ensemble Methods
An ensemble method is the sum of several optimized classification algorithms. A class label is obtained by majority voting (hard voting) of individual classifiers [25]. The enumerated class label is then stored in a new class Cl j . Here five optimized classifiers such as LGB, CatBoost, XGB, SVC and RF are ensembled to predict COVID-19 accurately, and the block diagram of this ensemble method has been displayed in Figure 2. By way of illustration, XGB, LGB and Catboost provide the "COVID-yes" class (1, 1, 1), and RF provides the "COVID-no" class. Insight comes from the majority voting system, since most classifiers predict the "COVID-yes" class; therefore, this instant is treated as the "COVID-yes" class. The concept of the ensemble technique is illustrated in the following equation mathematically.ŷ = mode{cl 1 , cl 2 , cl 3 , cl 4 , cl 5 } For instance, if the class labels for LGB, CatBoost, XGB, RF and SVC are calculated as 0, 1, 1, 1, 0, thenŷ = mode{0, 1, 1, 1, 0} = 1 because 1 occurs in most of the classes.
arg min where h ∈ H represents the hyperparameters of the classifiers, such as h 1 , h 2 , h 3 , · · · , h n ∈ H, and Cl f is the ML classification algorithms. For the purpose of the optimization task, we need to determine a proper objective function. In this paper, we have proposed an improved objective function, which is defined as an average loss of the K = 10-fold cross-validation using F1-score ( f obj ) applied to the training dataset. Because F1-score can be useful even in the case of an imbalanced dataset as it takes both precision and recall into account, it is better than error and accuracy. The mathematical generalization of the proposed objective function ( f obj ) has been illustrated in Equation (3).
where the F1Score k of a particular classifier, Cl f in k th -fold can be calculated as, The optimization of hyperparameters is needed to achieve maximum performance on the data in a reasonable amount of time. For our COVID-19 dataset, a metaheuristic algorithm called HHO technique [30] is applied to tune the hyperparameter for precisely detecting COVID-19. The algorithm is designed based on the Harris hawk's team's cooperative behaviors of hunting and chasing patterns for the capture of prey in nature called surprise pounce. The prey get confused by the hawks from diversified directions. Harris hawks can efficiently choose chase type in accordance with the distinct patterns of prey [3].
In the proposed HHO-based ML framework for COVID-19 prediction, there are mainly three phases, which have been delineated sequentially as follows:

Exploration Phase
In the exploration phase, according to the concept of the HHO algorithm [31], hyperparameters of the ML classifier are explored for the global optimal position H glo_opt . The search strategies for optimal hyperparameters of ML classifiers can be expressed as: where H i (iter) and H i (iter + 1) represent the locations of hyperparameters of the classifiers at the present and next iterations (i = 1, 2, 3, · · · , N); N denotes the size of population; H glo_opt symbolizes the position of the global optimal hyperparameter; H rand denotes the randomly selected position of the hyperparameter; iter signifies total iterations executed; LB and UB typify the least and greatest value of each hyperparameter; Th, r 1 , r 2 , r 3 and r 4 are randomly selected numeric values between 0 and 1; H avg indicates the mean location of the hyperparameter, that can be calculated as:

Shifting between Exploration and Exploitation
In the shifting phase, the hyperparameters of the ML classifiers are explored and exploited as flows: where E 0 and E represent the initial and escaping energy, respectively. The HHO will be in the exploration phase if |E| ≥ 1 and in the exploitation phase, if |E| < 1.

Exploitation Phase
The exploitation phase is stated by defining a random number r ∈ [0, 1]. When 0.5 ≤ |E| < 1 and r ≥ 0.5, a soft beiase strategy is used to update the position of the hyperparameter.
where K is the random number between [0, 2]; ∆H i (iter) stands for the deviation between the current global optimal position of the hyperparameter and the individual current position, which can be mathematically defined as: When |E| < 0.5 and r ≥ 0.5, a hard beiase strategy [30] is applied to update the position of the hyperparameter.
When 0.5 ≤ |E| < 1 and r < 0.5, a soft beiase strategy [30] with a progressive rapid dive is used to update the position of the hyperparameter. where where D symbolizes the diminution of problem [31]; S rv describes the random vector with size 1 × D, where the elements of S rv exist in [0, 1]; Levy is the mathematical levy flight function; f obj ( * ) is the objective function. When |E| < 0.5 and r < 0.5, a soft beiase strategy with a progressive rapid dive is used to update the position of the hyperparameter. where By iterating up to maximum iterations (max_iter), the position of the global hyperparameter H glo_opt is obtained, and using these global optimal hyperparameters, we will establish our proposed ML model for COVID-19 prediction [32].
Using the phases illustrated above, the algorithm of HHO used for the best hyperparameter optimization of classifiers can be constructed as in [Algorithm 1].

Performance Evaluation Metrics
The performance of any classification/prediction algorithm is enumerated using a variety of evaluation metrics. In our proposed approach, the ACC, error, F1-score, SE, SP, MCC, Kappa index, balanced accuracy score, cross-validation score (CV-score), precision and AUC, have been utilized to compute the performance of the proposed research from the confusion matrix. In our HHO-based ML approach, the 2 × 2 confusion matrix has been used to evaluate the model through the metrics mentioned above. The entries along the principal diagonal represent the correctly classified outcomes. (The higher values of the metrics except the error mentioned above specify the better version of the model.) In addition, the precision-recall curve (PRC) and recall vs. precision boundary are presented to evaluate the performance. Furthermore, a receiver operating characteristic (ROC) has also been used to measure the classification outcomes obtained from the classifiers. The decision boundary threshold value of 0.5 has been considered to provide equal importance to "COVID-yes" and "COVID-no" classes.  for each (H i (iter)) do 6 Modify the initial energy E 0 and jumping strength K

7
Modify the escaping energy E using the following equation if |E| ≥ 1 then 10 Modify the position of the classifier's hyperparameter, H i (iter) using else if (|E| < 0.5 and r < 0.5) then 25 Modify the position of the classifier's hyperparameter, H i (iter) using the equations as illustrated below where M and N denote the number of input features and set of input features, respectively; z typifies the training data; z ∈ {0, 1} M represents the coalition vector (simplified features), where "1" and "0" denote the presence and absence of the corresponding features; ψ j ∈ R is the feature attribution for a feature j. The contribution of each feature can be allocated by obtaining ψ j from the following equation.
where S f tr is the subset of features, n is the total input features.

Experimental Result
This paper uses HHO to optimize the hyperparameter of five classification algorithms. The optimized values of the hyperparameters using HHO and their domain range are shown in Table 2. For a fair justification, this study used the same dataset for the same time frame with the same set of training and testing samples in all classifiers. Different statistical measures were calculated to estimate how our proposed ML model performs. Table 3 shows different performance measures of the COVID-19 dataset. This section depicts the precision versus recall curve and the recall rate versus decision boundary. Furthermore, the ROC curve has been presented to show the performance. Moreover, the feature analysis using SHAP has been presented in Section 3.5. In Section 4.1, a comparison with the previously accomplished related works has been provided [as shown in Table 4]. Finally, a potential application of our HHO-based framework has been outlined through DSS in Section 5. Table 3 shows that the ensemble method picks the peak point in the case of all classification metrics such as ACC (92.29%), CV-score (92.68%), balanced ACC (92.68%) and AUC (97.80%) with a very negligible error (7.32%), whereas the HHOXGB occupies the second-highest position. On the other hand, the average performance of the BernoulliNB, LDA and QLDA was very low for the COVID-19 dataset in terms of all classification measures. The (69.63%) accuracy of QLDA was the lowest compared to other classifiers considered in this proposed research. To show the effectiveness of HHO, an experiment has also been performed without HHO using SVC, Catboost, RF, XGB and LGB. It has been evidenced that all the classifiers except Catboost provided a lower performance in all classification metrics when HHO was not been utilized. For instance, ACC (83.16%), Cohen Kappa (66.31%) and Precision (83.16%) values of XGB without HHO remain at a lower value than that of the HHO-optimized XGB. The code without HHO is also provided in the Github repository; https://github.com/MIrazul29/HHO-ML (accessed on 18 March 2022).

Precision vs. Recall Curve
The PRC is generally utilized to get a clear classification overview that helps evaluate and compare the test result. The more the curve adjacent to the right corner of the graph is, the more the test result is perfect. Figure 3 shows that the PRC value is very close to the right corner of the curve, implying that the ensemble method provides superior performance, while NB stays at the lowermost point.

Recall vs. Decision Boundary
The recall rate has been enumerated in our proposed classification task depending on a fixed threshold (T). To exemplify, in Figure 4, 0.5 threshold value for "COVID-yes" class has been considered to demystify the relationship between the recall rate (displayed along the y-axis) and the decision boundary (visualized along the x-axis), where the best recall rate has been obtained from the KNN provided, 0.9268, implying that approximately 92.68% of the time, KNN accurately classifies the "COVID-yes" class, whereas HHOXGB correctly classifies the recall rate for "COVID-no".

ROC Curve
The probability/performance of a binary outcome while classifying or predicting is defined as the ROC curve. The graphical representation of the ROC curve in Figure 5 shows that the ACC of the ensemble method grabs the apex in terms of ACC, whereas the LDA remains at the lowest point.

Feature Importance (SHAP Value Analysis)
Feature importance represents how the input features are convenient in predicting a target variable. To accomplish this task, the technique of feature importance assigns a score to the input features and provides insights into the predictive model (classification model). Most importantly, the enumeration of feature importance upgrades the effectiveness and efficiency of the predictive modeling project. We have used the SHAP summary plot in this study. The use of the SHAP summary plot ( Figure 6) has two-fold advantages as we can see: feature ranking and the effect of each feature. The position in the y-axis determines the feature ranking in descending order (higher importance to lower importance). The effect of each feature is determined by x-axis SHAP values; positive SAHP values indicate a positive correlation with the target and vice versa. In addition, the feature value indicated by the red color represents a higher feature value, and blue indicates a lower feature value. The jagged overlapping points make sense of distribution.
The significance of the features for any classification or prediction can be effortlessly analyzed by sorting the features in plunging mode, where the most influential feature occupies the peak point. For instance, as manifested in Figure 6, the bar plot visualizes the top 20 features, where "COVID19_contact" is in the topmost position. The next dominating features are "COVID19_symptoms", "cannabis", "bmi", "age", "contact_counts", etc. In contrast, "hiv_positive" remains undersurface compared to the other features exhibited in the given diagram. Moreover, from Figure 6, it can be seen that higher "COVID-19_contact", "COVID19_symptom"and "health workers" show a negative SHAP value, i.e., negative correlation. It intuitively reveals that the higher values of this feature indicate less chance to survive, i.e., COVID-no in this case and vice versa. Note that the summary plot is a bird's eye view of the data. The SHAP dependence plot and SHAP feature interaction plot can be used to discuss a particular feature and particular instance. This may determine the effect of one specific feature in model performance improvement, which is beyond the scope of this paper, and we will keep it in our near future works.

Discussion
In this research, we have used the HHO technique, a metaheuristic optimization algorithm, to tune the hyperparameters of some state-of-the-art classifiers, including HHORF, HHOXGB, HHOCAT, HHOLGB and ensemble methods, and thus amplify the classification ACC. To enumerate the performance of the COVID-19 prediction task, we have considered several classification measures, such as ACC (92.68%), AUC (97.80%), error (7.32%), etc. In addition, some pictorial delineations, such as the PRC, recall vs. decision boundary curve, and the ROC curve, have been generated to visualize and prove the superiority of the performance of the proposed method. It is elucidated that the ensemble method occupied the peak point in terms of all classification measures. In contrast, HHOXGB, another state-of-the-art classification algorithm, captured the second-highest position. In the SHAP analysis ( Figure 6), the mean SHAP values, representing the average impact on model output magnitude, have been enumerated to select the most dominating features, where "COVID19_contact" was the most influential over the rest of the features existing in the dataset used.
It can be further added that our proposed approach is likely to be implemented not only for predicting COVID-19 patients but also for detecting numerous diseases, such as asthma, vaccination [34], hypertension [35], diabetes mellitus [36], etc. The integration of the moderately large-scale dataset and utilization of an independent dataset to validate the proposed framework before further clinical trials will tremendously intensify the feasibility of this research.

Comparative Study
The performance of our proposed study is superior to the other existing methodologies, which has been clarified by a comparative study illustrated in Table 4. It is evidenced by Jim et al. [37] who considered the concept of CNN and received an ACC of 92.50%. In addition, He et al. [38], Ahamad et al. [39] and Li et al. [40] implemented XGB on clinical data and obtained average ACC of 87%, which outperforms the ACC obtained by Brinati et al. [41], who simultaneously used DT and RF. Moreover, Chimmula and Zhang [42] employed an LSTM network and attained an ACC of 92.67%. From the detailed delineation mentioned above, it is clear that our proposed framework provides the best result in terms of all classification measures. Note that in Table 4, we have relied on the results published as the datasets used in those studies are not publicly available. This is a common as well as a complex problem. Even the results will be different on the same publicly available dataset due to different training and testing samples. Therefore, we have used big data and provided processed training and testing datasets as well as code on GitHub for reproducibility and future contributions and collaboration. Moreover, comparative studies presented in Table 4 used the same state-of-the-art classifiers we used. Therefore, from the classifier's point of view, the HHO-based machine learning model ensures the optimal use of state-of-the-art machine learning algorithms and improves performance.

Potential Application
The potential application of the proposed model can be effortlessly obtained through designing a decision support system (DSS). A DSS is a pictorial interpretation to display the probable state of the labels of the target variable. In our proposed HHO-based COVID-19 prediction model, a clinically executable DSS has been designed to portray the probable state of "COVID-yes"; in other words, it represents whether a patient is carrying "COVIDyes" or not in a probabilistic sense.
Note that the threshold value of 0.5, i.e., probability = 0.5, has been considered according to the general rule of machine learning. If the threshold value is less than or greater than 0.5, then there may arise Type-1 and Type-2 errors. Moreover, probability = 0.5 is the middle point between 0 and 1, which provides a balanced and unbiased calculation of sensitivity, specificity and other performance metrics. If the probability calculated by the ensemble method is greater than 0.5, it represents that a patient would have a higher chance of "COVID-yes". This graphical representation will benefit nonspecialist users such as clinical staff and nurses. The previously calculated probability from the ensemble method has been exercised to visualize an expected output of "COVID-yes" as represented in Figure 7. It is worth noting that in the upper figure in our predicted label is highly accurate to the true label except for the specific patient (marked by the red rectangle in Figure 7). Moreover, the corresponding probability of "COVID-yes" has been shown in the lower figure. The test data are sorted in escalating mode to initially visualize the "COVID-no" class and then the "COVID-yes" class.

Conclusions
This paper designs an ML framework to predict COVID-19 using a publicly available clinical dataset. Five up-to-the-minute ML classification algorithms have been applied, where the hyperparameters of each classifier have been optimized using the proposed HHO algorithm with an improved objective function. The proposed approach aims to work in a real-time inpatient facility dataset so that the method can be user-friendly and cost-effective. The outcome of the proposed system has been evaluated by some metrics such as ACC, F1-score, MCC, Kappa index, etc. A DSS has been created to display the probability of a COVID-19 attack as a potential application of our proposed system. This can be very useful from the patient end as well as from a clinical point of view. The clinical staff and caregivers can easily screen COVID-19 patients. Furthermore, the proposed optimization technique, applied to optimize the hyperparameters of the ML classifiers, can easily be adapted for other disease predictions such as diabetes, asthma and hypertension. The feasibility of the clinical trials of the proposed research can be ensured by implementing the proposed framework on a larger dataset. Last but not least, our research would likely be integrated into mobile devices for the usefulness of end users.