Performance Improvement of Decision Tree: A Robust Classiﬁer Using Tabu Search Algorithm

: Classiﬁcation and regression are the major applications of machine learning algorithms which are widely used to solve problems in numerous domains of engineering and computer science. Different classiﬁers based on the optimization of the decision tree have been proposed, however, it is still evolving over time. This paper presents a novel and robust classiﬁer based on a decision tree and tabu search algorithms, respectively. In the aim of improving performance, our proposed algorithm constructs multiple decision trees while employing a tabu search algorithm to consistently monitor the leaf and decision nodes in the corresponding decision trees. Additionally, the used tabu search algorithm is responsible to balance the entropy of the corresponding decision trees. For training the model, we used the clinical data of COVID-19 patients to predict whether a patient is suffering. The experimental results were obtained using our proposed classiﬁer based on the built-in sci-kit learn library in Python. The extensive analysis for the performance comparison was presented using Big O and statistical analysis for conventional supervised machine learning algorithms. Moreover, the performance comparison to optimized state-of-the-art classiﬁers is also presented. The achieved accuracy of 98%, the required execution time of 55.6 ms and the area under receiver operating characteristic (AUROC) for proposed method of 0.95 reveals that the proposed classiﬁer algorithm is convenient for large datasets.


Introduction
Classification plays a vital role in machine leaning, pattern recognition and data analytics. There are several classifications and prediction algorithms, proposed in recent solutions to provide intelligent decision making by extracting the relevant information from historical/large data [1][2][3][4]. Moreover, the exploration of machine learning algorithms and techniques has grown enormously due to the considerable progress made in the information storage and processing capabilities of computers [3,5,6]. Therefore, the machine learning algorithms can be classified into four categories: (1) supervised learning; (2) unsupervised learning; (3) semi-supervised learning; and (4) reinforcement learning [7,8]. Concerning supervised machine learning, there are several prediction algorithms (or) approaches in the literature such as the kth nearest neighbor, naïve Bayes, decision trees, support vector machines, logic regression and random forest, etc. [9]. All these classification algorithms • We proposed a performance-oriented classifier algorithm for training supervised ML models over large datasets using decision tree and tabu search algorithms, respectively. We termed the proposed classifier algorithm "tabu search oriented decision tree (TSODT)" (see Section 4). • We provided a functional setup for the proposed TSODT classifier algorithm to perform several experimentations (as can be seen in Section 5). • The linear and logarithmic Big O evaluation of our proposed classifier is presented (as can be seen in Section 6.4.3). • We provided a statistical analysis of the proposed TSODT classifier algorithm (see Section 6.4.3).
The experimental results were performed using the TSODT classifier which exploits the built-in sci-kit-learn library in Python. The achieved 98% accuracy and the required 55.6 ms execution time reveals that the proposed classifier algorithm is convenient for large datasets.
The remainder of this work is structured as follows: The required background pertaining to the decision tree and tabu search algorithms is presented in Section 3. The proposed classifier algorithm is described in Section 4. The functional setup to perform this study is presented in Section 5. The results and comparison with state-of-the-art methods' performances are given in Section 6. Finally, the article is concluded in Section 7.

Existing Classifiers and Their Limitations
Classifiers based on the DT algorithm. Initially, the DT algorithm was proposed to achieve higher accuracy with a reduced computational cost [11,20,21]. The accuracy varies with the robustness of the computational algorithm over the targeted data. The earlier DT algorithm, termed ID3, is more convenient for making simple decision trees. It results in a decrease in accuracy when the computational complexity increases. The most recent solutions for the accuracy improvement of the DT algorithm is described in [10,20,[22][23][24][25].
In [10], an intelligent classifier based on a decision tree algorithm was structured for diverse applications. Recently, the solution given in [20] proposed an optimized structure of the decision tree to improve the efficiency and to reduce the error rate by simplifying 19 nodes to five node structures. The results reveal that the accuracy of ID3 was 6% to 7% higher as compared to conventional supervised machine learning classifiers. The solution described in [22] proposed an optimized genetic algorithm by merging the DT algorithm with a parallel genetic decision tree.
Towards performance improvement, the solution given in [23] has utilized a linear regression decision tree algorithm for data classification. To evaluate the effectiveness of their approach, a student evaluation and zoo datasets from the UCI Machine Learning Repository were considered. In [24], several classification models were trained to extract the relevant features from the targeted clinical or laboratory data. To achieve higher performance, relevant features from the different datasets were merged to construct a new dataset to train the learning model. A scalable decision tree algorithm was employed in [25] to classify the large dataset with improved efficiency as compared with other methods (such as SLIQ, SPRINT and RainForest). The efficiency was improved by reducing the sorting cost of the decision tree.
Classifiers based on combined DT and tabu search algorithms. Limited solutions are available in the literature where both DT and tabu search (TS) algorithms are utilized at the same time to improve the performance and accuracy of the learning classifiers [26][27][28][29][30].
In [26], a new algorithm was proposed to optimize the feature selection of hierarchical classification trees. To achieve this, a binary hierarchical classifier (BHC), also named the decision tree, was employed in this work. The result shows an improved accuracy when compared to traditional classifiers. Similarly, the solution described in [27], proposed a new classifier (they termed this "ad hoc") for logistic regression and discriminant analysis. This results in better accuracy as compared to conventional regression methods (i.e., stepwise, backward and forward).
The authors in [28] proposed an algorithm for classifying large datasets. Their classifier outperformed as compared to nominal classifiers in terms of accuracy. Moreover, the achieved accuracy was validated by employing a well-known dataset containing information relevant to different chronic diseases. An interesting classifier was proposed in [29] where better accuracy was achieved when compared to multivariate decision trees. In fact, the execution time linearly increased with the increase in the size of a dataset. Finally, the authors in [30] described a new algorithm (that they termed "cooperative-tabu-search") for classifying important information into different categories. It iteratively reduces the leaf nodes and transforms the data into new space. The results after several experiments reveal that their classifier achieves better accuracy as compared to conventional classifiers. In a nutshell, the work given in [26][27][28]30] reports improved accuracy while the solution proposed in [29] outperformed in terms of both accuracy and execution time.
Classifiers have also been employed for the detection and prediction of COVID-19. In December 2019, a number of people started suffering from the idiopathic disease in Wuhan, China, displaying symptoms similar to pneumonia. A number of samples were collected from the affected pneumonia patients to determine the specific health issues. The published works were reported in [31][32][33][34][35][36], where the occurrence of unbiased sequences was discovered among the affected pneumonia patients. These studies employed different classifiers to either detect or predict COVID-19.
As summarized, machine learning and data mining techniques are getting more important in analyzing large datasets using different classifiers. Moreover, extracting important information or making a decision from the trained model over-employed datasets are additional benefits. Therefore, in the aforesaid solutions for medical-related applications (published in [10,20,[22][23][24][25][26][27][28][29][30]), there are various implementations of supervised machine learning algorithms or approaches, either to detect or predict hazardous diseases, which has recently included the COVID-19 [31,32,34,35,37,38]. These solutions are based on standard supervised machine learning algorithms and tend to be specifically tailored for a single algorithm. These results decrease in accuracy and require higher computational time (especially for the provided datasets on COVID-19 patients). Consequently, there should be an intuitive way to obtain an early prediction of health issues with less computational time and higher accuracy.

Preliminaries
This section provides the essential background related to the decision tree algorithm in Section 3.1.

Decision Tree Algorithm
The hierarchy of the DT algorithm is presented in Figure 1. It belongs to a list of supervised machine learning algorithms. This is frequently employed for the classification of large datasets. Additionally, the DT algorithm was efficiently used to solve regression problems. The required computations for the implementation of the DT algorithm were based on a training model. The training model was required to predict the class (or) value of the given variable based on the learning rules. The learning rules were derived from the training data. As shown in Figure 1, the structure of the DT algorithm consists of several nodes, i.e., root, decision and leaf. The root node initiates the tree while the decision nodes are responsible for decision-making, i.e., switching from one node to another. The leaf nodes act as an output from decision nodes. In other words, the DT algorithm starts from the root of the tree (root node, shown in green color in the figure) to predict a class label. It compares the root attribute with the data attributes. Based on this comparison, it follows the corresponding branch and moves to the next node (decision nodes, shown in blue color in the figure). Concerning a certain set of rules, the decision nodes are responsible for moving on the corresponding leaf node (shown with pink color in the figure). In short, the entire tree contains several sub-trees, as presented in Figure 1. Each sub-tree consists of decision and leaf nodes.

Proposed Classifier Algorithm
The general idea of the DT algorithm is presented in Section 3.1. Therefore, the proposed TSODT algorithm goes recursively to divide the given data using the "greedy depth-first approach". This approach is also utilized in our work to optimize the traditional decision tree algorithm. The data flow and global overview of our proposed algorithm is given in Figure 2. It constructs a single decision tree (T) from a given dataset and then the tabu search selects the root node as a current node to start the optimization of the decision tree. It is noteworthy that our proposed algorithm was flexible while creating new decision trees and this feature also aids TSODT to optimize the performance-regardless of the size of the dataset. We assumed N = 3 to explain the working principle of TDOST and this means that TSODT will construct three decision trees (e.g., DT 1 , DT 2 , DT 3 ) and the tabu search extracts the neighbor nodes (V*) from the standard decision tree (T). This generates the best solution which is a leaf node (S*). Then, it creates the N number of decision trees (N = 3 for our case), extracting the best solution (S*) and then it appends into previously created decision trees. The place of the appended leaf node (S*) was decided using the entropy of the decision trees. In what follows, the tabu search checks the aspiration criterion and continues the same process until it achieves the aspiration criterion. Consequently, it builds an optimized decision tree that is robust as compared to the traditional decision tree (T). The proposed algorithm for the TSODT classifier is illustrated in Algorithm 1. Initially, it builds a tree (leaf nodes and decision nodes) and then it prunes so that the tree makes a lists [S, TABU] from the training instances. In what follows, it checks the lists [S, TABU], and if it is not empty, then it selects a node from the list and flags it as a current node J. It involves three steps (or) decision trees: (step-1) call tree classifies the extracted options as traditional or abnormal; (step-2) the identification of the abnormal options that contain signs of tuberculosis; and (step-3) the calculation of the likelihood of the malady from the results, obtained in step-2. Considering the aforementioned steps (step-1-step-3), it defines a splitting rule for the instances of the current node J. Moreover, it applies a select-split from J and forms child nodes. These child nodes are not a part of the decision trees. Once the resultant node is the only one, then J will be added to leaf nodes. The position of the current node (J) in the leaf nodes is defined on the entropy of the decision trees. On the other hand, the resultant nodes are added to the defined decision tres (DT 1 , DT 2 , DT 3 ). The probability of each decision tree is based on the fraction of entropy, calculated using Equation (1). It is important to note that the proposed TSODT classifier algorithm is capable of dealing with big datasets. Moreover, it is more flexible than other classification algorithms because the user can select the number of decision trees which will be sued to build a final decision tree as explained in the Algorithm 1: In Equation (1), p 1 , p 2 , . . . , p j determines the percentage of each class present in the child node (node for a specific value of the feature). The term, i.e., DT i presents the decision trees (as mentioned earlier in step-1-step-3) while k shows the number of classes.

Algorithm 1: Proposed Tabu Search Oriented Decision Tree (TSODT) Algorithm
if splitting_rule is valid f or J then 16 child_nodes ← produce_child_nodes

Experimental Setup for the Evaluation of Tsodt
The block diagram of settings to perform different experiments is illustrated in Figure 3. The dataset was collected from clinical sources based on the symptoms of COVID-19. The preprocessing of the clinical data was accomplished by employing different filters (the corresponding details are given in Section 5.2). After preprocessing, the features were extracted based on their importance. Thereafter, the data were split into two parts, i.e., for training and testing. In the selected COVID-19 dataset, there is a class imbalance problem due to the negative class dominating the positive. In order to overcome this issue, we used a synthetic minority oversampling technique (SMOT). Once all the issues related to the selected dataset were resolved, in the following, training and modeling over data were performed by using supervised ML algorithms, i.e., naïve Bayes, KNN, logistic regression, support vector machine, random forest, DT and our proposed TSODT. Subsequently, each model was tested using test data to predict whether the patient suffered from COVID-19 or not. For each ML model, the performance accuracy was calculated using different accuracy measures (as shown in Figure 3). Finally, the performance comparison for selected ML models was provided.
Therefore, the relevant details for each block of Figure 3 was provided in the text that follows:

Clinical Data
The dataset of COVID-19 was collected from the GitHub repository of Carbon Health [39]. They compiled a repository named "Coronavirus Disease 2019 (COVID-19) Clinical Data Repository", having the clinical data characteristics of patients who had taken a COVID-19 test. The repository was composed of CSV files and maintained its batches. Each batch contained the weekly test data of the patients from Carbon and Braid Health. The data include the clinical characteristics and symptoms for radiological and laboratory findings. Consequently, for our evaluations, we considered 11 weeks of data starting from the 7th of January to the 25th of April 2021. In short, all clinical data files contain 46 features and 11,168 records of patients.

Preprocessing
For training, preprocessing is essential to make the data readable for our model. Therefore, we considered: (1) the CSV files of 11 weeks of data for transformation into a single CSV file; (2) the exclusion of irrelevant columns; (3) transformation from categorical or text data into numerical data; (4) handled missing data problem; and (5) solved class imbalance problem.

Categorical Data into Numeric Data
To make the data machine-readable, these must be converted into machine language, as numeric data are very easy to understand for the training algorithms. Figure 4 shows the conversion of data categorical data into numerical data. The numbers used for converting categorical data into numeric data are "0" for false and "1" for true, "0" for negative and "1" for positive, "0" for mild, "1" for moderate and "2" for severe.

Missing Data
The data which do not store properly for their respective variables are the missing data. The machine learning classifier does not train the data with missing values. Therefore, it is essential to handle these first. There are a few techniques to handle the missing data, i.e., imputation, a model-based technique, (or) simply ignore the missing data [40]. Initially, the filling rate of the data is to be visualized and then analyzed. Figure 5 represents the filling data rate for the selected COVID-19 dataset. The x axis of Figure 5 shows the symptoms and vitals of the patients and the y axis illustrates the corresponding filling rate for the symptoms and vitals. After filtration, the result of the data versus symptoms is 52.49%. From Figure 5 and from previously obtained values, we can analyze how many data are filled and how many are missing for each symptom of COVID-19. Therefore, in this work, we used the following criteria to discard the missing data:

•
The columns with ≥50% missing data are dropped from the dataset. Consequently, after removing irrelevant and more than 50% missing data columns, the remaining columns are 31 (shown in Figure 5).
• For those features having less than 50% missing data, respective records/rows are dropped. After dropping missing data rows, 3616 records of patients were left for training and testing.

Feature Extraction
Feature extraction has great importance for the classification model to achieve accuracy, avoid overfitting and reduce training time. Therefore, we extracted 15 features out of 31 for the classification algorithms, as shown in Figure 6. This was achieved by using the built-in class, i.e., the ExtraTreeClassifier classifier.

Test-Train Split
Prior to training the classifier model for prediction, we need to split our dataset into two parts: (1) to train the classification model, and (2) to test the performance of the model after training. In fact, the training data should be more than testing data. Each model is trained with 5-fold cross validation to tune the hyper-parameters. Normally, the 75-25 train-test ratio is used for classification [45]. Consequently, we used 70% of the data for training and 30% for testing.

Class Imbalance
Class imbalance is a common problem for classification algorithms. In binary classification, an imbalanced class problem means that one class dominates another, in which the first class is in the minority and the other is in majority [46]. In the selected COVID-19 dataset (covid19_test_results), there are two categories/labels, i.e., "1" (means positive) and "0" (means negative). Moreover, it contains a total of 3616 records. Among these 3616 records, 315 are positive, while the remaining are negative. This dominates 91.3% to 8.7% with the minority class of "positive". There are numerous methods to resolve the class imbalance problem, including adaptive synthetic (ADASYN), random oversampling, synthetic minority oversampling technique (SMOTE), borderline SMOTE, SMOTE nominal and continuous (NC) [47,48]. The most frequently utilized techniques to reduce this issue are upsampling and downsampling. Each method has advantages and disadvantages, for example, random oversampling reduces the variance of the dataset, SMOTE can increase the overlapping of classes and it can create noise in the data. SMOTE is useful for large datasets and it alleviates the overfitting introduced due to oversampling [49]. Conse-quently, we employed a synthetic minority oversampling technique (SMOT) to reduce the aforementioned described class imbalance problem.

Experimental Results and Performance Evaluations
This section provides the criterion to investigate the achieved results, the calculated results and performance comparisons in Sections 6.2-6.4, respectively.

Parameters for Classifiers
As explained in Section 5.6, 5-fold cross validation is used to obtain the results. During training, the F1-score is maximized to optimize each classifier. The best learning model is extracted using the tuning of the learning rate (α) and the regularization parameter (λ) for LR. A set of nearest neighbors (100, 200. . . ., 1000) is experimented with to obviate bias and overfitting. For the case of traditional decision tree and TDOST, pre-pruning is exploited to prevent overfitting with maximum splits. Considering SVM, Gaussian kernels were used for optimization.

Criterion to Calculate the Results
The most common performance measurement is based on the construction of a confusion matrix, as shown in Table 1. Column one provides the status of the COVID-19 patients in terms of them being either true or false. The terms, i.e., true and false, determine the COVID-19 patients in terms of sufferer and non-sufferer. Moreover, column two of Table 1 shows the classifier prediction-based on certain conditions, (either positive or negative). Consequently, there are four cases to construct the confusion matrix: (1) the classifier predicts the COVID-19 patient as positive (meaning a true positive, denoted as TP); (2) the classifier predicts the COVID-19 patient as negative (meaning a false negative, termed as FN); (3) the classifier predicts the non-COVID-19 patient as positive (meaning a false positive, denoted as FP); and (4) the classifier predicts the non-COVID-19 patient as negative (meaning a false negative, termed as FN).

Calculated Results
The experimental results were performed using TSODT and a traditional classifier that exploits the built-in sci-kit-learn library in Python. As stated in the explanation of TSODT, it is flexible for the selection of internal decision trees, so we performed decision tree space exploration. We recorded the execution of TSODT for decision trees (1, 2, 3. . . , 10) and this resulted in the selection of three decision trees (N = 3, input of Algorithm 1) which provided efficient execution. Hence, we used three decision trees for further computation. The achieved results are given in Table 2. The first column presents the type of classifier while the second column shows the execution time (in ms) required for training the data. Columns three to six give the numerical values for the constructed confusion matrix over each classifier, respectively. The remaining columns (i.e., columns seven to eleven) provide different parameters in terms of accuracy, error rate, precision, recall and F score. Therefore, the corresponding values for these parameters (accuracy, error rate, precision, recall, Fscore and AUROC (area under receiver operating characteristic)) in Table 2 are calculated using Equations (A1)-(A5), respectively, as given in Appendix A. As shown in Table 2, the proposed TSODT algorithm (Algorithm 1) requires 55.6 ms to train the model. The combined utilization of the decision tree and a tabu search algorithms, at the same time, result in 98.1% accuracy. Apart from the higher accuracy and the required execution time, the error rate is only 3.7% (see column eight of Table 2). The additional parameters, i.e., precision, recall, and Fscore, result in more than 97% (as shown in columns nine to eleven in Table 2).

Performance Comparison
In order to provide a realistic and reasonable performance evaluation, the proposed TSODT classifier (see Algorithm 1) is compared with conventional classifiers (i.e., NB, LR, DT, SVM, RF, and KNN) in Section 6.4.1. The comparison to most recent state-of-the-art classifier algorithms is provided in Section 6.4.2. Finally, the performance validation of the proposed TSODT classifier algorithm is given in Section 6.4.3.

Comparison to Conventional Classifiers
As shown in column two of Table 2, the most efficient classifier with respect to execution time (or performance) is NB as it trains the model in 8.5 ms. This is due to a requirement for the computation of prior probabilities. Once all the probabilities are calculated, they can be stored in a storage memory for efficient use in the next required computations. On the other hand, SVM is shown to slow down as compared to other classifiers in Table 2. This is because it requires to load the whole dataset in a RAM and thereafter, process each data point sequentially. It is important to worth mentioning that the selected clinical dataset for training in this work is quite large. Therefore, the proposed TSODT classifier algorithm (Algorithm 1) requires 1.79, 224.7 and 9.09 times lower execution/computational time to train the model as compared to LR, SVM and RF classifiers, respectively. Both the proposed TSODT and DT classifier algorithms take the same execution time (55.6 ms). On the other hand, the proposed classifier algorithm (Algorithm 1) requires a 6.54 and 3.97 times higher execution time as compared to the conventional NB and KNN classifier algorithms, respectively.
The proposed TSODT classifier algorithm efficiently performs over a large number of features. Therefore, for the same number of features (fifteen), our proposed TSODT classifier algorithm (Algorithm 1) results in higher accuracy (98%, as can be seen in column seven of Table 2) as compared to other classifiers. It is important to worth mentioning that the higher the accuracy, the higher the classifier performance will be. As compared to other classifiers, i.e., NB, LR, DT, SVM, RF and KNN, the proposed TSODT classifier results as 1.01, 1.02, 1.04, 1.09, 1.02 and 1.04 times higher in accuracy. Moreover, the proposed TSODT algorithm aggregates a large number of decision trees, which limit the overfitting and error rate parameters due to certain bias.
As shown in column eight of Table 1, the proposed TSODT algorithm results in an error rate of 3.7%. Therefore, when concerning only the error rate for the comparisons, our proposed classifier provides a 1.45, 2.97, 2.60, 2.97, 1.18 and 2.78 times lower error rate as compared to the NB, LR, DT, SVM, RF and KNN conventional classifier approaches, respectively. As shown in columns nine to eleven in Table 3, our proposed TSODT classifier algorithm results in higher values for other parameters, i.e., precision (97.5%), recall (99.3%) and Fscore (98.4%).

Comparison to State-of-the-Art Classifier Algorithms
The performance comparison with the most recent state-of-the-art classifiers in terms of execution time and accuracy is presented in Table 3. In order to train the model, the classifiers, reported in [34,36,38,50], employed different classification approaches while the classifiers, reported in [23,28,30], consider decision tree optimizations using the tabu search algorithm. Note that we used the "-" in Table 3 where the relevant information is not provided.  Comparison to classifiers given in [34,36,38,50]. In [50], the employed classifiers are NB, SVM, RF and KNN. For these classifiers (NB, SVM, RF and KNN), the reported accuracy in Table 3 are 0.98%, 69.79%, 77.98% and 75.68%. Our proposed TSODT classifier results to be 100 (for NB), 1.40 (for SVM), 1.25 (for RF) and 1.29 (for KNN) times higher in accuracy when compared to [50]. Apart from accuracy, the execution time in [50] is only reported for the NB classifier (0.96 ms). This is comparatively 1.02 times higher than in this work (0.94).
Recently, the most interesting work is described in [36], where utilized classifiers are SVM, RF, and DT. For these classifiers (SVM, RF and DT), the reported values for accuracy in Table 3 are 89%, 94.3% and 0.91%. The proposed TSODT classifier result 1.10 (ratio of 98 over 89 for SVM), 1.03 (ratio of 98 over 94.3 for RF) and 107.69 (ratio of 98 over 0.91 for DT) times higher accuracy as compared to [36]. Similar to [50], the execution time in [36] is only given for the RF classifier (0.93 ms). This is comparatively 1.02 times lower than in this work (0.95).
Comparison over the optimized decision tree using the tabu search algorithm [23,28,30]. As shown in Table 3, the authors in [23] achieved 92.5% accuracy for their proposed optimized algorithm. This is comparatively 1.05 times lower than in this work. Moreover, there is a linear relationship between the execution time with datapoints. The classifier reported in [28] tested their algorithm for different datasets of chronic diseases. A maximum accuracy of 96% was achieved, which is comparatively 1.02 times lower than in our work (98%). The authors in [30] achieved 95.5% accuracy, which is comparatively 1.02 times lower than in this work. Similarly to [23], there is a linear increase in execution time with the increase in datapoints.

Performance Validation of the Proposed TSODT Classifier Algorithm
After providing comparisons in Sections 6.4.1 and 6.4.2, we validated the performance of our proposed TSODT algorithm for different data points, as shown in Figure 7. The x axis in the figure shows the number of datasets with different data points (in value × 1000) while the y axis, shows the execution time (in ms). Consequently, the trend (presented in Figure 7) explicitly shows that there is a linear increase in the execution time as with the increase in the number of data points.
Evaluation of Big O. We evaluated our classifier with other classifiers for linear and logarithmic Big O. Figure    Statistical analysis. Statistical analysis was performed to compare the proposed classifier and traditional classifiers. For this, the Friedman test and a post hoc analysis were performed [51]. The results for classification algorithms using the dataset are presented in Table 4. This table compares the TSODT and other classifiers with each other in terms of average ranks. The listed numbers illustrate subsequent classification algorithms sorted by the accuracy in ascending order. TSODT is highlighted as a number 1, the sum of average ranks is maximum for TSODT. Consequently, TSODT outperforms as compared to other classifiers. On the other hand, the value of AUROC, as shown in the last column of Table 2, confirms that TSODT has the highest probability as compared to the other classifiers. The results shows that NB and LR also offer better performance which shows high sensitivity and specificity.

Analysis of AUROC.
We also evaluated the performance of our classifier by comparing the AUROC of traditional and TSODT. Figure 9 illustrates the ROC curves for different classifiers. The ROC curve is generated by considering sensitivity and specificity. The x axis of this figure shows the false positive rate (FPR) and the y axis shows the true positive rate (TPR) of the corresponding classifier. The numeric values of AUROC are listed in the last column of Table 2. The AUROC for TSODT is higher as compared to other classifiers. On the other hand, the AUROC of KNN is the lowest among all classifiers. The evaluation of results is based on the COVID-19 dataset, which proves that our classifiers outperform. The values for AUROC might be different for other datasets but the comparative value will be higher. Generally speaking, the evaluation also shows that this classifier is well suited for medical applications.

Conclusions
Machine learning algorithms are extensively being in used for the prediction of numerous diseases, e.g., diabetic retinopathy, COVID-19. The prediction/diagnosis is achieved with the aid of a classifier. In this work, a new classifier based on DT and tabu search is presented to optimize the performance and accuracy at the same time. Half a dozen supervised machine learning (supervised ML) techniques (including TSODT) were exploited to diagnose the accustomed COVID-19 illness victimization using the clinical information of the patients. The experimental results reveal that the proposed classifier algorithm provides a higher performance for the medical applications with an accuracy of 98%. Additionally, the execution time required for the training is 55.6 ms. Consequently, the proposed classifier will help the relevant community solve different medical, social and statistical problems with an elevated accuracy and performance parameters.