Enhanced Heart Disease Prediction Based on Machine Learning and χ 2 Statistical Optimal Feature Selection Model

: Automatic heart disease prediction is a major global health concern. Effective cardiac treatment requires an accurate heart disease prognosis. Therefore, this paper proposes a new heart disease classiﬁcation model based on the support vector machine (SVM) algorithm for improved heart disease detection. To increase prediction accuracy, the χ 2 statistical optimum feature selection technique was used. The suggested model’s performance was then validated by comparing it to traditional models using several performance measures. The proposed model increased accuracy from 85.29% to 89.7%. Additionally, the componential load was reduced by half. This result indicates that our system outperformed other state-of-the-art methods in predicting heart disease.


Introduction
At present, the load on a person has increased significantly due to increased work. Because of this dire circumstance that cannot be avoided, there is a high probability that the person will suffer from heart disease [1][2][3]. According to the 2018 World Heart Federation study, heart disease causes millions of deaths annually. A decrease in the amount of blood flowing to the brain, heart, lungs, and other important organs is the cause of heart disorders (HDs). Congestive heart failure is the most common type of cardiovascular disease and the least serious. In human anatomy, blood veins are responsible for transporting blood to the heart. Other factors that contribute to heart disease include defective heart valves, which can result in heart failure. A typical symptom of cardiac disease is muscle soreness in the upper abdominal area, which might be accompanied by anesthesia. Decreasing blood pressure, minimizing cholesterol, and engaging in regular physical activity are all suggested to lower the risk of heart disease. Heart disease is most related to angina pectoris, dilated cardiomyopathy, stroke, and congestive heart failure, among other things. As a result, it is necessary to monitor cardiovascular disease (CVD) biomarkers and consult with healthcare physicians [4][5][6].
Since ancient times, humans have significantly improved in terms of machines and health care. In modern times, after the entry of machines and artificial intelligent (AI) in medicine and health care, there has been significant developments and improvements in medicine and health care [7][8][9][10]. When it comes to heart disease, determining a person's risk of heart failure is a major concern [11,12]. The application of multiple longitudinal study auto-regression analyses results in the construction of the prediction method [13]. Because of changes in technology, healthcare facilities now must store a huge amount of data in their databases, which makes it very hard to figure out what the data means.
The study of how computers acquire knowledge via observation and experience is known as machine learning. Machine-learning (ML) algorithms have the potential to tackle a wide range of problems in the management of specific medical centers and the analysis 1.
To develop a new heart disease (HD) classification model based on the ML (SVM) algorithm to improve the detection of heart disease; 2.
To implement an optimal feature selection model using the χ 2 statistical method for the extraction of the most informative attributes to improve prediction accuracy; 3.
To validate the proposed heart disease diagnosis model's accuracy by comparing it with traditional models through the analysis of several performance metrics.

Methodology
The following are the two primary steps that have been carried out to meet the objectives of this work: Figure 1 depicts the model's operation in detail. In the proposed support vector machine (SVM)-based heart disease prediction system, the most significant stages were data pre-processing, feature selection, and classifying. The feature normalization method was included in the pre-processing block. Training and testing sets were then created from the data. A feature scoring and selection algorithm was employed to ensure that the training subset was free of any biases. The χ 2 statistical model selected the same feature set for training and testing data. Next, training data with fewer features was fed into the SVM model for training purposes. Finally, using testing data, the trained SVM model was evaluated. The proposed model utilized 14 features from the University of California Irvine (UCI) Heart Disease Repository's Statlog and Cleveland datasets [17][18][19][20]. These features were examined using approaches that successfully predict heart disease. It was possible to develop and evaluate the system for heart disease prediction by using Python and the sci-kit learn library [21].  The SVM model learns how to separate various categories by a significant distance. It finds the optimum hyperplane for dividing sensitive data with the most significant margin. This margin is displayed as the distance between the hyperplane and the closest data points on each side of it in Figure 2. The hyperplane computations based on the fact that many points of one group fall on one side of the plane [23]. The SVM model uses a trick kernel to transform data and then find the optimal surface that divides between different classes [21]. This paper used a radial basis function (RBF) kernel, denoted in Equation (1). SVM classification relies heavily on the RBF, commonly known as the Gaussian kernel, to map input data into a feature space. For each feature, the kernel function calculates the inner product of the data points. Vladimir N. V. and Alexey Ya were the first to propose the SVM method in their study of statistical learning theory [4]. The support vector machine (SVM) algorithm, a supervised learning ML method, can be used for both classification and regression purposes. Prediction of cardiovascular disease using the SVM method is more accurate and less error prone [22].

Support Vector Machine (SVM) Model
The SVM model learns how to separate various categories by a significant distance. It finds the optimum hyperplane for dividing sensitive data with the most significant margin. This margin is displayed as the distance between the hyperplane and the closest data  Figure 2. The hyperplane computations based on the fact that many points of one group fall on one side of the plane [23]. The SVM model uses a trick kernel to transform data and then find the optimal surface that divides between different classes [21]. This paper used a radial basis function (RBF) kernel, denoted in Equation (1). SVM classification relies heavily on the RBF, commonly known as the Gaussian kernel, to map input data into a feature space. For each feature, the kernel function calculates the inner product of the data points.
The SVM model learns how to separate various categories by a significant distance. It finds the optimum hyperplane for dividing sensitive data with the most significant margin. This margin is displayed as the distance between the hyperplane and the closest data points on each side of it in Figure 2. The hyperplane computations based on the fact that many points of one group fall on one side of the plane [23]. The SVM model uses a trick kernel to transform data and then find the optimal surface that divides between different classes [21]. This paper used a radial basis function (RBF) kernel, denoted in Equation (1). SVM classification relies heavily on the RBF, commonly known as the Gaussian kernel, to map input data into a feature space. For each feature, the kernel function calculates the inner product of the data points. Here, and are the two data points. The core functions determine the kernel parameters. For example, this experiment's RBF kernel had a parameter gamma, which had to be tweaked to find the optimal hyperplane. It specified how much of an impact a single training sample can have. Higher gamma values indicate a close influence [22,23].
Another factor that needed to be tuned was penalty factor C (Cost). The model's accuracy decreased as C decreased, while its generalizability increased. A more significant number for C improved model accuracy but reduced its generalizability. In our SVM (RBF) model, we set gamma equal to 1, divided by the number of features seen during model fitting, and C = 1.0 to maximize efficiency.

Heart Disease Dataset Features
In this study, we utilized two publicly available heart disease datasets, the Cleveland and Statlog (Heart) datasets, which were obtained from the University of California at the Irvine (UCI) machine-learning repository [19,20]. The datasets were chosen because they Here, x and y are the two data points. The core functions determine the kernel parameters. For example, this experiment's RBF kernel had a parameter gamma, which had to be tweaked to find the optimal hyperplane. It specified how much of an impact a single training sample can have. Higher gamma values indicate a close influence [22,23].
Another factor that needed to be tuned was penalty factor C (Cost). The model's accuracy decreased as C decreased, while its generalizability increased. A more significant number for C improved model accuracy but reduced its generalizability. In our SVM (RBF) model, we set gamma equal to 1, divided by the number of features seen during model fitting, and C = 1.0 to maximize efficiency.

Heart Disease Dataset Features
In this study, we utilized two publicly available heart disease datasets, the Cleveland and Statlog (Heart) datasets, which were obtained from the University of California at the Irvine (UCI) machine-learning repository [19,20]. The datasets were chosen because they are the most widely used datasets by various researchers on heart disease prediction to find their model's effectiveness [24].
All 303 cases in the Cleveland dataset have 76 attributes. These include patients' identification numbers, ages, social security numbers, as well as a variety of health data, including details on the location and type of chest pain they were experiencing, measurements of their blood pressure and cholesterol levels, along with their fasting blood sugar and electrocardiogram readings [25]. Although the Cleveland heart disease database has 76 different features, most researchers only used 14 of them in their experiments [24]. With 270 instances, the Statlog (heart) dataset has 14 characteristics. Table 1 shows a description of the attributes of both datasets, which have the same type and number of features as one another. In the prediction of heart disease, thirteen attributes are used, with the last attribute serving as the output that decides whether a person has heart disease. The data distribution is critical when attempting to predict [26]. Figure 3 depicts the expected attribute class distributions for the two used datasets. There are 138 people in the Cleveland dataset who have no cardiac disease, and 165 patients in the dataset who do. A total of 120 people do not have heart disease in the Statlog dataset, whereas a total of 150 people do. Consequently, we have a dataset that is almost evenly distributed in terms of target output, which helps to prevent the overfitting issue.  In Figure 4, a heatmap is used to display the correlation analysis of all attributes and the target. For each attribute, the heatmap's color indicates the degree to which that attribute is correlated with all the others and with the output target class. In general, the stronger the correlation, the warmer the color. For the Cleveland dataset, the target attribute was most closely associated with the Cleveland dataset's features of exercise-induced depression, the kind of chest pain, exercise-induced angina, and the maximum heart rate reached. Meanwhile, for the Statlog dataset, the highest correlated features with the target were thalium, no. of major vessels, ST depression, exercise-induced angina, maximum heart rate, and chest pain. In Figure 4, a heatmap is used to display the correlation analysis of all attributes and the target. For each attribute, the heatmap's color indicates the degree to which that attribute is correlated with all the others and with the output target class. In general, the stronger the correlation, the warmer the color. For the Cleveland dataset, the target attribute was most closely associated with the Cleveland dataset's features of exerciseinduced depression, the kind of chest pain, exercise-induced angina, and the maximum heart rate reached. Meanwhile, for the Statlog dataset, the highest correlated features with the target were thalium, no. of major vessels, ST depression, exercise-induced angina, maximum heart rate, and chest pain. ute is correlated with all the others and with the output target class. In general, the stronger the correlation, the warmer the color. For the Cleveland dataset, the target attribute was most closely associated with the Cleveland dataset's features of exercise-induced depression, the kind of chest pain, exercise-induced angina, and the maximum heart rate reached. Meanwhile, for the Statlog dataset, the highest correlated features with the target were thalium, no. of major vessels, ST depression, exercise-induced angina, maximum heart rate, and chest pain.

Data Pre-Processing
There were no missing values in any of the derived datasets. Additionally, there was a roughly equal proportion of individuals with and without cardiac illness in the target variable, as indicated in Figure 3. This means that no target weighting would be needed in any way. However, a feature scaling approach was applied to ensure a normal distribution of the data. Features were scaled based on the standard scaler approach, which standardizes a feature by removing the mean and then scaling to a unit variance. The term "unit variance" refers to the process of dividing all the values by the standard deviation.

Data Pre-Processing
There were no missing values in any of the derived datasets. Additionally, there was a roughly equal proportion of individuals with and without cardiac illness in the target variable, as indicated in Figure 3. This means that no target weighting would be needed in any way. However, a feature scaling approach was applied to ensure a normal distribution of the data. Features were scaled based on the standard scaler approach, which standardizes a feature by removing the mean and then scaling to a unit variance. The term "unit variance" refers to the process of dividing all the values by the standard deviation. As shown in Equation (2), to obtain the new data point (x), the mean (µ) value was subtracted from the old data point in a particular column, and the result was divided by the standard deviation (σ) [22].
Two common issues in any prediction system are the underfitting and overfitting of the training data. Underfitting happens when the generated model does not learn enough from the training data, resulting in poor training and testing data performance. Overfitting occurs when a model learns too much from the training data and achieves unsatisfactory results from even minor details [27,28]. Irrelevant features in the training data often result in model overfitting. Even if the SVM model performs well on training data, it may not generalize well. We propose eliminating irrelevant features by using χ 2 statistical model to overcome this issue.

Enhanced SVM Model with Feature Selection
Having too many features causes overfitting. Thus, it is important to select the right features for both training and testing data to improve model performance [29]. Features that are relevant to the ML model are selected, while those that are irrelevant or noisy are discarded [28]. This paper used the chi-squared (χ 2 ) statistical model [30,31] to select the essential features before applying the SVM model.
A chi-squared (χ 2 ) test is a correlation-based feature selection method that determines the correlation between the features and the predicted class. Each non-negative feature (X i ) computes chi-square statistics to determine which features depend on the predicted attribute. The higher the chi-square score, the more dependent the feature is on the Designs 2022, 6, 87 6 of 12 predicted class [24]. First, the commonly used 13 features are ranked according to their χ 2 test score. The χ 2 test rank features for a binary classification problem are as follows: Let us assume there are (t) instances and two classes, positive and negative. To determine the χ 2 test score, we construct Table 2.  [27].

Positive Class Negative Class Total
Where (m) represents the sum of instances that include the feature (X i ), (t − m) represents the sum of instances that do not include the feature (X i ), (p) represents the sum of positive instances, and (t − p) represents the sum of all instances that are not positive.
The χ 2 test examines the difference between the expected count (E), and the observed count (O). The observed count (O) is the observed data (α, b, λ, and y), and the expected count (E) is calculated from the row total, column total, and overall total. If two features are independent, the observed count and the expected count are close. The α, b, λ, and y represent the observed values, and E α , E b , E λ , and E y represent the expected values. Then, assuming that the two occurrences are unrelated, the expected value (E α ) is calculated using Equation (3). Similarly, E b , E λ , and E y are calculated. Finally, based on the general χ 2 test form shown in Equation (4), we calculate χ 2 score as shown in Equation (5) [27].
After ranking the features using Equation (5), we looked for the best subset of features (n) with the highest χ 2 score. In the beginning, we used a subset of (n = 1), i.e., the feature with the highest χ 2 score. This subset was then applied to the SVM model, and the performance results were recorded as we experimented with various hyperparameters. We selected a subset of the two most highly scored attributes (n = 2) as a second approach. Then, this selection was applied to the SVM model, and the results were saved. We iterated this process until we obtained the optimum subset of ranked features (n = 6) that gave the best performance.
The proposed feature selection algorithm, based on the χ 2 statistical method, recognized six notable features that can be selected for model training. As shown in Figure 5, regarding the Cleveland dataset, the algorithm selected the following features: thalach, oldpeak, ca, cp, exang, and chol. While in Figure 6, for the Statlog dataset, the algorithm selected the following features: maximum heart rate, number of major vessels, thallium stress result, exercise-induced ST depression, cholesterol, and exercise-induced angina.
the best performance.
The proposed feature selection algorithm, based on the χ 2 statistical method, recognized six notable features that can be selected for model training. As shown in Figure 5, regarding the Cleveland dataset, the algorithm selected the following features: thalach, oldpeak, ca, cp, exang, and chol. While in Figure 6, for the Statlog dataset, the algorithm selected the following features: maximum heart rate, number of major vessels, thallium stress result, exercise-induced ST depression, cholesterol, and exercise-induced angina.

Results and Discussion
In this paper, the SVM model was enhanced with the χ 2 statistical feature selection method. The feature selection method was used to select the six most important features for the prediction of heart disease. The χ 2 -based SVM heart disease prediction model was developed and evaluated using Python and sci-kit learn library [21]. The two collected datasets (i.e., the Cleveland and Statlog (Heart) datasets) were partitioned into train and test sets. Training data was used to train the model, whereas testing data was used to evaluate the performance of the model [32]. To train and evaluate our proposed model, both datasets were split into a train and test set using a 75:25 split ratio. The following four primary parameters were assessed: true negative (TN), which means that the algorithm prediction output for persons with no heart disease is correct; true positive (TP), which means that the algorithm prediction output for heart disease patients is correct; the best performance.
The proposed feature selection algorithm, based on the χ 2 statistical method, recognized six notable features that can be selected for model training. As shown in Figure 5, regarding the Cleveland dataset, the algorithm selected the following features: thalach, oldpeak, ca, cp, exang, and chol. While in Figure 6, for the Statlog dataset, the algorithm selected the following features: maximum heart rate, number of major vessels, thallium stress result, exercise-induced ST depression, cholesterol, and exercise-induced angina.

Results and Discussion
In this paper, the SVM model was enhanced with the χ 2 statistical feature selection method. The feature selection method was used to select the six most important features for the prediction of heart disease. The χ 2 -based SVM heart disease prediction model was developed and evaluated using Python and sci-kit learn library [21]. The two collected datasets (i.e., the Cleveland and Statlog (Heart) datasets) were partitioned into train and test sets. Training data was used to train the model, whereas testing data was used to evaluate the performance of the model [32]. To train and evaluate our proposed model, both datasets were split into a train and test set using a 75:25 split ratio. The following four primary parameters were assessed: true negative (TN), which means that the algorithm prediction output for persons with no heart disease is correct; true positive (TP), which means that the algorithm prediction output for heart disease patients is correct; Figure 6. Selected features by highest χ 2 score (Statlog dataset).

Results and Discussion
In this paper, the SVM model was enhanced with the χ 2 statistical feature selection method. The feature selection method was used to select the six most important features for the prediction of heart disease. The χ 2 -based SVM heart disease prediction model was developed and evaluated using Python and sci-kit learn library [21]. The two collected datasets (i.e., the Cleveland and Statlog (Heart) datasets) were partitioned into train and test sets. Training data was used to train the model, whereas testing data was used to evaluate the performance of the model [32]. To train and evaluate our proposed model, both datasets were split into a train and test set using a 75:25 split ratio. The following four primary parameters were assessed: true negative (T N ), which means that the algorithm prediction output for persons with no heart disease is correct; true positive (T P ), which means that the algorithm prediction output for heart disease patients is correct; false positive (F P ), which means that the patients who have no heart disease are incorrectly classified as having heart disease; and false negative (F N ), which refers to patients who are actually suffering from a cardiac disease but are incorrectly categorized as healthy [24]. The proposed χ 2 -based SVM model was evaluated based on the following metrics: • Accuracy (Acc): defined as the proportion of total positive instances of the model to the total number of instances, as shown in Equation (6).
• Specificity (Spe): the percentage of true negatives out of all healthy individuals, calculated by Equation (7). It was used to determine the degree of the attribute to appropriately classify the individuals without diseases.
Designs 2022, 6, 87 8 of 12 • Sensitivity (Sen): used to determine the degree of the attribute in order to appropriately classify the individuals who have diseases, as illustrated in Equation (8); • F1-score: defined as the harmonic mean of the specificity and sensitivity. It can be computed as shown in Equation (9); The chi-squared-SVM algorithm was applied to the two collected datasets to see the difference when applied to different datasets. Two approaches were experimented with. In the first approach, the dataset with the total 14 features was normalized, then directly used for prediction using the SVM classifier. In the second approach, the χ 2 -based feature selection method was applied to the normalized dataset to choose the six features that are the most significant for heart disease detection before applying the SVM classifier.
As can be shown in Table 3, by using the first approach (feeding the total 14 features of the collected datasets), the SVM classifier achieved the following results for the Cleveland dataset: accuracy of 84.21%; sensitivity of 67.45%; specificity of 84.13%; and an F1-score of 84.16%. Meanwhile, for the Statlog dataset, the following results were achieved: diagnostic accuracy of 85.29%; sensitivity of 68.29%; specificity of 85.36%; and an F1-score of 85.29%. The results of the SVM model for the Cleveland dataset are as follows, as shown in Table 4: diagnostic accuracy of 89.47%; sensitivity of 89.40%; specificity of 89.40%; and an F1-score of 89.40%. These results were achieved by applying the second approach, which used the χ 2 feature selection method. The experimental results of applying these two approaches are shown in Figure 7. From Figure 7, conclusion can be drawn that the χ 2 feature selection method played a critical role in enhancing the accuracy of the SVM model, while also improving the results of sensitivity and specificity, which shows the model's ability to correctly identify people with and without the heart disease. The proposed χ 2 -based SVM model improves classification accuracy by 6.25% for the Cleveland dataset and 5.17% for the Statlog dataset, which is important for providing a correct diagnosis and decreasing the rate of false predictions. curve) charts were utilized to assess the performance of the proposed χ 2 -based SVM diagnostic model and its ability to identify heart disease occurrence. The ROC and AUC chart is a 2D graph, between the sensitivity and specificity, which evaluates the validity of a diagnostic model. The true positive rate (Y axis) and the true negative rate (X axis) are plotted in the ROC chart. It indicates that the optimal ROC curve is in the plot's upper left corner. An ROC chart with a bigger AUC is better, which indicates that a diagnostic model can correctly identify people with heart issues [27]. ROC curves are given in Figure 8a before and after the 14 features of the Cleveland dataset were reduced to 6. The AUC of the SVM model was 0.90 after lowering the features by the χ 2 method, but it was 0.91 when using the full set of features. Figure 8b depicts the same thing, but with the Statlog dataset. Before using the χ 2 feature selection method, the SVM model's AUC was 0.94; after using it, the AUC dropped to 0.91. This suggests that the influence of χ 2 feature selection approach was minimal in terms of AUC.  Furthermore, ROC (receiver operating characteristic) curve and AUC (area under the curve) charts were utilized to assess the performance of the proposed χ 2 -based SVM diagnostic model and its ability to identify heart disease occurrence. The ROC and AUC chart is a 2D graph, between the sensitivity and specificity, which evaluates the validity of a diagnostic model. The true positive rate (Y axis) and the true negative rate (X axis) are plotted in the ROC chart. It indicates that the optimal ROC curve is in the plot's upper left corner. An ROC chart with a bigger AUC is better, which indicates that a diagnostic model can correctly identify people with heart issues [27].
ROC curves are given in Figure 8a before and after the 14 features of the Cleveland dataset were reduced to 6. The AUC of the SVM model was 0.90 after lowering the features by the χ 2 method, but it was 0.91 when using the full set of features. Figure 8b depicts the same thing, but with the Statlog dataset. Before using the χ 2 feature selection method, the SVM model's AUC was 0.94; after using it, the AUC dropped to 0.91. This suggests that the influence of χ 2 feature selection approach was minimal in terms of AUC. Figure 7, conclusion can be drawn that the χ feature selection method played a critical role in enhancing the accuracy of the SVM model, while also improving the results of sensitivity and specificity, which shows the model's ability to correctly identify people with and without the heart disease. The proposed χ 2 -based SVM model improves classification accuracy by 6.25% for the Cleveland dataset and 5.17% for the Statlog dataset, which is important for providing a correct diagnosis and decreasing the rate of false predictions.

From
Furthermore, ROC (receiver operating characteristic) curve and AUC (area under the curve) charts were utilized to assess the performance of the proposed χ 2 -based SVM diagnostic model and its ability to identify heart disease occurrence. The ROC and AUC chart is a 2D graph, between the sensitivity and specificity, which evaluates the validity of a diagnostic model. The true positive rate (Y axis) and the true negative rate (X axis) are plotted in the ROC chart. It indicates that the optimal ROC curve is in the plot's upper left corner. An ROC chart with a bigger AUC is better, which indicates that a diagnostic model can correctly identify people with heart issues [27]. ROC curves are given in Figure 8a before and after the 14 features of the Cleveland dataset were reduced to 6. The AUC of the SVM model was 0.90 after lowering the features by the χ 2 method, but it was 0.91 when using the full set of features. Figure 8b depicts the same thing, but with the Statlog dataset. Before using the χ 2 feature selection method, the SVM model's AUC was 0.94; after using it, the AUC dropped to 0.91. This suggests that the influence of χ 2 feature selection approach was minimal in terms of AUC.  To evaluate the proposed χ 2 -based SVM model in detecting heart disease, another metric that goes by the term "confusion matrix" was utilized. In a confusion matrix, the values of the true positive (T P ) and false negative (F N ) parameters are laid out in a format like that of a table. The confusion matrix summarizes the number of correct and incorrect predictions. Figure 9b illustrates the confusion matrix that was produced as a result of using the proposed χ 2 -based SVM model on the Cleveland dataset. It shows that the proposed model can correctly detect 37 (predicted 1 and actual 1) heart diseased persons and identify 31 healthy subjects out of 35 (predicted 0 and actual 0). In Figure 10b, the Designs 2022, 6, 87 10 of 12 resulting confusion matrix of the Statlog dataset is presented. This demonstrates that the methodology that was proposed can accurately identify 31 heart disease patients and identify 30 out of 33 healthy subjects. Both Figures 9 and 10 show that the number of incorrect predictions (actual 1 but predicted 0, and actual 0 but predicted 1) made by the SVM model before and after applying the χ 2 feature selection method was reduced from 12 to 8 for the Cleveland dataset, and from 10 to 7 for the Statlog dataset.
To evaluate the proposed χ 2 -based SVM model in detecting heart disease, another metric that goes by the term "confusion matrix" was utilized. In a confusion matrix, the values of the true positive (TP) and false negative (FN) parameters are laid out in a format like that of a table. The confusion matrix summarizes the number of correct and incorrect predictions. Figure 9b illustrates the confusion matrix that was produced as a result of using the proposed χ 2 -based SVM model on the Cleveland dataset. It shows that the proposed model can correctly detect 37 (predicted 1 and actual 1) heart diseased persons and identify 31 healthy subjects out of 35 (predicted 0 and actual 0). In Figure 10b, the resulting confusion matrix of the Statlog dataset is presented. This demonstrates that the methodology that was proposed can accurately identify 31 heart disease patients and identify 30 out of 33 healthy subjects. Both Figure 9 and Figure 10 show that the number of incorrect predictions (actual 1 but predicted 0, and actual 0 but predicted 1) made by the SVM model before and after applying the χ 2 feature selection method was reduced from 12 to 8 for the Cleveland dataset, and from 10 to 7 for the Statlog dataset. Furthermore, concerning the accuracy and number of selected features, our proposed model was compared to several existing state-of-the-art techniques, as shown in Table 5. The performance of the suggested chi-squared-SVM methodology was shown to perform better than other methods, with an accuracy of 89.47%. metric that goes by the term "confusion matrix" was utilized. In a confusion matrix, the values of the true positive (TP) and false negative (FN) parameters are laid out in a format like that of a table. The confusion matrix summarizes the number of correct and incorrect predictions. Figure 9b illustrates the confusion matrix that was produced as a result of using the proposed χ 2 -based SVM model on the Cleveland dataset. It shows that the proposed model can correctly detect 37 (predicted 1 and actual 1) heart diseased persons and identify 31 healthy subjects out of 35 (predicted 0 and actual 0). In Figure 10b, the resulting confusion matrix of the Statlog dataset is presented. This demonstrates that the methodology that was proposed can accurately identify 31 heart disease patients and identify 30 out of 33 healthy subjects. Both Figure 9 and Figure 10 show that the number of incorrect predictions (actual 1 but predicted 0, and actual 0 but predicted 1) made by the SVM model before and after applying the χ 2 feature selection method was reduced from 12 to 8 for the Cleveland dataset, and from 10 to 7 for the Statlog dataset. Furthermore, concerning the accuracy and number of selected features, our proposed model was compared to several existing state-of-the-art techniques, as shown in Table 5. The performance of the suggested chi-squared-SVM methodology was shown to perform better than other methods, with an accuracy of 89.47%. Furthermore, concerning the accuracy and number of selected features, our proposed model was compared to several existing state-of-the-art techniques, as shown in Table 5. The performance of the suggested chi-squared-SVM methodology was shown to perform better than other methods, with an accuracy of 89.47%.

Conclusions
In this work, an enhanced model was implemented to increase the heart disease diagnosis and prediction accuracy, as well as to reduce computational load. The ML (SVM) algorithm was used as a classification model for enhanced heart disease diagnoses. This model was performed on two famous heart disease datasets. The results showed increasing accuracy from 84.21% to 89.47 and from 85.29% to 89.7% in the Cleveland and Statlog datasets, respectively. Furthermore, the features used in the system were decreased from 14 to 6 features, which means that the computational load was reduced from 100% to approximately 42%. We anticipate that this work will contribute to the future development and implementation of heart disease prediction and diagnosis systems.

Data Availability Statement:
The machine-learning repository at UCI provides access to the datasets that were used in this study.

Conflicts of Interest:
The authors declare no conflict of interest.