This section will provide an overall model with a modern working internet and an AIpowered automatic classification task of heart failure diseases.
3.2. The Proposed ML Model
The proposed ML model reduces the number of features using heuristics derived from data mining. Even with fewer features, it achieves almost the same accuracy as the other standard models. This has advantages in realtime applications as it can decrease laboratory analysis and device costs. Feature engineering and feature selection have many aspects. Our analysis rates 12 features and the relationship of each feature to the target values independently, replacing conventional methods such as logistic regression feature ranking and statistical tests. Equation (
3) ranks the feature importance for target values independently. First, the values for one feature from the dataset are sorted. Then, we include the target values according to the location of each selected value. For example,
Table 2 presents 20 values of patient ages derived from rows 70 to 90 of the dataset. If the target value equals 0, the patient survived during this period; otherwise, they died.
The ranking involves dividing the sorted feature values into two disjoint intervals and finding the border value by maximizing Equation (
3). The maximum value of Equation (
3) for a single feature is assigned to its weight
${\widehat{w}}_{i}$. Let
${u}_{1}^{1}$ and
${u}_{1}^{2}$ denote the number of surviving and deceased patients in the first interval and
${u}_{2}^{1}$ and
${u}_{2}^{2}$ the number of surviving and deceased patients in the second interval of feature
${x}_{j}$.
where
${c}_{1}$,
${c}_{2}$,
${c}_{3}$ are the minimum, border, and maximum values of the feature after dividing the data into disjoint intervals
$[{c}_{1},{c}_{2}]$,
$({c}_{2},{c}_{3}]$ in accordance. The solution of Equation (
3) lies in the interval
$[0,1]$. If it is equal to 1, then the objects of the two classes (surviving and deceased) are located in different intervals. Moreover, the first and second brace expressions in Equation (
3) imply the innerclass similarity and difference respectively. The Algorithm 1 below illustrates the steps for calculating the rank for a given feature.
Algorithm 1 Feature Ranking Algorithm 
Input: x and y ▹x and y values, labels of input feature Output: w, b ▹ Find max w values of Equation ( 3), and border b $w\leftarrow 0$, $b\leftarrow 0$ ▹ Initial values of output $l\leftarrow argsort\left(x\right)$ $i\leftarrow 0$ ${u}_{1}^{1}=0$, ${u}_{1}^{2}=0$, ${u}_{2}^{1}=len\left({K}_{1}\right)$, ${u}_{2}^{2}=len\left({K}_{2}\right)$ ▹ Sub. and sup. indicate class and interval while
$i<len\left(x\right)$
do $i\leftarrow i+1$ ${u}_{y\left[l\right[i\left]\right]}^{1]}\leftarrow {u}_{y\left[l\right[i\left]\right]}^{1}+1$ ▹ The border moves to right. Increase by 1 ${u}_{y\left[l\right[i\left]\right]}^{2}\leftarrow {u}_{y\left[l\right[i\left]\right]}^{2}1$ ▹ Decrease by 1 2nd interval if $x\left[l\right[i\left]\right]\ne x\left[l\right[i+1\left]\right]$ then ▹ Ensuring disjointablity of intervals $s\leftarrow criterion(u,K)$ ▹ Calculate the f by given u and K if $w<s$ then ▹ If new border’s rank higher, then reassign $w\leftarrow s$, $b\leftarrow i$ end if end if end while

We assume a function criterion exists in Algorithm 1. Given below is an example for this function using the values in
Table 2 based on choosing a probable best decision boundary. The object numbers in
Table 2 are 15 and 5 in two classes in accordance. In the current case, we have a classimbalance problem. At first glance, the first and second intervals are
$\left(\right)$ and
$\left(\right)$; refer the second part of the table. After the borders are defined, we can calculate the object numbers of each class in every interval. As denoted above,
${u}_{1}^{1}=8$,
${u}_{2}^{1}=0$,
${u}_{1}^{2}=7$,
${u}_{2}^{2}=5$ are the number of objects in classes and intervals respectively, where the superscripts and subscripts indicate the interval and the class indices, respectively. The similarity and the difference can be computed as
$\frac{8(81)+0(01)+7(71)+5(51)}{15(151)+5(51)}=0.51$ and
$\frac{8(50)+0(158)+7(55)+5(157)}{2\xb715\xb75}=0.53$, and the overall value of Equation (
3) is
$0.27$, and the binary accuracy is 65%, which is a little high.
Similar to logistic regression, the presented approach uses a feature border calculated by the Algorithm 1 (a feature with the height rank) as a decision boundary instead of determining the threshold, usually
$0.5$. It is important to remember that a decision based on a single feature can have adverse effects, especially in medicine. In particular, if a patient has cardiovascular disease, there is a high probability of death after a certain period. Therefore, we include other features based on the ranking given by Equation (
3) and the difficulty of measuring them in real time; for instance, determining a patient’s
age does not call for any additional resources or effort, but
platelets in the blood can be measured only in test laboratories. To end this, we leverage Equation (
4) as linear logistic regression instead of leveraging gradient descent optimization with modification of Equation (
2) is divided into two sums for: quantitative and categorical features separately. Equation (
4) transforms all object feature values into a single value, which can used for further analysis by applying Equation (
3).
where
${t}_{j}\in \{1,1\}$ is the class direction,
${w}_{j}$ is the value of Equation (
3),
${c}_{1,j}$,
${c}_{2,j}$,
${c}_{3,j}$ are the minimum, border, and maximum values of feature
${x}_{j}$ after dividing the entire range into disjoint intervals
$[{c}_{1,j},{c}_{2,j}]$,
$({c}_{2,j},{c}_{3,j}]$. The class direction is implicit. The easiest way, as introduced in the task of maximizing the value of Equation (
3), is determining
${t}_{j}\in \{1,1\}$ case by case, which is NPcomplete [
19,
20]; for example, if there 12 features in the datasets, then the number of total cases will be
${2}^{12}$. This paper proposes to modify this limitation of the existing method. Let us assume the firstclass objects are located in the first interval for each feature. To make this assumption work when the class intervals are misplaced, we set the class direction
${t}_{i}=1$ and otherwise, we assign 1 to the class direction
${t}_{j}$. To apply Equation (
4) to the categorical features, we preprocess them using a target encoder with an important modification: the value of each category is replaced with its posterior probability over the classes in Equation (
5) [
14].
We can group the features by using Equation (
4) to obtain a single output for multiple inputs, as in the logistic regression model. After we have the output values of the objects in the training set, we apply Equation (
3) to them to locate the border between two classes. Based on the assumption of the object locations in the intervals, we can write Equation (
6) to predict new patients. We used the same approach for computing metrics using a single feature in
Table 3.