Reducing the Heart Failure Burden in Romania by Predicting Congestive Heart Failure Using Artiﬁcial Intelligence: Proof of Concept

: Due to population aging, we are currently confronted with an increased number of chronic heart failure patients. The primary purpose of this study was to implement a noncontact system that can predict heart failure exacerbation through vocal analysis. We designed the system to evaluate the voice characteristics of every patient, and we used the identiﬁed variations as an input for a machine-learning-based approach. We collected data from a total of 16 patients, 9 men and 7 women, aged 65–91 years old, who agreed to take part in the study, with a detailed signed informed consent. We included hospitalized patients admitted with cardiogenic acute pulmonary edema in the study, regardless of the precipitation cause or other known cardiovascular comorbidities. There were no speciﬁc exclusion criteria, except age (which had to be over 18 years old) and patients with speech inabilities. We then recorded each patient’s voice twice a day, using the same smartphone, Lenovo P780, from day one of hospitalization—when their general status was critical—until the day of discharge, when they were clinically stable. We used the New York Heart Association Functional Classiﬁcation (NYHA) classiﬁcation system for heart failure to include the patients in stages based on their clinical evolution. Each voice recording has been accordingly equated and subsequently introduced into the machine-learning algorithm. We used multiple machine-learning techniques for classiﬁcation in order to detect which one turns out to be more appropriate for the given dataset and the one that can be the starting point for future developments. We used algorithms such as Artiﬁcial Neural Networks (ANN), Support Vector Machine (SVM) and K-Nearest Neighbors (KNN). After integrating the information from 15 patients, the algorithm correctly classiﬁed the 16th patient into the third NYHA stage at hospitalization and second NYHA stage at discharge, based only on his voice recording. The KNN algorithm proved to have the best classiﬁcation accuracy, with a value of 0.945. Voice is a cheap and easy way to monitor a patient’s health status. The algorithm we have used for analyzing the voice provides highly accurate preliminary results. We aim to obtain larger datasets and compute more complex voice analyzer algorithms to certify the outcomes presented.


Introduction
Age-related morphological and physiological changes lead to cardiogeriatric syndrome, predisposing the elder individual to develop Chronic Heart Failure (CHF) [1]. CHF is a significant public health issue, with a prevalence of over 37.7 million cases worldwide [2]. It is ranked by the substantial morbidity and mortality first and the significant annual healthcare and economic burden second [3,4].
CHF is the consequence of cardiac functional impairment secondary to many etiologies, commonly hypertension and coronary heart disease [5,6]. CHF symptoms, such as dyspnea, poor exercise tolerance, and fluid retention, strongly affect patients' quality of life [7]. The plurietiological substrate of heart failure has a critical variability depending on sex, ethnicity, age, comorbidities, and environment [8,9]. Globally, heart failure is one of the most important causes of hospitalization among adults over 65 years old, with medical costs ranging from USD 868 per patient in South Korea to USD 25,532 per patient in Germany, according to a study published in 2018 by Lesyuk W et al. [10]. The estimated lifetime cost of a chronic heart failure patient is 126,819 $ [11]. The situation in Romania is far from good; 4.7% of the population above 35 years old is diagnosed with CHF with an annual mortality rate of approximately 60% [3,12]. Once diagnosed with heart failure, a patient has an expected survival rate of 50% at five years and 10% at ten years [13,14].
However, the severity of the left ventricular dysfunction is associated with an even greater risk of sudden death [4]. Although the survival prognosis of heart failure is not good, the numbers have undergone a substantial improvement over time. [15] 1.1. Pathopyshiology of Acute Heart Failure Heart failure is a clinical syndrome characterized by acute exacerbations resulting from gradual or rapid changes in the heart, with signs (elevated jugular venous pressure, pulmonary congestion) and symptoms (dyspnea, orthopnea, lower limb swelling) needing urgent therapy [16,17]. Acute heart failure's most frequent clinical tableaus are chronic heart failure decompensation, cardiogenic shock, and acute pulmonary edema [18].
The Cardiogenic Acute Pulmonary Edema (CAPE) develops secondary to a sharp increase in left ventricular pressure, impacting the left atrium retrogradely. Therefore, pressure in pulmonary capillaries results in fluid exudation in the intravascular compartment [19,20]. This mechanism leads to a low diffusion capacity in the lungs, causing dyspnea and fluid retention, which can progress into anasarca, depending on the severity of the cardiac dysfunction [21,22].
Anasarca represents a generalized form of edema, with subcutaneous tissue swelling throughout the body, including the swelling of the larynx, also known as the voice box [23].
The link between the phonation process and generalized edema was underlined in 2002, when Verdolini et al. stated that systemic dehydration mediates the augmentation of phonation threshold pressure [24]. In 2017, Murton et al. conducted a speech analysis on patients with heart failure and obtained significant speech accuracy improvement after pulmonary decongestion and clinical stabilization [25].
In terms of clinical decisions, management of acute heart failure aims to decrease the number of readmissions and long-term mortality [26]. Despite medical efforts, acute heart failure remains a pathology with a sober prognosis, and there is no therapy proven to have long-term mortality benefits [27]. To avoid rehospitalization, the need for better secondary prevention strategies is evident. [28]

Artificial Intelligence in Cardiology
Artificial intelligence is an engineering branch that uses novel concepts to resolve complex challenges [29]. As biology and medicine are rapidly becoming data-intensive, deep-learning algorithms have been used to assist physicians [30].
Twenty-first-century medicine is now spinning around the patient's individuality, and big data algorithms are efficient assistance tools in the medical environment [31]. Artificial intelligence should not be regarded as a futuristic phenomenon but rather as a tool that saves medical staff time and minimizes human error [30].
In a study conducted in 2017, Dawes et al. managed to predict outcome in pulmonary hypertension patients with an algorithm of three-dimensional patterns of systolic cardiac motion. The software copied the MRI data from 256 patients and learned which configurations were associated with early death or right heart failure. The algorithm used the short-axis cine images segmentation for the three-dimensional model. The prediction tool assessed survival using the median survival time and area under the curve with timedependent receiver operating, for 1-year survival. Alongside conventional imaging and biological markers, this algorithm increased the accuracy of survival prediction [32].
The Artificial Intelligence-Clinical Decision Support System (AI-CDSS) is a hybrid (expert-driven and machine-learning-driven) tool designed to assist physicians in heart failure diagnosis. Dong-Ju Choi, Jin Joo Park et al. evaluated in their published study the diagnostic accuracy of AI-CDSS on a group of 97 patients with dyspnea. They assessed the concordance rate between the algorithm results and those of heart failure specialists. Out of the 97 patients, 44% had heart failure, with a concordance rate between AI-CDSS and heart failure specialists of 98%. On the other hand, the concordance rate between AI-CDSS and non-heart failure specialists was 76%. Finally, they underlined the usefulness of AI-CDSS in heart failure diagnosis, especially when a heart failure specialist is unavailable [33].
A recent review by Aixia Guo, Michael Pasque et al. summarized the recent findings and approaches of machine-learning techniques in heart failure diagnosis and outcome prediction. The review evaluated studies which used electronic health records, varying from demographic characteristics, medical treatment history, laboratory and imaging results to genetic profiles. They assert high-accuracy results of these prediction tools, taking into consideration at the same time the challenges that novel machine-learning models still need to overcome. Among the most common shortcomings in this area is the impossibility of full integration of the electronic health record (medical reports, a wide variety of imaging results, etc.). On the other hand, given that these algorithms are based on machine learning, patients with rare diseases and atypical profiles cannot benefit from this technology. Thus, it is necessary in the future to further enrich management techniques in order to provide interpretable and actionable models [34,35].
Our study offers a new perspective on the applicability of artificial intelligence in medicine. We pursue this software development in order to integrate it as a smartphone application in the near future. This application will run in the smartphone background, performing vocal analysis on heart failure patients. If it finds signs of heart failure decompensation, it will refer them to medical services. In this way, it will be possible to avoid severe presentations of acute heart failure, which require hospitalization and emergency treatment.

Main Contributions
The primary purpose of this study was to implement a noncontact system that can predict heart failure exacerbation through vocal analysis. The system was designed to evaluate every patient's voice characteristics, and the identified variations were used as an input for a machine-learning-based approach. This new concept proposes an implementation of a silent intelligent recorder in patients' home, capable of predicting heart failure decompensation.
Our preliminary results managed to highlight an important link between the phonation process and heart failure status. Voice is a cheap parameter that would prove extremely useful in the secondary prevention management of heart failure. In order to have effective secondary prevention campaigns in the future, we need an easy-to-use, fast-to-implement, cheap tool.
In our knowledge, there is currently no other open-source algorithm capable of predicting heart failure decompensation using artificial intelligence. The aim of our research study is to highlight the heart failure burden around the globe and the beneficial impact that a secondary prevention algorithm could have on frequent hospitalizations of heart failure patients.

Study Population
The selective criterion of inclusion was the cause of hospitalization. Patients presenting with cardiogenic acute pulmonary edema were selected regardless of the precipitation Appl. Sci. 2021, 11, 11728 4 of 13 cause or other known cardiovascular comorbidities. Patients' enrollment in the study was voluntary, after a detailed presentation of the study design. The patients' data collection could not be completely anonymous, so the pseudoanonymization alternative was chosen. Thus, each participant was assigned an identifier, through which personal information is separated from the data collection of the study. All participants were informed about their right to privacy and about the private storage and use of their data. The study conducted did not present any potential psychological, social, physical or legal harm to patients. A total of 16 patients, 9 men and 7 women, aged 65-91 years old, agreed to participate in the study and signed a comprehensive informed consent. There were no specific exclusion criteria, except age (which had to be over 18 years old) and patients with speech inabilities.

Intervention
We recorded the voices of all patients twice a day, using a Lenovo P780 smartphone, from day one of hospitalization, until the day of discharge. We asked the patients to repeatedly pronounce two specific keywords (number thirty-three and vowel E) while recording. We attempted to minimize environmental noise as much as possible. The mean hospitalization period was seven days, with two recordings per day. We built a small database of 240 audio recordings. We classified them according to the New York Heart Association Functional Classification (Table 1) [36].

IV
Unable to carry on any physical activity without discomfort. Symptoms of heart failure at rest. If any physical activity is undertaken, discomfort increases.

Feature Extraction
Voice is a continuous-time signal; however, for computation, it is represented as a discrete-time signal. We measured the amplitude at equal distances in a set number of points per second, along with the continuous signal. We set the sampling rate at 48 kHz in this study. Sampling refers to the recording of the speech signals at a regular interval (Figure 1). The raw data were processed to compute the input for the proposed machine-learning algorithms. Calculating the Mel-Frequency Cepstral Coefficients (MFCCs) is an essential step to extract the relevant voice features and reduce each file's dimension.  The raw data were processed to compute the input for the proposed machine-learning algorithms. Calculating the Mel-Frequency Cepstral Coefficients (MFCCs) is an essential step to extract the relevant voice features and reduce each file's dimension. Figure 2 presents the steps for MFCCs extraction, where Discrete Fourier Transform (DFT) is applied on the time signal generating the frequency spectrum. The logarithm function is used, and the Inverse DFT is computed. The final step is to add the Mel Cepstrum or Discrete Cosine Transform (DCT) [37] (Figure 2). The raw data were processed to compute the input for the proposed machine-learning algorithms. Calculating the Mel-Frequency Cepstral Coefficients (MFCCs) is an essential step to extract the relevant voice features and reduce each file's dimension. Figure 2 presents the steps for MFCCs extraction, where Discrete Fourier Transform (DFT) is applied on the time signal generating the frequency spectrum. The logarithm function is used, and the Inverse DFT is computed. The final step is to add the Mel Cepstrum or Discrete Cosine Transform (DCT) [37] (Figure 2). In the described method, the output is represented by up to 40 feature vectors; 20 feature vectors were used, graphically represented in Figure 3. As the dataset comprises different length audio files, the mean MFCC is computed for each feature. To minimize the noise impact, the values are normalized ( Figure 3). In the described method, the output is represented by up to 40 feature vectors; 20 feature vectors were used, graphically represented in Figure 3. As the dataset comprises different length audio files, the mean MFCC is computed for each feature. To minimize the noise impact, the values are normalized (Figure 3).

Machine-Learning Approaches
As a data analytic technique, machine-learning teaches computers to learn from experience, similar to human and animal nature. Machine-learning approaches use computational methods in order to absorb the data, without having to rely on predetermined equations as a model [38].

Machine-Learning Approaches
As a data analytic technique, machine-learning teaches computers to learn from experience, similar to human and animal nature. Machine-learning approaches use computational methods in order to absorb the data, without having to rely on predetermined equations as a model [38].
For classifying the audio files into the four heart-failure classes, we used multiple machine-learning techniques. Each method is generally explained below.
2.4.1. Support Vector Machine (SVM) Support Vector Machine (SVM) is a machine-learning technique that analyzes data for classification and regression analysis using supervised learning models with associated learning algorithms. Being a nonprobabilistic binary linear classifier, SVM algorithm settles a given set of training examples into one of two categories. Consequently, SVM creates a gap between the two categories, in order to maximize the space between them. The new examples are then mapped into that space and predicted to a category, based on the gap side they fall in. SVM is a standard and suitable method for audio classification [39]. Figure 4 describes a schematic manner of the general SVM algorithm. There are multiple SVMs, using different mathematical functions (or kernels), as follows: linear, nonlinear, radial basis function (RBF), polynomial, and sigmoid. They were tested and evaluated to conclude which is the best kernel to use to determine heart-failure severity [40,41] (Figure 4).

Artificial Neural Networks (ANN)
Artificial Neural Networks (ANN) represent computing systems inspired by the neural networks of animal brains. They are made up of nodes (artificial neurons) and connections that can transmit signals. These signals are represented by real numbers, and the output of each neuron is the sum of all its inputs. Usually, these artificial neurons are aggregated in layers. Each layer can produce different changes to its inputs. Correctly, the signal goes from the first to the last layer, often even repeatedly [42]. Figure 5a,b represent the used models, the chosen layers with the activation function specified for each of them: (Rectified Linear Activation Function, SoftMax, and Hyperbolic Tangent). A facile way to compute the model is by utilizing Keras, a high-level API that gives the user the necessary tools to build and evaluate the neural networks. (a)

Artificial Neural Networks (ANN)
Artificial Neural Networks (ANN) represent computing systems inspired by the neural networks of animal brains. They are made up of nodes (artificial neurons) and connections that can transmit signals. These signals are represented by real numbers, and the output of each neuron is the sum of all its inputs. Usually, these artificial neurons are aggregated in layers. Each layer can produce different changes to its inputs. Correctly, the signal goes from the first to the last layer, often even repeatedly [42]. Figure 5a,b represent the used models, the chosen layers with the activation function specified for each of them: (Rectified Linear Activation Function, SoftMax, and Hyperbolic Tangent). A facile way to compute the model is by utilizing Keras, a high-level API that gives the user the necessary tools to build and evaluate the neural networks. output of each neuron is the sum of all its inputs. Usually, these artificial neurons are aggregated in layers. Each layer can produce different changes to its inputs. Correctly, the signal goes from the first to the last layer, often even repeatedly [42]. Figure 5a,b represent the used models, the chosen layers with the activation function specified for each of them: (Rectified Linear Activation Function, SoftMax, and Hyperbolic Tangent). A facile way to compute the model is by utilizing Keras, a high-level API that gives the user the necessary tools to build and evaluate the neural networks.
(a) (b) Figure 5. (a,b) The models used in the ANN method.

K-Nearest Neighbors (KNN)
K-Nearest Neighbors (KNN) represents a nonparametric classification method used in classification and regression. This algorithm performs classification based on distance, thus its function is only locally approximated and computation is postponed until function evaluation. If the analyzed units have different physical characteristics, it is recommended to normalize the data in order to improve its accuracy [43].

K-Nearest Neighbors (KNN)
K-Nearest Neighbors (KNN) represents a nonparametric classification method used in classification and regression. This algorithm performs classification based on distance, thus its function is only locally approximated and computation is postponed until function evaluation. If the analyzed units have different physical characteristics, it is recommended to normalize the data in order to improve its accuracy [43].
In our study, the classification is computed, and a new object is classified according to the votes of its neighbors. The 20 mean MFCCs extracted are used in this approach, and for this reason, it is employed as a 20-dimension classifier. This algorithm provides high accuracy for problems with unknown distributions [44].
The Principal Component Analysis (PCA) is computed to evaluate whether the noise is introduced and determine if it can be reduced. For model generation, a Grid Search is performed. For calculating a model with KNN, Scikit learn, an easy-to-use and efficient library from python, was selected.
The dataset was split into two parts for all three methods: the train set (80%) and the test set (20%).

Results
The study group has 16 patients, 9 men and 7 women. The small sample size of our study group is due to the COVID-19 pandemic. Our local hospital was dedicated to COVID-19 patients and it was not possible to continue enrolling new patients in the study. Consequently, we decided to move forward with the analysis of this group in order to see if the algorithm works.
Despite the small patient study group, we believe that the results obtained are relevant, considering the used algorithm. KNN is able to deliver high-accuracy results for small databases. The obtained results show a link between patients' vocal changes and heart failure status. Given these favorable preliminary results, we expect them to strengthen as the number of patients increases.
In terms of cardiovascular risk factors, the enrolled patients have a minimum age value of 65 years old, a maximum age value of 91 years old, with a mean value of 72.68 years old. An increased body mass index is found in 7 out of 16 patients, with a maximum value of 46.88 kg/m 2 , corresponding to morbid obesity. High blood pressure and dyslipidemia are two very common risk factors in the study group: out of the 16 patients, 12 are known to have high blood pressure and/or dyslipidemia, with ambulatory treatment. Furthermore, 10 patients have type 2 diabetes, and 7 of these patients need insulin therapy ( Table 2). The medical history of the 16 patients enrolled in the study revealed ischemic coronary heart disease in 16 out of 16 patients. Additionally, seven of them have history of percutaneous angioplasty, four have history of coronary artery bypass grafting and five were on ischemic visa drug therapy. All patients have heart failure, 10 of which have severe left ventricular systolic dysfunction (Table 3). The New York Heart Association functional classification (NYHA) of heart failure is a widely used tool in cardiologists' daily practice. It evaluates heart failure patients' symptom severity and the exertion threshold needed to provoke symptoms. Each patient was given a daily assessment on the NYHA scale, the result being associated with the voice recordings performed [36].
In order to be able to use the voice signals as input data for the proposed algorithm, the audio files were converted to vectors of numerical values. These values represent the audio signal's amplitude measured at equal intervals in the temporal space. Consequently, the audio files were converted to vectors of numbers. The different length files resulted in different sized vectors. Their processing led to the extraction of 22 representative values for 22 vocal characteristics (see Section 2.2 Feature Extraction). This process is mandatory in order for the final data collection to have files of the same size.
With the dataset available, the algorithms described in Sections 2.4.1-2.4.3 were tested, and the KNN algorithm was the most relevant as it generated the highest accuracy for classification. Additionally, the result was sustained by the confusion matrix available in (Table 4). This kind of matrix is used to validate the accuracy of the KNN classification method. The three columns from the matrix represent the three classes associated with the audio files. Each element from the diagonal represents the number of correctly classified data from each class and the other elements represent the erroneous ones. The high accuracy obtained, having a value of 0.945, and the validation that was performed through the confusion matrix indicate that this method succeeded in classifying the data with high precision and is reliable for further development.
Alongside daily vocal analysis of the enrolled patients, their clinical and paraclinical monitoring was performed. Thus, weight at admission and discharge, daily diuresis, daily water intake and NTproBNP values were monitored. The clinical and paraclinical evolution of the patients enrolled in the study fits the results of the vocal analysis performed by the proposed algorithm (Table 5).

Discussion
The purpose of this study was to prove that the phonation process suffers during acute heart failure. Therefore, the voice can be used as a prognostic marker and to monitor patients' health status.
The data of the patients admitted to the hospital for acute heart failure have been evaluated. The subject data have been analyzed from the critical status (first day of admission) to the stable status (day of discharge). Out of 16, severe left ventricular systolic dysfunction was noted in 10 patients, with a hypersodium diet as precipitating factor. In comparison, six patients have had moderate left ventricular systolic dysfunction associated with bronchopneumonia with or without moderate to severe valvulopathy ( Table 4).
The machine-learning algorithm integrated the audio recordings from 15 patients. We used the last patient to test the algorithm, and he was classified accordingly and correctly after vocal analysis into the third NYHA stage at hospitalization and class II NYHA at discharge.
Numerous factors are known to contribute to the development of heart failure (HF). The potential causes include coronary artery disease, hypertension, cardiomyopathies, valvular and congenital heart disease, arrhythmias, alcohol and drugs, high output failure (anemia, thyrotoxicosis, Paget's disease, etc.), pericardial disease, and primary right heart failure [45]. The meta-analyses conducted by Jones et al. found an improvement in the survival rates secondary to CHF over the past 70 years. The estimated 1-year survival rate was 85.5%; however, the 5-year and 10-year survival rates were 56.7% and 34.9%, and most patients died directly from heart failure or cardiovascular diseases [46]. Although the risk of HF decompensation among older patients has declined over time, it remains one of the leading causes of hospitalization [47].
Cook et al. evaluated the annual global heart failure burden from all published sources and estimated it at $108 billion per annum in 2012. The direct costs accounted for $65 billion, and the indirect cost was $43 billion per annum. The mean immediate HF burden value for the high-income countries was 1.42% versus 0.11% for low-or middle-income countries [48,49]. The hospitalization expenses are the most significant cost component following the expenditures for the medication [50]. In hospitalization costs, room and board were the most important contributors, accounting for 43% of inpatient costs, followed by procedures, imaging, and laboratory testing [51]. Dialysis required the highest part of procedural costs, but it was needed only by a small number of patients [10].
Notable AI models with a successful history include echocardiogram images to identify patients with HF with preserved ejection fraction [7]. It is possible to predict the 1-year mortality from normal ECGs [8]. By reflecting the elevated potassium level in tall T-waves, AI models quantify the potassium regardless of the blood test [9]. The noninvasive cardio acoustic biomarkers were shown to offer reliable results in predicting the parameters of heart failure [10,11]. Misumi et al. used a machine-learning algorithm to examine the valuable predictors obtained from the left ventricular assistance device to provide a model for identifying aortic regurgitation [12].
Therefore, creating software for heart failure decompensation could be timesaving for clinicians and could play a vital role in improving patients' morbidity and mortality. In addition, it could prove to be a money-saving mechanism for healthcare systems and a pioneer in disease management technology [52].

Conclusions
We believe that our study serves as the first brick in the future construction of a software that will offer secondary prevention in chronic heart failure patients. Voice is an easy way to monitor a patient's health status as it is an easy-to-understand process, and it is not time-or money-consuming.

Limitations
The study sample is small, and consequently the obtained results are preliminary. Further research will be conducted in order to certify the outcomes presented. Additionally, patients enrolled in the study had to be capable of understanding and signing the comprehensive informed consent, a fact which had limited the enrollment of critical-state patients and low-educational background patients.  Data Availability Statement: Data available on request due to ethical restrictions. The data presented in this study are available on request from the corresponding author. The data are not publicly available due to ethical restrictions.