Machine Learning Model-Based Simple Clinical Information to Predict Decreased Left Atrial Appendage Flow Velocity

Background: Transesophageal echocardiography (TEE) is the first technique of choice for evaluating the left atrial appendage flow velocity (LAAV) in clinical practice, which may cause some complications. Therefore, clinicians require a simple applicable method to screen patients with decreased LAAV. Therefore, we investigated the feasibility and accuracy of a machine learning (ML) model to predict LAAV. Method: The analysis included patients with atrial fibrillation who visited the general hospital of PLA and underwent transesophageal echocardiography (TEE) between January 2017 and December 2020. Three machine learning algorithms were used to predict LAAV. The area under the receiver operating characteristic curve (AUC) was measured to evaluate diagnostic accuracy. Results: Of the 1039 subjects, 125 patients (12%) were determined as having decreased LAAV (LAAV < 25 cm/s). Patients with decreased LAAV were fatter and showed a higher prevalence of persistent AF, heart failure, hypertension, diabetes and stroke, and the decreased LAAV group had a larger left atrium diameter and a higher serum level of NT-pro BNP than the control group (p < 0.05). Three machine-learning models (SVM model, RF model, and KNN model) were developed to predict LAAV. In the test data, the RF model performs best (R = 0.608, AUC = 0.89) among the three models. A fivefold cross-validation scheme further verified the predictive ability of the RF model. In the RF model, NT-proBNP was the factor with the strongest impact. Conclusions: A machine learning model (Random Forest model)-based simple clinical information showed good performance in predicting LAAV. The tool for the screening of decreased LAAV patients may be very helpful in the risk classification of patients with a high risk of LAA thrombosis.


Background
Atrial fibrillation is the most common arrhythmia in clinical practice and is associated with stroke [1]. As shown in previous reports, the left atrial appendage (LAA) may harbor up to 90% of thrombi occurring in patients with AF [2].
The left atrial appendage flow velocity (LAAV) can reflect left atrial appendage function, which has many clinical implications. First, a low LAAV indicates a high risk of LAA spontaneous echo contrast (SEC) and LAA thrombus [3], and SEC is an independent risk factor for subsequent thromboembolic events [4]. Second, LAAV was proved to be an independent predictor of cardioversion success [5]. The measurement of LAAV could provide useful information for the prediction of cardioversion outcomes in AF patients. Third, previous studies have proved that a low LAAV was associated with AF recurrence after the initial catheter ablation of persistent AF [5]. Moreover, low flow velocity in LAA is a predisposing factor in AF patients [6]. Now, transesophageal echocardiography (TEE) is the first technique of choice to evaluate the LAAV [7]. Although TEE is relatively safe and noninvasive, the insertion and manipulation of the ultrasound probe may cause some complications such as oropharyngeal, esophageal, or gastric trauma [8,9]. Therefore, clinicians require a simple applicable method with high sensitivity, which could screen patients with decreased LAAV.
Machine learning (ML) has been used and has shown an acceptable performance in predicting the risk of diseases [10][11][12]. Therefore, in the current study, we used simple clinical data obtained by an ML model to predict LAAV and aim to examine the feature importance of the ML model to understand its mechanism.

Participants
In the present study, we have retrospectively recruited patients who visiting the General Hospital of PLA and underwent transesophageal echocardiography in between January 2017 and December 2020. Inclusion criteria: (1) age ≥ 18 years old; (2) data documented in HIS system in hospital; (3) complete transesophageal echocardiography examination; (4) diagnosed with nonvalvular atrial fibrillation. Subjects were excluded if they did not meet the above criteria.

Transesophageal Echocardiography
Left atrial appendage flow velocity was measured by TEE, and all transesophageal echocardiography procedures were performed by experienced cardiologists. Evaluation of LAA by TEE included the presentation or absence of left atrial thrombus, left atrial appendage spontaneous echo contrast, and decreased left atrial appendage emptying peak flow velocity. Based on previous research, LAAV of <25 cm/s was suggested as a useful cutoff value to discriminate patients with a high risk of systemic embolization [13,14].

Clinical Factors
All clinical data were searched for by the HIS system (Hospital Information System). Since the model we built is a primary screening model, covariates for the machine leaning model were chosen based on prior literature review and clinical judgment, focusing on easily accessible variables that were expected to affect the LAAV. Predictors in our model included demographics (age and sex), previous medical history (history of atrial fibrillation, hypertension, heart failure, diabetes, stroke, and vascular diseases), anticoagulant drugs and left atrial (LA) diameter. In addition, many studies have proved the correlation between NT-proBNP and LAA flow velocity [15][16][17]. Therefore, we chose BNP as one of the predictors.
To ensure accuracy, all the data were collected twice. If there was any inconsistency, a third review was conducted.

Study Design
As shown in Figure 1, the data were randomly split into a training set (80%) and a test set (20%). The CreateDataPartition function (caret package) in the R 15.6 Available online: http://www.R-project.org (accessed on 11 February 2022). was used to segment the dataset, and a statistical test was carried out to ensure the balance of the variables in the two datasets. A training set was used for the training of the ML model, and the models were tested in the testing set by cross-validation and 5-fold validation.

Model Development
In the current study, we built regression models using various ML algorithms, i cluding the support vector machine (SVM) model, random forest (RF) model, and k-nea est neighbor (KNN) model because they are used widely in regression models [18][19][20][21]. A models were developed by R 15.6.

Model Evaluation
The optimal cut-off point of the predicted LAAV was determined by receiver ope ating characteristic curve (based on the principle of maximum Yordon index), usi LAAV measured by TEE as a gold standard (<25 cm/s). With the cut-off values, the sen tivity, specificity, and F1 score of three models were calculated. Accuracy, sensitivi specificity, mean squared error and area under the receiver operating characteristic cur (AUCROC), and F1 score were used to evaluate the performance of the model. Accura refers to the ratio of the number of correctly predicted decreased LAAV to the total num ber of participants. Among the metrics, AUCROC and F1 score were the main metrics reflect the performance of the model.

Fivefold Cross-Validation
Considering that the model may suffer from overfitting, we performed a fivefo cross-validation scheme. First, the data were partitioned into 5 equal parts. The model w trained on 4 parts, leaving 1 part for testing. The process was repeated 5 times until testi was performed on all of the 5 parts.

Definition and Statistics of Variables
Continuous variables (age, Body Mass Index (BMI), serum of NT-proBNP and l atrial diameter) are presented as medians (interquartile ranges) and compared by Man

Model Development
In the current study, we built regression models using various ML algorithms, including the support vector machine (SVM) model, random forest (RF) model, and k-nearest neighbor (KNN) model because they are used widely in regression models [18][19][20][21]. All models were developed by R 15.6.

Model Evaluation
The optimal cut-off point of the predicted LAAV was determined by receiver operating characteristic curve (based on the principle of maximum Yordon index), using LAAV measured by TEE as a gold standard (<25 cm/s). With the cut-off values, the sensitivity, specificity, and F1 score of three models were calculated. Accuracy, sensitivity, specificity, mean squared error and area under the receiver operating characteristic curve (AUCROC), and F1 score were used to evaluate the performance of the model. Accuracy refers to the ratio of the number of correctly predicted decreased LAAV to the total number of participants. Among the metrics, AUCROC and F1 score were the main metrics to reflect the performance of the model.

Fivefold Cross-Validation
Considering that the model may suffer from overfitting, we performed a fivefold cross-validation scheme. First, the data were partitioned into 5 equal parts. The model was trained on 4 parts, leaving 1 part for testing. The process was repeated 5 times until testing was performed on all of the 5 parts.

Definition and Statistics of Variables
Continuous variables (age, Body Mass Index (BMI), serum of NT-proBNP and left atrial diameter) are presented as medians (interquartile ranges) and compared by Mann-Whitney U tests. Categorical variables (gender, anticoagulant, history of persistent AF, heart failure, hypertension, diabetes, stroke, and vascular disease) were presented as number (proportion) and compared by Pearson's χ 2 test or Fisher's exact test. A p-value of less than 0.05 was considered significant.

Patient Characteristics
Of the 1152 patients undergoing TEE, 1039 patients were included in the final analysis ( Figure 1). Among the 1039 patients, 125 patients (12%) were diagnosed with decreased LAAV (<25 cm/s).

Development of Machine Learning Model
The development of the ML models is shown in Figure 2. In the KNN model, the K value was adjusted to develop the best model, and the performance of the KNN model suggested the best accuracy with a K value of 49 (the absolute error: 13.4 cm/s). The development of the RF model and the KNN model is shown in Figure 2. In the RF model, the number of trees was 500. The support vector machine (SVM) model was also developed based on the training dataset.

Development of Machine Learning Model
The development of the ML models is shown in Figure 2. In the KNN model, the value was adjusted to develop the best model, and the performance of the KNN mod suggested the best accuracy with a K value of 49 (the absolute error: 13.4 cm/s). The d velopment of the RF model and the KNN model is shown in Figure 2. In the RF mod the number of trees was 500. The support vector machine (SVM) model was also deve oped based on the training dataset.

Model Comparison for Regression and Binary Classification Problem
The ability of the three ML models to predict the LAAV is shown in Table 3 (training set) and Table 4 (testing set). The RF model performed the best of the three models. In the testing set, the root-mean-square errors (RMSEs) of the three models were 17.51 cm/s (KNN model), 16.65 cm/s (RF model) and 17.66 cm/s (SVM model), and the mean absolute error of the RF model was also the smallest among the three models (13.4 cm/s, 13.04 cm/s and 13.68 cm/s, respectively).  The ability of the ML algorithms to discriminate between decreased LAAV and normal LAAV is shown in Tables 3 and 4. In  Table 3. With the cut-off values, the sensitivity, specificity, accuracy, and F1 score of the three models were calculated. The RF model has the best accuracy (92%) and highest F1 score among three models (KNN 0.44; RF 0.77; SVM 0.58).
In the testing set, the KNN model showed the poorest performance, with an AUCROC of 0.81 (0.73-0.89), while there was no difference found in AUCROC between the RF model and SVM model (p = 0.373). As shown in Table 4, although the sensitivity of the RF model (81%) was lower than the KNN model (93.5%) and SVM model (100%), the specificity of the RF model was highest (RF 86%, KNN 58% and SVM 62%) with the highest accuracy of 85% and highest F1 score of 0.62 (KNN 0.43, SVM 0.48).
The calibration plots of the models were shown in Figure 3. A Hosmer-Lemeshow test was used to evaluate the classification efficiency of the three models, and p-values were shown in Figure 3. All models indicated appropriate calibration (p > 0.05).
As reported in previous studies, a model with AUC ranging from a range of 0.80 to 0.90 is considered excellent [22]. The RF model showed acceptable AUC in both training and testing sets (0.98 and 0.89) and performed better than other non-invasive methods using CT [14,23].
The results of the fivefold cross-validation showed that the RF model showed a better discriminative ability for decreased LAAV than the other two models ( Table 5).
The percentages of increase in mean square error (%IncMSE) and increased node purity (IncNodePurity) were used to evaluate the importance of each variable in the model. The serum level of NT-pro BNP contributed the most (%IncMSE: 34.5) to the LAAV prediction, followed by the diagnosis of persistent AF (%IncMSE: 28.0), LA size (%IncMSE: 25.6%), BMI (%IncMSE: 14.7), weight (%IncMSE: 13.3%), etc. IncNodePurity relates to the loss function, which is chosen by best splits. The loss function is MSE for regression and Gini-impurity for classification. In the RF model, the serum level of NT-pro BNP, LA size, BMI, and diagnosis of persistent AF played an important role with high IncNodePurity in all variables.

Factors Predicting Decreased LAAV in the RF Model
The overall attributions of variables in the Random Forest model are shown in Figure 4.

Discussion
We developed three machine learning models (KNN model, Random Forest mo and SVM model) to predict LAAV in AF patients using simple clinical risk factors. T models were trained using data from 834 patients and tested using data from 205 patien In this retrospective analysis, the RF model demonstrated the highest accuracy (AUCRO 0.89, MAE: 13.04 cm/s) of the three models when validating in the testing set. In the model, the serum level of NT-pro BNP contributed the most to the LAAV prediction, lowed by LA size, diagnosis of persistent AF, BMI, weight, etc.
The left atrial appendage was proved to be a major source of emboli responsible cardioembolic stroke in previous studies [24], and decreased LAAV has been proved be a well-recognized risk factor for left atrial appendage thrombosis and stroke [6, In clinical practice, TEE was regarded as the first technique of choice to measure the fu tion of left atrial appendage, but TEE may cause severe discomfort in patients and, mo over, serious complications (such as esophageal damage) [27]. Freitas-Ferraz et al. c ducted a study including 1249 consecutive patients undergoing TEE and found that overall incidence of TEE-related complications was 0.9% to 6.1% [28]. Therefore, clinici require a noninvasive method to screen patients with decreased LAAV.
Coletta et al. [23] proved that transthoracic echocardiography (TTE) could be used identify patients with low and high blood flow velocities, but only 84% of the patie could be measured by TTE, and this study was conducted with a small number of patie (86 patients). Yasuoka et al. [14] found another noninvasive method to predict LAAV

Discussion
We developed three machine learning models (KNN model, Random Forest model and SVM model) to predict LAAV in AF patients using simple clinical risk factors. The models were trained using data from 834 patients and tested using data from 205 patients. In this retrospective analysis, the RF model demonstrated the highest accuracy (AUCROC: 0.89, MAE: 13.04 cm/s) of the three models when validating in the testing set. In the RF model, the serum level of NT-pro BNP contributed the most to the LAAV prediction, followed by LA size, diagnosis of persistent AF, BMI, weight, etc.
The left atrial appendage was proved to be a major source of emboli responsible for cardioembolic stroke in previous studies [24], and decreased LAAV has been proved to be a well-recognized risk factor for left atrial appendage thrombosis and stroke [6,[24][25][26]. In clinical practice, TEE was regarded as the first technique of choice to measure the function of left atrial appendage, but TEE may cause severe discomfort in patients and, moreover, serious complications (such as esophageal damage) [27]. Freitas-Ferraz et al. conducted a study including 1249 consecutive patients undergoing TEE and found that the overall incidence of TEE-related complications was 0.9% to 6.1% [28]. Therefore, clinicians require a noninvasive method to screen patients with decreased LAAV.
Coletta et al. [23] proved that transthoracic echocardiography (TTE) could be used to identify patients with low and high blood flow velocities, but only 84% of the patients could be measured by TTE, and this study was conducted with a small number of patients (86 patients). Yasuoka et al. [14] found another noninvasive method to predict LAAV using enhanced computed tomography. They found that the LAAV could be estimated by the HU density ratio at distal and proximal sites within the LAA. However, the study was conducted with a small number of patients (60 patients) and, in clinical practice, the HU density ratio of many patients could not always be measured because of poor-contrast filling of the left atrial appendage. In this study, using noninvasive clinical data, we developed an ML model to predict the LAAV of AF patients, which could be used to primarily screen patients with decreased LAAV in a cheap and fast way.
Machine learning has shown acceptable results in various medical fields. In ML model development, to avoid overfitting of the model, it has been suggested that the numberto-feature ratio should be at least 5 [29]. In the present study, the ratio was about 70. The data were randomly divided into two subsets, and the features in the testing set were not significantly different than the training set. In cross-validation, the RF model had 86% specificity and 81% sensitivity. We further assessed the model by fivefold cross-validation, and, consistently with the previous results of cross-validation, the RF model performed best (AUVROC 0.85).
Goldman et al. found that age, systolic blood pressure, sustained AF, ischemic heart disease, and left atrial area were associated with LAAV [30]. Demircelik et al. found that left ventricular diastolic dysfunction was associated with left atrial appendage functions [31]. Handke et al. found that left ventricular ejection fraction, LA size, paroxysmal AF, age, and sex are independent parameters influencing LAAV [32]. Although many clinical factors associated with decreased LAAV are increasingly available, risk estimation of decreased LAAV remains challenging. The indicators included in our model were easily accessible; therefore, our model may enable instantaneous risk estimation of decreased LAAV, which may facilitate rapid identification of individuals at elevated risk to guide further invasive inspection.
Although machine learning models may be useful to help us to diagnose patients, they are still a "black box", lacking acceptable interpretability. In the RF model, we further uncovered important predictors of decreased LAAV. Based on our results, serum level of NT-proBNP and LA size was the most important variable (Figure 4), which was also confirmed as a risk factor in previous research [16,33]. The percentage of IncMSE (% increased mean square error) was equivalent to mean decrease accuracy, which could be used to measure the importance of each variable in the model. Other factors, such as diagnosis of persistent AF, weight, and BMI, were still proven to be useful for predicting LAAV. Because machine learning models can quickly process large amounts of data, we note other implications of our study, which suggest that machine learning models may contribute to extract elements of risk.
It is worth noting the use of anticoagulants in AF patients. A previous study has shown that the underuse of OAC is common [34]. In 2008, only fewer than half were treated with anticoagulation (2.7%, warfarin; 39.7%, aspirin) [35]. In 2016, Xiong et al. found that only 44.5% of Chinese AF patients received OAC treatment in a research-based Chinese population [36]. In recent years, the use of anticoagulants has gradually increased but remains inadequate. In our results, about half of the AF patients (50.8%) were not taking anticoagulants, which indicates that standardized anticoagulant use remains inadequate.
In our model, the serum level of NT-proBNP was one of the most important variables. Several mechanisms may explain the association between NT-proBNP and decreased LAAV. BNP is secreted mainly from the left atrium, and atrial pressure overload leads to elevation in plasma BNP levels [37]. Previous studies [38,39] found that elevated BNP was associated with LAA disfunction, and a high plasma brain natriuretic polypeptide level was a marker of risk for thromboembolism in patients with nonvalvular atrial fibrillation [40]. Harada et al. also found that higher plasma BNP was associated with a lower LAA flow velocity in patients with nonvalvular AF and normal LV systolic function [17]. Therefore, in addition to diagnosing heart failure, the serum level of NT-pro BNP could, to a certain degree, reflect the function of left atrium and left atrial appendage. Therefore, it plays a great role in predicting the blood flow velocity of the left atrial appendage.

Limitations
This study has some limitations. First, this study was a single-center and retrospective study, and had a limitation in generalizability. Second, the model was only tested in the test set instead of a prospective cohort in the real world. Third, there may be some missing clinical features in the candidate features which may contribute to the improvement of the model.

Conclusions
Machine learning model (Random Forest model)-based simple clinical information showed good performance in predicting LAAV. The tool for screening decreased LAAV patients may be very helpful in the risk classification of patients with a high risk of LAA thrombosis.

Institutional Review Board Statement:
The study was conducted in accordance with the Declaration of Helsinki, and the protocol was approved by the Ethics Committee of General Hospital of PLA (Project identification code: CardioDB2009-2019).