Factors Predicting the Utilization of Center-Based Cardiac Rehabilitation Program

Although cardiac rehabilitation (CR) is clearly beneficial to improving patients’ physical functioning and reducing heart disease progression, significant proportions of patients do not complete CR programs. To evaluate the prevalence and predictors of completion of a center-based CR program in eligible cardiac patients, existing data collected from electronic medical records were used. To identify the predictors of CR completion, we used principal components analysis (PCA) and an artificial neural network (ANN) module. Among 685 patients, 61.4% (n = 421) completed the program, 31.7% (n = 217) dropped out, and 6.9% (n = 47) were referred but failed to initiate the program. PCA was conducted to consolidate baseline data into three factors—(1) psychosocial factors (depression, anxiety, and quality of life), (2) age, and (3) BMI, which explained 66.8% of the total variance. The ANN model produced similar results as the PCA. Patients who completed CR sessions had greater extremity strength and flexibility, longer six-minute walk distance, more CR knowledge, and a better quality of life. The present study demonstrated that patients who were older, obese, and who had depression, anxiety, or a low quality of life were less likely to complete the CR program.


Introduction
Cardiovascular disease (CVD) is the leading cause of mortality and morbidity, and has a rising clinical and economic burden worldwide [1,2]. In management of CVD, cardiac rehabilitation (CR) is the standard treatment to promote lifestyle changes and modify risk factors for many CVD patients [3]. CR is a secondary prevention program comprised of multifaceted interventions, focused on supervised exercise training, medication management, nutritional education, psychosocial counselling, and comprehensive support [4]. Participation in CR was proven to be a beneficial element of cardiac The de-identified medical records of 685 patients who were eligible to attend an academic medical center-based CR program in the Southeastern US, between April 2012 and April 2018 were included. The study included patients who (1) had an acute coronary syndrome; (2) underwent percutaneous coronary intervention, open-heart surgery (such as coronary artery bypass surgery, valve surgery or heart transplant); (3) had stable chronic heart failure; and (4) had current stable angina (chest pain). The patients who did not have 3-months of data available for review were excluded from the study. The study was approved by Augusta University Institutional Review Board (1295531 on 27 August 2018 and 1609080 on 27 July 2020)

Study Setting
The study participants were patients who were eligible for a 12-week CR program, after discharge from hospital. The CR program included three 1-h sessions per week, with a total of 36 exercise sessions. The supervised exercise training consisted of warm-up, stretching, treadmill, bicycle ergometer, and arm ergometer, with each activity performed over 10-20 min. Psychological support and nutritional counseling were offered to all patients. Blood pressure, heart rate, as well as exercise intensity were monitored by physical therapists during the CR program. Based on the completion status, we divided the participants into 3 groups-graduates, dropouts, and referrals. Graduates were patients who completed 36 CR sessions, dropouts withdrew from the program prior to completion, and referrals were sent for but failed to initiate the CR program.

Study Variables and Measures
In order to identify potential factors associated with CR completion, participants' characteristics were captured at baseline, prior to the CR program and 3 months after baseline. The data included sociodemographic variables (e.g., age, gender), clinical variables (e.g., primary cardiac diagnosis, resting heart rate, blood pressure, weight, height, body mass index (BMI), percentage of body fat), laboratory variables (e.g., total cholesterol, high-density lipoproteins (HDL), triglycerides, low-density lipoproteins (LDL), fasting blood glucose (FBG), hemoglobin A1C (HbA1c)), and physical performance variables (e.g., 6-min work test (6MWT), extremity flexibility, and strength assessment). CR knowledge test [18] was used to assess patients' knowledge regarding cardiac rehabilitation. Health-related quality of life (HRQoL) was measured by self-report, using the Short Form-36 [19] questionnaires general health (SF36GH) scales and physical functioning (SF36PF) scales. Both subscales are commonly used to assess HRQoL in cardiac patients. Higher scores in general health and physical functioning indicate Geriatrics 2020, 5, 66 3 of 12 better HRQoL. The Hospital Anxiety and Depression Scale [20] (HADS) was used to assess patients' psychological symptoms. The psychometric properties of SF36 and HAD were established in a cardiac population [21][22][23].

Statistical Methods
All statistical analyses were performed using IBM SPSS statistics for Windows, version 26.0 (IBM Corp, Armonk, NY, USA). Data were displayed as mean ± SD or percentage (N) for numerical variables or categorical variables, respectively. For all analyses, two-tailed t-tests were used with a level of statistical significance set at p < 0.05. The student's paired t-test was used to examine differences before and after, for each group. One-way analysis of variance (ANOVA) was employed to identify the relationships between variables.
Principal components analysis (PCA), a statistical procedure aimed to reduce the dimensionality of multiple correlated measurements to fewer numbers of linearly uncorrelated variables, was used to investigate the variation of the original data [24,25]. PCA can highlight the common variation between the original variables to condense the data, thus, identifying the most relevant principal components to explain CR utilization status (Graduate, Dropout, and Referral). Each principal component (PC) is, by definition, not related to another. The first PC (PC1) obtained accounts for the highest amount of the total variation between the original explanatory variables, while the next components (PC2, PC3 . . . ), respectively, accounted for less variation. Moreover, Kaiser-Meyer-Olkin (KMO) and the Bartlett test of sphericity was performed to verify the appropriateness of the sample, a varimax rotation was used to improve the interpretability of performing PCA, and the cut-off point was a variable loading ≥ 0.6.
Artificial neural network (ANN) modules are non-linear mapping structures that imitate the learning process of the human brain. They are powerful tools for processing problems involving non-linear and complex data, especially when the underlying data relationship is imprecise and noisy. Variables that are identified by linear modeling to be statistically associated with CR utilization status (Graduate, Dropout, and Referral) were used for neural network modeling. The data set included three distinct sets-training, testing, and validation. The training set was used to learn data patterns, the testing set was to evaluate the generalization ability of the trained network, and the validation set was used to check the trained network performance. The ANN procedure was carried out using a PC laptop computer, equipped with a 64-bit operating system, 2.90 GHz microprocessor, and 8.00 GB of RAM.
In this study, we used clinical data to identify the predictive model of CR utilization, which posed significant challenges for statistical analysis. The clinical data collected over time were voluminous, with large number of variables that were correlated or uncorrelated. There were many different types of observations with different levels of measure and measurement scales, which made it difficult to standardize. Large number of missing data reduced power and validity. The temporal relationships between variable were often nonlinear [26]. PCA is recommended as an alternative to classical regression analysis of multivariable clinical data [27,28]. However, the drawback of PCA is that it only recognizes the linear relationships, which limits its ability to identify nonlinear relationships between clinical variables [28,29], as the main strength of ANN is to handle nonlinear and multidimensional dependencies [29,30]. However, to identify the best-fit predictive model, ANN can be labor-intensive and time consuming, when dealing with a large number of clinical variables embedded in multi-layer and multi-level data structures [30]. To improve efficiency, PCA was first performed to reduce the input dimensions before ANN. PCA also helped develop ANN training sets and enhance the specificity of a predictive model by reducing dimensionality and removing the correlated variables. Then, ANN used the principal components obtained by the PCA method to develop a better performed predictive model with higher accuracy [28][29][30]. Using both PCA and ANN could help address challenges in clinical data analysis and interpretation. The combination of PCA and ANN is a valid and effective approach to reduce residual errors, make proper classification, recognize relationship patterns, and test an unlimited number of related or unrelated factors among a large number of clinical observations [28][29][30].

Baseline Characteristics
The baseline demographic and clinical characteristics of the study population is summarized in Table 1. Of the 685 patients included in the study, 421 (61.4%) of them completed 36 sessions, 217 (31.7%) dropped out, and 47 (6.9%) were referred to by a physician but failed to initiate the program. The average age of the study population was 64 years, ranging from 24 to 89. Male patients accounted for 65% of the study population. A total of 42% underwent coronary artery bypass grafting (CABG) or percutaneous coronary intervention, 28% had a stable angina, 20% had acute coronary syndrome, and 8% had a stable heart failure.

Comparison between Baseline and 3 Months
Comparison analyses were performed in the three groups (Graduates, Dropouts, and Referrals) at baseline and at 3 months (the standard duration of our program). Graduates exhibited greater extremity strength and flexibility, longer six-minute walk distance (6MWT), superior CR knowledge score, and a higher SF-36 PF score (all p < 0.001 by paired t-test). Moreover, the Graduates had significantly reduced anxiety and depression compared to Dropouts and Referrals, after completing CR (p = 0.000, paired t-test). While both Dropouts and Referrals had increased 6MWT distance (p = 0.038 and p = 0.001) and 6MWT Metabolic equivalent of tasks (METs) (p = 0.038 and p = 0.001), respectively, after 3 months, significant changes were not observed in the other variables (Figures 1 and 2 Table S1).  Table S1).    Table S1).

Predictors of CR Completion
One-way ANOVA test was used to analyze the demographic factors contributing to the CR completion status. According to the results presented in Table 2, there was a statistically significant difference between the three groups regarding mean age, BMI, CR knowledge test, HADS Anxiety, HADS Depression, SF36GH, and PF (p < 0.05). Principal component analysis (PCA) was conducted to consolidate baseline data into three factors, which explained 66.8% of the total variance. The Kaiser-Meyer-Olkin (KMO) statistic was 0.75, which exceeded the minimum recommendation of 0.60, and the Bartlett test of sphericity was statistically significant (χ 2 = 267.885, p < 0.001), both indicating that the sampling adequacy and correlation matrix was satisfactory for PCA.
The first principal components (PC1) addressed psychosocial variables (including SF36GH, HADS Depression, HADS Anxiety, and SF36PF), which accounted for 38.5% of the variance. The second principal component (PC2), which only included age as a variable, contributed to 14.5% of the variation. Finally, the third principal component (PC3), which included BMI variables, explained an additional 13.8% of the variation. PCA showed that patients who were older, obese, and who had depression, anxiety, or a low quality of life, were less likely to complete the CR program. A summary of the loadings, after varimax rotation, related to the variables in each component, is provided in Table 3. Figure S1 shows the scree plot of the three components. A detailed analysis of the score of each observation on the three PC analyses, as well as the contributing effect of each variable, are presented in Figure 3.
Finally, ANN was used to validate the results from PCA. It used feed-forward architecture with multi-layered perception. The same seven variables identified by linear modeling were confirmed in the input layer of the ANN module ( Table 4). The description diagram of the ANN model, generated by the software iterations, is presented in Figure 4. Our results identified the input layer comprised of seven neurons, which validated the findings generated from PCA. The output layer consisted of three neurons, representing three groups, based on CR utilization status. The hidden layer located between the input and output layers, contained mathematical functions that performed nonlinear transformations of the input entered into the network. The percent incorrect predictions for the training and testing steps were 0.197 and 0.229, respectively (Table 5), which was acceptable. The algorithm for determination of predictor importance, also termed "specificity," is presented in Figure S2. After the nonlinear transformation in the hidden layer and two-step modeling (training and testing), ANN was able to classify all study subjects into three groups, based on the minimal number of predictors needed, as they were entered into the input layer of the network.   Finally, ANN was used to validate the results from PCA. It used feed-forward architecture with multi-layered perception. The same seven variables identified by linear modeling were confirmed in the input layer of the ANN module ( Table 4). The description diagram of the ANN model, generated by the software iterations, is presented in Figure 4. Our results identified the input layer comprised of seven neurons, which validated the findings generated from PCA. The output layer consisted of three neurons, representing three groups, based on CR utilization status. The hidden layer located between the input and output layers, contained mathematical functions that performed nonlinear transformations of the input entered into the network. The percent incorrect predictions for the training and testing steps were 0.197 and 0.229, respectively (Table 5), which was acceptable. The algorithm for determination of predictor importance, also termed "specificity," is presented in Figure  S2. After the nonlinear transformation in the hidden layer and two-step modeling (training and testing), ANN was able to classify all study subjects into three groups, based on the minimal number of predictors needed, as they were entered into the input layer of the network.  The prediction weights generated by the neural network for each interaction among the 7 pre-incision factors ("input layer") and the 4 nodes ("hidden layer"), and the output weights of each node to the prediction of CR status-bias weights, were also contributed from the input layer and the hidden layer. The prediction weights generated by the neural network for each interaction among the 7 pre-incision factors ("input layer") and the 4 nodes ("hidden layer"), and the output weights of each node to the prediction of CR status-bias weights, were also contributed from the input layer and the hidden layer.

Discussion
This study aimed to identify the predictors of CR completion among patients with CVD. Among patients who were eligible for CR, approximately 32% of them dropped out, and 7% referred patients failed to initiate CR therapy. Identified predictors of completion fall into three categories-psychosocial, demographic, and clinical aspects, which account for 66.8% of the total variation among the studied variables.
The psychosocial domain included HRQoL (assessed by SF-36 PF and SFGH), anxiety, and depression (assessed by HADS). Consistent with previous studies, HRQoL played a significant role in patients' decision to utilize CR. Compared to patients who completed CR, those who failed to initiate or complete the CR program had lower HRQoL at baseline. Most studies reported an improvement of HRQoL after CR completion, which was congruent with our findings, but few studies examined the impact of HRQoL on CR utilization in CVD patients [31]. Unlike graduates, who had significantly improved the quality of life after completing CR, those who dropped out or who were referred but did not attend CR showed no significant improvement in QoL over time, after hospital discharge, suggesting that this is a critical window of time for patients to attend CR. In addition, HRQoL is related to psychological anxiety and depression [32], which was significantly improved in those who completed CR. Interestingly, our study also showed that anxiety and depression predicted CR completion. Thus, CVD patients need to improve their quality of life and reduce their anxiety and depression, which in turn might lead to higher CR adherence. This finding was in accordance with recent clinical studies [33].
As reported in previous studies conducted by Lane et al. [8] and Petrie et al. [34], age was the only factor in the demographic domain that was significantly associated with underutilization of CR. The potential explanations included physical constraints, health status, lack of transportation, and insufficient family support. Older cardiac patients were also more likely to be single, widowed, or to live alone with less social support, leading to the underutilization of CR. Additionally, the average number of comorbidities increased monotonically with age. These comorbidities might hamper participation in exercise sessions and create barriers to completing the CR program.
Among all variables examined in the clinical domain, BMI was found to predict CR completion. Recent studies provided strong evidence that obesity is associated with physical inactivity [35]. Physically inactive patients are more likely to drop out or fail to initiate a CR program [35]. The majority of participants in our study were overweight or obese (the average BMI was 31). The average BMI was higher in the Dropout and Referral groups than the Graduate group, at baseline (34, 32, and 30, respectively). CR programs could help obese patients achieve better health outcomes by managing their blood pressure, diabetes, and lipid levels. The effect of CR was dose-dependent, which meant that the number of sessions attended was positively associated with weight loss and reductions in blood pressure. Therefore, additional studies are needed to identify factors influencing obese patients' decision making on CR utilization, thus, enabling development of a tailored intervention to promote CR utilization in obese cardiac patients.
There are several limitations in our study. First, this study was an observational investigation using a convenient sample of patient data from a single CR program in an academic, hospital-based setting. Therefore, the results might not be generalized to national or international CR patients. Based on a recent study describing national CR participants' characteristics [36], however, our study included a heterogeneous and representative sample, comparable to national CR patients. In addition, the advanced statistical modeling helped reduce bias and improve generalizability to other CR programs across the nation. We used the combination of principal component analysis (PCA) and artificial neural network (ANN) analysis to examine predictors of both CR completion and attendance. PCA detects the common variation between the original variables and condenses the data. In the present study, a large number of explanatory variables makes PCA reliable for the purpose of the study. ANN was able to detect non-linear relationships that were not identified by PCA. The combination of PCA and ANN is a valid and effective approach to enhance model predictive accuracy, when dealing with complex clinical data with large missing values.
Second, the list of explanatory variables that could contribute to CR utilization was limited to what was available in the EMR. Factors such as economic scarcity, financial scarcity, social support, health belief, health literacy, and personality traits, were not available in EMR [37]. These missing variables could lead to misleading conclusions regarding the relative importance of the characteristics examined here. On the other hand, the selected variables were captured by most CR programs nationwide and used to develop clinically feasible interventions to promote CR utilization. Third, the HRQoL and mental health assessments were obtained from patients' self-reports, which was not validated by objective measures and thus could have led to bias. To date, objective and clinically feasible measures of HRQoL, anxiety, and depression are not identified or reported. As a result, future studies are needed to replicate and expand our findings.

Conclusions
The study identified important factors predicting CR utilization. Our statistical modeling revealed three domains that explained 66.8% of the total variance, including age, BMI, HRQoL, anxiety, and depression. Pre-CR screening and tailored interventions are needed to promote CR utilization in high-risk populations.
Supplementary Materials: The following are available online at http://www.mdpi.com/2308-3417/5/4/66/s1. Figure S1: Scree plot indicating the amount of variance explained by each principal component. Figure S2: Receiver-operating characteristic (ROC) curves for the artificial neural network. Table S1: Difference between before and after cardiac rehabilitation according to three status groups.