Estimating Postural Stability Using Improved Permutation Entropy via TUG Accelerometer Data for Community-Dwelling Elderly People

To develop an effective fall prevention program, clinicians must first identify the elderly people at risk of falling and then take the most appropriate interventions to reduce or eliminate preventable falls. Employing feature selection to establish effective decision making can thus assist in the identification of a patient’s fall risk from limited data. This work therefore aims to supplement professional timed up and go assessment methods using sensor technology, entropy analysis, and statistical analysis. The results showed the different approach of applying logistic regression analysis to the inertial data on a fall-risk scale to allow medical practitioners to predict for high-risk patients. Logistic regression was also used to automatically select feature values and clinical judgment methods to explore the differences in decision making. We also calculate the area under the receiver-operating characteristic curve (AUC). Results indicated that permutation entropy and statistical features provided the best AUC values (all above 0.9), and false positives were avoided. Additionally, the weighted-permutation entropy/statistical features test has a relatively good agreement rate with the short-form Berg balance scale when classifying patients as being at risk. Therefore, the proposed methodology can provide decision-makers with a more accurate way to classify fall risk in elderly people.


Introduction
In the information age, it is no longer difficult to obtain data; rather, the difficulty lies in extracting only the relevant, important, and non-redundant information [1]. When elderly people fall, they can incur serious health problems that can cause physical and psychological trauma, which can increase stress on the health care system. About one-third of elderly people fall every year, and the chance of falling increases with age [2,3]. Falling can have serious long-term consequences for elderly people, including hospital injuries, decreased mobility, fear of falling, and even death. Older people with gait or balance problems are at higher risk of falling in the future [4,5]. To develop an effective fall prevention program, elderly people at risk of falling must first be identified before appropriate interventions can be implemented to reduce or eliminate of preventable falls. Researchers have indicated that more than 50% of potential falls associated with older people have been avoided due to ongoing fall prevention interventions [6]. Researchers have also performed extensive motion evaluation in recent years, mostly using questionnaires [7][8][9][10] or traditional indictor methods for analysis, such as time, step count, generally, we attempt to use PE and WPE to add the availability to determine the features are most indicative of fall risk. Moreover, because of the multiple factors of falls, postural stability has been validated as an important indicator for predicting fall risk [30,31]. Previous studies simply used PEs in EEG as physiological signal analysis but not in TUG; however, we not only discuss the features availability but also the features performance of the decision making.
This work therefore aims to supplement professional assessments of fall risk by developing posture stability assessment methods using sensor technology, entropy features, and statistical analysis. Further, this work aims to understand the availability of the sensor-provided raw data, as well as calculate the availability of PE and WPE in the postural stability, trying to discuss the use of practical medical practice. Additionally, the suitability of utilizing PE to measure gait dynamics is investigated based on the repeatability and sensitivity to changes in walking conditions [32]. Overall, this work aims to provide a clinical, meaningful, and easy-to-use fall-risk diagnosis method that uses sensing technology.

Materials and Methods
Participants were equipped with a waist-mounted triaxial accelerometer to perform the TUG walking test; the data were compared with the short-form Berg balance scale (SFBBS), used to assess a community-dwelling elderly person's postural stability. The SFBBS, psychometrically similar (including test reliability, validity, and responsiveness) to the original BBS [33], is widely used to estimate a person's fall risk by determining their static and dynamic balancing abilities and providing a sound measurement for balance impairment [16]. The sensor data were analyzed through PE and WPE methods.

Subjects
Ninety-one community-dwelling elderly people living in a community in central Taiwan were recruited between April 2014 and May 2015 to perform the SFBBS and TUG test to evaluate their postural stability. The medical professional team included rehabilitation physicians, physiotherapists, and functional therapists. The subjects were all over 65 years of age, had no history of musculoskeletal injuries or central nervous system problems in the prior three months, could walk independently without any help, and completed a written consent form prior to testing. Due to the lack of signal acquisition, data were only collected for 85 subjects, aged 76.12 ± 6.99. Among the subjects, there were 18 men, aged 78.89 ± 5.95, and 67 women, aged 75.37 ± 7.1. It is worth noting that the collection of the elderly was not easy, which rendered achieving a balance of gender restrictions difficult.

Sensors Used
During the TUG test, a triaxial accelerometer (RD3152MMA7260Q; Freescale Semiconductor-NXP, Austin, TX, USA; sampling rate: 45 Hz) was placed on the subject's back at vertebrae L3-L5, as this location is the center of gravity of the human body and is commonly used in fall-risk assessment studies [11]. The X, Y, and Z axes were aligned with the vertical (V; up: +, down: −), mediolateral (ML; right: +, left: −) and anterior−posterior (AP; forward: +, backward: −) directions, respectively, as shown in Figure 1.

Clinical Testing
To reduce the screening work of front-line personnel and improve their objective decision-making skills, measurements were performed on community-dwelling elderly persons using both clinical tests and inertial sensors. Fall-risk analysis methods used included inertial sensor statistical indicators of the accelerator, PE and WPE analysis, multivariable logistic regression, and calculation of the area under the receiver-operating characteristic (ROC) curve (AUC).
On an average, each subject completed all fall assessment tests within 15−20 min. The SFBBS test, which takes half as long as the BBS, contains seven activities [34]: (1) stretching arms forward; (2) standing with eyes closed; (3) standing with one foot in front; (4) turning and looking backward; (5) picking an object up from the floor; (6) standing on one foot; and (7) standing up from a seated position. During testing, clinicians scored each item for a maximum of four points (for a maximum total score of 28). Prior studies employing SFBBS have classified subjects receiving scores < 23 as having an impaired balance [16,35]. Each subject then performed the TUG test. The observer marked the start time and end time, as well as the time to reach the standing position, reach the three-meter mark, turn around, reach the chair, and then return to the sitting position, while the raw data were collected from the triaxial acceleration sensor; an example of this raw data is shown in Figure 2. The signals of the fall risk data were filtered using a sixth-order Butterworth filter and a low-pass filter with frequency of 3 Hz, as suggested by [25,36].

Data Analysis
Computation of the PE and WPE, t-testing of significant features, regression, and statistical analysis were carried out using SPSS and MATLAB ® . The complexity of the TUG test for each subject was obtained by analyzing the PE, WPE, and statistical features (SFs), including the mean, standard deviation, maximum, minimum, and zero-crossing rate (ZCR), accounting for the three axes; there was a total of 21 features studied. The selection of the above features depends on the reference referred to as the window size impact in human activity recognition; these are some of the features most widely used in activity recognition [37][38][39][40][41]. Among them, the zero-crossing rate is the number of times the signal passes the average value.
The rest of the article is organized considering that the statistical analysis includes univariate screening, multivariable logistic regression, ROC, and AUC, which were used to assess not only the fall-risk prediction tool but also the performance metrics of the fall-risk regression model. Thus, in the present study, we proposed three cases with SFs, PE, and WPE to compare the selected features in Methods, whereas the ROC curve and AUC were calculated to estimate the postural stability and entropy. This novel approach was selected to characterize the differences between fall-risk and non-fall-risk subjects, confirm the fall-risk prediction tool, and compare the selected features and their impact on decision making. A flow chart of the data analysis of SF, PE, and WPE in this study is provided in Figure 3.

Entropy Analysis
PE and WPE analyses were performed; the methodology is presented in the following subsections.

PE
The PE estimates the complexity of a time series based on the relative frequency of the sequence patterns for a time series x and data length N, as x = {x 0 , x 1 , . . . x N−1 }. The PE is computed in three major steps, detailed below [26].
(1) Partitioning the state space: Firstly, the one-dimensional time series is partitioned into a matrix of overlapping vectors. This partitioning uses two hyperparameters: embedded dimension m and time delay τ, where τ = 1 to avoid missing any patterns. These hyperparameters are used to extract time series (2) Finding the ordinal patterns: After partitioning the one-dimensional time series, the subsequences of vector x are mapped into unique permutations and sorted in ascending order that captures the ordinal rankings of the data, which can then be used to distinguish corresponding ordinal patterns. There are m! different possible ordinal patterns of length m, termed (3) Calculating the relative frequencies and entropy: For all possible m! permutations, each probability p(π i ) was estimated by calculating the relative frequency of each ordinal pattern and then the PE, shown in the equation below. Here, a lower PE corresponds to a more regular time series [27], as shown in Equation (1): normalized PE = PE/ log 2 (m!).
In a previous study, the embedding dimension (m) selection of the signal ranged between 3 and 6, indicating that the appropriate embedding dimension (m) is related to the signal and its sampling frequency, while it was shown that short permutations cannot capture the entire dynamics [27,42]. Therefore, we tried to select m = 6 to obtain richer information. Moreover, we used adaptive resampling with interpolation to overcome the data length issue. In signal processing, oversampling is the process of sampling a signal at a sampling frequency that is significantly higher than the Nyquist rate. In general, the Nyquist frequency is half of the sampling rate of a discrete signal processing system. Thus, herein, the sensor sampling rate was 45 Hz and the Nyquist frequency was 22.5 Hz, implying that the data will not be oversampled at the motive signal behind filtering (3 Hz). Furthermore, to avoid the setting of m, we tried to calculate AUC when using m = 3 as an example in the "univariate screening and stepwise logistic regression analysis" method (Section 3.1). When m = 3 was selected, the obtained AUC in case (ii) was lower than when m = 6 was selected; however, case (iii) had almost the same AUC at both values of m, therefore, we selected m = 6. The PE was normalized to a value between 0 and 1 (0 ≤ PE ≤ 1), which facilitated the comparison between the permutation entropies in the present study [28].

WPE
The WPE, which was proposed by Fadlallah et al. [29], accounts for the variability of the amplitude information by applying a correcting factor or weight to the relative frequencies that takes into account both the sample variability and order. Thus, the weight (w) depends on the variation of the subsequences of vector x for the embedded dimension m, as shown in Equation (2): where denotes the arithmetic mean of X m j beginning at index j, where 0 ≤ j ≤ N − m + 1. Thus, the modified p(π i ) can be considered as the proportion of the variance accounted for by each ordinal pattern, denoted as p w (π i ) [29], as expressed in Equation (3): normalized WPE = WPE/ log 2 (m!), The parameter, m, and data length (N) of WPE are the same as those of PE.

Adaptive Resampling Procedure in PE/WPE
The data length must be considered when performing the PE/WPE analysis; typically, the time series of the length satisfies N > 5m!. For example, when m = 6 is selected, the data length should be more than 3600 points. It is reasonable to assume that with a high sampling rate, a detailed variation of PE/WPE can be achieved. Figure 4 shows an example of the variation of WPE with the number of data points. It is observed that the WPE decreases with the number of data points; however, data points greater than 5 × 6! have almost constant WPE. During the short sequences of the TUG tests based on the resampling process, we could show the detailed variation in the PE/WPE analysis and extract the local microstructure feature [43], which would help evaluate the fall risk of the elderly in the community.

Statistical Analysis
In our present study, we used the candidate predictors (variables) that were carefully selected based on prior knowledge (p < 0.05) and were used as inputs to multivariate logistic regression models to determine which features could be used to classify subjects as a fall risk; the results were compared with SFBBS criterion. The dependent variable in the multivariable logistic regression analysis was the fall-risk classification. We adapted the stepwise logistic regression with backward elimination using a p-value criterion of 0.157, which is suitable for prognostic models [44]. The ROC curve, well developed in the field of medicine [45], was also created to further explore the ability of clinical measures and complex index values to predict fall risk. Here, AUC = 0.5 indicates no discrimination, AUC = 1.0 indicates perfect discrimination, and 0.7 ≤ AUC ≤ 0.9 indicates an acceptable level of discrimination. The functional outcomes of the clinical test were compared using a student t-test, where the statistical analysis was considered significant if p ≤ 0.05.

Results
The discussion and analysis address the internal SFs, PE, WPE, and stepwise logistic regression analyses in three main parts. After first classifying each subject as a fall risk or a non-fall risk using the SFBBS criterion, univariate screening and multivariate analyses were performed, as detailed in Section 3.1, using t-test analysis to verify the categorization of fall risk, for example, medical experts' decision as the same as selecting the significant features. Stepwise logistic regression was then performed. Next, stepwise logistic regression analysis was the automatic variable selection with p < 0.05; this is discussed in Section 3.2. Finally, these two methodologies are compared in Section 3.3. Additionally, the AUC of the logistic regression results was calculated to understand the decisive features that are actually similar to the results of the clinical tests as predictions for decision making.

Univariate Screening and Stepwise Logistic Regression Analysis
In accordance with medical experts, the subjects were considered as presenting a risk of fall if their total SFBBS score was <23. Of the 85 subjects, 19 were classified as a fall risk according to SFBSS with an age of 78.37 ± 7.54; 66 were classified as non-fall risk, with an age of 75.47 ± 6.74.
The results and discussion are divided into three cases: (i) SFs, (ii) SFs and PE, and (iii) SFs and WPE. Stepwise logistic regression was then performed for each case and calculated AUC for comparison. The results of the t-tests for each of the 21 features are detailed in Table 1. After all eigenvalues were tested by t-test (p ≤ 0.05), they were included in the stepwise regression, summarized in Table 2 and detailed in Table 3. Table 3 shows each case's selected significant features when performing the omnibus test of logistic regression model coefficients, which is equivalent to the ANOVA-F test in linear regression. All of the β coefficients were zero, indicating that the model is significant and has predictive ability. When checking the odds ratio (∆ odds, expressed as EXP(B) in SPSS), variables with p < 0.157-as suggested in a previous work where the stepwise logistic regression with backward elimination was implemented-were used, which is suitable for a prognostic model [44], indicating a direct relationship with the risk of falling. In case (i), F1 and F9 were both significant, in case (ii), F9 and F18 were significant, and in case (iii), F1, F20, and F21 were significant. The ROC curve and AUC were then calculated as an indicator to judge the overall predictive ability of the model; they are shown in Figure 5, where the calculated AUC of cases (i), (ii), and (iii), are 0.8573, 0.9091, and 0.9274, respectively. Thus, case (iii) (i.e., SFs and WPE) offered the best performance, indicating that the larger the AUC value of the classifier, the higher the accuracy: AUC = 0.5 indicates no discrimination, 0.7 ≤ AUC ≤ 0.8 indicates acceptable discrimination, 0.8 ≤ AUC ≤ 0.9 indicates excellent discrimination, and 0.9 ≤ AUC ≤ 1.0 indicates outstanding discrimination.  Table 3.
Stepwise logistic regression results for each case.

Omnibus Test ∆ Odds (EXP(B)) and Significance
Case i

Case ii
Case iii

Direct Use of Stepwise Logistic Regression Analysis
Next, stepwise logistic regression was performed directly (i.e., the features were not pre-selected) to determine the predictive factors. The resulting features selected by the regression are shown in Table 4. Similarly, the ROC curve and AUC were calculated to classify the logistic regression; the resulting AUCs for cases (i)-(iii) were 0.924, 0.963, and 0.948, respectively, as shown in Figure 6. Thus, case ii (i.e., SFs and PE) offered the best performance, indicating that the larger the AUC value of the classifier, the higher the accuracy.

Comparison between the Subsequent and Direct Logistic Regression Methods
In this section, the results in Sections 3.1 and 3.2 (i.e., using univariate screening and auto selection) are compared. At first, we can see that the cases result seems different; however, the entropy makes more sense for predict, indicating that the models that included PE or WPE features are more accurate than the model that included only standard SFs. When using univariate screening and subsequent regression, case iii exhibited high accuracy; the selected features were F1, F9, F15, F20, and F21. When using direct regression, case ii demonstrated the best predictive abilities; the selected features were F1, F2, F3, F4, F6, F12, F14, and F18. Only F1 was selected in both methods, and the anterior−posterior (AP; forward: +, backward: −) axis is the most significant to be selected. The results from the confusion matrix (Table 5) indicated that the inclusion of PE or WPE improved the specificity; however, 100% specificity was not achieved. Therefore, this more precise model would decrease the rate of false positives, leading to a lower sensitivity. Therefore, some patients with risk of falls could not be identified.

Discussion
Fall risk has commonly been determined using inertial sensors as assessment tools to calculate physiological values, such as gait step and gait speed [9,12,[19][20][21]; however, these values are affected by the acceleration and displacements of different axes such as gauges and gyroscopes. Additionally, they are less calculated by the sensor itself. This work therefore aimed to clarify the sensor values using PE and WPE during feature selection to predict fall risk, rather than focusing on stopwatch feedback.
As regression analysis was used in different ways to explore the prediction results, the variable selection is a significant area of interest [46]. Predictive regression models have been used to explore dependent variables (sensor features). Compared with previous explorations of the importance of the independent variable, there will be different statistical models and thinking. Taking this research as an example to discuss dependent variables, all possible feature extractions are included in the way, and feature selection methods used by statistics of the t-test are a way to determine the feature sets. Thus, we discuss the model "content" and obtain the best model performance and results. Howcroft et al. [11] reviewed three main methods and classified how often the methods are used to examine subjects' fall risk: retrospective fall history (30%), prospective fall occurrence (15%), and scores on clinical assessments (32.5%). In particular, fall history and clinical assessments are more commonly used; however, only clinical assessments can be analyzed, and based on the results of this study, the use of sequential univariate screening and logistic regression adopted model building and bivariate analysis, and selected all variables with p < 0.05 into the multivariate analysis, which is very common (and even widely accepted) in current journals. The study of model establishment methods [24] showed that joining WPE can achieve better performance. Prior researchers have indicated that "univariate screening and multivariate analyses" is prone to false positives (i.e., a false indication of correlation); here, false positives were avoided using the SFBBS first criterion. The results discussed in Section 3.1 "Univariate Screening and Stepwise Logistic Regression Analysis " seems to be in line with prior researchers' [26] findings that WPE can overcome the shortcomings of PE and is more effective. However, looking at the results of "Direct Use of Stepwise Logistic Regression Analysis for Feature Selection", among them, the eigenvalues that are not significant (p > 0.05) are also in the model. Moreover, as demonstrated in the result of Section 3.2, in terms of statistical interpretation, the eigenvalues are in the hyperplane formed by multi-dimensional projections, which has the best performance. Therefore, in data science, the results discussed in Section 3.2 are considered a reasonable mathematical model as a predictive result. However, in the field of biomedicine, such as the study of risk of falls, it is difficult to explain why non-significance is an important feature, because medical experts are unconcerned with the non-significant features, although some studies state that it "may be" a potential factor. Regardless, if there is a criterion that can be used as the first step of bivariate analysis, it may help prevent false positives.
Based on the prediction results of the two, starting from different prediction models led to inconsistent results; however, adding entropy as a feature value led to improved overall predictive abilities than simple SFs. PE and WPE can thus be used during real-time analysis of the intrinsic sensor to implement the real-time analysis of practical services, such as community services. In addition, from the result of feature selection, F1 (mean_V) has been selected in two prediction approaches. The features of the AP axis offer the most predictive abilities in the prediction model. Thus, the sensor can detect the shaking of AP axis that is difficult to recognize with the naked eye, and it looks more like maintaining proprioception.
Researchers have demonstrated that multifactorial analysis is most effective at predicting older adults' fall risk. [47]. This work therefore aimed to discuss which sensor feature readings can most accurately be used to predict fall risk. Employing intrinsic sensors to explore the availability of posture control has been widely studied [48], hoping to extend the discussion on how to use significant features to effectively predict the risk of falls. It is worth mentioning that the collection of data from the elderly is very difficult; thus, it is not easy to achieve a balance of gender restrictions. Therefore, the discussion on gender differences is more limited. Overall, in terms of decision making, we must first understand the purpose, because different goals will lead to different forecasting methods, and the results will certainly be different. This research attempts to explain the results of different positions.

Conclusions
In this work, WPE and PE were applied to inertial sensor signals to predict the fall risk of elderly people. Stepwise logistic regression analysis was applied to present the inertial data on a fall-risk scale with different perspectives to allow medical practitioners to screen for high-risk patients. The logistic regression was also used to automatically select feature values and clinical judgment methods to explore the differences in decision making. The results indicated that directly applying stepwise logistic regression to obtain PE and SF provided the best AUC value and avoided false positives. Additionally, employing WPE and SF obtained by using the clinical test (SFBBS) as golden rule of predictive value can be more similar to the decision-making reference of clinical experts. Therefore, the proposed methodology can provide decision-makers with a more accurate manner to classify fall risks in elderly people.