Current Cut Points of Three Falls Risk Assessment Tools Are Inferior to Calculated Cut Points in Geriatric Evaluation and Management Units

: Background: Falls risk assessment tools are used in hospital inpatient settings to identify patients at increased risk of falls to guide and target interventions for fall prevention. In 2022, Western Health, Melbourne, Australia, introduced a new falls risk assessment tool, the Western Health St. Thomas’ Risk Assessment Tool (WH-STRATIFY), which adapted The Northern Hospital’s risk tool (TNH-STRATIFY) by adding non-English speaking background and falls-risk medication domains to reﬂect patient demographics. WH-STRATIFY replaced Peninsula Health Risk Screening Tool (PH-FRAT) previously in use at Western Health. This study compared the predictive accuracy of the three falls risk assessment tools in an older inpatient high-risk population. Aims: To determine the predictive accuracy of three falls risk assessment tools (PH-FRAT, TNH-STRATIFY, and WH-STRATIFY) on admission to Geriatric Evaluation Management (GEM) units (subacute inpatient wards where the most frail and older patients rehabilitate under a multi-disciplinary team). Method: A retrospective observational study was conducted on four GEM units. Data was collected on 54 consecutive patients who fell during admission and 62 randomly sampled patients who did not fall between December 2020 and June 2021. Participants were scored against three falls risk assessment tools. The event rate Youden (Youden Index ER ) indices were calculated and compared using default and optimal cut points to determine which tool was most accurate for predicting falls. Results: Overall, all tools had low predictive accuracy for falls. Using default cut points to compare falls assessment tools, TNH-STRATIFY had the highest predictive accuracy (Youden Index ER = 0.20, 95% confidence interval CI = 0.07, 0.34). The PH-FRAT (Youden Index ER = 0.01 and 95% CI = − 0.04, 0.05) and WH-STRATIFY (Youden Index ER = 0.00 and 95% CI = − 0.04, 0.03) were statistically equivalent and not predictive of falls compared to TNH-STRATIFY. When calculated optimal cut points were applied, predictive accuracy improved for PH-FRAT (Cut point 17, Youden Index ER = 0.14 and 95% CI = 0.01, 0.29) and WH-STRATIFY (Cut point 7, Youden Index ER = 0.18 and 95% CI = 0.00, 0.35). Conclusions: TNH-STRATIFY had the highest predictive accuracy for falls. The predictive accuracy of WH-STRATIFY improved and was signiﬁcant when the calculated optimal cut point was applied. The optimal cut points of falls risk assessment tools should be determined and validated in different clinical settings to optimise local predictive accuracy, enabling targeted fall risk mitigation strategies and resource allocation.


Introduction
Inpatient falls are the most commonly reported incidents in many hospitals with higher risk in older adult sub-acute patients [1][2][3][4][5].A fall is defined as "an event which results in a person coming to rest inadvertently on the ground or floor or other lower level," the definition widely accepted and used by the World Health Organisation [6][7][8][9].One Australian study reported over 40% of patients have experienced at least one fall during their admission [10].As falls lead to health complications for patients (both physical and psychological) [11], and greater utilisation of hospital resources [12], falls risk assessment tools have been used as part of a broader plan to reduce the risk of falls for patients in sub-acute wards [13].The major risk factors for inpatient falls include delirium, cognitive impairment, previous falls, neurological disorders, and sensory impairments [14][15][16][17].The main reason for an older adult's admission to Geriatric Evaluation and Management (GEM) units is to improve mobility and function prior to discharge [17].GEM patients are typically frail with multiple co-morbidities and a high risk of falls [17].
The term falls risk assessment tool has been used to describe a class of diagnostic processes to manage falls risk [3].These include risk-factor checklists that prompt healthcare workers to identify common modifiable fall risk factors to reduce harm through targeted plan development.Numerical risk prediction tools have cut points on a scale designed to predict the risk of future falls by calculating a score from a set of risk factors.
In 2022, a global multidisciplinary group presented consensus recommendations promoting the use of multifactorial falls risk assessments in preference to numerical falls risk screening [18].This included recommendations that assessments, interventions, and strategies should consider local context and resources [18].Numerical risk assessment tools, however, can form part of a multifactorial risk assessment, and due to the retrospective nature of our study, we sought to assess the accuracy of numerical falls risk screening tools.
Western Health previously used the Peninsula Health Falls Risk Screening Tool (PH-FRAT) (Appendix A) first developed in 1999 by the Peninsula Health Falls Prevention Service.The PH-FRAT is used by approximately 400 agencies worldwide [6] in a variety of settings including sub-acute care [6][7][8].However, its predictive performance for falls has been found to be poor [6].The St. Thomas Risk Assessment Tool in Falling elderly inpatients (STRATIFY) is the most widely studied falls risk assessment tool and has the best diagnostic validity [6].The Northern Hospital Modified St Thomas's Risk Assessment Tool (TNH-STRATIFY) was modified from STRATIFY based on local data and included additional risk factors: age, impaired balance, drug and alcohol-related problems, and broadening of the agitation item to include confusion, intellectually challenged or impulsivity.These modifications resulted in statistically significant improvements to the predictive accuracy of TNH-STRATIFY compared to the original STRATIFY tool [19].
In 2021, the Western Health Falls Working Group modified the TNH-STRATIFY to form a new tool (WH-STRATIFY, Appendix A) by adding two additional risk factors: Non-English-Speaking Background (NESB) and medications affecting mobility (sedatives, antidepressants, anti-Parkinson's, diuretics, anti-hypertensives, hypnotics, and opioids), based on evidence of these medications increasing falls risk [6,7].The NESB category was added to the WH-STRATIFY [6] based on the expectation that NESB leads to greater disadvantage in communication and education on falls risk management [6,7].A difference between WH-STRATIFY and its predecessors is the inclusion of suggested management strategies directly linked to each identified fall risk factor.In addition to numerical scoring, WH-STRATIFY promotes interventions tailored to the patient's individual risk, aligned with current guidelines [18].In 2022, WH-STRATIFY was launched at Western Health.We sought to undertake the first study to validate and assess the predictive accuracy of the WH-STRATIFY tool and assess whether adding local demographic risk factors such as non-English speaking background which is prevalent in Western Health, improves the tool's accuracy.
The falls risk assessment tools should be tested in clinical practice for validity and feasibility prior to use [6].This includes comparing predictive accuracy to other falls risk assessment tools [6][7][8][9] and establishing the optimal cut points (i.e., the threshold at which a falls risk assessment tool predicts a fall [6,8,9,20].In this study, a cut point is the minimum score required on a falls risk assessment tool to achieve the classification of someone predicted to have a fall during their GEM admission.Predictive accuracy varies with the cut point for different populations suggesting the cut point should be validated in the setting where the tool is applied [6].TNH-STRATIFY and PH-FRAT have been validated in other settings but their current cut-points could be optimised to improve predictive accuracy.We hypothesised that local validation will optimise the predictive accuracy of falls risk assessment tools.

Aims
This study of GEM unit inpatients compares the predictive accuracy for falls of the PH-FRAT, TNH-STRATIFY, and WH-STRATIFY risk assessment tools using default and calculated optimal cut points [17].We chose to assess participants admitted to GEM because this patient population has an established high risk for falls.

Demographics
A total of 116 participants, comprised of 54 fallers and 62 non-fallers with mean age 81.0 years (fallers) and 79.3 years (non-fallers), p-value = 0.284.Fallers had a significantly higher average length of stay than non-fallers (28.0 days compared to 13.7 days, p-value < 0.0001 (Table 1).

Predictive Accuracy Using Optimal Cut Points
The default cut points for PH-FRAT and WH-STRATIFY were not optimal.The Youden Index ER can be maximised to 0.143 (at optimal cut point 17 for PH-FRAT) and The PH-FRAT was poor at differentiating the fall risks of participants.The majority (82%) of participants were assigned a high score of 16 which overcalls fallers, noting a cut point of 12 (Appendix B).In comparison, there is a larger spread of scores for TNH-STRATIFY and WH-STRATIFY (close to 90% of participants scored between 2 and 6 for TNH-STRATIFY and between 3 and 7 for WH-STRATIFY (Appendix B).
Of the three falls risk assessment tools using default cut points, TNH-STRATIFY has the highest predictive accuracy with a Youden Index ER of 0.20 and 95% Confidence Interval (CI) 0.07, 0.34).This difference was statistically significant (Table 2).PH-FRAT and WH-STRATIFY had similar predictive accuracy.PH-FRAT has a Youden Index ER 0.01 and 95% CI −0.04, 0.05.WH-STRATIFY has a Youden Index ER of 0.00 and 95% CI −0.04, 0.03 (Table 2).Both PH-FRAT and WH-STRATIFY had a sensitivity ER of 0.98 and a specificity ER of 0.02, respectively.

Predictive Accuracy Using Optimal Cut Points
The default cut points for PH-FRAT and WH-STRATIFY were not optimal.The Youden Index ER can be maximised to 0.143 (at optimal cut point 17 for PH-FRAT) and 0.183 (at optimal cut point score 7 for WH-STRATIFY) (Table 3).The optimal cut point for TNH-STRATIFY is the default cut point (3).Using optimal cut points, TNH-STRATIFY has the highest Youden Index ER (0.20) followed by WH-STRATIFY (0.18) and PH-FRAT (0.14) (Table 4).The Youden Index ER confidence intervals for PH-FRAT (0.01 to 0.29), WH-STRATIFY (0.00 to 0.35), and TNH-STRATIFY (0.07 to 0.34) overlap.Therefore, TNH-STRATIFY no longer has predictive accuracy superiority using optimal cut points, and the predictive accuracy for the three falls risk assessment tools is comparable.

Discussion
This study showed that current tools have poor predictive accuracy for falls at their default cut points however, calculating optimal cut points for the three falls risk assessment tools shows promise for better predictive accuracy in high-risk inpatient populations.Using original cut points, PH-FRAT and WH-STRATIFY had no predictive accuracy for falls (Youden Index ER of 0.01 and 0.00 respectively).This study indicated that using WH-STRATIFY instead of PH-FRAT did not improve the predictive accuracy of falls on admission to GEM.Of the three falls risk assessment tools, PH-FRAT had the lowest falls prediction accuracy.These results are consistent with a previous large sample study on PH-FRAT [22].This may be because the PH-FRAT assigns an automatic score of 16 when there is a change in function.As a change in function is a predominant reason for patients admitted to GEM, there is a selection bias that automatically classifies patients as high-risk for falls.In clinical practice, the low predictive accuracy of falls risk reduces the identification of patients at risk of falls thus misdirecting appropriate fall prevention strategies [6].
This study included the impact of adding two new risk factors (NESB and medications associated with falls) to TNH-STRATIFY to create a new falls risk assessment tool, WH-STRATIFY.These additions did not improve the predictive accuracy of WH-STRATIFY, in this study's population.Although the two additional risk factors were incorporated in WH-STRATIFY, the default cut-point score remained at 3. We identified that adjusting this to 7 maximised predictive accuracy (Youden Index ER 0.00 at the default cut point 3, improved to Youden Index ER of 0.18 at optimal cut point 7).This remained lower than the Youden Index ER for TNH-STRATIFY at its optimal cut point (0.20), suggesting the addition of the two domains in WH-STRATIFY did not result in better predictive accuracy than TNH-STRATIFY.While there is evidence showing certain medications increase falls risk, there is limited evidence regarding whether patients with NESB have an increased risk of falls in hospital [23].
It is important to note there are risk factors for falls, such as sarcopenia and frailty, which are not captured in these risk assessment tools and should be considered in future.
The poor predictive accuracy overall suggests limitations with the use of numerical risk prediction tools in isolation for falls prediction and supports recommendations that clinical judgement and multi-disciplinary team-based assessment may be more effective than numerical risk prediction tools alone, in this setting.Falls risk assessment tools and management strategies should be locally designed according to the population and local resources available to improve efficacy.Calculating optimal cut points optimises the predictive accuracy of falls risk assessment tools to improve the identification of fall risk in clinical settings and allows for improved allocation of hospital resources targeting fall mitigation [23].
The optimal cut point for WH-STRATIFYreduces sensitivity and may not always be clinically desirable in a GEM population as this group is already at increased risk of falls, given reduced muscle strength and decline in mobility [1].A carefully balanced consideration of competing factors is required to achieve effective fall prevention strategies by finding acceptable sensitivity and specificity ranges to reduce the misclassification of fallers or non-fallers.We have shown that falls risk assessment tools should undergo clinical validation and calculation of an optimal cut point before being incorporated into a fall prevention program [20].
When comparing the three falls risk assessment tools, TNH-STRATIFY demonstrated the best predictive accuracy with statistically significant and highest Youden Index ER of 0.20 (see Table 4), however, notably this tool's accuracy is still low to moderate at best.The tool's predictive accuracy at the optimal cut-point (Youden Index of 0.20), however, is also lower compared to the initial study where the calculated Youden Index for TNH-STRATIFY was 0.44 which suggests that TNH-STRATFY had a better efficacy when they are targeted to their local patient population [19].It is common in falls risk assessment tool studies to show poor results replication [6,7,21,22].This suggests that falls risk assessment tools' predictive accuracy may be affected by the patient sample and clinical setting.In other words, risk factors more predictive of fallers or non-fallers in the initial study setting are represented in the risk score, rather than in replicated studies [21,24].Further studies are required to explore the underlying reasons for driving different predictive accuracies of falls risk assessment tools across different studies and whether this is due to certain changes in study population characteristics.

Limitations of the Study
This study had some strengths and limitations.This was an adaptation and validation study of a well-recognised risk assessment tool (TNH-STRATIFY) forming the WH-STRATIFY in a novel setting, as recommended in World Falls Guidelines [6].In this retrospective study, only the numerical component of WH-STRATIFY could be assessed.The impact of WH-STRATIFY in its entirety requires future prospective evaluation of fallsrisk management post-implementation to define how this tool performs in alignment with recent recommendations in favour of the use of multi-factorial risk assessment tools over numerical risk prediction tools [18].
Data depended on the quality of EMR documentation which was affected by incomplete data entry at the point of care.Due to the retrospective nature of this study, there was no opportunity to clinically assess patients in real-time during their admission or gather any collateral history that would be helpful with calculating patients' falls risk beyond that which can be included in a numerical tool.When completing the falls risk assessment tools in clinical practice, staff may use alternate data sources rather than sole reliance on EMR data.However, the limitations of any missing data affected the collection of data for all three falls risk assessment tools similarly.
This study captures patients' falls risk on admission to GEM, however, in clinical practice, this risk may change during their admission which could alter the predictive ability of the tools.Regular re-scoring throughout a patient's admission using the WH-STRATIFY may lead to a more accurate score contemporaneous with the fall.Furthermore, the random selection of the non-faller comparator group showed non-significant differences in gender and age between fallers and non-fallers, and significant differences in length of stay.Prospective studies and feedback surveys from users of these tools at different time periods, including survival analysis for time-to-fall, could explore these limitations.
This study focused on a sample of GEM patients from different hospitals of a single health network which may reduce generalisability.This study minimised selection bias by randomisation and use of consistent criteria (Appendix A) when calculating the falls risk scores of each patient.Future studies across multiple hospital networks and patient groups in larger numbers would better determine generalisability of the findings.

Participants and Data Collection
We conducted this retrospective observational study at Western Health, a metropolitan health network servicing the Western Suburbs of Melbourne, Australia.This study involved patients admitted to four Geriatric Evaluation and Management (GEM) units across the network [6].Participants were older adults, aged 65 or above, with acute deterioration in functional abilities due to illness, injury, or cognitive decline and were at risk of falls.This study focused on assessing numerical risk falls prediction tools so targeted interventions and prevention strategies could be implemented to mitigate fall risk during hospitalisation [6].
Adults aged 65 or above with inpatient falls admitted to GEM at Western Health between December 2020 and June 2021, identified through mandatory reporting, met inclusion criteria.This specific time window was selected to be outside the COVID-19 surge [25] to minimise changes to the typical patient profile and medical practices in GEM.A retrospective file review with data collection occurred in fifty-four consecutive patients who fell at least once during admission between December 2020 and June 2021.Sixty-two patients, aged 65 or above, admitted to GEM within the same time period who did not fall during their GEM admission were randomly selected for comparison in our study.We excluded any patients who were outside of the age of 65 or above and excluded any patients who were outside of the GEM admission period between December 2020 and June 2021.As this study was performed before the clinical launch of WH-STRATIFY only a retrospective assessment was possible.
Demographic and health data were manually retrieved from the Electronic Medical Record (EMR) between January 2022 and April 2022.The EMR was reviewed to score each participant with PH-FRAT, TNH-STRATIFY, and WH-STRATIFY (Appendix A).Data was stored in REDCap TM .Only data available on EMR on the day of and day preceding GEM admission were used.To reduce observation bias, blind scoring was conducted by randomising the total sample so knowledge of patients' fall status was not known by the assessor.As there are no previous studies of WH-STRATIFY, sample sizes could not be statistically determined a priori.The sample size of 116 was based on the sample size used in a comparable study [21].

Classification of Predicted Fallers and Non-Fallers
For prediction purposes, participants were classified as predicted fallers if they scored at or above the cut points of the falls risk assessment tools.Similarly, participants were classified as predicted non-fallers if they scored below the falls risk assessment tools' cut points.

Scoring of Falls Risk Assessment Tools
Using information available on the EMR on each patient's admission to GEM, the falls risk factors for these risk assessment tools (as listed in Table 5 below) were retrospectively identified and a score was calculated according to the three falls risk assessment tools' criteria and their cut-points.The higher the participants scored according to the falls risk assessment tools, the higher likelihood of their fall risk as they are more likely to reach or exceed the tools' cut point.Participants with a score at or above the cut point were assessed by the falls risk assessment tool to be a predicted faller.The predictions were compared with the actual outcome during admission to assess each tool's predictive accuracy.

Ethics
Ethics approval was obtained from Western Health Office for Research (ERM ID 81444).

Statistical Analysis
Due to the variability of participants' falls' frequency and admission length, the fall event rate (ER) (defined as the frequency of falls during the patient's admission) was used [24].Sensitivity ER is the number of falls during the patient's GEM admission correctly predicted by the falls risk assessment tool divided by the total number of falls.Specificity ER is the length of hospital stay for non-fallers accurately predicted to have low fall risk by the falls risk assessment tool divided by the total length of hospital stay for all non-fallers.The event rate Youden Index (Youden Index ER ) [6,24] measured the predictive accuracy of the falls risk assessment tools.The Youden Index ER is the sum of the event rate sensitivity (sensitivity ER ) and event rate specificity (specificity ER ) less 1 and produces a value between −1 and 1, with a higher value indicating greater predictive accuracy.
The Youden Index ER is preferred [24,26] as it provides a measure of accuracy by equally weighing sensitivity (i.e., the tool correctly predicts the patient is at high risk of fall) and specificity (i.e., the tool correctly predicts the patient is at low risk of fall).It adjusts for patients who had multiple falls and GEM length of stay.Statistical significance was assessed using bootstrapped 95% confidence intervals.Bootstrapping is a statistical method that resamples a single data set of the current study to create many simulated samples.This process allows the construction of the confidence intervals [6] and other studies have also used this method to derive 95% confidence intervals for the Youden Index ER [26].If the 95% confidence intervals for two falls risk assessment tools overlap, this implies that the two tools do not differ significantly at a 5% level.
The optimal cut point is the score that maximises the predictive accuracy of the three falls risk assessment tools (WH-STRATIFY, PH-FRAT, and TNH-STRATIFY).Using optimal cut points reduces misclassification of those who are likely or not to fall during GEM admission.Microsoft Excel was used in the statistical implementation.

Deriving Optimal Cut Points
Cut points were modified to demonstrate impact on the sensitivity ER , specificity ER , and Youden Index ER .We selected the optimal cut point as the score where the Youden Index ER is at its highest calculated value.

Conclusions
Of the three falls risk assessment tools (TNH-STRATIFY, WH-STRATIFY, and PH-FRAT) using the default cut points, TNH-STRATIFY offered the highest predictive accuracy.PH-FRAT and WH-STRATIFY at default cut points had negligible predictive accuracy, and showed modest improvement when using calculated optimal cut points.However, this study showed the predictive accuracy for all three falls risk assessment tools remained low at their default and optimal cut-points.
This study supports the recommendation that numerical risk prediction tools in isolation are insufficient for the purpose of identifying patients at high risk of falls and guiding falls risk interventions Clinical judgment and multi-disciplinary assessments are important to help predict the falls risk for inpatients and enable the use of person-centered falls risk reduction strategies [16].Future prospective research is required to assess the utility of WH-STRATIFY and associated individualised fall risk interventions as part of a falls risk management program [16].The optimal cut points of falls risk assessment tools should be determined and validated in different clinical settings to optimise predictive accuracy, support targeted falls risk mitigation, and improve resource allocation.Psychological condition has not impacted on daily ADLs c.
No change to antipsychotic or antidepressant on the GEM admission medication list d.
Moderately affected e.
Psychological condition is an active issue as per the medical team on GEM admission f.
As required antipsychotic or benzodiazepine on the GEM admission medication list g.
Severely affected h.
Psychological condition is the reason for the patient's acute hospitalisation prior to transfer to GEM i.
Code grey or requiring IM antipsychotic Psychological condition led to the patient needing a 1:1 special on admission to GEM  Table A3 and Figure A1 show the impact on the predictive accuracy diagnostics by varying the falls cut-off score.This table compared the three falls risk assessment tools (PH-FRAT, WH-STRATIFY, and TNH-STRATIFY) with regards to their cut-off scores demonstrated their respective Youden Index ER

Figures 1 - 4 Figure 1 . 3 Figure 1 .
Figures 1-3 summarise the number of fallers and non-fallers for the three falls risk assessment tools using default cut points.Default cut points are the original scores for each tool that denote a patient with a high or low fall risk (e.g., PH-FRAT = 12; TNH-STRATIFY = 3; WH-STRATIFY = 3).Muscles 2023, 2, FOR PEER REVIEW 4

Figure 1 .
Figure 1.Number of fallers and non-fallers by PH-FRAT score.

Figure A1 .Figure A1 .
Figure A1.Graph of event rate diagnostics by varying the fall cut-off score for PH-FRAT, TNH-STRATIFY, and WH-STRATIFY.

Table 2 .
[21]nostic predictive accuracy metrics for PH-FRAT, TNH-STRATIFY, and WH-STRATIFY using cut points for predicting a faller.Values in parenthesis are bootstrapped 95% confidence intervals based on 1000 repetitions of the original sample size.The event rate metrics also factors for patients who may have had multiple falls in the same admission and the patient's length of stay[21].

Table 3 .
Event rate diagnostics by varying the fall cut-off score for PH-FRAT, TNH-STRATIFY, and WH-STRATIFY.

Table 3 .
Cont.Peninsula Health Falls Risk Screening Tool, TNH-STRATIFY-The Northern Hospital Modified St Thomas's Risk Assessment Tool, WH-STRATIFY-Western Health St. Thomas' Risk Assessment Tool (See Appendix C for graphs which illustrate these results).

Table 4 .
Diagnostic predictive accuracy metrics for PH-FRAT, TNH-STRATIFY, and WH-STRATIFY using optimal cut points for predicting a faller.
Values in parenthesis are bootstrapped 95% confidence intervals based on 1000 repetitions of the original sample size.

Table A1 .
Cont.EMR Initial patient assessment completed by nursing on admission à look at Speech and Sensory deficits and check for Visual impairment, glasses/contact lenses • EMR Interactive View → Adult Systems Assessment → Mobility status Definition:

Table A3 .
Event rate diagnostics by varying the fall cut-off score for PH-FRAT, TNH-STRATIFY, and WH-STRATIFY.