Comparative Analysis of Frailty Scores for Predicting Adverse Outcomes in Hip Fracture Patients: Insights from the United States National Inpatient Sample

The aim of the current investigation was to compare the ability of several frailty scores to predict adverse outcomes in hip fracture patients. All adult patients (18 years or older) who suffered a hip fracture due to a fall and underwent surgical fixation were extracted from the 2019 National Inpatient Sample (NIS) Database. A combination of logistic regression and bootstrapping was used to compare the predictive ability of the Orthopedic Frailty Score (OFS), the Nottingham Hip Fracture Score (NHFS), the 11-factor modified Frailty Index (11-mFI) and 5-factor (5-mFI) modified Frailty Index, as well as the Johns Hopkins Frailty Indicator. A total of 227,850 patients were extracted from the NIS. In the prediction of in-hospital mortality and failure-to-rescue (FTR), the OFS surpassed all other frailty measures, approaching an acceptable predictive ability for mortality [AUC (95% CI): 0.69 (0.67–0.72)] and achieving an acceptable predictive ability for FTR [AUC (95% CI): 0.70 (0.67–0.72)]. The NHFS demonstrated the highest predictive ability for predicting any complication [AUC (95% CI): 0.62 (0.62–0.63)]. The 11-mFI exhibited the highest predictive ability for cardiovascular complications [AUC (95% CI): 0.66 (0.64–0.67)] and the NHFS achieved the highest predictive ability for delirium [AUC (95% CI): 0.69 (0.68–0.70)]. No score succeeded in effectively predicting venous thromboembolism or infections. In summary, the investigated frailty scores were most effective in predicting in-hospital mortality and failure-to-rescue; however, they struggled to predict complications.


Introduction
Hip fractures represent a significant and growing public health concern, which predominantly afflicts the elderly population [1].Projections indicate a substantial rise in the annual number of hip fractures, which are expected to reach 4.5 million globally by the year 2050 [2].This is of particular concern given that the hip fracture rates in the United States are among the highest in the world [3].With one-year postoperative mortality rates reaching 27% [4], and even exceeding 50% in certain demographic cohorts [5], this underscores the critical importance of addressing hip fractures comprehensively and urgently to mitigate their profound impact on individuals and healthcare systems alike.
This high mortality rate in hip fracture patients stems largely from the high prevalence of frailty within this population.Studies reveal stark contrasts, with crude 30-day and 90-day mortality rates among frail individuals reaching up to ten times higher than those among non-frail hip fracture patients [6].Furthermore, frailty has been found to be the single most important predictor of in-hospital mortality in hip fracture patients [7].Complicating matters, however, is the proliferation of published frailty scores, leading to uncertainty regarding their use and effectiveness.This equipoise underscores the need to identify which frailty scores perform optimally.Consequently, the current study aims to compare the ability of several frailty scores to predict adverse outcomes in hip fracture patients.

Materials and Methods
The data for the present study were sourced from the 2019 United States National Inpatient Sample (NIS).Managed by the Agency for Healthcare Research and Quality, the NIS constitutes the largest all-payer inpatient database in the United States, drawing from a 20% sample of the nation's hospitalizations.The NIS is sampled from the State Inpatient Databases, including all inpatient data that are currently contributed to the Healthcare Cost and Utilization Project.Utilizing validated sampling algorithms integrated with discharge and survey weights, the NIS ensures accurate national estimates for a substantial 97% of all inpatient hospitalizations in the country [8].The current investigation adhered to the transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD) guidelines.
All individuals aged 18 years or older who had undergone emergency hip fracture surgery following a traumatic fall were included in this study.Surgical procedures involved internal fixation methods, including open reduction internal fixation or intramedullary nailing, as well as arthroplasty, both total hip arthroplasty and hemiarthroplasty.To reduce heterogeneity in the dataset, patients with head, vascular, or truncal injuries were excluded.Only 0.4% of the patients (N = 925) were missing data required for calculating the frailty scores; consequently, these patients were excluded.The identification of the previously listed variables relied on the International Classification of Diseases 10th Revision (ICD-10) codes recorded in the NIS [9].The data retrieved included patient demographics (age, sex, and race/ethnicity), comorbidities, fracture morphology, surgical management, origin of admission, and functional ability, as well as discharge disposition and in-hospital complications [10].

Orthopedic Frailty Score
The Orthopedic Frailty Score (OFS) was developed for the explicit purpose of evaluating frailty and predicting short-term mortality in individuals with hip fractures [11,12].The OFS was calculated based on the presence of five dichotomous variables: congestive heart failure, a history of malignancy (excluding non-invasive skin cancer), institutionalization, dependence on assistance for activities of daily living (indicating a non-independent functional status), and an age of 85 years or older.Each of these variables was ascribed one point, resulting in a maximum possible score of 5 [11].

Nottingham Hip Fracture Score
The Nottingham Hip Fracture Score (NHFS) was predominantly calculated according to the original study by Maxwell et al. [13].Patients received 3 points if their age fell within the range of 66 to 85 years and 4 points if they were 86 years or older.Additionally, patients received 1 point if they were male, institutionalized, had a history of malignancy, or presented with two or more comorbidities.In the absence of a mini-mental test score, patients received 1 point if they had dementia, aligning with the methodology employed by previous investigations [11,[14][15][16][17]. Furthermore, as admission hemoglobin is not recorded in the NIS, patients also received 1 point if they were diagnosed with any deficiency anemia or chronic blood loss anemia at the time of admission [18].Subsequently, the maximum possible score was 10.

11-Factor Modified Frailty Index
The 11-factor modified Frailty Index (11-mFI) was determined based on the presence of non-independent functional status, hypertension requiring medication, chronic obstructive pulmonary disease (COPD), impaired sensorium, diabetes mellitus, previous myocardial infarction, congestive heart failure, stroke with neurologic deficit, transient ischemic attack or stroke without neurologic deficit, angina pectoris, and peripheral vascular disease [19,20].However, pneumonia was not included in the calculation as it constituted one of the outcomes in the current analysis.Each variable present increased the score by 1, resulting in a maximum potential score of 11.

5-Factor Modified Frailty Index
The 5-factor modified Frailty Index (5-mFI) was determined based on the presence of COPD, diabetes mellitus, hypertension requiring medication, congestive heart failure, and non-independent functional status [21].Each variable present increased the score by 1, resulting in a maximum potential score of 5.

Johns Hopkins Frailty Indicator
As per the Johns Hopkins Frailty Indicator, patients were categorized as either nonfrail or frail, contingent upon the identification of at least one diagnosis that defines frailty.These conditions included fall from wheelchair, fall on stairs or steps, abnormality of gait, difficulty in walking, inadequate material resources, inadequate housing, lack of housing, fecal incontinence, feeding difficulties and mismanagement, abnormal loss of weight and underweight, continuous urinary leakage, incontinence without sensory awareness, decubitus ulcer, profound visual impairment, senile dementia with delirium, senile dementia with delusional or depressive features, nutritional marasmus, or other severe protein-calorie malnutrition [22,23].

Statistical Analysis
Summary statistics were employed in order to describe the dataset.Age was summarized as a median and interquartile range, while the remaining categorical variables were presented as counts and percentages.The adverse outcomes investigated included in-hospital mortality, any complication, cardiovascular complications, delirium, venous thromboembolism, infection, and failure-to-rescue (FTR).
Any complication encompassed heart failure, cardiac arrest, cardiac tamponade, ventricular tachycardia, ventricular fibrillation, stroke, acute kidney injury, delirium, deep vein thrombosis, pulmonary embolism, embolism due to orthopedic device, urinary tract infection, pneumonia, wound infection, implant infection, sepsis, and decubitus ulcers.Cardiovascular complications were defined as heart failure, cardiac arrest, cardiac tamponade, ventricular tachycardia, ventricular fibrillation, or stroke.Venous thromboembolism comprised deep vein thrombosis, pulmonary embolism, or embolism due to orthopedic device.Infections included urinary tract infections, pneumonia, wound infections, implant infection, and sepsis.FTR was defined as in-hospital mortality subsequent to a complication.
To evaluate the predictive ability of all the included frailty scores for each outcome, LR models were fitted to the dataset with the outcome as the response variable and the frailty score as the predictor.The AUC (area under the receiver operating characteristic curve), as well as the sensitivity and specificity that maximized Youden's index (sensitivity + specificity − 1), was calculated for each model, along with their respective 95% confidence intervals (CI).The CIs were estimated using 1000 bootstrap replicates; p-values for the differences in the AUCs were also estimated using 1000 bootstrap replicates as follows: 1.
The AUCs for two frailty scores were calculated: AUC 1 and AUC 2 .

3.
1000 bootstrap replicates of the dataset were generated.

4.
The difference in the AUCs for each replicate was calculated: D 1 . . .D 1000 .

5.
The standard error of the differences in the AUCs was calculated as the standard deviation across all replicates: .

6.
The z-value was determined by dividing the difference in the AUCs by the standard error: Z = D observed/σ D .7.
A two-sided p-value was determined based on the proportion of a normally distributed population with a z-value > Z or < −Z (where Z is positive).
These analyses were also repeated with a subgroup of patients consisting of those who were 50 years or older.Statistical significance was defined as a two-sided p-value < 0.05.All p-values were adjusted for multiple comparisons using Bonferroni correction.All analyses were performed in the software R 4.2.2 using the tidyverse, survey, parallel, and WeightedROC packages [24].

Discussion
By leveraging the NIS, which samples 20% of the nation's hospitalizations and accurately estimates 97% of all inpatient hospitalizations in the United States, a nationwide analysis could be conducted comparing the predictive ability of a range of frailty scores.Among the adverse outcomes investigated, these scores functioned best when predicting mortality and failure-to-rescue.Regarding these particular outcomes, the OFS demonstrated better performance compared to all other scores, with the NHFS following in second place.However, all scores comparatively struggled to accurately predict in-hospital complications, with some exceptions for specific types of complications.Notably, the NHFS stood out in its ability to predict delirium, while the 11-mFI outperformed all other scores when predicting cardiovascular complications.Of all the scores, only the OFS and NHFS were able to demonstrate an acceptable or close-to-acceptable predictive ability for any of the adverse outcomes [25].
These findings are by and large in line with prior studies.In the original study that developed and validated the OFS [AUC (95% CI): 0.77 (0.74-0.80)], it was found to surpass the 5-mFI [AUC (95% CI): 0.69 (0.66-0.73)] in predicting mortality, which is consistent with the present investigation.Furthermore, the OFS in the current study displays essentially the same predictive ability for mortality as it did in the national dataset used for developing the OFS.However, in the original study, the NHFS [AUC (95% CI): 0.76 (0.73-0.79)] and the OFS [AUC (95% CI): 0.77 (0.74-0.80)] demonstrated comparable predictive performance.This difference could be due to the lack of an admission hemoglobin or the fact that the current investigation focused on in-hospital mortality rather than 30-day postoperative mortality [11].An analysis utilizing the NIS to explore the relationship between dementia and frailty also found that the OFS was the most important predictor of in-hospital mortality among the 52 variables analyzed, providing further support for its robust predictive capability in comparison to other scoring systems [7].In a study conducted by Thorne et al., the AUC for in-hospital mortality was reported as 0.69 (95% CI: 0.64-0.74)for the NHFS, slightly higher than the NHFS's performance and on the same level as the OFS observed in the current investigation [26].A separate analysis conducted by Lisk et al. reported an AUC for in-hospital mortality of 0.674 (95% CI: 0.584-0.764)for the NHFS; however, the wide confidence interval makes it difficult to judge if this indicates a true difference compared to the NHFS and the OFS in the current study [27].Finally, when investigating the predictive performance of the 5-mFI, Ondeck et al. calculated an AUC for in-hospital mortality of 0.641 (95% CI: 0.624-0.659),which was higher than the current analysis.Nevertheless, similar to the current findings, the 5-mFI struggled to predict complications, with an AUC of 0.586 (95% CI: 0.581-0.592)for all adverse events [28].
As the current analyses were limited to the variables available in the NIS, there are many frailty scores that could not be included in the current analysis but warrant acknowledgment.In the same investigation by Thorne et al. that examined the NHFS, they also found that the Clinical Frailty Scale (CFS) yielded an AUC of 0.63 (95% CI: 0.57-0.69)for in-patient mortality, indicating a lower predictive ability compared to both the NHFS and the OFS [26].In another study by Choi et al., the hip-multidimensional frailty score (Hip-MFS) exhibited a significantly better ability to predict postoperative complications with an AUC of 0.679 (95% CI: 0.613-0.745),surpassing all scores encompassed in the current study [29].Additionally, the Hospital Frailty Risk Score (HFRS), composed of over 100 weighted variables, has demonstrated associations with heightened risks of inhospital mortality, complications, prolonged length of stay, and increased total hospital costs [30,31].Similarly, the FRAIL scale has been linked to elevated risks of postoperative complications and extended length of stay in geriatric fracture patients [32].Employing a modified version of the Fried Frailty Index, Kistler et al. found that it was associated with an increased risk of complications [33].Meanwhile, the Frailty Index, comprising over 50 variables, was associated with an elevated mortality rate and longer hospital stays [34].
Despite the wide range of frailty scores available, there remains a clinical equipoise when selecting which to apply in practice.Many feature an extensive variable list, which may hinder their ability to be used bedside [30,31,34,35].Additionally, some also require patient cooperation [32,33,35], which may be challenging in a patient population where approximately 20% suffer from dementia [36].Moreover, the need for blood tests postadmission further complicates the use of others [13,35].Finally, some scores are inherently more subjective, relying on clinicians' judgment and observation of functional abilities in conjunction with overall health [37].This in turn results in greater interobserver variation, especially among inexperienced users [38][39][40][41].Despite the intuitive advantages of complexity, the current findings suggest that more intricate indices are not inherently superior.Notably, the OFS, among the simplest of the included scores, emerges as particularly adept at predicting mortality-related outcomes.Furthermore, the mFI-5, which requires fewer than half as many variables as the mFI-11, generally exhibits only marginal differences compared to the mFI-11, with the prediction of cardiovascular complications being the only exception.Additionally, research indicates that orthopedic surgeons tend to favor simpler and more straightforward scoring systems for clinical application [42]; this is an important point to consider when selecting a tool to transition from research settings to clinical practice.
This study was strengthened by the use of the NIS, which constitutes the largest all-payer inpatient database in the United States [8].Leveraging this extensive dataset enabled the investigation of a wide array of outcomes through the inclusion of over 227,000 estimated hip fracture patients.However, a notable limitation was the absence of admission hemoglobin data, necessitating reliance on the diagnosis of anemia, which could potentially have reduced the predictive ability of the NHFS.Nonetheless, despite this alteration, the NHFS still outperformed all other scores in predicting delirium by a significant margin.Furthermore, our analyses were constrained by the variables available in the NIS, preventing the inclusion of other pertinent outcomes such as readmission, quality of life, and long-term mortality.Additionally, the risk of non-differential misclassification remains a concern when using a retrospective database, as this adds statistical noise to the data that reduces the overall predictive ability of all scores.

Conclusions
The investigated frailty scores were most effective in predicting in-hospital mortality and failure-to-rescue; however, they struggled to predict complications.The findings suggest that increasing complexity in frailty scores may result in diminishing returns concerning predictive ability, as evidenced by the Orthopedic Frailty Score, which, despite its simplicity in calculation, either outperforms or performs comparably to the other scores across most outcomes.Future studies should investigate how these scores can be incorporated into routine preoperative evaluations to guide care through preoperative optimization and resource allocation.

Supplementary Materials:
The following supporting information can be downloaded at https: //www.mdpi.com/article/10.3390/jpm14060621/s1:Table S1: Distribution of outcomes for each frailty score; Table S2: Predictive ability of frailty scores for adverse outcomes in hip fracture patients who are ≥50 years old.

Table 1 .
Demographics of hip fracture patients.

Table 2 .
Comorbidities in hip fracture patients.

Table 3 .
Crude outcomes in hip fracture patients.

Table 4 .
Predictive ability of frailty scores for adverse outcomes in hip fracture patients.