Trajectories of Short Physical Performance Battery Are Strongly Associated with Future Major Mobility Disability: Results from the LIFE Study

Short Physical Performance Battery (SPPB) assessment is a widely used measure of lower extremity function, strength, and balance. In the Lifestyles Interventions and Independence for Elders (LIFE) Study, baseline SPPB and changes throughout the trial were strongly associated with major mobility disability (MMD). This study further investigated this association by identifying trajectories of SPPB and evaluating the predictive validity of SPPB trajectories for future MMD. Participants (n = 1635) aged 70–89 years were randomized to a physical activity or health education intervention and assessed every 6 months for MMD. We used group-based trajectory models (GBTMs) to identify trajectories of a binary outcome for a decrease from baseline SPPB of ≥1. Multinomial logistic regression explored baseline factors associated with group membership. Survival analyses evaluated the association between trajectories with MMD. The GBTM identified a 3-group model which included a “No Decline” group (46.0%), “Late Decline” group (27.7%), and an “Early Decline” group (26.3%). Adjusting for all other baseline characteristics, group assignment during the previous follow-up visit was strongly associated with MMD at the subsequent period. Comparisons between groups showed a 2-to-3-fold increase in MMD comparing the “Late” to “No” decline group and a 4-to-5-fold increase in MMD comparing the “Early” to “No” decline group. Group membership and impact on MMD was not different between intervention arms. Group-based trajectories of SPPB scores identified distinct subgroups in LIFE Study participants. Using these group assignments in outcome models were highly associated with MMD. GBTMs have potential to identify and improve prediction of aging-related decline to better design and identify patients for interventions.


Introduction
The Short Physical Performance Battery (SPPB) is a widely used assessment of lower extremity function in older adults [1]. SPPB measures balance, gait, strength, and endurance, particularly of the lower extremities by combining an eight-foot (8 ) walk test, chair sitting and standing, and balance tests [1]. SPPB has been strongly associated with all-cause mortality, disability, rehospitalization and healthcare utilization in older adults making it an important assessment tool in clinical practice as it can be quickly administered and requires minimal equipment and space [2][3][4][5][6]. Since its development,

LIFE Study Overview
The LIFE Study was a multi-center, single-blind, parallel randomized trial conducted across eight centers in the United States between February 2010 and December 2013 with participant recruitment occurring between February 2010 and December 2012 [9]. Among the n = 1635 participants randomized during the study period, there was an average of 2.6 years of follow-up time with loss to follow-up of 4% annually [9]. The study protocol was approved by the institutional review boards of each institution. Written informed consent was obtained from all study participants. The trial was monitored by a data and safety monitoring board appointed by the National Institute on Aging. The LIFE Study was registered prior to participant enrollment in the trial (NCT01072500). Details of the study design, rationale, and characteristics of the full study population are described elsewhere [9,16]. Participants were eligible for the trial who were 70-89 years of age, scored ≤ 9 on the SPPB, were sedentary with ≤125 min of activity per week, and were able to complete the 400-m walk test within 15 min without sitting, leaning or without assistance. Reuse of the LIFE Study data for this analysis was approved under expedited review by the University of Florida Institutional Review Board (IRB201701581).

Intervention
Details of the study interventions were published previously [9,17]. The physical activity (PA) intervention involved walking, with a goal of 150 min per week, strength, flexibility, and balance training. The intervention included attendance at two center-based visits per week and home-based activity three to four times per week for the duration of the study. The PA sessions were individualized and progressed toward a goal of 30 min of walking daily at moderate intensity, 10 min of primarily lower-extremity strength training by means of ankle weights (2 sets of 10 repetitions), 10 min of balance training, and large muscle group flexibility exercises.
The health education (HE) control group included weekly educational workshops during the first 26 weeks, and then monthly sessions thereafter. Workshops included topics relevant to older adults, such as how to effectively negotiate the health care system, how to travel safely, preventive services and screenings recommended at different ages, where to go for reliable health information, nutrition, etc. The workshops did not include any PA topics. The program also included a 5-to 10-min instructor-led program of gentle upper extremity stretching or flexibility exercises.

Follow-Up Visits and Outcome Assessment
Participants were assessed for the primary outcome (MMD) every 6 months at clinic visits. Home, telephone, and proxy assessments were attempted if participants could not return to the clinic. The assessment staff were masked to the intervention assignment and remained separate from the intervention team. Participants were asked not to disclose their assigned intervention arm or talk about their interventions during the assessment.
Details of MMD ascertainment were reported previously [8]. Briefly, participants were asked to walk 400 m at their usual pace, and MMD was defined as the inability to complete the walk within 15 min without sitting and without the help of another person or walker. When MMD could not be objectively measured because of the inability of the participant to come to the clinic and absence of a suitable walking course at the participant's home, institution, or hospital; an alternative adjudication of the outcome was based on objective inability to walk 4 m in less than 10 s, or self-, proxy-, or medical record-reported inability to walk across a room. If participants met these alternative criteria, they were considered to be unable to complete the 400-m walk within 15 min. SPPB was measured during each clinical follow-up visit at baseline, 6, 12, 24, and 36 months. SPPB included a 400-m walking velocity, timed repeated chair stand, and three standing balance tests. Each test is assigned a score ranging from 0 to 4 (inability to complete up to best performing) and the three test scores are summed to a summary score ranging from 0 (worst performers) to 12 (best performers).

Group-Based Trajectory Modeling
A longitudinal file of SPPB assessment scores was organized for each individual which included the baseline, 6, 12, 24, and 36-month SPPB measurements. Given dependence of the overall group membership on baseline SPPB scores ( Figure 1A), we standardized the SPPB models by using a binary indicator for whether an individual's SPPB score decreased ≥1 point compared to baseline in each time period, which is the minimum clinically relevant difference for SPPB [7]. GBTMs with logistic function were estimated for 2-8 groups using PROC TRAJ [18] in SAS Enterprise Guide v7.1 (SAS Institute, Cary, NC, USA) which predicted the probability of having a decline in SPPB in each follow-up period. GBTM models correlate trajectories in the dependent variable (SPPB) over time to identify latent (hidden) groupings of individuals based on trajectories of this variable. No other characteristics are considered in the trajectory models. Best practices for model selection included a comparison of the Bayesian Information Criterion (BIC) between models and Nagin's Criteria [11]. Missing observations for SPPB were treated as censored values.
Based on the overall model fit, we identified a final three group model and categorized the groups as "No Decline," "Early Decline," and "Late Decline" based on the visible pattern in SPPB ( Figure 1D). We described baseline characteristics including intervention group (PA or HE), physical characteristics (age, sex, race (white, black, or other), body mass index (BMI, kg/m 2 ), functional assessments, sleep quality (Pittsburgh Sleep Quality Index), grip strength (kg), gait speed (m/s), past medical history, and education (≥high school). Included functional assessments were previously validated measures summarized as raw scores within their individual range of values which included, for cognitive function, the Memory and Aging Telephone Screen (MATS) and the Modified Mini Mental State Examination (3MSE) and activity levels were measured via the Community Healthy Activities Model Program for Seniors (CHAMPS). Physical functioning was captured using baseline gait speed, grip strength, SPPB (score of ≤ 7), and the Pepper Assessment Tool for Disability (PAT-D). Overall current self-rated health (at baseline) was grouped based on "good, very good, or excellent" health status. Details of these assessments and their respective methodologies and score ranges can be found in the original trial deign and methods publications [8,9].
We estimated a multinomial logistic regression of these characteristics to identify characteristics associated with group membership. Adjusted odds ratios (OR) with 95% confidence intervals (CIs) are reported.
Lastly, we created a time-varying group membership variable and utilized it as the primary variable of interest in a proportional hazard regression for the MMD outcome. Group membership was identified during the prior follow-up period using the largest posterior probability of group membership. The subsequent follow-up period was assessed for the outcome (MMD). We estimated the proportional hazards model with all other baseline characteristics and reported hazard ratios (HR) and 95% CIs for the overall cohort as well as stratified by intervention group assignment. Based on the overall model fit, we identified a final three group model and categorized the groups as "No Decline," "Early Decline," and "Late Decline" based on the visible pattern in SPPB ( Figure 1, Panel D). We described baseline characteristics including intervention group (PA or HE), physical characteristics (age, sex, race (white, black, or other), body mass index (BMI, kg/m 2 ), functional assessments, sleep quality (Pittsburgh Sleep Quality Index), grip strength (kg), gait speed (m/s), past medical history, and education (≥high school). Included functional assessments were previously validated measures summarized as raw scores within their individual range of values which included, for cognitive function, the Memory and Aging Telephone Screen (MATS) and the Modified Mini Mental State Examination (3MSE) and activity levels were measured via the Community Healthy Activities Model Program for Seniors (CHAMPS). Physical functioning was captured using baseline gait speed, grip strength, SPPB (score of ≤7), and the Pepper Assessment Tool for Disability (PAT-D). Overall current self-rated health (at baseline) was grouped based on "good, very good, or excellent" health status. Details of these assessments and their respective methodologies and score ranges can be found in the original trial deign and methods publications [8,9]. We estimated a multinomial logistic regression of these characteristics to identify characteristics associated with group membership. Adjusted odds ratios (OR) with 95% confidence intervals (CIs) are reported.
Lastly, we created a time-varying group membership variable and utilized it as the primary variable of interest in a proportional hazard regression for the MMD outcome. Group membership was identified during the prior follow-up period using the largest posterior probability of group membership. The subsequent follow-up period was assessed for the outcome (MMD). We estimated

Results
Among the 1635 LIFE participants, GBTMs of the full cohort on the raw continuous SPPB score showed the best fit with a 6-group model. However, upon visual inspection, membership in this model (and all other models estimated) was strongly dependent on the baseline SPPB score ( Figure 1A). Thus, we stratified the cohort by SPPB ≤ 7 and > 7 and re-estimated these separate models ( Figure 1B,C). Each model was the best fit by a 5-group model that more clearly distinguished group trajectories. Notably, the SPPB > 7 cohorts all had the same average baseline SPPB (~8) which deviated into final SPPB average values ranging from 3-11 showing clear trajectories of both increasing and decreasing SPPB values ( Figure 1C).
Using a standardized measure of SPPB decline of ≥1 point, the binary models produced a 3-group model that was easily interpretable ( Figure 1D). Group memberships included 46.0% of the cohort in a "No Decline" group, 27.7% in a "Late Decline" group, and 26.3% in an "Early Decline" group. The groups were overall very similar on baseline characteristics (Table 1). There were also few independent predictors of group membership in fully adjusted models (Table 2). Compared to the "No Decline" group, "Late Decline" and "Early Decline" group membership was associated with increases in age, "other" races compared to Black race, and lower SPPB scores at baseline. Self-rated health as good, very good, or excellent was associated with higher membership in the "Late Decline" group (OR = 1.94 (1.01-3.74)). There were no differences in group membership and intervention arm assignment. Increases in MATS score, grip strength, 3MSE, and female sex were positively associated with membership in the "No Decline" group while higher self-rated health, other race, and lower SPPB scores were associated with declines in SPPB groups. In the comparison between "Early" versus "Late" Decline, female sex (OR = 2.13 (1.13-4.02)) and current smoking status (OR = 3.19 (1.14-8.96)) were the only significant predictors. Table 1. Bivariate association of baseline characteristics with group-based trajectories for functional decline measured by a decrease of SPPB ≥ 1 in a 3-group model ( Figure 1D).

Characteristics
Group Membership     Adjusting for all other baseline characteristics, prior period group assignment was strongly associated with subsequent period rates of MMD (Figure 2). Compared to the "No Decline" group, the "Late Decline group was associated with greater than two-fold increases in MMD (HR = 2.50 (1.97-3.10)) in the overall cohort and was similar when stratified intervention groups. Similarly, the "Early Decline" group was associated with 4-to 5-fold increases in the rate of MMD in the overall cohort (HR = 4.76 (3.76-5.85)) and similar in magnitude between intervention groups. This effect appeared to be greater in the PA intervention group compared to the HE controls, but the interaction term did not reach statistical significance. associated with subsequent period rates of MMD (Figure 2). Compared to the "No Decline" group, the "Late Decline group was associated with greater than two-fold increases in MMD (HR = 2.50 (1.97-3.10)) in the overall cohort and was similar when stratified intervention groups. Similarly, the "Early Decline" group was associated with 4-to 5-fold increases in the rate of MMD in the overall cohort (HR = 4.76 (3.76-5.85)) and similar in magnitude between intervention groups. This effect appeared to be greater in the PA intervention group compared to the HE controls, but the interaction term did not reach statistical significance.

Discussion
In this secondary analysis of the LIFE Study clinical trial, we identified distinct GBTMs that classify individuals into groups based on SPPB scores. These trajectories showed few factors at baseline that predicted group membership, but these groups were highly associated with the primary MMD outcome.
While the PA intervention successfully prevented MMD in the original trial, there were indicators of heterogeneity in that treatment effect [8,10]. Capturing these predictors could assist in developing better interventions or identifying higher-risk subgroups for interventions. For instance, SPPB ≤ 7 at baseline was associated with a higher risk of MMD but this group also showed a stronger response to the intervention in prior analyses [10]. Here, we utilized GBTMs as another means to explore this unique association between SPPB and MMD. While that association remained strong, we found that there were few factors related to group membership at baseline, thus, it appeared that baseline data alone could not be used to identify individuals at high risk of MMD. Additional work will evaluate simultaneous GBTMs of other physical and cognitive assessments as well as explore the impact of intervening health events (e.g., hospitalizations, fall and fractures) on GBTMs.
The results of the GBTM models as well as the regression results exploring associations with group membership confirm the observed paradoxical relationship between SPPB and the PA intervention in the LIFE Study [9]. For those with SPPB > 7 with an average baseline SPPB of approximately 8, five distinct groups were identified that could be further characterized as two groups with improvement ( Figure 1C; 48.5% of those with SPPB > 7) while another 22.2% can be categorized as declining in SPPB scores over the trial period. Conversely, in those with SPPB ≤ 7 ( Figure 1B), a group specifically recruited for the LIFE Study, the average baseline scores were around 6 for the five identified groups and was made up of 31.8% in groups that declined and only 32.5% characterized as improving over time.
These results are significant as it has been previously observed that the PA intervention was effective in those with SPPB ≤ 7 but not in those with higher scores [8,19]. Paired with the observation in this study that higher self-rated health was associated with belonging to the late or early decline group, a few conclusions can be made. Importantly, it is becoming apparent that in some older adults, there may be a trade-off between increased physical activity and exercise and potential harms such as injuries and fall risk [20]. These results support that these risks may correlate with baseline functioning, potentially due to the perception of ability and greater exertion leading to increased risk of injury. Rather, among those with already present limitations, approaches to exercise may be more conservative which aids in maintaining or improving baseline functioning with less risk of injury and future limitation [19]. In addition, the concept of "regression to the mean," wherein those with lower results will tend to improve and those with higher results will tend to decline, cannot be ruled out. However, we believe this to be less likely to explain the results given the correlation of decline groups with self-rated health.
SPPB is widely used in geriatric assessments both for research and clinical purposes [1]. SPPB is an attractive physical assessment as it measures multiple components of physical strengthen and balance and is simple and quick to perform in-person or virtually with little equipment or space required. By showing that trajectories of SPPB are predictive of future MMD, we have shown that tracking an individual's SPPB scores may be a scalable way of predicting and intervening on future issues related to mobility. We have shown, however, that aside from baseline SPPB scores, there were few predictive variables associated with how these assessments vary over time. These results are further confirmation that interventions may be most effective among those with lower baseline physical functioning and also suggests that those at higher functioning, especially those with high assessments of their own health, may be a more "risky" group for physical activity. Care should be taken in recruitment of individuals into exercise programs and education to ease into such programs may be further necessary [21]. Preventing declines in physical functioning is an important component of caring for older adults and as physical functioning is important to healthy aging and associated with healthcare outcomes, utilization and costs [22,23].

Study Limitations
This study is strengthened by a clinically relevant sample, long follow-up, and objective outcome measurement. Limitations include assumptions made for GBTMs as well as the selection of the final trajectory models. While SPPB was measured at several intervals, differential follow-up and censoring were allowed. The model assumed censoring was random and non-informative, however, this might not always be the case. While the selection of final models was based on model fit, definitions of each group are investigator-driven which may introduce some level of subjectivity in interpretation. However, the final model represented easily interpreted group memberships. While the binary SPPB decline model standardized the measurement, it ignored improvements in SPPB which may be more predictive of future events and deserves further research. Lastly, while the analysis tracked SPPB and its association with MMD while controlling for other baseline factors, the regression does not account for other intervening health events (e.g., hospitalizations) or other time-varying characteristics. Such events would likely be on the causal pathway or follow reductions in physical functioning or be further modified by the intervention. Future research should establish the temporality between such events to understand cause and effect between incident disability and healthcare events.

Conclusions
Group-based trajectories of SPPB scores identified distinct subgroups in LIFE Study participants. The use of these group assignments in outcome models was highly predictive of the primary MMD outcome. Tracking SPPB longitudinally for an individual may provide insights into future mobility limitations and may allow identification of patients needing intervention. GBTMs have the potential to identify and improve prediction of aging-related decline to better design and identify patients for interventions.