Psychometric Properties of the CASP-12 Scale in Portugal: An Analysis Using SHARE Data

The purpose of this study is to assess the psychometric properties of the Portuguese version of the Control, Autonomy, Self-realization, and Pleasure (CASP)-12 scale used in the Survey of Health, Aging and Retirement in Europe (SHARE) project. Data were obtained from a representative sample of 1666 people aged ≥50 years living in Portugal and participating in the SHARE wave 6. In addition to the CASP-12 scale, sociodemographic data and health status, activity limitation (GALI), depression (Euro-D) and satisfaction with life scores were collected. Data quality and acceptability, construct and structural validity and internal consistency of the CASP-12 scale were analyzed. A Rasch analysis was also performed. CASP-12 total score (mean: 33.3; standard deviation: 5.8, range: 12–48) correlated with Euro-D (−0.57) and with life satisfaction (0.52). Mean scores were significantly lower for women, people aged ≥75 years and those with activity limitations and worse health status (p < 0.001). The confirmatory factor analysis showed good fit to the 4-factor model (root mean squared error of approximation (RMSEA): 0.07; comparative fit index (CFI): 0.90, χ2 (48) = 444.59, p < 0.001), which was confirmed by Rasch analysis (χ2 (36) = 10.089, p = 0.745, person separation index (PSI) = 0.722 for the 4-factor model). For domains, person separation index ranged 0.31–0.79 and Cronbach’s alpha, 0.37–0.73. In conclusion, the Portuguese version of the CASP-12 scale presents some inadequacies in acceptability, internal consistency and structural validity.


Introduction
By 2050, Portugal is projected to be one of the oldest countries in Europe, with persons aged 55 years or more representing almost half (47.1%) of the total population [1]. This increasing population aging raises many questions at socioeconomic, health, scientific and ethics levels. It is fundamental to consider in which conditions and with what quality of life (QoL) older adults are and will be living. This important question has led to international interest in the enhancement, and measurement, of quality of life in old age, attracting increasing research and policy interest [2]. Specifically, in Portugal, the advancement in aging research is noteworthy, with some studies approaching conceptualization and assessment of QoL and related concepts, such as well-being, successful and active aging.

Study Design and Setting
A cross-sectional, national study in a sample of people aged 50 years or older living in Portugal and participating in the wave 6 (W6, 2015) SHARE project.

Participants
The sample was composed by 1666 people aged 50 or older at the time of sampling, living in Portugal and recruited for the W6 of the SHARE project [18]. Participants were excluded if they were incarcerated, hospitalized or out of the country during the entire survey period, unable to speak the country's language or had moved to an unknown address. Detailed information on the sample and data collection can be found in Malter and Börsch-Supan [19].
The SHARE project was reviewed and approved by the Ethics Council of the Max Planck Society. The Ethics Committee of Carlos III Institute of Health approved the present study (reference: CEI PI 62-2019).

Measures
The main outcome variable was the QoL measured with the CASP-12 scale [9,13], a 12-item self-assessed questionnaire. Four domains compose the scale: control, autonomy, self-realization, and pleasure. Items are rated on a four-point Likert scale from 1 = never to 4 = often, although item 4 and items 7 to 12 are reversed, thus, the lowest scores mean the worst QoL. The total score ranges from 12 to 48, with higher scores meaning better QoL.
For depression, the Euro-D was applied [20]. It is a self-completed scale with items assessing depression, pessimism, suicidality, guilt, sleep, interest, irritability, appetite, fatigue, concentration (on reading or entertainment), enjoyment, and tearfulness. The Self-perceived Health-United States version (SPHUS) [21] was applied for health status (rated from 1, excellent, to 5, poor). The Global Activity Limitation Indicator (GALI) [22], which is a single-item ("For the past 6 months at least, to what extent have you been limited because of a health problem in activities people usually do?") with limited and non-limited categories for self-perceived activity limitation. An item on life satisfaction ("On a scale from 0 to 10 where 0 means completely dissatisfied and 10 means completely satisfied, how satisfied are you with your life?") and a checklist of chronic diseases were also used. Socio-demographic variables (age, sex, marital status, job situation, years of education and living in urban or rural setting) were also compiled and used to characterize the sample. Education level was described according to the International Standard Classification of Education (ISCED) [23]. A description of all variables used in SHARE W6 can be found in Börsch-Supan [18]. Items included in the SHARE survey were translated into European Portuguese following a common procedure [24].

Statistical Analysis
The outcome variable-CASP-12 total score-did not fit normal distribution (Kolmogorov-Smirnov test with Lilliefors correction, p < 0.001), and non-parametric statistics were applied. Descriptive statistics were used for characterizing the sample in terms of socio-demographic variables and rating scales scores.
The psychometric properties of CASP-12 were explored using classical test theory (CTT) and Rasch analysis, a variation of item response theory (IRT). According to the CTT principles, the following psychometric attributes were calculated: data quality and acceptability, construct (structural and hypotheses testing) validity, and reliability (internal consistency) [25,26].
For structural validity, a confirmatory factorial analysis (CFA) using maximum likelihood estimations was performed. For CFA, a root mean squared error of approximation (RMSEA) ≤ 0.06 and a comparative fit index (CFI) > 0.9 indicated good fit to the model [28]. Additionally, for structural validity, the corrected item-total correlation (criterion: ≥0.20) and the inter-correlation of CASP-12 domains (internal validity) using Spearman's rank correlation coefficients were calculated.
Testing of hypotheses comprises convergent and discriminative validity. Convergent validity was calculated using the Spearman's rank correlation coefficients of CASP-12 total and domain scores with the remaining measures. Based on previous studies, a moderate correlation was hypothesized between CASP-12 and depression (Euro-D), activity limitation (GALI) and life satisfaction (r S = 0.30-0.50) [29]. Discriminative or known-groups validity was explored by calculating the differences in CASP-12 scores in the sample grouped by variables of interest: sex, age group (50 to 64, 65 to 74, and 75 years and older), and with or without activity limitation (GALI). Mann-Whitney or Kruskal-Wallis tests were used to ascertain differences between groups.
For Rasch analysis, the following attributes were calculated: fit to the Rasch model [31], unidimensionality, internal consistency (person separation index, PSI), item local independency, absence of differential item functioning (DIF) by age (3 groups) and sex, and threshold ordering. There are excellent publications where detailed information is provided to the lay reader about how to conduct a Rasch analysis and interpret its results [32][33][34].
Fit to the Rasch model was considered when there was a non-significant chi-square value with Bonferroni adjustment for a number of items. Furthermore, item and person fit residuals should follow a distribution with mean of 0 and standard deviation of 1, with individual item fit residuals being expected to fall within the −2.5 to 2.5 range. For unidimensionality, the person estimates of two sets of items defined in a principal component analysis of the residuals are compared through t-tests. For a scale to be unidimensional, the lower bound of the binomial confidence interval should overlap 5% [35,36].
Reliability was measured with the PSI, which is interpreted similarly to Cronbach's alpha. Item local independency is ascertained when the item corrections are low (<0.30 of the mean correlations) following removal of the variance due to the first Rasch factor. Locally dependent items may be combined into a single, super item [37]. DIF was measured with an analysis of variance [38]. In the case of uniform DIF, the item may be split, and item locations are calculated separately by each group. Thresholds are points between two response categories with equal probability of answer. Threshold ordering means that the participants use the response categories in an expected way, consistent with the construct continuum. In the case of threshold disordering, two adjacent response categories are collapsed.
We followed an iterative process, where model modifications were made and repeatedly tested until model specifications were met [33,39]. Large sample sizes provide a high statistical power that will determine small deviations from the Rasch model as statistically significant. Therefore, a random sample of 300 cases was taken and analyzed [40].
Statistical significance was set at p < 0.05. CTT calculations were performed using IBM SPSS Statistics 22.0 (IBM, Armonk, NY, USA), except CFA, for which Stata 14.0 was used. Rasch analyses were performed using RUMM2030 [41].

Results
The sample was composed of 55% women, had an overall mean age of 67.81 (SD: 9.01; range: 50-94), and presented an average of 6.28 years of education (SD: 4.16; range: 0-25), with most of the sample (61.8%) having primary level education. Most of the participants were married or lived with a spouse (75.9%), were retired (62.4%), and lived in urban settings (72.5%). A description of the sample is presented in Table 1. The CASP-12 total score was computable for 1468 (88.1%) participants, thus, missing data represented 11.9% of the sample. The mean CASP-12 was 26.68 (SD: 5.80, range: 12-48). Skewness for the total score was 0.311, and floor and ceiling effects were less than 0.5% (Table 2). For domains, pleasure presented the highest percentage of missing data (11.3%) and floor effect (12.1%). All domain scores covered the full score range (3 to 12 points). No domain showed a ceiling effect. Some items presented marked floor or ceiling effects, particularly in the autonomy and pleasure domains. Figure 1 shows the path diagram of the CASP-12 scale performed through CFA. The 4-factor model obtained a RMSEA of 0.07 and a CFI of 0.90, χ 2 (48) = 444.59, p < 0.001). Details are provided in the Supplementary Material, including models for one and three factors.     Table 3. CASP-12 domains correlated from 0.22 (autonomy and pleasure) to 0.46 (pleasure and self-realization). Regarding convergent validity (Table 3), CASP-12 total score correlated −0.57 with Euro-D, 0.52 with life satisfaction, −0.47 with self-perceived health and 0.41 with GALI. For domains, control (r S = 0.49) and self-realization (r S = 0.46) reached the highest correlation coefficients with Euro-D. Self-realization also showed the highest correlation coefficients with life satisfaction (r S = −0.46) and self-perceived health (r S = 0.45). The pleasure domain displayed the lowest correlation coefficients with the other applied measures. CASP-12 total and domains showed significant differences in scores by sex (women showed lower mean scores than men, p < 0.01), by age groups (lower mean scores in older participants in all domains except in autonomy, p < 0.01), by activity limitation assessed with GALI (higher mean scores for those participants without limitations, p < 0.001), by self-perceived health (lower mean scores for participants with poor health, p < 0.001), and by presence of multimorbidity (lower mean scores for participants with two or more chronic conditions, p < 0.001) ( Table 4). Regarding internal consistency (Table 5), Cronbach's alpha for the total score of CASP-12 was 0.78, with a range from 0.37 (autonomy) to 0.73 (self-realization) for domains. Inter-item correlation and item homogeneity indexes were lower for autonomy and higher for self-realization. Finally, a Rasch analysis was performed with the 12 items, showing a lack of fit to the Rasch model. Therefore, each domain was analyzed separately, and all showed unidimensionality, ordered thresholds, item local independency and lack of DIF by gender. Table 6 presents the person and item fit parameters for the CASP-12 domains. The control domain presented a good fit to the Rasch model, PSI = 0.617. Item 1 ("My age prevents me from doing the things I would like to do") displayed DIF by age, with adults aged 75 or more underestimating scores.
Similarly to control, the autonomy domain, showed a good fit to the Rasch model with a low PSI of 0.312 and DIF by age for item 4 ("I can do the things I want to do"). The pleasure domain displayed an adequate fit to the Rasch model after splitting items 8 ("I feel that my life has meaning") and 9 ("On balance, I look back on my life with a sense of happiness") due to DIF by age (underestimation of scores by older adults), with low PSI (0.372). Finally, the self-realization section had a good fit to the Rasch model, PSI = 0.71, and no DIF by age groups. When super items were created for each domain, a good fit to the Rasch model was observed, with χ 2 (36) = 10.089, p = 0.745, PSI = 0.722.

Discussion
This is the first complete validation study with the European Portuguese version of the CASP-12 scale used in the SHARE study. This version is slightly different to that originally proposed by Wiggins et al. [9].
Regarding data quality and acceptability, missing data and skewness of domains and total score were within the standard limits. Most items showed a marked ceiling effect, particularly in the control and pleasure domains, and two items showed a floor effect. This is in line with the other studies reporting the distribution of items scores in CASP-12 [15]. However, domains and total score did not show floor or ceiling effects.
Internal consistency of CASP-12 was satisfactory in the self-realization domain and when combining all items. The autonomy domain presented the lowest values of internal consistency, as reported in previous studies [14,15]. The autonomy and pleasure domains also showed low reliability indices in Rasch analysis. This should be considered when using these two domains individually.
The autonomy domain also showed the weakest results for internal, construct and structural validity. This domain showed low correlation coefficients with other domains (0.22 with pleasure and 0.36 with self-realization). In Rasch analysis, the autonomy domain presented a low reliability and one item with bias by age. Some authors have proposed dropping the items on "family responsibilities" and on "shortage of money", as they could be measuring something different to QoL [9], or forming a domain with control in a bi-or tri-factorial model [15,16]. However, despite the problems with the item on "family responsibilities", family life is an important dimension for Portuguese older adults' QoL, as commented in the Introduction, and this prevent us from deleting this item [5].
Moreover, the CFA supported the 4-factor model proposed by CASP-12 original developers [42]. However, the factor intercorrelations found in the CFA were much higher than the scale intercorrelations for internal validity. In addition, items 5 and 6 items loaded weakly on the "autonomy" scale, and the same was observed for item 7 on the "pleasure" scale. This suggests the need for further research to elucidate on the advantages and disadvantages of using different factor structures of the CASP-12 scale in Portugal. Even though the first 12-item model did provide a good fit to the Rasch model, an adequate fit was found for each of the individual domains, as well as the model with one super item per domain. Thus, results from Rasch analysis point to a hierarchical scale structure of the CASP-12 scale with a higher-order construct, QoL, formed by four lower-order unidimensional sections. A previous Rasch analysis found evidence for a unidimensional 15-item version of the CASP scale [17].
Regarding convergent validity, the total score of CASP-12 showed the highest correlation coefficients with the item on satisfaction with life and with the Euro-D, as hypothesized. Depression is a main determinant of QoL deterioration, and this relationship has been previously reported in other studies applying CASP-12 in older samples [43].
Women, participants aged 76 year or older, with limitations of activity, with poor self-perceived health and with two or more chronic diseases scored significantly lower in CASP-12, as in other studies [43,44]. Portugal has a high old age-dependency ratio, reaching 65.8% [1]; thus, these results suggest the need of intervention in the most vulnerable population to improve their QoL and achieve healthy and active aging. Gender differences are not due to an item bias by gender, as indicated in our DIF analyses and in previous studies [17]. However, a bias by age was found in four items, and further research is needed to confirm our findings. If confirmed, differences by age in the autonomy and pleasure dimensions should be interpreted cautiously, as they might be due to item bias.
Several limitations to this study must be acknowledged. A heterogeneous, diverse sample is usually advised for validation studies. In this case, the SHARE project does not include individuals living in nursing homes. Furthermore, because of the cross-sectional design, it is not possible to evaluate the temporal stability of the structure presented. Therefore, further studies should assess CASP-12 validity with other Portuguese samples (i.e., institutionalized older adults) and analyze responses for test-retest reliability.
In explicitly resisting a conflation of QoL with health status that often happens in old age, the CASP scale was developed to focus on favorable and advantageous features of aging and on older people's positive characteristics [42]. Despite some limitations, it is an instrument with adequate psychometric features, and its use is encouraged as it may contribute to furthering the study of older adults' needs and strengths and therefore improve the well-being of the older population in Portugal. The CASP-12 scale is a QoL instrument that might be useful for clinical practice, as well as to assess public health interventions and aging policies in Portugal. In addition, this study underscores the past and future research studies performed with data from the SHARE project and that use the CASP-12 scale as a QoL measure.

Conclusions
In conclusion, the European Portuguese version of the CASP-12 scale, when applied to people aged 50 years or older, presented some inadequacies in terms of acceptability, internal consistency and structural and construct validity of two of the four domains that compose it. Therefore, the total score could be preferred over the use of individual domains scores. The European Portuguese version of the CASP-12 scale used in the SHARE project presents some strengths, such as good acceptability, unbiased scores by gender, fit to the Rasch model, and adequate reliability of the pleasure and self-realization domains. Nevertheless, future research should present more evidence on the scale's psychometric properties, including its factor structure in different samples, namely, old-old and institutionalized individuals.