Reliability and Validity of the Geriatric Depression Scale in a Sample of Portuguese Older Adults with Mild-to-Moderate Cognitive Impairment

Although the Geriatric Depression Scale (GDS) is a well-established instrument for the assessment of depressive symptoms in older adults, this has not been validated specifically for Portuguese older adults with cognitive impairment. The objective of this study was to analyze the psychometric properties of two Portuguese versions of the GDS (GDS-27 and GDS-15) in a sample of Portuguese older adults with mild-to-moderate cognitive impairment. Clinicians assessed for major depressive disorder and cognitive functioning in 117 participants with mild-to-moderate cognitive decline (76.9% female, Mage = 83.66 years). The internal consistency of GDS-27 and GDS-15 were 0.874 and 0.812, respectively. There was a significant correlation between GDS-27 and GDS-15 with the Beck Depression Inventory-II (GDS-27: rho = 0.738, p < 0.001; GDS-15: rho = 0.760, p < 0.001), suggesting good validity. A cutoff point of 15/16 in GDS-27 and 8/9 in GDS-15 resulted in the identification of persons with depression (GDS-27: sensitivity 100%, specificity 63%; GDS-15: sensitivity 90%, specificity 62%). Overall, the GDS-27 and GDS-15 are reliable and valid instruments for the assessment of depression in Portuguese-speaking older adults with cognitive impairment.


Introduction
Population ageing is occurring globally at an unprecedented rate and will accelerate in the coming decades, particularly in developing countries [1].The World Health Organization estimates that the number of older adults aged 60 and over will reach up to 2.1 billion worldwide by 2050, with nearly 80% stemming from low-and middle-income countries [2].According to data from the Statistics National Institute, older adults in Portugal comprise approximately 21.5% of the population, with an expected increase to 37.2% by 2080 [3].
Older adults, without normal ageing, are particularly vulnerable to mental and neurological health conditions such as neurocognitive disorders and depression due to additional stress factors such as loss of capacity [4].The effects of depression can be chronic or recurrent and can dramatically affect an individual's daily functioning and quality of life [1].Additionally, when depression is associated with chronic illness, it increases morbidity and mortality, leading to psychological and financial burden on the individual, family, and health system [5].Thus, the health status of older adults has significant personal, community, and national impacts.Depression often accompanies neurocognitive disorders, is a risk factor and prodrome, and may indicate worse prognosis [6].Therefore, early and timely identification of mood changes may be crucial in treating cognitive functioning decline.Thus, it is crucial to have screening tools for depression that are suitable for use in older adults with cognitive impairment.
The clinical manifestations of depression in older adults are complex as they involve biological, psychological, and social aspects, often related to lifestyle changes and cognitive decline [7].Identifying depression in the context of primary care, particularly in patients with multiple comorbidities, also can be difficult.As a consequence, depression is often under-diagnosed and under-treated in older adults [8].Therefore, self-response questionnaires have emerged as an approach to help primary care providers identify patients who may have depression but do not yet have a diagnosis [4].
Two existing gold-standard instruments for assessing depression throughout the lifespan include the Beck Depression Inventory-II (BDI-II), originally developed by Beck et al. [9], and for older adults, the Geriatric Depression Scale (GDS) created by Yesavage et al. [10,11].The GDS is a widely used tool for depression screening, specifically designed for the elderly.This instrument does not contain somatic symptomatology assessment items (unlike other depression screening tools) on the grounds that these may lack discriminatory capacity in older adults because they can be attributed to comorbid physical conditions and the ageing process [10].Additionally, the format of the GDS includes dichotomous yes/no items rather than the multi-level items in the BDI-II and has a reduced retrospective recall time span (1 week for the GDS versus 2 weeks for the BDI-II) making it a more simplified tool for older adults.Therefore, the GDS can be administered to older adults regardless of physical illness or cognitive impairment [12] and is a reliable screening tool for depressive symptoms in mild cognitive impairment [13] and dementia [13][14][15].However, as shown by a recent systematic review [16], research focused on the accuracy of this measure for screening of depression in older adults with cognitive impairment is still sparse and further studies are needed to enable the selection of optimal cutoff values.
In Portugal, the GDS-30 was adapted and validated by Pocinho et al. [20], while a 15-item (GDS-15) version was adapted and validated by Apóstolo et al. [21,22].Both versions demonstrated good psychometric properties; hence, they present with potential as sound instruments for screening for depressive symptoms in older adults.However, these have not been validated specifically in samples of Portuguese individuals with cognitive impairment, which is important to explore as comprehension of items and response styles may differ in those with cognitive difficulties, thus affecting the outcomes/interpretation.
The aim of this study was to explore the psychometric properties (namely, internal consistency, reliability, and construct validity) of two Portuguese versions of the GDS (GDS-30 and GDS-15) in a Portuguese mild-to-moderate cognitive impairment sample and to compare their performance to the DSM-5's diagnostic criteria for major depressive episode (thus exploring criterion validity).Establishing reliability and validity, two critical psychometric properties, is critically important for accurate and interpretable assessment [30].Thus, this is a critical mechanism that needs to be explored in various populations and instruments to instill confidence in the measured outcome and drives the importance of the current study.The influence of sociodemographic variables-namely, age, education level, and gender-on the performance of two GDS versions were also analyzed.Notably, we used the GDS-27 version in the current study as modified in Pocinho et al. [20], in which item-level performance revealed that three items on the original GDS-30 (items 27, 29, and 30) were found to be weak in the Portuguese translation; thus, our version reflected the psychometrically stronger (and more commonly administered) GDS-27 version.

Participants
This cross-sectional study was conducted on a convenience sample of 161 older adults recruited through 12 institutions that provide social care and support services for older adults (including people living in long-term care centres, people attending day and social centres, and people receiving home support services), located in urban (one in the northern region and four in the central region) and rural (five in the central region and two in the southern region) areas of Portugal.Inclusion criteria included the following: (a) aged 65 years or over; (b) diagnosed with neurocognitive disorder by a clinical psychologist as per DSM-5 criteria [31]; (c) able to engage and understand the assessment questions; and (d) native Portuguese speaker.
Exclusion criteria included older adults that had severe sensory and/or physical limitations, were not oriented to the environment, or had severe neuropsychiatric symptoms that prevented the completion of the assessment instruments.Another exclusion criterion took into account Mini Mental State Examination (MMSE; see details in Instruments section) scores.As our primary focus was on mild-to-moderate cognitive impairment, participants with severe cognitive impairment (MMSE score of less than 9) were not included, nor were those without evidence of cognitive impairment (MMSE score of 22 or better for those with 0 to 2 years of formal education; 24 or better for those with 3 to 6 years of formal education; 27 or better for those with 7 or more years of formal education).Of the 161 persons contacted to participate in the study, 44 were excluded based on MMSE scores, resulting in a final sample of 117 participants.The study was conducted in accordance with the latest version of the Declaration of Helsinki and was approved by the Ethics Committee of the Health Sciences Research Unit: Nursing, part of the Nursing School of Coimbra (code number P629/11-2019).Prior to the inclusion of the subjects in the study, signed informed consent was obtained from the participants or their legal representatives.This information included the processing of data in accordance with current legislation, the voluntary nature of participation in the study, the participants receiving no financial compensation or any other incentive, and the right to withdraw consent at any time, without affecting the services received at the institution.Additionally, throughout the study, the evaluators monitored the participants for indications that they did not wish to participate in the evaluations.

Instruments
A self-reported sociodemographic and clinical questionnaire was administered, collecting data on gender, age, marital status, educational level, type of institution attended, presence of medical comorbidities, and cognitive symptoms.In addition, the following self-report assessment tools were used: Geriatric Depression Scale-30 (GDS-30): full version, with reported good reliability (Cronbach's alpha = 0.91).Items are presented in dichotomous format (yes/no).Score range 0-30.Scores of 11 or more out of a maximum of 30 points are suggestive of clinical depression.In the present study, the 27-item version was used.It is easy to administer and is indicated for administration to people suffering from cognitive decline [10,20].
Geriatric Depression Scale-15 (GDS-15): A 15-item version of the GDS with good reliability (Cronbach's alpha = 0.83).It consists of 15 questions in a dichotomous format (yes/no).Score range 0-15.Scores equal to or greater than 6 out of a maximum of 15 points are considered to be indicators of depression [21,32].The authors suggest that this tool is effective for screening for depression in older adults [22].
Beck Depression Inventory-II (BDI-II): One of the most widely used tools for assessing self-reported depression, presenting with good reliability (Cronbach's alpha = 0.90) [33,34].It consists of 21 items that assess symptoms characteristic of depression during the last two weeks.It consists of a cognitive-affective scale (items 1-10, 12, 14, and 21) and a somatic scale (items 11,13,[16][17][18][19][20].Each item is scored from 0 to 3, with 0 reflecting the absence of the symptom and a higher value reflecting greater symptom severity.The overall score ranges from 0 to 63 points.There is evidence that the BDI-II is a reliable and valid tool to be used in screening for depression in older adults, including those with cognitive impairment [35], and that it has practical utility in different healthcare contexts [36][37][38]. The MMSE (Folstein et al. [39]; Portuguese version by Guerreiro et al. [40]) is widely known and used as a screening tool of cognitive function evaluation, interpreted based on normative data for different literacy groups established by Morgado et al. [41].Score range 0-30.The cutoff points used followed the normative data established by Morgado et al. [41] and took into account three literacy groups.Namely, in those with 0 to 2 years of formal education, the cutoff point of 22 was used; in those with 3 to 6 years of formal education, the cutoff point of 24 was used; in those with 7 or more years of formal education, the cutoff point of 27 was used.The Portuguese version by Guerreiro et al. [40] shows good reliability (Cronbach's alpha = 0.89).
We also administered a module of the Structured Clinical Interview for the Disorders of the DSM-5, Clinician Version (SCID-5-CV) [42], which is based on the DSM-5 criteria [31] for depressive disorders, with inter-observer reliability (κ) ranging from 0.70 to 1.00.

Procedures
The sample was recruited via non-probabilistic sampling among 12 institutions that provide social care and support services for older adults (including people living in longterm care centres, people attending day and social centres, and people receiving home support services), located in urban and rural areas of Portugal.Older adults with cognitive impairment and their legal representatives were contacted, the details of the study were explained to them, and they were invited to participate.
A psychologist with two or more years of experience and familiarity with the measures used in the study asked participants who met the inclusion criteria to complete a sociodemographic questionnaire, the Portuguese version of the GDS-27 scale, the Portuguese version of the GDS-15 scale, and the BDI-II (which was used as the gold standard to assess symptoms of depression).In the case where the participants had less reading experience, reading the questions and recording the answers obtained was the responsibility of the professional.A clinical psychologist with two or more years of experience, who had previously undergone training of at least 4 h, administered the module for major depression of the SCID-5-CV.The clinician was knowledgeable about the results of the GDS and BDI.
The instruments were given in a single session and the order of administration was as follows: (1) MMSE, to determine whether they met inclusion criteria of diagnosis of neurocognitive disorder; (2) depression screening instruments; and (3) the SCID-5-CV.

Statistical Analysis
To compare the distribution of categorical variables in independent groups, the Chi-Squared test was used, with effect size being estimated based on the Phi (ϕ) statistic for 2 × 2 contingency tables or on Cramer's V statistic for non 2 × 2 contingency tables.To compare the variance of ordinal variables in independent groups, the Mann-Whitney test (for two groups) and the Kruskal-Wallis test (for more than two groups) were used due to non-normal distribution of the results obtained.In relation to the Mann-Whitney test, the effect size was calculated based on a formula "r = Z/ √ N" and interpreted based on the indications proposed by Cohen [43].Regarding the Kruskal-Wallis test, if differences were statistically significant, a multiple comparison of mean ranks was performed.The effect size was calculated using the Partial Eta Squared measure (η 2 p ) and interpreted following Cohen's [43] and Marôco's [44] suggestions.
Internal consistency of GDS questionnaires was measured using the Kuder-Richardson coefficient, KR-20, which is indicated for dichotomous variables [45].Additionally, McDonald's omega " Ω was calculated as this coefficient proved to be a robust alternative for cases in which the assumptions for the use of Cronbach's alpha (such as unidimensionality and absence of normality violations) are not met [46]; although, the use of McDonald's omega is mainly recommended for large samples.Congruent validity was determined through the Spearman correlation coefficients, calculated for GDS questionnaires and BDI-II.
To analyze the possible effect of covariates on dependent variables, a non-parametric ANCOVA was performed using the F statistic calculated according to the Quade Method.The Quade Method involves testing the equality of the residuals between groups obtained through linear regression of the ranked dependent variable on the ranked covariate.
The two-factor interaction effect on the dependent variables was also analyzed.For this purpose, a non-parametric two-way ANOVA was performed, using the H statistic calculated based on the formula in which the sum of the squared ranks of a given factor is divided by the total mean square for those ranks.The effect size was indicated by the η 2 p coefficient.
The receiver operating characteristic (ROC) curve for GDS scores and DSM-5 diagnosis (present/absent) was also plotted to establish the sensitivity and specificity of different cut-off points for depression screening.The selection of an optimal cut-off point considered the maximum Youden index, calculated according to the formula "sensitivity + specificity − 1" [47].Other standard summary measures of test accuracy, such as positive and negative predictive values, were also calculated.
In all analyses, a statistical significance level of 0.05 was considered.For data treatment, the Statistical Package for Social Sciences (IBM SPSS Statistics, New York, NY, USA), version 25.0, was used.

Sample Profile
From the 117 older adults presented in Table 1, applying the DSM-5 diagnostic criteria for a major depressive episode, 20 participants were classified as having depression and 97 as not having depression; among the depressed participants, 10 met the criteria for a mild depressive episode, seven for a moderate depressive episode, and three for a severe depressive episode.There were no significant differences between depressed and nondepressed participants in terms of mean age (U(97, 20) = 958.50,p = 0.934), male/female ratio (χ 2 (1) = 0.887, p = 0.346), and formal education level ratio (χ 2 (2) = 5.106, p = 0.078).The groups were also equivalent in terms of MMSE score (U(97, 20) = 961.00,p = 0.948).

Depressive Symptomatology and Sociodemographic Characteristics
The mean GDS-27 and GDS-15 scores obtained in this sample were 14.11 (SD = 6.68) and 7.68 (SD = 3.75), respectively.Analyses of the potential effect of age, gender, and formal education level on the distribution of sociodemographic, clinical, and neuropsychological characteristics of the participants in the GDS-27 and GDS-15 scores in depressed and non-depressed participants were also performed.To analyze the influence of age, a nonparametric ANCOVA was performed using mean ranks of both the dependent variable and covariate.The F statistic was calculated using the univariate ANOVA of non-standardized residuals obtained through linear regression of the rank of the dependent variable on the rank of the covariate.The F-test showed that the different scores obtained by depressed and non-depressed older adults in both questionaries cannot be explained by age distribution (GDS-27: F = 29.657,p < 0.001, η 2 p = 0.205; GDS-15: F = 24.817,p < 0.001, η 2 p = 0.177).

Reliability
The reliability of GDS-27 and GDS-15 was calculated based on the responses of 94 participants (18 with depression and 76 without depression).With regard to the GDS-27, the item means ranged from 0.27 (item 15) to 0.80 (item 17).The item-total correlations were found to be strong (r ≥ 0.70) for items 4 and 16, and weak (r ≤ 0.40) for items 12, 14, 15, 18, 22, and 28.In the case of the remaining 19 items, the item-total correlations were revealed to be moderate (0.40 < r < 0.70).The corrected item-total correlations ranged between 0.13 (item 12) and 0.74 (item 4).

Depressive Symptomatology and MMSE Score
To analyze the potential effect of the covariates of MMSE score on the performance of the GDS-27 and GDS-15, a non-parametric ANCOVA was performed using mean ranks of both the dependent variable and covariate.The F statistic was calculated using the univariate ANOVA of non-standardized residuals obtained through linear regression of the rank of the dependent variable on the rank of the covariate.The F-test showed that the different scores obtained by depressed and non-depressed older adults in both questionnaires cannot be explained by the MMSE score distribution (GDS-27: F = 30.258,p < 0.001, η 2 p = 0.208; GDS-15: F = 25.278,p < 0.001, η 2 p = 0.180).

Discussion
This study aimed to analyze the psychometric properties and screening performance of two Portuguese-translated versions of the commonly used Geriatric Depression scale (GDS-27 and GDS-15) in older adults with mild-to-moderate cognitive impairment.The mean GDS-27 and GDS-15 scores obtained in this sample were just over 14 and 7, respectively.As expected, depressive symptoms were significantly worse for those participants who met the DSM-V criteria for depression.
Notably, reliability analyses, performed based on Kuder-Richardson coefficient and McDonald's omega, revealed strong internal consistency reliability in both versions of this scale (GDS-27: > 0.86; GDS-15 ≥ 0.81).These results are consistent with the internal consistency found for the general Portuguese population (GDS-27: α = 0.91; GDS-15: α = 0.83) in previous studies by Pocinho et al. [20] and Apóstolo et al. [21].While overall internal consistency was strong, there were some items on the scale that were notably weaker than others, which raises questions about the functions of those items within our sample.For example, the lowest item correlation on the GDS-15 asks about the patient worrying about bad things happening (item 6) and the second lowest correlation asks about one's preference to staying home versus going out and doing things (item 9).Responses to these two items in our sample, who are majority residents in a supervised institutional setting (76%), may reflect life circumstances and may be sample specific-that is, our sample mostly comprises individuals living in a supervised residential setting, which inherently have reduced autonomy and independence that may be reflected in that item (and thus, reduced variability of the item) rather than the effects of depressed mood specifically.Similar findings on the GDS-27 also were observed, with the lowest correlated items measuring preferring to stay at home rather than doing things (item 12) and being preoccupied about the past (item 18).
Furthermore, the scale had good validity.We explored convergent validity by examining associations between both the GDS-15 and GDS-27 and the gold standard measure for self-reported depression, the BDI-II, and found strong associations.These results are consistent with previous explorations of the validity of the Portuguese GDS-30 [20] and GDS-15 [21,22] in our cognitively impaired sample.Furthermore, when grouping the BDI-II items with items aligned with cognitive-affective and somatic subscales, the two GDS versions continued to demonstrate strong associations.
Since both the GDS-27 and the GDS-15 obtained good reliability and validity, clinicians can opt for using the GDS-15 to avoid fatigue in the population of people with cognitive impairment.
Another important indicator of a strong measure is sensitivity to identifying the underlying construct.The two GDS versions had strong sensitivity at identifying those with diagnosed clinical depression, showing its benefit as a screening tool.In particular, while this relatively brief self-report form was not strongly able to identify nuances of depression, we were able to identify those with and without diagnosed depression with high accuracy on both the GDS-15 and GDS-27.Along these lines, we were able to identify the ideal cutoff scores on the GDS scales that provide ideal levels of sensitivity and specificity.Specifically, we found scores of 8/9 on the GDS-15 and 15/16 on the GDS-27 as optimal cutoff points in the screening identification of depression.Thus, we feel confident in the ability of this tool to serve as a good screening measure that can help identify when patients need a follow-up for mood evaluation and treatment.
Notably, our ideal cutoff scores are higher than typical cutoff points in other Portuguesespeaking populations, which fall at 4/5 on the GDS-15 [22,25].It is possible that our cognitively impaired sample endorsed more of the cognitively oriented symptoms (e.g., more memory problems than most; worry about bad things happening) than a cognitively intact older adult population, instead capturing the effects of cognitive deficits rather than being mood-related in nature.Nonetheless, these cutoff values are still well within the ranges found in other studies.Cutoffs have been reported as high as 7/8 in other translated versions of the GDS-15 [48] and 10/11 on the GDS-30 [49].Further, on the GDS-30, it has ranged up to 16 in other clinical samples and translations [50].Further, while our analyses revealed high sensitivity rates (90-100%), our specificity rates were lower than statistically ideal (62-63%); however, this also is within the specificity range found in past reviews of the GDS-15 and GDS-30 [51] and, thus, consistent with the general performance of the scale.Our overall scores revealed sensitivity and specificity consistent with the GDS performance in other studies; in our sample, GDS scores remained consistent across demographic characteristic, including gender and education, and importantly arguing for good consistency of the measure.Overall, we were able to demonstrate sound discriminability of the GDS in those in our sample with and without depression.
Naturally, this study has some limitations.First, our use of a convenience sample has implications for the generalizability of the study findings to a wider population.To reduce the impact of this limitation, the recruitment process was carried out in different settings and different geographical locations, involving both institutionalized and communitydwelling older adults.Second, we used a relatively small sample size for a psychometric exploration.Namely, although a large number of older adults were contacted, the study sample included only 117 participants, which made it impossible, among other reasons, to carry out factor analysis which would have allowed a better understanding of structure of the questionnaire.The reduced adherence reflects the difficulties that health professionals often face in recruiting older adults with neurocognitive disorders to research projects.It should also be noted that of all participants included in the study, only 20 met the DSM-5 criteria for major depressive episode.Another limitation of the study was the non-use of a diagnostic tool, rather than a screening one, to complement the diagnosis of a neurocognitive disorder according to the DSM-5.Finally, because we did not control or restrict access to antidepressant medications, the present study did not evaluate whether the results obtained in the GDS were affected or not by antidepressant medication taken by participants.In further studies, this potential effect of medication intake should be examined in a controlled manner.

Conclusions
A better understanding of the psychometrics of any measure helps improve clinicians' and researchers' confidence in the validity and reliability of the measure when administering this scale in their practice.This study provides support for the psychometric properties of the GDS-15 and GDS-27 in a sample of persons with mild-to-moderate cognitive impairment.Our results support that both GDS-27 and GDS-15 are reliable and valid instruments for assessing depressive symptoms and screening for depression in Portuguese persons with cognitive impairment.This knowledge adds to a growing body of literature exploring the psychometrics of the GDS in various settings globally and, specifically, contributes to the growing knowledge of the Portuguese versions of the GDS-15 and GDS-27.

Table 1 .
Sociodemographic, clinical, and neuropsychological characteristics of study participants.

Table 2 .
Results on the GDS questionnaires for the subsample of depressed and non-depressed participants.

Table 3 .
Sensitivity , specificity, and Youden Index of Geriatric Depression Scale with 27 and 15.

Table 3 .
Sensitivity , specificity, and Youden Index of Geriatric Depression Scale with 27 and 15.

Table 3 .
Sensitivity, specificity, and Youden Index of Geriatric Depression Scale with 27 and 15.