1. Introduction
Depression is a common condition among adults and can lead to harmful consequences. Worldwide, it is considered the third cause of years lost to disability [
1], with a prevalence that increased by 17.8% between 2005 and 2015 [
2]. In Europe, studies carried out in primary care (PC) settings have reported an incidence from 9.6% [
3] to 20.2% [
4]. The prevalence of depression in Spain is higher than the European mean and is associated with a negative perception of physical health, the presence of two or more difficulties in daily living activities, female gender [
5], and some physical comorbidities [
6].
Several instruments have been designed to screen mental disorders. As a collaborative international project, the Family Practice Depression and Multimorbidity (FPDM) group from the European General Practice Research Network (EGPRN) aimed to select a questionnaire to detect depression symptoms in PC patients [
7]. Firstly, a systematic review of validated scales for screening and diagnosis of depression in adults was performed. Scales that had been compared to a psychiatric interview based on the Diagnostic and Statistical Manual of Mental Disorders (DSM) criteria with quantitative results and with participation of PC professionals were analyzed [
8]. As a result of this systematic review, seven scales were identified: Geriatric Depression Scale of five items (GDS-5), Geriatric Depression Scale of 15 items (GDS-15), Geriatric Depression Scale of 30 items (GDS-30), Hospital Anxiety Depression Scale (HADS), Center for Epidemiologic Studies Depression Scale-Revised (CESD-R), Physical Symptom Checklist of 51 items (PSC-51), and Hopkins Symptom Checklist of 25 items (HSCL-25).
Secondly, the HSCL-25 was selected by consensus [
9]. Validity, efficacy, and reproducibility were analyzed as quantitative criteria. Characteristics such as being a self-administered questionnaire, easiness of completion for patients, and the simplicity of its interpretation were taken into account to assess the ergonomics. The HSCL-25 is suitable for use in PC because of its high validity and reliability; moreover, its ergonomics make it easy to use for patients [
9]. It is a self-report questionnaire designed to measure psychological distress based on the SCL-90 [
10], a longer checklist designed by Derogatis et al. The full version of the SCL-90 covers nine symptom dimensions, with 25 items belonging to the anxiety and depression ones.
Thirdly, the questionnaire was translated into 13 European languages [
11], including the Spanish version [
12]. The translation and adaptation process consisted of an initial forward translation, a pilot study based on the Delphi methodology with the participation of family doctors, and a back translation. Comprehension analysis was carried out through cognitive debriefing in a sample of PC patients. At the last step, transcultural harmonization was performed simultaneously with other versions of the scale in different European languages [
11].
Finally, validation of the different versions is in process, the French version has already been validated [
13] and the Croatian one is under way.
Instruments should be tested and validated in different languages, cultures, settings, and populations in order to make comparisons and establish efficacy. The Consensus-based Standards for the selection of health Measurement Instruments (COSMIN) [
14] initiative has developed criteria to evaluate the measurement properties of outcome measurement instruments. In addition, a considerable number of studies have assessed the HSCL-25 psychometric properties in various populations [
15,
16,
17,
18,
19], including PC patients [
20,
21]. PC is the gateway to the healthcare system for most of the Spanish population. It is the ideal setting to study the prevalence of the most common diseases.
The purpose of this study was to assess the HSCL-25 psychometric properties and validate the scale’s Spanish version in a PC population.
2. Methods
2.1. Study Design
A cross-sectional multicenter design was used. The participants were patients attending primary healthcare centers (PHC) in Aragon (1), the Balearic Islands (1), and Galicia (4) taking part in the EIRA study [
22]. Ethical approval was given by the Ethics Committee of Institut Universitari d’Investigació en Atenció Primària (IDIAP) Jordi Gol (reference number P16/025).
2.2. Participants
The selection criteria were those employed in the EIRA study. Eligible participants were patients aged between 45 and 75 years who had two or more of the following unhealthy behaviors: tobacco use, low adherence to the Mediterranean dietary pattern, and insufficient physical activity. Exclusion criteria were advanced serious illness, cognitive impairment, dependence in basic everyday activities, severe mental illness, inclusion in a long-term home healthcare program, treatment for cancer, end-of-life care, or no plan to reside in the area during the intervention period.
2.3. Recruitment and Sample Size
Recruitment was made by consecutive sampling of patients meeting the selection criteria and attending the PHC for any reason. The recruitment period took 6 months during 2017.
The COSMIN guide [
23] was followed to calculate the sample size. It states that seven completed questionnaires are needed per each item of the scale and that at least 100 completed questionnaires are required to assess psychometric properties. As the HCSL-25 has 25 items, and taking into account a 10% possibility of missing values, 193 patients were needed to complete the questionnaire.
In order to estimate the sample size required to compare the HSCL-25 with the Composite International Diagnostic Interview (CIDI), the receiver operating curve (ROC) and the corresponding area under the curve (AUC) were calculated with the BIOSOFT application (
http://www.biosoft.hacettepe.edu.tr/easyROC/, accessed on 31 January 2021) employing the following parameters:
Taking into account that the estimated prevalence in PC is 16.3% [
4], 533 patients were required to complete the scale to obtain 87 cases.
To evaluate test–retest reliability, the same considerations and a 20% possibility of missing values were taken into account, 26 patients were needed to reach an acceptable correlation coefficient of 0.7 [
24]. All the included patients were invited to participate in the telephonic retest.
2.4. Variables
Sociodemographic data (sex, age, nationality, marital status, current employment, and education level) were gathered from the participants. They were asked to complete the self-administered HSCL-25 questionnaire and other forms related to the EIRA study. Afterwards, trained professionals blinded to the HSCL-25 results conducted the CIDI with all the participants. Training consisted of a global presentation of the procedure of the interview, the reading question by question, role-playing with the interviewers, and resolution of doubtful situations. Retest of the HSCL-25 was telephonic to facilitate the viability; it was carried out between 1 and 3 months later.
2.5. Hopkins Symptom Checklist-25 (HSCL-25)
The HSCL-25 is a self-administered questionnaire that takes from five to ten minutes to complete [
13]. It consists of 25 items on a four-point Likert scale: 1 = “Not at all,” 2 = “A little,” 3 = “Quite a bit,” 4 = “Extremely”. The tool has two well-known dimensions: items 1 to 10 belong to the anxiety dimension, whereas items 11 to 25 constitute the depression one. The HSCL-25 score is calculated by dividing the total score of items by the number of items answered, so the final score can range from 1 to 4. A cutoff value of 1.75 is generally used for diagnosis of major depression, defined as “a case in need of treatment”. This cutoff point is recommended as a valid predictor of mental disorder [
15,
17,
25]. Our study was carried out using the Spanish version of the HSCL-25 obtained by means of translation and cultural adaption of the original English version [
12].
2.6. Composite International Diagnostic Interview (CIDI)
The CIDI is a standardized structured diagnostic interview created by the World Health Organization (WHO) according to the DSM-IV and International Classification of Diseases (ICD-10) definitions and criteria. Used by trained interviewers for mental disorder assessment in the general population [
26], it has demonstrated high validity and reliability [
27]. Whilst the original CIDI was in English, it has been adapted into and validated for many languages using a common procedure overseen by the WHO [
28]. Questions related to depression symptoms can be found in section E of the CIDI. In this study, it was considered the gold standard to assess the HSCL-25.
2.7. Patient Health Questionnaire (PHQ)
The PHQ is a well-known self-administered questionnaire used for common mental disorders. The PHQ-9 is the depression module in which each of the nine items is rated with a Likert scale that ranges from 0 to 3 [
29]. The total score can vary from 0 to 27. Scores of 15 or more indicate major depression. For this study, the validated Spanish version [
30] was employed.
2.8. Statistical Analysis
Analysis was conducted using STATA version 15 (manufacturer StataCorp LLC, Texas, USA).
2.8.1. Missing Data
The missing values for scale item responses were imputed with the mean of the responses to the rest of the scale items of each individual (the participant’s most representative value). The subjects with less than 50% response were excluded. The same imputation was carried out for the retest values.
2.8.2. Responding Process and Item Analysis
An analysis of the responding process was performed, looking for patterns of non-response and frequency response distribution of the items by category and sex. The discriminatory capacity of the items was assessed by comparing the two extreme groups. The discrimination index (DI) of each item was calculated by the mean difference of each group. Given that the response options have four possible categories, the DI could vary from −3 to +3.
2.8.3. Internal Structure
Confirmatory factor analysis (CFA) was carried out based on the structure of the original English version. The factorial loads for the two models with only one factor and for the two correlated ones (anxiety and depression) were calculated. The robust maximum likelihood mean adjusted method was employed to carry out factorial analysis of the standardized values. To evaluate the estimated model fit, the absolute fit index was calculated with chi-squared distribution. Given that this value may be affected by the sample size, complementary indices were employed, including the root mean square error of approximation (RMSEA), the standardized root mean square residual (SRMR), and the coefficient of determination (CD). In addition, comparative indices such as the comparative fit index (CFI) and the Tucker–Lewis fit index (TLI) were employed.
2.8.4. Criterion Validity
Criterion validity was measured by calculating the ROC curve for the HSCL-25 scale in comparison with the gold-standard CIDI. The AUC was estimated with 95% CI. Sensitivity, specificity, positive and negative predictive values, Youden index, and the best cutoff point were also assessed.
Concordance with the PHQ-9 was measured with the Pearson correlation coefficient and the prevalence- and bias-adjusted kappa, taking into account cutoff points of 1.75 and 15 for the HSCL-25 and PHQ-9, respectively.
2.8.5. Internal Consistency
The contribution of the items to the internal consistency was analyzed with indicators based on correlation (homogeneity), covariance (Cronbach’s alpha coefficient), and regression (R
2). The total Cronbach’s alpha and one for each of the two subscales were calculated. The value ≥0.7 was considered adequate [
24].
2.8.6. Test–Retest Reliability
Test–retest reliability was assessed by calculating the intraclass correlation coefficient (ICC) by the use of the mean of two evaluations (test and retest), absolute agreement, and a two-way mixed-effects model.
4. Discussion
A major finding of our study is that the Spanish version of the HSCL-25 is an instrument with good acceptability and high response rate for PC patients. Its reliability in measuring depression is robust and presents considerable sensitivity and specificity when compared to the CIDI interview. The CFA demonstrated that the Spanish version is similar to the original English one.
For most of the Spanish population, PC consultations are the gateway to the healthcare system. Due to the high prevalence of depression [
5], it is crucial that easy to use viable tools are available for the PC environment. As the HSCL-25 meets such characteristics [
9], awareness of its psychometric properties is relevant, in particular, of those items that most contribute to detecting symptoms and thus permit discrimination between the healthy populations and the potentially depressed ones. In addition, PC professionals should be informed of the reliability of the scale and its sensitivity and specificity values which are key in order to establish its diagnostic utility.
The study participants were PC patients aged 45–75 years who had taken part in the more extensive EIRA study [
22]. Whilst this implied a restricted age range, which might signify a limitation, the sample was considered sufficiently representative of such individuals. The sample size was greater than the minimum required for the analysis according to the COSMIN guidelines [
23], which are taken as reference in the field of psychometry. The statistical analysis was carried out based on the same recommendations. The content validity of the Spanish version of the HSCL-25 had been previously evaluated when it had been translated and transculturally adapted to Spanish and other official languages of the country [
12].
With respect to item analysis, a considerable percentage of responses was available, and no definite pattern was observed. As a consequence, the questionnaire appears to be widely accepted by PC patients without any items which may cause discomfort or difficulty in understanding. As the study was carried out with patients attending the PHC for any reason, a high percentage of low-rating responses for the categories was expected. In addition, a floor effect was foreseen for item 18 “Thinking of ending one’s life” which concerned suicidal ideation. Taking into account the definition of depression according to the DSM-5 [
31], it is not surprising that the item that best discriminated between the healthy population and the one with depressive symptoms referred to sadness. Item 17 “Feeling blue” was shown to be the most homogenous in all the analyses, with the highest correlation compared to the other scale items. It presented the highest coefficient of determination (that is to say, it could be predicted from the rest of the items) and most contributed to augmenting internal global consistency.
Regarding analysis of the scale’s factorial structure, this was performed with the CFA as the HSCL-25 has been widely studied with one single factor or two correlated ones even though other models have been proposed [
15,
32,
33]. The fit indices for both models were acceptable, and the results indicated moderately elevated factorial loads. In the study of the two-factor model, there was a factorial correlation of 0.84 which indicated that the depression and anxiety dimensions strongly correlated in a positive manner. Such a figure is higher than that detected in previous studies [
15]. The correlation is understandable as symptoms of anxiety are often observed in patients diagnosed with depression; moreover, anxiety and depression are frequently found to be associated comorbidities [
34,
35].
In other studies which compared the HSCL-25 scale with structured psychiatric interviews, a subsample of participants was selected for the latter to facilitate viability [
13,
16,
17,
20]. A strength of our work is that all the participants who responded to the HSCL-25 scale also took part in the structured CIDI interview imparted by trained professionals. We obtained 736 patients who fully answered both the scale and the gold-standard CIDI. Validity criteria were considerable, the global AUC was 0.890 (higher in men than in women). The global sensitivity and specificity by gender were elevated. The former was greater than that found in previous studies [
13,
16,
17,
20,
25], whilst the latter was similar to the 73% reported by Nettelbladt et al. [
20] and the 78% observed by Lundin et al. [
16], both in Swedish populations. Other authors have described higher values [
13,
17]. In spite of the augmented number of false positives obtained, in a similar manner to other studies [
36], the negative predictive value was greater than 97% for both genders. Such a finding indicates that the scale is a good tool for depression screening. With respect to the optimum cutoff point, both the global figure and the one for women were very similar to the 1.75 proposed in the original version and employed in other studies [
21,
37]. Nevertheless, 1.84 for men was higher, and contrasted with the findings of other authors where the cutoff point was greater for women [
25]. When contrasting the total rating of the HSCL-25 scale with that of the PHQ-9 [
29,
38] of depression, an elevated correlation was obtained, and the PABAK was acceptable. Such an analysis reinforces the elevated criteria validity found.
For the one-factor HSCL-25 scale, Cronbach’s alpha coefficient was 0.92, similar to the 0.93 obtained in the French version [
13]. Nunally et al. established the critical level of reliability at 0.70; they stated, however, that for the key individual decisions, such as the diagnosis of depression, reliability should be raised to 0.90 [
39]. Cronbach’s alpha for the subscales of depression and anxiety taken separately was greater than 0.80. Such findings demonstrate the strong reliability of the scale to measure depression, especially when employed as a single dimension instrument. The test–retest reliability was greater than 0.90, higher than that observed in other studies [
18], which indicates that the ratings are stable over time. The time interval between the baseline interview and the retest was considered adequate, the test conditions—acceptable, in spite of the retest being carried out by telephone to avoid overwhelming the participants.
Other shortened versions of the scale with five and 10 items [
40,
41] presenting acceptable reliability have been proposed. They could be of use, taking into account the characteristics of the PC environment. These studies have been performed in other languages and it might be of interest to translate them into Spanish.
Our findings indicate that, in the future, the Spanish version of the HSCL-25 scale could be employed as a diagnostic tool for depression in PC consultations. Our study has taken place within the framework of a European project [
8,
9], in which a common methodology has been used for the translation and adaptation of different languages. We believe, therefore, that the HSCL-25 is a good tool to carry out research concerning the prevalence of depression at the European level once the various language versions have been validated.