Measurement Invariance in the Center for Epidemiologic Studies-Depression (CES-D) Scale among English-Speaking Whites and Asians

The Center for Epidemiologic Studies Depression (CES-D) Scale has been widely used to measure depressive symptoms. This study compared the measurement invariances for one-, two-, three-, and four-factor models of the CES-D across English-speaking Whites and Asians: White Americans, White Australians, Indians, Filipinos, and Singaporeans. White Americans and Australians, Indians, Filipinos, and Singaporeans English speakers (782 men and 824 women) whose ages ranged from 20 to 79 years, completed the CES-D. They were recruited from the data pool of the 2013 and 2014 Coping and Health Survey. Confirmatory factor analyses indicated that the original four-factor model showed the best fit, compared to the other models. Mean and covariance structure analyses showed that the factor means of the CES-D subscales among Whites were significantly lower than were those among Asians; the score gap was particularly high between Whites and Indians. Additionally, Indians scored the highest on all subscales of the CES-D compared to all other countries. Overall, CES-D scores among Whites were lower than those among Asians.


Introduction
According to a face-to face household survey of community adults [1], the 12-month prevalence of mood disorders in Asia and America ranges from 1.7% to 3.1% and 4.8% to 9.6%, respectively; depression is a leading cause of disability worldwide. The Center for Epidemiologic Studies Depression (CES-D) Scale [2] is a 20-item self-reported questionnaire designed to measure depressive symptoms among general populations. It has been widely used in many countries with many racial/ethnic groups [3]; indeed, in a list of the 100 mostcited papers of all time by Nature in 2014 [4], the article [2] reporting on the development of the CES-D was 51st (N = 17,055 citations).
Researchers must address measurement invariance when examining cross-cultural differences in CES-D scores because, without measurement equivalence across ethnic or cultural groups, it is difficult to interpret the differences in observed mean scores meaningfully [5]. Radloff [2] originally proposed a four-factor model for the CES-D, comprising depressed affect, somatic complaints, interpersonal problems, and positive affect. This four-factor structure has been extensively replicated, particularly among Whites [3,6], including Australians [7][8][9]. For example, the National Longitudinal Study of Adolescent Health in the U.S. showed that the original four-factor structure was a better fit compared to the one-and three-factor models [10]. Moreover, the original four-factor structure has been identified in other racial/ethnic groups [11,12], including Asians [13][14][15]. A study [13] using a sample of 1200 for each of five Asian countries-Indonesia, Korea, Myanmar, Sri Lanka, and Thailand-showed that the original four-factor structure was replicated for all countries.
However, many studies have revealed other factor structures for the CES-D in multiple cultural groups, specifically in Asian populations [16][17][18][19][20]. For example, a meta-analysis replicated [3] across African Americans, American Indians, Asians, Whites, and Hispanics, using confirmatory factor analysis (CFA), demonstrated that the original four-factor structure was acceptable across all ethnic groups except for the Asian group (N = 65,554, k = 16). Two-and three-factor structures have also been cited as other possible factor structures for the CES-D; the two-factor structure comprises negative affect (depressed affect, somatic complaints, and interpersonal problems) and positive affect, whereas the three-factor structure comprises depressed affect and somatic complaints, interpersonal problems, and positive affect.
In addition to measurement invariance, researchers must address complications regarding translation equivalence when examining cross-cultural differences [21]. Indeed, CES-D scores and factor structures were found to be strongly influenced by participants' language [22]. Therefore, to facilitate accurate comparison, we selected India, the Philippines, and Singapore as English-speaking Asian countries.
People in Asian cultures reported higher scores on the CES-D than those in Western cultures [13,14], particularly for somatic complaints, positive affect, and interpersonal problems. Although somatic symptoms are a common feature of depression in many countries [23], non-Westerners, particularly Asians, tend to show stronger somatic symptoms than Westerners [24,25]. Several explanations for this tendency among Asians have been proposed, one of which concerns how cultural differences in symptom presentation are derived from variations in processing or expressing affect [25]. For example, although Asians tend to pay more attention to somatic symptoms than Westerners, their somatization tends to exhibit lower levels of interoceptive accuracy [25], which is one's ability to accurately conjecture the magnitude of his or her bodily changes. Regarding positive affect, individuals in Asian cultures tend to avoid or inhibit expressing positive emotions because they are less likely to desire maximization of such emotions and minimization of negative emotions. Instead, they tend to promote a more balanced perspective on positive emotions, in contrast to Westerners, who tend to consider positive emotions to be functional and desirable [24]. Concerning interpersonal problems, research on cross-cultural differences in interpersonal relationships has shown that Asians tend to emphasize respecting and living in harmony with others (collectivism), compared to Westerners [26]. Shared emotions in collectivists, including Asians, involve ensuring that others share the concern and behave accordingly, whereas individualist cultures, including Americans and Australian, involve sharing of information. As such, Asians are more likely than Westerners to regard interpersonal problems as important and tend to report more interpersonal problems. Lastly, concerning depressed affect, some researchers [27] suggest that Asians tend to exhibit somatic symptoms rather than psychological symptoms, including depressed affect. However, the emphasis on somatic symptoms is not a minimization or denial of depressed affect, and that cultural differences in depressed affect between Westerners and non-Westerners are attributable to variations in constituent symptoms rather than the actual level of depressed affect [28]. Specifically, Chinese psychiatric outpatients reported less hopelessness among psychological symptoms according to a clinical interview than Euro-Canadian psychiatric outpatients but reported greater suppressed emotions and depressed mood [28]. Indeed, Asians report higher levels of depressed affect than do Westerners [13,14].
The present study compared the measurement invariances for one-, two-, three-, and four-factor models and scores of the CES-D between Asians and Westerns. Few studies have examined cross-cultural differences in CES-D scores between Whites and Indians, Filipinos, or Singaporeans, particularly using the same language. Nevertheless, given these previous theoretical studies on cross-cultural differences in self-reported depressive symptoms, we hypothesized that Whites would report lower CES-D scores than Asian groups.

Participants and Procedures
Two surveys, the 2013 and 2014 Coping and Health Survey [29][30][31] were conducted using web-based panels of the polling organization, Rakuten Research. They comprised more than 40.12, 8.26, 9.44, 1.60, and 3.92 hundred-thousand members in the U.S., Australia, India, the Philippines, and Singapore, respectively, who had registered and received one ID per person. Participants in the U.S. and Australia and those in India, Philippines, and Singaporean were selected from the data pools of the 2013 [29,30] and 2014 Coping and Health Survey [31], respectively. The data pools of the former and latter projects were 500 Americans Australians each, and 300 Indians, Filipinos, and Singaporeans each, respectively. The details of the survey were sent to potential participants, who were English speakers and ranged in age from 20 to 79 years, through an e-mail. The data in these surveys were collected so that the sample was almost evenly divided by gender and age in each country in each survey.
Participants (782 men and 824 women) in the present study included White Americans and Australians, Indians, Filipinos, and Singaporeans. The demographic characteristics of each sample are shown in Table 1. Although Singapore has three main ethnic groups-Chinese, Malay, and Indian-only Chinese and Malay were selected for this study to avoid overlap with the Indian sample. There were three Indiana participants.

Measures
In addition to the American population [2], the reliability and validity of CES-D scores have been established in Australian [7][8][9], Indian [17], Filipino [18,19,32], and Singaporean [33,34] populations. Additionally, the reliability and validity of CES-D scores obtained through Internet surveys have been supported [33,35,36]. Participants in this study rated each of the 20 CES-D items according to their experiences within the past week on a 4-point Likert scale ranging from 0 (rarely or none of the time, less than 1 day) to 3 (most or all the time, 5-7 days).

Data Analyses
First, we performed separate CFAs using the maximum likelihood method to test the fitness of the one-, two-, three-, and four-factor models. Second, the measurement invariances of the four models across the five samples were tested using CFA, which comprises the assessment of configural invariance-equivalence of factor structure across groups-metric invariance-equivalence of factor loadings across groups-and scalar invariance-equivalence of intercepts across groups. The configural invariance served as a baseline model. The standardized root-mean-square residual (SRMR; the most sensitive to mis-specified factor covariances), the comparative fit index (CFI; the most sensitive to mis-specified factor loadings), and the root-mean-square error of approximation (RMSEA; a measure of lack of fit per degree of freedom) were used as fit indices. According to the criterion proposed by Tanaka [37], CFI values of 0.95 or greater are considered good fits, whereas CFI values greater than 0.90 are acceptable. RMSEA values of 0.06 or lower are optimal while a SRMR value of 0.08 or lower is acceptable. The expected cross-validation index is used to compare the fit of different models; the model with the smallest positive values is preferred. As chi-square statistics are known to be sensitive to sample size, the ratio of changes in chi-square and degrees of freedom (∆χ 2 /∆df ) and changes in CFI (∆CFI) and SRMR (∆SRMR) were used to compare the models. Models with a ∆CFI of 0.01-supplemented by 0.030 or lower in ∆SRMR or 0.015 or lower in ∆RMSEA [38]-and a ∆χ 2 /∆df of less than or equal to five [39] would have a good fit to the data when CFA was conducted with a sample size of 1606. Additionally, cross-cultural differences in CES-D scores were compared using factor means between countries. Data were analyzed using SPSS version 22 (IBM, Armonk, NY, USA) and AMOS 22.0 (IBM, Armonk, NY, USA).

Results
The mean item and total scores of the CES-D by sample are shown in Table 2. Total CES-D scores were ranked in descending order as follows: Indians, Singaporeans, Filipinos, and White Americans and Australians. The fit indices in CFAs of all models by sample are shown in Table 3. In White American and Australian samples, all other models excluding the one-factor model satisfied the acceptable cutoff criteria (CFI ≥ 0.90 with SRMR ≤ 0.080 and RMSEA ≤ 0.080). Among Indians, Filipino, and Singaporeans, only the four-factor model met the acceptable cutoff criteria. The one-factor model showed a poor fit to the data for all samples. The results on the CFAs indicated that only the four-factor model was acceptable across all samples. CFAs were used to test the measurement invariances of the four-factor model across all samples (Table 4). The results of all models excluding the four-factor model are shown in S1 and S2 (Supplementary materials). The respective fit values in metric invariance were acceptable in the four-factor model based on the cutoff criteria (∆χ 2 /∆df ≤ 5.0 and ∆CFI ≤ 0.01 with ∆SRMR ≤ 0.030 or ∆RMSEA ≤ 0.015). However, ∆CFIs in scalar invariance were unsatisfied; although, the other fit values in scalar invariance were satisfied in the four-factor model. Therefore, to establish partial scalar invariance in the four-factor model, the constraints on loadings and intercepts of items 4, 7, 10, 11, 16, 17, 18, 19, and 20 were released based on Steenkamp and Baumgartner's [40] recommendation-ideally, more than half items on a factor should be invariant. The results showed that fit indices for the partial scalar invariance in the four-factor model were adequate. Finally, factor means, which were corrected for noninvariant items, in the four-factor model were compared between samples. Factor means for somatic complaints showed that White Australians (M = 0. 35

Discussion
The four-factor CES-D has been well validated [5,6,11,12], which has also been replicated by our findings. Previous studies validating the four-factor structure of the CES-D in Asian countries largely used Chinese samples [14,15], with a non-English-language version of the CES-D. However, this study used Indian, Filipino, and Singaporean samples with an English-language version of the CES-D. Therefore, our findings that the four-factor structure of the CES-D was established in Asian populations may contribute to future research on the CES-D with Asians. However, our findings cannot exclude the possibility of other-factor structures of the CES-D, which have been proposed in Asian cultures [16][17][18][19][20].
Although our findings replicated the four-factor structure of the CES-D, we did not examine its clinical validity or usefulness. In Asian cultures, positive items in self-report questionnaires on depression may not be useful as markers of depression [24]. For example, European and Asian Americans born in the U.S. reported that the intensity of positive emotions was negatively associated with depressive symptoms; however, this was not the case with Asians who had immigrated into the U.S. [41]. Moreover, for Asians, some studies [15] about the CES-D have also questioned the clinical usefulness of the positive affect subscale. For example, in a study of Chinese patients [15], excluding the positive affect items provided a better screening tool for depression as compared to the original CES-D.
Before we compared CES-D scores among the samples, CFAs conducted to test the measurement invariance across the five samples, revealed that the goodness-of-fit statistics of the four-factor model were acceptable. This result indicated that the mean CES-D subscale scores of the four-factor model were comparable across the five samples. We hypothesized that CES-D scores among Whites would be higher than those among Asians. Total CES-D scores were ranked in descending order as follows: Indians, Singaporeans, Filipinos, White Americans and Australians. Factor means for somatic complaints were lower among White Americans and White Australians than those among Indians. Furthermore, factor means for depressed affect were lower among White Americans than those among Indians, whereas White Australians showed lower scores than all Asian samples. Factor means for low positive affect among White Americans and Australians were significantly lower than were those among Singaporeans. Factor means for interpersonal problems were lower among White Americans than among Indians, whereas White Australians showed lower scores than all the Asian samples. In sum, overall, the CES-D subscales scores among Whites were lower than those among Asians, with the gap being greater for Whites and Indians. Additionally, overall, factor means of all subscales among Indians were highest among all Asians; although, we did not hypothesize about differences in CES-D scores in this study. These findings are addressed in the following paragraphs.
First, this study showed that overall, Indians reported the highest CES-D scores when compared to other samples, particularly concerning somatic complaints. Somatic symptoms are core diagnostic symptoms of depression for Indians and are the presenting complaints for 97% of Indian primary care patients [42]. One possible reason for this is that yoga plays a central and definitive role in many Indians' lives. Yoga is a spiritual and physical discipline in Hinduism, which is the main religion of India; although a more spiritual meditation than physical exercise, it helps Indians focus on bodily changes in the moment. Therefore, Indian people may have greater somatic awareness and show stronger somatic symptoms than people in other countries.
Second, although India, the Philippines, and Singapore had been colonized by countries in Europe and America, the colonial periods in the Philippines and Singapore were longer than India; thus, Filipino and Singaporean cultures have been more heavily influenced by Western culture than the Indian culture. Such cultural differences might influence the differences in CES-D scores in our findings. Some studies found no significant differences between Filipinos and Whites concerning CES-D scores; although, to our knowledge, no study had examined differences in CES-D scores between them. A previous study [32] conducted on college students in Hawaii showed no significant differences in CES-D total scores between European Americans and Filipinos; although, Filipinos' scores (M = 16.51, SD = 10.96) were somewhat higher than were European Americans' scores (M = 15.09, SD = 9.70). Additionally, a previous study [34] suggested that CES-D total score among Indians was higher than that among Singaporeans; although, to our knowledge, the current study was the first to examine differences in CES-D scores between Filipino, Singaporean, and Indian populations.

Limitations
Some limitations must be considered when interpreting our findings. First, the generalization or representativeness of the current findings to other samples may be low. Of many Asian countries, we selected only three to address the translation equivalence of the CES-D. Furthermore, we did not obtain data from White Europeans or Canadians. In other words, our samples are not entirely representative of Asian or White samples. Additionally, the Asian samples in this study may not have been representative of Indians, Filipinos, or Singaporeans, as we used only English-speaking participants. Moreover, the CES-D scores were obtained through a web-based survey; the data may differ if collected via other methods, such as by telephone, interviews, or paper-based self-report. A previous study [35] conducted in the Netherlands found that the mean CES-D total score among adolescents recruited through the Internet (M = 15.4, SD = 11.8) was significantly higher than that of adolescents recruited through schools (M = 9.7, SD = 8.4). Further, scores of depressive symptoms obtained through the Internet were greater than those obtained by paper-based self-report [43].
Second, although the results in this study suggested that the four-factor was an acceptable structure for the CES-D, this does not suggest that the four-factor structure is more useful for clinical or diagnostic research on depression than are the other-factor structures. Additionally, in other ethnic groups, the one-, two-, three-, or other-factor models may be a better structure for the CES-D.
Third, the findings on the differences in CES-D scores need to be interpreted with caution. Partial scalar invariance in the four-factor model was acceptable in our samples, but not scalar invariance. Standards for partial invariance are inconsistent [44]; however, partial invariance has been reported for approximately one-third of tests [44]. The findings on the differences in CES-D scores contribute to research on the CES-D.
Finally, although our results showed that Asians reported higher CES-D scores than Whites, the prevalence of depression in Asians is not necessarily higher than that among Westerners when diagnosed via an interview with a psychiatrist. The World Health Organization [45] reported the prevalence of current depression, according to Composite International Diagnostic Interviews, was 9.1% in India, 4.0% in China, 6.3% in the U.S., and 4.7-16.9% among Western Europeans. This contradiction may suggest that cross-cultural differences in CES-D scores do not necessarily correspond to those in the prevalence of depression as diagnosed by a psychiatrist.

Conclusions
The CES-D has been widely and frequently used to measure depressive symptoms in the general population. Although an appropriate structure of the CES-D was different in different cultures, the four-factor model was most adequate across all cultures. Despite some limitations, our findings among English-speaking Whites and Asians indicate that the mean CES-D scores in Whites were lower than those in Asians; the gap was particularly high among Indians.

Declarations
All participants were selected from the data pools of the 2013 [29,30] and 2014 Coping and Health Survey [31], as described in the Methods section. The Coping and Health Survey was administered for several purposes. The present study used the CES-D, which was a part of the data obtained by the 2013 and 2014 Coping and Health Surveys. The mean CES-D scores in the U.S., Australian, and Chinese samples were reported by Kato [29] using the data from the 2013 Coping and Health Surveys. The mean CES-D scores in the Indian, Filipino, and Singaporean samples were reported by Kato [31] using the data from the 2014 Coping and Health Survey. Based on PLoS ONE editorial policies regarding the sharing of materials and data, the raw data obtained from the 2013 Coping and Health Survey were made freely available (Supporting Information: Dataset S1).

Supplementary Materials:
The following are available online at https://www.mdpi.com/article/ 10.3390/ijerph18105298/s1, Table S1: Mean and covariance structure analysis for other models excluding four-factor model of the CES-D, Table S2: Differences in factor means between samples for the four-factor model.