1. Introduction
Published literature on the influence of religion and spirituality (R/S) on human health has grown substantially in recent years (
Demir 2019;
Koenig 2012), yet R/S measures are still not prioritized in large prospective studies investigating the etiology of disease (
Shields and Balboni 2020). Exceptions occur among those cohorts with an explicit focus on R/S and health (
Smith and Faris 2002) or in cohorts that focus on a specific religious group (e.g., the Adventist Health Study) (
Butler et al. 2008;
Beeson et al. 1989). On the whole, however, many gaps remain. An analysis of all surveys fielded by 20 large U.S.-based prospective cohort reveals, for example, that only 7 out of 20 cohorts had ever collected at least three different R/S measures in the history of their cohort (
CGVH 2019).
This oversight is important since the potential influence of R/S on health likely differs across racial/ethnic communities. Enough evidence exists demonstrating the importance of R/S in health research (
Koenig 2012;
VanderWeele 2017) that examination of understudied groups must become a priority. We know, for example, that more than 75% of African Americans and 59% of Latinos say their religion or spirituality is very important in their lives, compared to only 49% of white Americans (
Pew Research Center 2014). Despite this, R/S research has received almost no attention in other minority groups, such as the U.S. South Asian and American Indian communities. Minorities bear a disproportionate burden of chronic disease (
Cockerham et al. 2017), and understanding how R/S functions in different minority communities may inform novel interventions to reduce health disparities.
It is within this context that the Study on Stress, Spirituality, and Health (SSSH) and the National Consortium on Psychosocial Stress, Spirituality, and Health were launched. The SSSH is a “cohort of cohorts”, collecting R/S and other psychosocial and health data needed to better understand how stressors “get under the skin” to increase risk of disease and disparities in the burden of illness among socially disadvantaged communities. Five prospective cohort studies currently participate in the SSSH, including American Indian, Black, Hispanic/Latino, South Asian, and white cohorts. The eventual goal of the SSSH is to evaluate which R/S measures are the most strongly associated with chronic disease, and to identify biological pathways or mechanisms through which R/S operate to affect health. However, initial validation of the survey is a necessary first step. The Baseline Spirituality Survey (SS-1) combines existing measures on religious coping (RCOPE) (
Pargament et al. 2011), daily spiritual experiences (DSES) (
Underwood and Teresi 2002;
Underwood 2011), and other R/S beliefs and experiences with de novo R/S items developed through qualitative research in multi-ethnic focus groups from participating cohorts.
The objective of this report is to describe an initial psychometric evaluation of the SS-1 using SSSH data on 4563 participants from five SSSH cohorts: the Black Women’s Health Study (BWHS) (
Russell et al. 2001), Hispanic Community Health Study/Study of Latinos (HCHS/SOL) (
Lavange et al. 2010), Mediators of Atherosclerosis among South Asians Living in America (MASALA) Study (
Kanaya et al. 2013), Nurses’ Health Study II (NHSII) (
Bao et al. 2016), and the Strong Heart Study (SHS) (
Lee et al. 1990). In addition to psychometric validation, we looked at several possibilities for item reduction among our existing scales and examined associations with the physical and mental health components of the SF-12, a widely used measure of functional health (
Ware et al. 1996). Establishing psychometric validity in our multi-ethnic sample provides a foundation for ongoing and future work in the SSSH.
2. Methods
2.1. Spirituality Survey (SS-1) Development
As a first step, our team conducted a systematic literature review to identify validated R/S measures and scales shown to be associated with a broad range of health outcomes. For several well-established scales (e.g., RCOPE, DSES), we worked directly with the scale authors to identify a parsimonious set of sub-items to include in our survey. Focus groups and/or key informant interviews were conducted among African American, Hispanic/Latino, American Indian, and South Asian cohort participants, community members, and religious/spiritual leaders to understand the meanings of religion and spirituality in these different cultural contexts and to help identify the most salient R/S concepts. Once a list of R/S measures and scales was identified, we worked with SSSH investigators, cohort PIs, and R/S experts to identify constructs important to one or more of our ethnic populations, but not captured in existing validated scales. Several de novo measures were developed to address these gaps. The full questionnaire was then field-tested in each cohort and language iteratively modified to address cultural expectations while maintaining item validity across all cohorts. The final SS-1 was approved by each participating cohort.
The SS-1 consists of 82 R/S items assessing the following areas: (a) Religious activities; (b) Closeness to God; (c) Religious coping (positive religious coping and negative religious coping/spiritual struggles); (d) Gratitude;
1 and (e) Non-theistic daily spiritual experiences. The survey items were prefaced with the following statement: “These questions are being asked of people from different religious backgrounds, and although we use the term ‘God’ in some of the questions below, please substitute your own word for ‘God’ (e.g., Bhagwan, Allah, The Divine, etc.).” Questions regarding one’s relationship with God were asked only of those survey participants who said they believed in [God], and questions addressing one’s experience with their religious congregation or community were asked only of those respondents who indicated that they belonged to a religious congregation or community.
2.2. SS-1 Dimensions and Scales
Individual Items. The SS-1 contains a number of individual items that measure respondents’ attitudes, beliefs, and practices, including respondents’ views of organized religion, the extent to which they identify as a religious or spiritual person, their membership in a religious congregation or community, their relationship to that congregation or community, and beliefs about God and the afterlife (see
Supplementary Table S1). The majority of these individual items were not assessed in the current study, as the focus here is on scale validation. SS-1 scales are detailed in the following material.
Religious Activities (RAS). This SS-1 section contains seven items designed to capture the frequency of religious activities, such as praying in groups or alone, reading scriptures, meditating, and practicing yoga or Tai Chi. Respondents’ engagement in each religious activity was rated on a 7-point scale, from “never” to “several times a day.”
Closeness to God (CtoG). The CtoG scale contains ten items assessing how people relate to God (e.g., “I feel God’s love”, “God’s spirit dwells in my body”, “God is the center of my life,” etc.), including four selected items from the Duke University Religion Index (
Koenig and Büssing 2010) and the theistic items from the Daily Spiritual Experience Scale (
Underwood and Teresi 2002), as well as six de novo items that address perception of relationship to God. Response categories reflect a 5-point scale ranging from “definitely not true of me” to “definitely true of me.”
Religious Coping. Multiple items assess the use of R/S in coping with stress, including positive (e.g., using R/S to cope with stressful situations) and negative (e.g., doubting God’s love, or feeling that God is punishing me) coping. We ask a single item question from the Brief Multidimensional Measure of Religiousness/Spirituality (BMMRS) (“To what extent is your religion or spirituality involved in understanding or dealing with stressful situations in any way?”), with four response categories ranging from “not involved at all” to “very involved” (
Fetzer Institute 1999).
The SS-1 also includes a 10-item scale capturing the construct of positive religious coping in dealing with stressful events. Eight of these items were selected from Kenneth Pargament’s well-validated RCOPE (
Pargament et al. 2000), with two sub-items selected from four different RCOPE sub-domains (items RC 1–2 and 7–12,
Supplementary Table S1). These sub-domains were selected by the study PI and Pargament based on the salience of specific sub-domains in the diverse racial/ethnic communities represented in the SSSH. A de novo subdomain reflects R/S as critical to maintaining hope in the face of adversity. This sub-domain emerged as a central means of coping with difficult life circumstances in focus groups with African Americans and Hispanic/Latino participants, but was not included in the RCOPE. Positive Religious Coping items are rated on a 4-point scale (“not at all”, “somewhat”, “quite a bit” and “a great deal”).
The SS-1 also includes 8 items capturing the construct of negative religious coping in dealing with stressful events, six from the RCOPE (
Pargament et al. 2000), reflecting three RCOPE sub-domains. Two of the original five items under each sub-domain were selected (RC 3–6 and 13–14,
Supplementary Table S1). Two items from the Doubt subscale of the validated Religious and Spiritual Struggles Scale (i.e., RC 17–18: “I felt confused about my religious or spiritual beliefs” and “I felt troubled by doubts or questions about my religion or spirituality”) were also included (
Exline et al. 2014). All items are rated on a 4-point scale (“not at all,” “somewhat,” “quite a bit,” and “a great deal”).
Gratitude (GQ). The SS-1 includes two of the original six items measuring dimensions of gratitude from the Gratitude Questionnaire-6: (“I have so much in life to be grateful for” and “If I listed everything that I felt grateful for, it would be a very long list”). The items are rated on a 5-point scale (“strongly disagree”, “somewhat disagree”, “neutral”, “somewhat agree”, and “strongly agree”) (
McCullough et al. 2002).
Non-Theistic Daily Spiritual Experiences (NT-DSES). This section has four items measuring non-theistic daily spiritual experiences from Underwood’s Daily Spiritual Experiences Scale (i.e., “I experience a connection to all of life”; “I feel deep inner peace or harmony”; “I am touched by the beauty of creation”; and “I feel a selfless caring for others”) rated on a 5-point scale (“never” to “many times a day”) (
Underwood and Teresi 2002).
2 2.3. Study Population and Survey Procedures
The psychometric properties of the SS-1 were evaluated using all 4563 SSSH participants drawn from the five initial U.S. cohorts in the SSSH: BWHS, HCHS/SOL, MASALA, NHSII, and SHS. Brief cohort descriptions follow.
BWHS began in 1995 to investigate breast cancer and other diseases that disproportionately affect Black women. In 2015, approximately 4000 participants who had completed the most recent wave of data collection were invited to complete the SS-1; more than 2400 women responded within the first two weeks of recruitment and enrollment was stopped. A random sample of 1000 of these participants was included in SSSH. Comparisons to the full BWHS cohort indicate a high degree of comparability on available religious measures (e.g., religious attendance, degree of religious/spiritual person (
Cozier et al. 2018). The sample represents a full range of socioeconomic levels and all geographic regions of the U.S.
HCHS/SOL targets both immigrant and U.S.-born Hispanic/Latinos in four U.S. cities (total cohort N = 16,415), with the aim of assessing the role of acculturation in cardiovascular and related conditions disease etiology. To be eligible for the SS-1 at the time of collection in 2018–2019, participants had to be from the Chicago site, to have completed the most recent round of data collection, and to have participated in HCHS/SOL’s Sociocultural Ancillary Study (N = 900, response rate 754/900 = 83.8%). An additional 244 participants were recruited through letters sent to the broader sample of Chicago site participants to reach the desired study population of 1000. The SSSH sample is generally comparable to the full HCHS/SOL cohort, though variations occur on the handful of comparison items available (i.e., the SSSH sample has a slightly higher proportion of religious affiliates but attends religious services slightly less—see
Lerman et al. (
2018).
MASALA examines risk factors for atherosclerosis among South Asians, with participants drawn from the Chicago and the San Francisco Bay areas. To be eligible for MASALA, respondents must have had at least 3 grandparents born in India, Pakistan, Bangladesh, Nepal, or Sri Lanka. All participants (total cohort N = 990) were invited to complete the SS-1 in 2016-18, and only one declined.
NHSII was established in 1989 among 116,429 women who responded to the baseline and subsequent biennial follow-up questionnaires to investigate risk factors for major chronic diseases in women, and is comprised of nurses from 14 states who are predominantly white. R/S data collection occurred from 2015–16, and eligibility included provision of at least two blood samples, being age 45–75 at the time of the most recent blood draw (2010–13), completion of four questionnaires (2001 violence, 2008 trauma, and 2013 and 2015 main questionnaires), and no active participation in an ongoing ancillary study. Approximately the first 1100 women who completed the survey were enrolled. Comparisons to the larger cohort indicate almost identical levels of religious service attendance (
Spence et al. 2020).
SHS is one of the largest prospective cohort studies of American Indians, and is focused on cardiovascular disease. In 2017–18 participants for the SS-1 were drawn from the Dakotas region and had to be part of phase IV or V and completed the previous two rounds of data collection. Community workers held community events and reached out to SHS participants, most often conducting home visits to assist with completion of the SS-1. Religious comparison measures were not available for the larger cohort.
SSSH Survey data were collected using the established procedures for data collection within each cohort. BWHS and NHSII participants completed a web-based version of the survey accessed through an emailed link. Participants from MASALA completed the survey during an in-person clinical visit, or by mail, if they had already completed their most recent clinical visit. Participants from HCHS/SOL and SHS completed the survey either via mail, over the telephone, or in person. Each cohort also provided historic survey data for all cohort participants who completed the SS-1 and were thus enrolled in the SSSH, including demographic, psychosocial, lifestyle, behavioral, and clinical data. All data were sent to the Harvard/MGH Center on Genomics, Vulnerable Populations, and Health Disparities, where data elements were harmonized and incorporated into the SSSH analytic file. Procedures were approved by each cohort’s Ancillary Study Committee and Institutional Review Board (IRB), as well as the Partners Human Research Committee.
2.4. Psychometric Analyses
The primary goal of this paper was to determine the psychometric properties of the SS-1, including reliability (internal consistency [α]), item adequacy (adjusted item-to-scale correlations), and the item-level factor structure, where appropriate, using the five-cohort pooled sample.
3 We also sought to identify opportunities where the same construct captured by a scale could be reliably measured with fewer items. Scale-level factor structure was assessed using Principal Axis Exploratory Factor Analysis (
Thompson 2004). The SS-1 also contains several categorical and nominal items (e.g., “To what extent do you view organized religion as positive or negative?”) that were answered on a 5-point scale (e.g., “very positive” to “very negative”). For select categorical and nominal items, we reported the response distribution, as well as floor and ceiling effects.
2.5. Initial Validity Analysis
We sought to obtain initial evidence of validity for the SS-1 scales as predictors of respondents’ functional health status using the Short-Form 12 Health Survey (SF-12), a validated scale comprised of physical health (SF-12 PCS) and mental health (SF-12 MCS) components (
Ware et al. 1996). First, we calculated partial correlations (controlling for age, sex, and race/ethnicity) among the SS-1 sections identified as having adequate reliability (five SS-1 sections or scales met this criterion) and relevant nominal variables contained within the SS-1 (e.g., the extent of being a religious or spiritual person; view of organized religion; being a member of a religious congregation or community) and the SF-12 PCS and SF-12 MCS. Given the size of our sample, trivial correlations would achieve statistical significance. To identify meaningful relationships, only correlations with an absolute value ≥ 0.15 were considered significant (95% Confidence Interval [CI] for r = 0.15 in a sample of 4000 is 0.12 to 0.18). The absolute value of r ≥ 0.15 seemed an appropriate threshold as the 95% CI exceeds 0.10, which is Cohen’s small effect size lower boundary (
Cohen 1988).
To test group differentiation, the sample was divided into groups based on whether or not they reported being part of a religious congregation or community. This grouping variable produced an almost 50/50 split in our sample. Between groups t-tests were conducted for the five SS-1 domains/scales and the SF-12 component scales. The t-test results and effect size measure (Cohen’s d) are presented. To assess predictive validity, we conducted two stepwise multiple regression analyses exploring the extent to which the SS-1 domains/scales could predict scores on the SF-12-PCS and SF-12-MCS. Analyses utilized sampling weights provided by HCHS/SOL; other cohorts were set to a weight value of 1.
3. Results
3.1. Participant Characteristics
As shown in
Table 1, the SSSH participants included in this analysis represent individuals from five racial/ethnic groups: non-Hispanic White (24.4%, n = 1115), Black (22.0%, n = 1005), South Asian (21.8%, n = 996), Hispanic/Latino (17.9%, n = 818), and American Indian (13.8%, n = 629). Religious affiliations included Evangelical (11.2%, n = 510), Mainline Protestant (10.8%, n = 490), Black Protestant (15.0%, n = 682), Catholic (23.2%, n = 1054), Jewish (0.6%, n = 25), Hindu (13.3%, n = 605), Muslim (1.8%, n = 83), Jain (1.2%, n = 56), Sikh (1.1%, n = 50), Traditional Native American Practice (2.4%, n = 107), Other (7.9%, n = 361), No affiliation (6.12%, n = 278), Agnostic (2.6%, n = 120), and Atheist (1.9%, n = 87). The average age was 57.0 years, 74.3% were female, and 59.3% had an annual household income greater than
$50,000.
3.2. Response Patterns of Individual R/S Measurement Items
The SS-1 includes several individual (single item) R/S questions (both de novo and validated) in addition to scales. As shown in
Table 1, most subjects viewed organized religion as either very positive (22.2%, n = 788) or positive (38.2%, n = 1357), while a few viewed religion as negative (6.9%, n = 245) or very negative (2.5%, n = 90). Most subjects classified themselves as both spiritual and religious (58.9%, n = 2655) and only 6.1% (n = 274) indicated they were neither spiritual nor religious. More than one-third (n = 1505) of subjects considered themselves to be very religious or spiritual. Half of the sample (49.9%, n = 2241) reported being part of a religious congregation or community. Of the subjects who were members of a community or congregation, 79.4% (n = 1776) felt they received love or care from their congregation/community “very often” or “fairly often” and 78.4% (n = 1755) felt they showed love or care to congregation members “very often” or “fairly often”. On the other hand, 9.7% of subjects (n = 217) felt their community or congregation was critical of them “very often” or “fairly often” and 5.2% (n = 115) felt ignored or neglected by other members of their religious community or congregation (“very often” or “fairly often”).
With respect to religious beliefs, nearly three quarters (74.1%, n = 3326) responded “definitely true of me” to the statement, “I believe that God exists.” Another 11.5% (n = 513) answered “tends to be true of me” and only 4.7% (n = 209) answered “definitely not true of me” to this question. The majority (56.0%, n = 2510) of subjects answered “definitely true” to the question, “I believe in life after death,” while only 7.0% (n = 315) endorsed “definitely not true.” The second largest group was “unsure” (19.4%, n = 868). Finally, one question asked: “When you think about God in relationship to your health, which of the following is closest to your view?” 42.2% (n = 1790) selected “My health is determined by my own actions;” 51.7% (n = 2191) selected “When it comes to my health, God and I both have a role to play;” and 6.2% (n = 262) selected “God determines my health, regardless of my own actions and behaviors”.
3.3. Religious Coping Scale Development
Exploratory Factor Analysis (EFA; Principle Axis Factoring with orthogonal rotation) was used to explore the underlying dimensional structure of the 18 positive and negative religious coping items (RC 1–18). The EFA produced 4 factors with Eigenvalues ≥ 1.00. Parallel analysis (PA) (
Horn 1965;
O’Connor 2019) and scree plot examination suggested two factors (true Eigenvalues: 7.54 [random PA Eigenvalues 1.17]; 3.52 [1.13]; 1.17 [1.11]; and 1.01 [1.09]). Examining the 2-factor solution revealed that the religious coping items separated clearly into two factors representing positive and negative R/S coping styles, with all items having strong primary loadings and no secondary loadings (>0.29) (
Table 2). The factor loadings for the first factor ranged from 0.74 (“I saw my situation as part of God’s plan,” RC-1) to 0.90 (“I trusted God would be on my side,” RC-12). Factor 1 also contained all four positive coping items from the RCOPE. Loadings on the second factor ranged from 0.46 (“I felt as though the devil or evil spirits were trying to turn me away from God,” RC-6) to 0.72 (“I wondered whether God had abandoned me,” RC-13). Factor 2 contained all the negative religious coping items from the RCOPE and the two Religious and Spiritual Struggles items. Based on these factor loadings, two scales were composed based upon the factor loadings of religious coping items, with 10 items comprising the positive religious/spiritual coping scale (PRC) and eight items in the negative religious/spiritual coping scale (NRC).
3.4. Internal Consistency of the SS-1 Scales
Religious Activities (RAS).
Table 3 displays the descriptive statistics, number of items, and alphas for each of the five SS-1 scales beginning with RAS. A review of response distribution revealed RAS-7 (Tai Chi) to have a non-normal distribution with excessive skewness (6.10) and kurtosis (39.9). As such, it was dropped from any additional analyses. A 6-item scale on Religious Activities (RAS) was evaluated and was found to have an acceptable internal consistency (α = 0.75), but RAS item 6 (yoga) had an adjusted item-to-scale correlation of 0.08, well below the 0.30 lower boundary, so it was also removed. The remaining five items produced a scale with good internal consistency (α = 0.80) and adequate adjusted item-to-scale correlations (0.43 [meditation, RAS-5] to 0.72 [individual prayer, RAS-2]). Exploratory Factor Analysis (EFA; Principle Axis Factoring) showed the five item RAS to be uni-factorial, having one factor with an Eigenvalue of 2.84 accounting for 56.8% of the variance. The factor loading ranged from 0.60 (meditation, RAS-5) to 0.87 (individual prayer, RAS-2).
Closeness to God (CtoG). Psychometric analysis revealed that these items suffer from several limitations. The 10 items have ceiling effects that are unacceptably high; they range from a low of 45.3% (CtoG-10) to a high of 65.9% (CtoG-5). Thus, the 10-item CtoG scale had limited score range (median score = 4.6; the modal score is the scale maximum of 5.0), excessive inter-item correlation (mean inter-item correlation = 0.73), and a coefficient alpha of 0.96 (
Table 3). Exploratory Factor Analysis (EFA; Principle Axis Factoring) showed the Closeness to God scale was highly uni-factorial, having a single factor with an Eigenvalue of 7.60, accounting for 76.1% of the variance. The factor loading ranged from 0.78 (“I feel God’s love or care for me through others,” CtoG-1) to 0.90 (“My relationship with God lies behind my whole approach to life,” CtoG-7). These results suggest that psychometrically the 10 items are essentially identical and the construct could be well-measured with fewer items. To test this possibility, we composed three independent 3-item Closeness to God scales (three items are generally accepted as the minimum number necessary to accurately calculate internal consistency) (
Cohen 1988). CtoG-A (items 1, 3, and 5) had an alpha of 0.87 and item to scale correlations between 0.72 and 0.79; CtoG-B (items 2, 4, and 6) had an alpha of 0.92 and item to scale correlations between 0.82 and 0.84), and CtoG-C (items 8, 9, and 10) had an alpha of 0.91 and item to scale correlations between 0.80 and 0.84. All three 3-item CtoG scales demonstrated adequate psychometric functioning, supporting the use of fewer items to measure this dimension of R/S.
Religious Coping (single item). Among our sample, 87.0% (n = 3690) felt religion or spirituality was somewhat (30.1%) or very much (56.9%) involved in how they coped with stressful life situations. This single item showed a good response distribution.
Positive and Negative Religious Coping. The 10-item Positive Religious Coping (PRC) scale had an alpha of 0.95, with adjusted item-to-scale correlations ranging from 0.70 to 0.86, while the 8-item Negative Religious Coping (NRC) scale had an alpha of 0.84, with adjusted Item-to-Scale correlations ranging from 0.53 to 0.65. The findings for PRC (α > 0.90, factor loadings of 0.75 or better, and no item-to-scale correlations below 0.70) suggest that the scale could be reduced in length, perhaps by half, without suffering a significant loss of measurement power. Similarly, the NRC scale could be reduced by 1 or 2 items and maintain acceptable measurement properties. Two potential candidates for removal, both from the original RCOPE, are RC-6 (“I felt as though the devil, or an evil spirit was trying to turn me away from God”) and RC-5 (“I believed the devil or evil spirits were responsible for my situation”), as these were the two items with the weakest factor loading.
Two 5-item scales, PRC and NRC, were composed using the 5 highest loading items from each scale. PRC5 (RC items, 8, 9, 11, 12, and 18) had an alpha of 0.94, with adjusted item-to-scale correlations ranging from 0.81 to 0.87, while NRC5 (RC items 3, 13, 14, 15, and 16) had an alpha of 0.81, with adjusted item-to-scale correlations ranging from 0.47 to 0.66 (data not shown). These 10 items also cleanly reproduced the two-factor solution and offer an option for assessing positive and negative religious coping with a more parsimonious measure.
Gratitude (GQ). We found these two items suffered from significant psychometric deficiencies, including ceiling effects (subjects scoring at the top of the scale) of 89.9%, along with severe kurtosis (24.3). As such, the gratitude items were dropped from further analysis.
Non-Theistic Daily Spiritual Experiences (NT-DSES). The four items showed excellent psychometric properties, with coefficient Alpha 0.76 and adjusted item-to-scale correlations ranging from 0.46 (NTDSES-4) to 0.61 (NTDSES-2). The 4-items all loaded on a single factor, with an Eigenvalue of 2.34 accounting for 58.6% of the variance. The factor loading ranged from 0.67 (NTDSES-4) to 0.81 (NTDSES-2).
3.5. Correlation Between SS-1 Scales
Partial correlation analyses (controlling for age, sex, and race/ethnicity) showed the five SS-1 scales to be mildly inter-correlated, with a mean inter-scale correlation of 0.36 (data not shown). However, some scale pairs showed high inter-correlation: Closeness to God & Positive Religious Coping (r = 0.84); Closeness to God and Religious Activities (r = 0.65); and Religious Activities & Positive Religious Coping (r = 0.61). No other scale pairs correlated at ≥0.50. This raises the possibility that these three scales, and especially Closeness to God and Positive Religious Coping, measure similar constructs.
7. Discussion
This paper describes the initial psychometric evaluation (reliability and validity) of the baseline Spirituality Survey (SS-1) assessed among 4563 adult respondents from five prospective cohorts participating in the Study on Stress, Spirituality, and Health (SSSH). The SS-1 is designed to assess diverse R/S experiences, beliefs, and practices across ethnically diverse communities and diverse religious traditions. Responses demonstrated that religion and spirituality (R/S) are significant components of most participants’ lives. Our analyses revealed that the SS-1 contains five scales with acceptable psychometric properties; they also provided initial evidence supporting construct and predictive validity of measures and scales in the SS-1, offering solid evidence that SS-1 scales are meaningfully associated with a wide range of relevant religious/spiritual beliefs, attitudes and practices. Results further demonstrated that the SS-1 scales produced meaningful group differentiation according to key R/S constructs (e.g., participants who were or were not part of a religious or spiritual community) and showed predictive validity when evaluated against a well-validated measure of functional health status, the SF-12. The SS-1 scales were further able to predict both the physical health and mental health components of the SF-12. Together these findings provide initial support for the psychometric adequacy of the SS-1 scales and categorical attitudes, beliefs, and practices assessed.
Regarding reliability, the five final scales demonstrated adequate reliability (internal consistency) and acceptable adjusted item-to-scale correlations. Some SS-1 scales combine items from previously validated scales, with de novo items generated through qualitative research in minority communities. The best example is the 18-item religious coping scale which is composed of 14 items from the 105-item full RCOPE, two from the 26-item Religious and Spiritual Struggles Scale, and two de novo items addressing R/S as a resource for maintaining hope in difficult times. The resulting composite scales exhibited reliability values consistent with those reported for the RCOPE and Religious and Spiritual Struggles scales, from which these sub-items were drawn. An existing shortened form, the Brief RCOPE (14 total items) has a median alpha of 0.92 for the positive religious coping subscale and a median alpha of 0.81 for negative religious coping (
Pargament et al. 2011), similar to what we obtained for the PRC (0.95) and NRC (0.84). The fact that our reliability findings for the PRC and NRC scales were consistent with existing literature increases confidence in our findings, and suggests that our more concise 5-item subscales, which combine sub-items from larger source scales and novel items resulting from qualitative research, work well across racial/ethnic groups and achieve predictive power similar to that of the full scales. Achieving this while also accounting for hope in the divine, which was absent from the RCOPE scale, helps the measure to more robustly capture coping attitudes among minority populations.
The factor analytic findings for SS-1 scales (with more than 3-items) showed all scales to be uni-factorial. The uni-factorial nature of the SS-1 scales provides strong evidence that they measure the specific constructs intended. By demonstrating that the SS-1 scales have acceptable reliability and clear factor structures, these data establish a solid foundation for future SSSH research in relation to other health outcomes.
The range of initial validity data was also reported. Four of the five SS-1 scales were meaningfully associated with six other measures of religious/spiritual beliefs, attitudes and practices. Three scales: RAS, CtoG, and PRC, had meaningful correlations with all six measures, and NT-DSES was associated with three measures. Only NRC failed to demonstrate a meaningful association to any of the six other religious or spiritual measures, suggesting that this scale may capture a unique dimension of R/S experience.
Our results also showed that four of the five SS-1 scales were able to meaningfully differentiate participants who identified as being a member of a religious congregation or community from those who did not. Using partial η
2 as a guide, three scales (Religious Activities, Closeness to God, and Positive Religious Coping), yielded group differences falling within the large effect range (>0.13). We also observed a small effect size for Non-theistic Daily Spiritual Experiences, as well as a significant (but trivial) difference between groups with respect to Negative Religious Coping. These differences were in the expected direction, with congregation members scoring higher on Non-theistic Daily Spiritual Experiences and lower Negative Religious Coping. Beyond demonstrating the magnitude of group differences, the effect size information we report can help researchers select the most appropriate scales for use in their studies. For example, if power analysis (
Cohen 1988) indicates a sample is only sufficient for identifying medium to large effects, our results suggest selecting from among the Positive Religious Coping, Religious Activities, and Closeness to God scales.
Additionally, the multiple regression analyses revealed how SS-1 scales were related to two common measures of health: the SF-12 PCS and MCS. Two SS-1 scales were significant independent correlates of physical health (SF-12 PCS) and three were significant independent correlates of mental health (SF-12 MCS). Two SS-1 scales were relatively strong correlates of mental health, with ΔR
2 indicating a small effect size: Negative Religious Coping and Non-theistic Daily Spiritual Experiences. The Closeness to God scale was statistically, though not substantively, significant. These results were in the expected direction, with Closeness to God and Spiritual Experiences associated with a higher mental health score and Negative Coping associated with a lower mental health score. Although the ability of SS-1 scales to predict physical health was fairly modest (R
2 of 0.017 absent the control variables, not shown), this value exceeds Cohen’s small effect size boundary (R
2 > 0.01), suggesting that the predictive utility is meaningful (
Cohen 1988). We would argue that this is a context in which even a small effect may aggregate over time to have a significant life impact. For example, if Negative Religious Coping has a small but meaningful negative effect on physical health, individuals that rely on this coping style may accumulate substantially more health concerns over time as the need to cope with health challenges occurs frequently across the life span (
Funder and Ozer 2019). Together the findings from the group differentiation analysis and the MRA predictive validity analyses provide reasonable initial evidence as to the utility and validity of the SS-1 scales in a multi-ethnic sample.
It should be noted that our analyses did identify approximately 18 redundant or underperforming items. For example, the three longest SS-1 scales Closeness to God (10-items), Positive Religious Coping (10-items) and Negative Religious Coping (8-items), could possibly be reduced to approximately five-items each without degrading their psychometric functioning. Additionally, a single item on use of R/S to cope with stress showed a good response distribution and seemed to hold promise as a parsimonious measure of positive R/S coping. In future work, we anticipate assessing more concise versions of these scales to identify the optimal combination of items that associate significantly with multiple health outcomes.
Our psychometric analyses also demonstrated the distinction between accounting for variance versus capturing significant influences on health outcomes, which is our aim. For example, although positive religious coping items accounted for nearly twice as much variance in factor analysis as negative religious coping items (second factor), our selected dimensions of negative religious coping were the strongest predictors of (poor) mental health, and to a lesser degree physical health. These results highlight the importance of validating survey items in relation to important health outcomes and suggest that negative religious coping should receive higher priority than positive religious coping in selecting R/S measures for use in large population studies. This is borne out in several extant studies, which have found the negative effect of negative religious coping to be greater than the positive effect of positive religious coping (
Park et al. 2018;
Ng et al. 2017;
Ahles et al. 2016).
8. Limitations and Conclusions
Our study has several important limitations. While it employed a large and diverse sample, some sample characteristics may have impacted the findings. The sample was predominately female (74%), somewhat older (mean age of 57.0 years), and fairly wealthy (59.3% made more than $50,000/year). Despite controlling for several demographic characteristics, how well these findings generalize to a more representative sample of U.S. is unclear. Although dividing the sample into participants who identified as congregation members or not created two nearly even sized groups, this division may have enhanced differences in a manner similar to that seen with extreme group comparisons. As such, it is possible that our group differentiation results overestimate the ability of the SS-1 scales to identify group differences. Second, the predictive utility (validity) of the SS-1 scales was evaluated against two variables, the physical and mental health components of the well-validated SF-12 instrument, aimed at measuring functional health. While these functional health measures represent meaningful constructs against which to evaluate various SS-1 scales, assessment of the SS-1 items in relation to a broader array of health outcomes is needed to assess predictive power. Future research should explore the validity of the SS-1 measures in the context of diverse clinical endpoints, including incident hypertension, cardiovascular disease, depression, and other chronic health conditions that are marked by persistent disparities in the burden of illness across racial/ethnic communities. Finally, in this initial evaluation of the psychometric properties of R/S measures included in the baseline Spirituality Survey of the SSSH, we did not assess the psychometric properties of R/S measures within specific self-defined racial/ethnic groups. Our data collection is ongoing in some racial/ethnic communities, and thus we elected to postpone such assessment until such time as we have all ethnic groups equally represented in our data. The fact that the SS-1 as currently evaluated does indeed include five ethnic groups, four of which bear a disparate burden of illness, is a strength of this psychometric analysis and the SSSH more generally.
Despite these limitations, the results of the current study provide strong initial support for the psychometric functioning (reliability and validity) of the five SS-1 scales included in the first wave of data collection within the SSSH, and suggest some areas where items could be further trimmed without compromising the reliability and validity of the construct. The scales assessed in this analysis tap a wide range of R/S attitudes, beliefs and practices, and the study represents the first paper to publish standard psychometric analyses on multiple R/S scales evaluated in a large, racially/ethnically and religiously diverse sample of adults in the U.S. This is important, given that the initial validity of most scales has been assessed in predominantly white, Christian populations (
Pargament et al. 2011;
Koenig and Büssing 2010;
Exline et al. 2014).
Further research is needed to assess how different R/S measures perform across different religious traditions. This is particularly important given that several items in the SS-1 were tailored to meet specific communities’ cultural preferences, and may not perform equally well across all religious communities. Future work should also assess test-retest reliability or stability of the SS-1 scales. This type of reliability has rarely been assessed in surveys of religion and spirituality. Finally, SS-1 items should be examined against a wide array of mental health and chronic disease outcomes to identify the most robust R/S items in relation to high-priority chronic conditions. All of these limitations are being addressed in planned future work with the SSSH.
In conclusion, our initial assessment of the SSSH baseline Spirituality Survey in a large population of more than 4500 respondents from five different racial/ethnic communities in the U.S. revealed five scales having acceptable psychometric properties and predictive validity in relation to the physical health and mental health components of the SF-12 functional health assessment, thus supporting their use in the empirical research assessing the influence of religion and spirituality on important health outcomes in the SSSH.