“The Study on Stress, Spirituality, and Health (SSSH): Psychometric Evaluation and Initial Validation of the SSSH Baseline Spirituality Survey”

This paper describes the development and initial psychometric testing of the baseline Spirituality Survey (SS-1) from the Study on Stress, Spirituality, and Health (SSSH) which contained a mixture of items selected from validated existing scales and new items generated to measure important constructs not captured by existing instruments. The purpose was to establish the validity of new and existing measures in our racially/ethnically diverse sample. Psychometric properties of the SS-1 were evaluated using standard psychometric analyses in 4,634 SSSH participants. Predictive validity of SS-1 scales was assessed in relation to the physical and mental health component scores from the Short-Form 12 Health Survey (SF-12). Scales exhibited adequate to strong psychometric properties and demonstrated construct and predictive validity. Overall, the correlational findings provide solid evidence that the SS-1 scales are associated with a wide range of relevant R/S attitudes, mental health, and to a lesser degree physical health.


Introduction
Published literature on the influence of religion and spirituality (R/S) on human health has grown substantially in recent years (Demir 2019, Koenig 2012), yet R/S measures are still not prioritized in large prospective studies investigating the etiology of disease.(Shields and Balboni 2020) Exceptions occur among those cohorts with an explicit focus on R/S and health (Smith and Faris 2002) or in cohorts that focus on a specific religious group (e.g., the Adventist Health Study) (Beeson, Mills, Phillips et al. 1989, Butler, Fraser, Beeson et al. 2008). On the whole, however, many gaps remain. An analysis of all surveys fielded by 20 large U.S-based prospective cohort reveals, for example, that only 7 out of 20 cohorts had ever collected at least three different R/S measures in the history of their cohort (CGVH 2019).
This oversight is important since the potential influence of R/S on health likely differs across racial/ethnic communities. Enough evidence exists demonstrating the importance of R/S in health research (Koenig 2012, VanderWeele 2017) that examination of understudied groups must become a priority. We know, for example, that more than 75% of African Americans and 59% of Latinos say their religion or spirituality is very important in their lives, compared to only 49% of white Americans (CGVH 2014), but R/S research has received almost no attention in other groups, such as the U.S. South Asian and American Indian communities. Minority communities bear a disproportionate burden of chronic disease iteratively modified to address cultural expectations while maintaining item validity across all cohorts. The final SS-1 was approved by each participating cohort.
The SS-1 consists of 82 R/S items assessing the following areas: (a) Religious activities; (b) Closeness to God; (c) Religious coping (positive religious coping and negative religious coping/spiritual struggles); (d) Gratitude; i and (e) Non-theistic daily spiritual experiences. The survey items were prefaced with the following statement: "These questions are being asked of people from different religious backgrounds, and although we use the term 'God' in some of the questions below, please substitute your own word for 'God' (e.g., Bhagwan, Allah, The Divine, etc.)." Questions regarding one's relationship with God were asked only of those survey participants who said they believed in [God], and questions addressing one's experience with their religious congregation or community were asked only of those respondents who indicated that they belonged to a religious congregation or community.

SS-1 Dimensions and Scales
Individual Items.-The SS-1 contains a number of individual items that measure respondents' attitudes, beliefs, and practices, including respondents' views of organized religion, the extent to which they identify as a religious or spiritual person, their membership in a religious congregation or community, their relationship to that congregation or community, and beliefs about God and the afterlife (see Supplementary Table 1). The majority of these individual items were not assessed in the current study, as the focus here is on scale validation. SS-1 scales are are detailed in the following material.
Religious Activities (RAS).-This SS-1 section contained seven items designed to capture the frequency of religious activities, such as praying in groups or alone, reading scriptures, meditating, and practicing yoga or Tai Chi. Respondents' engagement in each religious activity was rated on a 7-point scale, from "never" to "several times a day." Closeness to God (CtoG).-Asked of respondents who said they believed in God, the CtoG scale contains ten items assessing how people relate to God (e.g., "I feel God's love", "God's spirit dwells in my body", "God is the center of my life," etc.), including four selected items from the Duke University Religion Index (Koenig and Büssing 2010) and the theistic items from the Daily Spiritual Experience Scale (Theistic) (Underwood and Teresi 2002), as well as six de novo items that addressed perception of relationship to God.
Response categories reflected a 5-point scale ranging from "definitely not true of me" to "definitely true of me."

Religious
Coping.-Multiple items were used to assess the use of R/S in coping with stress, including positive (e.g., using R/S to cope with stressful situations) and negative (e.g., doubting God's love, or feeling that God is punishing me) coping. We asked a single item question from the Brief Multidimensional Measure of Religiousness/Spirituality (BMMRS) (Fetzer Institute 1999) ("To what extent is your religion or spirituality involved i Gratitude was conceived of as a relevant "virtue" within a broadly defined spirituality. in understanding or dealing with stressful situations in any way?"), with four response categories ranging from "not involved at all" to "very involved." The SS-1 also included a 10-item scale capturing the construct of positive religious coping in dealing with stressful events. Eight of these items were selected from Kenneth Pargament's well-validated RCOPE (Pargament, Koenig and Perez 2000), with two sub-items selected from four different RCOPE sub-domains (items RC 1-2 and 7-12, Supplementary Table 1). These sub-domains were selected by the study PI and Pargament based on the salience of specific sub-domains in the diverse racial/ethnic communities represented in the SSSH. A de novo subdomain reflects R/S as critical to maintaining hope in the face of adversity. This sub-domain emerged as a central means of coping with difficult life circumstances in focus groups with African Americans and Hispanic/Latino participants, but was not included in the RCOPE. Positive Religious Coping items were rated on a 4-point scale ("not at all," "somewhat," "quite a bit," and "a great deal").
The SS-1 also included 8 items capturing the construct of negative religious coping in dealing with stressful events, six from the RCOPE (Pargament et al. 2000), reflecting three RCOPE sub-domains. Two of the original five items under each sub-domain were selected (RC 3-6 and 13-14, Supplementary Table 1). Two items from the Doubt subscale of the validated Religious and Spiritual Struggles Scale (i.e., RC 17-18: "I felt confused about my religious or spiritual beliefs" and "I felt troubled by doubts or questions about my religion or spirituality") were also included. (Exline, Pargament, B. et al. 2014) All items were rated on a 4-point scale ("not at all," "somewhat," "quite a bit," and "a great deal").
Gratitude (GQ).-The SS-1 included two of the original six items measuring dimensions of gratitude from the Gratitude Questionnaire-6: (Mccullough, Emmons and Tsang 2002) ("I have so much in life to be grateful for" and "If I listed everything that I felt grateful for, it would be a very long list"). The items were rated on a 5-point scale ("strongly disagree," "somewhat disagree," "neutral," "somewhat agree," and "strongly agree").

Non-Theistic Daily Spiritual Experiences (NT-DSES)
.-This SS-1 section had four items measuring non-theistic daily spiritual experiences from Underwood's Daily Spiritual Experiences Scale(Underwood and Teresi 2002) (i.e., "I experience a connection to all of life;" "I feel deep inner peace or harmony;" "I am touched by the beauty of creation;" and "I feel a selfless caring for others") rated on a 5-point scale ("never" to "many times a day"). ii

Study Population and Survey Procedures
The psychometric properties of the SS-1 were evaluated using all 4,563 SSSH participants drawn from the five initial U.S. cohorts participating in the SSSH: BWHS, HCHS/SOL, MASALA, NHSII, and SHS. Brief cohort descriptions follow.
BWHS began in 1995 to investigate breast cancer and other diseases that disproportionately affect Black women. In 2015, approximately 4,000 participants who had completed the most recent wave of data collection were invited to complete the SS-1; more than 2,400 ii These items may still be interpreted by religious adherents through a theistic lens despite "God" not being named explicitly. women responded within the first two weeks of recruitment and enrollment was stopped. A random sample of 1,000 of these participants was included in SSSH. Comparisons to the full BWHS cohort indicate a high degree of comparability on available religious measures (e.g., religious attendance, degree of religious/spiritual person. (Cozier, Yu, Wise et al. 2018) The sample represents a full range of socioeconomic levels and all geographic regions of the U.S. (bu.edu\bwhs).
HCHS/SOL targets both immigrant and U.S.-born Hispanic/Latinos in four U.S. cities (total cohort N=16,415), with the aim of assessing the role of acculturation in cardiovascular and related conditions disease etiology. To be eligible for the SS-1 at the time of collection in 2018-19, participants had to be from the Chicago site, completed the most recent round of data collection, and participated in HCHS/SOL's Sociocultural Ancillary Study (N=900, response rate 754/900=83.8%). An additional 244 participants were recruited through letters sent to the broader sample of Chicago site participants (sites.cscc.unc.edu/hchs) to reach the desired study population of 1,000. The SSSH sample is generally comparable to the full HCHS/SOL cohort, though variations occur on the handful of comparison items available (i.e., SSSH sample has a slightly higher proportion of religious affiliates but attends religious services slightly less -see Lerman et al. 2018).
MASALA examines risk factors for atherosclerosis among South Asians, with participants drawn from the Chicago and the San Francisco Bay areas. To be eligible for MASALA, respondents must have had at least 3 grandparents born in India, Pakistan, Bangladesh, Nepal, or Sri Lanka. All participants (total cohort N=990) were invited to complete the SS-1 in 2016-18, and only one declined (masalastudy.org).
NHSII was established in 1989 among 116,429 women who responded to the baseline and subsequent biennial follow-up questionnaires to investigate risk factors for major chronic diseases in women, and is comprised of nurses from 14 states who are predominantly white. R/S data collection occurred from 2015-16, and eligibility included provision of at least two blood samples, being age 45-75 at the time of the most recent blood draw (2010-13), completion of four questionnaires (2001 violence, 2008 trauma, and 2013 and 2015 main questionnaires), and no active participation in an ongoing ancillary study. Approximately the first 1,100 women who completed the survey were enrolled. Comparisons to the larger cohort indicate almost identical levels of religious service attendance (nurseshealthstudy.org) (Spence, Farvid, Warner et al. 2020).
SHS is one of the largest prospective cohort studies of American Indians, and is focused on cardiovascular disease. In 2017-18 participants for the SS-1 were drawn from the Dakotas region and had to be part of phase IV or V and completed the previous two rounds of data collection. Community workers held community events and reached out to SHS participants, most often conducting home visits to assist with completion of the SS-1. Religious comparison measures were not available for the larger cohort (strongheartstudy.org).
SSSH Survey data were collected using the established procedures for data collection within each cohort. BWHS and NHSII participants completed a web-based version of the survey accessed through an emailed link. Participants from MASALA completed the survey during an in-person clinical visit, or by mail, if they had already completed their most recent clinical visit. Participants from HCHS/SOL and SHS completed the survey either via mail, over the telephone, or in person. Each cohort also provided historic survey data for all cohort participants who completed the SS-1 and were thus were enrolled in the SSSH, including demographic, psychosocial, lifestyle, behavioral, and clinical data. All data were sent to the Harvard/MGH Center on Genomics, Vulnerable Populations, and Health Disparities, where data elements were harmonized and incorporated into the SSSH analytic file. Procedures were approved by each cohort's Ancillary Study Committee and Institutional Review Board (IRB), as well as the Partners Human Research Committee.

Psychometric Analyses
The primary goal of this paper was to determine the psychometric properties of the SS-1, including reliability (internal consistency [α]), item adequacy (adjusted item-to-scale correlations), and the item-level factor structure, where appropriate, using the five-cohort pooled sample. iii We also sought to identify opportunities where the same construct captured by a scale could be reliably measured with fewer items. Scale-level factor structure was assessed using Principal Axis Exploratory Factor Analysis.(Thompson 2019) The SS-1 also contained several categorical and nominal items (e.g., "To what extent do you view organized religion as positive or negative?") that were answered on a 5-point scale (e.g., "very positive" to "very negative"). For select categorical and nominal items, we report the response distribution, as well as floor and ceiling effects.

Initial Validity Analysis
We sought to obtain initial evidence of validity for the SS-1 scales as predictors of respondents' functional health status using the Short-Form 12 Health Survey (SF-12), a validated scale comprised of physical health (SF-12 PCS) and mental health (SF-12 MCS) components. (Ware et al. 1996) First, we calculated partial correlations (controlling for age, sex, and race/ethnicity) among the SS-1 sections identified as having adequate reliability (five SS-1 sections or scales met this criterion) and relevant nominal variables contained within the SS-1 (e.g., extent of being a religious or spiritual person; view of organized religion; being a member of a religious congregation or community) and the SF-12 PCS and SF-12 MCS. Given the size of our sample, trivial correlations would achieve statistical significance. To identify meaningful relationships, only correlations with an absolute value ≥ 0.15 were considered significant (95% Confidence Interval [CI] for r=0.15 in a sample of 4,000 is 0.12 to 0.18). The absolute value of r ≥ 0.15 seemed an appropriate threshold as the 95% CI exceeds 0.10, Cohen's small effect size lower boundary. (Cohen 1988) To test group differentiation, the sample was divided into groups based on whether or not they reported being part of a religious congregation or community. This grouping variable produced an almost 50/50 split in our sample. Between groups T-tests were conducted for the five SS-1 domains / scales and the SF-12 component scales. The T-test results and effect size measure (Cohen's d) are presented. To assess predictive validity, we conducted iii This analysis was limited to the pooled sample only. A planned future analysis will examine psychometric properties across each of the racial/ethnic groups. two stepwise multiple regression analyses exploring the extent to which the SS-1 domains / scales could predict scores on the SF-12-PCS and SF-12-MCS. Analyses utilized sampling weights provided by HCHS/SOL; other cohorts were set to a weight value of 1.

Response Patterns of Individual R/S Measurement Items
The SS-1 includes several individual (single item) R/S questions (both de novo and validated) in addition to scales. As shown in Table 1, most subjects viewed organized religion as either very positive (22.2%, n=788) or positive (38.2%, n=1357), while few viewed religion as negative (6.9%, n=245) or very negative (2.5%, n=90). Most subjects classified themselves as both spiritual and religious (58.9%, n=2655) and only 6.1% (n=274) indicated they were neither spiritual nor religious. More than one-third (n=1505) of subjects considered themselves to be very religious or spiritual. Half of the sample (49.9%, n=2241) reported being part of a religious congregation or community. Of the subjects who were members of a community or congregation, 79.4% (n=1776) felt they received love or care from their congregation/community "very often" or "fairly often" and 78.4% (n=1755) felt they showed love or care to congregation members "very often" or "fairly often". On the other hand, 9.7% of subjects (n=217) felt their community or congregation was critical of them "very often" or "fairly often" and 5.2% (n=115) felt ignored or neglected by other members of their religious community or congregation ("very often" or "fairly often").
With respect to religious beliefs, nearly three quarters (74.1%, n=3326) responded "definitely true of me" to the statement, "I believe that God exists." Another 11.5% (n=513) answered "tends to be true of me" and only 4.7% (n=209) answered "definitely not true of me" to this question. The majority (56.0%, n=2510) of subjects answered "definitely true" to the question, "I believe in life after death," while only 7.0% (n=315) endorsed "definitely not true." The second largest group was "unsure" (19.4%, n=868). Finally, one question asked: "When you think about God in relationship to your health, which of the following is closest to your view?" 42.2% (n=1790) selected "My health is determined by my own actions;" 51.7% (n=2191) selected "When it comes to my health, God and I both have a role to play;" and 6.2% (n=262) selected "God determines my health, regardless of my own actions and behaviors."

Religious Coping Scale Development
Exploratory Factor Analysis (EFA; Principle Axis Factoring with orthogonal rotation) was used to explore the underlying dimensional structure of the 18 positive and negative religious coping items  . Examining the 2-factor solution revealed that the religious coping items separated clearly into two factors representing positive and negative R/S coping styles, with all items having strong primary loadings and no secondary loadings (>0.29) ( Table 2). The factor loadings for the first factor ranged from 0.74 ("I saw my situation as part of God's plan," RC-1) to 0.90 ("I trusted God would be on my side," RC-12). Factor 1 also contained all four positive coping items from the RCOPE. Loadings on the second factor ranged from 0.46 ("I felt as though the devil or evil spirits were trying to turn me away from God," RC-6) to 0.72 ("I wondered whether God had abandoned me," RC-13). Factor 2 contained all the negative religious coping items from the RCOPE and the two Religious and Spiritual Struggles items. Based on these factor loadings, two scales were composed based upon the factor loadings of religious coping items, with 10 items comprising the positive religious/ spiritual coping scale (PRC) and eight items in the negative religious/spiritual coping scale (NRC).

Internal Consistency of the SS-1 Scales
Religious Activities (RAS).- Table 3 displays the descriptive statistics, number of items, and alphas for each of the five SS-1 scales beginning with RAS. A review of response distribution revealed RAS-7 (Tai Chi) to have a non-normal distribution with excessive skewness (6.10) and kurtosis (39.9). As such, it was dropped from any additional analyses. A 6-item scale on Religious Activities (RAS) was evaluated and was found to have an acceptable internal consistency (α=0.75), but RAS item 6 (yoga) had an adjusted item-to scale correlation of 0.08, well below the 0.30 lower boundary, so was also removed. The remaining five items produced a scale with good internal consistency (α=0.80) and adequate adjusted item-to-scale correlations (0.43 [meditation, RAS-5] to 0.72 [individual prayer, RAS-2]). Exploratory Factor Analysis (EFA; Principle Axis Factoring) showed the five item RAS to be uni-factorial, having one factor with an Eigenvalue of 2.84 accounting for 56.8% of the variance. The factor loading ranged from 0.60 (meditation, RAS-5) to 0.87 (individual prayer, RAS-2).
Closeness to God (CtoG).-Psychometric analysis revealed that these items suffer from several limitations. The 10 items have ceiling effects that are unacceptably high; they range from a low of 45.3% (CtoG-10) to a high of 65.9% (CtoG-5). Thus, the 10-item CtoG scale had limited score range (median score=4.6; modal score is the scale maximum 5.0), excessive inter-item correlation (mean inter-item correlation = 0.73), and a coefficient alpha of 0.96 (Table 3). Exploratory Factor Analysis (EFA; Principle Axis Factoring) showed the Closeness to God scale was highly uni-factorial, having a single factor with an Eigenvalue of 7.60 accounting for 76.1% of the variance. The factor loading ranged from 0.78 ("I feel God's love or care for me through others," CtoG-1) to 0.90 ("My relationship with God lies behind my whole approach to life," CtoG-7). These results suggest that psychometrically the 10 items are essentially identical and the construct could be well-measured with fewer items. To test this possibility, we composed three independent 3-item Closeness to God scales (three items are generally accepted as the minimum number necessary to accurately calculate internal consistency).(Cohen 1988) CtoG-A (items 1, 3, and 5) had an alpha of 0.87 and item to scale correlations between 0.72 and 0.79; CtoG-B (items 2, 4, and 6) had an alpha of 0.92 and item to scale correlations between 0.82 and 0.84), and CtoG-C (items 8, 9, and 10) had an alpha of 0.91 and item to scale correlations between 0.80 and 0.84. All three 3-item CtoG scales demonstrated adequate psychometric functioning, supporting the use of fewer items to measure this dimension of R/S.

Religious
Coping (single item).-Among our sample, 87.0% (n=3690) felt religion or spirituality was somewhat (30.1%) or very much (56.9%) involved in how they coped with stressful life situations. This single item showed a good response distribution.

Positive and Negative Religious
Coping.-The 10-item Positive Religious Coping (PRC) scale had an alpha of 0.95, with adjusted item-to-scale correlations ranging from 0.70 to 0.86, while the 8-item Negative Religious Coping (NRC) scale had an alpha of 0.84, with adjusted Item-to-Scale correlations ranging from 0.53 to 0.65. The findings for PRC (α >0.90, factor loadings of 0.75 or better, and no item-to-scale correlations below 0.70) suggest that the scale could be reduced in length, perhaps by half, without suffering a significant loss of measurement power. Similarly, the NRC scale could be reduced by 1 or 2 items and maintain acceptable measurement properties. Two potential candidates for removal, both from the original RCOPE, are RC-6 ("I felt as though the devil, or an evil spirit was trying to turn me away from God") and RC-5 ("I believed the devil or evil spirits were responsible for my situation"), as these were the two items with the weakest factor loading.
Two 5-item scales, PRC and NRC, were composed using the 5 highest loading items from each scale. PRC5 (RC items,8,9,11,12,& 18) had an alpha of 0.94, with adjusted item-to-scale correlations ranging from 0.81 to 0.87, while NRC5 (RC items 3, 13, 14, 15, & 16) had an alpha of 0.81, with adjusted item-to-scale correlations ranging from 0.47 to 0.66 (data not shown). These 10 items also cleanly reproduced the two-factor solution, and offer an option for assessing positive and negative religious coping with a more parsimonious measure.
Gratitude (GQ).-We found these two items suffered from significant psychometric deficiencies, including ceiling effects (subjects scoring at the top of the scale) of 89.9%, along with severe kurtosis (24.3). As such, the gratitude items were dropped from further analysis.

Correlation Between SS-1 Scales
Partial correlation analyses (controlling for age, sex, and race/ethnicity) showed the five SS-1 scales to be mildly inter-correlated, with a mean inter-scale correlation of 0.36 (data not shown). However, some scale pairs showed high inter-correlation: Closeness to God & Positive Religious Coping (r=0.84); Closeness to God and Religious Activities (r=0.65); and Religious Activities & Positive Religious Coping (r=0.61). No other scale pairs correlated at ≥ 0.50. This raises the possibility that these three scales, and especially Closeness to God and Positive Religious Coping, measure similar constructs. Table 4 provides partial correlations (controlling for age, sex, and race/ethnicity) for the five SS-1 scales in relation to (a) five conceptually relevant attitudes from the SS-1, (b) the SF-12-PCS, and (c) the SF-12-MCS. Four of the SS-1 scales were strongly or moderately associated with relevant religious attitudes, with 23 out of 40 (58%) observed correlations at or above the 0.15 threshold. Three SS-1 scales (Religious Activities, Closeness to God, and Positive Religious Coping) were meaningfully correlated with six of the eight variables examined, including the extent of self-described religiosity or spirituality, views towards organized religion, belonging to a religious or spiritual community, religious service attendance, belief in life after death, and a single item on R/S coping. Non-theistic Daily Spiritual Experiences were associated with extent of religiosity/spirituality, belief in life after death, and R/S coping. Consistent with the scale inter-correlations presented above, the pattern and magnitude of correlations observed for these scales were highly similar. The exception was Negative Religious Coping, which was not meaningfully associated with other R/S variables, though it was meaningfully correlated (in the negative direction) with mental health. Table 5 shows that significant group differences were observed for all five SS-1 scales and the SF-12 mental health component (SF-12-MCS) as defined by membership in a religious or spiritual community (no difference was observed on the SF-12 physical health component). However, given the sample size, statistically significant differences may in fact be trivial. Effect size measures provide a more meaningful method for assessing the importance of the observed differences. Keeping with Cohen's recommendations, partial eta-squred effect sizes for of 0.01 or higher were considered meaningful. Large effect sizes (i.e., > 0.13) were seen for the Religious Activities Scale ( = 0.20), Closeness to God (= 0.18), and Positive Religious Coping ( = 0.14). A small effect size ( = 0.02) was observed for Non-theistic Daily Spiritual Experiences, and trivial effect sizes were observed for Negative Religious Coping and SF-12-MCS, despite achieving statistical significance.

Predictive Validity
Predictive validity of the five SS-1 scales was explored using two multiple regression analyses with stepwise selection, controlling for age, sex, and race/ethnicity (Table 6). Negative Religious Coping and Religious Activities were associated with poorer physical health as measured by the SF-12-PCS, but neither produced a change of R 2 indicating a small effect size (Cohen's small effect boundary, R 2 > 0.01). In the second multiple regression analysis, three scales were significantly associated with the SF-12-MCS. Two were positively associated with mental health: Non-theistic Daily Spiritual Experiences and Closeness to God, while Negative Religious Coping was negatively associated with mental health. Negative Coping (0.029) and Non-theistic Daily Spiritual Experiences (0.018) yielded a change of R 2 indicating a small effect size.

Discussion
This paper describes the initial psychometric evaluation (reliability and validity) of the baseline Spirituality Survey (SS-1) assessed among 4,563 adult respondents from five prospective cohorts participating in the Study on Stress, Spirituality, and Health (SSSH). The SS-1 is designed to assess diverse R/S experiences, beliefs, and practices across ethnically diverse communities and diverse religious traditions. Responses demonstrated that religion and spirituality (R/S) are significant components of most participants' lives. Our analyses revealed that the SS-1 contains five scales that have acceptable psychometric properties; they also provided initial evidence supporting construct and predictive validity of measures and scales in the SS-1, offering solid evidence that SS-1 scales are meaningfully associated with a wide range of relevant religious/spiritual beliefs, attitudes and practices. Results further demonstrated that the SS-1 scales produced meaningful group differentiation according to key R/S constructs (e.g., participants who were or were not part of a religious or spiritual community) and showed predictive validity when evaluated against a well-validated measure of functional health status, the SF-12. The SS-1 scales were further able to predict both the physical health and mental health components of the SF-12. Together these findings provide initial support for the psychometric adequacy of the SS-1 scales and categorical attitudes, beliefs, and practices assessed.
Regarding reliability, the five final scales demonstrated adequate reliability (internal consistency) and acceptable adjusted item-to-scale correlations. Some SS-1 scales combine items from previously validated scales, with de novo items generated through qualitative research in minority communities. The best example is the 18-item religious coping scale which is composed of 14 items from the 105-item full RCOPE, two from the 26-item Religious and Spiritual Struggles Scale, and two de novo items addressing R/S as a resource for maintaining hope in difficult times. The resulting composite scales exhibited reliability values consistent with those reported for the RCOPE and Religious and Spiritual Struggles scales, from which these sub-items were drawn. An existing shortened form, the Brief RCOPE (14 total items) has a median alpha of 0.92 for the positive religious coping subscale and a median alpha of 0.81 for negative religious coping (Pargament et al. 2011) similar to what we obtained for the PRC (0.95) and NRC (0.84). The fact that our reliability findings for the PRC and NRC scales were consistent with existing literature increases confidence in our findings, and suggests that our more concise 5-item subscales, which combine sub-items from larger source scales and novel items resulting from qualitiative research, work well across racial/ethnic groups and achieve predictive power similar to that of the full scales. Achieving this while also accounting for hope in the divine, which was absent from the RCOPE scale, helps the measure to more robustly capture coping attitudes among minority populations.
The factor analytic findings for SS-1 scales (with more than 3-items) showed all scales to be uni-factorial. The uni-factorial nature of the SS-1 scales provides strong evidence that they measure the specific constructs intended. By demonstrating that the SS-1 scales have acceptable reliability and clear factor structures, these data establish a solid foundation for future SSSH research in relation to other health outcomes.
The range of initial validity data was also reported. Four of the five SS-1 scales were meaningfully associated with six other measures of religious/spiritual beliefs, attitudes and practices. Three scales: RAS, CtoG, and PRC, had meaningful correlations with all six measures, and NT-DSES was associated with three measures. Only NRC failed to demonstrate a meaningful association to any of the six other religious or spiritual measures, suggesting this scale may capture a unique dimension of R/S experience.
Our results also showed that four of the five SS-1 scales were able to meaningfully differentiate participants who identified as being a member of a religious congregation or community from those who did not. Using partial η 2 as a guide, three scales (Religious Activities, Closeness to God, and Positive Religious Coping), yielded group differences falling within the large effect range (> 0.13). We also observed a small effect size for Non theistic Daily Spiritual Experiences, as well as a significant (but trivial) difference between groups with respect to Negative Religious Coping. These differences were in the expected direction, with congregation members scoring higher on Non-theistic Daily Spiritual Experiences and lower Negative Religious Coping. Beyond demonstrating the magnitude of group differences, the effect size information we report can help researchers select the most appropriate scales for use in their studies. For example, if power analysis (Cohen 1988) indicates a sample is only sufficient for identifying medium to large effects, our results suggest selecting from among the Positive Religious Coping, Religious Activities, and Closeness to God scales.
Additionally, the multiple regression analyses revealed how SS-1 scales were related to two common measures of health: the SF-12 PCS and MCS. Two SS-1 scales were significant independent correlates of physical health (SF-12 PCS) and three were significant independent correlates of mental health (SF-12 MCS). Two SS-1 scales were relatively strong correlates of mental health, with ΔR 2 indicating a small effect size: Negative Religious Coping and Non-theistic Daily Spiritual Experiences. The Closeness to God scale was statistically, though not substantively, significant. These results were in the expected direction, with Closeness to God and Spiritual Experiences associated with a higher mental health score and Negative Coping associated with a lower mental health score. Although the ability of SS-1 scales to predict physical health was fairly modest (R 2 of 0.017 absent the control variables, not shown), this value exceeds Cohen's small effect size boundary (R 2 > 0.01), suggesting that the predictive utility is meaningful.(Cohen 1988) We would argue that this is a context in which even a small effect may aggregate over time to have a significant life impact. For example, if Negative Religious Coping has a small but meaningful negative effect on physical health, individuals that rely on this coping style may accumulate substantially more health concerns over time as the need to cope with health challenges occurs frequently across the life span (Funder and Ozer 2019). Together the findings from the group differentiation analysis and the MRA predictive validity analyses provide reasonable initial evidence as to the utility and validity of the SS-1 scales in a multi-ethnic sample.
It should be noted that our analyses did identify approximately 18 redundant or underperforming items. For example, the three longest SS-1 scales Closeness to God (10 items), Positive Religious Coping (10-items) and Negative Religious Coping (8-items), could possibly be reduced to approximately five-items each without degrading their psychometric functioning. Additionally, a single item on use of R/S to cope with stress showed a good response distribution and seemed to hold promise as a parsimonious measure of positive R/S coping. In future work, we anticipate assessing more concise versions of these scales to identify the optimal combination of items that associate significantly with multiple health outcomes.
Our psychometric analyses also demonstrated the distinction between accounting for variance versus capturing significant influences on health outcomes, which is our aim. For example, although positive religious coping items accounted for nearly twice as much variance in factor analysis as negative religious coping items (second factor), our selected dimensions of negative religious coping were the strongest predictors of (poor) mental health, and to a lesser degree physical health. These results highlight the importance of validating survey items in relation to important health outcomes and suggest that negative religious coping should receive higher priority than positive religious coping in selecting R/S measures for use in large population studies. This is borne out in several extant studies, which have found the negative effect of negative religious coping to be greater than the positive effect of positive religious coping (Ahles, Mezulis and Hudson 2016, Ng, Mohamed, Sulaiman et al. 2017, Park, Holt, Le et al. 2018).

Limitations and Conclusion
Our study has several important limitations. While it employed a large and diverse sample, some sample characteristics may have impacted the findings. The sample was predominately female (74%), somewhat older (mean age of 57.0 years), and fairly wealthy (59.3% made more than $50,000/year). Despite controlling for several demographic characteristics, how well these findings generalize to a more representative sample of U.S. is unclear. Although dividing the sample into participants who identified as congregation members or not created two nearly even sized groups, this division may have enhanced differences in a manner similar to that seen with extreme group comparisons. As such, it is possible that our group differentiation results overestimate the ability of the SS-1 scales to identify group differences. Second, the predictive utility (validity) of the SS-1 scales was evaluated against two variables, the physical and mental health components of the well-validated SF-12 instrument, aimed at measuring functional health. While these functional health measures represent meaningful constructs against which to evaluate various SS-1 scales, assessment of the SS-1 items in relation to a broader array of health outcomes is needed to assess predictive power. Future research should explore the validity of the SS-1 measures in the context of diverse clinical endpoints, including incident hypertension, cardiovascular disease, depression, and other chronic health conditions that are marked by persistent disparities in the burden of illness across racial/ethnic communities. Finally, in this initial evaluation of the psychometric properties of R/S measures included in the baseline Spirituality Survey of the SSSH, we did not assess the psychometric properties of R/S measures within specific self-defined racial/ethnic groups. Our data collection is ongoing in some racial/ethnic communities, and thus we elected to postpone such assessment until such time as we have all ethnic groups equally represented in our data. The fact that the SS-1 as currently evaluated does indeed include five ethnic groups, four of which bear a disparate burden of illness, is a strength of this psychometric analysis and the SSSH more generally.
Despite these limitations, the results of the current study provide strong initial support for the psychometric functioning (reliability and validity) of the five SS-1 scales included in the first wave of data collection within the SSSH, and suggest some areas where items could be further trimmed without compromising the reliability and validity of the construct. The scales assessed in this analysis tap a wide range of R/S attitudes, beliefs and practices, and the study represents the first paper to publish standard psychometric analyses on multiple R/S scales evaluated in a large, racially/ethnically and religiously diverse sample of adults in the U.S. This is important, given that the initial validity of most scales has been assessed in predominantly white, Christian populations (Exline et al. 2014, Koenig and Büssing 2010, Pargament et al. 2011).
Further research is needed to assess how different R/S measures perform across different religious traditions. This is particularly important given that several items in the SS-1 were tailored to meet specific communities' cultural preferences, and may not perform equally well across all religious communities. Future work should also assess test-retest reliability or stability of the SS-1 scales. This type of reliability has rarely been assessed in surveys of religion and spirituality. Finally, SS-1 items should be examined against a wide array of mental health and chronic disease outcomes to identify the most robust R/S items in relation to high-priority chronic conditions. All of these limitations are being addressed in planned future work with the SSSH.
In conclusion, our initial assessment of the SSSH baseline Spirituality Survey (SS-1) in a large population of more than 4,500 respondents from five different racial/ethnic communities in the U.S. revealed five scales having acceptable psychometric properties and predictive validity in relation to the physical health and mental health components of the SF-12 functional health assessment, thus supporting their use in the empirical research assessing the influence of religion and spirituality on important health outcomes in the SSSH.

Supplementary Material
Refer to Web version on PubMed Central for supplementary material.

Funding:
This analysis was supported by a grant from the John Templeton Foundation and the Study on Stress, Spirituality, and Health (grant #59607). The Black Women's Health Study was supported by NIH grants UM1CA164974, U01CA164974, and R01CA058420. The MASALA Study was supported by NIH grants 1R01HL093009, 2R01HL093009, R01HL120725, UL1RR024131, UL1TR001872, and P30DK098722. The Nurses' Health Study II was supported by NIH grants U01 CA176726 and R01 CA163451. The Strong Heart Study was