Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Open-Ended and Closed-Ended Measures of Religious/Spiritual Struggles: A Mixed-Methods Study

Religions 2020, 11(10), 505; https://doi.org/10.3390/rel11100505

by Joshua A. Wilt^1,*

, Joyce T. Takahashi¹

, Peter Jeong¹, Julie J. Exline¹

and Kenneth I. Pargament²

Reviewer 1: Anonymous

Reviewer 2: Anonymous

Reviewer 3: Anonymous

Religions 2020, 11(10), 505; https://doi.org/10.3390/rel11100505

Submission received: 22 July 2020 / Revised: 23 September 2020 / Accepted: 27 September 2020 / Published: 1 October 2020

(This article belongs to the Special Issue The Study of Religious and Spiritual Struggles: An Interdisciplinary Endeavor)

Round 1

Reviewer 1 Report

Introduction requires extension; it is recommended to make references to J.J. Exline’s works.
There is a need to add information concerning sex and age of the study participants in the group description.
Descriptive statistics should be completed (Min., Max.). It is vital to conduct supplementary statistical analyses, more advanced than correlations.
References are not sufficient enough and they need to be supplemented. Descriptions of some sources (e.g. Hall, Todd W. 2014) are incomplete.

Author Response

Thank you for the review. Please find below a description of how we addressed each of your comments.

Introduction requires extension; it is recommended to make references to J.J. Exline’s works.

Response: We have added several more citations to Exline’s work. (Note that Exline is a co-author on this paper.)

There is a need to add information concerning sex and age of the study participants in the group description.

Response: We include information about self-identified gender and age on p. 6

Descriptive statistics should be completed (Min., Max.). It is vital to conduct supplementary statistical analyses, more advanced than correlations.

Response: We include more complete descriptive information (min, max, skew, kurtosis) in Tables 2 (p. 11) and 3 (p. 12).

In the previous version of the manuscript, we presented zero-order Spearman correlations because they are well-suited to the purposes of providing evidence for or against the validity of our open-ended measure (convergent/discriminant validity with the RSS, and criterion validity for religious belief salience). Without more detail regarding what supplementary analyses would be recommended, we considered different options for more advanced statistical techniques that may be suited to our purposes. We decided to conduct partial Spearman correlations for within-domain associations between the RSS subscales and our open-ended measure (e.g., correlation between the divine subscale of the RSS and the divine code, statistically controlling for all other RSS subscales). These results index unique associations within domains, which provide a more stringent test of convergent validity. Results (p. 12) showed positive associations within all domains. We also considered whether it would be useful to conduct analyses examining unique associations between r/s struggles measures and religiousness. However, because the open-ended codes are forced to be negatively correlated (i.e., participant descriptions mostly involved 1 type of struggle, making the data structure quasi-ipsative), partialling techniques are problematic (Meade, 2004).

Meade, A. W. (2004). Psychometric problems and issues involved with creating and using ipsative measures for selection. Journal of Occupational and Organizational Psychology, 77(4), 531-551.

References are not sufficient enough and they need to be supplemented. Descriptions of some sources (e.g. Hall, Todd W. 2014) are incomplete.

Response: Thanks for pointing this out. We have done our best to correct all omissions and errors in the reference list.

Reviewer 2 Report

This manuscript has many strengths:

The topic of r/s struggles is important, and appropriate for this journal.
The method of adding the qualitative approach to the pervasive and well-established quantitative approach is laudable. Most social scientists favor this multi-method combination, but few try to do it.
The results in Table 1 are rewarding: they elaborate on the factors of the RSS.
The overall approach is systematic.
The writing is clear and succinct.

Based on the observations above, this study should be published, after addressing the points below.

At the same time, I got a feeling that the authors are not familiar with the qualitative methods, so they may have missed out on some simple practices.

METHODOLOGICAL ISSUES

Lack of cross-validation. It appears that all 976 participants were rated at the same time. If so, the findings in Table 1 is post-hoc. A more seasoned qualitative researcher would divide them into three piles. In the first pile, of about 10 to 30 subjects, the raters train themselves and make all the rules of proceeding explicit. The second pile, about half the remainders, follows that method and generate that Table 1. Then, the rest, often called a “sealed sample,” is used to cross-validate the finding of the second pile. At the end, you combine the wisdom from all three piles. Without this, all you get is a post-hoc finding, more like a hypothesis, waiting for cross-validation.

Lack of clarity in inter-rater reliability. There are quite a few issues here. Let me just list some of them.

“The first author and the RAs developed a coding protocol collaboratively.” Ideally, the people who develop the coding protocol should not be the people who use the coding protocol, to avoid the issue of “tacit knowledge”—something that the first group knows and uses but does not spell out.
“RAs met with the first author (in one-on-one and joint meetings) over the course of the coding process to discuss questions regarding the application of codes.” How often did the RAs meet with the first author? Only at the end? Somewhere in the middle to calibrate? All of these need, and more, to spell out. It’s best to calibrate in small batches at the beginning as not to waste data, and to keep clear written notes for each step of this calibration because those are the crux of the method.
“Discrepancies across coders were discussed, but codes were not changed once assigned.” It is good to keep a record of the original coding, but why not change after discussion? If the discussion brings out something better, why no change? Of course, it is necessary to keep a record of the change—which rater changes on which item, in which direction, how often, etc., to see the dynamic of the rating (who has more influence, in which direction, and whether the direction is justified, etc.).

CONCEPTUAL ISSUES

Lack of depth in asking one question, without probes. You ask a good question, “please describe a religious/spiritual struggle…” A key method of the qualitative method is the probe. Piaget is famous for his clinical interview, which Larry Kohlberg and others have followed. Closer to your method, of writing on a topic, is Jane Loevinger’s Sentence Completion Test, but she probes by using 36 stems, not one. What would you want to know about this r/s struggle, such as, How long has it been? Has it become more severe, or led to some growth, or both? And so on. You know better what to probe. Sorry about my poor suggestions here. In the qualitative studies that you have reviewed well, there was much probing, over a long time. You don't need to spend 2 hours per subject, but without probing, you get a thin slice of the experience. One great example of depth is the recent work of Christopher Kerr on dreams before death: he discovered that distressing dreams can liberate people and bring them to peace. In your review, you mentioned that there can be opposite outcomes for r/s struggles. Can you design a qualitative probe to reveal the different expressions of struggles, that may end up in different outcomes (in addition to the factors you have pointed out, like supporting communities and so on)? And maybe different expressions and types of r/s struggles at different stages of development?

You use the RSS as blinders, instead of allowing the full range discoveries in the qualitative methods, “When applicable, individual themes were grouped according to one of the six RSS struggle domains.” It would be best to employ raters who do not know the 6 RSS factors to sort the responses, according to an established rule that is not based on the RSS factors. There can be 30 to 50 categories (my wild guess, based on your Table 1). Then you run inter-rater reliability on these categories first, not on the factors. Afterward, you can group these 30 to 50 categories into super-categories, and run the inter-rater reliability again. Also, employ people who are ignorant of your RSS factors. If the super-categories match the RSS factors, then you have convincing convergent validity of the factors. Instead, what I think you have done, was to start pretty much with the RSS factors early on, depriving yourself of the chance to confirm these factors based on qualitative data. I guess that you do know this problem, because you acknowledge in section 4.4, Limitations, that “One salient bias of our team may be the preference to couch our themes within the RSS…” You can fix that, though it takes work: start over, with raw data, using research assistants who do not know the RSS factors.

You already acknowledge the issue with external validity, “The generalizability of qualitative and quantitative findings may be limited by our reliance on a sample of predominantly Christian undergraduates…” Would it be possible to collect a small sample, say, of 60 students, from St. John’s College in Chennai or another college in Tami Nadu, South India, where the level of religiosity might differ from those in the US? Plus another sample of 60 retired people> Then mix all responses together, to enjoy the task of sorting (without knowing what population the responses come from) leading to a strong conclusion, instead of remaining with this limitation, which you acknowledge as huge, while the fix is relatively easy. Wouldn’t it be great to see the demonic factor popping out in some of these populations?

I have some other smaller questions:

In Table 5, would it be clearer to put the Code and the RSS in two columns, instead of stacking them?

In Table 4, I might be missing something really elementary here: shouldn’t bivariate correlations mirror across the diagonal?

Please allow me to discuss the first sentence of your abstract, “Religious and spiritual struggles are typically assessed by self-report scales using closed-ended items, yet nascent research suggests that using open-ended items may complement and advance assessment.” As I mention above, I admire your adding qualitative method to the existing quantitative method. On the other hand, do you just aim to “advance assessment”? In Thomas Kuhn’s Structure, there are assessment (or methods), theories, data, and assumptions. You can use the qualitative method of open-ended items to advance either assessment or theories (or constructs). Larry Kohlberg’s Moral Judgment Interview (MJI) aims to advance the assessment of a clearly-stated theory on stages of moral development. His main finding was that the theory was supported, with minor adjustments on a possible regression at stage 4, which he modified the theory to account for it. In short, the MJI mainly advances the assessment. My hunch is that you unintentionally went this way. In contrast, Jane Loevinger’s Washington University Sentence Completion Test (SCT) leaves much more room for discovery, deriving the categories mainly from data, before grouping them together, either horizontally into similar themes, or vertically into expressions of the same category at a different level of complexity and development. You probably can discover much if you follow her approach. To her credit, she spells out every step of her method clearly, guarding against the tacit knowledge that I warned about.

I apologize for writing so long, and for probably misunderstand you in various places. Please know that I deeply admire your work of adding the wonderful qualitative method. Plus, your writing is also much clearer than mine. Thanks for putting up with my clumsy English. I look forward to seeing this manuscript published, possibly after some improvements. If you can’t make the changes, gently pointing out what needs to be done next time is already an important service to the research community. Hopefully, we don’t scare people away from adding the qualitative method.

Author Response

Thank you for the review. Please see below how we addressed each of your comments.

#####

Reviewer 2

This manuscript has many strengths:

The topic of r/s struggles is important, and appropriate for this journal.
The method of adding the qualitative approach to the pervasive and well-established quantitative approach is laudable. Most social scientists favor this multi-method combination, but few try to do it.
The results in Table 1 are rewarding: they elaborate on the factors of the RSS.
The overall approach is systematic.
The writing is clear and succinct.

Based on the observations above, this study should be published, after addressing the points below.

Response: Thank you for these positive and affirming comments. They are much appreciated.

At the same time, I got a feeling that the authors are not familiar with the qualitative methods, so they may have missed out on some simple practices.

METHODOLOGICAL ISSUES

Response: Thank you for providing the information about a more formal procedure employed in qualitative research. We now acknowledge the lack of cross-validation as a limitation on p. 16:

“A limitation of our qualitative coding protocol is that it did not adhere to formal cross-validation procedures, such as using a small calibration sample, coding one half of the remainder of the sample, and then cross-validating results using the other half of the remaining sample. Our procedures leave open the possibility that some themes were not well-represented throughout the sample. Though we acknowledge the importance of cross-validation for hypothesis testing and theory generation, in this initial work we primarily aimed to explore and describe themes, and to generate a coding manual including a highly comprehensive database of themes.”

Lack of clarity in inter-rater reliability. There are quite a few issues here. Let me just list some of them.

“The first author and the RAs developed a coding protocol collaboratively.” Ideally, the people who develop the coding protocol should not be the people who use the coding protocol, to avoid the issue of “tacit knowledge”—something that the first group knows and uses but does not spell out.

Response: We acknowledge that the description of the collaborative coding protocol was somewhat vague and have provided more detail about the procedures on p. 6:

“In the first phase of our coding, two research assistants (RAs; the second and third authors, JTT and PJ) assigned a descriptive code to each response independently; RAs were instructed to simply describe the most salient r/s struggle themes, which fits well within the framework of qualitative description. Over the course of this initial coding, RAs discussed codes with the first author (JAW) in one-on-one and joint meetings, which occurred approximately once per week until this phase of coding was complete. In the initial meetings, relatively few participants (e.g., 30-50) were coded each week so that ample time was given for training coders and developing strategies and heuristics for coding. When coders became comfortable with the procedure, they coded relatively more participants (e.g., 100-200) per week.”

“Following the completion of coding, the first author then reviewed descriptive codes and developed the thematic coding manual independently (for the full coding guidelines, see https://osf.io/a49gk/?view_only=5545d5beaba54e63b6c9253ccf84eb8a). The first author assigned a label to each theme and wrote a short description of the coding criteria for the theme.”

We believe that this protocol does not have an issue with tacit knowledge, as the first author wrote the coding manual independently. We hope that this description removed ambiguity regarding the collaborative aspect of the coding protocol.

“RAs met with the first author (in one-on-one and joint meetings) over the course of the coding process to discuss questions regarding the application of codes.” How often did the RAs meet with the first author? Only at the end? Somewhere in the middle to calibrate? All of these need, and more, to spell out. It’s best to calibrate in small batches at the beginning as not to waste data, and to keep clear written notes for each step of this calibration because those are the crux of the method.

Response: Thank you for the suggestion to include information regarding the timing of meetings. We have added more detail to the descriptions of the qualitative and quantitative coding process to indicate the timing of meetings more precisely.

With regard to the qualitative coding, on p. 6, we state, “Over the course of this initial coding, RAs discussed codes with the first author (JAW) in one-on-one and joint meetings, which occurred approximately once per week until this phase of coding was complete. In the initial meetings, relatively few participants (e.g., 30-50) were coded each week so that ample time was given for training coders and developing strategies and heuristics for coding. When coders became comfortable with the procedure, they coded relatively more participants (e.g., 100-200) per week.”

For the quantitative coding, we note on p. 7 that “RAs met with the first author (in one-on-one and joint meetings, occurring approximately once per week) over the course of this coding process to discuss questions regarding application of codes. Again, initial weeks involved coding relatively few participants (to train coders) as compared to latter weeks of the process.”

“Discrepancies across coders were discussed, but codes were not changed once assigned.” It is good to keep a record of the original coding, but why not change after discussion? If the discussion brings out something better, why no change? Of course, it is necessary to keep a record of the change—which rater changes on which item, in which direction, how often, etc., to see the dynamic of the rating (who has more influence, in which direction, and whether the direction is justified, etc.).

Response: We are aware that there are different stances about changing original codes depending on the method of coding and purpose of the research. We give our rationale for not changing the codes on p. 7:

“Discrepancies across coders were discussed, but codes were not changed once assigned due to concerns about artificially inflating intercoder reliability coefficients (O’Connor and Joffe 2020).”

CONCEPTUAL ISSUES

Response: Thank you for this information, the examples of probing, and for your suggestions about what to probe. We agree that it would be extremely useful and interesting to study open-ended response about r/s struggles in more depth, and these are extremely useful ideas for future work. However, it is unrealistic for us to use these types of strategies in r the current manuscript: This kind of research goes well beyond the scope of the current study and would necessitate a great deal of time and effort involved in data collection, coding, and analyses. We discuss some ways to extend our work by prompting for more details about the r/s struggle on pp. 15-16:

“One potentially fruitful way to extend this work may be to enrich the descriptions of the different types of r/s struggles by subsequently prompting for details such as perceived precursors and consequences, as well as how people make meaning from the r/s struggle. Doing so may generate descriptions of r/s struggle phenomenology that would be amenable to more sophisticated qualitative coding techniques (Creswell and Poth 2016). Additionally, adding more to the open-ended probes may be needed to help participants zero in on spiritual struggles in greater detail in their responses, as opposed to phenomena that may be indirectly related to r/s struggles (e.g., bereavement, isolation, mental and emotional suffering).”

Response: We appreciate the suggestion to start with a completely inductive approach to coding and see this as a fruitful strategy if this was a purely qualitative study. However, from the standpoint of conducting mixed methods research, grouping by RSS categories facilitated the aim of comparing quantitative indices derived from the qualitative codes with the RSS scales. We have added additional justification of this decision on pp. 6-7:

“When applicable, individual themes were grouped according to one of the six RSS struggle domains. For example, the theme of “concern about angering God” was grouped into divine struggle, whereas the theme of “meaninglessness of life” was grouped into ultimate meaning struggle. Though this step technically departed from pure qualitative description, it facilitated the mixed methods aim of deriving quantitative indices used for validation purposes. Indeed, one advantage of qualitative description is its ability to tie in with mixed methods research because it yields information useful for scale development (Neergaard et al. 2009). Furthermore, grouping individual codes according to RSS domains was informed by the quantitative and qualitative research described in the introduction indicating that the RSS domains provide relatively comprehensive coverage of r/s struggle domains.”

Response: Yes, it would be interesting to see the result of a demonic factor in different populations, and thank you for the suggestions about different samples in which to examine generalizability. For this manuscript, additional cross-cultural data collection is not feasible for various reasons (e.g., time constraints), but we will keep these ideas in the mind for future studies.

I have some other smaller questions:

In Table 5, would it be clearer to put the Code and the RSS in two columns, instead of stacking them?

Response: Thank you for this suggestion. We believe the table is clearer after this revision.

In Table 4, I might be missing something really elementary here: shouldn’t bivariate correlations mirror across the diagonal?

Response: Table 4 shows correlations between the different measures of r/s struggles (the 5 RSS subscales and 5 codes) rather than within either measure (if the latter were true, correlations would mirror across the diagonal). We have edited the labels on the Table to increase clarity of presentation.

Response: Thank you for this explanation of different approaches. Though our approach was aimed at advancing assessment, we believe that our findings have the potential to advance theory and constructs as well, and we discuss these potential advances in several places (e.g., pp. 13-14 regarding struggle content outside of the RSS; p. 15 regarding the conceptual interpretation of divergent between religiousness and different measures of r/s struggles). Therefore, we have modified the first sentence of the abstract to more accurately reflect that qualitative research may advance assessment and theories.

Response: We appreciate your thoughtful and instructive review. We have tried to address your critiques as completely as possible given the constraints of the data and coding output that we have available, and we really appreciate the suggestions to help us advance our qualitative research methods in future projects.

Reviewer 3 Report

Designing and implementing rigorous mixed method research is challenging. Unfortunately, the article has significant methodological flaws and does not adhere to the APA Style Journal Article Reporting Standards Mixed Method Article Reporting Standards.See https://apastyle.apa.org/jars/mixed-methods and https://apastyle.apa.org/jars/mixed-table-1.pdf

For example, I appreciate that you reported the study aims in Lines 41-45, but the aims are not in alignment with JARS for mixed method research.

In addition, you did not explain why mixed method research was the appropriate methodology given the study’s goals.

Perhaps most concerning is the presentation of the qualitative design as it is undeveloped. The authors stated that they coded the open-ended responses using “a qualitative exploratory, descriptive design” (Line 227), yet research from multiple disciplines highlights the problems with this approach if the design is not explained in a transparent, detailed, and rigorous manner.

Furthermore, participants were asked to “please describe a religious/spiritual struggle that you have experienced over the past few months. If possible, try to choose the struggle that you see as most important or serious. But even if you focus on a smaller struggle, that is OK.” How does this question reflect the exploratory, descriptive design? Did participants complete the measures online or on paper surveys? How many words was the average response to the above statement? The authors implied beginning in Line 229 that they sought fit between the coding of the open-ended response and the RSS struggle domains. By seeking fit, the RSS served as a hermeneutic for interpreting the open-ended responses, which defeats the purpose of uncovering r/s struggles otherwise not captured by the RSS. Moreover, with seemingly brief qualitative responses, they knew little contextually in order to understand participants’ experiences that did not seem to fit within the RSS. For example, one participant stated, “I have a bad relationship with one of my friends.” That was coded as “no religious/spiritual connotation,” yet the participant clearly identified that as a r/s problem in response to the prompt. Lack of context or understanding does not negate this as a r/s problem. This is the same dilemma Breuninger et al. (2019) faced in failing to accept veterans’ mental health struggles as r/s struggles.

Finally, there are ethical concerns regarding the fact that participants received partial course credit for participating. Were students offered a non-research alternative to completing the survey that involved a similar commitment of time and effort or were they penalized if they did not wish to participate? Was the study approved by the IRB at each institution? These questions highlight only some of the many details that are needed to evidence the rigor of a mixed method study.

Author Response

Thank you for the review. Please see below for how we addressed each of your comments.

######

Reviewer 3

For example, I appreciate that you reported the study aims in Lines 41-45, but the aims are not in alignment with JARS for mixed method research.

In addition, you did not explain why mixed method research was the appropriate methodology given the study’s goals.

Response: Thank you for this guidance about reporting standards. We have modified the manuscript in several ways to increase adherence to JARS for mixed methods research.

We modified the title to reflect the mixed methods approach.
We modified the abstract to reference the qualitative, quantitative, and mixed methods approaches in the current study.
We clarified how the mixed methods design facilitated our qualitative, quantitative, and mixed methods research aims on p. 2

“The present project had three aims relevant to advancing the measurement and theory of r/s struggles within a mixed-methods design. First, we developed a qualitative coding system for open-ended descriptions of r/s struggles. Second, we derived quantitative scores from the qualitative codes and examined the convergent and discriminant validity of the emergent codes against the Religious and Spiritual Struggles Scale (RSS), which measures multiple domains of r/s struggles with closed-ended items (Exline et al. 2014). Third, we compared quantitative associations between open-and closed-ended assessments of r/s struggles to a measure of religiousness (Blaine and Crocker 1995).”

We elaborated on the qualitative, quantitative, and mixed methods aims on p. 5.
We included a Research Design Overview section on p. 5 that defines our mixed methods design and gives an overview of qualitative, quantitative, and mixed methods.
In both the Methods and Results sections, we added subheadings to clearly indicate when methods or analyses were qualitative, quantitative, or mixed.

Response: Thank you for encouraging a more rigorous description of the qualitative methods, which we have included on pp. 6-7 (section 2.3.1 – section 2.3.1.2). In these sections, we (a) noted how our methods were highly similar to qualitative description methodology, (b) defined qualitative description methodology in more detail, (c) gave more details about our qualitative coding procedure, (d) distinguished qualitative coding from transforming qualitative data to a quantitative coding scheme, and (e) gave more details about our quantitative coding.

Response: Thank you for raising these important points.

We note how the question and coding reflect the qualitative description method on p. 6:

“Qualitative description stays close to the data and involves low inference relative to other qualitative methods. Data are simply described without attachment to any particular theory, though some interpretation may be required to translate text to meaning units (i.e., codes).

In the first phase of our coding, two research assistants (RAs; the second and third authors, JTT and PJ) assigned a descriptive code to each response independently; RAs were instructed to simply describe the most salient r/s struggle themes, which fits well within the framework of qualitative description.”

We provide more detail about how participants reported struggles and about their responses on p. 6:

“Participants typed responses into a text box. The mean response was 15.3 words (SD = 13.9); most participants wrote one or two sentences, whereas some participants wrote as little as a few words or as much as a few paragraphs.”

Though we acknowledge how our decision to use the RSS categories may introduce bias (see limitations on p. 16), we do not agree that doing so completely obscured uncovering content outside the RSS: for example, we did find that the “commitment” code fell outside of the RSS categories. Furthermore, many of the individual themes that could be grouped within domains covered by the RSS (e.g., divine, doubt) fell outside of the specific content included in the RSS (see pp. 13-14). We also think it is useful to reiterate the response to Reviewer 2 justifying the use of the RSS for our research purposes: From the standpoint of conducting mixed methods research, grouping by RSS categories facilitated the aim of comparing quantitative indices derived from the qualitative codes with the RSS scales (the text below appears on pp. 6-7).

Regarding the decision to categorize some descriptions as having no r/s connotation, on p. 16 we acknowledge that…

“Another potential bias in our protocol was the decision to code responses that did not explicitly include r/s content into non-r/s categories (e.g., suffering, loss). Because our prompt asked explicitly for r/s struggles, such responses may still have been perceived by participants as including r/s themes, which may have become evident upon further prompting.”

We stand by the decision to exclude these kinds of responses from r/s struggles categories because it is ambiguous as to whether they have an r/s connotation for the participant.

Response: We include these important details on p. 6:

“All methods for the study were approved by each university’s IRB. Participants received partial credit toward the class research participation requirement for participating; participants had the option to choose other research studies or complete a non-research option (e.g., writing a paper) to fulfill this requirement.”

Thank you for your helpful comments. We hope that the additional details provided have addressed your concerns.

Round 2

Reviewer 1 Report

I accept revised version of the article.

Author Response

Thank you for your earlier comments and for your endorsement of this article for publication.

Article Menu

Open-Ended and Closed-Ended Measures of Religious/Spiritual Struggles: A Mixed-Methods Study

#####

Further Information

Guidelines

MDPI Initiatives

Follow MDPI