Rethinking the Subjective Units of Distress Scale: Validity and Clinical Utility of the SUDS

Mattera, Elizabeth; Zaboski, Brian

doi:10.3390/clinpract15070123

Open AccessReview

Rethinking the Subjective Units of Distress Scale: Validity and Clinical Utility of the SUDS

by

Elizabeth Mattera

and

Brian Zaboski

^*

Yale School of Medicine, Department of Psychiatry, Yale University, New Haven, CT 06510, USA

^*

Author to whom correspondence should be addressed.

Clin. Pract. 2025, 15(7), 123; https://doi.org/10.3390/clinpract15070123

Submission received: 21 April 2025 / Revised: 23 June 2025 / Accepted: 26 June 2025 / Published: 29 June 2025

Download

Browse Figure

Review Reports Versions Notes

Abstract

The Subjective Units of Distress Scale (SUDS) is a widely used self-report measure clinicians rely on during exposure and response prevention (ERP) to monitor progress, guide exposure pacing, and assess intervention efficacy. However, despite its ubiquity in clinical and research settings, foundational investigations of its psychometrics are often atheoretical, fail to evaluate its longitudinal properties, and lack a rigorous construct validation framework. This paper addresses these shortcomings by evaluating the SUDS as a measure of state negative affective intensity using the Strong Program of Construct Validation. Our evaluation demonstrates that the SUDS suffers from significant psychometric weaknesses, including construct underrepresentation, construct irrelevance, poorly defined measurement occasions, and structural limitations, challenging its validity as a precise measure of subjective distress. These limitations have crucial implications for clinical practice, potentially leading to misinterpretations of patient distress and compromising treatment decisions. We discuss these clinical implications, highlight them with a brief clinical vignette, outline a research roadmap for potential improvement using modern psychometric methods, and provide practical recommendations for clinicians currently using the SUDS. Given these validity concerns, caution is warranted when interpreting SUDS scores in both clinical and research contexts until its psychometric properties are more robustly established and understood.

Keywords:

subjective units of distress scale; SUDS; validity; construct validation; psychometric properties; clinical assessment; affect measurement; exposure therapy

1. Introduction

Wolpe and Lazarus introduced the Subjective Units of Distress Scale (SUDS) in 1966 [1]. They described the administration of the SUDS through the following interaction with a hypothetical patient.

Think of the worst anxiety you have ever experienced, or can imagine experiencing, and assign to this the number 100. Now think of the state of being absolutely calm and call this zero. Now you have a scale. On this scale how do you rate yourself at this moment? (p. 73).

Wolpe referred to each unit of measurement as a “sud” (p. 73), or a subjective unit of disturbance, with which patients communicated the severity of their distress throughout behavior therapy. In 1981, Wolpe and Wolpe [2] anchored the SUDS with the descriptive terms displayed in Table 1.

Consequently, the SUDS afforded practical advantages in clinical and research settings. First, it collected within- and between-session ratings in a dynamic therapeutic environment across disorders. In recent studies, the SUDS has been used at the start and end of sessions [3], at multiple timepoints throughout sessions, and immediately before and after exposures [4]. Second, it allowed clinicians to rank-order fears on a hierarchy, a graduated list of anxiety-provoking situations that frequently guided exposure-based treatment [5]. Third, it helped prevent exposure-based treatment from becoming too overwhelming. For instance, if a patient reported a SUD for one exposure as a 20 but the next as an 80, a clinician could develop intermediate challenges that progressively intensified the experience [6]. Thus, SUDS scores directly influenced critical clinical decisions, such as determining the pace of exposure therapy, assessing treatment readiness, judging session effectiveness, and even defining treatment success (e.g., the common 50% reduction heuristic for habituation) [7].

With a convenient measure applicable to clinical and research settings, the SUDS was swiftly integrated into anxiety research throughout the 1960s and 1970s. Pioneers in anxiety research found that within- and between-session decreases in the SUDS related to habituation during the treatment of OCD and agoraphobia [8], while dissertations using the SUDS showed higher distress for teachers exposed to unruly classroom behaviors than teachers who were not [9]. Wolpe himself argued for the SUDS as both a clinical tool [6] as well as a research tool, which he used widely in his own seminal investigations in exposure work [10].

The SUDS continues to be used in modern psychology, primarily for measuring state negative affective intensity within and between therapy sessions [11,12] and as a measure of subjective fear, anxiety, and discomfort [13,14] across various settings [15,16]. While other scales are often interminable, costly, and measure trait characteristics (vs. states), the SUDS provides a method of measuring unstable constructs like distress and anxiety in real time [17]. In several training and practitioner-friendly resources, the SUDS continues to be the de facto method of generating modern fear hierarchies [7,18,19]. Indeed, the SUDS, along with a 50% reduction cutoff, is often used to signify that habituation to a feared stimulus has occurred [20,21]. Clinicians rely on these SUDS ratings moment-to-moment to gauge patient tolerance, decide whether to continue, intensify, or cease an exposure task, and to collaboratively build fear hierarchies that form the backbone of treatment plans [20]. The SUDS has also played an important role in basic and applied research clarifying exposure therapy’s mechanistic underpinnings [22]. For example, in a study concluding that habituation is a separate phenomenon from extinction learning, SUDS ratings of conditioned stimuli were one of the measures [23]. Because it is such a popular and ubiquitous measure, a careful examination of its construct validity is warranted.

In clinical practice, accurate measurement of patient distress is fundamental to treatment planning, progress monitoring, and outcome evaluation. The SUDS has been widely adopted in exposure-based treatments across anxiety disorders, PTSD, and OCD, where clinicians rely on it to calibrate exposure intensity, determine habituation, and make moment-to-moment treatment decisions. However, as Messick (1995) [24] and Cronbach and Meehl (1955) [25] established, meaningful clinical application requires construct clarity before empirical utility can be established. While the SUDS has been a subject of various investigations over the decades, a comprehensive assessment through a rigorous, modern construct validation lens has been less common, which is the gap our paper seeks to address. Thus, our narrative and theoretical critique through the rigorous Strong Program of Construct Validation [26] addresses a critical gap affecting everyday clinical practice—namely, that clinicians using the SUDS to guide treatment decisions may be measuring qualitatively different phenomena across patients, sessions, or even within the same exposure exercise. This theoretical ambiguity undermines treatment standardization, complicates clinical training, and potentially compromises patient outcomes. Therefore, the primary objective of this paper is to critically reevaluate the construct validity of the SUDS using the rigorous framework of the Strong Program of Construct Validation. We will argue that this examination reveals significant foundational weaknesses that have profound implications for its use and for future research directions. Our goal is to provide the theoretical groundwork necessary for improving this widely used measure or developing sound alternatives. For convenience, a glossary of terms is provided.

2. Validity Studies

Construct validation—the “integrative and evaluative judgement” that “supports the adequacy and appropriateness of inferences and actions based on test scores” is a necessary step for establishing the interpretation of scores [27] (p. 13). As it relates to the SUDS, three important studies examined construct validity directly. Table 2 summarizes the key findings, interpretations, and our critiques of the seminal studies discussed.

The earliest attempt explored the relationship between the SUDS, digit temperature, and heart rate in twenty college students [28]. Participants watched a 3-minute video of a venous cutdown procedure, during which SUDs scores were obtained at 30-second intervals. Average heart rate and left-hand and right-hand temperature correlated significantly with SUDs scores. The authors concluded that peripheral vasoconstriction is more sensitive to subjective anxiety than heart rate, and that the results support the continued use of the SUDS in clinical and research settings.

Nevertheless, this overlooks the possibility that the SUDS was measuring other constructs: participants may have reported anxiety, disgust, or general autonomic arousal after watching the video of the venous cutdown procedure. The authors reported that the relationship between the left-hand and right-hand correlations and the SUDS was in the predicted direction. But in the absence of theory, their prediction could still have been confirmed if they found a statistically significant inverse correlation between digit temperature and anxiety (especially since at the time of publication the direction of the relationship was not yet established). Their belief in the SUDS and its predictive ability is plainly stated: “Self-report data such as the subjective anxiety scale are frequently the primary outcome measure of interest in clinical behavior therapy and its usefulness is not dependent on concurrent response parameters” (p. 6).

In another attempt to study the SUDS, Kim et al. [29] analyzed SUDS scores from 61 patients treated with eye movement desensitization and reprocessing (EMDR) at a trauma clinic. At baseline, the authors stated that the SUDS correlated with the Beck Depression Inventory (r = 0.28, p < 0.05; [31]) and the State Anxiety Scale (r = 0.31, p < 0.05), which they argued was evidence for convergent validity. They interpreted a non-significant correlation with the Trait Anxiety Scale (r = 0.21, p > 0.05; [32]) as evidence for discriminant validity. They further claimed there were no correlations between the SUDS and age (r = −0.23, p < 0.05), education (r = −0.016, p < 0.05), or income (r = 0.12, p < 0.05)—interpreted as discriminant validity. In terms of predictive validity, they added that the SUDS at the end of the first session was significantly and positively correlated with the Clinical Global Impression of Change (r = 0.32; CGI-C; [33]). The SUDS also correlated with CGI-C at the end of the second (r = 0.51) and third sessions (r = 0.61). The authors found significant, positive correlations between the SUDS and the Symptom Checklist-90-Revised’s Positive Symptom Distress Index (SCL-90 PSDI [34] and Impact of Event Scale-Revised (IES-R [35]), both interpreted as evidence of concurrent validity (rs = 0.50 and 0.46, respectively).

However, many validity interpretations were based on whether a parameter estimate was statistically significant. Unfortunately, this use of p-values is problematic, as it relies on dichotomous decisions that overlook the strength and theoretical importance of the correlations themselves [36]. For instance, the authors concluded that the SUDS did not correlate with the Trait Anxiety Scale because the p-value was not significant—which they interpreted as evidence of discriminant validity—when in fact the size of the correlation with the Trait Anxiety Scale (r = 0.21) was akin to that with the Beck Depression Inventory (r = 0.28), which they did interpret as evidence for convergent validity. Because p-values offer limited information about the actual magnitude of an effect, their determination of convergent or discriminant validity based primarily on p-value thresholds, without adequate consideration of these comparable correlation strengths (e.g., r = 0.21 vs. r = 0.28), is inconsistent. While laudable, this investigation lacks a strong foundation for clinicians looking to trust SUDS scores as specific indicators of state distress versus related constructs like depression or stable anxiety traits, highlighting the risk of relying on assessment approaches not grounded in robust, theoretically coherent validation evidence, a necessary step to distinguish evidence-based practice from potentially unsubstantiated claims [37].

In the absence of a theoretical and empirical understanding of the SUDS, its measurement limitations, and the construct it purports to measure, one cannot interpret positive correlations between anxiety and depression as convergent validity. Again, using p-values, Kim et al. [29] claimed that neither age, education, nor income was significant, so these were evidence of discriminant validity. Considering predictive validity, the authors noted that SUDS scores following the first session predicted SUDS scores during all subsequent sessions. Nevertheless, these results may all have been artifacts resulting from their last observation carried forward (LOCF) method, which typically involves copying subject data forward in a dataset to fill in missing data. This approach is problematic enough that researchers and statisticians have been skeptical of it for some time [38]; hence, several recommendations have been put forth for more sophisticated analyses/designs [39,40].

To expand the scope of the SUDS’s clinical utility, Tanner (2012) [30] investigated emotional and physical discomfort. They used the Minnesota Multiphasic Personality Inventory-2 (MMPI-2 [41]) and Global Assessment of Functioning (GAF [42]) with the SUDS to track 182 patients in a hospital setting. They found that emotional SUDS scores were moderately related to clinician GAF ratings (r = −0.44), Scale A of the MMPI-2 (r = 0.35), and the sum of scales 1–3 (r = 0.37), and decreased significantly after 3 months of treatment. Physical SUDS scores did not decrease significantly after treatment. The researchers concluded that “the data provide several pieces of evidence regarding the validity and sensitivity of global SUDS ratings” (p. 33) and that global SUDS ratings are a useful extension of the traditional SUDS scale.

Nevertheless, the foundation for Tanner’s (2012) [30] argument was the work done by Thyer et al. [28] and Kim et al. [29], and they supplied several papers that have found correlations between the SUDS and other constructs. What do we make of the correlations between the SUDS and so many constructs, which have continued to be demonstrated in numerous papers since then? First, we acknowledge that the SUDS is associated with measurements of stress and anxiety, including heart rate and heart variability [43]. It correlates with several constructs and instruments, including willingness to engage in treatment (r = −0.12; [44]), Screen for Anxiety and Related Disorders—Youth and Parent reports [45], the externalizing score on the parent report for the Strength and Difficulties Questionnaire [46], and the Global Distress Tolerance Scale [47,48]. Moreover, some research shows that the SUDS predicts between-session exposure therapy treatment outcomes [49], and that changes in SUDS scores can predict changes in outcome variables such as OCD severity, functional impairment, and clinician-rated improvement [50].

Validity evidence requires more than just a collection of correlations—it also requires theoretical justification [25,51]. Thus, a deeper examination of its construct validity is overdue. To systematically evaluate these psychometric concerns and provide a clearer path forward, we use the Strong Program of Construct Validation [26,52]. This is a rigorous, theory-driven framework that helps researchers assess whether a measure, like the SUDS, truly captures the concept it intends to measure by examining different types of evidence in a structured way. We will explore this program and then apply its components—Substantive, Structural, and External—to the SUDS. This framework of construct validation has been used extensively for several decades and is recommended by researchers and modern factor analytic texts [52,53]. Its primary purpose is to help researchers determine if a measure belongs within a nomological network, or a related network of constructs and variables [25]. So it is within this framework that one can evaluate whether enough research and theoretical work suggest that the SUDS is a suitable measure of distress. Our assessment is followed by recommendations for research and clinical practice.

3. Strong and Weak Construct Validation

The Weak Program of construct validation is an exploratory endeavor that captures the correlational relationship between the focal construct and other constructs with little regard for theory [51]. Research falls into the Weak Program when it attempts to establish external validity (i.e., relationships with other variables) without first adequately addressing the substantive aspects of construct validation, such as clearly defining the theoretical construct, ensuring the measure comprehensively represents it, and minimizing construct-irrelevant variance. Because the Weak Program relies on exploratory (rather than confirmatory) empirical research, often bypassing these crucial initial theoretical steps, its atheoretical and unsystematic approach provides less convincing support for a construct. This can lead to a collection of correlations that, while perhaps statistically significant, offer little clarity on what the measure truly assesses or how it functions within a coherent theoretical system.

To address the limitations of the Weak Program of construct validation, the Strong Program is a theory-driven framework derived from Loevinger (1957) [54] and Nunnally (1978) [55]. It integrates the six categories of construct validation from Messick’s (1995) [24] unified concept of validity and includes three components: Substantive, Structural, and External. A variety of evidence is required at all stages, which build upon each other. If one stage lacks evidence, it must be reevaluated. Figure 1 displays these components as building blocks.

4. Substantive Component

Within the Substantive Component, the theoretical and empirical domains of a construct are identified. A construct’s theoretical domain consists of all that is known about the construct in the literature and is supplemented by the researcher’s observations [26]. Ideally, the theoretical domain corresponds to a construct’s empirical domain, or all ways that the construct can be operationalized (e.g., brain scans, electrophysiological recordings, behavioral measures, self-reports, etc.). Two threats to construct validity in this stage include construct underrepresentation and construct irrelevance [24]. Construct underrepresentation occurs when measures do not sufficiently account for and represent the theoretical domain of the focal construct [56]. Construct irrelevance occurs when an assessment captures variability unrelated to the construct’s theoretical domain.

A clinician observing a high SUDS score cannot be certain what specific aspect of the patient’s experience it reflects (e.g., fear of the stimulus, general overwhelm, frustration), potentially leading to misinterpretations of the patient’s state and misguided clinical interventions. This is highlighted by applying the Strong Program to “distress.” Without explicitly defining the domain (distress vs. anxiety), it is unclear what the SUDS measures. For instance, patients are supposed to report the “worst anxiety” [1] (p. 66) they ever experienced. But it is called a subjective unit of disturbance scale and was also encouraged to be used for domains such as “rejection,” “guilt,” “or others like them” (p. 67). This introduced many constructs—anxiety, rejection, guilt, or others—into the theoretical domain, ostensibly under the umbrella term “disturbance.” Yet, without an operational definition of disturbance and how that relates to anxiety, the scale risks measuring multiple domains simultaneously and introducing construct-irrelevant variance. Even if "distress" is considered broadly as a form of state negative affect, the SUDS fails to adequately define its precise scope within this larger dimension. It does not clarify which specific facets of negative affect are being targeted (e.g., fear, sadness, anger, general unease), nor how it differentiates a global sense of negative feeling from more specific emotional experiences. This lack of specificity makes it difficult to ascertain if the SUDS is truly capturing a coherent aspect of negative affect or simply an amalgamation of various undifferentiated negative feelings, further contributing to construct irrelevance. For instance, someone with both anxiety and depression may have a high SUDS score because of depression, anxiety, or other reasons. By integrating multiple constructs of interest, we risk combining the intensity of different affective responses.

The SUDS also fails to define its measurement period, obscuring the construct it purports to measure. Inconsistent prompting (rate your distress now vs. rate your peak distress during that exposure) across or within sessions can yield incomparable data, undermining the clinician’s ability to accurately track change or therapeutic processes like habituation. For instance, in an investigation of public speaking anxiety, the SUDS was a measure of distress throughout the course of a speech [57]. In another study measuring eating disorder treatment, the SUDS evaluated distress prior to and after eating a feared food, during within-session and between-session habituation [58]. Different domains measured on different time scales are subject to their own theoretical and empirical examinations. Each of these, as a measure of state negative affective intensity, requires a theoretical and empirical understanding.

5. Structural Component

When enough evidence is collected in the Substantive Stage, a construct should be evaluated in the Structural Stage. Here the relationship between the observed variables and the construct of interest is assessed [26]. Constructs assessed by single-item measures are best assessed by their sensitivity to within-person, longitudinal variability [59,60]. In other words, fluctuations in one person’s scores may be a better measure of their affect than comparing their scores to others’ scores. Further, measures of affect must be considered in the context of an individual’s traits [61]. For instance, while significant fluctuations in affect scores may be unusual for one person, they may be expected in an individual with greater emotional lability. More concretely, someone with OCD may report consistently high anxiety (low within-person variability), but someone with borderline personality disorder may experience a wider range of emotional fluctuation (greater within-person variability).

The SUDS falls short without clear theoretical and empirical operational definitions in the Structural Component. For instance, the within-person variability of distress as measured by the SUDS may differ depending on whether someone experiences anxiety, “Not just right,” or hopelessness. Relying on a single SUDS number may cause clinicians to miss crucial patterns or sources of distress variability (e.g., lability vs. stable high anxiety), hindering a nuanced understanding of the patient’s affective state and response to treatment. Also, the construct’s irrelevant variance in the Substantive Component carries through to the Structural Component. For example, a score after an exposure therapy session may be contaminated by the many different constructs the SUDS is measuring. Consequently, the extent and the source of within-person variability remain unexplained.

6. External Component

In the External Component, an individual’s time series of momentary affective scores would be compared to trait measures [61]. For instance, we may hypothesize that average longitudinal scores of anxiety and the degree of variation in reported anxiety over time would correlate with Beck Anxiety Inventory scores (BAI [62]). We might also hypothesize that a focal construct should change in the context of group differences [63]. For instance, a researcher might randomize children to either a treatment or experimental condition and provide cognitive–behavioral therapy, hypothesizing that therapy will reduce scores on the SUDS in the treatment group. Similarly, a researcher could use knowledge of the construct to predict how different populations of individuals (OCD vs. non-OCD) might rate the SUDS in the presence of certain stimuli.

However, because the SUDS’ focal construct is not theoretically or empirically defined in the Substantive Stage—leaving its evidence from the Structural Stage undetermined—evidence from the External Stage is difficult to interpret. Without this theoretical and empirical knowledge, one cannot know, for example, how to make sense of the relationships described by Kim et al. [29], who found discriminant validity on the grounds of a non-significant correlation between the SUDS and Trait Anxiety Scale, and then found convergent validity when the SUDS correlated with the BDI. More broadly, without a clear understanding of what the SUDS measures, clinicians cannot confidently use it to predict treatment response, differentiate patient groups, or meaningfully relate subjective distress to other clinical outcomes. For this reason, Figure 1 suggests that conceptual work be completed in earlier stages in the Strong Program.

7. Discussion

Since its development, the SUDS has been a cornerstone of research and clinical practice, providing a quick and convenient method of measuring subjective anxiety and distress [64]. To evaluate the utility of the SUDS as a measure of distress, we investigated its development and applied the Strong Program of Construct Validation, a framework that outlines three cumulative and recursive steps for establishing evidence of construct validity: Substantive, Structural, and External.

A brief clinical vignette can illustrate how a failure at the Substantive Stage—specifically, a lack of clear construct definition for “distress”—can manifest in practice: A patient with OCD undergoing ERP for contamination fears consistently reports a SUDS score of 80/100 during exposures involving touching a contaminated object. The clinician, following standard practice, waits for the SUDS score to decrease by 50% before ending the exposure, but the patient’s rating remains high. Based solely on the SUDS, the clinician might conclude the exposure is ineffective or too difficult. However, upon further questioning prompted by the lack of observable avoidance reduction, the patient reveals the high rating reflects intense frustration (“I should be over this!”), hopelessness (“This is never going to work”), and even their anxiety about their progress and engagement in ERP (“Am I doing this right? What if I’m untreatable?”), rather than acute anxiety about contamination itself. The SUDS score, lacking clear construct definition (Substantive failure), provided misleading information. Relying on it obscured other clinically relevant affective states and potentially stalled effective treatment progression, underscoring the critical importance of precise differential assessment when interpreting subjective reports in clinical practice [65].

With an understanding of where the SUDS stands in terms of the Strong Program, its place in modern assessment, as well as how to improve it, becomes clearer. The SUDS struggled to recover from the limitations in its development. For example, Wolpe and Lazarus [1] approached depression and anxiety from a combined theoretical framework in their formulation of the SUDS. They asserted that “most neurotic depression is the product of severe anxiety arousal” (p. 28), potentially illuminating why the SUDS measures so many constructs. At the same time, little was known about how to measure constructs adequately over longitudinal periods [66]. Nevertheless, the process of validation is ongoing [56], and the SUDS should still meet modern measurement standards for clinical and research decisions.

8. Clinical Implications

Although the practical appeal of the SUDS in demanding clinical settings—offering a seemingly rapid, quantifiable snapshot of patient distress—is undeniable, the significant psychometric limitations regarding its construct validity necessitate considerable caution from clinicians relying on it. Instead of accepting the SUDS score at face value, it is advisable to supplement it with other data sources. Clinicians should integrate direct behavioral observations, such as latency to engage in exposure, task duration, avoidance attempts, or non-verbal signs of distress, alongside the SUDS rating. Crucially, employing qualitative inquiry can contextualize the number; asking clarifying questions like “What specific feelings or thoughts are contributing to that rating right now?” may reveal whether the score reflects the targeted anxiety, or perhaps frustration, hopelessness, or physical discomfort, thus preventing potential misinterpretations.

Furthermore, clinicians could attempt to improve specificity by explicitly defining the construct during administration (e.g., “Rate your fear of contamination from 0–100 right now”) rather than using ambiguous terms like “distress” or “disturbance.” Furthermore, adopting more structured anchoring procedures, as exemplified in some treatment protocols like Prolonged Exposure for PTSD, where therapists help clients generate specific, personalized examples for various SUDS levels (e.g., 0, 25, 50, 75, 100) that remain consistent throughout therapy [67], might also enhance clarity and consistency. While validated single-item measures for specific momentary affective states are being developed [52,53], psychometric robustness for guiding real-time clinical decisions within therapy sessions requires further dedicated research. For evaluating change across sessions or treatment phases, clinicians should continue to prioritize validated multi-item questionnaires designed to assess specific, relevant constructs beyond momentary distress, such as quality of life or symptom severity [68]. The overarching goal must be to base clinical judgments and interventions on measures with demonstrated reliability and validity for the specific inferences being made, a standard the SUDS, in its current form, struggles to meet.

Given the highlighted psychometric limitations of the SUDS, particularly concerning its construct clarity and potential for misinterpretation, clinicians may seek alternatives for assessing and guiding treatment. Better-validated, often multi-item, questionnaires can be used for between-session, within-session, and moment-to-moment improvement. For instance, administration of the PTSD Checklist for DSM-5 [69], prior to exposure therapy for trauma and after treatment, could offer a more interpretive measure of change between sessions. Likewise, the Beck Anxiety Inventory [62] can be used to measure improvements in anxiety, and the Yale–Brown Obsessive–Compulsive Scale [46] can be used to measure change in OCD severity during treatment. These can be used with shorter scales to investigate within-session fluctuations in anxiety, fearfulness, or unease, like the Patient-Reported Outcomes Measurement Information System [70], which now includes computer-adaptive administration. Lastly, these can be combined with moment-to-moment measures that have stronger theoretical support based on the clinical need. For instance, because expectation violation is crucial during modern exposure-based practice [71,72], a 100-point expectancy rating scale (from 0, “will not occur,” to 100, “definitely will occur”) can be used with clients to measure the discrepancy in a client’s prediction between exposures [73]. Utilizing such measures aligns with the foundational principle of construct validation: ensuring that clinical decisions are based on assessments that reliably and validly measure the intended psychological attribute of interest.

9. Improving the SUDS

While this evaluation highlights significant concerns regarding the SUDS’ psychometric soundness, it also underscores the genuine need clinicians have for brief, real-time measures of subjective distress during therapeutic interventions like exposure therapy. To bridge this gap and provide clinicians with more trustworthy tools, a dedicated research roadmap is necessary. Firstly, foundational work must revisit the Substantive Component by rigorously defining the specific construct(s) intended for measurement. Rather than relying on the ambiguous umbrella term “distress” or “disturbance,” research employing qualitative methods, such as interviewing patients and clinicians about their moment-to-moment experiences during therapy, could identify and operationalize the most salient affective states (e.g., fear intensity, anxiety, disgust, hopelessness) relevant to specific therapeutic contexts. This clarification is a prerequisite for developing or refining any measurement tool [74].

After building a stronger theoretical base, the measurement properties of the SUDS scale warrant empirical scrutiny. How do patients interpret the 0–100 range and the provided anchors (Table 1)? Research could investigate whether the generalized anchors (Table 1) are as effective as more personalized, multi-point anchoring systems used in specific protocols [67]. Exploring the impact of such detailed anchoring on the scale’s consistency and perceived meaning for patients is a worthwhile avenue. Cognitive interviewing studies exploring patients’ thought processes as they generate a SUDS rating, or potentially advanced psychometric analyses like item response theory, could shed light on whether the scale yields meaningful quantitative information [75,76,77]. Furthermore, future validation efforts must move beyond simple atheoretical correlations. Employing intensive longitudinal designs, such as ecological momentary assessment (EMA) within or across sessions, would allow researchers to examine the SUDS’ sensitivity to change and its relationship with specific therapeutic events and validated measures of distinct emotional states over time. Utilizing a multitrait–multimethod approach within these studies would be crucial for systematically establishing the convergent and discriminant validity that has thus far been lacking

EMA research has already been underway for negative affect that falls under the umbrella of distress using the Strong Program of Construct Validation [52]. For example, with specific reports of affective experience, Cloos and colleagues [52] considered momentary affect as individual affective experiences on a dimensional spectrum, ranging from positive affect to negative affect. This underscores that distress is broad, and that measures of negative affect cannot be created on the spot. Thus, we recommend that clinicians and researchers instead operationalize the construct in which they are interested—fear? hopelessness? expectation violation?—and select a validated measure for research and clinical care. Another recommendation is to quantify construct validity through meta-analysis [78]. In attempting to validate a construct, one attempts to include a measure within a nomological network in which it relates to other variables [25]. A meta-analysis can put one’s hypotheses about the SUDS to the test.

10. Conclusions

This manuscript has critical implications for researchers and clinicians about a distress scale that has been used for decades. Distress is a uniquely subjective negative affective experience, constantly fluctuating, and influenced by internal and external variables [79]. As the subjective nature of affect remains pertinent [80], it requires varied and modern measurement techniques (e.g., questionnaires, interviews) with an overall sensitivity to discriminant and incremental validity [81]. Modern psychometric methods can supplement such approaches, providing new ways for researchers and clinicians to measure distress that may be applied to the SUDS. By incorporating longitudinal measures of negative affect or using meta-analyses with careful consideration of theory, we can more accurately measure and interpret a rating of distress.

Given the significant unanswered questions about what the SUDS measures and the potential for misinterpretation, we urge clinicians and researchers to be highly cautious and critical when employing it. While its simplicity is appealing, relying on SUDS scores without acknowledging their profound psychometric limitations may compromise clinical decision-making. However, developing and validating brief, reliable, and theoretically grounded measures of subjective state distress suitable for the dynamic context of therapy sessions remains an important and attainable goal to better support evidence-based clinical practice.

Author Contributions

Conceptualization, B.Z.; methodology, B.Z. and E.M.; writing—original draft preparation, E.M.; writing—review and editing, B.Z.; visualization, E.M.; supervision, B.Z.; project administration, B.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

No new data were created or analyzed in this study.

Conflicts of Interest

For the past three years, B.Z. has consulted with Biohaven Pharmaceuticals and received royalties from Oxford University Press; these relationships are not related to the work described here.

Glossary

Anxiety	A specific emotional state often characterized by apprehension, worry, and physiological arousal. In the context of the SUDS, it is one of the primary constructs the scale was intended to measure (e.g., Wolpe’s “worst anxiety”), though the manuscript argues that its measurement by the SUDS is often confounded with other states.
Construct Irrelevance	A threat to construct validity that occurs when a test or measure includes aspects or captures variances that are not part of the intended construct. In the manuscript’s analysis of the SUDS, this refers to the scale potentially reflecting factors like frustration, hopelessness, or other emotions rather than solely the intended construct of distress or anxiety.
Construct Underrepresentation	A threat to construct validity where a test or measure fails to capture important aspects of the construct it is intended to assess. For the SUDS, this could mean that the scale does not fully encompass the multifaceted nature of subjective distress or anxiety, as it attempts to distill complex experiences into a single number.
Construct Validation	The overarching, integrative, and evaluative process of gathering evidence to determine whether a measure accurately assesses the specific psychological construct it claims to measure, and whether inferences made from its scores are appropriate and meaningful. The manuscript employs this process, particularly the Strong Program, to investigate the SUDS.
Distress	A general term for a state of negative affective intensity, suffering, or unease. Within the manuscript, it is the focal construct that the SUDS (Subjective Units of Distress Scale) purports to measure, but its lack of a clear, consistent operational definition is identified as a key psychometric weakness, leading to ambiguity in what SUDS scores actually represent.
Disturbance	A term originally used by Wolpe in relation to the SUDS (e.g., “subjective unit of disturbance”), often used interchangeably with “distress” or “anxiety.” The manuscript highlights that this broad term contributed to the scale’s early ambiguity regarding the specific construct being measured.
External Component (of Strong Program)	The third component of the Strong Program of Construct Validation, which examines how a measure relates to other established measures and variables (convergent and discriminant validity), and how it performs across different groups or conditions, to build a nomological network. The manuscript suggests that robust evidence in this stage for the SUDS is difficult to interpret due to weaknesses in earlier validation stages.
Negative Affect	A broad dimension of emotional experience encompassing various unpleasant feelings and states, such as anxiety, distress, fear, and sadness. The manuscript considers the SUDS as an attempt to measure “state negative affective intensity.”
Strong Program of Construct Validation	A theory-driven, multi-stage framework for evaluating the construct validity of a measure, integrating various forms of validity evidence. It includes three core components: Substantive, Structural, and External. The manuscript utilizes this program as its primary theoretical lens for critiquing the psychometric properties of the SUDS.
Structural Component (of Strong Program)	The second component of the Strong Program of Construct Validation, focusing on the internal structure of a measure and how well it reflects the structure of the intended construct. For single-item measures like the SUDS, this includes assessing its sensitivity to within-person, longitudinal variability and the consistency of its measurement properties.
Subjective Units of Distress (SUDS)	A self-report rating scale, typically ranging from 0 to 100, intended to quantify an individual’s current level of subjective distress, anxiety, or disturbance. The manuscript critically examines its validity and clinical utility.
Substantive Component (of Strong Program)	The foundational first component of the Strong Program of Construct Validation, concerned with defining the theoretical basis of the construct being measured and ensuring the measure (including its items, format, and scoring) adequately represents this theoretical domain, while avoiding construct underrepresentation and irrelevance. The manuscript argues that the SUDS has significant weaknesses in this component.

References

Wolpe, J.; Lazarus, A.A. Behavior Therapy Techniques: A Guide to the Treatment of Neuroses; Pergamon Press: Oxford, UK, 1966. [Google Scholar]
Wolpe, J.; Wolpe, D. Our Useless Fears; Houghton Mifflin: Boston, MA, USA, 1981. [Google Scholar]
Shapira, S.; Yeshua-Katz, D.; Sarid, O. Effect of Distinct Psychological Interventions on Changes in Self-Reported Distress, Depression and Loneliness among Older Adults during Covid-19. World J. Psychiatry 2022, 12, 970. [Google Scholar] [CrossRef] [PubMed]
Miegel, F.; Bücker, L.; Kühn, S.; Mostajeran, F.; Moritz, S.; Baumeister, A.; Lohse, L.; Blömer, J.; Grzella, K.; Jelinek, L. Exposure and Response Prevention in Virtual Reality for Patients with Contamination-Related Obsessive–Compulsive Disorder: A Case Series. Psychiatr. Q. 2022, 93, 861–882. [Google Scholar] [CrossRef] [PubMed]
Wolpe, J. The Systematic Desensitization Treatment of Neuroses. J. Nerv. Ment. Dis. 1961, 132, 189–203. [Google Scholar] [CrossRef]
Wolpe, J. (Ed.) The Practice of Behavior Therapy, 1st ed.; Pergamon Press: Elmsford, NY, USA, 1969. [Google Scholar]
Abramowitz, J.S.; Deacon, B.J.; Whiteside, S.P. Exposure Therapy for Anxiety: Principles and Practice; Guilford Publications: New York, NY, USA, 2019. [Google Scholar]
Foa, E.B.; Chambless, D.L. Habituation of Subjective Anxiety during Flooding in Imagery. Behav. Res. Ther. 1978, 16, 391–399. [Google Scholar] [CrossRef]
Sherman, T.M.; Cormier, W.H. An Investigation of the Influence of Student Behavior on Teacher Behavior. J. Appl. Behav. Anal. 1974, 7, 11–21. [Google Scholar] [CrossRef] [PubMed]
Wolpe, J.; Flood, J. The Effect of Relaxation on the Galvanic Skin Response to Repeated Phobic Stimuli in Ascending Order. J. Behav. Ther. Exp. Psychiatry 1970, 1, 195–200. [Google Scholar] [CrossRef]
Benjamin, C.L.; O’Neil, K.A.; Crawley, S.A.; Beidas, R.S.; Coles, M.; Kendall, P.C. Patterns and Predictors of Subjective Units of Distress in Anxious Youth. Behav. Cogn. Psychother. 2010, 38, 497–504. [Google Scholar] [CrossRef]
Elsner, B.; Jacobi, T.; Kischkel, E.; Schulze, D.; Reuter, B. Mechanisms of Exposure and Response Prevention in Obsessive-Compulsive Disorder: Effects of Habituation and Expectancy Violation on Short-Term Outcome in Cognitive Behavioral Therapy. BMC Psychiatry 2022, 22, 66. [Google Scholar] [CrossRef]
Levy, H.C.; Radomsky, A.S. Safety Behaviour Enhances the Acceptability of Exposure. Cogn. Behav. Ther. 2014, 43, 83–92. [Google Scholar] [CrossRef]
Parrish, C.L.; Radomsky, A.S. An Experimental Investigation of Responsibility and Reassurance: Relationships with Compulsive Checking. Int. J. Behav. Consult. Ther. 2006, 2, 174–191. [Google Scholar] [CrossRef]
Zaboski, B.A.; Romaker, E.K. Using Cognitive-Behavioral Therapy with Exposure for Anxious Students with Classroom Accommodations. J. Coll. Stud. Psychother. 2023, 37, 209–226. [Google Scholar] [CrossRef]
Zaboski, B.A. Exposure Therapy for Anxiety Disorders in Schools: Getting Started. Contemp. Sch. Psychol. 2020, 26, 235–240. [Google Scholar] [CrossRef]
Milgram, L.; Sheehan, K.; Cain, G.; Carper, M.M.; O’Connor, E.E.; Freeman, J.B.; Garcia, A.; Case, B.; Benito, K. Comparison of Patient-Reported Distress during Harm Avoidance and Incompleteness Exposure Tasks for Youth with OCD. J. Obs. Compuls. Relat. Disord. 2022, 35, 100760. [Google Scholar] [CrossRef]
Reid, J.E.; Laws, K.R.; Drummond, L.; Vismara, M.; Grancini, B.; Mpavaenda, D.; Fineberg, N.A. Cognitive Behavioural Therapy with Exposure and Response Prevention in the Treatment of Obsessive-Compulsive Disorder: A Systematic Review and Meta-Analysis of Randomised Controlled Trials. Compr. Psychiatry 2021, 106, 152223. [Google Scholar] [CrossRef] [PubMed]
Van Noppen, B.; Sassano-Higgins, S.; Appasani, R.; Sapp, F. Cognitive-Behavioral Therapy for Obsessive-Compulsive Disorder: 2021 Update. Focus 2020, 19, 430–443. [Google Scholar] [CrossRef] [PubMed]
Jacoby, R.J.; Abramowitz, J.S.; Blakey, S.M.; Reuman, L. Is the Hierarchy Necessary? Gradual versus Variable Exposure Intensity in the Treatment of Unacceptable Obsessional Thoughts. J. Behav. Ther. Exp. Psychiatry 2019, 64, 54–63. [Google Scholar] [CrossRef]
Kendall, P.C.; Robin, J.A.; Hedtke, K.A.; Suveg, C.; Flannery-Schroeder, E.; Gosch, E. Considering CBT with Anxious Youth? Think Exposures. Cogn. Behav. Pract. 2005, 12, 136–148. [Google Scholar] [CrossRef]
Rupp, C.; Doebler, P.; Ehring, T.; Vossbeck-Elsebusch, A.N. Emotional Processing Theory Put to Test: A Meta-Analysis on the Association between Process and Outcome Measures in Exposure Therapy. Clin. Psychol. Psychother. 2017, 24, 697–711. [Google Scholar] [CrossRef]
Prenoveau, J.M.; Craske, M.G.; Liao, B.; Ornitz, E.M. Human Fear Conditioning and Extinction: Timing Is Everything…or Is It? Biol. Psychol. 2013, 92, 59–68. [Google Scholar] [CrossRef]
Messick, S. Validity of Psychological Assessment: Validation of Inferences from Persons’ Responses and Performances as Scientific Inquiry into Score Meaning. Am. Psychol. 1995, 50, 741. [Google Scholar] [CrossRef]
Cronbach, L.J.; Meehl, P.E. Construct Validity in Psychological Tests. Psychol. Bull. 1955, 52, 281–302. [Google Scholar] [CrossRef]
Benson, J. Developing a Strong Program of Construct Validation: A Test Anxiety Example. Educ. Meas. Issues Pract. 2005, 17, 10–17. [Google Scholar] [CrossRef]
Messick, S. Validity. In Educational Measurement; Linn, R.L., Ed.; Macmillan: New York, NY, USA, 1989; pp. 13–103. [Google Scholar]
Thyer, B.A.; Papsdorf, J.D.; Davis, R.; Vallecorsa, S. Autonomic Correlates of the Subjective Anxiety Scale. J. Behav. Ther. Exp. Psychiatry 1984, 15, 3–7. [Google Scholar] [CrossRef] [PubMed]
Kim, D.; Bae, H.; Park, Y.C. Validity of the Subjective Units of Disturbance Scale in EMDR. J. EMDR Pract. Res. 2008, 2, 57–62. [Google Scholar] [CrossRef]
Tanner, B.A. Validity of Global Physical and Emotional SUDS. Appl. Psychophysiol. Biofeedback 2012, 37, 31–34. [Google Scholar] [CrossRef]
Beck, A.T.; Ward, C.H.; Mendelson, M.; Mock, J.; Erbaugh, J. An Inventory for Measuring Depression. Archives of general psychiatry 1961, 4, 561–571. [Google Scholar] [CrossRef]
Spieberger, C.D.; Gorsuch, R.L.; Lushene, R.E. Manual for the State-Trait Anxiety Inventory; Consulting Psychologists Press: Palo Alto, CA, USA, 1983. [Google Scholar]
Guy, W. ECDEU Assessment Manual for Psychopharmacology; US Department of Health, Education and Welfare: Washington, DC, USA, 1976. [Google Scholar]
Derogatis, L.R.; Rickels, K.; Rock, A.F. The SCL-90 and the MMPI: A Step in the Validation of a New Self-Report Scale. Br. J. Psychiatry 1976, 128, 280–289. [Google Scholar] [CrossRef]
Weiss, D.S. The Impact of Event Scale: Revised. In Cross-Cultural Assessment of Psychological Trauma and PTSD; Springer: Berlin/Heidelberg, Germany, 2007; pp. 219–238. [Google Scholar]
Wasserstein, R.L.; Lazar, N.A. The ASA Statement on p-Values: Context, Process, and Purpose. Am. Stat. 2016, 70, 129–133. [Google Scholar] [CrossRef]
Kranzler, J.H.; Floyd, R.G.; Benson, N.; Zaboski, B.; Thibodaux, L. Cross-Battery Assessment Pattern of Strengths and Weaknesses Approach to the Identification of Specific Learning Disorders: Evidence-Based Practice or Pseudoscience? Int. J. Sch. Educ. Psychol. 2016, 4, 146–157. [Google Scholar] [CrossRef]
Lachin, J.M. Fallacies of Last Observation Carried Forward Analyses. Clin. Trials 2016, 13, 161–168. [Google Scholar] [CrossRef]
Hamer, R.M.; Simpson, P.M. Last Observation Carried Forward versus Mixed Models in the Analysis of Psychiatric Clinical Trials. Am. J. Psychiatry 2009, 166, 639–641. [Google Scholar] [CrossRef]
Moore, R.A.; Derry, S.; Wiffen, P.J. Challenges in Design and Interpretation of Chronic Pain Trials. Br. J. Anaesth. 2013, 111, 38–45. [Google Scholar] [CrossRef]
Greene, R.L. The MMPI-2: An Interpretive Manual; Allyn & Bacon: Boston, MA, USA, 2000. [Google Scholar]
Endicott, J.; Spitzer, R.L.; Fleiss, J.L.; Cohen, J. The Global Assessment Scale: A Procedure for Measuring Overall Severity of Psychiatric Disturbance. Arch. Gen. Psychiatry 1976, 33, 766–771. [Google Scholar] [CrossRef] [PubMed]
Pittig, A.; Arch, J.J.; Lam, C.W.; Craske, M.G. Heart Rate and Heart Rate Variability in Panic, Social Anxiety, Obsessive–Compulsive, and Generalized Anxiety Disorders at Baseline and in Response to Relaxation and Hyperventilation. Int. J. Psychophysiol. 2013, 87, 19–27. [Google Scholar] [CrossRef] [PubMed]
Reid, A.M.; Garner, L.E.; Van Kirk, N.; Gironda, C.; Krompinger, J.W.; Brennan, B.P.; Mathes, B.M.; Monaghan, S.C.; Tifft, E.D.; André, M.-C.; et al. How Willing Are You? Willingness as a Predictor of Change during Treatment of Adults with Obsessive–Compulsive Disorder. Depress. Anxiety 2017, 34, 1057–1064. [Google Scholar] [CrossRef] [PubMed]
Birmaher, B.; Brent, D.A.; Chiappetta, L.; Bridge, J.; Monga, S.; Baugher, M. Psychometric Properties of the Screen for Child Anxiety Related Emotional Disorders (SCARED): A Replication Study. J. Am. Acad. Child Adolesc. Psychiatry 1999, 38, 1230–1236. [Google Scholar] [CrossRef]
Goodman, R. Psychometric Properties of the Strengths and Difficulties Questionnaire. J. Am. Acad. Child Adolesc. Psychiatry 2001, 40, 1337–1345. [Google Scholar] [CrossRef]
Simons, J.S.; Gaher, R.M. The Distress Tolerance Scale: Development and Validation of a Self-Report Measure. Motiv. Emot. 2005, 29, 83–102. [Google Scholar] [CrossRef]
Tonarely, N.A.; Hirlemann, A.; Shaw, A.M.; LoCurto, J.; Souer, H.; Ginsburg, G.S.; Jensen-Doss, A.; Ehrenreich-May, J. Validation and Clinical Correlates of the Behavioral Indicator of Resiliency to Distress Task (BIRD) in a University- and Community-Based Sample of Youth with Emotional Disorders. J. Psychopathol. Behav. Assess. 2020, 42, 787–798. [Google Scholar] [CrossRef]
Sripada, R.K.; Rauch, S.A. Between-Session and within-Session Habituation in Prolonged Exposure Therapy for Posttraumatic Stress Disorder: A Hierarchical Linear Modeling Approach. J. Anxiety Disord. 2015, 30, 81–87. [Google Scholar] [CrossRef]
Kircanski, K.; Wu, M.; Piacentini, J. Reduction of Subjective Distress in CBT for Childhood OCD: Nature of Change, Predictors, and Relation to Treatment Outcome. J. Anxiety Disord. 2014, 28, 125–132. [Google Scholar] [CrossRef] [PubMed]
Linn, R. (Ed.) Cronbach Construct Validation after Thirty Years. In Intelligence: Measurement, Theory, and Public Policy: Proceedings of a Symposium in Honor of Lloyd G. Humphreys; University of Chicago Press: Chicago, IL, USA, 1989; pp. 147–167. [Google Scholar]
Cloos, L.; Ceulemans, E.; Kuppens, P. Development, Validation, and Comparison of Self-Report Measures for Positive and Negative Affect in Intensive Longitudinal Research. Psychol. Assess. 2022, 35, 189. [Google Scholar] [CrossRef]
Watkins, M.W. A Step-by-Step Guide to Exploratory Factor Analysis with R and RStudio; Routledge: London, UK, 2020. [Google Scholar]
Loevinger, J. Objective Tests as Instruments of Psychological Theory. Psychol. Rep. 1957, 3, 635–694. [Google Scholar] [CrossRef]
Nunnally, J. Psychometric Theory, 2nd ed.; McGraw-Hill: New York, NY, USA, 1978. [Google Scholar]
American Educational Research Association; American Psychological Association; National Council on Measurment in Education. Standards for Educational and Psychological Testing; American Educational Research Association: Washington, DC, USA, 2014; ISBN 978-0-935302-35-6. [Google Scholar]
Takac, M.; Collett, J.; Blom, K.J.; Conduit, R.; Rehm, I.; Foe, A.D. Public Speaking Anxiety Decreases within Repeated Virtual Reality Training Sessions. PLoS ONE 2019, 14, e0216288. [Google Scholar] [CrossRef] [PubMed]
Essayli, J.H.; Forrest, L.N.; Zickgraf, H.F.; Stefano, E.C.; Keller, K.L.; Lane-Loney, S.E. The Impact of Between-session Habituation, Within-session Habituation, and Weight Gain on Response to Food Exposure for Adolescents with Eating Disorders. Int. J. Eat. Disord. 2023, 56, 637–645. [Google Scholar] [CrossRef]
Brose, A.; Schmiedek, F.; Gerstorf, D.; Voelkle, M.C. The Measurement of Within-Person Affect Variation. Emotion 2020, 20, 677. [Google Scholar] [CrossRef]
Salzborn, S.; Davidov, E.; Reinecke, J. (Eds.) Methods, Theories, and Empirical Applications in the Social Sciences; VS Verlag für Sozialwissenschaften: Wiesbaden, Germany, 2012; ISBN 978-3-531-17130-2. [Google Scholar]
Trull, T.J.; Ebner-Priemer, U. Ambulatory Assessment. Annu. Rev. Clin. Psychol. 2013, 9, 151–176. [Google Scholar] [CrossRef]
Beck, A.T.; Epstein, N.; Brown, G.; Steer, R.A. An Inventory for Measuring Clinical Anxiety: Psychometric Properties. J. Consult. Clin. Psychol. 1988, 56, 893. [Google Scholar] [CrossRef]
Furr, R.M. Psychometrics: An Introduction; SAGE publications: New York, NY, USA, 2021. [Google Scholar]
Milosevic, I.; McCabe, R.E. (Eds.) Phobias: The Psychology of Irrational Fear; Phobias: The Psychology of Irrational Fear; Bloomsbury Publishing: New York, NY, USA, 2015. [Google Scholar]
Mattera, E.F.; Ching, T.H.; Zaboski, B.A.; Kichuk, S.A. Suicidal Obsessions or Suicidal Ideation? A Case Report and Practical Guide for Differential Assessment. Cogn. Behav. Pract. 2022, 31, 259–271. [Google Scholar] [CrossRef]
Flake, J.K.; Fried, E.I. Measurement Schmeasurement: Questionable Measurement Practices and How to Avoid Them. Adv. Methods Pract. Psychol. Sci. 2020, 10, 456–465. [Google Scholar] [CrossRef]
Bluett, E.J.; Zoellner, L.A.; Feeny, N.C. Does Change in Distress Matter? Mechanisms of Change in Prolonged Exposure for PTSD. J. Behav. Ther. Exp. Psychiatry 2014, 45, 97–104. [Google Scholar] [CrossRef] [PubMed]
Molinari, A.D.; Andrews, J.L.; Zaboski, B.A.; Kay, B.; Hamblin, R.; Gilbert, A.; Ramos, A.; Riemann, B.C.; Eken, S.; Nadeau, J.M. Quality of Life and Anxiety in Children and Adolescents in Residential Treatment Facilities. Resid. Treat. Child. Youth 2019, 36, 220–234. [Google Scholar] [CrossRef]
Weathers, F.W.; Litz, B.T.; Herman, D.S.; Huska, J.A.; Keane, T.M. The PTSD Checklist (PCL): Reliability, Validity, and Diagnostic Utility. In Proceedings of the Annual Convention of the International Society for Traumatic Stress Studies, San Antonio, TX, USA, 24–27 October 1993; Volume 462. [Google Scholar]
HealthMeasures. PROMIS^® Anxiety Scoring Manual. 2019. Available online: https://www.healthmeasures.net/images/PROMIS/manuals/PROMIS_Anxiety_Scoring_Manual.pdf (accessed on 25 June 2025).
Craske, M.G.; Treanor, M.; Conway, C.C.; Zbozinek, T.; Vervliet, B. Maximizing Exposure Therapy: An Inhibitory Learning Approach. Behav. Res. Ther. 2014, 58, 10–23. [Google Scholar] [CrossRef]
Craske, M.G.; Kircanski, K.; Zelikowsky, M.; Mystkowski, J.; Chowdhury, N.; Baker, A. Optimizing Inhibitory Learning during Exposure Therapy. Behav. Res. Ther. 2008, 46, 5–27. [Google Scholar] [CrossRef]
Craske, M.G.; Treanor, M.; Zbozinek, T.D.; Vervliet, B. Optimizing Exposure Therapy with an Inhibitory Retrieval Approach and the OptEx Nexus. Behav. Res. Ther. 2022, 152, 104069. [Google Scholar] [CrossRef]
Boateng, G.O.; Neilands, T.B.; Frongillo, E.A.; Melgar-Quiñonez, H.R.; Young, S.L. Best Practices for Developing and Validating Scales for Health, Social, and Behavioral Research: A Primer. Front. Public Health 2018, 6, 149. [Google Scholar] [CrossRef] [PubMed]
Willis, G.B. Cognitive Interviewing: A Tool for Improving Questionnaire Design; Sage Publications: New York, NY, USA, 2004. [Google Scholar]
Liu, W.; Dindo, L.; Hadlandsmyth, K.; Unick, G.J.; Zimmerman, M.B.; St. Marie, B.; Embree, J.; Tripp-Reimer, T.; Rakel, B.; Marie, B.; et al. Item Response Theory Analysis: PROMIS^® Anxiety Form and Generalized Anxiety Disorder Scale. West. J. Nurs. Res. 2022, 44, 765–772. [Google Scholar] [CrossRef]
Hays, R.D.; Spritzer, K.L.; Reise, S.P. Using Item Response Theory to Identify Responders to Treatment: Examples with the Patient-Reported Outcomes Measurement Information System (PROMIS^®) Physical Function Scale and Emotional Distress Composite. Psychometrika 2021, 86, 781–792. [Google Scholar] [CrossRef] [PubMed]
Westen, D.; Rosenthal, R. Quantifying Construct Validity: Two Simple Measures. J. Personal. Soc. Psychol. 2003, 84, 608–618. [Google Scholar] [CrossRef]
Kuppens, P.; Verduyn, P. Emotion Dynamics. Curr. Opin. Psychol. 2017, 17, 22–26. [Google Scholar] [CrossRef]
Gray, E.K.; Watson, D. Assessing positive and negative affect via self-report. In Handbook of Emotion Elicitation and Assessment; Coan, J.A., Allen, J.J.B., Eds.; Oxford University Press: New York, NY, USA, 2007; pp. 171–183. [Google Scholar]
Clark, L.A.; Watson, D. Constructing Validity: New Developments in Creating Objective Measuring Instruments. Psychol. Assess. 2019, 31, 1412. [Google Scholar] [CrossRef] [PubMed]

Figure 1. The Strong Program of Construct Validation. Note: Building blocks for the Strong Program. Each block requires a range of theoretical and empirical support. Arrows on the right show that, as support fails in one component, reevaluation is needed in a previous stage. Text boxes highlight useful concepts in each component.

Table 1. Anchors for the subjective units of disturbance scale.

Value	Description
0	No anxiety at all; complete calmness
1–10	Very slight anxiety
10–20	Slight anxiety
20–40	Moderate anxiety; definitely unpleasant feeling
40–60	Severe anxiety; considerable distress
60–80	Severe anxiety; becoming intolerable
80–100	Very severe anxiety; approaching panic

Table 2. Summary of key findings and critiques from validation studies.

Feature	Thyer et al. (1984) [28]	Kim et al. (2009) [29]	Tanner (2012) [30]
Study Focus/Sample	Relationship between SUDS, digit temperature, and heart rate in 20 college students watching a venous cutdown video.	SUDS scores from 61 patients undergoing EMDR at a trauma clinic.	Emotional and physical SUDS in 182 hospital patients, correlated with MMPI-2 and GAF.
Key Correlations	- SUDS and Digit Temperature: Significant, predicted direction. - SUDS and Heart Rate: Significant.	- SUDS and BDI: r = 0.28 (p < 0.05). - SUDS and State Anxiety: r = 0.31 (p < 0.05). - SUDS and Trait Anxiety: r = 0.21 (p > 0.05). - SUDS and Age: r = −0.23 (p < 0.05). - SUDS and Income: r = 0.12 (p < 0.05). - SUDS and SCL-90 PSDI: rs = 0.50. - SUDS and IES-R: rs = 0.46.	- Emotional SUDS and GAF: r = −0.44. - Emotional SUDS and MMPI-2 Scale A: r = 0.35. - Emotional SUDS and MMPI-2 Scales 1–3: r = 0.37. - Emotional SUDS decreased significantly over 3 months.
Authors’ Interpretation	Supported continued use of SUDS in clinical and research settings.	- BDI, State Anxiety: Convergent validity. - Trait Anxiety: Discriminant validity. - Age, Education, Income: Claimed “no correlations,” interpreted as discriminant validity. - SCL-90, IES-R: Concurrent validity. - CGI-C Correlations: Predictive validity.	Data provided evidence for validity and sensitivity of global SUDS ratings; a useful extension of traditional SUDS.
Key Limitations	- Overlooked that SUDS might measure constructs other than anxiety (e.g., distress, disgust, general arousal). - Lack of theoretical grounding for predictions. - Asserted SUDS usefulness independently of concurrent physiological measures.	- Validity interpretations relied on statistical significance (p-values) rather than effect sizes. - Inconsistently claimed “no correlations” for age, education, and income. - Findings may be artifacts of last observation carried forward (LOCF). - Lacked a strong theoretical basis for interpreting correlations as convergent (e.g., SUDS with depression).	- Argument for validity relies on atheoretical work of Thyer et al. and Kim et al. - Provides no theoretical rationale for SUDS use.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Mattera, E.; Zaboski, B. Rethinking the Subjective Units of Distress Scale: Validity and Clinical Utility of the SUDS. Clin. Pract. 2025, 15, 123. https://doi.org/10.3390/clinpract15070123

AMA Style

Mattera E, Zaboski B. Rethinking the Subjective Units of Distress Scale: Validity and Clinical Utility of the SUDS. Clinics and Practice. 2025; 15(7):123. https://doi.org/10.3390/clinpract15070123

Chicago/Turabian Style

Mattera, Elizabeth, and Brian Zaboski. 2025. "Rethinking the Subjective Units of Distress Scale: Validity and Clinical Utility of the SUDS" Clinics and Practice 15, no. 7: 123. https://doi.org/10.3390/clinpract15070123

APA Style

Mattera, E., & Zaboski, B. (2025). Rethinking the Subjective Units of Distress Scale: Validity and Clinical Utility of the SUDS. Clinics and Practice, 15(7), 123. https://doi.org/10.3390/clinpract15070123

Article Menu

Rethinking the Subjective Units of Distress Scale: Validity and Clinical Utility of the SUDS

Abstract

1. Introduction

2. Validity Studies

3. Strong and Weak Construct Validation

4. Substantive Component

5. Structural Component

6. External Component

7. Discussion

8. Clinical Implications

9. Improving the SUDS

10. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Glossary

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI