Validation and Analysis of the Psychometric Properties of Two Irritability-Measuring Tools: The Affective Reactivity Index (ARI) and the Born-Steiner Irritability Scale (BSIS) in the Italian Adult and Adolescent Populations

Irritability is a transdiagnostic symptom that affects quality of life during the lifespan of individuals. The objective of the present research was to validate two assessment tools, namely the Affective Reactivity Index (ARI) and the Born-Steiner Irritability Scale (BSIS). We investigated internal consistency as measured with Cronbach’s alpha, test–retest with intraclass correlation coefficient (ICC) and convergent validity confronting ARI and BSIS scores with the strength and difficulties questionnaire (SDQ). Our results revealed ARI good internal consistency with a Cronbach’s α of 0.79 for adolescent and 0.78 for adults. The BSIS also demonstrated good internal consistency for both samples with Cronbach’s α = 0.87. Test–retest analysis showed excellent values for both tools. Convergent validity showed positive and significant correlation with SDW, albeit weak for some sub-scales. In conclusion, we found ARI and BSIS to be good tools for measuring irritability in adolescents and adults, and now, Italian healthcare professionals can use it with more confidence.


Introduction
Irritability is defined as a low threshold for experiences of anger in response to frustration, often associated with verbal and/or physical aggression [1]. During a recent seminar on childhood irritability, experts identified two components of irritability: tonic and phasic [2,3]. The former was defined as persistently angry, short-tempered, or grumpy mood, whereas the latter is defined as waves of intense anger, and behavioral outbursts in response to frustration [4]. The term "irritability" more frequently refers to the irritable mood rather than to tonic irritability, whereas terms such as "reactive aggression" and "impulsive aggression" were used to describe the intense manifestations of anger in phasic irritability. These two dimensions of irritability were defined as distinct constructs in terms of prediction of long-term clinical outcomes, response to treatment, or family history. Being able to draw a distinction between these two components of chronic irritability and untangle their heterogeneity is fundamental to clarifying the subtypes of the disorder and to conducting future investigations on the response to treatment [5].
Longitudinal and cross-sectional studies showed that chronic irritability, which includes both tonic and phasic components, is associated with decreased income, education, and an increased risk of anxiety, depression, and suicidal ideation [5]. It is important to highlight that tonic and phasic irritability are the major components of a disruptive mood dysregulation disorder (DMDD), recently introduced in the Diagnostic and Statistical Manual of Mental Disorders, 5th edition (DSM-5) [4,5]. These two components make up the two main symptoms of DMDD: severe, recurrent outbursts of anger (i.e., phasic irritability) and outbursts of irritability or angry mood (i.e., tonic irritability) [2].
Irritable children and adolescents show an orientation towards threatening environmental stimuli and tend to interpret ambiguous stimuli as threatening, with an increased risk of outbursts of anger and aggressive behavior (reactive aggression) [4]. The frequency of tantrums is very high during preschool and follows a well-defined trajectory; after preschool, irritability levels tend to be relatively stable over time, and this is similar to documented levels for anxiety and depression [1].
The DSM-5 mentions irritability in numerous disorders with greater prevalence in childhood; for instance, DMDD, Oppositional Defiant Disorder; furthermore, irritability is also a symptom of multiple disorders in adulthood (generalized anxiety disorder, bipolar disorder, cyclothymic disorder, major depression, and post-traumatic stress disorder [3]). In the DSM-5, the absence of a specific diagnosis of irritability for adults stands out; in fact, those manifesting symptoms of irritability in adulthood often receive alternative and incorrect diagnoses that do not exactly correspond to their specific clinical condition. This leads patients to feeling misunderstood and mislabeled; consequently, they do not experience authentic representations of their lived experiences [3]. Furthermore, adult irritability is often comorbid with other disorders with which it shares genetic and environmental antecedents and risk factors.
Irritability was inconsistently measured in the scientific literature, mainly due to the lack of clarity in the conceptualization of this construct [5]. Recently, the development of irritability scales significantly increased, demonstrating the renewed scientific interest in this field. However, there are several limitations to these current measurement tools, the first one concerning the definition of the concept of irritability. Second, despite specific outcome measures being available in Italy, there is still a need for specific validated instruments, such as the clinician-affective reactivity index (CL-ARI) [6] and the irritability in adult patients with epilepsy (I-Epi) [7] However, the CL-ARI is a clinician-rated instrument to assess irritability in pediatric research and clinical settings, while the I-Epi is an outcome measure specific for the condition of epilepsy. Two measures were found to be valid and reliable for general population: the affective reactivity index (ARI) and the Born-Steiner irritability scale (BSIS) [8]. Considering the need of specific assessment tools for general population is urgently required for both adults and adolescents, the primary objective of this study was to validate and analyze the psychometric properties of both ARI and BSIS in the Italian context.

Materials and Methods
This study was conducted following the the consensus-based standards for the selection of health measurement instruments (COSMIN) [9] by a research group from Sapienza University of Rome and the Association of "Rehabilitation and Outcome measure Assessment" R.O.M.A. [10][11][12][13][14][15]. While the Italian version of the ARI was available, the research group-upon receiving consent from the developers-produced a translation and crosscultural adaptation of the BSIS following the Rome Foundation guidelines.

Population
Samples were composed by a group of adolescents and adults. For sample adequacy, different studies recommended an item-to-response ratio varying from 1:3 to 1:20. Considering the BSIS had 14-items, and after analyzing the sample size of other validation studies (133-218), we considered recruiting a minimum of 280 people to be adequate. Adolescents were recruited from several Italian secondary schools. Parents/guardians agreed to give their consent. At the same time, for the adult population, we recruited some parents of the adolescent sample, as well as a convenience sample from general practitioners and university population. To participate, people had to meet the following inclusion criteria: a good understanding of the Italian language, no cognitive disabilities, no diagnosed psychiatric disorders or drug use. The inclusion criteria were assessed during specific meetings organized within the school or other facilities, where research purposes, timing, and administering approaches were explained. Those who showed interest in participating in the study and met the inclusion criteria were interviewed using ARI, BSIS, and SDQ. Data were collected using a web-app for storing data. Consequently, after a period ranging from 24-48 h, participants received a link to perform the re-test.

Assessment Tools
The affective reactivity index (ARI) [16] is a concise tool developed to measure irritability in childhood and adolescence. The scale is available in two versions: one for self-assessment and the other for parents. ARI investigates three dimensions of irritability: threshold for an irritability reaction, frequency of irritability feelings/behaviors, and duration of such behaviors/feelings. Each item had three possible answers: "Not true", "Quite true", "Certainly true", respectively, evaluated as 0, 1, 2. Only the first six items were added together to obtain the total score, while the seventh element, referring to the impairment determined by irritability, was examined separately. Total ARI scores by parent and self-report showed a Cronbach's alpha of 0.92 and 0.88 in USA and 0.89 and 0.90 in UK, for the parent-and self-report scales, respectively [16]. For the present investigation, we used the Italian version available from King's College London (https://www.kcl.ac. uk/academic-psychiatry/about/departments/child-adolescent-psychiatry, accessed on 28 February 2023).
The Born-Steiner irritability scale (BSIS) [17] was created for the assessment of irritability in the female population with mood disorders related to the menstrual cycle, pregnancy, or menopause. The BSIS was subsequently used to assess irritability also in the general population and other health conditions. An observer reported rating scale and a self-report rating scale were developed. For the present investigation, we used the 14-item self-report scale; scoring was attributed using a 4-point Likert scale. BSIS demonstrated good internal consistency (Cronbach's α = 0.92) and acceptable test-retest reliability (r = 0.70).
The strength and difficulties questionnaire (SDQ) [18] is a brief emotional and behavioral questionnaire for the clinical assessment of children and young people. The tool can capture the perspective of children and young people, their parents, and teachers. There are currently three versions of the SDQ: a short form, a longer form with an impact supplement (which assesses the impact of difficulties on the child's life), and a follow-up form. The 25 items in the SDQ comprise 5 scales of 5 items each. The scales include: (1) emotional symptoms subscale (2) conduct problems subscale (3) hyperactivity/inattention subscale (4) peer relationships problem subscale (5) prosocial behavior subscale. The SDQ can be completed by children and young people aged 11-17 years old, and a separate version can be completed by those aged 18 and over. SDQ is widely used in different countries and revealed good psychometric properties [19]. For the present investigation, we used the 25-item SDQ Italian version [20].

Psychometric Properties
All statistical analyses were performed with Statistical Package for Social Science (SPSS) 27.0. First, for both ARI and BSIS, we performed an exploratory factor analysis (EFA) with maximum likelihood factoring. The appropriateness of sampling was evaluated using the Keiser-Meyer-Olkin (KMO) and Bartlett's test of sphericity. Subsequently, the reliability and validity for the BSIS and ARI scales were assessed considering the consensusbased standards for the selection of health measurement tools (COSMIN) [9]. The internal consistency was examined with Cronbach's α to evaluate the interrelation of the items and the homogeneity of the scale. A value of α ≥ 0.7 is commonly considered to be acceptable [21]. To evaluate the test-retest reliability, a randomized sample of the whole population performed a second administration, two days apart [22]. This timing was considered adequate for measuring irritability without facing any clinical change. The intraclass correlation coefficient (ICC) was calculated to determine the degree to which repeated measurements were free from measurement errors. ICC values equal to or greater than 0.70 is commonly considered acceptable [21]. To assess convergent validity, the Italian version of the skills and difficulties questionnaire (SDQ-Ita) was administered to the entire population. The correlation between both ARI and BSIS scores and those of SDQ-ita was evaluated with Pearson's correlation coefficients; a value greater than 0.5 indicates an acceptable level of correlation. where positive value means positive linear correlation, and a negative value means negative linear correlation [21,23]. p values < 0.05 were considered statistically significant. At the end, a Student t-test was used to explore differences in ARI and BSIS scores across gender.

Translation and Cultural Adaptation of the BSIS
Once the consent of BSIS authors was received, the scale, originally in English, was translated into Italian, following international guidelines [24]. In order to adapt the wordfor-word translated version to the Italian culture, a focus group reviewed the final translation, corrected any remaining incoherencies, and reformulated some elements to minimize differences from the original English version.

Participants
For the validation of the BSIS scale and the ARI scales, a sample of 435 adolescents and 174 adults was recruited, whose demographic characteristics are summarized in Table 1.

Factor Analysis
ARI, KMO, and Bartlett's test of sampling adequacy revealed a value of 0.847 with p < 0.01. The factor analysis extracted one factor that explained 57.12% of variance. BSIS, KMO, and Bartlett's test of sampling adequacy revealed a value of 0.869 with p < 0.01. The factor analysis extracted one factor that explained 40.07% of variance.

Reliability and Validity of the BSIS and the ARI
The statistical analysis was carried out on 435 adolescents and 174 adults participating in the study, who answered to ARI, BSIS, and SDQ-ita in the same session. For test-retest, we sent a web link for answering to each assessment tool; however, the number of people participating in this second phase decreased from 435 to 224 for the adolescent group, and from 174 to 59 for adult population. BSIS demonstrated good internal consistency for both samples with Cronbach's α = 0.79, (p < 0.05) for adolescents and Cronbach's α = 0.78, (p < 0.05) for adults. Internal consistency was calculated for the full scale for both samples; the item-total correlation showed positive results as reported in Table 2. The ARI also demonstrated good internal consistency for both samples with Cronbach's α = 0.87, (p < 0.05) for adolescents and Cronbach's α = 0.87, (p < 0.05) for adults. Internal consistency was calculated for the full scale for both samples; item-total correlation showed positive results as reported in Table 3. Statistical analysis showed that both scales were comparable to the original version for both samples. For convergent validity, the Italian version of the SDQ questionnaire was distributed to the entire population. Pearson's correlation coefficient showed statistically significant values for a positive correlation between the ARI scale and the SDQ-Ita subscales in both the adolescent and adult samples, thus indicating good convergent validity. Pearson's correlation coefficient also showed statistically significant values for a positive correlation between the BSIS scale and the SDQ-Ita questionnaire in both the adolescent and adult samples, indicating good convergent validity. Pearson's correlation coefficients are shown in Table 4. The test-retest reliability, for both scales, was evaluated on the group of participants for whom it was possible to record data at t1 (224 adolescents and 59 adults). ARI obtained a CCI of 0.81 (p < 0.05) for adolescents and 0.93, (p < 0.05) for adults, respectively. BSIS achieved a CI of 0.90 (p < 0.05) for adolescents and 0.95, (p < 0.05) for adults. Both BSIS and ARI, therefore, showed good test-retest reliability, as shown in Table 5. Finally, we also investigated specific differences between both BSIS and ARI across gender. Tables 6 and 7 report mean (SD) values for each item of subscales with significant differences.

Discussion
Irritability recently became a widely discussed topic in the scientific literature due to the heterogeneity of manifestations and for its relationship with mental health and specific clinical conditions [24]. In addition, irritability can be exacerbated in specific environmental conditions. For instance, a recent study stressed the importance of focusing on irritability exhibited by children and adolescents with a diagnosis of autism spectrum disorder during COVID-19 home confinement [25]. However, despite this growing interest, knowledge on how to measure and treat irritability did not reach a universal and well recognized standard [26].
The present study aimed to contribute towards a better understanding of irritability. Our findings suggested that the Italian version of both BSIS and ARI are reliable and valid tools for assessing irritability in healthy Italian adult and adolescent population. Internal consistency of the ARI revealed a Cronbach's α = 0.87 for both adolescents and adult population, in line with the Australian [27], the Brazilian [28], and the original [16] versions. For BSIS, we found a Cronbach's α = 0.79 for adolescents and Cronbach's α = 0.78 for the adult population, slightly lower than the original version [17]. However, our findings revealed BSIS and ARI as consistent tools to measure irritability in the target population. Furthermore, both tools also revealed good test-retest reliability: ARI obtained an ICC of 0.81 and 0.93 for adolescents and adult population, respectively. Our findings were in line with the Australian validation of the ARI [27], while other versions did not report testretest reliability. BSIS obtained an ICC of 0.90 for adolescents and 0.95 for adult population, slightly higher than the original version [17]. Our findings proved that both ARI and BSIS are reliable measure confirming their stability over the time. Pearson's correlations indicated that there were moderate to strong relationships between ARI and BSIS total score and SDQ-ita. Our analysis also proved moderate correlations between ARI scores and SDQ-ita subscales except for prosocial behavior, as reported in Australian validation study [27].
Regarding differences in ARI and BSIS scores among gender, we found specific characteristics in both adolescent and adult samples. We explored the effect of gender on irritability and found that women reported greater irritability when measured with both ARI and BSIS. Our findings were in line with previous studies of higher irritability rates among females rather than males [8]. The same differences in irritability rates were found in various population, such as in adolescents from general population [29], adult people with epilepsy [7], students with depressive symptoms [30], and adults with major depres-sive disorder [31,32]. These differences were more evident in adolescent populations; in fact, small but significant gender differences in emotion expressions were reported for adults [33], with women showing greater emotional expressivity, especially for positive emotions and internalizing negative emotions such as sadness. Gender differences in irritability and fear in response to a social stress challenge were consistent with previous studies that demonstrated that women were more likely to endorse negative effects than men in response to interpersonal stressors [34].
Finally, it is important highlight that irritability is one of the most transdiagnostic constructs in the Diagnostic and Statistical Manual of Mental Disorders, 5th edition (DSM-5), ranging across 15 disorders including mood disorders; trauma-and stressor-related disorders; disruptive, impulse-control, and conduct disorders; substance-related and addictive disorders; and personality disorders (American Psychiatric Association, 2013). However, an overlap in the conceptualizations of irritability, anger, and aggression exists and the failure to distinguish these symptoms presents several problems. First, it leads to the contamination of measures of irritability with items that represent other constructs. Second, it reduces the validity of research on these three constructs [5]. Therefore, our study, with the validation of two measurement tools, aimed to contribute to a better definition for measuring irritability in both adults and adolescents. However, this study had some limitations. First of all, the relatively small sample size to investigate test-retest reliability. Second, we did not conduct a power analysis for sample size and this can limit the generalizability of results. Future studies with larger samples should be conducted to examine how irritability can predict, for example, the clinical outcome in specific health conditions.

Conclusions
The BSIS and the ARI were proven to be valid, reliable, easy-to-use tools for measuring irritability. In particular, the BSIS evaluated episodes of irritability in the last six months, and the ARI in the last week. These reasons make both tools suitable for epidemiological studies. The flexibility and specificity of the scales identified in this study allow the use of these tools in a wide range of clinical and research for both adults and adolescents. Institutional Review Board Statement: All procedures were performed in accordance with the ethical standards of the responsible committee on human experimentation (institutional and national) and with the Declaration of Helsinki of 1975, as revised in 2008. The ethics committee approval for this study was Rif. 4816 Prot.03/01/2018. This research involved data that were provided without any identifier or group of identifiers that would allow the attribution of private information to an individual. Informed consent was obtained from all participants for being included in the study.

Informed Consent Statement:
Informed consent was obtained from all participants involved in the study.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.

Conflicts of Interest:
The authors declare no conflict of interest.