Validation of the Glasgow Antipsychotic Side-Effect Scale (GASS) in an Italian Sample of Patients with Stable Schizophrenia and Bipolar Spectrum Disorders

Antipsychotics are a class of psychotropic drugs that improve psychotic symptoms and reduce relapse risk. However, they may cause side effects (SE) that impact patients’ quality of life and psychosocial functioning. Therefore, there is a need for practical tools to identify them and possibly intervene. The objective of the present study was to translate into Italian the Glasgow Antipsychotic Side Effect Scale (GASS), which is suggested as the questionnaire of choice to collect SE reported by patients treated with antipsychotics. We administered the GASS and the Udvalg for Kliniske Undersøgelser (UKU) SE scale—which is considered the gold standard—to 100 stable patients with schizophrenia and bipolar spectrum disorders. We measured the structural validity, internal consistency, concurrent criterion validity, construct validity, and clinical feasibility. GASS was characterized by modest structural validity and good internal consistency. The binary correlations concerning the presence of specific symptoms investigated with the GASS and the UKU were strong or relatively strong for only half of them. The GASS total scale score was inversely related to patients’ quality of life and psychosocial functioning. The GASS is useful to briefly assess the burden of antipsychotic SE (~5 min) but is not optimal in identifying them.


Introduction
Antipsychotic (AP) drugs are widely used for the treatment of several psychiatric conditions, including schizophrenia and bipolar disorder [1][2][3], which require long-term specific treatment to manage symptom severity, improve outcomes, and reduce relapses [4].
Both first-and second-generation antipsychotics may cause a wide range of side effects (SE)-including weight gain, sedation, prolactin-related sexual dysfunction, cardiovascular problems, and extrapyramidal symptoms to a variable extent [5]-which may impact patients' quality of life and psychosocial functioning [6].
Guidelines suggest constant monitoring of SE to ensure good treatment efficacy without losing sight of tolerability [7]. Since the 1970s, several scales have been developed to evaluate the SE induced by AP treatment [8][9][10]. Some of them evaluate specific SE

Participants
The participants were inpatients and outpatients recruited from the Psychiatry Unit of the University of Catania, Catania, Italy. Inclusion criteria were: (a) age ≥ 18 years; (b) being an inpatient or outpatient; (c) diagnosis of schizophrenia spectrum disorder or bipolar spectrum disorder based on DSM-5 criteria; (d) having been on treatment with at least one AP for at least 6 months (persistent use of the same AP was not required); (e) absence of positive symptoms at the time of recruitment (defined with a score ≤3 on the Positive and Negative Syndrome Scale [PANSS] p1-delusion, p3-hallucinatory behavior, g9unusual thought content inspired by Andreasen's remission criteria for positive symptoms); (f) absence of depressive or manic symptoms at the time of recruitment (defined as a score < 10 on the Montgomery-Asberg Depression Rating Scale [MADRS] and <7 on the Young Mania Rating Scale [YMRS]); (g) presence of good insight (PANSS g12-lack of judgment & insight ≤3); (h) absence of delusions and hallucination for bipolar patients; (i) sufficient understanding of the proposed questionnaires; and (j) ability to read and understand the informed consent documentation.
Patients were excluded if: (a) they were treated in the context of a compulsory intervention; (b) they presented concomitant organic diseases; (c) they declared they were currently using psychoactive substances; (d) they presented other neurological conditions (i.e., epilepsy, movement disorders, intellectual disability, dementia, etc.); or (e) they presented any condition that would prevent the completion of the assessment. The following demographic and clinical data were collected: age, sex, education, marital status, employment status, smoking status, concomitant pathologies, illness-related data (illness duration, Brain Sci. 2022, 12, 891 3 of 17 hospitalizations, actual recruitment setting), and drug-related data (antipsychotic used, olanzapine oral-equivalents, administration route, concomitant psychotropic medications).

Instruments
The GASS is a self-administered scale initially developed in English [17] and translated into other languages [18][19][20]. It is a 22-item self-rated questionnaire used to assess AP-induced weight gain; sedation; central nervous system (CNS), cardiovascular, gastrointestinal, and genitourinary functioning; extrapyramidal and anticholinergic activity; diabetes; and prolactin-related SE. For each item, patients can indicate the frequency of the reported SE (Never, Once, A few times, and Every day, scored as 0,1,2, and 3, respectively) and then the level of distress that the SE determines (scored from 1 to 10). Twenty questions refer to the prior week, while the last two questions (on changes in menstrual periods and weight gain) refer to the previous 3 months. The total scale score is given by the sum of the frequency of the items. The Italian translation of the scale is reported in Appendix A.
The clinician-rated UKU SE rating scale [12] is considered to be the gold standard for recording psychotropic drug-induced SE [18]. The original scale includes 48 items. The scale questions investigate the severity of the SE by defining specific discriminant parameters. The severity of the symptoms is defined as "no side effects" equal to 0, "mild side effects that do not interfere with the patient's performance" equal to 1, "moderately" and "markedly" equal to 2 and 3, respectively. For the present study, we matched the GASS items with the UKU ones, adding items on nocturnal enuresis and breast pain that were not present in the original version of the UKU, as suggested by the manual. We replicated the procedure that Bock et al. applied in the Danish validation of the scale [18]. We adopted the same time frame for the GASS questions as the UKU ones. The total scale score was calculated by summing individual items matched to the GASS.
The WHO Disability Assessment Schedule (WHO-DAS) 2.0 is a generic self-rated tool used to measure health and disability levels in clinical practice [21]. There are two different versions of the instrument. The complete version contains 36 items, while the brief version-which we used for this study-includes 12 items, and it was also validated for patients with psychosis [22]. All the questions refer to the prior 30 days, asking for the level of difficulty in doing daily activities, ranging from "No difficulty," equal to 1, to "Extreme or cannot do," equal to 5. The sum of the items is proportional to the functional impairment.
The EuroQoL-5 dimensions-5 levels (EQ-5D-5L) is a quality-of-life screening tool [23]. It consists of two sections. The first part contains five Likert five-level questions regarding movement capacity, self-care, common activities, pain, and anxiety/depression. The second part consists of a visual analog scale (VAS) in which the patient should indicate his or her perceived health ranging from 0 to 100, where higher is better. We considered only the VAS for the present work, considering its more straightforward interpretation and correlation with other EQ-5D indices for patients with schizophrenia [24].

Translation and Validation Procedure
The translation procedure followed the prescriptions in the available literature [25]. First, we asked Prof. M. Taylor, the creator of the scale, for permission to translate it into Italian. Subsequently, the scale was translated from English into Italian by an Italian clinician proficient in the English language and an English, mother-tongue translator. With their consent and with the contribution of five patients, the two versions were then merged. Then, a clinician who was a native English speaker and proficient in the Italian language and an Italian mother-tongue translator back-translated the scale into English. The final version was then merged by consensus. Finally, all the documents were sent back to the GASS creator, who checked if the final version of the back-translated scale was in line with the original scale. After approval, the scale was administered to 10 patients, who confirmed the usability of the tool.

Raters
For the present study, three senior psychiatrists and three psychiatrists in training at the Psychiatry Unit administered the self-rated questionnaires. For the UKU SE rating scale administration, the senior and in-training psychiatrists performed a preliminary assessment on 10 patients to improve inter-rater reliability.

Structural Validity and Internal Consistency
The GASS is a multi-dimensional scale covering diverse SE. It was designed to measure the SE burden associated with AP drugs. Nevertheless, originally there were no subscales, and a total score could be utilized by summing the score of all items. Therefore, we conducted a confirmatory factor analysis (CFA) to determine the original one-factor construct of the scale. We used diagonally weighted least squares (DWLS) to estimate model parameters since the items are rated on an ordinal scale. The fit of CFA was examined with the chi-squared test, the comparative fit index (CFI; good fit when ≥0.95), the Tucker-Lewis index (TLI; good fit when TLI ≥0.95), and the root-mean-square error of approximation (RMSEA) and its 90% confidence intervals (CI; good fit when <0.06) [26]. We considered modifying the model by adding error covariances that could substantially improve the model's fit when identified by modification indices. In addition, we examined internal consistency by calculating Cronbach's α and its 95% CI (good internal consistency when ≥0.7) [27]. We also examined inter-item Spearman's ρ correlations, and an average inter-item correlation between 0.2 and 0.4 indicated a good internal consistency [28].

Concurrent Criterion Validity
We examined the agreement between the GASS and the UKU, which is regarded as the gold standard [18]. First, we paired the GASS items with the UKU clinician-administered items (Table A1) after dichotomizing both of them between present and absent. The sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) were calculated using the UKU items. Moreover, the phi coefficients of associations between dichotomized GASS and UKU items were calculated to obtain a good proxy of concurrent criterion validity between the scales [29]. Subsequently, the phi value was interpreted according to Rea and Parker's anchor points [30]. We also investigated the relationship between the total scores of GASS and UKU with Spearman's ρ (good agreement when ρ ≥ 0.7) [27].

Hypothesis Testing for Construct Validity
We examined construct validity by investigating the relationship between the GASS total and functional impairment (measured with the WHO-DAS 2.0) or perceived quality of life (measured with the EQ-5D-5L VAS) by using Spearman's ρ (good construct validity when |ρ| ≥ 0.5) [31]. We also investigated the relationship between frequency and distress scores of the individual GASS items using Spearman's ρ. We examined differences in the GASS total score between patient subgroups (e.g., sex, diagnosis, employment status, etc.) using the Mann-Whitney U test or checking whether any correlation was found between GASS total score and demographic and illness-related variables.

Clinical Feasibility
We recorded the timing of administering the instrument and the questions that the participants asked the clinicians when answering the GASS questionnaire.

Sample Characteristics
For the present study, we recruited 111 participants between September 2021 and April 2022. Due to missing frequency data needed for clinical validation, 11 participants were excluded from the analyses. In total, 100 participants provided all the data for the GASS frequency items, while only 81 completed the distress section. The sample comprised 69 patients with schizophrenia and 31 patients with bipolar spectrum disorders. All patients except four, who were taking typical AP only, were taking second-generation AP. 12 patients were treated with a combination of typical and atypical AP. Paliperidone, olanzapine, aripiprazole, and risperidone were the most prescribed (in 31%, 23%, 20%, and 12% of participants, respectively). Other AP were prescribed to less than 10 participants. A detailed description of the patients' demographics and illness-related data is reported in Table 1. UKU total + 6 (IQR: 3-10) 6 (IQR: 3-10) 6 (IQR: 4-10) The CFA did not find a good fit for the primary model of the one-factor construct of GASS (chi-squared = 309.81, degrees of freedom (df) = 189, p-value < 0.001; CFI = 0.81; TLI = 0.79; and RMSEA = 0.080, 90%CI [0.064, 0.096]). The five most influential modification indices were unique residual correlations concerning items 17 or 18 (items concerning "Gynecomastia/Breast pain" and " Galactorrhea", which were both rare in our sample, Table 2) (Table A2). Therefore, we re-specified the model by adding these correlations, and the fit was better (chi-squared = 247.14, degrees of freedom (df) = 184, p-value = 0.001; CFI = 0.90; TLI = 0.89; and RMSEA = 0.059, 90%CI [0.038, 0.077]). Standardized loadings of the items on the total score are presented in Table A3.
There was a fair agreement between the GASS total and UKU total (ρ = 0.67, p-value < 0.001, Figure 1A).

Correlation between Functioning and Perceived Health
We found that a higher GASS total score was correlated with more functional impairment as indicated by higher scores on the WHO-DAS 2.0 (ρ = 0.45, p < 0.001, Figure 1B) and worse perceived health as indicated by lower scores on the VAS of 5Q-5D-5L (ρ = −0.4, p < 0.001, Figure 1C).

Correlation between the Frequency of a Side-Effect and the Distress Caused to the Patient
The relationship between the frequency of a side-effect and the distress caused to the patient is presented in Table 3 (in 81 patients that rated distress). There were indications that some SE were more distressing when they were more frequent, e.g., sleepiness (item 1), parkinsonism, and dyskinesia (items 6, 9, and 10), and (anti)cholinergic side-effects (items 8, 12, and 13). In contrast, other SE may be equally distressing irrespective of their frequency, e.g., confusion or dizziness (items 2 and 3), akathisia (item 7), and sexual dysfunction (item 19).

Differences between Subgroups
The GASS total score did not differ in any dichotomous comparison, except between patients providing distress data who had lower scores than those who did not complete the distress GASS column. The GASS total score did not correlate with any continuous variable (e.g., age, olanzapine equivalents dose, etc.).

Clinical Feasibility
As previously reported, 11 participants did not completely understand how to fill in the distress column. The median completion time for the GASS scale was 4:42 min. In Table 4, we report the most common questions asked by our participants. What does "Men Only" mean?

Questions related to GASS Distress column (n = 9):
Is it a tick box? What does the number mean?
If the main answer is "Never", which "distress" might be chosen?
Questions related to general GASS form, (n = 2): What is the difference between "once" and "almost never"? What if it is neither "once" nor "a few times"?

Discussion
The present study aimed to translate the GASS into Italian and to validate it as a measure of the degree of the AP-SE burden.
In terms of structural validity, the one-factor definition of the scale might be inappropriate, requiring further investigation on the scale structure. However, it should be considered that the CFA statistics improved when considering the correlations of items 17 and 18 with some other ones. Those items could have impacted the one-factor analysis because of their rarity in our sample. Nevertheless, their paucity does not differ much from what was found in the CATIE study, which aimed to compare the effectiveness of conventional and atypical AP medications used for the treatment of schizophrenia [36].
For most of the individual SE investigated by the GASS against the UKU, the sensitivity and specificity in their identification were higher than 70%, resulting in a fair tool for AP-SE screening. On the other hand, even if the PPV of the GASS items was relatively high in most cases, the same cannot be said for the NPV. This means that patients may have been judged to have a specific SE during the clinical administration of the UKU but they did not recognize the same SE with the GASS. However, given that the primary aim of the GASS is to measure SE distress in patients, we consider our results acceptable for that aim.
The correlation between the GASS and UKU items ranged from negligible to strong [29,30]. In particular, it was strong and relatively strong for 11 items (orthostatic hypotension, palpitations, tremor, dry mouth, dysuria, nausea and vomiting, galactorrhea, sexual dysfunction, erectile dysfunction, menstruation changes, and weight gain). The correlation of the other items was lower. It is likely that the disproportion between the symptoms identified by the patient and the clinician is the result of the greater tendency of patients to report symptoms that determine distress [37] as present but are otherwise recognized as mild or absent by the clinician [38].
In line with a previous similar study [18], the items "asthenia" and "sedation" did not perform well, but as opposed to previous findings, our results showed high specificity for the former. Unlike the other study, the GASS items investigating neurological SE showed fair sensitivity. It should be noted that the items exploring hypokinesis and hyperkinesis showed the former to be specific but not sensitive and the latter sensitive but not specific. This was probably conditioned by the fact that during the examination, the clinician tended to notice the global slowness of the patient more, which tends to be more evident as the disorder progresses [39], rather than hyperkinetic movements that show a fluctuating trend.
Regarding construct validity, the present work confirmed an inverse correlation between the total score of the GASS scale, functional impairment, and quality of life. The total score is proportional to the UKU total score in the items corresponding to the GASS scale. Moreover, we found that the frequency of SE is proportional to the distress that those SE determine. This leads to the consideration that the GASS might be preferentially used to estimate the SE burden rather than to identify the SE.
The translation and validation study of the GASS scale we performed adds to the studies already available [18][19][20]. As previously highlighted in the Greek and Arabic translations [19,20], the scale maintains good internal consistency, an element not investigated by the Danish validation [18]. However, consistency with the gold standard (measured by criterion validity) was not as satisfactory as in the Danish validation, although relatively robust for more than half of the items. Finally, the CFA, characterized by a different analytical approach to the one applied by Arab researchers [20], showed that the one-factor analysis did not fit well. Therefore, further exploratory factor analysis could be carried out to elucidate the scale structure.
The present study has some limitations. Firstly, it should be considered that the staff was trained before beginning the study using the UKU scale. We believe that this impacted on maintaining inter-rater reliability, considering that recruitment occurred within 7 months. This may have had an impact on criterion validity. Secondly, it should be considered that in order to be in line with Bock's work [18], we used an instrument that was slightly modified from the one originally conceived by the creators of the GASS [17], considering the distress parameter to be continuous and not dichotomous. On the one hand, we could conduct more in-depth analyses; on the other hand, the distress column was considered ancillary in the original GASS and it is not used to calculate the total score of the scale. We did not find good reasons to maintain the continuous distress column, and we suggest using the original dichotomized form. Thirdly, given the small number of events found according to the GASS in terms of galactorrhea, menstrual cycle alterations, gynecomastia, and nocturnal enuresis, our sample size is unlikely to clarify sufficiently whether the scale is capable of adequately identifying them in clinical practice. Furthermore, in the present study, we did not measure either the discriminative ability of the tool or the test-retest. Only one of the other translations provided data on test-retest results, suggesting a good agreement between the administrations [19], similar to the original validation [17]. Additionally, it should be noted that the patients were all stable, most of them were treated with only second-generation antipsychotics, and they reported intermediate scores on the GASS scale on average. These sample characteristics may impact the generalizability of the present study. Finally, this study aimed to validate the Italian translation of the original scale. A sample size of at least 100 participants is suggested for this purpose. However, models with multiple parameters (such as when using the DWLS estimator) may need much larger samples (e.g., 20 participants for each parameter) [40].
The strengths of this research include the detailed analysis of the GASS, which integrates and expands the work of other researchers mentioned regarding the tool. This is relevant for future development of similar instruments. Moreover, the study involved some inpatients, not considered in previous studies, which extends the generalizability of the results. The validation followed the current standards for translating scales, maintaining the original face validity of the instrument. We provided a pragmatic instrument to measure AP-SE distress, requiring only 5 minutes to be completed, previously unavailable in a validated form for Italian patients with psychotic disorders.

Conclusions
The Italian translation and validation of the GASS adds a valuable tool to the patientreported outcome measures (PROMs) that patients could benefit from. In addition, increased attention by clinicians to AP-SE may ensure an improvement in patients' quality of life and psychosocial functioning. However, clinicians should integrate it with a clinical interview, as GASS appears more suitable as a screening tool than a diagnostic tool. Informed Consent Statement: Informed consent was obtained from all subjects involved in the study. Written informed consent was obtained from the patient(s) to publish this paper.

Data Availability Statement:
The data presented in this study are available on reasonable request from the corresponding author.