Validation of an Online Version of the Alcohol Use Disorders Identification Test (AUDIT) for Alcohol Screening in Spanish University Students

Online alcohol screening may be helpful in preventing alcohol use disorders. We assessed psychometric properties of an online version of the Alcohol Use Disorders Identification Test (AUDIT) among Spanish university students. We used a longitudinal online survey (the UNIVERSAL project) of first-year students (18–24 years old) in five universities, including the AUDIT, as part of the WHO World Mental Health International College Student (WMH-ICS) initiative. A reappraisal interview was carried out with the Timeline Followback (TLFB) for alcohol consumption categories and the Mini International Neuropsychiatric Interview (MINI) for alcohol use disorder. Reliability, construct validity and diagnostic accuracy were assessed. Results: 287 students (75% women) completed the MINI, of whom 242 also completed the TLFB. AUDIT’s Cronbach’s alpha was 0.82. The confirmatory factor analysis for the one-factor solution of the AUDIT showed a good fit to the data. Significant AUDIT score differences were observed by TLFB categories and by MINI disorders. Areas under the curve (AUC) were very large for dependence (AUC = 0.96) and adequate for consumption categories (AUC > 0.7). AUDIT cut-off points of 6/8 (women/men) for moderate-risk drinking and 13 for alcohol dependence showed sensitivity/specificity of 76.2%/78.9% and 56%/97.5%, respectively. The online version of the AUDIT is useful for detecting alcohol consumption categories and alcohol dependence in Spanish university students.


Introduction
Alcohol consumption is one of the leading risk factors for disability and premature death [1,2]. Particularly, among young people, some alcohol consumption patterns are well recognized as a public health problem [3]. The last report of the European School Survey Project on Alcohol and Other Drugs (ESPAD) showed that 34% of young people reported binge drinking [4]. In Spain, according to the last report from the Survey on Alcohol and Other Drugs in Spain (EDADES) [5], the prevalence of any drinking in the last 30 days was 59.7% in young people.
Among young people, university students report a higher frequency of some risk patterns of alcohol consumption than non-students [6]. In the university population, binge drinking is common and strongly related to other risky health behaviors [7][8][9] and high levels of alcohol consumption have been reported [10][11][12][13]. In this sense, it has been proposed to carry out screening programs to identify patterns of alcohol consumption and to implement web-based interventions that have proven to be effective [14][15][16].
The Alcohol Use Disorders Identification Test (AUDIT) [17] is a self-administered instrument developed by the WHO, providing classifications of alcohol consumption and dependence. While the Spanish version of the AUDIT has been validated in health care settings [18][19][20][21], young people (12-18 years) [22,23] and university student populations [24,25], to our knowledge, this is the first study to evaluate an online version, as part of a larger mental health survey, as administered in the UNIVERSAL project [26], where the results obtained may differ according to the mode of administration [27,28]. Obtaining such evidence would be important since online instruments allow for easier and less costly administration than other modes.
The UNIVERSAL project is a multicenter, observational and prospective cohort study that aims to assess the prevalence and incidence of mental disorders among Spanish university students [26]. This project is part of the World Mental Health International College Student (WMH-ICS) initiative. The online survey administered in the study was composed of multiple screening instruments for the assessment of mental disorders and suicidal thoughts and behaviors, where the screening for alcohol use disorders (AUD) was based on the AUDIT [17,29].
The aims of the present study were to assess the reliability, construct validity and diagnostic accuracy of the online version of the AUDIT as used in the UNIVERSAL project for measuring alcohol consumption and AUD according to standard reference measures, 7-day Timeline Followback (TLFB) [30] and Mini International Neuropsychiatric Interview (MINI) [31], respectively.

Participants
Participants for this validation study were recruited from the UNIVERSAL project. Further information on the UNIVERSAL project has been published elsewhere [26]. Briefly, in the academic year 2014/15, all first-year students enrolled in a university degree for the first time and aged 18-24 from five Spanish universities of five Spanish regions, Andalusia (UCA), Basque Country (UPV-EHU), Balearic Islands (UIB), Catalonia (UPF) and Valencia (UMH), were eligible and they were invited to participate in the UNIVERSAL project (n = 16,332). These universities represented around 8% of the total number of students in public universities of Spain in the years 2014-15, and their distribution in demographic characteristics (i.e., gender, nationality and academic field) was similar to that of the overall population of students in public universities of Spain (results available upon request). The students participating in the study were re-contacted every year, from 2015/16 to 2017/18 courses, for follow-up online assessments. Ethical approval was provided by the Parc de Salut Mar-Clinical Research Ethics Committee (Reference: 2013/5252/I,  3 December 2013). A total of 2343 students answered the online baseline survey of the UNIVERSAL project. In the clinical reappraisal sub-study, students were invited to participate after responding the online surveys at different time periods (i.e., at baseline and at 1st and 2nd follow-up). Inclusion criteria in the clinical reappraisal sub-study were: (i) acceptance of informed consent; (ii) provision of a telephone number; and (iii) completion of the diagnostic sections of the online survey. A sub-sample of 575 individuals fulfilling the inclusion criteria was selected for participation through sampling strategies (see flow diagram in Figure 1). Further information on the clinical reappraisal sub-study has been published elsewhere [32]. Consecutive sampling of cases was applied at baseline (2014/2015) and 1st year follow-up (2015/16). In order to increase the number of individuals with specific mental disorders for the clinical reappraisal study, a stratified random selection with different probabilities of selection was conducted during the 2nd year follow-up (2016/17) by selecting: (a) 100% of individuals who screened positive in the online survey for the following mental health problems: alcohol use disorder, generalized anxiety disorder, panic disorder, bipolar disorder, substance use disorder, suicide plan and suicide attempt; (b) a random 20% of individuals with positive screen of a major depressive episode or suicidal ideation but without any of the above mental health problems; and (c) 10% of the remaining respondents. This probabilistic sample allowed us to restore the original distribution of disorders in the UNIVERSAL study through the use of sampling weights in the analysis. The final sample of the clinical reappraisal study was 287 students. A total of 2343 students answered the online baseline survey of the UNIVERSAL pro ject. In the clinical reappraisal sub-study, students were invited to participate after re sponding the online surveys at different time periods (i.e., at baseline and at 1st and 2n follow-up). Inclusion criteria in the clinical reappraisal sub-study were: (i) acceptance o informed consent; (ii) provision of a telephone number; and (iii) completion of the diag nostic sections of the online survey. A sub-sample of 575 individuals fulfilling the inclu sion criteria was selected for participation through sampling strategies (see flow diagram in Figure 1). Further information on the clinical reappraisal sub-study has been publishe elsewhere [32]. Consecutive sampling of cases was applied at baseline (2014/2015) and 1s year follow-up (2015/16). In order to increase the number of individuals with specific men tal disorders for the clinical reappraisal study, a stratified random selection with differen probabilities of selection was conducted during the 2nd year follow-up (2016/17) by se lecting: (a) 100% of individuals who screened positive in the online survey for the follow ing mental health problems: alcohol use disorder, generalized anxiety disorder, panic dis order, bipolar disorder, substance use disorder, suicide plan and suicide attempt; (b) random 20% of individuals with positive screen of a major depressive episode or suicida ideation but without any of the above mental health problems; and (c) 10% of the remain ing respondents. This probabilistic sample allowed us to restore the original distributio of disorders in the UNIVERSAL study through the use of sampling weights in the analy sis. The final sample of the clinical reappraisal study was 287 students.

Online Survey
The online survey included an adaptation of the AUDIT questionnaire [17] for estimating current usual prevalence of alcohol consumption and AUD (abuse or dependence) among first-year university students without specifying a specific time period. AUDIT is a self-administered questionnaire composed of 10 items with scoring range of 0-40 points. This questionnaire refers to the quantification of alcohol consumption, the behavior towards drinking, adverse reactions and problems related to alcohol consumption.
Three variables were defined based on the AUDIT total score: (a) binge drinking (BD), as a dichotomous variable obtained from the third AUDIT question reduced in our study to five or more drinks in both genders, "How often do you have five or more alcoholic drinks at a single sitting?", as recommended in previous validation studies among university students [24,34]. Responses were coded as follows: never/less than once a month = 0 and 1-2 days a month/1-2 days a week/3-4 days a week/every day or nearly every day = 1 [35,36]; (b) risk drinking, where different cut-off values were established according to gender: 8 in men and 6 in women, as recommended in previous Spanish validation studies [18,21,24]; and (c) probable dependence, as a dichotomous variable from the AUDIT with a cut-off of 13 for both genders [21].

Reappraisal Instruments
Interviews for the clinical reappraisal study were conducted using reference standard instruments to obtain the diagnostics with which to validate the online version of the AUDIT. Telephone interviews were performed by clinical psychologists specially trained for structured interviews who were blind to the online survey responses. Interviewers had no personal information (only the telephone number) of the participants in order to preserve confidentiality. Two standardized measures were selected as gold standard instruments for this validation study: 7-day Timeline Followback (TLFB) and Mini International Neuropsychiatric Interview (MINI).
The 7-day TLFB is a drinking assessment method that obtains estimates of daily drinking in the past 7 days, using a record of standard drink units (SDUs) consumed at different times or occasions throughout the day [30]. Participants completed a diary, with the help of the clinical interviewer, in which they were asked about the amount of alcohol consumed at different times of the day during the previous seven days. The following four categories were considered: (a) non-drinkers (SDUs = 0), (b) low-risk drinkers (SDUs ≤ 21 and ≤14 for men and women, respectively) [37], (c) moderate-risk drinkers (22-27 for men and 15-16 SDUs for women) and (d) high-risk drinkers (≥28 and 17 SDUs) [37,38]. Additionally, BD was defined as the consumption of 5 or more SDUs in a single sitting [38][39][40].
The adapted version of the Spanish structured interview MINI 5.0.0 [41] was administered for AUD diagnostics according to the Diagnostic and Statistical Manual of Mental Disorders (DSM-IV) [42] criteria and referred to the previous 12 months. Four categories were considered: (a) non-case; (b) alcohol abuse; (c) alcohol dependence; and (d) any AUD (abuse/dependence).

Analysis
We compared characteristics of the clinical reappraisal sub-sample and prevalence estimates of alcohol consumption categories and AUD according to the reference standards by gender using a chi-squared test and Fisher's exact test. Reliability and confirmatory factor analysis was performed in the overall sample of participants in the UNIVERSAL project (n = 2343). The analysis with MINI as a reference standard was performed with the whole sample of reappraised university students (n = 287) and the analysis with TLFB as a reference standard was restricted to those students from the reappraisal sample who provided all TFLB data (n = 242).
The reliability of AUDIT was analyzed using Cronbach's alpha and Guttman's lambda-2, as measures of internal consistency. The AUDIT total score was used in this study and its unidimensionality was evaluated through confirmatory factor analysis (CFA) with the one-factor solution, using unweighted least squares estimation. In addition to the chi-square statistic, which is sensitive to sample size [43,44], we assessed the chi-square statistic and degrees of freedom and its corresponding p-value. Given the sensitivity of this test to large sample sizes, we additionally examined the following goodness of fit indices: comparative fit index (CFI), Tucker-Lewis index (TLI) and root mean square error of approximation (RMSEA), considering the cut-off criteria of 0.95 in CFI and TLI for good fit and RMSEA < 0.06 for good fit [45,46].
Known-groups validity was assessed by computing weighted average scores (weighted standard deviation) of AUDIT across TLFB groups: those who do not drink; low-risk drinkers and moderate-risk to high-risk drinkers. Similarly, we computed weighted average scores (weighted standard deviation) of AUDIT across MINI diagnosis: no AUD, alcohol abuse and alcohol dependence disorder. A Jonckheere-Tepstra test was calculated with the ex ante hypothesis that there would be a gradient from lower to higher values in AUDIT scores across these groups. Statistical significance was set at the 5% level based on two-sided tests. Cohen's effect sizes were computed for each category as compared to the lowest category ("non-drinkers" for TLFB; "non-case" for MINI) [47] considering small (0.2), moderate (0.5) and large (0.8) effect sizes [48]. Criterion validity of the AUDIT scores was assessed with the receiver operating characteristic (ROC) and its corresponding area under the curve (AUC), considering the TLFB definitions and MINI diagnoses as the reference standards. According to Landis and Koch (1977), different ranges of AUC were assigned labels of discrimination ability: slight (0.50-0.59), fair (0.6-0.69), moderate (0.7-0.79), substantial (0.8-0.89) and almost perfect (≥0.9) [49]. Finally, we studied test characteristics for pre-specified cut-off points of the AUDIT with respect to TLFB and MINI definitions described previously: sensitivity (SN), specificity (SP), positive predictive value (PPV), negative predictive value (NPV), likelihood ratio positive (LR+) and likelihood ratio negative (LR-). The AUCs for the dichotomous categories of the AUDIT are also presented. In the case of a dichotomous predictor and a dichotomous outcome, the AUC equals (SN + SP)/2 [50]. For assessing the differences in the prevalence between the online version of the AUDIT and reference standard, a McNemar χ 2 test was calculated.
Inverse probability weighting was computed to adjust the sampling method applied in the reappraisal selection carried out during the 2nd year follow-up (2016/17). Weights were obtained as the inverse of the probability of selection within each stratum in 2nd year follow-up and normalized to the total sample size of the clinical reappraisal study. Poststratification weights were applied for the correction of differences of gender, academic field and nationality characteristics between the clinical reappraisal sample and the respective UNIVERSAL sample, as the reference population. Analyses were performed using SAS v9.4 [51] and MPLUS v8.5 [52]. Table 1 shows the sociodemographic characteristics and prevalence of the reference standard measures in the clinical reappraisal sample. The majority of the sample was women (75.3%), 69.6% were 18 years old and 2.9% had non-Spanish nationality. Most students (47.6%) were enrolled in social sciences. According to the MINI, men were significantly more likely than women to meet the criteria of alcohol abuse (14.2% vs. 6.4%, p = 0.028) but no gender differences for MINI alcohol dependence were found (1.6% vs. 1.3%). TLFB alcohol consumption categories (i.e., binge drinking, moderaterisk drinking and high-risk drinking) also did not show statistically significant differences between genders. The internal consistency of AUDIT evaluated by the Cronbach's alpha coefficient was 0.817. The lambda-2 coefficient was 0.829. Corrected item-total correlations ranged from 0.332 to 0.663 (Table 2).  Table 3 shows standardized factor loadings of the one-factor CFA model of the online version of the AUDIT, which ranged from 0.545 to 0.797. The model chi-square statistic was 219.073 (35), p-value < 0.001, and the CFA indices had optimal values according to the cut-off criteria, indicating a good fit to the data, with a CFI of 0.973, a TLI of 0.966 and an RSMEA of 0.049 (95%CI 0.043-0.056). Table 3. Standardized factor loadings from a confirmatory factor analysis with one factor of the online version of the AUDIT administered in the UNIVERSAL sample (n = 2343).  Figure 2A shows weighted mean AUDIT scores and their weighted standard deviation (SD), as well as corresponding effect sizes, across the TLFB alcohol consumption categories. A clear upward gradient was observed for the AUDIT scores, rising from the "nondrinkers" group (mean = 2.86, SD = 2.89), through to the "low-risk drinkers" (mean = 4.51, SD = 3.42) and finally the "moderate-risk drinkers or more" group (mean = 13.5, SD = 8.03) (J = 10,025.5; p-value < 0.001). Similar results were obtained for women. Results for men could not be calculated due to insufficient data in the "moderate-risk drinkers or more" category (n < 5), but differences between non-drinkers and low-risk drinkers were small and not statistically significant. Effect sizes associated with "moderate-risk drinkers or more" were the highest for the total sample (ES = 3.34) and for women (ES = 2.72). Figure 2B shows weighted mean AUDIT scores and weighted SD for alcohol abuse and dependence as assessed by the MINI. Again, a consistent upward gradient was observed for the AUDIT scores, rising from "non-cases" to respondents with "dependence" criteria. Results for men in the "dependence" category (n < 5) could not be calculated due to insufficient data. Effect sizes associated with "dependence" were the highest for the total sample (ES = 2.72) and for women (ES = 2.98).

AUDIT
The ability of the AUDIT scores for detecting alcohol risk-drinking and AUD, using TLFB and the MINI as the respective gold standards, is presented by ROC curves and AUCs in Figure 3. AUCs were substantial for the TLFB, with values of 0.84 to detect moderate-risk drinking and 0.85 for high-risk drinking. For the MINI, AUCs ranged from fair (0.78) for alcohol abuse/dependence to almost perfect (0.96) for alcohol dependence.  The ability of the AUDIT scores for detecting alcohol risk-drinking and AUD, using TLFB and the MINI as the respective gold standards, is presented by ROC curves and AUCs in Figure 3. AUCs were substantial for the TLFB, with values of 0.84 to detect moderate-risk drinking and 0.85 for high-risk drinking. For the MINI, AUCs ranged from fair (0.78) for alcohol abuse/dependence to almost perfect (0.96) for alcohol dependence. Accuracy analyses were first performed for alcohol consumption categories (Table 4), comparing different AUDIT cut-off points with the Timeline Followback categories as the gold standard. AUDIT cut-off points had a sensitivity (SN) of 41.4% and specificity (SP) of 83.6% for detecting binge drinking. Results for pre-specified cut-off points for detecting at least moderate-risk drinkers were SN = 76.2% and SP = 78.9% for cut-off point 8 for men and cut-off point 6 for women. Using the cut-off point 13 for both genders provided an SN of 74.4% and SP of 98.3% for high-risk drinkers based on the TLFB. Prevalence estimates are also presented in Table 4, which show statistically significant differences between the index text and gold standard according to the McNemar test for binge drinking and for moderate-risk drinkers. The AUCs were fair to substantial for moderate-risk and high-risk drinking (ranging from 0.7 to 0.9), and fair for binge drinking (0.6). Accuracy analyses were first performed for alcohol consumption categories (Table 4), comparing different AUDIT cut-off points with the Timeline Followback categories as the gold standard. AUDIT cut-off points had a sensitivity (SN) of 41.4% and specificity (SP) of 83.6% for detecting binge drinking. Results for pre-specified cut-off points for detecting at least moderate-risk drinkers were SN = 76.2% and SP = 78.9% for cut-off point 8 for men and cut-off point 6 for women. Using the cut-off point 13 for both genders provided an SN of 74.4% and SP of 98.3% for high-risk drinkers based on the TLFB. Prevalence estimates are also presented in Table 4, which show statistically significant differences between the index text and gold standard according to the McNemar test for binge drinking and for moderate-risk drinkers. The AUCs were fair to substantial for moderate-risk and high-risk drinking (ranging from 0.7 to 0.9), and fair for binge drinking (0.6).
In Table 5, AUDIT cut-off points were compared to the MINI as the gold standard for detecting AUD. The AUDIT cut-off point used for detecting alcohol abuse or dependence was 8, as a generally accepted cut-off [29,53]. SN for men was 26.6% and for women 46%, while SP was higher for both genders (81.1%; 90.2%, respectively). The alternative cut-off point of 13 for alcohol dependence, recommended by García-Carretero et al. (2016), showed more accurate results: SN for the overall sample 56% and 54.3% for women, and SP of 97.5% and 97.6%, respectively [24]. According to the McNemar test (Table 5), no statistically significant differences were found in prevalence estimates. The dichotomous AUCs for alcohol dependence were slightly higher than values for alcohol abuse/dependence for the overall sample (0.60 vs. 0.77) and women (0.68 vs. 0.76).

Discussion
This study has assessed the validity of the online AUDIT to identify diagnostic criteria for AUD as well as risk-drinking according to the MINI and the TLFB. The results show that the online version of the AUDIT is adequate to detect alcohol dependence among Spanish university students and to discriminate different alcohol consumption categories.
Reliability is a prerequisite for validity [54]. Internal consistency for the online version of the AUDIT was good on the whole measurement, which reflected the consistency of responses across the items of the instrument. Our results are comparable to those found in previous Spanish studies among university students [24] and the general population [18,19], which found a value of around 0.8. However, corrected item-total correlations were low for some of the items, particularly for items 6 and 10, which were also found in previous studies [19,24] that concluded that this could be because the 4th to 10th items assess dependence and harmful alcohol use [29] and these were less frequent in this population.
The unidimensionality evaluation, consistent with the total score of the AUDIT, showed a good fit of the results, as found in previous studies [55,56]. The results obtained for the known-groups comparisons also provide support for the construct validity of the online version of the AUDIT. An upward gradient was observed in both cases, for alcohol consumption and for AUD categories. Increasing scores were obtained across different types of consumption (similar to those reported by García -Carretero et al., 2016).
The results reported also offer evidence of good diagnostic accuracy of the AUDIT for identifying risk-drinking categories with the TLFB and alcohol dependence (assessed by the MINI). The AUCs to assess discrimination ability for risk-drinking were substantial (0.84-0.86) and similar to those found in previous validation studies among university students (AUCs 0.87-0.93) [34,57]. However, these AUCs to identify riskdrinking were slightly lower than those obtained in previous Spanish validation studies (AUC = 0.95-0.98) [20,24], which differed in the mode of administration of the AUDIT. The AUC for detecting alcohol dependence was almost perfect (AUC = 0.96), similar to AUC values obtained in other studies [24,57]. Finally, the AUC for detecting BD with the AUDIT was fair (0.6) and lower than the AUC, which was found by Cortés et al. (2017), who support the recommendation to change the third item of the AUDIT to four or more drinks in women [34] or using the full instrument to identify BD, such as AUDIT or AUDIT-C [23,25].
In this study, previously suggested cut-off points in AUDIT for alcohol risk consumption among men and women (8 and 6, respectively) resulted in sensitivities and specificities lower than the Spanish validation among university students [24] and primary care [18,21], but similar to the validation carried out in the United States among university students by Kokotailo et al. (2004) [34]. The online AUDIT cut-off score of 13 for detecting alcohol dependence also showed lower psychometric properties than another previous Spanish validation study [24]. Sensitivity analyses conducted in this study showed better psychometric properties with a cut-off point of 12 for detecting alcohol dependence (full results of the additional analyses are available upon request). Additionally, our study showed low PPVs and NPVs, which might be due to the low prevalence of alcohol disorders in our population [58].
The results of this study must be interpreted taking into account the following limitations. First, we used the MINI as the gold standard diagnostic instrument, which is not used as widely as other structured interviews (such as the Structured Clinical Interview DSM-IV (SCID) [59] or Composite International Diagnostic Interview (CIDI) [60]), but we applied it for feasibility and because it has shown acceptable SN/SP values (0.8/0.8) for structured interviews. Second, also in relation to the MINI, this study used the validated Spanish version of the MINI which is based on the DSM-IV criteria. Due to the subsequent publication of the DSM-5 and the changes in the diagnosis of alcohol use disorder, the results obtained in this study may be different from the new criteria. Recent validation studies of the AUDIT according to the DSM-5 have found few differences in their results [61][62][63][64], but further research on the validation of the AUDIT among university students is needed. Third, although the validity of the online instrument was established in a sample of 287 university students, the low prevalence of alcohol-related problems limited the statistical power of our study. Importantly, our results on the overall sample are consistent with previous findings [24]. However, studies with larger samples are needed for the online version of the AUDIT. A limitation concerns the recall periods of the instruments. The MINI interview administered in this study uses a 12-month recall period and the TLFB that was administered in this study uses a recall period of 1 week. On the other hand, we used an adapted version of the AUDIT without the original 12-month recall period, thus assessing current usual alcohol consumption without a specific reference period. We decided to use a short recall period of the TLFB to reduce the possible bias in the information collected regarding long periods of time and to reduce the burden on the interviewer and the respondent of a longer diary [65][66][67][68]. However, the use of such a short period of time could bias the estimate of the usual consumption pattern of university students [69]. If this was the case, the association among the measures would be attenuated and would underestimate the validity of the online AUDIT.

Conclusions
We have tested the metric performance of the Spanish online version of the AUDIT among Spanish university students. The results indicate good reliability of this version, as well as good construct validity and diagnostic accuracy. If applied in epidemiological research settings, the online version of the AUDIT might be useful to improve the detection of risk alcohol consumption patterns and probable cases of AUD diagnosis. The ease of administering the online AUDIT will facilitate its inclusion in more complete mental health profile evaluations. However, there is a shorter validated version of the AUDIT (i.e., the AUDIT-C) that could be useful, so a next step would be to validate the online version of the AUDIT-C among university students. It is known that online screening and interventions could reduce drinking in university students [70]. Such programs could be implemented more widely, for instance, among university campuses.