Validity and Psychometric Properties of the Internet Gaming Disorder Scale in Three Large Independent Samples of Children and Adolescents

Background: Problematic gaming has become a major health issue in children and adolescents resulting in the need for targeted valid and reliable screening instruments. This study aimed to explore the psychometric properties and criterion validity of the widely used 9-item Internet Gaming Disorder Scale (IGDS) in young gamers. Methods: Three independent samples were drawn from socio-demographically representative cross-sectional telephone surveys collected in the years 2016 (N = 762), 2017 (N = 777), and 2018 (N = 784) and analyzed separately. Results: The IGDS revealed psychometric properties suitable for screening in large samples. Cronbach’s alpha was 0.563, 0.724, and 0.778. The unidimensionality assumption was challenged. At-risk and pathological gamers compared to normal gamers reported longer digital media use and more emotional symptoms and hyperactivity/inattention with clinical relevance to medium effect sizes. The comparison of at-risk and pathological gamers indicated a partial distinction between the two problematic gaming groups. Conclusions: The IGDS could be shown to be an overall suitable and valid tool to identify pathological gamers in childhood and adolescence according to the DSM-5 criteria on a population level. However, the polythetic structure limits comparability with the recent ICD-11 criteria. At-risk gamers appeared as a heterogeneous group warranting more research.


Introduction
The use of digital games has significantly increased during recent years and especially the last months [1]. Among smartphone apps, digital games are the most popular with increasing revenue [2,3]. Children and adolescents are particularly prone to these media since a lot of games are designed to address their special interests and support strong attachment by graphics, stories, and the application of intermittent reward system techniques [4,5]. Their frequency and daily time spent with games on electronic devices significantly increased under the first lockdown measures of the COVID-19 pandemic [6]. While the neural reward system of children and adolescents is fully developed, their cognitive control mechanisms are still immature [7]. Together with the higher exposure to digital games, this makes them especially vulnerable to develop at-risk or pathological gaming behavior [8,9]. Pathological gaming is estimated by prevalence rates between 1.2% to 5.9% in European, Asian, and Australian adolescents affecting sleep and academic achievement as well as family relationships [10].
Hence, pathological gaming has become a major concern for health care professionals during the last years. This aspect has been considered by adding the Internet gaming disorder (IGD) as a "condition warranting more clinical research and experience" in the Diagnostic and Statistical Manual of Mental Disorders 5 (DSM-5) of the American A further validation of these aspects in different samples from a developmental science perspective is urgently required. To date, current research on IGD assessment neglects possible specifics of different age and sex groups. It is common to include children, adolescents, and adults into large-sized psychometric studies without age-or sex-related analyses and with males being usually oversampled. Even in the aforementioned informative and comprehensive systematic review on IGD assessment tools by King et al. [23], age and sex were no evaluation criteria. Furthermore, there is sparse knowledge concerning at-risk gamers. Can they be separated from normal and pathological gamers as e.g., suggested by Milani et al. [15]? Hence, the present study aimed to investigate the psychometric properties of the IGDS in children and adolescents considering different age and sex groups. The convergent validity was explored by comparing normal with at-risk and pathological gamers in terms of time spent with digital media as well as co-occurring emotional and behavioral problems.

Procedure
The data was collected in three independent cross-sectional surveys technically conducted by the established German marketing and social research institute forsa via computer-assisted telephone interviews in sociodemographic representative samples between 2016 and 2018-for details see [30][31][32]. All study participants including the parents/caregivers of participating children gave informed consent prior to inclusion. They could withdraw from the study at any time without reason. All procedures performed in this study were in accordance with the ethical standards of the institutional and national research committee, with the 1964 Helsinki declaration as well as its later amendments or comparable ethical standards.

Participants
From the original three surveys, only data a) of 12-to 17-year old participants who b) completed the IGDS without omissions and c) reported playing any digital games at least once a week were included in the present study. This resulted in 8.5%, 22 11.1% in 2018. Since failure to meet only one of the three criteria mentioned above led to exclusion from the analysis, the cases affected can be understood as missing at random, and their exclusion represented only a negligible bias in the samples.
Data of the 2016, 2017, and 2018 samples were analyzed separately and not combined to take advantage of three independent replication trials. This makes it more likely to identify psychometric weaknesses that would go undetected in one joint large sample. The sample size was comparable in all surveys: 2016 (N = 762), 2017 (N = 777), and 2018 (N = 784). Age groups were based on the German social code ("Sozialgesetzbuch") which defines children by law as younger than 14 years and adolescents as younger than 18 years.

Measures
Time spent with digital media was measured by asking for the "average time of playing any digital game on weekends (in minutes)" in the 2016 sample, and by asking for "the daily time of using any social media" in the 2017 sample in a multiple choice format (<60 min/60-119 min/120-179/180-239 min/>240 min). This format is viewed as more reliable by marketing research practitioners than asking for exact times in minutes. Psychopathology was screened in the 2018 sample by the established Strength and Dif-ficulties Questionnaire (SDQ) by Goodman (https://www.sdqinfo.com/) with German norms [33]. Psychometric studies on German speaking children and adolescent samples reported mixed findings on internal scale consistencies [33][34][35][36]. These will be considered in the analyses and in the discussion section.

Statistical Analyses
For assessing psychometric properties, item analyses and internal consistencies were performed based on the "classical test theory". Dimensionality of the IGDS was examined by confirmatory factor analyses (CFAs), item analyses, and principal component analyses (PCA). CFA fit indices were reported in comparison to the following recommended criteria: χ 2 /df ratio < 5, root mean square error of approximation (RMSEA) < 0.08, standardized root mean squared residual (SRMR) < 0.08, Tucker-Lewis Index (TLI) ≥ 0.95, comparative fit index (CFI) ≥ 0.95 [37]. Generalized least squares (GLS) were chosen as the parameter estimation method since no assumptions of multivariate normality among the observed variables were made. Following the "known group"-paradigm of convergent validation, diagnostic scoring "0-2 (normal)" on the IGDS were compared with those scoring "3-4 (at-risk gamers)" and "5+ (pathological gamers)" by analyses of variance in metric data (ANOVA) and by non-parametric U-tests in ordinal data. Because of sample size restrictions and resulting slanted cell frequencies in group comparisons, ANOVAs were preceded by non-parametric H tests, and significance was only assumed when both tests were significant.
All effect sizes (ES) are reported in Cohen's d and η 2 . Considered conventions for d were: d > 0.20 = clinically relevant (small ES); 0.50 ≤ d < 0.80 medium ES, d ≥ 0.80 large ES [38]. Considered conventions for η 2 are: η 2 > 0.01 = clinically relevant (small); η 2 ≥ 0.10 medium, η 2 ≥ 0.25 large. ES were computed via Psychometrica software (https://www.psychometrica. de/) and CFAs via JASP 0.14.1.0 which is based on procedures written in the statistical programming language R (jasp-stats.org). All other computations were realized by IBM SPSS Statistics version 22 (https://www.ibm.com/de-de/products/spss-statistics). Table 1 reports the sociodemographic data of the three samples, divided by age groups and sex. About 95% of the sample attended school at the time of the interview. All samples of frequent gamers (at least once a week) consisted of a higher proportion of boys (64%) compared to girls (36%). Notes. a Other is study, military/social service, job-seeking, or job.

Sample Description
The distribution of IGDS scores is given in Table 2 together with the assessment outcomes. In the sample of 2016 pathological vs. at-risk gaming prevalence was 12 vs. 19%, 2017 it was 4 vs. 12%, and 2018 it was 3 vs. 11%. Prevalence of at-risk and pathological gaming was higher in boys than in girls. The total Cronbach's alpha coefficients were 0.784 in 2016, 0.724 in 2017, and 0.514 in 2018. These are shown in Table 2 together with group-specific values. Over all samples, girls showed a mean alpha of 0.682, boys of 0.638, children (12-13 years) of 0.652 and adolescents (12-17 years) of 0.668.
Notes. a based on N = 762 for 2016, N = 777 for 2017, N = 784 for 2018; b group mean (standard deviation); c percentages (frequencies). Table A2 in the Appendix A shows the distribution of validation measures for time spent with digital media across the three samples (with means of typical weekend gaming time in minutes and medians of any daily social media use) and the mean SDQ total difficulties score. Moreover, the distributions of the SDQ subscale scores (Table A3) and alpha values (Table A4) are presented in the Appendix A. Due to non-satisfactory consistency values, results on the SDQ subscales 'conduct problems' and 'peer relationship problems' are reported but not interpreted. The same applies to the total difficulties scale as it is a linear combination of all four subscales and its good alpha values could be attributed merely to the scale length (20 items instead of 5), cp. [39]. Table 3 summarizes the results of confirmatory factor analyses (CFA) that were conducted to test the assumed IGDS unidimensionality. Moreover, classical item analyses were additionally conducted as PCAs, whereby the number of extractable components was restricted to k = 1. In all of the three samples the IGDS items did not clearly form one dimension. As depicted in Table 3, only RMSEA and χ 2 /df were clearly in favor of unidimensionality, items 4, 5, and 8 showed weak loadings indicating heterogeneity in item contents. According to Table 4, item 5 (escape, avoid thinking about disturbing things) yielded unacceptable values for item-total correlation (r it ) and factor loadings (a) in all three samples. Item 8 (displacement, loss of interest) showed unsatisfactory r it or low αvalues in the 2018 sample. However, the consistency of the scale was not significantly improved by omitting these items: the highest increase would have been from α = 0.563 to α = 0.586 in the 2018 sample. Inspection, overall component loadings were similar, but item difficulties and total item correlations varied across samples, again indicating some heterogeneity within components.   Table 5 gives information about the average time of internet gaming and social media use. Differentiation between age groups was not useful, although desirable, due to the low cell frequencies in the gaming groups. On inspection, the gaming groups could be separated by the time spent with digital media: This was confirmed by significant overall test results with small ES (see Table 6). Planned comparisons (Helmert contrasts) of "normal vs. atrisk/pathological" revealed almost medium ES for internet gaming, and U tests showed small but clinically relevant ES for social media use. "At-risk vs. pathological" comparisons were non-significant and-with the exception of girls' social media use-had small ES indicating a major overlap between these gaming groups in their temporal digital media use.

Emotional and Behavioral Difficulties
The distributions of the SDQ subscale scores of the 2018 sample, depending on IGDS gaming groups, are presented in Table 7. Again, differentiation by age groups is not reasonable due to the small cell sizes. Gaming groups could be separated according to the severity of emotional symptoms and hyperactivity/inattention, as indicated by the group means. Overall, ANOVA results showed significantly small ES ( Table 8). The comparisons of "normal vs. at-risk/pathological" revealed almost medium ES for emotional symptoms and hyperactivity/inattention. The outcome for girls' hyperactivity/inattention was not significant between the gaming groups but had an almost medium ES. In the "at-risk vs. pathological" comparisons regarding emotional problems and hyperactivity, 3 out of 6 were significant, two of them in boys, indicating a possible distinction but also overlap between the two gaming groups.
Notes. a Group mean (standard deviation); b percentages (frequencies). Results of statistical significance tests are given in Table 6.  Notes. a Group mean (standard deviation). Alphas b not acceptable or c not satisfactory. Table A5 in the Appendix A shows the distribution of emotional and behavioral problems (assessed by SDQ) depending on the IGDS categories. Approximately one-third of at-risk boys simultaneously reported borderline or clinically severe emotional problems and hyperactivity/inattention. The same is true for two fifths of the girls classified as at-risk according to the German standards of Becker et al. [33]. The results of Table A5 in the Appendix A should be considered with the caveat of small sample sizes.

Discussion
The current study aimed to increase the knowledge on psychometric properties and screening capabilities of a suitable IGD instrument for children and adolescents. Young gamers are especially vulnerable to develop at-risk or pathological gaming patterns leading to the need for a valid and easy applicable screening tool. The IGDS by Lemmens et al. [19] is one of the few screening instruments to have been initially validated in adolescents and based on the DSM-5 criteria. It showed good psychometric properties in prior research.

Psychometric Properties
The presented results did not indicate scale consistencies in favor of either sex or age groups. Lemmens et al. [19] reported a large Cronbach's alpha of 0.93 and unidimensionality of the IGDS. These values, however, were not replicated in our samples where alphas varied from 0.56 (sample 3) via 0.72 (sample 2) to 0.78 (sample 1) with the lower value being regarded as insufficient. It has to be noted that the age range of the original study was 13 to 40 years and, thus, included children, adolescents, and adults. Values comparable to the ones of our sample 1 and 2 were described by Koning  IGDS items 8 (displacement) und especially 5 (escape) did not add to scale consistencies and omitting these items did not improve scale consistencies. From a test theoretical point of view, Schmitt [43] argues that a low Cronbach's alpha does not seriously attenuate validity, but can still deliver useful information. Internal consistency is necessary, but not sufficient for validity. Moreover, alpha reflects not only scale property but also sample attributes [39]. Looking at item difficulties in Table 4, most IGDS items in sample 2018 show less agreement (fewer "yes" responses) than in the other samples, the exception is item 5 (escape). It may be an attribute of the 2018 sample of answering reluctantly. In 2018 there was an increased interest and a higher concern regarding potential risks of digital gaming in German media and school-based prevention programs. This might have fostered a tendency to downplay gaming behavior in order to answer as socially desirable and to appear "normal" to the interviewer. Thus, this contextual influence may have contributed to the weaker psychometric results in the 2018 sample compared to the other two. In the same year Jeong et al. [44] published a study showing a significant discrepancy between self-measurements and clinically verified IGD diagnoses among adolescents with a false-negative rate of 44%.
The assumption of unidimensionality was further challenged by weak CFA-and PCA-loadings of items 5 (escape) and 8 (displacement). As reported in the introduction, the criterion escape was considered by experts to have little diagnostic value [26]. The literature finding of Besser et al. [29] that the 'escape'-item made up a factor of its own when investigating adolescents-as opposed to adult samples-was supported in all three samples of the current study. The authors commented: "Considering the young age of the participants ( . . . ), the use of the internet for mood modification might be an agesensitive effect that should be considered in further studies (p. 292)" [29]. For reasons of comparability, the original IGDS items should be retained for research purposes but we suggest to reword the item for the DSM-5 criterion escape in future IGD measures.
Interestingly, earlier work by the authors of this study provided initial evidence that to describe gaming disorder in children and adolescents according to the new ICD-11 criteria, the use of two factors is superior to the use of a single factor. Here, one factor reflects the cognitive-behavioral gaming symptoms and the other factor reflects impending or manifest consequences due to gaming behavior [45,46]. In contrast, an ICD-11 screening tool for adults covering four items found one underlying factor only [47]. Yet, a two-factorial solution is in line with the biaxial model of addiction where an addictive behavior is defined as pathological only when both specific symptoms and adverse outcomes occur [48]. The IGDS includes functional impairments by two out of the nine symptoms (according to DSM-5 criteria 'problems' and 'conflict'), weakly loading on the main IGDS factor in all three samples. Since the scale evaluation follows a polythetic principle with every item being weighted equally, an IGD can be assumed without any impairment symptoms. Esposito et al. [49] argue for a careful consideration of monothetic or different polythetic questionnaire evaluations.

Convergent Validity
According to the "known group"-paradigm of convergent validation, the groups of normal, at-risk and pathological gamers were separable by ANOVAs, as well as by nonparametric H tests and U tests. In at-risk and pathological gamers compared to normal gamers, "typical weekend gaming time" was longer with clinically relevant effect sizes (2016 sample), and percentage of "4+ hours daily social media use" was larger (2017 sample).
Emotional and behavioral assessment via SDQ German norms [33] revealed emotional symptoms to be much more pronounced in at-risk/pathological gamers than in normal gamers (2018 sample) with mainly medium effect sizes. The computed ES correspond to correlation coefficients reported by Lemmens et al. [19] and Wartberg et al. [26] varying between r = 0.227 (d = 0.466) and r = 0.311 (d = 0.654). By structural equation modeling in a cross-lagged panel design, Wartberg et al. [26] found the following variables to predict IGD in N = 955 children aged 12-14 years after one year: male sex, IGD at initial assessment, as well as hyperactivity/inattention and self-esteem problems. Accordingly, it can be assumed that "at-risk" adolescents could develop a manifest pathological gaming disorder if clinically relevant hyperactivity/inattention and/or emotional symptoms are present, and no clinical intervention takes place.
Exploratory testing for group differences between at-risk gamers and pathological gamers yielded mixed results. This is conclusive because "at-risk" gamers form a heterogeneous group. However, an increased psychopathologic burden was found in this group which is worth further clinical evaluation. Thus, for clinical purposes, investigating frequent gamers by applying a cut-off score of 5+ only does not seem to be satisfying. Even an IGDS score of 3+ seems to indicate an elevated risk of co-occurring emotional and behavioral problems. However, it has to be kept in mind, that by reducing the IGDS cut-off, the chance of detecting at-risk gamers will be increased while the risk of misclassifying regular gamers as problematic will be elevated. Accordingly, Colder Carras and Kardefelt-Winter [48] argue for a careful consideration of both addiction-related symptoms and gaming-related impairments [48]. In their large-scale investigation of 7865 adolescent European gamers, 2.2% of the sample reported symptoms and impairments while 30.9% of the adolescents would have been misclassified based on their personal problems only. Whereas 23.6% of "concerned" gamers indicated only a few addiction symptoms but a high level of impairments, 7.3% of "engaged" gamers stated many addiction symptoms but only a few impairments. While the first group might resemble adolescents with higher emotional and behavioral problems with accompanied non-addictive gaming, the latter group might correspond most likely to the group of ICD-11 hazardous gamers. A distinction between these two groups is not possible by the IGDS.

Limitations
In the present study, comparisons between the gaming groups were limited due to low disorder prevalence, although gender-, sex-, and age-sensitive research on internet gaming is desirable. Future studies should address this issue by an oversampling of pathological and at-risk gamers. The present findings are based on cross-sectional survey data. Thus, no conclusions can be drawn about symptom stability and retest reliability. A further shortcoming of the current study is the missing criterion validity with objective markers such as logged usage times. Furthermore, to the best of our knowledge, the IGDS has not yet been validated against clinical evaluation in adolescents which is the gold standard when pursuing diagnostic purposes. Therefore, the IGDS is suitable for epidemiological research but not adequate for individual assessment. Finally, IGDS results cannot be easily transferred to evaluate a gaming disorder according to the ICD-11 criteria and there is no clear equivalent to ICD-11 hazardous gaming. Thus, studies comparing both diagnostic approaches would be of great interest.

Conclusions
The current study is the first to investigate the psychometric properties and the criterion validity of an established IGD screener in three large independent representative samples of children and adolescents. Since IGD in this young target group is of high clinical relevance, a suitable and easy to administer screening tool is urgently needed. Based on the present findings, the use of the IGDS by Lemmens et al. [17] in children and adolescents as a valid screening tool on a populational level could be supported. However, the assumption of unidimensionality was challenged. Normal gamers could be reliably differentiated from at-risk and pathological gamers. However, the heterogeneous group of at-risk gamers could not be further evaluated due to scale limitations. Since the IGDS is a polythetic tool with equal weight of each item, the presence of gaming-associated impairments is not necessary for IGD classification. Along with the missing equivalent of hazardous gaming criteria, a direct comparison of the gaming groups with the new ICD-11 definitions is limited. Funding: This work was financially supported by the German statutory health insurance DAK Gesundheit.

Institutional Review Board Statement:
The study was conducted according to the guidelines of the Declaration of Helsinki 1975, revised in 2013. After consultation with the local ethics committee, ethical review and approval were waived for this study. The study was non-interventional and without impact on the physical and psychological integrity of participants, as participants had no patient status in relation to the authors, and as authors received participants' data in anonymized form (please refer to DFG guidelines for German researchers, German Research Foundation -FAQ: Humanities and Social Sciences). Participants were enrolled based on anonymous panel data.
Neither the interviewer, nor the researchers had any access to personal data that would have allowed identification. Thus, data collection and data transfer was completely anonymous and identification cannot be reconstructed by any source.
Informed Consent Statement: All study participants including the parents/caregivers of participating children gave informed consent prior to inclusion. They could withdraw from the interview at any time without reason.

Data Availability Statement:
The data presented in this study are available from the corresponding author upon reasonable request.

Acknowledgments:
The authors would like to thank all study participants and the German market and opinion research company forsa for the excellent data collection.

Conflicts of Interest:
The authors declare no conflict of interest. DAK Gesundheit had no role in the design of the study, collection, analyses, or interpretation of data, in the writing of the manuscript, or in the decision to publish the results. Table A1. Brief overview of diagnostic criteria for the internet gaming disorder (IGD).

Label Criterion
A.
Mental preoccupation with gaming B.
Withdrawal symptoms C.
Tolerance/increase of dosage (gaming time) D.
Failures to gain gaming control E.
Loss of previous interests or prior hobbies F.
Continuation of gaming despite insight into adverse consequences G.
Lying to significant others in respect of factual gaming H.
Gaming in order to regulate negative moods ('escape') I.
Elevated risk of losing an important social relationship (job/education) Notes. For assuming an IGD, a person's gaming behavior must have matched at least 5 out of these 9 criteria in the past 12 months. See details in APA [11].