Classification Performance of the Ages and Stages Questionnaire: Influence of Maternal Education Level

(1) Background: The Ages and Stages Questionnaire—Third Edition (ASQ-3) is a parental screening questionnaire increasingly being used to evaluate the development of preterm children. We aimed to assess the classification performance of the ASQ-3 in preterm infant follow-up. (2) Methods: In this cross-sectional study, we included 185 children from the SEVE longitudinal cohort born <33 weeks of gestational age between November 2011 and January 2018, who had both an ASQ-3 score at 24 months of corrected age (CA) and a revised Brunet–Lézine (RBL) scale score at 30 months of CA. The ASQ-3 overall score and sub-scores were compared to the RBL developmental quotient (DQ) scores domain by domain. The diagnostic performance of the ASQ-3 was evaluated with the RBL as the reference method by calculating sensitivity, specificity, and positive and negative likelihood ratios. A multivariate analysis assessed the association between low maternal education level and incorrect evaluation with the ASQ-3. (3) Results: The ASQ-3 overall score had a specificity of 91%, a sensitivity of 34%, a positive likelihood ratio of 3.82, and a negative likelihood ratio of 0.72. Low maternal education level was a major risk factor for incorrectly evaluating children with the ASQ-3 (odds ratio 4.16, 95% confidence interval 1.47–12.03; p < 0.01). (4) Conclusions: Regarding the low sensitivity and the impact of a low maternal education level on the classification performance of the ASQ-3, this parental questionnaire should not be used alone to follow the development of preterm children


Introduction
Preterm children are at high risk of neurodevelopmental impairment [1,2]. According to the EPIPAGE-2 cohort study, the rates of severe and moderate neurodevelopmental impairments at five years of age are estimated from 28% to 12%, for children born between 24 and 26 weeks of gestational age and between 32 and 34 weeks of gestational age, respectively [1]. The early evaluation of the developmental trajectory of each preterm child is essential to identify the children to whom to propose an early developmental intervention, to promote their future activity and participation [3,4]. A meta-analysis of eight studies, including 1436 children born before 37 weeks of gestational age evaluated between three and five years of age, had shown that such early developmental intervention improved intelligence quotient at preschool age (standardised mean difference 0.43 standard deviations, 95% confidence interval (CI) from 0.32 to 0.54) [4].
The two hetero-assessment scales used by professionals for assessing neurodevelopment during the first years of life and validated in France are the revised BrunetLézine (RBL) scale and the Bayley Scales of Infant and Toddler Development-Third Edition (Bayley-III) [5,6]. The RBL is the predominant validated hetero-assessment scale used to evaluate the neurodevelopment of children at preschool age in France. It calculates a global developmental quotient (DQ) by computing four DQs for sociability, gross motor function, visuospatial coordination, and language [5]. The Bayley-III evaluates the same neurodevelopmental domains, has the same age groups, and uses a similar scoring technique as the RBL [6].
The Ages and Stages Questionnaire-Third Edition (ASQ-3) is a validated parental screening questionnaire increasingly being used to evaluate the development of preterm children [7,8]. It evaluates the same neurodevelopmental domains as RBL, in addition to the problem-solving domain, and consists of six questions for each domain of child development [8]. A single study compared the ASQ-3 to the RBL scale in this population and found a good correlation between the overall ASQ-3 score and the global DQ of the RBL in children with neurodevelopmental impairment at 24 months of corrected age (CA) [9].
The aim of our study was to assess the classification performance of the ASQ-3 in preterm children and the impact of maternal education level on the classification performances of the ASQ-3, using the RBL scale as a reference. As the ASQ-3 is a parental questionnaire, we hypothesised that the ASQ-3 has low sensitivity and that low maternal education level was associated with incorrect classification with the ASQ-3.

Materials and Methods
This cross-sectional study was performed and reported according to the STROBE guidelines for observational studies [10].

Population Sample
The Suivi des Enfants Vulnérables sur le Réseau Elena (SEVE) Network is a regional cohort ensuring a free-of-charge and standardised follow-up of vulnerable children, as described [11]. As part of the follow-up, parents were encouraged to complete the ASQ-3 at 24 months of corrected age (CA) and a systematic encounter with an experienced neuropsychologist was offered at 30 months of CA to perform an RBL assessment. All children born before 33 weeks of gestational age (GA) between November 2011 and January 2018 with an ASQ-3 evaluation and an RBL evaluation were included in our study. There were no exclusion criteria. Data were longitudinally collected by using standardised questionnaires throughout the predefined follow-up protocol and included maternal age, maternal education level, antenatal steroids administration, multiple pregnancy, vaginal delivery versus caesareansection, GA at birth, birth measurements, Apgar score at 1 min and 5 min, the use of surfactant, cerebral haemorrhage, periventricular leukomalacia, necrotising enterocolitis, ventilatory support duration, hospitalisation duration, and feeding at discharge. Information on the maternal education level was recorded during the neuropsychologist consultation. A low maternal education level was defined as the highest attained education below secondary education (i.e., baccalaureate degree or less in the French school system). A high education level was defined by any high school degree. Birth measurement z-scores were calculated by GA and sex using the INTERGROWTH-21st International Newborn Size at Birth Standards application v1.3.5 (University of Oxford, UK).

Neurodevelopmental Assessment
Neurodevelopmental evaluation at 30 months of CA was performed by an experienced neuropsychologist blinded to the ASQ-3 results, using the RBL scale. The RBL is the predominant scale utilised to evaluate the neurodevelopment of children from 2 to 30 months of CA in France [5,12]. It was validated on a sample of 1032 children born at term [5]. A global DQ was established by computing four DQs for sociability, gross motor function, visuospatial coordination, and language. Each DQ is normalised to the CA and has a mean of 100 and a standard deviation (SD) of 15 [5]. A DQ < 85 is defined as abnormal [5].
Parents completed the French ASQ-3 adapted for children at 24 months of CA. The ASQ-3 consists of six questions for five domains of child development: communication, gross motor, fine motor, problem-solving, and personal-social [8]. A score <−2 SDs is defined as abnormal [8]. The overall ASQ-3 score is considered abnormal if at least one domain is abnormal [8].

Ethics
Written informed parental consent was mandatory to be included in the SEVE Network. The French National Technologies and Civil Liberties Commission (CNIL) approved data collection (authorisation no. 1530737). This study was authorised by the Ethical Committee of the Saint-Étienne University Hospital (April 2020, IRBN392020/CHUSTE).

Statistical Analysis
Categorical variables are expressed as absolute numbers (percentage). Continuous variables are expressed as mean (SD) or median (interquartile range) according to their distribution.
The ASQ-3 was compared to the RBL domain by domain: communication, gross motor, fine motor, personal-social domains, and overall ASQ scores were compared to language, gross motor function, visuospatial coordination, sociability, and global DQs, respectively. The classification performance of the ASQ-3 was evaluated by generating contingency tables for each score. The RBL was defined as the reference method. DQs < 85 were considered as positive, i.e., abnormal development [5]. ASQ-3 domain scores and overall ASQ score with one or more domains <−2 SDs were considered as positive, i.e., abnormal development [8]. The children considered correctly classified with the ASQ-3 were true positives and true negatives. The children considered incorrectly classified were false positives and false negatives.
Using that classification contingency table, sensitivity (true positive rate) and specificity (true negative rate) with their 95% CIs were computed. To summarize the information contained in both sensitivity and specificity, positive and negative likelihood ratios were also computed for each score using Formulas (1) and (2) [13].
"T+" and "T−" represent the positivity or negativity of the test, respectively. "D+" and "D− " represent the presence or absence of abnormal development, respectively. A positive likelihood ratio > 10 would be considered satisfying, as well as a negative likelihood ratio < 0.1, whereas positive or negative likelihood ratios = 1 would mean that the test does not discriminate against individuals with or without abnormal development.
Diagnostic odds ratios with their 95% CI were also calculated to describe the odds of having a positive ASQ-3 test in children with abnormal development relative to the odds of a positive ASQ-3 test in children with normal development. The calculation of the diagnostic odds ratio and its relations to positive and negative likelihood ratios is described in Formula (3) [13].
Diagnostic odds ratio = positive likelihood ratio negative likelihood ratio Then, the performance of the ASQ-3 diagnostic was assessed using the area under curve (AUC) of receiver operating characteristic (ROC) curves and their 95% CIs, by means of 10,000 iterations of bootstrap. A secondary analysis stratified on maternal education level was also performed using the same methodology.
Univariate and multivariate logistic regression models were used to estimate the independent association between the classification performance of ASQ-3 and low maternal education level. The dependent variable ASQ-3 was dichotomized as previously and treated as a binary variable, with an overall ASQ-3 score < −2 SDs considered positive and an overall ASQ-3 score ≥ −2 SDs considered negative. The multivariate model was built a priori, using known potential confounders according to literature data and experts' opinions. Sex, GA, hospitalization duration, and ventilatory support duration were included in the multivariate model as adjustment covariates as potential confounders. Quantitative covariates were coded as continuous covariates.
The results are reported as odds ratios (ORs) and their 95% CIs. All tests were two-sided, with a p-value <0.05 considered as statistically significant. Missing data represented 2.5% of the data and were not considered in the analysis. Analyses were performed using R software version 4.0.3 (R Foundation for Statistical Computing, Vienna, Austria).

Results
One hundred and eighty-five children had an RBL and ASQ-3 evaluation and were included in the current study. The features of the population sample are described in Table 1. The mean CA at RBL evaluation was 29.0 months with an SD of 2.7. The results of the neurodevelopmental assessments by ASQ and RBL are summarised in Table 2  The classification performance of the ASQ-3 is detailed in Table 3. The overall ASQ-3 score had a sensitivity of 0.34 (95% CI from 0.27 to 0.41) and a specificity of 0.91 (95% CI from 0.87 to 0.95). The cross-tabulation of the overall ASQ score and the global DQ score found a positive likelihood ratio of 3.82, a negative likelihood ratio of 0.72, a diagnostic odds ratio of 5.29 (95% CI from 4.43 to 6.17), and an area under curve of 0.67 (95% CI from 0.56 to 0.77). Regarding domain-by-domain evaluation, ASQ-3 had a sensitivity of 0.11 (95% CI from 0.07 to 0.16) for the gross motor domain, 0.08 (95% CI from 0.04 to 0.12) for the personal-social domain, 0.13 (95% CI from 0.08 to 0. 18  The subgroup analysis of the classification performance of the ASQ-3 in children with a low maternal education level is detailed in Table 4. The overall ASQ-3 score had a sensitivity of 0.35 (95% CI from 0.21 to 0.49) and a specificity of 0.88 (95% CI from 0.78 to 0.97) in children with a low maternal education level. In these children, the cross-tabulation of the overall ASQ score and the global DQ score found a positive likelihood ratio of 2.80, a negative likelihood ratio of 0.74, a diagnostic odds ratio of 3.77 (95% CI from 2.38 to 5.16), and an area under curve of 0.66 (95% CI from 0.49 to 0.83) ( Table 4). The subgroup analysis of the classification performance of the ASQ-3 in children with a high maternal education level is detailed in Table 5. In these children, the overall ASQ-3 score had a sensitivity of 0.36 (95% CI from 0.27 to 0.44) and a specificity of 0.92 (95% CI from 0.87 to 0.97). In children with a high maternal education level, the ASQ-3 also had a positive likelihood ratio of 4.37, a negative likelihood ratio of 0.70, a diagnostic odds ratio of 6.23 (95% CI from 4.98 to 7.49), and an area under curve of 0.64 (95% CI from 0.50 to 0.77) ( Table 5). The results of the analyses calculating the association between the classification performance of ASQ-3 and a low maternal education level are displayed in Table 6. A Low maternal education level was a major risk factor for incorrectly evaluating children with the ASQ-3 on multivariate analysis (OR = 4.16, 95% CI from 1.47 to 12.03; p < 0.01) ( Table 6).

Discussion
Compared to the RBL, the ASQ-3 had excellent specificity and low sensitivity to assess the neurodevelopment of preterm children. A low maternal education level was a major risk factor for incorrectly evaluating children with the ASQ-3.
This association between the maternal education level and the classification performance of the ASQ-3 is a significant result to consider because a low maternal education level is also a risk factor for impaired neurodevelopmental outcomes in preterm children [14,15]. Altogether, a low maternal education level can be considered as a double-risk factor in preterm children: a risk factor for an impaired neurodevelopmental outcome, and a risk factor for having such neurodevelopmental outcome incorrectly evaluated if the ASQ-3 is used alone to assess the neurodevelopment [14,15].
The Bayley-III and RBL scales evaluate the same neurodevelopmental domains, have the same age groups, and use a similar scoring technique. The accuracy of the RBL scale compared to the Bayley-III scale is 86.4% for motor skills, 90.9% for cognitive skills, 94.3% for communication skills, and 88.6% for sociability skills [16]. Our findings for the specificity and sensitivity of the ASQ-3 agree with most studies comparing the ASQ-3 to the Bayley-III at this age [17][18][19][20][21]. The ASQ-3 is a screening tool rather than a diagnostic assessment [8]. Some studies considered that the ASQ-3 is not reliable for screening neurodevelopment in children [22,23]. One prospective study considered the ASQ-3 equivalent to the Bayley-III at 2 years of CA to detect moderate to severe neurodevelopmental impairment [24]. A recent meta-analysis including 36 studies concluded that the ASQ-3 performance to identify developmental delay in children aged between 12 and 60 months was moderate, with a pooled sensitivity of 0.77 (95% CI, 0.64-0.86), a pooled specificity of 0.81 (95% CI, 0.75-0.86), a pooled positive likelihood ratio of 4.10 (95% CI, 3.17-5.30), and a pooled negative likelihood ratio of 0.28 (95% CI, 0.18-0.44) [25].
In France, the ASQ-3 has been used as an evaluating method based on a study that compared the ASQ-3 scores and RBL DQ scores at 24 months of CA [9]. Compared to our study, the previous study included more patients and found good sensitivity (88%) and specificity (57%) for the ASQ-3 [9]. The high sensitivity might be explained by the RBL being administered earlier, at 24 months of CA, because the neurodevelopmental trajectory becomes complex with age [26]. Additionally, the authors compared the overall ASQ-3 score to the global RBL DQ and did not focus on each domain of the ASQ-3 [9]. Domain-by-domain evaluation allows for more accurate identification of impairment. Besides language skills, which are reported to more strongly predict impairment than other cognitive domains, early developmental milestones before age 24 months poorly predict neurodevelopmental impairment at school age [27]. This is particularly important for fine motor skills, which involve complex neurodevelopmental functions [28].
Although the parental point of view is essential, a professional assessment of the neurodevelopmental trajectory of preterm children is critical. Parent and professional views are different; the threshold level of impairment can differ: professionals tend to prioritize functional goals, whereas parents tend to prioritize activity and participation [29]. Additionally, a professional assessment is limited to a short time window of evaluation, whereas parental evaluation reflects on an average function over time. Alongside the neurodevelopmental trajectory evaluation, a follow-up of the quality of life is also essential in this population [30,31].
Our study is the first to compare the ASQ-3 and RBL scales domain by domain and the second to compare those two neurodevelopmental assessment scales. A limitation is the population sample size of 185, but the features of this sample are comparable to those of the whole population included in the SEVE Network [11].

Conclusions
The ASQ-3 has a low sensitivity to assess the neurodevelopment of preterm children A low maternal education level is a major risk factor for incorrectly evaluating children with the ASQ-3 at 24 months of CA. The ASQ-3 should not be used alone to follow the development of preterm children. Although the parental point of view is essential, a hetero-assessment by a professional using a validated scale is critical to evaluate the neurodevelopment of preterm children. Funding: This work was funded by the Agence Régionale de Santé Auvergne-Rhône-Alpes. The funding source was not involved in the study design; in the collection, analysis, and interpretation of the data; in the writing of the report; and in the decision to submit the paper for publication.

Institutional Review Board Statement:
The study was conducted in accordance with the Declaration of Helsinki and approved by the Ethics Committee of the Saint-Étienne University Hospital (April 2020, IRBN392020/CHUSTE).

Informed Consent Statement:
Informed consent was obtained from all subjects involved in the study.
Data Availability Statement: Anonymised data are available from the corresponding authors upon reasonable request.