1. Introduction
Autism spectrum disorder (ASD), a key topic in today’s clinical industry throughout the globe, refers to a pervasive developmental disorder that hinders an individual’s skills in socialisation, creates repetitive behaviours, and impacts expressive or verbal communication with disruptions ranging from moderate to severe [
1]. The symptoms of autism are more visible and easy to identify in children two to three years of age. According to [
2], one out of every 68 children has autism. Consequently, various screening methods have been developed by leading medical experts and psychiatrists across the world seeking to identify autistic traits in their primitive stage so as to readily provide the necessary medications [
3].
Diagnosing ASD is a challenging task since there are currently multiple clinical techniques available, with most typically involving long-term observation and in-depth evaluation by licensed healthcare professionals [
4,
5,
6]. Conventional diagnostic procedures for ASD require medical professionals to conduct a clinical assessment of the patient’s developmental age based on a variety of domains (e.g., behaviour excesses, communication, self-care, social skills). This widely accepted approach is referred to as clinical judgment (CJ) [
7]. Until recently, most clinicians have used the
Diagnostic and Statistical Manual fourth edition (
DSM-IV) as the underlying criteria for diagnosing autistic behaviours [
8]. The
DSM-IV classifies autism under the category of common pervasive development disorders (PDDs). This class includes the symptoms and areas of development that need to be observed in order to identify any PDDs. The
DSM-IV introduced four diagnostic subcategories of PDDs: Asperger disorder, pervasive development disorder—not otherwise specified (PDD-NOS), Rett’s disorder, and childhood disintegrative disorder (CDD). In 2013, the
DSM-IV was revised, which posed a significant challenge to the way individuals are assessed as being on the ASD spectrum and necessitated refinements of existing diagnosis as well as screening techniques.
There are many clinical and self-screening methods available to assess individuals with ASD. The most popular clinical methods include Autism Diagnostic Interview-Revised (ADI-R), Autism Diagnostic Observation Schedule (ADOS), Childhood Autism Rating Scale (CARS), Joseph Picture self-concept scale, and the social responsiveness scale [
9,
10,
11,
12]. These are clinical methods used for formal ASD diagnosis and treatment planning [
13]. The techniques, like ADI-R and ADOS, have been clinically proven to be effective instruments in differentiating autism from other related developmental disorders, and having adequate validity and sensitivity [
14]. However, they have been criticised for being time consuming, having long questionnaires and scoring methods, and requiring licensed clinicians and observers to administer them [
15,
16,
17,
18].
Apart from clinical diagnostic methods, there are self-administered screening instruments developed by different neuroscientists and psychologists in the autism and healthcare arena. The tools, such as Autism Spectrum Quotient (AQ), Childhood Asperger Syndrome Test (CAST), and the Modified Checklist for Autism in Toddlers (M-CHAT), which are discussed in later sections, often consist of large sets of items for discriminating the autistic behaviours from all other types of PDDs [
19,
20,
21]. Most of these tools have been developed based on CJ methods, and have been able to present more accessible ways for users to undergo an ASD screening. Nevertheless, screening tools are not considered diagnosis methods for ASD since many of them lack the presence of a licensed clinician as well as the necessary clinical environment. In addition, the majority of these screening tools do not fully align with the new criteria for ASD developed under the
DSM-5. Therefore, the need for revised methods that adhere to the standards of the
DSM-5 have arisen.
There have been many studies in applied behavioural sciences that have investigated the efficiency and effectiveness in clinical environments of ASD diagnosis techniques [
22,
23,
24,
25]. However, limited studies have been carried out to identify the performance of ASD screening methods and to evaluate their merits and issues [
2,
26,
27,
28]. For instance, [
26], reviewed common screening methods related to autism and only compared their performance with regard to specificity and sensitivity. A small number of details about the screening methods were provided, and important aspects such as DSM-5 fulfilment, the methods’ popularity, and their target audience were omitted. Ref. [
27] reviewed early screening methods for toddlers without covering other important aspects relating to adolescents, children, and adults. They indicated that early identification of ASD traits in toddlers, 18–24 months of age, is consistent with the recommendations of the American Academy of Paediatrics. Another similar review of ASD tools for infants was conducted by [
2], and showed that a two-level screening can help improve the reliability of the process. Ref. [
28] conducted a systematic review of common diagnosis methods of ASD in low and middle-income countries. They revealed that because of the limited clinical resources in low-income countries, screening methods are more effective in discovering autistic traits. However, clinical diagnosis methods seem more widely utilised in middle and high-income countries. The review by [
28] noted extensive variation in the design and screening mechanisms and limited population features, therefore the study’s recommendations may not be generalised.
This article critically evaluates the available ASD screening methods in order to recognise the merits, performance issues, and shortcomings (not only in terms of sensitivity, and specificity, but also critical issues related to administration, efficiency, target audience, complexity, digital existence, and accessibility among others) for each available method. Furthermore, the screening methods that cover all target ages (toddlers, children, adolescents, and adults) are critically analysed, making this review comprehensive and applicable to the entire population of cases. For convenience, all identified methods are categorised according to their target audience, i.e., toddlers and children, adolescents and adults, and hybrids. The screening methods included in the hybrid category facilitate the screening of toddlers, children, adolescents, and adults all together.
The article consists of four main Sections.
Section 2 reviews and critically analyses the ASD screening tools considered.
Section 3 is devoted to a comprehensive discussion that contrasts and evaluates the identified screening tools in terms of their method of administration, accessibility, popularity, performance, and comprehensibility. Lastly,
Section 4 contains suggestions for the different stakeholders involved in ASD research and summarises conclusions that abridge the findings of the study.
3. Discussion
The section below is focused on evaluating the screening tools presented above in terms of their administration methods, accessibility, popularity, performance, and comprehensibility in order to shed light on the possible innovations required in new screening tools to be developed for screening autism.
3.1. DSM-IV versus DSM-5 Criteria
Many scholars continue to argue over the shortcomings of both CJ and self-reported screening tools, especially regarding efficiency and administration requirements [
15,
16,
77,
78,
79]. The validity and reliability of most of the screening tools is still under investigation as most of them follow the earlier version of the DSM (
DSM-IV) rather than the procedures and guidelines of the current
DSM-5 manual. Since most of the screening methods utilise different behavioural characteristics in determining a patient’s developmental age, they have been jointly presented as the triangle of impairments under the definition of the
DSM-IV and still need to consider amendments presented in the
DSM-5 [
80]. The most recent version of the DSM (
DSM-5) groups the five PDDs, consisting of Asperger syndrome (AS), pervasive development disorder–not otherwise specified (PDD-NOS), Rett syndrome (RS), and childhood disintegrative disorder (CDD), into ASD [
80]. The guidelines set by the
DSM-IV are followed mostly by clinicians and healthcare professionals all around the world when diagnosing autistic behaviours. In the USA, the 10th version of the International Classification of Disease (ICD-10) is also used in diagnosis and clinical evaluations of autism and other developmental disorders [
81]. The ICD-10 lists seven syndromes under PDD, and includes atypical autism and unspecified PPDs beyond the five PDDs listed by
DSM-5.
Conventional methods used in clinical judgements (CJ) of ASD, such as ADI-R and ADOS, diagnose individuals based purely on behavioural criteria through a questionnaire or interview that contains items related to the
DSM-IV [
10,
11]. After publication of the
DSM-5, researchers pointed out that some cases who were diagnosed with autism using
DSM-IV criteria may not be classified as having ASD under the revised
DMS-5 criteria [
4,
82,
83,
84]. This has created a debate among scholars in behavioural science, psychiatry, and psychology due to the inconsistent sensitivity and specificity results published in the last few years. For instance, [
82] showed a reduction in sensitivity for adults and toddlers while [
85], revealed a consistent sensitivity of cases tested under both the
DSM-IV and
DSM-5 despite a decrement in specificity.
Since most of the ASD screening methods available today were developed prior to 2013, they did not consider the guidelines established in the DSM-5. The existing ASD screening methods are based on clinical diagnosis methods, and therefore changes in ASD diagnosis criteria after publication of the DSM-5 demanded a change in the way diagnostic algorithms within the screening method behaved during the classifying of cases. For instance, items in the existing screening methods should cover social interaction and social communication (Category A) in the DSM-5 manual and at least two criteria from Category B (Restricted and Reparative Behaviour). Unfortunately, despite items in the majority of current screening methods fulfilling multiple criteria in Category A, they still fail to fully cover conditions in Category B. Nevertheless, screening for ASD does not necessarily require fully meeting the diagnostic conditions of ASD as its ultimate aim is merely to reveal potential autistic traits rather than diagnose individuals since to do so necessitates the involvement of expert clinicians and a clinical setup.
Therefore, there is a need to re-examine questions and features within the ASD diagnostic and screening tools in order to comprehensively satisfy the new criteria of the DSM-5. This necessitates mapping the new ASD criteria to the items used in the screening tool besides evaluating the way the diagnostic process works. The outcome may result in an updated version of the current screening tool that maps the new criteria of ASD in the DSM-5 to the items of the screening tool. In addition, comprehensive experimental studies using controls and cases as data are expected to be conducted in order to direct researchers, clinicians, psychiatrists, and psychologists to the right screening tool that maintains performance even after the new changes proposed.
3.2. Digital Presence and Accessibility
Some of the discussed ASD screening tools are available on the internet, either as web-based online tests or smartphone applications. Other instruments require a payment and are available only in hand written formats. These tools are intended to enhance development of disease control and prevention measures through early detection of ASD and associated communicational and behavioural disorders. The ASD screening tools, such as Q-CHAT, ASDS, ASSQ, CARS, AQ, CAST, PEDS, and CSBS-DP, are freely available on their web pages for access by parents, teachers, professionals, and clinicians who can administer the tests online and then receive an automatically generated score at the end of each test completed. Most web-based screening tools provide a guide to interpreting the final scores. Some screening tools like STAT, however, consist of a tool kit that includes a user manual, check list, questionnaire, score calculation manual, and sometimes toys such as dolls, balls, trucks, etc., in order to physically observe the child or rate an individual’s behaviour in detail. These types of screening tools are difficult to integrate into a mobile platform and require special training to administer.
There are only a few screening tools available as mobile applications, and most of these use two or more combinations of the screening methods above to derive their results. Therefore, it is quite difficult to evaluate the methods used in smart phone apps in terms of sensitivity and specificity. The M-CHAT, AQ, and CBSL are the screening methods most commonly used in mobile platforms (Android, iTunes). However, a new mobile screening application based on all AQ short versions (AQ-10-Adult, AQ-10-Adolescent, AQ-Child) and Q-CHAT (toddlers) was recently developed to cover all age categories [
86]. This is the only screening application available for all audiences. The Autism Fingerprint is another example of a smart phone application that uses M-CHAT as its screening method. In Oman, 14 out of every 100,000 children have been found to be cases of autistic traits, but there has been a lack of awareness and properly standardised tools to diagnosis the disease in the early stages. For this reason, Autism Fingerprint was developed by Arab neuroscience specialists in collaboration with technical experts [
87]. Culturally and traditionally appropriate images and items were used in order to make the application more user-friendly and an easy to use tool for screening children for autistic traits. The AQ Asperger Test, AQ Test, and the Asperger Test are some of the mobile applications that were developed using AQ. These applications can be found in both Android and iTunes mobile platforms. The Canvas Child Behaviour Check List is such a mobile application, using CBCL as the screening method to investigate behavioural impairments in children and adolescents aged between 4 and18 years.
There are many other applications that use games, drawing tools, and advanced online activities to observe the behavioural conditions of individuals, covering various segments of the population. Apple iPad is considered one such advanced tool that helps children with special needs in their communication and social development programmes. The specialists in the world’s health care arena have advised that since the iPad came onto the market in 2010, it has helped many children with autistic behaviours in developing their skills [
88]. Similarly, the contribution of numerous and innovative applications introduced by Android for promoting awareness and identifying individuals with autism at an early stage has been immense.
The presence on the internet and mobile platforms is essential in today’s society to ensure the accessibility of any product or service. Therefore, the availability of free ASD screening tools via the internet defines their accessibility. The accessibility and comprehensiveness of some screening tools is questionable, however, as most are not freely available and only target specific demographic groups. Even though some of the ASD screening tools are available on web platforms, they are not free for users. Most of the tools available on the internet are subject to a certain payment prior to obtaining access to the screening process. The ASD screening instruments, such as the AQ versions, Q-CHAT, and CAST are freely available for anyone to use while tools such as CARS-2 and CSBS-DP are available on the internet as a pay only facility. In a world where the demand for smart phone applications is growing rapidly and even the most basic facilities are available in mobile application format, it is crucial for ASD screening tools to be present on mobile phone application platforms. Currently, only a few screening tools, such as AQ and CSBS-DP, are available on smart phone platforms. The unavailability on the internet and mobile platforms hinders the accessibility to users of many of the screening tools.
3.3. Administration and Time Efficiency
Administration refers to undertaking the questionnaires or interviews provided by the ASD screening tools in order to identify autistic behaviours and an individual’s likelihood of being a case of autism. There are three types of ASD screening tools: Self-administered, parent or caregiver administered, and administered by clinicians or well-trained professionals. Most of the screening methods discussed fall into either the self-administered or parents/caregiver-administered category. Some methods require professionals to administer the questionnaires and/or to score and interpret the generated results. This is one additional requirement that makes the screening tools hard for ordinary people to use. The Q-CHAT and STAT are examples of screening instruments that require administration by professionals in addition to a report submitted by the parents on the behavioural complexities of their child. Some screening tools, such as CSBS-DP, allow the parents, teachers, and other caregivers who are familiar with the individual to administer the questionnaire, but scoring and interpretation of the scores is required to be done by a professional. Thus, these screening tools have limited involvement expected of the potential users. On the other hand, the screening tools that are self-administrated, such as AQ and its versions, seem to have fewer requirements to be conducted and can be taken by adults with average IQ, parents, family members, caregivers, and teachers among others. These self-administered methods often utilise simple scoring functions that offer a numeric score for the likelihood of having autistic traits.
One of the key performance indicators of the ASD screening tools is the time taken to complete one screening process. Since conventional methods are usually lengthy questionnaires that take time to complete, and many of the screening tools have originated from these questionnaires, it is advantageous to reduce the time necessary for the test. For example, the AQ-adolescent version originally had a questionnaire with 50 items that took approximately 15–20 min to complete. Ref. [
3] dealt with this shortcoming by reducing the number of items in the original AQ questionnaire to 10 items, only taking 5–8 min to complete. Many scholars have not yet recognised this need for a short and effective screening tool, and therefore have less involvement by users. RAADS-R, CSBS-DP, and KADI are some of the tools that take more than 30–45 min to complete, even though they are widely accepted in terms of their reliability and validity.
In the current digital era, most users prefer to have shorter screening tests such as Q-CHAT-10, AQ-10 versions, and PEDS since the tests are typically taken in an online environment and within an acceptable timeframe. In fact, recent developments in hardware, computer networks, and mobile applications have provided rapid accessibility to the tools for the healthcare community. New technologies, such as mobile platforms, may render some of these time-consuming tools obsolete.
3.4. Performance and Comprehensibility
The validity and reliability of the ASD screening tools are expressed in terms of sensitivity and specificity metrics when applied against a certain dataset of cases and controls. It is imperative, therefore, to acknowledge that these two metrics (besides accuracy) are measured with respect to a specific dataset. Thus, the screening method performance is restricted to the dataset characteristics and ensures quality. According to [
89] sensitivity refers to the ability of the screening tool to identify a person who is a case of autism while specificity refers to the power of the screening tool to discriminate a person who is a control of autism. Based on the results reported in the literature (and included in the tables constructed in
Section 2.4), most of the existing screening methods have acceptable sensitivity and specificity rates. Nevertheless, some screening tools have little research validating their results with respect to sensitivity and specificity metrics. For instance, the MABC-2 method, which has the least reported sensitivity (41%) and the ESAT method which has the least reported specificity (14%) are examples of screening tools that could potentially be improved to obtain acceptable levels of sensitivity and specificity on their datasets. The tools such as M-CHAT, FYI, GADS, and ASAS still necessitate experimental studies to seek their actual performance (both sensitivity and specificity).
The comprehensibility of the screening tools depends on the size of the audience that they cover. Most of the screening tools cover only one segment of the population, with some being specialised for infants and children while others are designed specifically for adults. Since the recognition of autism at an early stage is critical for medication and treatment planning, many tools cover infants and toddlers aged between 12 and 36 months. A lack of importance placed on teenagers and adults is another issue associated with existing ASD screening instruments. Even though the instruments, such as M-CHAT and Q-CHAT, are present on both web-based and mobile phone platforms, with acceptable levels of sensitivity and specificity, the comprehensibility of these tools is in question as they only cover infants aged from 16–30 months. This represents less than 5% of the entire population.
3.5. Popularity
There is no exact metric for measuring the popularity of an ASD screening tool as no tracking is available to measure the number of individuals who use a screening tool at any given time, nor how frequently they are being used. An estimate on popularity can be derived from the clinical usage of each tool. Unfortunately, most available ASD screening tools are designed for research and developmental purposes rather than clinical diagnosis purposes, with only a few tools such as CARS-2HF, ADOS, and ADI-R being used by clinicians in their medical diagnosis process. None of these screening tools can be used alone to provide a proper medical diagnosis, and are used in collaboration with many other medical tests and professional investigations in order to reveal autistic behaviours and to differentiate them from related developmental impairments.
Some of the ways to measure the popularity of a screening tool is to utilise application features like functionalities, user review ratings, and coverage. Out of the testing methods considered, the AQ short versions and Q-CHAT have been able to obtain positive ratings across approximately 100+ user reviews. For example, the ASDTests app, which is based on the AQ short versions, has 111 reviews and numerous downloads. Apparently, the screening tools that are based around questionnaires are more favourable to end users, as observation and video screening methods have limited or no ratings in both the Android and Apple stores. It is believed that this is the result of these screenings being more time consuming than questionnaire-based methods. In questionnaire tests, such as AQ-10, the number of questions is just 10 so users are less likely to lose interest while using this method. The hybrid screening methods seem to cover a larger group of users, as they target various combinations of toddlers, children, adolescents, and adults. More importantly, only one screening method application, ASDTests, covers all audiences collectively, thus making it more popular. This shows that hybrid screening methods seem to be more comprehensive than specific screening methods, at least in the context of user usage. Nevertheless, methods such as Q-CHAT, that cover toddlers, are still popular within their user segment (toddlers).
3.6. Intelligent Classification Methods
The current CJ and ASD screening tools generally employ human developed rules to classify cases and controls. The psychiatric and behavioural science specialists have designed these rules, and the quality of outcomes and decisions depends substantially on the subjective contributions of these professionals and the interpretations of the specialised clinical staff conducting the assessments. Instead, the diagnosis of ASD might be empowered by automated decisions generated by intelligent algorithms such as machine learning. To date, there are no self-administered ASD diagnostic methods that have integrated machine learning models into the process, despite a few research attempts on doing so [
15,
16,
17,
78,
79,
90,
91,
92,
93,
94] The lack of integrating technology with existing methods may contribute to current limitations. For example, the ASD classification amendments from 2013 were disseminated in an updated version of the
DSM-5, but ASD diagnostic procedures were not changed in accordance with these amendments. The machine learning innovations to be examined and developed for self-assessment tools are intended to make the classification process of ASD automated, rather than static. These changes may effectively replace pre-existing human-generated rules and procedures, resulting in three distinct and impactful advantages: Increased efficiency with ASD classification (less time required for screening); the reduction in the number of questions and components of ASD assessments to minimal levels while maintaining assessment integrity and validity (identification of key components that produce accurate diagnoses); and the enhancement of classification accuracy for borderline and complex cases due to empowering predictive models derived by machine learning algorithms that facilitate ASD classification decisions.
The self-administered components of the assessment, with respect to screening tools, is expected to be automated using machine learning and facilitated by caregivers or professionals. This necessitates the following: (1) The minimisation of the total number of scale items through computational intelligence techniques; (2) the creation of a machine learning classification algorithm to be embedded within the classification process; (3) examination of the case or control; (4) periodic amendments to machine learning algorithm outcomes (i.e., predictive models), based on the classified test cases.
It is also possible that the development of a new ASD self-administered assessment tool, based on machine learning, will encourage a transition from antiquated CJ tools and contribute to increased efficiency with professional diagnostic processes. Future directions with CJ tools may involve a semi-automated process, due to the need for licensed clinical specialists to verify outcomes (i.e., specific classifications). The sssessment conclusions will be solely in the hands of the specialists, while machine learning will continue to improve predictive models and provide applicable alternatives to professionals. Furthermore, machine learning may provide assessors with potential rationales for classification decisions, improving the diagnostic process with respect to both efficiency and accuracy.
4. Conclusions
Autism is no longer a dilemma for ordinary people due to the rapidly increasing awareness of the ASD spectrum and availability of screening and diagnosis measures. Identifying and distinguishing autistic traits from other developmental disorders has not been simple and easy in the past. Consequently, many scholars in the behavioural science, psychology, psychiatry, and neuro-science fields have developed diagnosis and screening methods to identify cases of autism and assist the medical diagnosis. There have been a few reviews on screening methods that have addressed common criterion, such as the number of items included in each screening test, time taken to complete the test, age categories involved, and performance (sensitivity, predictve accuracy, and specificity). However, existing reviews have failed to critically analyse vital aspects related to ASD screening, including the tool’s accessibility, comprehensibility, popularity, and efficiency among others. More importantly, none of the reviews emphasise the importance of the DSM-5 criteria for evaluating the reliability of ASD screening. Therefore, this study investigates ASD screening methods to identify their performance in terms of different advanced parameters in order to discover possible concerns that need to be addressed through an innovative ASD screening process. A total of 37 different screening methods have been identified and categorised into three subcategories depending on their target audience in order to make the evaluation process more convenient. The three subcategories are: screening tools for infants and children, screening tools for adolescents and adults, and hybrid screening tools. Hybrid screening refers to the ASD screening tools that consider the target audience as a combination of three or more of the following categories: infants, toddlers, adolescents, and adults. Out of the 37 screening methods considered, 12 fall into the category of infants and children. All screening tools have been critically analysed individually in terms of their evaluation, administration, target audience, scoring methods, other available versions, and performance. None of the prevailing screening tools have been found to be performing completely well in terms of all the considered parameters. Some tools that were apparently performing well in terms of their sensitivities and specificities have been found to be unsuccessful in other parameters. For instance, ASIEP-3 is a highly accepted tool with a 100% sensitivity and acceptable level of specificity (81%). However, it consists of a series of activities and a questionnaire with 47 items, making it more time consuming than the other available tools. Similarly, CHAT is a tool that is efficient in terms of time, but is not freely available and has an unacceptable level of sensitivity (40%).
Many of the available screening tools, especially the short versions, comply only partly with the ASD criteria of the DSM-5 introduced in 2013. Most of the available tools were developed before that time and follow the guidelines established by the older version, DSM-IV. Apart from that, each screening tool has been discussed in depth in terms of accessibility, comprehensibility, administration, popularity, and performance. It has been revealed that many available screening tools, such as Q-CHAT and STAT, require administration by well-trained professionals (at least during one stage of the evaluation). Moreover, the M-CHAT and Q-CHAT are not comprehensive in terms of the size of the audience they cover, but for an individual who is looking for a screening tool to identify autistic traits in an infant aged from 16 to 36 months these are appropriate methods as they perform well in terms of sensitivity and specificity. Similarly, the AQ-10 (Adults and Adolescents) can be recommended for individuals aged 12–16 years and 18+ respectively, as they are time efficient, easy to use, and can be self-administered with an acceptable level of performance.
With the exception of AQ, AQ-10, Q-CHAT, CARS, CAST and their variations, many of the tools are not freely available for users and only three tools are available on a mobile platform, thus limiting accessibility for users. Some screening tools also have issues with their performance and comprehensibility, especially the lengthy time questionnaires (some with more than 50 items). This makes the entire screening process tedious, unpopular, and not very usable by individuals. To summarise, the findings of this study emphasises the need for a more efficient, intelligent, and innovative ASD screening tool that can cover a wider audience while maintaining high levels of performance.
It is believed that in the near future a highly interactive platform, utilising an intelligent machine learning diagnosis algorithm, will offer more accurate and robust performance that engages individuals with ASD (both children and adolescents), parents, caregivers, GPs, other medical staff, researchers, and the broader population. This is due to such intelligent screening methods being useful by offering both the diagnosis and the individual development plan needed for cases to their families and healthcare providers efficiently.