Summarizing Recent Developments on Autism Spectrum Disorder Detection and Classification Through Machine Learning and Deep Learning Techniques

Ahmed, Masroor; Hussain, Sadam; Ali, Farman; Gárate-Escamilla, Anna Karen; Amaya, Ivan; Ochoa-Ruiz, Gilberto; Ortiz-Bayliss, José Carlos

doi:10.3390/app15148056

Open AccessSystematic Review

Summarizing Recent Developments on Autism Spectrum Disorder Detection and Classification Through Machine Learning and Deep Learning Techniques

by

Masroor Ahmed

^*

,

Sadam Hussain

,

Farman Ali

,

Anna Karen Gárate-Escamilla

,

Ivan Amaya

,

Gilberto Ochoa-Ruiz

and

José Carlos Ortiz-Bayliss

^*

Tecnologico de Monterrey, School of Engineering and Sciences, Monterrey 64700, Mexico

^*

Authors to whom correspondence should be addressed.

Appl. Sci. 2025, 15(14), 8056; https://doi.org/10.3390/app15148056

Submission received: 22 April 2025 / Revised: 2 June 2025 / Accepted: 4 June 2025 / Published: 19 July 2025

Download

Browse Figures

Versions Notes

Abstract

Autism Spectrum Disorder (ASD) encompasses various neurological disorders with symptoms varying by age, development, genetics, and other factors. Core symptoms include decreased pain sensitivity, difficulty sustaining eye contact, incorrect auditory responses, and social engagement issues. Diagnosing ASD poses challenges as signs can appear at early stages of life, leading to delayed diagnoses. Traditional diagnosis relies mainly on clinical observation, which is a subjective and time-consuming approach. However, AI-driven techniques, primarily those within machine learning and deep learning, are becoming increasingly prevalent for the efficient and objective detection and classification of ASD. In this work, we review and discuss the most relevant related literature between January 2016 and May 2024 by focusing on ASD detection or classification using diverse technologies, including magnetic resonance imaging, facial images, questionnaires, electroencephalogram, and eye tracking data. Our analysis encompasses works from major research repositories, including WoS, PubMed, Scopus, and IEEE. We discuss rehabilitation techniques, the structure of public and private datasets, and the challenges of automated ASD detection, classification, and therapy by highlighting emerging trends, gaps, and future research directions. Among the most interesting findings of this review are the relevance of questionnaires and genetics in the early detection of ASD, as well as the prevalence of datasets that are biased toward specific genders, ethnicities, or geographic locations, restricting their applicability. This document serves as a comprehensive resource for researchers, clinicians, and stakeholders, promoting a deeper understanding and advancement of AI applications in the evaluation and management of ASD.

Keywords:

autism spectrum disorder (ASD); artificial intelligence (AI); bioinformatics; deep learning (DL); machine learning (ML)

1. Introduction

Autism Spectrum Disorder (ASD) is a broad term that includes several neuropsychiatric disorders. Impaired social communication, interpersonal relationships, academic achievement, and confined and repetitive activities are all characteristics of this disorder. When compared to others, people with ASD frequently display variations in habits, interaction, and learning [1]. The term `spectrum’ in ASD refers to a wide range of characteristics, aptitudes, and capacities particular to each person. Individuals with ASD experience this condition differently from one another, which results in a range of assistance needs. Although the fundamental traits of ASD offer a range of difficulties, they may also lead to unique strengths and skills. Even though this is a lifelong condition, both adults and children with ASD may make significant progress and have fulfilling lives with the correct kind of assistance [2,3]. Although the specific causes of ASD are still unclear, biological variables such as mutations in genes, inflammatory conditions of the brain, and detrimental perinatal circumstances are likely to be involved. ASD symptoms can be confusing and time-consuming to diagnose since they resemble those of several other mental conditions. Although we lack a definitive treatment for ASD, early detection of its symptoms can lessen its impact over time [4]. Core symptoms of ASD include decreased pain sensitivity, trouble sustaining eye contact, incorrect auditory response, reluctance to snuggle, difficulties using gestures, inability to engage in social engagement, abnormal attachment to items, and a desire for isolation among children. In addition to behavioral examinations specific to ASD, standard clinical and medical records can provide vital information for assessing young children’s risk for ASD. Research has shown that extra symptoms and health concerns, such as digestive disorders, infections, and feeding difficulties, are common in kids with ASD [5].

Attention deficit hyperactivity disorder (ADHD), anxiety, depressive disorders, and epileptic seizures are among the co-occurring problems that people with ASD frequently face. Moreover, individuals often exhibit difficulty controlling challenging behaviors, such as self-harm and sleep disruptions. ASD’s intellectual landscape is broad, ranging from those with severe cognitive deficits to those who perform at exceptionally high levels [6,7]. Although 168 million people worldwide might be affected by ASD, the actual number is probably larger due to underdiagnosis brought on by a lack of resources and awareness [8]. Unbelievably, one-third of people with ASD live in low- and middle-income nations, where it is sometimes difficult to get a diagnosis and receive proper treatment.

ASD has become more common, especially in the last several decades. In the United States, the prevalence of ASD increased from 1 in 150 children in 2000 to 1 in 44 children in 2018 [9]. This increase might be influenced by greater awareness, diagnosis improvements, and diagnostic criteria changes, although the exact causes remain unknown. Up to 2023, the countries with the largest children population affected with ASD were the United States, with 466,665; Japan, with 127,590; and the United Kingdom, with 95,801 children [10].

The treatment of ASD accounted for over USD 1.88 billion of the global market in 2021. According to projections, it will reach over USD 3.5 billion by 2030, representing a compound annual growth rate (CAGR) of 7.15% between 2022 and 2030. The increasing incidence of ASD is driving the growth of the market for treatments. Additionally, the World Health Organization’s revised study from March 2022 showed that one in 100 children globally has ASD, highlighting the critical need to address this expanding health issue. Anticonvulsants, antipsychotics, antidepressants, as well as other drugs that focus on specific disorder symptoms are commonly used in the treatment of ASD. Nevertheless, despite continuous attempts, there are not any reliable ASD therapy alternatives on the market. As such, pursuing the objective of research and development to improve treatment alternatives offers profitable prospects for market growth (see Figure 1 for more details).

Offering children with ASD symptoms the proper therapy and treatment requires early discovery and diagnosis. The Centers for Disease Control and Prevention (CDC) report that, over the previous fifteen years, four and a half years have been the average age at which early indications of ASD are diagnosed. However, parents and other caretakers frequently become aware of issues at around two years old. As such, it is critical to apply efficient diagnostic and rehabilitation methods. There are two primary methods for identifying and monitoring kids with ASD: manual and automatic methods [12]. For the identification and diagnosis of ASD, automated systems using computer vision and image-based techniques in conjunction with conventional machine learning (ML) and deep learning (DL) are becoming increasingly common [13].

Additionally, manual techniques like observation- and interview-based methodologies are standard nowadays. The Childhood Autism Rating Scale (CARS), for example, uses 15 questions to diagnose ASD and uses scores to classify severity [14]. Interview-based systems such as the Growth-related Dimensional analysis, Diagnostic Interview (3DI) [15], Autism Diagnostic Interview-Revised (ADI-R) [16], and Asperger Syndrome Diagnostic Interview (ASDI) require in-depth interviews with parents or other caregivers to diagnose ASD [17]. Similarly, the Gilliam Autism Rating Scale (GARS) uses 56 items divided into four categories to evaluate the severity of ASD [18]. However, this manual approach is not practicable for collecting data during daily activities since it primarily depends on expert opinion and behavioral observations. It takes a lot of time and money as well [19].

Researchers have created automated solutions to improve precision and effectiveness using the diagnosis process to overcome the shortcomings of conventional ASD diagnostic techniques [20,21]. For example, conventional ML methods and computer vision have demonstrated potential for producing quick and efficient screening tools [22,23,24]. Meanwhile, DL approaches have outperformed traditional ML methods by automatically extracting characteristics and decreasing mistakes in identifying and diagnosing medical conditions, including ASD [21]. The capacity to analyze photos and videos for detecting, categorizing, diagnosing, and tracking ASD has increased dramatically in recent DL developments, significantly improving manual techniques [25,26].

ML is a fast-growing Artificial Intelligence (AI) subfield that uses information to create highly accurate predictive models. ML encompasses diverse algorithms designed for classification, regression, and clustering [27]. These algorithms, such as Naïve Bayes, support vector machines (SVM), logistic regression (LR), k-nearest neighbors (kNN), linear and polynomial regression (LPR), k-means, neural networks (NNs), and convolutional neural networks (CNNs), are adept at learning and interpreting intricate data features [27,28]. They are instrumental in predicting and classifying ASD across various age groups. Given the proper requirements, these techniques can predict survival rates, analyze behavior, study gaze patterns, and more, thereby facilitating early diagnosis of ASD [29].

Standardized behavioral evaluations, which can be drawn out and time-consuming, are commonly used to diagnose ASD. The goal of research on psychiatric neuroimaging is to find objective biomarkers that can help diagnose and treat brain-based diseases more effectively. By removing identifying characteristics from functional MRI (fMRI) data and passing them into classifiers, the literature has explored DL approaches to automate the diagnosis of ASD. Cutting-edge DL methods have greatly improved the classification of ASD by distinguishing ASD from typical developmental behaviors [30]. These methods improve the effectiveness of feature transformation and reduction, which reduces analysis time and improves classification parameters. The accuracy, specificity, error rate, sensitivity, Positive Predictive Value (PPV), Area Under the Curve (AUC), and Negative Predictive Value (NPV) are common performance indicators used in the evaluation of diagnostic tools for ASD and play a vital role [31,32]. In this area, CNNs, deep neural networks (DNNs), graph convolutional networks (GCNs), and hybrid models have shown encouraging outcomes [33].

AI-powered computer-aided systems use combined AI and relative technologies like ML and DL techniques. DL has grown increasingly used for extracting deep features [34]. New developments in ASD diagnosis use DL models, combining ML and neuroimaging techniques through DL to identify early biological indicators [35,36]. CNN models that have been simplified exhibit remarkable F1-scores, accuracy, and precision. Still, difficulties remain, including problems with data accuracy, comprehensibility, and ethical dilemmas [37]. With a combination of DL approaches, ASD detection algorithms are developing and combining data from several sources to increase accuracy. DL techniques have become common in the early phases of ASD identification and analyzing speech data, behavioral observations, and neuroimaging [38]. This combination improves diagnosis precision and speeds up the procedure, which might result in better results. Accurate diagnosis depends critically on structural MRI (MRI) and functional MRI (fMRI) [39].

It is essential to critically evaluate the aims and constraints of previous studies to ascertain the present level of academic comprehension concerning the methodologies of ASD using AI tools. Song et al. [40] conducted a systematic review and analyzed 13 research studies from 2009 to 2019 that used AI to diagnose ASD. They employed supervised ML techniques to distinguish between people with and without ASD. Their study investigates AI’s capacity to record behavioral traits that might act as diagnostic markers objectively. Kohli et al. [16] reviewed potential techniques, such as machine learning and deep learning, to enhance the early identification of ASD. The study carried out a scoping review of 35 studies published from 2011 to 2021, and this study focused on multiple modalities such as stereotypical behaviors, eye gaze, facial expression, etc. Moreover, this scoping review addresses the limitations and future works. Minissi et al. [41] focused on evaluating the earliest stages of ASD using ML approaches to analyze eye movement (EM) biomarkers regarding social cues. They looked at 11 research articles from 2015 to 2020 that used ML to study children’s social visual attention (SVA) and found disparities between friends with ASD and those typically developing (TD). Jeyarani and Senthilkumar [42] examined 30 selective new studies published between 2017 and 2020 that focused on eye tracking data based on ML and DL techniques. They also described the diagnostic tools, performance criteria, and datasets in the review. Furthermore, this study focused on insights into the detection, behavioral assessment, and differentiation between autistic children and TD. Moreover, Joudar et al. [43] systematically reviewed 18 research articles published from 2017 to 2022. They assessed AI approaches using different datasets and examined AI’s involvement in triage, priority setting, and genetic factors. The study explored ML models as prediction tools for diagnosing ASD by addressing research limitations and gaps. The study conducted by Parlett-Pelleriti et al. [44] reviewed 43 articles based on unsupervised ML and focused on the diagnosis and treatment of ASD based on genetic and behavioral data. More recently, Uddin et al. [45] presented a detailed review of machine learning (ML) and deep learning (DL) techniques for detection, classification, and rehabilitation of ASD. In their review, they reported 130 articles published from 2017 to June 2023, concentrating on the usage of the DL models to analyze image or videos representing ASD.

Various authors have presented their solutions using AI-based approaches. The goal was to extract potential findings and research gaps within this area so that future investigators could research and work on them. Moreover, even published in recent years, some systematic literature review (SLR) research considered only a few modalities and missed others that could provide deeper insights and information. Our primary motivation is to explore various modalities, including MRI images, genomics, facial images, eye gaze patterns, EEG signals, and questionnaires, by extensively reviewing most of the related work within this domain for early detection of ASD. However, the aim is to conduct a novel research covering all the essential aspects like reviewing the literature, extracting prospective limitations, identifying AI-based challenges, research gaps, reviewing potential datasets and deriving their limitations and overall contribution of AI for detecting and classifying ASD.

The reminder for this document is as follows. Section 2 describes the objectives and research questions in our work. Additionally, we describe the search and data extraction process, which includes the search, inclusion and exclusion criteria. Section 3 describes the most relevant contributions and various modalities of AI to ASD detection and classification. Section 4 describes the related work regarding ML and DL. In Section 5, we provide a list of popular datasets available for ASD detection and classification. Section 6 describes the limitations and research gaps we identified in the literature. Finally, Section 7 provides a conclusion and some ideas for future work.

2. Research Methods

This work compares various research methodologies and techniques presented by different authors. The study rigorously evaluates the research process and outcomes, focusing on their effectiveness, innovation, and performance. We meticulously defined the search strategy, the inclusion and exclusion criteria, and the quality assessment standards for the selected articles.

This systematic review followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines, which provide a standardized framework for reporting systematic reviews and meta-analyses [46,47]. PRISMA offers multiple benefits, including facilitating comprehensive searches of research repositories, clarifying the description of research questions, and providing a transparent and rigorous approach to defining inclusion and exclusion criteria for relevant studies. Supplementary Materials File S1 presents the completed PRISMA 2020 27-item checklist [48] corresponding to the present review. In addition, this investigation was registered on the Open Science Framework and is publicly available at https://osf.io/ua6b4/, accessed on 10 March 2025.

The following sections provide detailed explanations of each component in this work.

2.1. Objectives and Research Questions

The primary objective of this study is to explore the significant roles of AI and its associated techniques, such as ML and DL, in detecting or classifying ASD. Additionally, the research reviews related works by various authors utilizing ML and DL, highlighting the difficulties and challenges encountered in ASD detection or classification. Another key goal is to identify shortcomings and gaps in current ASD intervention strategies. This proposed study offers valuable insights into different areas for improvement, future directions, and potential outcomes in the field of ASD treatment, as well as support.

This work aims to answer the following research questions:

(i): What are the key contributions of AI and its subfields, such as ML and DL, in detecting or classifying ASD?
(ii): Which datasets are available for ASD detection or classification, and what are their representative characteristics?
(iii): What significant advancements and studies have been conducted in the domain of ML and DL for ASD detection or classification?
(iv): What are the limitations and gaps in the current research on ASD detection or classification using AI and its subfields such as ML and DL, and how can these be addressed in future studies?

2.2. Search and Data Extraction

To retrieve the documents considered for this work, we used the following query:

("autism spectrum disorder" OR "autism" OR "asd") AND
("machine learning" OR "deep learning") AND
("classification" OR "detection" OR "identification")

Such a query allowed us to retrieve relevant literature on the research topic. We executed this query across reputable databases, including PubMed, WOS, IEEE, and Scopus. As described before, our review incorporates all English-language articles presented to peer-reviewed conferences or journals. However, we excluded some works from this analysis for various reasons, such as those presenting an unclear methodology, tools, strategy, or approach. Thus, from all the documents retrieved by our search, we only considered:

(i): Documents written in English.
(ii): Documents published between 1 January 2016 and 31 May 2024.
(iii): Documents related to AI, ML- and DL-based detection or classification of ASD and aligned with our research questions and objectives.

Please note that, according to the inclusion criteria, we excluded documents written in languages other than English from our study. Additionally, we also excluded grey literature, abstracts, or preprints to ensure the inclusion of only peer-reviewed, methodologically sound, and fully reported studies. These sources often lack rigorous quality control, comprehensive data, or final results, which may compromise the reliability and reproducibility of our findings.

Search Results

To decide on the documents for our analysis, we followed a PRISMA-inspired four-step methodology, as described in Figure 2. Initially, we obtained 4719 documents by using the query as described before. We removed 2042 duplicate results during the screening phase since some documents appear in multiple sources. This elimination reduced the number of documents to 2677. Additionally, during the same phase, we removed 2611 documents since we considered them out of this work’s scope based on their title and abstract, which left 66 documents. In the eligibility phase, our criteria focused on studies involving the detection, classification, and identification of ASD using AI methodologies, mainly consisting of ML and DL techniques and full-text evaluation. A variety of diagnostic modalities, such as MRI/fMRI, eye gaze, facial images, questionnaires, genetics, and EEG, were included in these studies. Subsequently, from the 66 documents resulting from the screening phase, articles that failed to meet the requisite criteria were removed for the following reasons: five documents because they contained insufficient comprehensive information, seven more documents due to their irrelevance to ASD diagnosis, four more because of an absence of discernible results, and two more because they were retracted. By the end of the selection process, we kept only 48 documents for full review eligibility and met the inclusion criteria based on ML and DL approaches.

3. Contributions of Artificial Intelligence to ASD Detection

In addition, the identification of many brain abnormalities, including ASD, has been greatly facilitated by the quick development of neuroimaging methods. For example, MRI is a crucial noninvasive technique for assessing brain structure, white matter (WM) integrity, and functional activity. Structural MRI (sMRI) has been employed to delineate the morphological alterations in the brain associated with ASD, focusing on the shape and volume of various brain areas. Diffusion tensor imaging (DTI) evaluates anatomical connections and has revealed impaired microstructural white matter integrity in individuals with autism. Functional MRI (fMRI) depends on identifying dynamic physiological data from active cerebral areas. Assessing alterations in blood-oxygenation-level-dependent (BOLD) signals in different brain states (resting state or task-induced) can uncover functional architectural anomalies in the ASD population despite various MRI modalities demonstrating potential in differentiating people with ASD from healthy controls (HCs) [49]. For instance, Abraham et al. [50] used features derived from ROI-based resting-state FC metrics to distinguish ASD from HCs. They used an SVM algorithm, exposing an enhancement in predictive accuracy corresponding to an enhancement in participant numbers. Later, Saad and Islam [51] relied on SVM and a linear discriminant analysis (LDA) to categorize the data, and PCA was employed to reduce the noise characteristics based on graph theory with DTI-based connection characteristics and classify ASD from HCs. The best performance was determined using two PCA features and the SVM algorithm.

Furthermore, EEG captures the brain’s electrical activity by measuring the electrical impulses of various frequencies utilized by neurons for communication via electrodes affixed to the scalp. EEG-based diagnosis can offer better-tailored interventions by identifying diverse neurophysiological traits in autistic patients. EEG data analysis enables the detection of aberrant synchronous neural activity in children with ASD. The researcher investigated the amalgamation of EEG with AI methodologies to improve the identification and categorization of ASD [52]. For instance, Rogala et al. [53] utilized traditional statistical methods with classical ML techniques based on EEG data to classify distinct features or attributes linked to ASD. Researchers also combined statistical and ML approaches to enhance the classification of ASD.

Moreover, eye tracking (ET) is a noninvasive technique for recording a person’s gaze positions in real-time, which allows us to examine a user’s eye movements or the focus point of an individual. It is a process of assessing the point of gaze or the location of eyes and gathering the eye features from an individual. The recorded data containing fixation amounts, first fixation, and fixation duration can be examined using visual analytic methods to review and obtain the eye features. It utilizes a visual analytic procedure to enhance the visualization of general visual problem-solving [54]. For example, Meng et al. [55] used eye movement data to apply ML techniques for early detection of ASD gathered from individuals as they view different types of faces (real and artificial). Notably, it examined how the gaze patterns of individuals can be assessed when they inspect both real human faces and artificial faces to classify key markers of ASD.

Researchers have relied on AI techniques for the early identification of ASD based on facial photos. Facial images may be the most reliable diagnosis technique because every child with ASD differs from a normal child based on facial features. Researchers from the University of Missouri discovered that children with autism exhibit specific facial characteristics, like wide-set eyes and a broad upper face. Compared to youngsters without ASD, their faces are frequently characterized by a shorter central area, including the nose and cheeks. Because of the social impact on emerging nations, research on the facial feature-based diagnosis of ASD is expanding quickly [56]. For example, in this study, Awaji et al. [57] explored hybrid techniques for early detection of ASD based on facial images using CNN-based feature extraction. This method also integrated many CNN models to improve the detection accuracy and utilized DL to capture intricate facial patterns related to ASD.

AI algorithms analyze genetic data to find variations linked to the severity and susceptibility to ASD [58]. By identifying specific gene mutations and pathways related to ASD, whole-genome sequencing and genome-wide association studies (GWAS) shed light on the ASD genetic makeup. Explainable Artificial Intelligence (XAI) has excellent potential for deciphering intricate representations and patterns in various data sources. To build trust and enable more informed and actionable insights in biomedical applications, XAI makes the inner workings of these models more transparent, enabling researchers and clinicians to understand better how decisions are made [59]. AI-driven methods improve our comprehension of the intricate genetic foundation of ASD and guide tailored treatments that focus on the underlying molecular process [60]. For more details, each modality is described individually below.

4. Related Work in the ML and DL Domain

4.1. Questionnaire-Based Diagnosis of ASD

Early screening is crucial for timely intervention in children’s development. Regular screening by using tools like the Autism Spectrum Quotient (AQ), the Social Communication Questionnaire (SCQ), and the Modified Checklist for Autism in Toddlers (M-CHAT) helps gather information about a child’s social skills and behavior. This information aids healthcare professionals in diagnosing ASD, a process they can only carry out. Diagnosis involves the use of standardized diagnostic tools such as the Autism Diagnostic Observation Schedule (ADOS), Autism Diagnostic Interview-Revised (ADI-R), and the Diagnostic and Statistical Manual of Mental Disorders 5 (DSM-5). These methods rely primarily on interviews, which are easily adaptable to participants of all ages [61].

Early identification helps to ensure prompt intervention, which has a significant impact on the management of ASD. They function similarly to checklists physicians use to spot trends that may indicate ASD and order early screens and therapies. The diagnosis of ASD is undergoing a revolution due to recent developments in AI. For example, ML and DL algorithms process through enormous volumes of data from many sources. This data consists of medical records, questionnaires, and behavioral assessments. By examining these many sources, AI can find trends and indicators related to ASD [62,63]. The following lines describe, chronologically, relevant works where AI has interacted with data gathered from questionnaires to improve ASD diagnosis.

In 2019, Erkan and Thanh [64] proposed a classification system for ASD data to facilitate early ASD diagnosis using practical algorithms. It used three ASD datasets to consider kids, teens, and adults (AQ-10-Child, AQ-10-Adolescence, and AQ-10-Adult). Their research used Random Forests (RF), SVM, and k-nearest neighbors (kNN) for classification in datasets with 20 variables that include screening questions and personal information. Their study suggests that RF and SVM are valuable techniques for classifying ASD. Also in 2019, Akter et al. [65] used datasets that included people with ASD of all ages, from toddlers to adults. It used a variety of approaches, such as sine function, logarithmic, and Z-score normalization, to modify the characteristics. They used the adjusted datasets to evaluate a variety of ML-based classifiers. Feature selection techniques were then used to Z-score normalized datasets to uncover significant ASD risk variables across age groups. The study showed how well optimal ML techniques can predict ASD, even in the face of obstacles like missing values and data noise. The UCI ML archives and Kaggle datasets included records for 1054 toddlers, 248 children, 98 adolescents, and 609 adults, with varying gender proportions for each age group.

A year later, Raj and Masood [66] used three publicly accessible non-clinical ASD datasets to predict and analyze ASD across various age groups: one with 21 characteristics for children (292 occurrences), one for adults (704 instances), and one for adolescents (104 instances). Their technique included a train-test split of 80:20 and tested NB, SVM, LR, kNN, neural networks (NN), and CNN models. CNN models performed better across all datasets. Later, the study conducted by Vakadkar et al. [67] focused on increasing the accuracy and swiftness of ASD diagnosis by integrating ML approaches with conventional methodologies. The dataset had 1054 instances at first, with 18 features. However, unimportant and categorical attributes had to be removed during the preprocessing. The dataset was then divided into 80% for training (843 samples) and 20% for testing (211 samples).

Another important work is that of Hossain et al. [22], who used various classification algorithms on datasets with different age groups (including toddlers, children, adolescents, and adults) to explore how to automate the diagnosis of ASD. Their study carefully cleans data and removes unnecessary features to preprocess ASD datasets. It then uses a wide range of 27 classification methods, evaluating each one using 10-fold cross-validation and evaluating the results according to accuracy and F1-score. The focus is on feature engineering, which identifies the most important features for the best classifier performance by rating them using five benchmark methodologies. The results highlight the multilayer perceptron (MLP) as the best classifier and the `relief F’ approach as the most successful feature selection technique for ASD identification among all age groups.

Bala et al. [68] developed an ML model that enabled more accurate identification of ASD in children, adolescents, adults, and toddlers. Their approach selected features using various methods and evaluated many classifiers using metrics, including AUROC, kappa statistics, F1-measure, and overall prediction accuracy. SVM consistently performed better in every age group than other classifiers. Significant ASD-related characteristics were found by analyzing several feature subsets employing the Shapley Additive Explanations (SHAP) approach. The data gathering process entailed using the autism screening app ASDTests, which assessed the risk of ASD in various age groups using the Q-CHAT-10 and AQ-10 questionnaires. Each age group was represented by a dataset with 18 to 23 characteristics. That same year, Kumar and Das [69] looked at the automated generation of an autism diagnosis tool using ML using a dataset of 701 samples that include individual characteristics and AQ-10-Adult data. Recursive Feature Elimination (RFE) was used for feature selection in the training and testing of ML classifiers, including ANN, SVM, and RF, on a preprocessed dataset. The ASD screening data collection for adults includes 701 samples from 67 nations, with 512 negative and 189 positive cases. The gender distribution of the samples is also balanced. Regarding age, gender, jaundice, and ASD, including app usage, missing data are added, and classifiers like SVM, RF, and decision tree (DT), including LR with RFE, are used. A total of 12 classification models were created.

More recently, Farooq et al. [70] presented a federated learning (FL) method that uses locally trained SVM classifiers and LR classifiers to diagnose ASD in children and adults. Four datasets totaling more than 600 records each were considered to extract the features. The suggested strategy consists of five essential parts: gathering datasets, preprocessing data, locally training ML models, and testing to identify the best diagnostic model for ASD. Getting publicly accessible datasets, preprocessing and normalizing the data, and then individually implementing SVM and LR classifiers onto the dataset were the steps in the procedure. These classifiers then send their results across a centralized server, whereby a meta-classifier is developed to determine the best approach. Moreover, Khudhur and Khudhur [71] employed four non-clinical ASD screening datasets from Kaggle and the UCI ML library. These datasets comprise the following age groups: adults (704 cases, 21 features), adolescents (104 cases, 21 features), toddlers (1054 instances, 19 characteristics), and children (292 cases, 21 features). Several supervised ML models are used in the study, including RF, DT, SVM, kNN, NB, and LR. The study found that the best classifiers were DT, LR, and RF.

Finally, we can mention the works by Mukherjee et al. [72] and Rasul et al. [73]. Mukherjee et al. [72] relied on a Long Short-Term Memory (LSTM) model to analyze parent-child conversations to identify indications of ASD in children. The data, which focused on improving speech, conduct, and communication, were gathered from various social media sites and groups that support kids with special needs. The gathered conversations underwent preprocessing. Then, the LSTM model was trained to identify sentiment patterns within the parent chats. After training and testing, the model could accurately predict sentiment on fresh input data, either 0 or 1. Conversely, Rasul et al. [73] used a dataset for adults with ASD and a children’s dataset based on the AQ10 questionnaire with 292 samples of children aged four to eleven. The study determined the effectiveness of eight state-of-the-art classification methods in identifying ASD. The study evaluates five clustering algorithms in the absence of labeled data. It examines how well they perform using measures such as Silhouette Coefficient (SC), Adjusted Rand Index (ARI), and Normalized Mutual Information (NMI). Based on ARI and NMI metrics, spectral clustering is the best performer, while k-means shows the greatest SC. ML algorithms were trained and evaluated using five-fold cross-validation to ensure robust results.

We provide a summary of the studies related to the questionnaire data in Table 1.

4.2. Facial Image-Based Diagnosis of ASD

The diagnosis of ASD has also been explored by examining facial characteristics [74]. Many facial distances are altered in ASD individuals, according to research: the distances between the glabella and nasion and the inner canthi, as well as the distances between nasion and landmarks on the nose and philtrum, are reduced, while the distances between the mouth and the nasal region, as well as the distances between facial landmarks and the eyes and the opposing side that contains the mouth, are increased. Boys with ASD usually exhibit significant facial asymmetry, particularly around the supra- and periorbital areas [75,76]. Additionally, studies reveal that compared to children without ASD, who more often display expressions indicating engagement and interest, kids who suffer from high-functioning autism (HFA) produce neutral emotions with greater frequency and show less richness and flexibility in their facial expressions.

Regarding real-time mental and emotional state assessments and ASD diagnosis, facial recognition technology is beneficial. This technique extracts characteristics that differentiate between regular and aberrant expressions by analyzing human traits and is applied to identify behavioral patterns from big datasets. The study of facial expressions in individuals with ASD has recently received more attention. Facial expressions have also been widely studied in individuals with neurological illnesses such as Alzheimer’s disease, neurodegenerative disorders, and frontotemporal dementia. Various techniques teach models to detect characteristics like lips, eyes, and eyebrows to help children with ASD comprehend and use facial emotions. Combined with iconized images, these models help in differential diagnosis [74,77]. The following is a list of the most relevant works related to AI techniques and the use of facial image data to detect ASD.

In 2016, Liu et al. [78] identified children with ASD relying on ML techniques to analyze data on eye movements using a face recognition test. They used a dataset including three cohorts: 29 Chinese children between four and eleven years old diagnosed with ASD, 29 Chinese children matched by age who were typically developed (TD), and 29 additional TD children matched by IQ. Its main objective was to assess the sensitivity, specificity, and accuracy of ASD categorization using patterns from face scans.

Later, in 2021, Lu and Perkowski [79] used the VGG16 DL model with transfer learning based on facial images for identifying ASD in children. This study employed two datasets: the Kaggle Autism Facial Dataset, which included 2936 photos from diverse racial origins, and the East Asia ASD Children Facial Image Dataset, which included 1122 photographs of East Asian children. The two datasets are split equally between kids with ASD and TD.

A year later, the study by Mujeeb Rahman and Subashini [80] determined whether facial traits taken from photos of autistic children may be used as a biomarker to distinguish autistic children from usually developing kids. The work uses a DNN model for accurate ASD classification and five pre-trained CNN models (MobileNet, Xception, EfficientNetB0, EfficientNetB1, and EfficientNetB2) with feature extraction. The 2936 colored 2D face photos in the dataset, which is publicly accessible and split into training, validation, as well as test sets with each grouping between autistic and non-autistic subgroups, are of children between the ages of two and 14, with the majority falling between the ages of two and eight. The CNN-based feature extraction tool captures complex face features important for categorization, demonstrating the model’s ability to extract sophisticated features beyond human visual evaluation using digital filters.

The year 2023 was critical for this type of application. Gaddala et al. [81] improved the diagnosis of ASD by integrating DL with conventional techniques for detecting ASD based on the dataset of facial images using deep CNNs such as VGG16 and VGG19. The models were trained to achieve optimal accuracy in ASD identification using a batch size aggregating 12 over ten epochs. Additionally, the study conducted by Alkahtani et al. [82] used a dataset of facial images from Kaggle. It focused on ML and DL techniques to diagnose ASD using facial landmark analysis, such as LR, linear SVM, RF, DT, gradient boosting, MLP classifiers, and KNNs with deep CNNs, including MobileNetV2 and hybrid VGG19 based on transfer learning techniques to improve the performance of the models using Kaggle dataset with 2940 photos of children with and without autism. Finally, Li et al. [83] employed DL techniques and focused on improving the performance of ASD classification using MobileNetV2 and MobileNetV3 large models through a two-phase transfer learning using the dataset of facial images. This study showed improvements, supporting the two-phase transfer learning method regarding accuracy and Area Under the Curve (AUC).

More recently, Reddy [84] addressed DL models; three pre-trained CNN models, VGG16, VGG19, and EfficientNetB0, were employed for feature extraction and classification of ASD using a dataset of facial images from Kaggle. The results revealed that DL models, especially EfficientNetB0, perform well in the classification of ASD.

At this point, we would like to provide deeper comparative insights regarding the performance of convolutional neural networks (CNNs) and support vector machines (SVMs) in facial image analysis. Before CNNs became dominant, SVMs were considered state-of-the-art for many classification tasks. However, in recent years, CNNs have established themselves as the leading approach for object recognition in computer vision. One reason for their success is that convolutional layers can learn hierarchical and spatially-aware representations of input data, while fully connected layers subsequently map these representations to output classes. The superiority of CNNs is often attributed to several factors, including translation invariance, hierarchical feature learning, reduction in overfitting through techniques such as dropout and data augmentation, and efficiency in processing image data. By contrast, SVMs do not learn feature representations from data. Instead, they rely on a pre-specified kernel function to transform the data into a higher-dimensional space where it may become linearly separable. If the chosen kernel is appropriate, the SVM can perform well; if not, its performance degrades. This process can be seen as a form of “educated guessing”. In contrast, CNNs can be interpreted as learning both the feature transformation (analogous to a data-driven kernel) and the classifier simultaneously. This capacity to adaptively learn representations makes CNNs particularly advantageous in complex or exploratory domains [85,86].

Finally, regarding the architectures of the CNNs, VGG-16 is a relatively straightforward CNN architecture that does not include residual connections. In contrast, ResNet architectures incorporate residual blocks, allowing for the effective training of much deeper networks. As a result, ResNets generally outperform VGG models both in accuracy and computational efficiency. For example, ResNet-50 requires approximately 3.8 billion floating point operations (FLOPs), compared to 15.3 billion FLOPs for VGG-16 [87]. In terms of classification accuracy on the ImageNet benchmark, ResNet-50 achieves 85.7% of accuracy [88], compared to 79.01% for VGG-16 [89].

Furthermore, we present the summary of all studies on the facial image data in Table 2.

4.3. MRI-Based Diagnosis of ASD

Strong magnets and radio waves are used in MRI to produce precise images of the body. MRI works similarly to a powerful camera that can see through muscle and skin, providing doctors with incredibly detailed images of organs, tissues, and the brain [90]. Since MRIs do not use radiation, they are safer than CT scans and X-rays. Because of this, they are ideal for evaluating soft tissues, such as the brain and muscles, which aid in diagnosing everything from malignancies to brain damage [91,92].

AI-based systems can decipher the complex information concealed in MRI images, particularly those built on ML, DL, and computer vision. These scans capture the structure and functionality of the brain, and AI decodes the intricate patterns inside them like a codebreaker. AI may detect minute variations in brain areas and their connections, frequently connected to ASD, by training on enormous volumes of MRI data. AI can recognize these subtleties that human specialists could overlook. This makes it possible to diagnose ASD more accurately and comprehend its brain roots on a deeper level. Additionally, MRI is a noninvasive, accurate method of making an early diagnosis of ASD in collaboration with medical specialists, which improves treatment strategies and provides insightful information about this complex disease [93,94].

For instance, Heinsfeld et al. [95] used a resting-state functional MRI (rs-fMRI) dataset gathered by the Autism Imaging Data Exchange (ABIDE I), which included 530 typical controls (TC) and 505 people with ASD. The information collected from seventeen distinct imaging locations was utilized to train a classification model. The researchers used denoising autoencoders to improve model accuracy on new data, which recreated inputs from faulty copies. For the unsupervised pre-training phase, two stacked autoencoders were used to minimize the dimensionality of the data and maximize reconstruction loss. The autoencoder weights were merged into the multilayer perceptron (MLP), the final model, which was then fine-tuned to reduce prediction errors. The output layer of the MLP yielded the likelihood that an input image would be categorized as either TC or ASD. In addition, the work conducted by Li et al. [96] used fMRI scans to identify biomarkers for ASD using a dual-phase technique. Initially, spatial-temporal data from 4D fMRI, compressed to

32 \times 32 \times 32

dimensions, is trained into a DNN classifier (2CC3D). The classifier combines dropout and L2 regularization to avoid overfitting with convolutional and fully connected layers. The technique uses task-based fMRI images processed by motion correction, slice timing correction, brain extraction, spatial smoothing, and high-pass filtering on participants with ASD and control subjects doing a pointing task. Sliding window analysis over the fMRI time dimension produces 144 3D volume pairings per participant during region segmentation, which is carried out using the AAL atlas. The method’s effectiveness in distinguishing ASD-related brain characteristics amongst ASD and experimental results demonstrate healthy controls.

Just a year later, Mostafa et al. [97] used the ABIDE dataset, which comprises structural MRI, rs-fMRI, and phenotypic data from 1112 participants (539 ASD and 573 TC), to diagnose ASD and 264 raw brain features derived from the eigenvalues of the Laplacian matrix of the brain network plus three network centrality features. These 264 features were reduced to 64 using a feature selection technique. After that, these characteristics are employed to train several ML models, such as NN, SVM, LDA, LR, and kNN, to diagnose ASD and the classification performance for ASD and TC participants. NN is implemented in PyCharm using Python, whereas LDA, LR, SVM, and kNN are executed in MATLAB. The same year, Eslami and Saeed [98] presented the Auto-ASD-Network model, which uses fMRI data to differentiate between individuals with ASD and healthy individuals. To increase classification accuracy and avoid overfitting, this model uses a multilayer perceptron (MLP) with two hidden layers and adds SMOTE for data augmentation. After being retrieved from MLP, the SVM classifier evaluates the features, and Auto-Tune Models (ATM) optimizes them. By interpolating between randomly chosen nearest neighbors, SMOTE is used to construct synthetic samples, increasing the training set’s size. The C-PAC pipeline preprocesses the ABIDE initiative’s fMRI data. The preprocessing steps include motion correction, slice timing correction, drift correction, nuisance signal removal, and voxel normalization. ABIDE’s spatially limited spectral clustering techniques partition the brain into 200 areas.

Sharif and Khan [99] presented an ML system that uses brain volume and corpus callosum data to identify ASD. This method simplifies model training through selective feature extraction while achieving excellent recognition accuracy for ASD. The study also investigates the use of DL for neuroimaging data analysis, using the ABIDE-I dataset to build a VGG16 model. This dataset includes 573 healthy controls and 539 autistic persons from 17 countries. The effectiveness of ML in analyzing structural MRI data for ASD identification was demonstrated by the evaluation of several classifiers, comprising LDA, SVM (with an RBF kernel), RF, MLP, and kNN (with

k = 3

). Additionally, in 2022, in the study by Othmani et al. [100], MRI data is used to classify ASD using a comprehensible DNN method. When the techniques were evaluated using the ABIDE dataset, it outperformed VGG16 and ResNet-50 and produced encouraging results. Preprocessing the data, isolating and finally integrating local ROIs, enhancing the data to prevent overfitting, and feeding the enhanced dataset through the LeNet-5 network. It utilized data processed by the Configurable Pipeline for the Analysis of Connectomes (C-PAC). The results justify the implementation of LeNet-5 throughout an ASD diagnostic assistance system by showing that adding additional MRI images to the annotated training set considerably improves performance.

Using the ABIDE fMRI dataset, Zhang et al. [101] provided a DL strategy combined with the F-score feature selection method for ASD diagnosis. Their method finds functional connectivity characteristics in the ABIDE and intra-site datasets with great accuracy. ASD’s path duration and cluster coefficient significantly decreased, as shown by feature analysis, suggesting a switch between small world and random network architecture. The model is tested using k-fold cross-validation, which splits the source data across k-equal subsections between training and testing. An autoencoder is used to remove lower-dimensional features from the model. Classification accuracy (ACC), sensitivity (SEN), and specificity (SPE) are performance measures that quantify the percentage of adequately categorized subjects, correctly recognized individuals with ASD subjects, and correctly identified healthy subjects, respectively. Similarly, Park and Cho [102] used the ABIDE I dataset and applied a residual attention network integrated with a graph convolutional network to extract features from 4D brain scans. By using 10-fold cross-validation and the ABIDE I dataset, which comprises 800 individuals (389 with ASD and 411 controls), the model can attain a better accuracy rate. The training efficiency is increased by employing the Adam optimizer. The Adam optimizer was used to enhance the training accuracy or efficiency.

More recently, Bahathiq et al. [103] employed ABIDE datasets containing sMRI and rs-fMRI scans. Seven ML models were tested for their ability to detect early signs of ASD in children between the ages of five and ten. Combining Grey Wolf Optimizer (GWO) with support vector machines (SVM) yielded the most significant results. The assessment was conducted using 10-fold cross-validation to ensure robustness, and the models were trained using an 80/20 training–validation split. Additionally, Li et al. [104] employed a unique framework, LD-MILCT, for analyzing spatial and morphological data across different brain areas. It utilized a two-stage multi-instance learning technique. The multi-instance learning head (MIL head) of a Vision Transformer is integrated into the framework to optimize the use of essential characteristics for categorization. The study used the ABIDE dataset for ASD, which included 342 typically developing controls and 213 ASD patients, and the ADNI dataset for Alzheimer’s, which included 399 controls and 308 AD patients. In addition, also in 2024, Wang et al. [105] employed fMRI data from 264 participants, which consisted of 134 with ASD and 130 TD children. The study utilized cutting-edge techniques for detecting ASD using graph attention networks (GATs). The model considered every brain region of interest (ROI) as a node and used wavelet decomposition to extract features from the BOLD signal. The adjacency matrix is an optimized functional connectivity (FC) matrix. Self-attention methods capture long-range relationships between features, and node-selection pooling layers ascertain the importance of each ROI for prediction. Finally, Wang et al. [106] presented a multisite fMRI investigation using the ABIDE dataset. The study used a class-consistency and site-independence multiview hyperedge-aware hypergraph embedding learning (CcSi-MHAHGEL) framework to integrate functional connectivity networks (FCNs) from different brain atlases. A multiview hyperedge-aware hypergraph convolutional network (HGCN) trained adaptive hyperedge weights, developing a multi-atlas-based FCN embedding. The framework includes a site-independence module to minimize site-related disparities resulting from disparate scanning procedures and a class-consistency element to preserve intra-class compression with inter-class separation. Afterward, fully connected layers and a softmax classifier are used to facilitate diagnosis and multi-atlas-based FCN embeddings.

Table 3 summarizes the studies related to the MRI image data.

4.4. Eye Tracking-Based Diagnosis of ASD

The use of eye-tracking technology has provided evidence in favor of theories about the gaze habits of autistic youngsters. An effective way to evaluate ASD is by eye movement analysis, which can provide important information about a person’s cognitive abilities, social communication skills, and visual attention. Eye-tracking technologies allow for the recording several types of gaze patterns, such as saccadic eye movements, blinking, and fixation [107]. By highlighting minute differences in gaze-related behaviors, it becomes possible to pinpoint characteristics that differentiate people with ASD from the rest. Technological obstacles, such as various equipment types and intricate analytical methods, hinder the collection and processing of eye tracking data. Cutting-edge AI techniques such as ML and DL are being applied to improve eye-tracking equipment for examining gaze patterns and attentional mechanisms in people diagnosed with and not diagnosed with ASD [62,107]. Numerous research studies use AI algorithms to examine gaze patterns and diagnose autism. Through the analysis and classification of these gaze patterns, the integration of eye-tracking technology with AI algorithms has the potential for the early identification of ASD [62,108,109]. Some relevant works that have explored eye tracking to diagnose ASD are described below.

Fabiano et al. [110] categorized varying degrees of ASD risk using eye gaze data and demographic characteristics like age and gender. They developed feature descriptors incorporating these variables, and examining eye-gazing patterns verified their efficacy. Their work utilized the National Database for Autism Research (NDAR) “Eye Tracking Subject-Experiment” (ETS-E) dataset, which included 229 participants with varying degrees of ASD risk. PART, a deep NN, RF, and C4.5 DTs were among the classifiers put to the test. In addition to age and gender data, the dataset included information on eye gazing. By using age, gender, average fixation time, and raw eye gaze points, they created homogeneous feature vectors that enable precise ASD categorization determined by gaze patterns for in-depth analysis.

A year later, using ML and visualization approaches, Cilia et al. [111] investigated the application of eye tracking for ASD screening. Fifty-nine school-age participants in the study watched age-appropriate pictures and videos while having their eye movements monitored. Eye-tracking scan paths were converted into visual pictures or representations, which were analyzed using a CNN. This approach aimed to ease the diagnostic task and achieve high classification accuracy. High classification accuracy was achieved by converting these eye-tracking scan paths into visual pictures and employing a CNN for analysis. This suggests that visual representations can instead capture gaze motion details well. The study also looked at relationships between eye movement patterns and the degree of autism. Children with diagnoses of ASD, as well as generally developing kids, participated in the study; medical professionals confirmed the diagnosis. They used a threshold of 200 points to ensure optimal visual clarity in the scan path representations.

In 2022, Ahmed et al. [112] presented three AI approaches, ML, DL, and hybrid, for the early detection of autism. The first technique combined features taken from grey-level co-occurrence matrix (GLCM) and local binary pattern (LBP) methods using neural networks (FFNNs and ANNs). The second technique extracts deep features from CNN models that have already been trained, such as GoogleNet and ResNet-18. The third strategy, known as GoogleNet + SVM and ResNet-18 + SVM, is a hybrid approach that combines ML and DL methods such as (SVM, GoogleNet, and ResNet-18). The study also used image preprocessing and feature extraction using eye-tracking routes. The main contributions of their study are the unique hybrid strategy and the thorough examination of machine learning and deep learning techniques for eye-tracking ASD detection. Using visible eye-tracking scan path (ETSP) pictures, Kanhirakadavath and Chandran [113] analyzed ML approaches to determine the optimal model for autism prediction. They used a publicly accessible dataset of 547 ETSPs from 328 usually developing and 219 autistic children to test three conventional models, including a DNN classifier. To avoid overfitting, they used picture augmentation. The study aimed to replace subjective manual screenings with an objective and trustworthy way of diagnosing autism utilizing 2D ETSP pictures. While boosted DT (BDT), deep SVM, decision jungle (DJ), and DNN functioned as classifiers, principal component analysis (PCA) and CNN were employed for feature extraction. They assessed performance using AUC, specificity, sensitivity, PPV, and NPV.

Ahmed et al. [114] utilized cutting-edge technology based on eye tracking and deep-learning algorithms such as LSTM, CNN-LSTM, Bi-LSTM, and GRU networks. They trained these models using an eye tracking dataset specifically curated for ASD research. In performance evaluations, LSTM outperformed the other models in terms of accuracy. Moreover, in 2023, Thanarajan et al. [115] aimed to detect ASD using eye tracking based on deep learning techniques. The researcher integrated eye tracking data with the Chaotic Butterfly Optimization (CBO) method to enhance the diagnosis accuracy using the ETASD-CBODL framework. It analyzes the patterns using the visualization of eye-tracking scanpaths image dataset in ASD. The ETASD-CBODL approach uses Inception v3 for feature extraction and U-Net for segmentation to find regions of interest for feature extraction.

More recently, Alsaidi et al. [109] presented T-CNN-ASD, which analyzes ASD based on eye tracking data to categorize subjects into typically developing (TD) and ASD groups. The researchers utilized quantitative eye movement analysis to study attentional processes. It is a potential technique for establishing biomarkers in clinical experiments for ASD because of its high accuracy, cost, and ease of use. This framework compared the abnormal visual attention patterns displayed in children with ASD and those in TD children. The T-CNN-ASD model has two hidden layers comprising 300 and 150 neurons, respectively, and it was subjected to a 20% dropout rate during a 10-fold cross-validation process.

A summary of the studies related to the eye tracking data is presented in Table 4.

4.5. EEG-Based Diagnosis of ASD

EEG is a noninvasive method for evaluating electrical activity in the brain, making it crucial for clarifying the neurophysiological underpinnings of ASD. An EEG allows us to investigate the brain and detect minute electrical vibrations. This method of measuring brain activity and displaying it as wave patterns is painless. Epilepsy and sleep disturbances are among the illnesses that doctors identify with EEGs. Moreover, ASD can benefit from this technology if AI technologies are introduced in the diagnosis workflow. EEG recordings include intricate patterns that may be analyzed by AI techniques such as ML and DL, which allow the identification of ASD-specific patterns [52,116]. They search for indicators such as wave frequency, the electrical interactions between various brain areas, and the brain’s response to stimuli. AI can increase the precision of ASD diagnosis, particularly in the critical early stages, by identifying these minimal variations. Even better, AI can grow and learn over time. Large-scale EEG data analysis improves its ability to recognize ASD patterns. Because of this, AI techniques, such as ML and DL, can evaluate EEG data to identify abnormal brainwave patterns and connectivity and are valuable tools for medical professionals and researchers, providing a noninvasive and ever more trustworthy means of diagnosing and comprehending ASD [117,118,119].

For instance, Radhakrishnan et al. [120] investigated many deep convolutional architectures for ASD diagnosis, emphasizing good classification accuracy and automated feature extraction. Their study assesses models including AlexNet, DenseNet, SqueezeNet, ShuffleNet, VGGNet, InceptionNet, and ResNet based on precision, recall, and accuracy. Using k-fold cross-validation, ResNet50 outperformed the other models based on specificity and F1-Score. Therefore, the study suggests that ResNet50 be used to diagnose ASD from EEG waves. EEG data were obtained from 10 normally growing and autistic children collected using the internationally recognized 10-20 electrode system. At the same time, they were respecting ethical standards and getting parental agreement. The Helsinki Ethical Principles and ICMR Ethical Guidelines were adhered to by the procedure.

Rogala et al. [53] investigated the classification of ASD in children utilizing statistical methods and ML approaches to EEG data. The authors performed retrospective research using statistical methods and machine-learning approaches, employing clinical EEG recordings from two patient groups aged from 2.08 to 5.92 years. The Wilcoxon rank-sum test with FDR correction and the Friedman test were two statistical procedures. A logistic regression model with L2 regularization was utilized for machine learning approaches using cross-validation optimization to assess the model’s performance. In the same year, Alhassan et al. [121] presented sensor-based techniques for early detection of ASD using EEG signals. The study emphasized energy-efficient signal modification for feature extraction and classification using machine learning techniques. The complexity of brain activity was measured using multiscale entropy (MSE), which was calculated using digital Haar wavelet decomposition. The three ML classifiers examined were SVM, logistic regression, and decision trees.

More recently, Menaka et al. [122] evaluated the effectiveness of many deep learning models, such as AlexNet, VGG16, and ResNet50, in diagnosing ASD using EEG data utilizing five cepstral coefficient characteristics. One of the main contributions is using cepstral coefficients to create spectrograms during the processing stage. According to experimental data, the maximum accuracy is obtained when AlexNet is combined alongside Linear Frequency Cepstral Coefficients (LFCC). In addition, Al-Qazzaz et al. [123] utilized several pre-trained models, such as AlexNet, SqueezeNet, and MobileNetV2, to classify ASD based on EEG data. The EEG signals are segmented into five-second intervals, and power spectral density (PSD) is used to transform the signals into greyscale spectrograms. The researcher used hybrid models that combined DL models with traditional machine learning models such as SVM, kNN, and DT. According to the results, SqueezeNet increased accuracy, and performance is further improved when combined with the SVM classifier. Additionally, in 2024, Ullah and Yu [124] presented a weighted ensemble model based on DL for the classification of ASD. They used STFT to extract features from EEG recordings and convert them into 2D time-frequency spectrograms. The dataset included EEG data from 17 participants, 12 with ASD and 5 TD participants. EEG data was recorded with Ag/AgCl electrodes. An ensemble model is formed by combining three CNN models, and a grid search is used to build the final weighted ensemble.

We provide a summary of these studies in Table 5.

4.6. Genetics-Based Diagnosis of ASD

Genetics has made a significant new understanding of the biology of ASD possible [125,126]. More than a thousand genes are thought to be involved in increasing the risk of ASD when they experience functional perturbations, including de novo mutations, disruptions to expression quantitative trait loci (eQTLs), and inheritance of harmful uncommon variants. Despite these approximations, recent studies have shown about 100 genes that have a high correlation with the incidence of ASD. Extensive genetic research is being conducted to find more genes linked to ASD [127]. The brain’s frontal and parietal cortex frequently exhibit low expression of genes linked to ASD. Notable genes include newly discovered candidates like MYCBP2 and CAND1 and well-known ones like NBEA, HERC1, and TCF20. CNNs and recurrent neural networks are two examples of contemporary DL applications in genomics that improve the analysis of DNA and RNA sequences and forecast mutations’ biological and phenotypic effects, including the single-cell methylation state of CpG dinucleotides. AI can recognize patterns, linkages, and genetic markers that may point to an ASD diagnosis or suggest a person’s predisposition to the disorder by applying ML and DL [128]. These algorithms can navigate the intricate web of genetics and identify minute distinctions between people with and without ASD. This results in more accurate and customized diagnoses. Additionally, AI-driven genomic analysis opens the door to creating early-stage targeted therapies and treatments by helping us decipher the underlying genetic pathways driving ASD Nahas et al. [59], Abdullah et al. [129].

For example, Gök [130] used data on brain-developing gene expression to create a model that used machine learning to perform binary classification of ASD risk genes. The model’s two main parts are discretization, the Haar wavelet transform for feature extraction, and a Bayes network learning algorithm for classification. Using lncRNA gene data, they evaluated the proposed model against a few stand-alone classifiers and current techniques, such as LR, RF, NB, Bayes network, kNN, and linear and polynomial SVM. The 10-fold cross-validation testing methodology was used for the ASD classification problem in all classifier assessments.

A few years later, Wang et al. [131] used ML techniques, particularly RF, to identify ASD using IgA levels and VFGM genes. The research used metagenomic datasets to compare kids with TD (n = 31) and ASD (n = 43). According to the data, there is a correlation between IgA levels and VFGM gene diversity, which is more significant in ASD. Most of the 24 VFGMs that distinguished between TD and ASD were attributed to genes related to the group B streptococcus (GBS). The study found nine important virulence factors utilizing the Random Forest approach. These included five unique GBS genes (like YP_329683.1) and four non-GBS genes (like kfiC, pvdM, mtrE, and hasA). Additionally, in 2021, Lin et al. [132] examined ASD-specific gene expression, enabling prospective subsequent detection by predicting ASD risk genes and identifying the temporospatial areas of brain structures at different developmental stages. Employing the BrainSpan atlas, the research analyzed a dataset spanning 13 developmental stages from eight weeks post-conception to eight years over 26 different brain regions. The proposed work utilized the SVM technique to differentiate ASD risk genes from non-ASD genes. It used the transmissible bi-objective combinatorial genetic algorithm for the best feature selection, improving the model’s performance. At 13 post-conception weeks, the posteroventral parietal cortex was the most predictive brain area for genes.

Alsuliman and Al-Baity [133] explored two datasets: the GE dataset, which contains gene expression data from 30 samples with 43,931 features differentiating ASD and non-ASD cases, and the PBC dataset, from the University of California, Irvine (UCI) that contains 292 samples and 20 features related to ASD diagnosis and personal details. The high dimensionality of the datasets was addressed in the study by optimizing feature selection using a combination of bio-inspired algorithms and machine learning (ML) models to boost classification accuracy. For this goal, we utilized four algorithms: GWO, FPA, BA, and ABC. The models were assessed using AUC, recall, accuracy, precision, and F1-score. Out of all the models suggested the GWO-SVM model had the best performance, with an accuracy of 99.66%.

More recently, Suratanee and Plaimas [134] created a differential gene expression profile utilizing gene expression data and a gene embedding profile, which incorporated intricate gene connections using the data to determine disease-related genes for ASD. They utilized these profiles and the XGBoost classifier to find new connections for ASD. The proposed method found 10,848 putative gene–gene connections and 125 candidate genes, with the top three possibilities being DNA Topoisomerase I, ATP Synthase F1 Subunit Gamma, and Neuronal Calcium Sensor. They used statistical analysis to assess these candidate genes regarding specific pathways and activities. Additionally, they discovered sub-networks inside the prospective gene network, which aided in identifying association subgroups for putative genes linked to ASD.

A summary of the studies related to the eye tracking data is presented in Table 6.

To summarize, we provide a plot showing the number of works in our review related to each modality. Figure 3 allows to distinguish the most used modalities and then, estimate their contribution to detecting or classifying ASD. Based on these results, we can observe that MRI and questionnaire are the most used modalities for detecting ASD in our study, with 12 and 11 works, respectively.

5. Popular Datasets for ASD Detection or Classification

This section describes some popular ASD-related datasets relative to the various modalities available in the current literature.

MRI datasets. The ABIDE I and ABIDE II datasets are open-access and publicly available, consisting of resting state fMRI (r-fMRI) data samples or regional and total brain characteristics of the brain connectome. The ABIDE I dataset, launched in 2012, consists of a total of 1112 samples collected from over 17 various global sites, including 539 ASD and 573 typical control participants that are aged from 7 to 64 years. The ABIDE II launched in 2016, with contributions from 19 locations, provided a total of 1114 samples with enhanced phenotypical data, including 521 ASD people and 593 additional longitudinal samples from 38 people, who were participants aged 5 to 64 years. The major limitation of the ABIDE I dataset for ASD is the imbalanced distribution of gender (male and female). For instance, over 85% of participants are male, reporting a 4:1 male-to-female diagnosis ratio. This gender inequality makes it difficult for AI models to detect ASD accurately in females since girls may determine distinct brain patterns and behaviors. As a result, models instructed on commonly male data may not behave well in real clinical settings where gender differences are more balanced [135]. Additionally, the dataset is taken from specific high-income countries, resulting in biases for similar human races or regions lacking diversity. Further, the data samples taken belong to humans of specific age groups, which reflects a need first to increase the data samples as they are quite low and require targeting a larger age group to improve the effectiveness of these datasets. Lastly, the quality of data samples can differ based on the imaging technology used by the institute from which it is taken and collectively can affect the preprocessing, impacting the model’s performance upon training [38,136].
Questionnaire datasets. The Autism Screening on Adults dataset, ASD children traits dataset, and Autism Screening data for toddlers datasets are all publicly available on Kaggle. All these datasets consist of ASD Questionnaire-based responses filled out by parents or caregivers of autistic individuals using the ASD Test application. The autism screening on the adult dataset consists of 21 attributes and a total of 704 participants in the age group of 17–64 years. The ASD children traits dataset consists of 19 attributes and 1054 instances where the age range of the participants is from 12 to 36 months. The Autism Screening data for toddlers dataset comprises 1985 instances and 28 attributes. The age range of participants consists of 1 to 18 years. These datasets have a few limitations; first, the datasets have been developed by compiling user responses. It is analyzed that few outliers exist within these datasets caused by typing errors or user mistakes while entering data, causing a lack of data integrity. The datasets are limited to detailed information as they consist of a few questions but do not include any attributes regarding medical ASD tests carried out by experts like MRI or EEG signals, which lowers the impact of ASD classification. Moreover, all these datasets consist of class imbalance in terms of data samples as these datasets consist of fewer samples of ASD patients and more instances of non-ASD individuals, making it less effective for ASD detection [137,138]
The eye tracking dataset. The eye tracking dataset is an open-source resource publicly available in the Figshare repository. It was created to assist researchers in further research on ASD by sharing detailed records of gaze behavior in children. The dataset comprises 59 participants and is limited to children aged 3 to 12 years, including ASD and TD participants, collected from educational institutions in Hauts-de-France. The gender distribution of participants was 64% male and 36% female. Eye movements, such as fixations and saccades, pupil diameter, and point-of-gaze coordinates, have been recorded for each participant utilizing a screen-based SMI RED-M eye tracker at a sampling rate of 60 Hz. At the same time, this dataset provides valuable opportunities for ASD researchers to explore more research on ASD detection using it. However, limitations exist, such as small sample size, limited diversity, gender imbalance, and potential measurement errors [139,140].
NDAR dataset. The National Database for Autism Research (NDAR) is a scalable and adaptable data informatics platform that facilitates research on ASD. Data from all layers of biological and behavioral structures, such as synthetics, genes, brain tissue, personality, and relationships with the environment and society, are supported. Different sorts of data are also supported by NDAR, including text, pictures, time series, and numerical values, as NDAR consists of multi-dimensional data types like behavioral data, genetic data, and neuroimaging data that may be in raw format. It adds more complexity to data analysis; therefore, extensive data preprocessing techniques are required to develop benchmark datasets requiring higher computational resources like GPU. Moreover, the datasets are limited to fewer samples and need more diversity, adding biases towards specific human races and age groups [141,142].
SFARI dataset. The SFARI Autism Inpatient Collection dataset was released in 2013 and consists of behavioral and genetics-based samples belonging to ASD individuals with 527 instances. Some limitations are found in the SFARI dataset. First, the number of instances of this dataset is significantly low and requires more data samples. The genetic dataset is based on the socioeconomic and demographic factors of participants from specific countries with different diagnosis criteria for individuals with ASD, limiting overall diversity and generalizability. Moreover, the behavioral dataset consists of missing values as data is collected via parents and caregivers so human error can reduce data integrity [143,144,145].
EEG datasets. The P300 speller KAU dataset is an open-source dataset by King AbdulAziz University KSA, and the dataset consists of two categories: one is the global dataset for autism disorder, and the global P300 speller dataset consists of EEG samples of individuals with ASD and people without neurological disorders. Ten boys, ages 9 to 16, who are not diagnosed with any neurological impairments make up the control group; eight boys, ages 10 to 16, represent the ASD group. The BCIAUT_P30 dataset was launched in 2020 and is freely available on Kaggle; comprising of EEG recordings developed for the Brain-Computer Interface (BCI) system based on the P300, it contains 105 individual recordings of data from 15 people diagnosed with ASD, collected throughout seven sessions. The main limitation of EEG datasets is the lack of data diversity and demographic imbalances [146] since they contain very few EEG recording samples, only from the male gender. Moreover, the EEG samples belong to a particular age group, causing bias based only on inclination to that age group. Lastly, the EEG samples are pretty challenging to extract. They are prone to making noise during recording due to interactions with other electromagnetic waves, adding distortion, which requires advanced preprocessing techniques to remove and enhance data quality [147,148,149]. Overall, these problems can reduce a model’s performance and generalization to real-world settings.
Genetics datasets. The Autism WGS is a collection of genomic data where 32 ASD-affected families underwent whole-genome sequencing (WGS), revealing potentially dangerous genetic abnormalities in 31% of the families and 19% of the probands. These included new and recognized susceptible genes, such as CAPRIN1, VIP, SCN2A, and numerous others connected to disorders, including fragile X syndrome and epilepsy with de novo and hereditary mutations. Due to the rarity of certain phenotypic expressions, the dataset lacks comprehensive representation, limiting insights into specific factors contributing to ASD. Furthermore, the absence of fully developed tools for analyzing structural variants related to ASD hinders the dataset from achieving its full potential in research applications [150,151].
Facial images dataset. The Autistic Children Facial Dataset is an open-source dataset publicly available on Kaggle, consisting of 2936 facial images of normal and ASD-diagnosed children. The facial images are taken from various human races around the world. The limitations of this dataset are that the images come from various sources, so the image quality differs a lot in size, and it consists of bad image quality and noise. The major limitation is that facial images are not the best medium to detect or classify ASD based on facial images; it can be interpreted that children labeled having ASD have very normal facial features, and the children labeled normal look like having ASD. It is not easy to achieve high accuracy by training CNN-based models due to quite a similarity among facial features. In addition, the number of samples is limited as well and requires to be increased for more effective and efficient results [56]. Lastly, this dataset lacks demographic diversity and presents racial facial feature disparity and gender imbalance, which decreases generalizability and model performance. To overcome these issues, some researchers suggest building and training race-specific models to eliminate bias from other race factors on the reliability and accuracy of the models and recommending balanced demographic representation within datasets to reduce the misclassification caused by facial anthropometric disparities or differences in facial structures in ASD detection using facial images [79].

To summarize, we provide a list of popular ASD-related datasets available in the current literature in Table 7.

6. Discussion

This document analyzes how ML and DL techniques have been used in ASD research. ASD is a broad category of conditions frequently characterized by speech and social interaction difficulties. People with ASD have a wide range of requirements, which might alter with time. Some people can live independently, while others can have severe impairments that need care and assistance for the rest of their lives. As we have discussed in Section 1, ASD in children is diagnosed at approximately four and half years of age in the US [152]. However, parents and caregivers often report concerns about their children’s behavior as young as two years old [13]. Thus, it is evident that early detection of ASD is essential for timely intervention and support. However, our investigation emphasizes the various modalities used for the early detection of ASD, as well as the significant role of AI in improving diagnostic accuracy. Moreover, we have mentioned the datasets used for the early detection and classification of ASD and the limitations of such datasets (Table 7). The ML- and DL-based approaches mentioned in this document can be used to diagnose ASD and to develop tools or apps to assist healthcare professionals, clinicians, and caregivers in the early detection of ASD. Although incorporating AI in identifying and categorizing ASD has revealed significant potential, numerous limitations remain in the existing research that must be rectified to improve the effectiveness and usability of AI models for ASD detection.

Furthermore, the most significant limitations found in the reviewed literature include the lack of data augmentation, small sample size of datasets, imbalance data, sample homogeneity, and generalizability concerns, which limit the performance of the models. For example, we found that, in many cases, data augmentation approaches are necessary for enhancing the model’s performance. For example, Wang et al. [153] determined that various augmentation approaches can directly enhance classification tasks. Moreover, Frid-Adar et al. [154] recommended that GANs can be used to produce synthetic data to increase the size of datasets that represent the characteristics of ASD datasets to prevent overfitting. Limited sample sizes in many studies on ASD detection and classification might result in overfitting and poor model generalizability. Models with small datasets frequently exhibit good performance on training data but poor classification accuracy on unseen data. In addition, researchers should collect larger, more varied datasets that cover a wider range of demographics, geographic areas, and clinical symptoms of ASD. Furthermore, they must collaborate with research institutions and centers to enhance data exchange and aggregation. This can enable more rigorous analysis and improve the generalizability of results.

Similarly, in ASD detection, various modalities exist, including EEG, eye gaze tracking, facial images, questionnaires, genetics, biomarkers, and MRI images. Researchers have not considered using multi-modalities and used any single modality for ASD classification. This unexplored region can be a vital research breakthrough as authors can use a combination of MRI and facial images or EEG, which can be paired with biomarkers for detailed and accurate detection of ASD. Genetics is a modality researchers have not used and explored, but it can also be beneficial for early detection.

Various authors have worked and presented different ML and DL modalities to detect and classify ASD. After conducting this research, we identified and derived the limitations and research gaps within these domains based on ML and DL. For instance, researchers used questionnaire data to detect ASD using ML techniques. An ASD test app and various other ASD questionnaires have been designed and developed by experts, and they consist of multiple parameters that can assist in detecting ASD. Patient’s parents or caregivers filled out the questionnaire data and processed it into structured datasets. AI-based models are trained and evaluated for ASD classification using these datasets [22]. Furthermore, this modality has some limitations; there are few available datasets for researchers, while the existing datasets consist of similar features and samples belonging to a specific human race. However, these datasets can have potential human biases as filled out by parents or caregivers manually. It could affect the overall validity and accuracy of these datasets. In contrast, these datasets do not include attributes of medical tests by professionals that contain factual details related to ASD detection [155]. It is better to add more clinical assessments and medical examination attributes to the dataset to enhance or improve the accuracy of predictions [67].

Furthermore, recent research shows several facial features linked to ASD, such as a wide upper mouth, wide-set eyes, and shorter middle regions such as cheeks and nose [56]. For early detection of ASD, they used the facial images dataset sourced from Kaggle. The dataset was collected from different platforms such as Facebook and Google. It comprised 2D RGB images, aged from 2 to 8 years old. Facial images have been utilized by various authors as a research modality for the detection and classification of ASD using AI approaches such as ML and DL techniques, mainly focused on CNNs for extracting hidden features and the potential performance in the detection of ASD based on facial images [56]. For example, Alkahtani et al. [82] focused on the potential of facial features based on CNNs and employed transfer learning techniques, using MobileNetV2 and Hybrid VGG19 to improve the detection performance of ASD. However, there are some limitations to this modality; there is no information in the dataset related to clinical records, ASD severity, ethnicity, or socioeconomic condition of the children related to ASD. Further, the quality of images is not the best in terms of brightness, image size, and facial alignment. To acquire reliable predictions, the training data for DL algorithms should encompass a thoroughly inclusive dataset that includes all the details related to ASD.

Moreover, magnetic resonance imaging (MRI), a noninvasive technique, has been widely utilized to analyze the brain’s regional networks and provides structural information such as regional volumes, white matter, cerebrospinal fluid, and cortical thickness, all of which can help identify ASD using ML or DL techniques [99]. MRI scans are further separated into two categories based on scanning techniques: structural MRI (sMRI) and functional MRI (fMRI). Structural MRI (sMRI) scans are employed to assess the brain’s anatomy and neurology and determine the brain’s volume. Meanwhile, fMRI identifies variations in blood flow for functional connectivity analysis. However, this modality exhibits a few limitations as the samples utilized in various studies are insufficient or diverse enough to train their suggested techniques [97]. These datasets consist of data belonging to specific regions’ populations, reducing their overall diversity, range, and usability. These datasets are pretty challenging to extract and then preprocess to make them usable and trained by different deep learning models or ML classifiers. At the same time, difficulties in understanding results, heterogeneity among ASD, and the impact of data quality on results are other limitations. If not conducted correctly, there might be a significant loss of accuracy in the results. For neuroimaging data to provide accurate and significant insights into ASD, these issues must be addressed appropriately.

Similarly, an electroencephalogram (EEG) records the brain’s electrical activity. Small sensors are applied to the scalp during this noninvasive process to pick up electrical impulses produced by the brain. A machine then records these signals, and a physician examines them [121]. Various authors have explored identifying ASD based on features extracted from EEG data using ML algorithms, and the efficiency of these methods depends closely on obtaining relevant features from the signals [124]. However, this modality has a few limitations, including the publicly available diverse datasets consisting of minimal samples with class imbalance issues and are inclined more toward specific gender and ethnicity [120]. Moreover, EEG datasets are quite challenging to record as they can cause discomfort to the individual even though the process is entirely painless. During recording, EEG signals can induce unwanted noise that can affect the ASD detection process and is quite challenging to remove. However, EEG presents various challenges in deciphering intricate brain patterns and the potential influence of confounding factors on EEG signals [53,120,122].

Eye tracking is an essential tool for examining a person’s gaze position. It is a promising marker for ASD due to its speed, affordability, ease of analysis, and suitability for all age groups [112]. Researchers have concentrated on detecting ASD using eye tracking and examining eye movement’s biological and behavioral patterns based on ML algorithms, particularly in children with several developmental problems, including ASD [109]. The eye-tracking tool is a biomarker for evaluating children with ASD and offers few advantages. Firstly, it facilitates eye tracking for young children, enabling early identification of ASD concerns. Secondly, eye tracking data furnishes information that can be utilized as biomarkers to signify unusual visual attention. Thirdly, eye-tracking technology is a straightforward metric associated with the diagnostic instruments employed for ASD screening [112,115]. Unfortunately, there are some limitations to this modality. There are few publicly available eye-tracking and imbalance datasets. However, they are not appropriate for accurate ASD screening since they aggregate several eye-tracking scan path photos from a small number of participants rather than a single trial from a large sample. Additionally, the accuracy of the suggested ASD screening method depends on the quality and consistency of the training data. Therefore, the clinician needs to make sure the child is looking at the eye tracker during the trials and that there are no distractions in the recording environment that could cause the child to lose focus [109,115].

Genetics significantly contributes to the early identification of ASD by finding particular variations linked to ASD. Advancements in genomic technology, including whole exome sequencing (WES) and genome-wide association studies (GWAS), have enabled the identification of many genetic variants associated with ASD, but these are time-consuming and expensive. Nonetheless, computational tools offer rapid, more dependable, cost-effective alternatives for predicting or prioritizing potential disease genes. Computational methods integrate many data sources and gene functional information related to connected disorders to predict disease genes using machine learning techniques [130]. However, there are a few challenges regarding genetic data, such as data quality interpretability, and most genomic data exhibit a significantly smaller sample size relative to their gene characteristics [133,134].

AI is developing efficiently and has various possible uses in multiple domains. However, several issues in ASD detection must be resolved if AI systems are to be implemented safely and efficiently. Some major difficulties are ensuring data size, quality, complexity, interpretability, and privacy and data protection. To fully realize AI’s promise across a range of industries, these challenges must be overcome. They are described below.

Data quality. AI models’ accuracy and prediction ability are greatly influenced by the data quality used to train them. Ensuring good data quality through thoughtful gathering, preprocessing, cleaning, feature extraction, and validation is crucial for ASD research because identifying nuanced patterns and behaviors is crucial [56,124].
Data availability. Large datasets are necessary for AI systems, especially DL models, to learn efficiently and generalize correctly. However, limited and diverse data are a common problem in ASD research, which might make AI models underperform [156].
Interpretability. Understanding the fundamental mechanisms of DL algorithms can be difficult due to their proficiency in mapping complex, nonlinear functions. The interpretability of results is essential in the healthcare sector, as understanding the factors that impact consequences is as vital as generating accurate predictions. Interpretability enhances confidence and assists the prompt integration of these technologies into clinical procedures, improving decision-making and allowing medical practitioners to make educated choices based on algorithmic insights [156].
Ethical consideration. Strict adherence to legal and ethical obligations is crucial for managing medical data, especially for ASD. Getting informed permission, protecting patient privacy, and protecting data are paramount [157,158].
Algorithm complexity. Comprehensive AI techniques typically demand a lot of computing power, necessitating the use of suitable hardware and expertise in model building and training. This complexity may be a challenge in clinical and research environments for ASD, where resources are often limited [159].

7. Conclusions

ASD detection is a complex process that requires a significant amount of resources, including financial resources, time, and specialized expertise. The prompt detection of ASD may represent a significant difference in the future for individuals, as early intervention translates into a more promising outcome. Aiming to improve how ASD detection or classification is performed, researchers worldwide have been developing techniques involving machine learning (ML) and deep learning (DL) models and training them on respective datasets based on the requirements and modalities under consideration. These techniques can be very helpful in ASD detection, as they can accelerate the diagnostic process while maintaining the main aim of accuracy, efficiency, and correctness.

In this document, we have explored various modalities for detecting or classifying Autism Spectrum Disorder (ASD), including questionnaires, facial images, MRI, EEG, eye tracking, and genetics, and how Artificial Intelligence (AI) can interact with these modalities to enhance their effectiveness. As part of the exploration, we have also studied various available datasets related to ASD research and discussed their use and limitations. Finally, we have carefully reviewed the limitations and research gaps in existing work, including the challenges and issues related to ASD detection or classification based on AI techniques.

Overall, our review discusses the role of AI in the detection of ASD since it extensively describes various modalities and recent relevant work conducted by researchers in the ML and DL domains, while highlighting the limitations and research gaps that require attention and can pave the way for future research. In this study, we have also discussed popular ASD datasets from the literature to accelerate the development of behavioral and technical studies on ASD, and their download links were provided. The aim of this research was to highlight all the relevant aspects of ASD detection based on the research questions designed to extract extensive details on the topic. The main findings and recommendations are that numerous studies have introduced ML and DL techniques for detecting and classifying ASD. The area of ML and DL for ASD identification can make great progress by tackling these constraints and investigating the noted research gaps. This will open the door for creating more reliable, applicable, and practical models for diagnosing and treating ASD.

One significant limitation identified in the reviewed literature is that most studies rely on a single modality for detecting ASD. Moreover, they are not leveraging the potential benefits of multimodalities, such as combining behavioral and biological indicators. AI can significantly assist in this process by integrating diverse data sources, including behavioral and biological indicators. By leveraging this approach, AI can enhance the diagnostic accuracy and timely detection of ASD. For example, we could consider a framework that combines fMRI and genetics with attention mechanisms to prioritize cross-modal features.

Moreover, based on the reviewed studies, we observed that many researchers have used small and demographically imbalanced datasets. Such conditions can limit the robustness and generalization of AI models for ASD detection when applied to real-world populations. Therefore, future research should prioritize the collection of larger, more balanced, and ethnically diverse datasets, as these approaches will help make robust and generalizable inferences across different demographic groups in ASD detection models, thereby improving clinical dependability and reliability in various global settings.

Finally, various AI models are typically evaluated in controlled and experimental conditions rather than validated in different clinical environments. Future work should evaluate these models in diverse clinical settings to assess their performance in various real-world scenarios.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/app15148056/s1, File S1: The PRISMA 2020 27-item checklist.

Author Contributions

Conceptualization, M.A. and J.C.O.-B.; methodology, M.A., F.A., S.H., I.A. and J.C.O.-B.; formal analysis, M.A., F.A., S.H., G.O.-R. and J.C.O.-B.; investigation, M.A., F.A. and S.H.; data curation, M.A., F.A. and S.H.; writing—original draft preparation, M.A., F.A., S.H. and J.C.O.-B.; writing—review and editing, A.K.G.-E., I.A., G.O.-R., J.C.O.-B.; visualization, M.A. and J.C.O.-B.; supervision, G.O.-R.; project administration, A.K.G.-E., I.A.; funding acquisition, J.C.O.-B. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

We have provided the corresponding links to publicly archived datasets described during the study.

Conflicts of Interest

The authors declare no conflict of interest.

References

Mayo Clinic. Autism Spectrum Disorder. 2018. Available online: https://www.mayoclinic.org/diseases-conditions/autism-spectrum-disorder/symptoms-causes/syc-20352928 (accessed on 13 March 2025).
Eslami, T.; Almuqhim, F.; Raiker, J.S.; Saeed, F. Machine learning methods for diagnosing autism spectrum disorder and attention-deficit/hyperactivity disorder using functional and structural MRI: A survey. Front. Neuroinform. 2021, 14, 575999. [Google Scholar] [CrossRef] [PubMed]
National Institute of Mental Health. Autism Spectrum Disorder. 2024. Available online: https://www.nimh.nih.gov/health/topics/autism-spectrum-disorders-asd (accessed on 13 March 2025).
Christensen, D.L.; Baio, J.; Braun, K.V.N. Prevalence and Characteristics of Autism Spectrum Disorder Among Children Aged 8 Years—Autism and Developmental Disabilities Monitoring Network, 11 Sites, United States, 2012. MMWR Surveill. Summ. 2018, 65, 1–3. Available online: https://www.cdc.gov/mmwr/volumes/65/ss/ss6513a1.htm?s_cid=ss6513a1_w (accessed on 13 March 2025). [CrossRef] [PubMed]
World Health Organization. Autism. 2023. Available online: https://www.who.int/news-room/fact-sheets/detail/autism-spectrum-disorders (accessed on 13 March 2025).
Talantseva, O.I.; Romanova, R.S.; Shurdova, E.M.; Dolgorukova, T.A.; Sologub, P.S.; Titova, O.S.; Kleeva, D.F.; Grigorenko, E.L. The global prevalence of autism spectrum disorder: A three-level meta-analysis. Front. Psychiatry 2023, 14, 1071181. [Google Scholar] [CrossRef] [PubMed]
Salari, N.; Rasoulpoor, S.; Rasoulpoor, S.; Shohaimi, S.; Jafarpour, S.; Abdoli, N.; Khaledi-Paveh, B.; Mohammadi, M. The global prevalence of autism spectrum disorder: A comprehensive systematic review and meta-analysis. Ital. J. Pediatr. 2022, 48, 112. [Google Scholar] [CrossRef] [PubMed]
Connnect N Care ABA Client. Breaking Boundaries: Autism Prevalence by Country Revealed. 2024. Available online: https://www.connectncareaba.com/autism-prevalence-by-country (accessed on 13 March 2025).
Maenner, M.J. Prevalence and characteristics of autism spectrum disorder among children aged 8 years—Autism and Developmental Disabilities Monitoring Network, 11 sites, United States, 2020. MMWR Surveill. Summ. 2023, 72, 1–14. [Google Scholar] [CrossRef] [PubMed]
Antonio, U.H.S. Data of Interest: Autism Rates by Country-Hoffman Program for Chemical Intolerance. 2023. Available online: https://tiltresearch.org/2023/06/15/data-of-interest-autism-rates-by-country/ (accessed on 19 August 2024).
BioSpace. Autism Spectrum Disorder Treatment Market Size, Growth, Report 2022–2030. 2022. Available online: https://www.biospace.com/article/autism-spectrum-disorder-treatment-market-size-growth-report-2022-2030 (accessed on 13 March 2025).
Khodatars, M.; Shoeibi, A.; Sadeghi, D.; Ghaasemi, N.; Jafari, M.; Moridian, P.; Khadem, A.; Alizadehsani, R.; Zare, A.; Kong, Y.; et al. Deep learning for neuroimaging-based diagnosis and rehabilitation of autism spectrum disorder: A review. Comput. Biol. Med. 2021, 139, 104949. [Google Scholar] [CrossRef] [PubMed]
De Giacomo, A.; Fombonne, E. Parental recognition of developmental abnormalities in autism. Eur. Child Adolesc. Psychiatry 1998, 7, 131–136. [Google Scholar] [CrossRef] [PubMed]
Schopler, E.; Reichler, R.J.; DeVellis, R.F.; Daly, K. Toward objective classification of childhood autism: Childhood Autism Rating Scale (CARS). J. Autism Dev. Disord. 1980, 10, 91–103. [Google Scholar] [CrossRef] [PubMed]
Skuse, D.; Warrington, R.; Bishop, D.; Chowdhury, U.; Lau, J.; Mandy, W.; Place, M. The developmental, dimensional and diagnostic interview (3di): A novel computerized assessment for autism spectrum disorders. J. Am. Acad. Child Adolesc. Psychiatry 2004, 43, 548–558. [Google Scholar] [CrossRef] [PubMed]
Kohli, M.; Kar, A.K.; Sinha, S. The role of intelligent technologies in early detection of autism spectrum disorder (asd): A scoping review. IEEE Access 2022, 10, 104887–104913. [Google Scholar] [CrossRef]
Gillberg, C.; Gillberg, C.; Råstam, M.; Wentz, E. The Asperger Syndrome (and high-functioning autism) Diagnostic Interview (ASDI): A preliminary study of a new structured clinical interview. Autism 2001, 5, 57–66. [Google Scholar] [CrossRef] [PubMed]
Lecavalier, L. An evaluation of the Gilliam autism rating scale. J. Autism Dev. Disord. 2005, 35, 795–805. [Google Scholar] [CrossRef] [PubMed]
De Belen, R.A.J.; Bednarz, T.; Sowmya, A.; Del Favero, D. Computer vision in autism spectrum disorder research: A systematic review of published studies from 2009 to 2019. Transl. Psychiatry 2020, 10, 333. [Google Scholar] [CrossRef] [PubMed]
Noorbakhsh-Sabet, N.; Zand, R.; Zhang, Y.; Abedi, V. Artificial intelligence transforms the future of health care. Am. J. Med. 2019, 132, 795–801. [Google Scholar] [CrossRef] [PubMed]
Niu, K.; Guo, J.; Pan, Y.; Gao, X.; Peng, X.; Li, N.; Li, H. Multichannel deep attention neural networks for the classification of autism spectrum disorder using neuroimaging and personal characteristic data. Complexity 2020, 2020, 1357853. [Google Scholar] [CrossRef]
Hossain, M.D.; Kabir, M.A.; Anwar, A.; Islam, M.Z. Detecting autism spectrum disorder using machine learning techniques: An experimental analysis on toddler, child, adolescent and adult datasets. Health Inf. Sci. Syst. 2021, 9, 17. [Google Scholar] [CrossRef] [PubMed]
Thabtah, F. An accessible and efficient autism screening method for behavioural data and predictive analyses. Health Inform. J. 2019, 25, 1739–1755. [Google Scholar] [CrossRef] [PubMed]
Sapiro, G.; Hashemi, J.; Dawson, G. Computer vision and behavioral phenotyping: An autism case study. Curr. Opin. Biomed. Eng. 2019, 9, 14–20. [Google Scholar] [CrossRef] [PubMed]
Hyde, K.K.; Novack, M.N.; LaHaye, N.; Parlett-Pelleriti, C.; Anden, R.; Dixon, D.R.; Linstead, E. Applications of supervised machine learning in autism spectrum disorder research: A review. Rev. J. Autism Dev. Disord. 2019, 6, 128–146. [Google Scholar] [CrossRef]
Lamani, M.R.; Pernabas, J.B. A Thorough Review of Deep Learning in Autism Spectrum Disorder Detection: From Data to Diagnosis. Recent Adv. Comput. Sci. Commun. 2024, 17, 73–91. [Google Scholar] [CrossRef]
Sarker, I.H. Machine Learning: Algorithms, Real-World Applications and Research Directions. Sn Comput. Sci. 2021, 2. [Google Scholar] [CrossRef] [PubMed]
Tufail, S.; Riggs, H.; Tariq, M.; Sarwat, A.I. Advancements and challenges in machine learning: A comprehensive review of models, libraries, applications, and algorithms. Electronics 2023, 12, 1789. [Google Scholar] [CrossRef]
Rahman, M.M.; Usman, O.L.; Muniyandi, R.C.; Sahran, S.; Mohamed, S.; Razak, R.A. A review of machine learning methods of feature selection and classification for autism spectrum disorder. Brain Sci. 2020, 10, 949. [Google Scholar] [CrossRef] [PubMed]
Feng, M.; Xu, J. Detection of ASD children through deep-learning application of fMRI. Children 2023, 10, 1654. [Google Scholar] [CrossRef] [PubMed]
Bezemer, M.; Blijd-Hoogewys, E.; Meek-Heekelaar, M. The Predictive Value of the AQ and the SRS-A in the Diagnosis of ASD in Adults in Clinical Practice. J. Autism Dev. Disord. 2021, 51, 2402–2415. [Google Scholar] [CrossRef] [PubMed]
Amit, G.; Bilu, Y.; Sudry, T.; Tsadok, M.A.; Zimmerman, D.R.; Baruch, R.; Kasir, N.; Akiva, P.; Sadaka, Y. Early Prediction of Autistic Spectrum Disorder Using Developmental Surveillance Data. JAMA Netw. Open 2024, 7, e2351052. [Google Scholar] [CrossRef] [PubMed]
Islam, M.M.; Karray, F.; Alhajj, R.; Zeng, J. A Review on Deep Learning Techniques for the Diagnosis of Novel Coronavirus (COVID-19). IEEE Access 2021, 9, 30551–30572. [Google Scholar] [CrossRef] [PubMed]
de Barros, F.R.D.; da Silva, C.N.F.; de Castro Michelassi, G.; Brentani, H.; Nunes, F.L.; Machado-Lima, A. Computer aided diagnosis of neurodevelopmental disorders and genetic syndromes based on facial images—A systematic literature review. Heliyon 2023, 9, e20517. [Google Scholar] [CrossRef] [PubMed]
Liu, X.; Hasan, M.R.; Gedeon, T.; Hossain, M.Z. MADE-for-ASD: A multi-atlas deep ensemble network for diagnosing Autism Spectrum Disorder. Comput. Biol. Med. 2024, 182, 109083. [Google Scholar] [CrossRef] [PubMed]
Fan, Y.; Xiong, H.; Sun, G. DeepASDPred: A CNN-LSTM-based deep learning method for Autism spectrum disorders risk RNA identification. BMC Bioinform. 2023, 24, 261. [Google Scholar] [CrossRef] [PubMed]
Simeoli, R.; Rega, A.; Cerasuolo, M.; Nappo, R.; Marocco, D. Using machine learning for motion analysis to early detect autism spectrum disorder: A systematic review. Rev. J. Autism Dev. Disord. 2024, 1–20. [Google Scholar] [CrossRef]
Huda, S.; Khan, D.M.; Masroor, K.; Warda; Rashid, A.; Shabbir, M. Advancements in automated diagnosis of autism spectrum disorder through deep learning and resting-state functional mri biomarkers: A systematic review. Cogn. Neurodyn. 2024, 18, 3585–3601. [Google Scholar] [CrossRef] [PubMed]
Dcouto, S.S.; Pradeepkandhasamy, J. Multimodal Deep Learning in Early Autism Detection—Recent Advances and Challenges. Eng. Proc. 2024, 59, 205. [Google Scholar] [CrossRef]
Song, D.Y.; Kim, S.Y.; Bong, G.; Kim, J.M.; Yoo, H.J. The use of artificial intelligence in screening and diagnosis of autism spectrum disorder: A literature review. J. Korean Acad. Child Adolesc. Psychiatry 2019, 30, 145. [Google Scholar] [CrossRef] [PubMed]
Minissi, M.E.; Chicchi Giglioli, I.A.; Mantovani, F.; Alcaniz Raya, M. Assessment of the autism spectrum disorder based on machine learning and social visual attention: A systematic review. J. Autism Dev. Disord. 2022, 52, 2187–2202. [Google Scholar] [CrossRef] [PubMed]
Jeyarani, R.A.; Senthilkumar, R. Eye tracking biomarkers for autism spectrum disorder detection using machine learning and deep learning techniques. Res. Autism Spectr. Disord. 2023, 108, 102228. [Google Scholar] [CrossRef]
Joudar, S.S.; Albahri, A.S.; Hamid, R.A. Triage and priority-based healthcare diagnosis using artificial intelligence for autism spectrum disorder and gene contribution: A systematic review. Comput. Biol. Med. 2022, 146, 105553. [Google Scholar] [CrossRef] [PubMed]
Parlett-Pelleriti, C.M.; Stevens, E.; Dixon, D.; Linstead, E.J. Applications of unsupervised machine learning in autism spectrum disorder research: A review. Rev. J. Autism Dev. Disord. 2023, 10, 406–421. [Google Scholar] [CrossRef]
Uddin, M.Z.; Shahriar, M.A.; Mahamood, M.N.; Alnajjar, F.; Pramanik, M.I.; Ahad, M.A.R. Deep learning with image-based autism spectrum disorder analysis: A systematic review. Eng. Appl. Artif. Intell. 2024, 127, 107185. [Google Scholar] [CrossRef]
Page, M.J.; McKenzie, J.E.; Bossuyt, P.M.; Boutron, I.; Hoffmann, T.C.; Mulrow, C.D.; Shamseer, L.; Tetzlaff, J.M.; Akl, E.A.; Brennan, S.E.; et al. The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. BMJ 2021, 372, n71. [Google Scholar] [CrossRef] [PubMed]
Frank, R.A.; Bossuyt, P.M.; McInnes, M.D. Systematic reviews and meta-analyses of diagnostic test accuracy: The PRISMA-DTA statement. Radiology 2018, 289. [Google Scholar] [CrossRef] [PubMed]
PRISMA. PRISMA 2020 Statement Paper. 2020. Available online: https://www.prisma-statement.org/prisma-2020-statement (accessed on 1 June 2025).
Helmy, E.; Elnakib, A.; ElNakieb, Y.; Khudri, M.; Abdelrahim, M.; Yousaf, J.; Ghazal, M.; Contractor, S.; Barnes, G.N.; El-Baz, A. Role of artificial intelligence for autism diagnosis using DTI and fMRI: A survey. Biomedicines 2023, 11, 1858. [Google Scholar] [CrossRef] [PubMed]
Abraham, A.; Milham, M.P.; Di Martino, A.; Craddock, R.C.; Samaras, D.; Thirion, B.; Varoquaux, G. Deriving reproducible biomarkers from multi-site resting-state data: An Autism-based example. NeuroImage 2017, 147, 736–745. [Google Scholar] [CrossRef] [PubMed]
Saad, M.; Islam, S.M.R. Brain connectivity network analysis and classifications from diffusion tensor imaging. In Proceedings of the 2019 International Conference on Robotics, Electrical and Signal Processing Techniques (ICREST), Dhaka, Bangladesh, 10–12 January 2019; pp. 422–427. [Google Scholar] [CrossRef]
Li, J.; Kong, X.; Sun, L.; Chen, X.; Ouyang, G.; Li, X.; Chen, S. Identification of autism spectrum disorder based on electroencephalography: A systematic review. Comput. Biol. Med. 2024, 108075. [Google Scholar] [CrossRef] [PubMed]
Rogala, J.; Żygierewicz, J.; Malinowska, U.; Cygan, H.; Stawicka, E.; Kobus, A.; Vanrumste, B. Enhancing autism spectrum disorder classification in children through the integration of traditional statistics and classical machine learning techniques in EEG analysis. Sci. Rep. 2023, 13, 21748. [Google Scholar] [CrossRef] [PubMed]
Lim, J.Z.; Mountstephens, J.; Teo, J. Eye-tracking feature extraction for biometric machine learning. Front. Neurorobot. 2022, 15, 796895. [Google Scholar] [CrossRef] [PubMed]
Meng, F.; Li, F.; Wu, S.; Yang, T.; Xiao, Z.; Zhang, Y.; Liu, Z.; Lu, J.; Luo, X. Machine learning-based early diagnosis of autism according to eye movements of real and artificial faces scanning. Front. Neurosci. 2023, 17, 1170951. [Google Scholar] [CrossRef] [PubMed]
Alam, M.S.; Rashid, M.M.; Roy, R.; Faizabadi, A.R.; Gupta, K.D.; Ahsan, M.M. Empirical study of autism spectrum disorder diagnosis using facial images by improved transfer learning approach. Bioengineering 2022, 9, 710. [Google Scholar] [CrossRef] [PubMed]
Awaji, B.; Senan, E.M.; Olayah, F.; Alshari, E.A.; Alsulami, M.; Abosaq, H.A.; Alqahtani, J.; Janrao, P. Hybrid techniques of facial feature image analysis for early detection of autism spectrum disorder based on combined CNN features. Diagnostics 2023, 13, 2948. [Google Scholar] [CrossRef] [PubMed]
Chaplot, N.; Pandey, D.; Kumar, Y.; Sisodia, P.S. A comprehensive analysis of artificial intelligence techniques for the prediction and prognosis of genetic disorders using various gene disorders. Arch. Comput. Methods Eng. 2023, 30, 3301–3323. [Google Scholar] [CrossRef]
Nahas, L.D.; Datta, A.; Alsamman, A.M.; Adly, M.H.; Al-Dewik, N.; Sekaran, K.; Sasikumar, K.; Verma, K.; Doss, G.P.C.; Zayed, H. Genomic insights and advanced machine learning: Characterizing autism spectrum disorder biomarkers and genetic interactions. Metab. Brain Dis. 2024, 39, 29–42. [Google Scholar] [CrossRef] [PubMed]
Uddin, M.; Wang, Y.; Woodbury-Smith, M. Artificial intelligence for precision medicine in neurodevelopmental disorders. NPJ Digit. Med. 2019, 2, 112. [Google Scholar] [CrossRef] [PubMed]
Hyman, S.L.; Levy, S.E.; Myers, S.M.; Kuo, D.Z.; Apkon, S.; Davidson, L.F.; Ellerbeck, K.A.; Foster, J.E.; Noritz, G.H.; Leppert, M.O.; et al. Identification, evaluation, and management of children with autism spectrum disorder. Pediatrics 2020, 145, e20193447. [Google Scholar] [CrossRef] [PubMed]
Frazier, T.W.; Klingemier, E.W.; Beukemann, M.; Speer, L.; Markowitz, L.; Parikh, S.; Wexberg, S.; Giuliano, K.; Schulte, E.; Delahunty, C.; et al. Development of an objective autism risk index using remote eye tracking. J. Am. Acad. Child Adolesc. Psychiatry 2016, 55, 301–309. [Google Scholar] [CrossRef] [PubMed]
Shahamiri, S.R.; Thabtah, F. Autism AI: A new autism screening system based on artificial intelligence. Cogn. Comput. 2020, 12, 766–777. [Google Scholar] [CrossRef]
Erkan, U.; Thanh, D.N. Autism spectrum disorder detection with machine learning methods. Curr. Psychiatry Res. Rev. 2019, 15, 297–308. [Google Scholar] [CrossRef]
Akter, T.; Satu, M.S.; Khan, M.I.; Ali, M.H.; Uddin, S.; Lio, P.; Quinn, J.M.; Moni, M.A. Machine learning-based models for early stage detection of autism spectrum disorders. IEEE Access 2019, 7, 166509–166527. [Google Scholar] [CrossRef]
Raj, S.; Masood, S. Analysis and detection of autism spectrum disorder using machine learning techniques. Procedia Comput. Sci. 2020, 167, 994–1004. [Google Scholar] [CrossRef]
Vakadkar, K.; Purkayastha, D.; Krishnan, D. Detection of autism spectrum disorder in children using machine learning techniques. SN Comput. Sci. 2021, 2, 386. [Google Scholar] [CrossRef] [PubMed]
Bala, M.; Ali, M.H.; Satu, M.S.; Hasan, K.F.; Moni, M.A. Efficient machine learning models for early stage detection of autism spectrum disorder. Algorithms 2022, 15, 166. [Google Scholar] [CrossRef]
Kumar, C.J.; Das, P.R. The diagnosis of ASD using multiple machine learning techniques. Int. J. Dev. Disabil. 2022, 68, 973–983. [Google Scholar] [CrossRef] [PubMed]
Farooq, M.S.; Tehseen, R.; Sabir, M.; Atal, Z. Detection of autism spectrum disorder (ASD) in children and adults using machine learning. Sci. Rep. 2023, 13, 9605. [Google Scholar] [CrossRef] [PubMed]
Khudhur, D.D.; Khudhur, S.D. The classification of autism spectrum disorder by machine learning methods on multiple datasets for four age groups. Meas. Sens. 2023, 27, 100774. [Google Scholar] [CrossRef]
Mukherjee, P.; Godse, M.; Chakraborty, B. Detection of Autism Spectrum Disorder (ASD) Symptoms using LSTM Model. WSEAS Trans. Biol. Biomed. 2024, 21, 40–54. [Google Scholar] [CrossRef]
Rasul, R.A.; Saha, P.; Bala, D.; Karim, S.R.U.; Abdullah, M.I.; Saha, B. An evaluation of machine learning approaches for early diagnosis of autism spectrum disorder. Healthc. Anal. 2024, 5, 100293. [Google Scholar] [CrossRef]
Priya, K.L.; Jyothirmai, I.; Akshaya, G.; Chowdary, C.S. Identification of Autism in Children Using Static Facial Features and Deep Neural Networks. Turk. J. Comput. Math. Educ. (TURCOMAT) 2023, 14, 704–715. [Google Scholar]
Golden Steps ABA. Facial Features & Physical Characteristics of Autism. 2025. Available online: https://www.goldenstepsaba.com/resources/facial-features-autism (accessed on 19 August 2024).
Therapy, G.C. Autism Facial Features|Golden Care Therapy. 2022. Available online: https://goldencaretherapy.com/autism-facial-features/, (accessed on 19 August 2024).
Farhat, T.; Akram, S.; AlSagri, H.S.; Ali, Z.; Ahmad, A.; Jaffar, A. Facial Image-Based Autism Detection: A Comparative Study of Deep Neural Network Classifiers. Comput. Mater. Contin. 2024, 78, 105–126. [Google Scholar] [CrossRef]
Liu, W.; Li, M.; Yi, L. Identifying children with autism spectrum disorder based on their face processing abnormality: A machine learning framework. Autism Res. 2016, 9, 888–898. [Google Scholar] [CrossRef] [PubMed]
Lu, A.; Perkowski, M. Deep learning approach for screening autism spectrum disorder in children with facial images and analysis of ethnoracial factors in model development and application. Brain Sci. 2021, 11, 1446. [Google Scholar] [CrossRef] [PubMed]
Mujeeb Rahman, K.; Subashini, M.M. Identification of autism in children using static facial features and deep neural networks. Brain Sci. 2022, 12, 94. [Google Scholar] [CrossRef] [PubMed]
Gaddala, L.K.; Kodepogu, K.R.; Surekha, Y.; Tejaswi, M.; Ameesha, K.; Kollapalli, L.S.; Kotha, S.K.; Manjeti, V.B. Autism Spectrum Disorder Detection Using Facial Images and Deep Convolutional Neural Networks. Rev. D’Intelligence Artif. 2023, 37, 801–806. [Google Scholar] [CrossRef]
Alkahtani, H.; Aldhyani, T.H.; Alzahrani, M.Y. Deep learning algorithms to identify autism spectrum disorder in children-based facial landmarks. Appl. Sci. 2023, 13, 4855. [Google Scholar] [CrossRef]
Li, Y.; Huang, W.C.; Song, P.H. A face image classification method of autistic children based on the two-phase transfer learning. Front. Psychol. 2023, 14, 1226470. [Google Scholar] [CrossRef] [PubMed]
Reddy, P. Diagnosis of Autism in Children Using Deep Learning Techniques by Analyzing Facial Features. Eng. Proc. 2024, 59, 198. [Google Scholar] [CrossRef]
Peng, X.; Li, Y.; Murphey, Y.L.; Luo, J. Domain adaptation by stacked local constraint auto-encoder learning. IEEE Access 2019, 7, 108248–108260. [Google Scholar] [CrossRef]
Maji, S.; Malik, J. Fast and Accurate Digit Classification; Tech. Rep. UCB/EECS-2009-159; EECS Department, University of California: Berkeley, CA, USA, 2009; Available online: https://www2.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS-2009-159.pdf (accessed on 8 March 2025).
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar] [CrossRef]
Wightman, R.; Touvron, H.; Jégou, H. Resnet strikes back: An improved training procedure in timm. arXiv 2021. [Google Scholar] [CrossRef]
Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar] [CrossRef]
Cherney, K. MRI vs. X-Ray: Pros, Cons, Costs & More. Available online: https://www.healthline.com/health/mri-vs-xray#vs-ct-scans (accessed on 19 August 2024).
Dichter, G.S. Functional magnetic resonance imaging of autism spectrum disorders. Dialogues Clin. Neurosci. 2012, 14, 319–351. [Google Scholar] [CrossRef] [PubMed]
Stigler, K.A.; McDonald, B.C.; Anand, A.; Saykin, A.J.; McDougle, C.J. Structural and functional magnetic resonance imaging of autism spectrum disorders. Brain Res. 2011, 1380, 146–161. [Google Scholar] [CrossRef] [PubMed]
Mengi, M.; Malhotra, D. SSMDA: Semi-supervised multi-source domain adaptive autism prediction model using neuroimaging. Biomed. Signal Process. Control 2024, 95, 106337. [Google Scholar] [CrossRef]
Liang, D.; Liang, S.; Xiaa, S.; Zhang, X.; Yind, Q. Diagnosis Classification of Autism in Ages 7-15 Years Based on Deep Learning. In Volume 385: Artificial Intelligence and Human-Computer Interaction; IOP: Bristol, UK, 2024. [Google Scholar] [CrossRef]
Heinsfeld, A.S.; Franco, A.R.; Craddock, R.C.; Buchweitz, A.; Meneguzzi, F. Identification of autism spectrum disorder using deep learning and the ABIDE dataset. NeuroImage Clin. 2018, 17, 16–23. [Google Scholar] [CrossRef] [PubMed]
Li, X.; Dvornek, N.C.; Zhuang, J.; Ventola, P.; Duncan, J.S. Brain biomarker interpretation in ASD using deep learning and fMRI. In Proceedings of the Medical Image Computing and Computer Assisted Intervention–MICCAI 2018: 21st International Conference, Granada, Spain, 16–20 September 2018; pp. 206–214. [Google Scholar] [CrossRef]
Mostafa, S.; Tang, L.; Wu, F.X. Diagnosis of autism spectrum disorder based on eigenvalues of brain networks. IEEE Access 2019, 7, 128474–128486. [Google Scholar] [CrossRef]
Eslami, T.; Saeed, F. Auto-ASD-network: A technique based on deep learning and support vector machines for diagnosing autism spectrum disorder using fMRI data. In Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics, Niagara Falls, NY, USA, 7–10 September 2019; pp. 646–651. [Google Scholar] [CrossRef]
Sharif, H.; Khan, R.A. A novel machine learning based framework for detection of autism spectrum disorder (ASD). Appl. Artif. Intell. 2022, 36, 2004655. [Google Scholar] [CrossRef]
Othmani, A.; Bizet, T.; Pellerin, T.; Hamdi, B.; Bock, M.A.; Dev, S. Significant cc400 functional brain parcellations based lenet5 convolutional neural network for autism spectrum disorder detection. In Proceedings of the International Conference on Recent Trends in Image Processing and Pattern Recognition, Kingsville, TX, USA, 1–2 December 2022; pp. 34–45. [Google Scholar] [CrossRef]
Zhang, J.; Feng, F.; Han, T.; Gong, X.; Duan, F. Detection of autism spectrum disorder using fMRI functional connectivity with feature selection and deep learning. Cogn. Comput. 2023, 15, 1106–1117. [Google Scholar] [CrossRef]
Park, K.W.; Cho, S.B. A residual graph convolutional network with spatio-temporal features for autism classification from fMRI brain images. Appl. Soft Comput. 2023, 142, 110363. [Google Scholar] [CrossRef]
Bahathiq, R.A.; Banjar, H.; Jarraya, S.K.; Bamaga, A.K.; Almoallim, R. Efficient diagnosis of autism spectrum disorder using optimized machine learning models based on structural MRI. Appl. Sci. 2024, 14, 473. [Google Scholar] [CrossRef]
Li, G.; Ji, Z.; Sun, Q. Deep Multi-Instance Conv-Transformer Frameworks for Landmark-Based Brain MRI Classification. Electronics 2024, 13, 980. [Google Scholar] [CrossRef]
Wang, C.; Xiao, Z.; Xu, Y.; Zhang, Q.; Chen, J. A novel approach for ASD recognition based on graph attention networks. Front. Comput. Neurosci. 2024, 18, 1388083. [Google Scholar] [CrossRef] [PubMed]
Wang, W.; Xiao, L.; Qu, G.; Calhoun, V.D.; Wang, Y.P.; Sun, X. Multiview hyperedge-aware hypergraph embedding learning for multisite, multiatlas fMRI based functional connectivity network analysis. Med. Image Anal. 2024, 94, 103144. [Google Scholar] [CrossRef] [PubMed]
Falck-Ytter, T.; Bölte, S.; Gredebäck, G. Eye tracking in early autism research. J. Neurodev. Disord. 2013, 5, 1–13. [Google Scholar] [CrossRef] [PubMed]
Zhou, W.; Yang, M.; Tang, J.; Wang, J.; Hu, B. Gaze Patterns in Children With Autism Spectrum Disorder to Emotional Faces: Scanpath and Similarity. IEEE Trans. Neural Syst. Rehabil. Eng. 2024, 32, 865–874. [Google Scholar] [CrossRef] [PubMed]
Alsaidi, M.; Obeid, N.; Al-Madi, N.; Hiary, H.; Aljarah, I. A Convolutional Deep Neural Network Approach to Predict Autism Spectrum Disorder Based on Eye-Tracking Scan Paths. Information 2024, 15, 133. [Google Scholar] [CrossRef]
Fabiano, D.; Canavan, S.; Agazzi, H.; Hinduja, S.; Goldgof, D. Gaze-based classification of autism spectrum disorder. Pattern Recognit. Lett. 2020, 135, 204–212. [Google Scholar] [CrossRef]
Cilia, F.; Carette, R.; Elbattah, M.; Dequen, G.; Guérin, J.L.; Bosche, J.; Vandromme, L.; Le Driant, B. Computer-aided screening of autism spectrum disorder: Eye-tracking study using data visualization and deep learning. JMIR Hum. Factors 2021, 8, e27706. [Google Scholar] [CrossRef] [PubMed]
Ahmed, I.A.; Senan, E.M.; Rassem, T.H.; Ali, M.A.; Shatnawi, H.S.A.; Alwazer, S.M.; Alshahrani, M. Eye tracking-based diagnosis and early detection of autism spectrum disorder using machine learning and deep learning techniques. Electronics 2022, 11, 530. [Google Scholar] [CrossRef]
Kanhirakadavath, M.R.; Chandran, M.S.M. Investigation of eye-tracking scan path as a biomarker for autism screening using machine learning algorithms. Diagnostics 2022, 12, 518. [Google Scholar] [CrossRef] [PubMed]
Ahmed, Z.A.; Albalawi, E.; Aldhyani, T.H.; Jadhav, M.E.; Janrao, P.; Obeidat, M.R.M. Applying eye tracking with deep learning techniques for early-stage detection of autism spectrum disorders. Data 2023, 8, 168. [Google Scholar] [CrossRef]
Thanarajan, T.; Alotaibi, Y.; Rajendran, S.; Nagappan, K. Eye-Tracking Based Autism Spectrum Disorder Diagnosis Using Chaotic Butterfly Optimization with Deep Learning Model. Comput. Mater. Contin. 2023, 76, 1995–2013. [Google Scholar] [CrossRef]
Milovanovic, M.; Grujicic, R. Electroencephalography in assessment of autism spectrum disorders: A review. Front. Psychiatry 2021, 12, 686021. [Google Scholar] [CrossRef] [PubMed]
Melinda, M.; Arnia, F.; Yafi, A.; Andryani, N.A.C.; Enriko, I.K.A. Design and Implementation of Mobile Application for CNN-Based EEG Identification of Autism Spectrum Disorder. Int. J. Adv. Sci. Eng. Inf. Technol. 2024, 14, 57. [Google Scholar] [CrossRef]
Xu, Y.; Yu, Z.; Li, Y.; Liu, Y.; Li, Y.; Wang, Y. Autism spectrum disorder diagnosis with EEG signals using time series maps of brain functional connectivity and a combined CNN–LSTM model. Comput. Methods Prog. Biomed. 2024, 250, 108196. [Google Scholar] [CrossRef] [PubMed]
Billeci, L.; Sicca, F.; Maharatna, K.; Apicella, F.; Narzisi, A.; Campatelli, G.; Calderoni, S.; Pioggia, G.; Muratori, F. On the application of quantitative EEG for characterizing autistic brain: A systematic review. Front. Hum. Neurosci. 2013, 7, 442. [Google Scholar] [CrossRef] [PubMed]
Radhakrishnan, M.; Ramamurthy, K.; Choudhury, K.K.; Won, D.; Manoharan, T.A. Performance analysis of deep learning models for detection of autism spectrum disorder from EEG signals. Trait. Du Signal 2021, 38, 853–863. [Google Scholar] [CrossRef]
Alhassan, S.; Soudani, A.; Almusallam, M. Energy-efficient EEG-based scheme for autism spectrum disorder detection using wearable sensors. Sensors 2023, 23, 2228. [Google Scholar] [CrossRef] [PubMed]
Menaka, R.; Karthik, R.; Saranya, S.; Niranjan, M.; Kabilan, S. An improved AlexNet model and cepstral coefficient-based classification of autism using EEG. Clin. EEG Neurosci. 2024, 55, 43–51. [Google Scholar] [CrossRef] [PubMed]
Al-Qazzaz, N.K.; Aldoori, A.A.; Buniya, A.; Ali, S.H.B.M.; Ahmad, S.A. Transfer Learning and Hybrid Deep Convolutional Neural Networks Models for Autism Spectrum Disorder Classification from EEG Signals. IEEE Access 2024, 12, 64510–64530. [Google Scholar] [CrossRef]
Ullah, M.Z.; Yu, D. Grid-tuned ensemble models for 2D spectrogram-based autism classification. Biomed. Signal Process. Control 2024, 93, 106151. [Google Scholar] [CrossRef]
Vorstman, J.A.; Parr, J.R.; Moreno-De-Luca, D.; Anney, R.J.; Nurnberger Jr, J.I.; Hallmayer, J.F. Autism genetics: Opportunities and challenges for clinical translation. Nat. Rev. Genet. 2017, 18, 362–376. [Google Scholar] [CrossRef] [PubMed]
Geschwind, D.H. Genetics of autism spectrum disorders. Trends Cogn. Sci. 2011, 15, 409–416. [Google Scholar] [CrossRef] [PubMed]
Rylaarsdam, L.; Guemez-Gamboa, A. Genetic causes and modifiers of autism spectrum disorder. Front. Cell. Neurosci. 2019, 13, 385. [Google Scholar] [CrossRef] [PubMed]
Saleem, S.; Habib, S.H. Implications of Genetic Factors and Modifiers in Autism Spectrum Disorders: A Systematic Review. Rev. J. Autism Dev. Disord. 2024, 11, 172–183. [Google Scholar] [CrossRef]
Abdullah, A.A.; Rijal, S.; Dash, S.R. Evaluation on machine learning algorithms for classification of autism spectrum disorder (ASD). J. Phys. Conf. Ser. 2019, 1372, 012052. [Google Scholar] [CrossRef]
Gök, M. A novel machine learning model to predict autism spectrum disorders risk gene. Neural Comput. Appl. 2019, 31, 6711–6717. [Google Scholar] [CrossRef]
Wang, M.; Doenyas, C.; Wan, J.; Zeng, S.; Cai, C.; Zhou, J.; Liu, Y.; Yin, Z.; Zhou, W. Virulence factor-related gut microbiota genes and immunoglobulin A levels as novel markers for machine learning-based classification of autism spectrum disorder. Comput. Struct. Biotechnol. J. 2021, 19, 545–554. [Google Scholar] [CrossRef] [PubMed]
Lin, Y.; Yerukala Sathipati, S.; Ho, S.Y. Predicting the risk genes of autism spectrum disorders. Front. Genet. 2021, 12, 665469. [Google Scholar] [CrossRef] [PubMed]
Alsuliman, M.; Al-Baity, H.H. Efficient diagnosis of autism with optimized machine learning models: An experimental analysis on genetic and personal characteristic datasets. Appl. Sci. 2022, 12, 3812. [Google Scholar] [CrossRef]
Suratanee, A.; Plaimas, K. Gene association classification for autism spectrum disorder: Leveraging gene embedding and differential gene expression profiles to identify disease-related genes. Appl. Sci. 2023, 13, 8980. [Google Scholar] [CrossRef]
Di Martino, A.; Yan, C.G.; Li, Q.; Denio, E.; Castellanos, F.X.; Alaerts, K.; Anderson, J.S.; Assaf, M.; Bookheimer, S.Y.; Dapretto, M.; et al. The autism brain imaging data exchange: Towards a large-scale evaluation of the intrinsic brain architecture in autism. Mol. Psychiatry 2014, 19, 659–667. [Google Scholar] [CrossRef] [PubMed]
Liu, M.; Li, B.; Hu, D. Autism spectrum disorder studies using fMRI data and machine learning: A review. Front. Neurosci. 2021, 15, 697870. [Google Scholar] [CrossRef] [PubMed]
Priyadarshini, I. Autism screening in toddlers and adults using deep learning and fair AI techniques. Future Internet 2023, 15, 292. [Google Scholar] [CrossRef]
Francese, R.; Yang, X. Supporting autism spectrum disorder screening and intervention with machine learning and wearables: A systematic literature review. Complex Intell. Syst. 2022, 8, 3659–3674. [Google Scholar] [CrossRef]
Cilia, F.; Carette, R.; Elbattah, M.; Guérin, J.L.; Dequen, G. Eye-tracking dataset to support the research on autism spectrum disorder. In Proceedings of the 1st Workshop on Scarce Data in Artificial Intelligence for Healthcare—SDAIH; SciTePress: Setúbal, Portugal, 2022. [Google Scholar] [CrossRef]
Guillon, Q.; Hadjikhani, N.; Baduel, S.; Rogé, B. Visual social attention in autism spectrum disorder: Insights from eye tracking studies. Neurosci. Biobehav. Rev. 2014, 42, 279–297. [Google Scholar] [CrossRef] [PubMed]
Torgerson, C.M.; Quinn, C.; Dinov, I.; Liu, Z.; Petrosyan, P.; Pelphrey, K.; Haselgrove, C.; Kennedy, D.N.; Toga, A.W.; Van Horn, J.D. Interacting with the National Database for Autism Research (NDAR) via the LONI Pipeline workflow environment. Brain Imaging Behav. 2015, 9, 89–103. [Google Scholar] [CrossRef] [PubMed][Green Version]
Hall, D.; Huerta, M.F.; McAuliffe, M.J.; Farber, G.K. Sharing heterogeneous data: The national database for autism research. Neuroinformatics 2012, 10, 331–339. [Google Scholar] [CrossRef] [PubMed]
Weiner, D.J.; Ling, E.; Erdin, S.; Tai, D.J.; Yadav, R.; Grove, J.; Fu, J.M.; Nadig, A.; Carey, C.E.; Baya, N.; et al. Statistical and functional convergence of common and rare genetic influences on autism at chromosome 16p. Nat. Genet. 2022, 54, 1630–1639. [Google Scholar] [CrossRef] [PubMed]
Berto, S.; Treacher, A.H.; Caglayan, E.; Luo, D.; Haney, J.R.; Gandal, M.J.; Geschwind, D.H.; Montillo, A.A.; Konopka, G. Association between resting-state functional brain connectivity and gene expression is altered in autism spectrum disorder. Nat. Commun. 2022, 13, 3328. [Google Scholar] [CrossRef] [PubMed]
Siegel, M.; Smith, K.A.; Mazefsky, C.; Gabriels, R.L.; Erickson, C.; Kaplan, D.; Morrow, E.M.; Wink, L.; Santangelo, S.L. The autism inpatient collection: Methods and preliminary sample description. Mol. Autism 2015, 6, 61. [Google Scholar] [CrossRef] [PubMed]
Roy, Y.; Banville, H.; Albuquerque, I.; Gramfort, A.; Falk, T.H.; Faubert, J. Deep learning-based electroencephalography analysis: A systematic review. J. Neural Eng. 2019, 16, 051001. [Google Scholar] [CrossRef] [PubMed]
Ramirez-Quintana, J.A.; Madrid-Herrera, L.; Chacon-Murguia, M.I.; Corral-Martínez, L.F. Brain-Computer Interface System Based on P300 Processing with Convolutional Neural Network, Novel Speller, and Low Number of Electrodes. Cogn. Comput. 2020, 13, 108–124. [Google Scholar] [CrossRef]
Simões, M.; Borra, D.; Santamaría-Vázquez, E.; Bittencourt-Villalpando, M.; Krzemiński, D.; Miladinović, A.; Schmid, T.; Zhao, H.; Amaral, C.P.; Direito, B.; et al. BCIAUT-P300: A Multi-Session and Multi-Subject Benchmark Dataset on Autism for P300-Based Brain-Computer-Interfaces. Front. Neurosci. 2020, 14, 568104. [Google Scholar] [CrossRef] [PubMed]
Peketi, S.; Dhok, S.B. Machine Learning Enabled P300 Classifier for Autism Spectrum Disorder Using Adaptive Signal Decomposition. Brain Sci. 2023, 13, 315. [Google Scholar] [CrossRef] [PubMed]
Zhang, Y.; Liu, X.; Guo, R.; Xu, W.; Guo, Q.; Hao, C.; Ni, X.; Li, W. Biological implications of genetic variations in autism spectrum disorders from genomics studies. Biosci. Rep. 2021, 41, BSR20210593. [Google Scholar] [CrossRef] [PubMed]
Trost, B.; Thiruvahindrapuram, B.; Chan, A.J.S.; Engchuan, W.; Higginbotham, E.J.; Howe, J.L.; Loureiro, L.; Reuter, M.S.; Roshandel, D.; Whitney, J.; et al. Genomic architecture of autism spectrum disorder from comprehensive whole-genome sequence annotation. medRxiv 2022. [Google Scholar] [CrossRef]
Baio, J. Prevalence of autism spectrum disorder among children aged 8 years—Autism and developmental disabilities monitoring network, 11 sites, United States, 2014. MMWR. Surveill. Summ. 2018, 67, 1–23. [Google Scholar] [CrossRef] [PubMed]
Wang, J.; Perez, L. The effectiveness of data augmentation in image classification using deep learning. Convolutional Neural Netw. Vis. Recognit. 2017, 11, 1–8. [Google Scholar] [CrossRef]
Frid-Adar, M.; Diamant, I.; Klang, E.; Amitai, M.; Goldberger, J.; Greenspan, H. GAN-based synthetic medical image augmentation for increased CNN performance in liver lesion classification. Neurocomputing 2018, 321, 321–331. [Google Scholar] [CrossRef]
Bougeard, C.; Picarel-Blanchot, F.; Schmid, R.; Campbell, R.; Buitelaar, J. Prevalence of autism spectrum disorder and co-morbidities in children and adolescents: A systematic literature review. Front. Psychiatry 2021, 12, 744709. [Google Scholar] [CrossRef] [PubMed]
Valliani, A.A.A.; Ranti, D.; Oermann, E.K. Deep learning and neurology: A systematic review. Neurol. Ther. 2019, 8, 351–365. [Google Scholar] [CrossRef] [PubMed]
MacIntyre, M.R.; Cockerill, R.G.; Mirza, O.F.; Appel, J.M. Ethical considerations for the use of artificial intelligence in medical decision-making capacity assessments. Psychiatry Res. 2023, 328, 115466. [Google Scholar] [CrossRef] [PubMed]
Rehan, H. AI-Driven Cloud Security: The Future of Safeguarding Sensitive Data in the Digital Age. J. Artif. Intell. Gen. Sci. (JAIGS) 2024, 1, 132–151. [Google Scholar] [CrossRef]
Hu, X.; Chu, L.; Pei, J.; Liu, W.; Bian, J. Model complexity of deep learning: A survey. Knowl. Inf. Syst. 2021, 63, 2585–2619. [Google Scholar] [CrossRef]

Figure 1. Global ASD treatment market size (in USD billion) [11].

Figure 2. PRISMA flow diagram illustrating the study-selection process.

Figure 3. A bar plot with the number of works in our review related to each modality.

Table 1. A summary of works using AI techniques for detecting ASD through questionnaire data.

Citation	Dataset	Technique	Reported Metrics	Limitations
[64]	Autism Screening Adult, Autistic Spectrum Disorder Screening Data for Children, Autistic (public), Spectrum Disorder Screening Data for Adolescence Data Set	kNN, SVM, and RF	Accuracy: 100.00% Sensitivity: 100.00% F-Score: 100.00% AUC: 100.00%	Limited feature exploration, generalizability concerns, and demographic constraints.
[65]	ASD Dataset of toddler, child, adolescent, adult and ASDTest App: Public	Adaboost, FDA, C5.0, Glmboost, LDA, MDA, PDA, SVM and CART	Accuracy: 98.36% AUROC: 100.00%	Model generalization, and feature interpretation.
[66]	ASD dataset for Children, Adult and Adolescent, Attributes based dataset	LR, SVM, NB, kNN, ANN, CNN	Accuracy: 99.53% Sensitivity: 99.30% Specificity: 100.00%	Small dataset, feature interpretation, class imbalance, and preprocessing limitations.
[22]	ASD Dataset, Autism detection in adults dataset (public)	ANN, SVM, Multi-Layer Perceptron (MLP), CNN, RF, and LR	Accuracy: 100.00% Recall: 100.00%	Age and race bias, lack of dataset validation, and restricted generalization.
[67]	Autism Screening dataset for toddlers (public)	SVM, RF, NB, LR, and kNN	Accuracy: 97.15% F1-Score: 98.00%	Generalizability concerns.
[68]	ASD dataset toddlers, children, youths, and adults	NB, Baggage Classifier, Random Tree, SVM, kNN, Classification and Regression Tree (CART), DT, k-Star (kS)	Accuracy: 99.61% Kappa Stat.: 99.21% F1-Score: 99.60% AUROC: 99.60%	Data source bias, limited dataset diversity, lack of model comprehensibility, and Restricted clinical applicability.
[69]	Autistic Spectrum Disorder Screening Data for adults	ANN, SVM, DT, RF, LR	Accuracy: 99.40%	Restricted model variety, underutilized ML models, and potential for model expansion.
[70]	Autism Screening dataset for toddlers, Autism screening on adults dataset, Autism screening dataset, ASD dataset (public)	FL with SVM and LR	Accuracy: 99.00% Precision: 97.00% Recall: 58.00% F1-Score: 73.00%	Insufficient data instances, risk of misclassification, diagnostic complexity, and limited real world validation.
[71]	ASD datasets for children, adult, teenagers, and adolescent, Questionnaire based datasets: Public	DT, kNN, LR, NB, RF, SVM	Accuracy: 100.00% Precision: 100.00% Recall: 100.00% F1-Score: 100.00%	Class imbalance, no sampling techniques, and lack of feature extraction.
[72]	Self-collected ASD dataset based on sentiments: Private	LSTM	Accuracy: 97.00% Precision: 97.00% Recall: 97.00% F1-Score: 97.00%	Unspecified dataset size, missing feature interpretation, and real time detection.
[73]	ASD Children and Adult dataset, questionnaire-based instances: Public	NB, kNN, LR, SVM, DT, ANN, RF, BIRCH, XG Boost	Accuracy: 94.28% Precision: 89.90% Recall: 92.00% F1-score: 91.19% Specificity: 95.05% Kappa: 86.95% AUC: 98.91%	Small size of data, lack of augmentation, and diversity constraints.

Table 2. A summary of works using AI techniques for detecting ASD through facial image data.

Citation	Dataset	Technique	Reported Metrics	Limitations
[78]	Self-collected dataset of Chinese children with ASD (private)	k-means and SVM	Accuracy: 88.51% Sensitivity: 93.10% Specificity: 86.21% AUC: 89.93%	Limited data sample, algorithmic limitations, no proper data augmentation and lack of comparative analysis.
[79]	East Asia ASD Children Facial Image Dataset, ASD Facial Images Kaggle Dataset	VGG16	Accuracy: 95.00% F1-Score: 95.00%	Small data size, no data augmentation, and unjustified accuracy.
[80]	Autistic children dataset, facial images of autism vs normal children: Public	Pre-trained MobileNet, Xception, EfficientNetB0, EfficientNetB1, EfficientNetB2, and DNN	Sensitivity: 88.46% Specificity: 91.66% NPV: 88.00% PPV: 92.00% AUC: 96.63%	Inappropriate train/test split, unjustified accuracy, and missing data preprocessing.
[82]	Autistic children dataset, Facial Images of ASD and Non-ASD Children	MobileNetV2, Hybrid VGG19, LR, Linear SVC, RF, DT, Gradient Boosting, MLP, and KNN	Accuracy: 92.00% Precision: 90.40% Recall: 92.00% F1-Score: 92.00%	No data augmentation, missing feature interpretation, and improper train/test split.
[83]	Autism Image data, Facial Images of ASD and Non-ASD Children: Public	MobileNetV2, MobileNetV3, Hybrid Integrated Classifier	Accuracy: 90.50% Sensitivity: 92.30% Specificity: 88.60% G_Mean: 90.40% AUC: 96.40%	No data augmentation, and lack of dataset comparison
[81]	Autistic children dataset, 2D facial images of ASD and Non-ASD Children	VGG16, VGG19	Accuracy: 84.00%	Small data size, no data augmentation, and unjustified accuracy.
[84]	Autistic children dataset, 2D facial images of ASD and Non-ASD Children	VGG16, VGG19 and, EfficientnetB0	Accuracy: 88.30% AUC: 95.44%	Missing confusion matrix, and class-wise accuracy unclear.

Table 3. A summary of works using AI techniques for detecting ASD through MRI image data.

Citation	Dataset	Technique	Reported Metrics	Limitations
[96]	ABIDE dataset, fMRI images	DNN	Accuracy: 87.40%	Lack of practical context, small sample size, and limited generalizability.
[95]	ABIDE dataset, rs-fMRI images: Public	Transfer learning autoencoder	Accuracy: 70.00% Sensitivity: 74.00% Specificity: 63.00%	Limited sample size, limited accuracy, lack of augmentation, and insufficient preprocessing.
[98]	ABIDE dataset, MRI images: Public	DL with SVM, MLP, ATM and Auto ASD encoder method	Accuracy: 80.00%	Low sample size, missing data states, absent hyperparameter tuning, and incomplete implementation details.
[97]	ABIDE-1 Dataset, rs-fMRI Images of ASD and non-ASD patients: Public	kNN, SVM, LR, ANN, LDA	Accuracy: 77.70% AUC: 83.10%	Small data samples, lack of augmentation, feature insufficiency, and accuracy constraints.
[99]	ABIDE 1, s-MRI Images (public)	VGG16, SVM, kNN, MLP, Linear Discriminant Analysis	Accuracy: 65.00%	Data dependency, integration challenges, and dataset diversity.
[100]	ABIDE, rs-fMRI Images of ASD and non-ASD patients: Public	LeNet-5, VGG16, ResNet-50	Accuracy: 95.00%	No data augmentation comparison, missing evaluation metrics, and limited performance analysis.
[102]	ABIDE dataset, 4D rs-fMRI Images of ASD and non-ASD patients: Private	Graph DNN	Accuracy: 97.66% Precision: 98.00% Recall: 98.00% F1-Score: 98.00%	Limited data size, no data augmentation, and insufficient validation data.
[101]	ABIDE-1 dataset, fMRI images: Private	Autoencoder	Accuracy: 70.90% Sensitivity: 70.70% Specificity: 75.50%	Limited dataset size, missing data augmentation, lack of model comparison, and incomplete preprocessing.
[103]	ABIDE I, ABIDE II, KAU dataset, rs-fMRI Images of ASD and non-ASD individuals: Public	SVM, NB, DT, ANN, XG Boost, CatBoost, MLP	Accuracy: 62.70% Sensitivity: 61.70% Specificity: 60.14%	Limited data, lack of augmentation, and accuracy constraints.
[104]	ABIDE dataset and Alzheimer’s Disease dataset, MRI Images of ASD, and non-ASD, Images of AD and Non-AD individuals: Public	Multi-instance Conv-Transformer (LD-MILCT)	Accuracy: 70.00% Sensitivity: 50.00%, Specificity: 69.00%, F1-Score: 33.00%	Small dataset size, and limited task generalization.
[105]	ABIDE dataset, rs-fMRI Images of ASD and non-ASD patients	GNN	Accuracy: 74.00% Sensitivity: 69.00% Specificity: 74.00%	Limited dataset size, limited data preprocessing, and low accuracy.
[106]	ABIDE dataset, rs-fMRI Images of ASD and non-ASD patients	Multiview hyperedge-aware hypergraph, convolutional network (HGCN), GNN	Accuracy: 78.54% Sensitivity: 74.40% Precision: 82.00% F1-Score: 77.00% AUC: 87.00%	Imbalance dataset, overlooked demographic variables, and ignored temporal dynamics.

Table 4. A summary of works using AI techniques for detecting ASD through eye tracking data.

Citation	Dataset	Technique	Reported Metrics	Limitations
[110]	Eye Tracking Subject-Experiment (ETS-E) dataset, Public	RF, C4.5 DT, PART, and feedforward neural network (FFNN)	Accuracy: 93.45%	Limited data, restricted generalizability, and handcrafted features.
[111]	Self-collected data from 59 children: Private	CNN	Accuracy: 71.00% AUC: 90.00%	Limited sample size, unsatisfactory accuracy, and lack of model comparison.
	Early Screening ASD dataset, Eye-Tracking Scan Path (ETSP) images	Boosted DT (BDT), Deep SVM, and Decision Jungle (DJ), DNN	Sensitivity: 93.00% Specificity: 91.00% PPV: 94.00% NPV: 90.00% AUC: 97.00%	Lack of feature interpretation, missing accuracy metric, and incomplete performance evaluation.
[112]	Eye-Tracking Scanpaths in ASD dataset, ASD eye-tracking scanpaths images of ASD and developing patients	ANN, FFNN, GoogleNet, ResNet-18, Hybrid GoogleNet + SVM, Hybrid ResNet-18 + SVM	Accuracy: 97.60% Precision: 96.50% Sensitivity: 97.00% Specificity: 97.00% AUC: 97.57%	Missing feature interpretation, lack of dataset comparison, and incomplete research findings.
[114]	Eye tracking dataset	LSTM, CNN-LSTM, BiLSTM, GRU	Accuracy: 98.33% Sensitivity: 97.25% Specificity: 98.94% F1-Score: 98.70% AUC: 98.00%	Demographic imbalance.
[115]	Visualization of Eye-Tracking Scanpaths image dataset	U-net, InceptionV3, LSTM	Accuracy: 99.29% Precision: 98.78% Sensitivity: 99.29% Specificity: 99.29%	Class imbalance, no proper preprocessing.
[109]	Eye-Tracking Scanpath Image Dataset	CNN	Accuracy: 95.59% Sensitivity: 77.60% Specificity: 79.91% F1-Score: 78.73%	Limited data Sample, no data augmentation.

Table 5. A summary of works using AI techniques for detecting ASD through EEG data.

Citation	Dataset	Technique	Reported Metrics	Limitations
[120]	Self collected EEG Signal Data from 10 children: Private	AlexNet, DenseNet, SqueezeNet, ShuffleNet, VGGNet, InceptionNet, and ResNet	Accuracy: 81.00% Precision: 92.00% Recall: 91.00% Specificity: 100.00% F1-Score: 91.00%	Limited dataset size, lack of oversampling, missing dataset comparison, and incomplete model validation.
[53]	EEG Recording collected from Institute in Warsaw	LR and Statistical Methods	Accuracy: 83.00% Sensitivity: 83.00% Specificity: 83.00% AUC: 83.00%	Class imbalance, small size of data.
[121]	EEG signal dataset	SVM, LR and DT	Accuracy: 96.67% Sensitivity: 100.00% Specificity: 95.00% PPV: 93.33% NPV: 100.00% F1-Score: 96.55%	Limited dataset.
[123]	EEG Signal Dataset	AlexNet, SqueezeNet, MobileNetV2, SVM, kNN, DT	Accuracy: 85.50% Precision: 85.50% Specificity: 95.20% Recall: 85.40% F1-Score: 97.73%	Limited dataset size.
[122]	Self Collected EEG Recording	AlexNet, VGG16, ResNet50	Accuracy: 90.12%	Sample homogeneity.
[124]	KAU and TUOS dataset, EEG recordings and spectrogram images of ASD and Non-ASD individuals: Public	CNN based weighted ensemble model	Accuracy: 96.22%	Missing feature interpretation, and lack of real time relevance.

Table 6. A summary of works using AI techniques for detecting ASD by analyzing genetic data.

Citation	Dataset	Technique	Reported Metrics	Limitations
[130]	lncRNA gene dataset	NB, BN, kNN, LR, RF, linear and polynomial SVM	Accuracy: 78.31% Sensitivity: 90.02% Specificity: 66.50% F1-Score: 80.60% MCC: 58.30%	Customization.
[131]	ASD gut metagenomic dataset (SRP182132)	RF	AUC: 97.00%	Lacking a comparative evaluation with other ML classifiers.
[132]	Gene Expression Dataset	RF, LMT, SMO, LR, SVM, IBCGA	Accuracy: 81.83% Sensitivity: 84.00% Specificity: 79.00% AUC: 84.00%	Limited dataset samples.
[133]	Self-collected data from ASD Tests application and GE dataset	NB, kNN, DT, SVM, FPA and GWO	Accuracy: 99.69% Precision: 99.65% Recall: 99.67% F1-Score: 100.00% AUC: 99.66%	Feature interpretation, small sample size, practical applicability, and data augmentation.
[134]	A whole-genome transcriptomic dataset GSE6575, GSE28521	XGBoost, NB, NN, and RF	Accuracy: 82.49% AUC: 75.32%	Limited diversity.

Table 7. A summary of the most popular ASD-related available datasets.

Dataset	Data Type	Description
Autism screening data for children https://archive.ics.uci.edu/dataset/419/autistic+spectrum+disorder+screening+data+for+children (accessed on 13 March 2025)	ASD questionnaires	This dataset, released in 2017, includes 292 instances with 20 features. The data was taken from children aged 4 to 11 years, including 141 ASD and 151 TD (with a gender distribution of 208 males and 84 females). It focuses on screening children for ASD and contains ten behavioral features (Q-Chat-10) and other individual characteristics to identify ASD traits.
Autism Screening on Adults https://www.kaggle.com/datasets/andrewmvd/autism-screening-on-adults (accessed on 13 March 2025)	ASD questionnaires	Released in 2020, it consists of 704 participants, including 189 ASD and 515 TD, aged 17 to 58, resulting in a gender distribution of 367 males and 337 females. The dataset contains responses to surveys based on questions indicating ASD.
Autism screening data for toddlers https://www.kaggle.com/datasets/fabdelja/autism-screening-for-toddlers (accessed on 13 March 2025)	ASD questionnaires	Released in 2018, it includes 1054 records with 18 attributes with 728 ASD and 326 TD participants (735 males and 319 females), focusing on screening autism in toddlers. It contains ten behavioral features (Q-Chat-10) and other individual characteristics to identify ASD traits.
ASD children traits https://www.kaggle.com/datasets/uppulurimadhuri/dataset (accessed on 13 March 2025)	ASD questionnaires	Released in 2022, it includes assessments of Autism Spectrum Quotient (AQ1–AQ10), Social Responsiveness Scale, aged 1 to 18, 1074 ASD and 911 TD participants, obtaining gender distribution 1447 male and 538 female and family history of ASD, aiding in predictive analysis of ASD tendencies. The Autism Spectrum Quotient helps identify potential ASD traits in individuals aged 16 or older.
Autistic Children Facial Dataset https://www.kaggle.com/datasets/imrankhan77/autistic-children-facial-data-set?select=consolidated (accessed on 13 March 2025)	Facial images	Last updated in 2022, the dataset consists of 2940 subjects (1470 ASD and 1470 TD), aged 2 to 14, resulting in a gender distribution (3:1) in terms of male and female ratio.
ABIDE I https://fcon_1000.projects.nitrc.org/indi/abide/abide_I.html (accessed on 13 March 2025)	MRI	Released in 2012, it includes 539 ASD and 573 TC, aged 7 to 64, resulting in 1112 resting-state fMRI and structural MRI datasets from 17 worldwide locations.
ABIDE II https://fcon_1000.projects.nitrc.org/indi/abide/abide_II.html (accessed on 13 March 2025)	MRI	Released in 2016, ABIDE II aggregates 1114 MRI datasets from 19 sites, including 521 ASD and 593 controls, aged 5 to 64 year, with enhanced phenotypic data and some longitudinal samples.
Eye-tracking Scanpaths Dataset https://figshare.com/articles/dataset/Eye-Tracking_Dataset_to_Support_the_Research_on_Autism_Spectrum_Disorder/20113592 (accessed on 13 March 2025)	Scanpath Images	The dataset consists of 547 subjects, including 29 ASD and 30 TD participants, aged 2 to 12, obtaining a gender distribution of 38 male and 21 female participants.
P300 speller KAU https://malhaddad.kau.edu.sa/Pages-BCI-Datasets-En.aspx (accessed on 13 March 2025)	EEG	Published in 2019, it comprises EEGs representing subjects with 15 ASD disorders and normal subjects. The disorder group includes eight boys aged 10 to 16, while the normal group consists of ten boys aged 9 to 16 without neurological disorders.
BCIAUT_P30 https://www.kaggle.com/datasets/disbeat/bciaut-p300 (accessed on 13 March 2025)	EEG	Published in 2020, it comprises EEG recordings of P300-based Brain-Computer Interface for training individuals with ASD. It includes recordings from 15 ASD participants across seven sessions, totaling 105 sessions.
SFARI Autism Inpatient Collection https://www.sfari.org/resource/autism-brainnet/ (accessed on 13 March 2025)	ASD questionnaires and genetics	Initiated in 2013, it aims to gather phenotypic and genetic data from children diagnosed with ASD. It includes surveys on social communication, behaviors, and other domains from 527 individuals.
Autism WGS https://www.omicsdi.org/dataset/ega/EGAS00001000850 (accessed on 13 March 2025)	Genetics	Published in 2014, the Detection of Clinically Relevant Genetic Variants in Autism Spectrum Disorder by Whole-Genome Sequencing explores genetic variants in 32 ASD-affected families.
National Database for Autism Research (NDAR) https://catalog.data.gov/dataset/national-database-for-autism-research-ndar (accessed on 13 March 2025)	MRI, Genetics	Last updated in 2023, the dataset is a scalable platform for sharing ASD-related data, supporting various data types, including biological, behavioral, genetic, and medical imaging data.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ahmed, M.; Hussain, S.; Ali, F.; Gárate-Escamilla, A.K.; Amaya, I.; Ochoa-Ruiz, G.; Ortiz-Bayliss, J.C. Summarizing Recent Developments on Autism Spectrum Disorder Detection and Classification Through Machine Learning and Deep Learning Techniques. Appl. Sci. 2025, 15, 8056. https://doi.org/10.3390/app15148056

AMA Style

Ahmed M, Hussain S, Ali F, Gárate-Escamilla AK, Amaya I, Ochoa-Ruiz G, Ortiz-Bayliss JC. Summarizing Recent Developments on Autism Spectrum Disorder Detection and Classification Through Machine Learning and Deep Learning Techniques. Applied Sciences. 2025; 15(14):8056. https://doi.org/10.3390/app15148056

Chicago/Turabian Style

Ahmed, Masroor, Sadam Hussain, Farman Ali, Anna Karen Gárate-Escamilla, Ivan Amaya, Gilberto Ochoa-Ruiz, and José Carlos Ortiz-Bayliss. 2025. "Summarizing Recent Developments on Autism Spectrum Disorder Detection and Classification Through Machine Learning and Deep Learning Techniques" Applied Sciences 15, no. 14: 8056. https://doi.org/10.3390/app15148056

APA Style

Ahmed, M., Hussain, S., Ali, F., Gárate-Escamilla, A. K., Amaya, I., Ochoa-Ruiz, G., & Ortiz-Bayliss, J. C. (2025). Summarizing Recent Developments on Autism Spectrum Disorder Detection and Classification Through Machine Learning and Deep Learning Techniques. Applied Sciences, 15(14), 8056. https://doi.org/10.3390/app15148056

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Summarizing Recent Developments on Autism Spectrum Disorder Detection and Classification Through Machine Learning and Deep Learning Techniques

Abstract

1. Introduction

2. Research Methods

2.1. Objectives and Research Questions

2.2. Search and Data Extraction

Search Results

3. Contributions of Artificial Intelligence to ASD Detection

4. Related Work in the ML and DL Domain

4.1. Questionnaire-Based Diagnosis of ASD

4.2. Facial Image-Based Diagnosis of ASD

4.3. MRI-Based Diagnosis of ASD

4.4. Eye Tracking-Based Diagnosis of ASD

4.5. EEG-Based Diagnosis of ASD

4.6. Genetics-Based Diagnosis of ASD

5. Popular Datasets for ASD Detection or Classification

6. Discussion

7. Conclusions

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI