A Real Time Multi Modal Computer Vision Framework for Automated Autism Spectrum Disorder Screening
Abstract
1. Introduction
2. Related Work
2.1. Early Screening and Clinical Assessment Limitations
2.2. Video-Based Machine Learning Approaches
2.3. Pose Estimation and Skeleton-Based Representations
2.4. Eye-Tracking and Sensor-Based Measurements
2.5. Modeling Motor and Audio–Motor Stereotypies
2.6. Deep Learning for Abnormal Hand and Body Movements
2.7. Facial Expression and Gesture Analysis
2.8. Summary and Motivation
3. Materials and Methods
3.1. System Overview
3.2. Datasets and Input Modalities
3.2.1. Video Datasets
3.2.2. Image Datasets
3.2.3. External Validation Dataset
3.3. Feature Extraction and Preprocessing
3.3.1. Motion Feature Extraction from Video
Stereotypical Motion Detection via Spectral Analysis
Bilateral Asymmetry, Variability, and Smoothness
3.3.2. Statistical Characterization of Motion Distributions
3.3.3. Facial Feature Extraction
3.3.4. Multi-Modal Feature Integration
3.4. Classification Models
3.4.1. Random Forest Classification
3.4.2. Transfer Learning and Fine Tuning for Pose Estimation
3.5. Evaluation Protocol
3.6. Implementation Details
4. Results
4.1. Performance of Individual Modalities
4.1.1. ComplexVideos Motion-Based Classification
4.1.2. KinectStickman Skeleton-Based Classification
4.1.3. Facial Expression and Morphology Models
4.2. Multi-Modal Fusion Results
4.3. Cross-Validation Stability Analysis
4.4. Feature Importance Analysis
4.5. Computational Performance
5. Discussion
5.1. Interpretation of Motion-Based Behavioral Patterns
5.2. Effectiveness of Multi-Modal Fusion
5.3. Cross-Dataset Generalization and Robustness
5.4. Clinical Relevance and Practical Applicability
5.5. Limitations
5.6. Future Research Directions
6. Conclusions
Conclusions and Future Directions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Farooq, M.S.; Tehseen, R.; Sabir, M.; Atal, Z. Detection of autism spectrum disorder (ASD) in children and adults using machine learning. Sci. Rep. 2023, 13, 9605. [Google Scholar] [CrossRef] [PubMed]
- Lord, C.; Cook, E.H.; Leventhal, B.L.; Amaral, D.G. Autism spectrum disorders. Neuron 2000, 28, 355–363. [Google Scholar] [CrossRef] [PubMed]
- Barami, T.; Manelis-Baram, L.; Kaiser, H.; Ilan, M.; Slobodkin, A.; Hadashi, O.; Hadad, D.; Waissengreen, D.; Nitzan, T.; Menashe, I.; et al. Automated analysis of stereotypical movements in videos of children with autism spectrum disorder. JAMA Netw. Open 2024, 7, e2432851. [Google Scholar] [CrossRef] [PubMed]
- Al-Jubouri, A.; Hadi, I.; Rajihy, Y. Three-Dimensional Dataset Combining Gait and Full Body Movement of Children with Autism Spectrum Disorders Collected by Kinect v2 Camera. 2020. Available online: https://datadryad.org/dataset/doi:10.5061/dryad.s7h44j150 (accessed on 10 December 2025). [CrossRef]
- Natraj, S.; Kojovic, N.; Maillart, T.; Schaer, M. Video-Audio Neural Network Ensemble for Comprehensive Screening of Autism Spectrum Disorder in Young Children (OpenPose ADOS Dataset). 2024. Available online: https://zenodo.org/records/12658214 (accessed on 10 December 2025). [CrossRef]
- de Belen, R.A.J.; Bednarz, T.; Sowmya, A.; Del Favero, D. Computer vision in autism spectrum disorder research: A systematic review of published studies from 2009 to 2019. Transl. Psychiatry 2020, 10, 333. [Google Scholar] [CrossRef] [PubMed]
- Wu, C.; Liaqat, S.; Helvaci, H.; Cheung, S.c.S.; Chuah, C.N.; Ozonoff, S.; Young, G. Machine Learning Based Autism Spectrum Disorder Detection from Videos. In Proceedings of the IEEE International Conference on E-Health Networking, Application & Services (HEALTHCOM), Virtual, 1–2 March 2021. [Google Scholar] [CrossRef]
- Lanzarini, E.; Pruccoli, J.; Grimandi, I.; Spadoni, C.; Angotti, M.; Pignataro, V.; Sacrato, L.; Franzoni, E.; Parmeggiani, A. Phonic and Motor Stereotypies in Autism Spectrum Disorder: Video Analysis and Neurological Characterization. Brain Sci. 2021, 11, 431. [Google Scholar] [CrossRef] [PubMed]
- Babu, P.R.K.; Di Martino, J.M.; Chang, Z.; Perochon, S.; Aiello, R.; Carpenter, K.L.H.; Compton, S.; Davis, N.; Franz, L.; Espinosa, S.; et al. Complexity analysis of head movements in autistic toddlers. J. Child Psychol. Psychiatry 2022, 64, 156–166. [Google Scholar] [CrossRef] [PubMed]
- Rose, K. An Autistic Frequency (Stimming). 2018. Available online: https://theautisticadvocate.com/an-autistic-frequency/ (accessed on 10 January 2026).
- Ali, A.; Negin, F.F.; Thümmler, S.; Bremond, F.F. Video-based Behavior Understanding of Children for Objective Diagnosis of Autism. In Proceedings of the 17th International Conference on Computer Vision Theory and Applications (VISAPP), Virtual, 6–8 February 2022. [Google Scholar] [CrossRef]
- Kojovic, N.; Natraj, S.; Mohanty, S.P.; Maillart, T.; Schaer, M. Using 2D video-based pose estimation for automated prediction of autism spectrum disorders in young children. Sci. Rep. 2021, 11, 15069. [Google Scholar] [CrossRef] [PubMed]
- Rehg, J.M.; Abowd, G.D.; Rozga, A.; Romero, M.; Clements, M.A.; Sclaroff, S.; Essa, I.; Ousley, O.Y.; Li, Y.; Kim, C.; et al. Decoding Children’s Social Behavior. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Portland, OR, USA, 23–28 June 2013; pp. 3414–3421. [Google Scholar] [CrossRef]
- Young, G.S.; Constantino, J.N.; Dvorak, S.; Belding, A.; Gangi, D.; Hill, A.; Hill, M.; Miller, M.; Parikh, C.; Schwichtenberg, A.J.; et al. A video-based measure to identify autism risk in infancy. J. Child Psychol. Psychiatry 2020, 61, 1031–1039. [Google Scholar] [CrossRef] [PubMed]
- Ahmed, Z.A.; Jadhav, M. A Review of Early Detection of Autism Based on Eye-Tracking and Sensing Technology. In Proceedings of the International Conference on Inventive Computation Technologies (ICICT), Coimbatore, India, 26–28 February 2020; pp. 160–166. [Google Scholar] [CrossRef]
- Cedrus. New: Cedrus Introduces StimTracker for SMI Eye Trackers. 2014. Available online: https://community.cedrus.com/t/new-cedrus-introduces-stimtracker-for-smi-eye-trackers/5092 (accessed on 10 December 2025).
- Jaradat, A.S.; Wedyan, M.; Alomari, S.; Barhoush, M.M. Using Machine Learning to Diagnose Autism Based on Eye Tracking Technology. Diagnostics 2024, 15, 66. [Google Scholar] [CrossRef] [PubMed]
- Muyinda, P.B.; Masagazi, F.M.; Mugagga, A.M.; Mulumba, M.B. Tracking Students’ Eye-Movements when Reading Learning Objects on Mobile Phones: A Discourse Analysis of Luganda Language Teacher-Trainees’ Reflective Observations. J. Learn. Dev. 2016, 3, 51–65. [Google Scholar] [CrossRef]
- Pierce, K.; Marinero, S.; Hazin, R.; McKenna, B.; Carter Barnes, C.; Malige, A. Eye Tracking Reveals Abnormal Visual Preference for Geometric Images as an Early Biomarker of an Autism Spectrum Disorder Subtype Associated with Increased Symptom Severity. Biol. Psychiatry 2016, 79, 657–666. [Google Scholar] [CrossRef] [PubMed]
- Raja, K.S.S.; Balaji, V.; Kiruthika, U.S.; Raman, C. An IoT Platform for Children Behaviour Analysis and Early Detection of Neurodevelopmental Disorders. In Proceedings of the 2021 Innovations in Power and Advanced Computing Technologies (i-PACT), Kuala Lumpur, Malaysia, 27–29 November 2021; pp. 73–84. [Google Scholar] [CrossRef]
- Lakkapragada, A.; Kline, A.; Mutlu, O.C.; Paskov, K.; Chrisman, B.; Stockham, N.; Washington, P.; Wall, D.P. The Classification of Abnormal Hand Movement to Aid in Autism Detection: Machine Learning Study. JMIR Biomed. Eng. 2022, 7, e33771. [Google Scholar] [CrossRef]
- Vabalas, A.; Gowen, E.; Poliakoff, E.; Casson, A.J. Applying Machine Learning to Kinematic and Eye Movement Features of a Movement Imitation Task to Predict Autism Diagnosis. Sci. Rep. 2020, 10, 8346. [Google Scholar] [CrossRef] [PubMed]
- Derbali, M.; Jarrah, M.; Randhawa, P. Autism Spectrum Disorder Detection Using Video Games Facial Expression Diagnosis. Int. J. Adv. Comput. Sci. Appl. 2023, 14, 110–119. [Google Scholar] [CrossRef]
- Zunino, A.; Morerio, P.; Cavallo, A.; Ansuini, C.; Podda, J.; Battaglia, F.; Veneselli, E.; Becchio, C.; Murino, V. Video Gesture Analysis for Autism Spectrum Disorder Detection. In Proceedings of the 2018 24th International Conference on Pattern Recognition (ICPR), Beijing, China, 20–24 August 2018; pp. 3421–3426. [Google Scholar] [CrossRef]
- Talaat, F.M. Autistic Children Emotions Dataset. 2025. Available online: https://www.kaggle.com/datasets/fatmamtalaat/autistic-children-emotions-dr-fatma-m-talaat (accessed on 10 January 2026).
- Das, P. Autistic Children Facial Image Dataset. 2025. Available online: https://www.kaggle.com/datasets/prayashdas/autistic-children-facial-image-dataset (accessed on 10 January 2026).
- Nada, A. AV-ASD Videos Part 5. 2025. Available online: https://www.kaggle.com/datasets/nadaahmed567/av-asd-videos-part-5/data (accessed on 10 January 2026).
- Genuer, R.; Poggi, J.M. Random Forests. In Random Forests with R; Springer: Berlin/Heidelberg, Germany, 2020; pp. 33–55. [Google Scholar] [CrossRef]
- Nelli, F. Machine Learning with scikit-learn. In Python Data Analytics; Springer: Berlin/Heidelberg, Germany, 2023; pp. 259–287. [Google Scholar] [CrossRef]













Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Dénes-Fazakas, L.; Mateas, I.C.; Berciu, A.G.; Szilágyi, L.; Kovács, L.; Dulf, E.-H. A Real Time Multi Modal Computer Vision Framework for Automated Autism Spectrum Disorder Screening. Electronics 2026, 15, 1287. https://doi.org/10.3390/electronics15061287
Dénes-Fazakas L, Mateas IC, Berciu AG, Szilágyi L, Kovács L, Dulf E-H. A Real Time Multi Modal Computer Vision Framework for Automated Autism Spectrum Disorder Screening. Electronics. 2026; 15(6):1287. https://doi.org/10.3390/electronics15061287
Chicago/Turabian StyleDénes-Fazakas, Lehel, Ioan Catalin Mateas, Alexandru George Berciu, László Szilágyi, Levente Kovács, and Eva-H. Dulf. 2026. "A Real Time Multi Modal Computer Vision Framework for Automated Autism Spectrum Disorder Screening" Electronics 15, no. 6: 1287. https://doi.org/10.3390/electronics15061287
APA StyleDénes-Fazakas, L., Mateas, I. C., Berciu, A. G., Szilágyi, L., Kovács, L., & Dulf, E.-H. (2026). A Real Time Multi Modal Computer Vision Framework for Automated Autism Spectrum Disorder Screening. Electronics, 15(6), 1287. https://doi.org/10.3390/electronics15061287

