You are currently viewing a new version of our website. To view the old version click .
Acoustics
  • Opinion
  • Open Access

10 November 2025

Envisioning the Future of Machine Learning in the Early Detection of Neurodevelopmental and Neurodegenerative Disorders via Speech and Language Biomarkers

1
Department of Languages and Literature, University of Nicosia, Nicosia 2417, Cyprus
2
Phonetic Lab, University of Nicosia, Nicosia 2417, Cyprus
This article belongs to the Special Issue Artificial Intelligence in Acoustic Phonetics

Abstract

Speech and language offer a rich, non-invasive window into brain health. Advances in machine learning (ML) have enabled increasingly accurate detection of neurodevelopmental and neurodegenerative disorders through these modalities. This paper envisions the future of ML in the early detection of neurodevelopmental disorders like autism spectrum disorder and attention-deficit/hyperactivity disorder, and neurodegenerative disorders, such as Parkinson’s disease and Alzheimer’s disease, through speech and language biomarkers. We explore the current landscape of ML techniques, including deep learning and multimodal approaches, and review their applications across various conditions, highlighting both successes and inherent limitations. Our core contribution lies in outlining future trends across several critical dimensions. These include the enhancement of data availability and quality, the evolution of models, the development of multilingual and cross-cultural models, the establishment of regulatory and clinical translation frameworks, and the creation of hybrid systems enabling human–artificial intelligence (AI) collaboration. Finally, we conclude with a vision for future directions, emphasizing the potential integration of ML-driven speech diagnostics into public health infrastructure, the development of patient-specific explainable AI, and its synergistic combination with genomics and brain imaging for holistic brain health assessment. Overcoming substantial hurdles in validation, generalization, and clinical adoption, the field is poised to shift toward ubiquitous, accessible, and highly personalized tools for early diagnosis.

1. Introduction

The global burden of neurodevelopmental and neurodegenerative disorders presents an escalating public health challenge. Conditions, such as autism spectrum disorder (ASD), attention-deficit/hyperactivity disorder (ADHD), Parkinson’s disease (PD), and Alzheimer’s disease (AD), significantly impact quality of life, necessitate extensive care, and impose immense socio-economic costs []. Globally, ASD affects roughly 1 in 100 children and about 1 in 127 individuals across the lifespan [,]. AD affects >55 million people worldwide and is projected to reach ~78 million by 2030 and ~139–153 million by 2050 [,]. PD affects ~8.5 million people and is projected to more than double by 2040–2050 [,].
It is important to mention that neurodevelopmental and neurodegenerative disorders differ fundamentally in etiology, diagnostic frameworks, and developmental trajectories. Neurodevelopmental disorders (e.g., ASD, ADHD) arise in the developmental period and are defined behaviorally in Diagnostic and Statistical Manual of Mental Disorders (DSM)/International Classification of Diseases (ICD) as disturbances in the acquisition of cognitive, language, motor, or social functions, with multifactorial genetic and environmental risk architecture and no validated biological diagnostic biomarkers to date [,,]. By contrast, neurodegenerative diseases reflect progressive proteinopathies with characteristic pathobiology—β-amyloid and tau for AD within the AT(N) framework, and α-synuclein pathology for PD—supported by evolving fluid and imaging biomarkers that increasingly anchor diagnosis [,,]. Diagnostic criteria also diverge: ADHD/ASD require developmental history and symptom-based thresholds in DSM/ICD, whereas AD now emphasizes biological definition and staged criteria, and PD employs Movement Disorder Society clinical diagnostic criteria with specified supportive and exclusion features [,].
A critical unmet need across these diverse conditions is the capacity for early and accurate diagnosis. Early detection is paramount, as it enables timely intervention, potentially slowing disease progression, improving developmental outcomes, and facilitating access to support services, thereby enhancing patient and caregiver well-being [,]. However, current diagnostic pathways often involve subjective clinical assessments, lengthy observation periods, and expensive, invasive neuroimaging or biochemical tests, leading to diagnostic delays and disparities in access [,].
In this context, speech and language emerge as exceptionally promising biomarkers. They are ubiquitous, non-invasive, cost-effective, and highly accessible, requiring only a microphone-equipped device for data collection []. Speech production and language processing are complex cognitive functions that rely on intricate neural networks. Consequently, disruptions in these networks, whether due to atypical development or neurodegeneration, often manifest as subtle yet detectable changes in vocal characteristics, prosody, articulation, fluency, syntax, semantics, and pragmatics []. These changes can precede overt clinical symptoms by years, offering a unique window for proactive intervention.
The advent of machine learning (ML) has revolutionized the ability to identify subtle patterns within complex data, making it an ideal candidate for extracting meaningful diagnostic signals from speech and language. ML algorithms can analyze vast quantities of acoustic (e.g., formant frequencies, pitch), linguistic (e.g., syntactic complexity, lexical choice), and paralinguistic (e.g., speech rate, intonation) features, identifying intricate relationships that are imperceptible to the human ear or traditional statistical methods []. This capability has propelled ML to the forefront of research in speech-based diagnostics, transforming raw vocalizations into actionable diagnostic insights.
This paper aims to discuss the future trajectories of ML-driven, speech- and language-based diagnostics for neurodevelopmental and neurodegenerative disorders. It provides a narrative review that synthesizes representative findings from current research, analyzes emerging technological and societal trends, and examines how these developments are shaping the landscape of early detection. Rather than offering a systematic evidence synthesis, the paper takes a conceptual and integrative approach, connecting advances across linguistics, ML, and clinical neuroscience to provide a forward-looking outlook on the transformative potential of ML in brain health diagnostics.

2. Background and Current Landscape

The application of ML to speech and language biomarkers for neurodevelopmental and neurodegenerative disorders has seen significant progress in recent years. Researchers have leveraged a diverse array of ML techniques to extract meaningful patterns from complex vocal and linguistic data, aiming to identify subtle deviations indicative of neurological conditions.

2.1. Overview of Current Machine Learning Techniques

The bedrock of most current diagnostic applications is supervised learning, a paradigm where models learn from labeled data, such as speech samples explicitly marked as healthy or disordered. Historically, Support Vector Machines have been prominent, proving effective for classification tasks by identifying an optimal hyperplane to separate different classes in a high-dimensional feature space. These have been widely applied with hand-crafted acoustic features, including Mel-frequency cepstral coefficients, jitter, and shimmer, particularly for conditions like PD []. Ensemble learning methods, such as Random Forests, also offer robustness and strong performance by building multiple decision trees and aggregating their predictions, especially when dealing with high-dimensional feature sets and complex interactions [].
The most transformative development in recent years has been the rise of deep neural networks. Convolutional Neural Networks (CNNs) are primarily utilized for analyzing spectrograms of speech, which are visual representations of sound frequencies over time. CNNs excel at capturing local patterns and hierarchical features, much like their application in image recognition [], and have shown promise in detecting vocal tremors or atypical prosody []. Recurrent Neural Networks and Long Short-Term Memory networks are specifically designed to process sequential data, making them well-suited for capturing temporal dependencies in speech and language, such as fluency patterns, pauses, or grammatical structures []. These are particularly relevant for analyzing conversational dynamics or narrative coherence in conditions like AD. More recently, the Transformer architecture, with its self-attention mechanism, has become dominant, especially in natural language processing. While initially applied to textual data, the Transformer’s ability to model long-range dependencies and contextual relationships makes it increasingly relevant for speech analysis, particularly in conjunction with large pre-trained models []. Figure 1 illustrates a general ML pipeline for detecting speech or language disorders.
Figure 1. General ML pipeline for detecting speech or language disorders.
Beyond supervised learning, self-supervised and unsupervised models address the challenge of limited labeled data. Self-supervised learning involves models learning representations from unlabeled data by solving pretext tasks, such as predicting masked words or future frames in a sequence. Models like Wav2Vec 2.0 and HuBERT, pre-trained on vast amounts of raw speech, learn rich, transferable representations that can then be fine-tuned on smaller, labeled datasets for specific diagnostic tasks [], significantly reducing the need for extensive manual annotation. Unsupervised learning techniques, including clustering or anomaly detection, can identify unusual patterns in speech without prior labels [], potentially flagging individuals who deviate significantly from a “normative” speech profile. For example, Bi et al. [] introduced a fully unsupervised deep learning approach for diagnosing AD using magnetic resonance imaging (MRI) scans. The method employed unsupervised networks to automatically extract features from the MRI data, followed by an unsupervised prediction model to classify disease states. The model achieved high accuracy in distinguishing AD from mild cognitive impairment (MCI) and in differentiating MCI from normal controls. Parlett-Pelleriti et al. [] presented a review of 43 studies that applied unsupervised ML techniques to ASD research. The authors concluded that unsupervised methods are particularly well-suited for ASD research due to the wide variability among individuals with the diagnosis, and both current applications and future developments in this area show significant potential.
Finally, multimodal learning recognizes that speech and language are often accompanied by other behavioral cues, integrating information from various sources. This approach might combine speech features with facial expressions, gaze patterns, body movements, or even physiological signals like heart rate variability, to provide a more comprehensive diagnostic picture []. For instance, in autism detection, combining vocalizations with eye-tracking data can enhance diagnostic accuracy [].

2.2. Review of Current Applications

ML-driven speech and language diagnostics are actively being explored across a spectrum of neurodevelopmental and neurodegenerative disorders. For ASD, early detection efforts often focus on vocalizations in infants and toddlers, such as atypical cry patterns or reduced babbling complexity, and later, on prosodic abnormalities, repetitive language, or pragmatic difficulties in older children []. ML models analyze acoustic features, turn-taking patterns, and semantic content in these cases. Research into ADHD explores speech rate, variability, and disfluencies, as well as the coherence and organization of narratives, which can reflect executive function deficits [].
In PD, dysarthria, affecting voice quality, articulation, and prosody, is a hallmark. ML models extensively analyze features like fundamental frequency (pitch) variations (jitter, shimmer), voice intensity, speech rate, and pauses to detect early signs of vocal impairment []. For AD and MCI, language decline is a prominent feature. ML models assess lexical diversity, semantic coherence, grammatical complexity, increased pauses, and repetitions in spontaneous speech or narrative tasks, such as picture description [].
Notwithstanding these research advances, the vast majority of speech- and language-based ML applications remain experimental. To date, no model has completed prospective, multi-site clinical trials or obtained regulatory clearance for standalone diagnosis in these conditions.

2.3. Datasets Used in Previous Work

The development of these ML models heavily relies on specialized datasets. The ADReSS (Alzheimer’s Dementia Recognition through Speech) is a benchmark dataset for AD detection from speech, containing audio recordings and transcripts from individuals with AD and healthy controls []. The DAIC-WOZ (Distress Analysis Interview Corpus—Wizard of Oz), while primarily for depression detection, contains conversational speech data relevant for analyzing psychiatric and neurological conditions, including features like prosody and turn-taking []. Coswara is a large crowdsourced dataset of respiratory sounds and speech, initially for COVID-19 detection, but valuable for exploring vocal biomarkers across various health conditions due to its scale and diversity []. The Child Language Data Exchange System (CHILDES) is a vast repository of transcribed child language data, crucial for studying typical and atypical language development []. Many studies also rely on smaller, institution-specific private clinical datasets collected from clinical populations, though these often limit generalizability due to their size and specific demographics.

2.4. Advantages and Limitations

The advantages of ML in this field include its non-invasiveness and accessibility, offering a low-cost alternative to traditional diagnostics and enabling broader access, especially in underserved areas. ML models have demonstrated significant potential for early detection, identifying subtle speech and language changes that precede overt clinical symptoms, thus offering a crucial window for early intervention []. Furthermore, these models offer scalability, capable of processing large volumes of data efficiently [], making them suitable for population-level screening initiatives, and provide objective quantification of speech and language features, reducing subjectivity inherent in clinical assessments.
However, significant limitations persist. Generalizability is a major hurdle, as models trained on specific datasets often perform poorly when applied to new populations, different recording conditions, or diverse linguistic backgrounds, a phenomenon known as “dataset shift” []. Although group-level studies frequently report high accuracy, these findings rarely generalize across datasets, institutions, or populations, a limitation widely attributed to dataset shift and the absence of standardized data collection protocols [,]. Diagnostic reliability at the individual level remains elusive, and many well-known neurophysiological or behavioral biomarkers, such as the Mismatch Negativity (MMN) component, illustrate this gap. While MMN reliably differentiates groups with ASD from neurotypical controls, meta-analyses have concluded that it lacks the precision and reproducibility necessary for clinical implementation [,]. Similarly, although numerous ML studies report promising results in detecting ASD, no clinical SLP service currently employs ML tools for routine assessment or diagnosis, underscoring the persistent translational divide between algorithmic performance in research settings and real-world clinical utility. Data scarcity is another challenge; high-quality, clinically annotated speech datasets for rare disorders or specific disease stages are often small, imbalanced, and difficult to acquire due to privacy concerns and the cost of expert annotation, which limits the training of robust deep learning models []. Interpretability, often referred to as the “black-box problem”, means that many advanced deep learning models make it difficult to understand why a particular diagnosis was made. This lack of transparency is a significant barrier to clinical adoption, as clinicians require insights into the decision-making process for trust and validation []. Confounding factors also complicate analysis, as speech patterns are influenced by numerous variables beyond neurological conditions, including age, gender, medication, emotional state, education level, and cultural background, making it challenging to isolate disease-specific biomarkers []. Finally, no ML system has yet been approved by any regulatory agencies as a standalone diagnostic tool for autism or related neurodevelopmental conditions. Existing digital platforms, such as Canvas Dx, have only been authorized as diagnostic aids rather than definitive instruments, reflecting the early, investigational status of ML-based tools in this domain [,].
Despite these limitations, the rapid advancements in ML, particularly in deep learning and self-supervised learning, suggest a promising future for speech and language biomarkers in neurological diagnostics once key challenges are addressed. The following sections outline how these emerging trends may evolve and address current limitations in the use of ML for detection through speech and language biomarkers. Figure 2 illustrates an overview of the advantages and limitations of ML models in clinical practice.
Figure 2. Overview of the advantages and limitations of ML models in clinical practice.

4. Conclusions and Future Directions

Despite major research advances, current ML systems have not yet achieved the reliability or regulatory readiness required for routine clinical use; their role remains exploratory and adjunctive rather than diagnostic. Nevertheless, as evidence and oversight frameworks mature, we anticipate integration into public health infrastructure. The non-invasive and scalable nature of speech-based diagnostics makes them potential candidates for integration into broader public health initiatives. We foresee a future where these tools are deployed for population-level screening, particularly in primary care settings or even through consumer-facing applications. This integration would significantly reduce diagnostic delays, improve access to care, and enable earlier interventions, leading to better long-term outcomes and reduced healthcare costs. This requires robust, validated, and user-friendly systems that can be operated by non-specialist healthcare providers or even directly by individuals in their homes, with results securely transmitted to clinicians for review.
Explainable, patient-specific ML systems will overcome the black-box problem through a strong emphasis on AI that provides actionable insights to clinicians and patients. Future systems will not just provide a diagnostic label but will highlight the specific acoustic, prosodic, or linguistic features that contributed to the prediction. Furthermore, these systems might become highly patient-specific, adapting to individual baseline speech patterns and tracking longitudinal changes. This personalized approach has the opportunity to move beyond population-level averages, enabling more precise monitoring of disease progression or response to therapy for each individual. The development of “digital twins” (i.e., virtual models that mirror a patient’s speech profile and are continuously updated with new data of a patient’s speech profile, continuously updated with new data) could allow for highly individualized risk assessments and treatment recommendations.
A powerful future direction lies in the synergistic integration with genomics and brain imaging, combining speech and language biomarkers with other omics data and neuroimaging modalities. Speech provides a functional, behavioral phenotype, while genomics offers insights into genetic predispositions, and brain imaging, such as functional MRI (fMRI) and positron emission tomography (PET), reveals structural and functional brain changes. Multi-omics integration, combining speech data with genomic information, could identify individuals at high genetic risk who also exhibit subtle speech atypicalities, allowing for ultra-early identification and preventive strategies []. Neuroimaging correlation, linking specific speech biomarkers with changes observed in brain imaging, could deepen our understanding of the neural correlates of speech production and language processing in disease. This could lead to the discovery of novel, more specific biomarkers and provide biological validation for speech-based diagnostic findings. The ultimate vision is a holistic brain health assessment platform that integrates diverse data streams—speech, genomics, neuroimaging, cognitive test results, and even wearable sensor data—to provide a comprehensive, multi-dimensional view of an individual’s neurological health. ML models will be crucial for fusing these disparate data types, identifying complex interactions, and providing a unified risk profile or diagnostic assessment. This integrated approach will move beyond single-modality diagnostics to a more subtle and accurate understanding of neurodevelopmental and neurodegenerative conditions, paving the way for truly personalized and preventive neurological care [].
The present paper offers a conceptual and forward-looking synthesis of emerging trends for the use of ML in clinical translation. Despite rapid research progress, it is important to recognize that ML models in speech and language remain at the proof-of-concept stage. Bridging the gap between promising research and validated, generalizable clinical tools will require large-scale, longitudinal validation and appropriate regulatory oversight. Looking ahead, these developments point to a paradigm shift—from reactive diagnosis to proactive health management—where accessible, intelligent, and integrated technologies empower both individuals and clinicians in the continuous pursuit of brain health. Another important point lies in the joint discussion of neurodevelopmental (e.g., ASD, ADHD) and neurodegenerative (e.g., PD, AD) disorders within a shared analytical framework. While this comparative approach emphasizes the common value of speech and language as non-invasive biomarkers across the lifespan, these groups differ substantially in etiology, biomarker expression, and methodological requirements. These distinctions entail different data collection paradigms, ML architectures, and clinical validation pathways. Therefore, the integrated treatment adopted here should be viewed as a conceptual synthesis rather than a disorder-specific analytical framework.

Funding

This research received no external funding.

Data Availability Statement

No data are available.

Acknowledgments

This study was supported by the Phonetic Lab of the University of Nicosia.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Ding, C.; Wu, Y.; Chen, X.; Chen, Y.; Wu, Z.; Lin, Z.; Kang, D.; Fang, W.; Chen, F. Global, regional, and national burden and attributable risk factors of neurological disorders: The Global Burden of Disease Study 1990–2019. Front. Public Health 2022, 10, 952161. [Google Scholar] [CrossRef] [PubMed]
  2. World Health Organization. Autism Spectrum Disorders (Fact Sheet). 2025. Available online: https://www.who.int/news-room/fact-sheets/detail/autism-spectrum-disorders (accessed on 9 October 2025).
  3. Santomauro, D.F.; Erskine, H.E.; Herrera, A.M.M.; Miller, P.A.; Shadid, J.; Hagins, H.; Addo, I.Y.; Adnani, Q.E.S.; Ahinkorah, B.O.; Ahmed, A.; et al. The Global Epidemiology and Health Burden of the Autism Spectrum: Findings from the Global Burden of Disease Study 2021. Lancet Psychiatry 2025, 12, 111–121. [Google Scholar] [CrossRef] [PubMed]
  4. Alzheimer’s Disease International. Dementia Statistics (Facts & Figures); Alzheimer’s Disease International: London, UK, 2020; Available online: https://www.alzint.org/about/dementia-facts-figures/dementia-statistics/ (accessed on 9 October 2025).
  5. Alzheimer’s Disease International. Numbers of People with Dementia Worldwide: An Update to the Estimates in the World Alzheimer Report 2015; Alzheimer’s Disease International: London, UK, 2020; Available online: https://www.alzint.org/resource/numbers-of-people-with-dementia-worldwide/ (accessed on 9 October 2025).
  6. World Health Organization (WHO). Parkinson Disease—Fact Sheet; World Health Organization: Geneva, Switzerland, 2023; Available online: https://www.who.int/news-room/fact-sheets/detail/parkinson-disease (accessed on 9 October 2025).
  7. Bhidayasiri, R.; Sringean, J.; Phumphid, S.; Anan, C.; Thanawattano, C.; Deoisres, S.; Panyakaew, P.; Phokaewvarangkul, O.; Maytharakcheep, S.; Buranasrikul, V.; et al. The Rise of Parkinson’s Disease Is a Global Challenge, but Efforts to Tackle This Must Begin at a National Level: A Protocol for National Digital Screening and “Eat, Move, Sleep” Lifestyle Interventions to Prevent or Slow the Rise of Non-Communicable Diseases in Thailand. Front. Neurol. 2024, 15, 1386608. [Google Scholar]
  8. Reed, G.M.; First, M.B.; Kogan, C.S.; Hyman, S.E.; Gureje, O.; Gaebel, W.; Maj, M.; Stein, D.J.; Maercker, A.; Tyrer, P.; et al. Innovations and Changes in the ICD-11 Classification of Mental, Behavioural and Neurodevelopmental Disorders. World Psychiatry 2019, 18, 3–19. [Google Scholar] [CrossRef]
  9. World Health Organization. Clinical Descriptions and Diagnostic Requirements for ICD-11 Mental, Behavioural and Neurodevelopmental Disorders; World Health Organization: Geneva, Switzerland, 2024. [Google Scholar]
  10. Sauer, A.K.; Stanton, J.; Hans, S.; Grabrucker, A. Autism Spectrum Disorders: Etiology and Pathology; Exon Publications: Brisbane, Australia, 2021; pp. 1–15. [Google Scholar]
  11. Jack, C.R., Jr.; Bennett, D.A.; Blennow, K.; Carrillo, M.C.; Dunn, B.; Haeberlein, S.B.; Holtzman, D.M.; Jagust, W.; Jessen, F.; Karlawish, J.; et al. NIA-AA Research Framework: Toward a Biological Definition of Alzheimer’s Disease. Alzheimer’s Dement. 2018, 14, 535–562. [Google Scholar] [CrossRef]
  12. Jack, C.R., Jr.; Andrews, J.S.; Beach, T.G.; Buracchio, T.; Dunn, B.; Graf, A.; Hansson, O.; Ho, C.; Jagust, W.; McDade, E.; et al. Revised Criteria for Diagnosis and Staging of Alzheimer’s Disease: Alzheimer’s Association Workgroup. Alzheimer’s Dement. 2024, 20, 5143–5169. [Google Scholar] [CrossRef]
  13. Gauthier, S.; Rosa-Neto, P. Alzheimer’s Disease Biomarkers: Amyloid, Tau, Neurodegeneration (ATN)—Where Do We Go from Here? Pract. Neurol. 2019, 88, 60–61. [Google Scholar]
  14. Postuma, R.B.; Berg, D.; Stern, M.; Poewe, W.; Olanow, C.W.; Oertel, W.; Obeso, J.; Marek, K.; Litvan, I.; Lang, A.E.; et al. MDS Clinical Diagnostic Criteria for Parkinson’s Disease. Mov. Disord. 2015, 30, 1591–1601. [Google Scholar] [CrossRef]
  15. Franke, B.; Michelini, G.; Asherson, P.; Banaschewski, T.; Bilbow, A.; Buitelaar, J.K.; Cormand, B.; Faraone, S.V.; Ginsberg, Y.; Haavik, J.; et al. Live Fast, Die Young? A Review on the Developmental Trajectories of ADHD across the Lifespan. Eur. Neuropsychopharmacol. 2018, 28, 1059–1088. [Google Scholar] [CrossRef]
  16. Zwaigenbaum, L.; Bauman, M.L.; Choueiri, R.; Kasari, C.; Carter, A.; Granpeesheh, D.; Mailloux, Z.; Smith Roley, S.; Wagner, S.; Fein, D.; et al. Early intervention for children with autism spectrum disorder under 3 years of age: Recommendations for practice and research. Pediatrics 2015, 136 (Suppl. 1), S60–S81. [Google Scholar] [CrossRef]
  17. Leibing, A. The Earlier the Better: Alzheimer’s Prevention, Early Detection, and the Quest for Pharmacological Interventions. Cult. Med. Psychiatry 2014, 38, 217–236. [Google Scholar] [CrossRef]
  18. McLoughlin, C.; Lee, W.H.; Carson, A.; Stone, J. Iatrogenic harm in functional neurological disorder. Brain 2025, 148, 27–38. [Google Scholar] [CrossRef] [PubMed]
  19. Wen, Q.; Wittens, M.M.J.; Engelborghs, S.; van Herwijnen, M.H.; Tsamou, M.; Roggen, E.; Smeets, B.; Krauskopf, J.; Briedé, J.J. Beyond CSF and neuroimaging assessment: Evaluating plasma miR-145-5p as a potential biomarker for mild cognitive impairment and Alzheimer’s disease. ACS Chem. Neurosci. 2024, 15, 1042–1054. [Google Scholar] [CrossRef] [PubMed]
  20. Cao, F.; Vogel, A.P.; Gharahkhani, P.; Renteria, M.E. Speech and language biomarkers for Parkinson’s disease prediction, early diagnosis and progression. NPJ Parkinson’s Dis. 2025, 11, 57. [Google Scholar] [CrossRef] [PubMed]
  21. Georgiou, G.P.; Theodorou, E. Abilities of children with developmental language disorders in perceiving phonological, grammatical, and semantic structures. J. Autism Dev. Disord. 2023, 53, 4483–4487. [Google Scholar] [CrossRef]
  22. Myszczynska, M.A.; Ojamies, P.N.; Lacoste, A.M.; Neil, D.; Saffari, A.; Mead, R.; Hautbergue, G.M.; Holbrook, J.D.; Ferraiuolo, L. Applications of machine learning to diagnosis and treatment of neurodegenerative diseases. Nat. Rev. Neurol. 2020, 16, 440–456. [Google Scholar] [CrossRef]
  23. Moro-Velazquez, L.; Gomez-Garcia, J.A.; Arias-Londoño, J.D.; Dehak, N.; Godino-Llorente, J.I. Advances in Parkinson’s disease detection and assessment using voice and speech: A review of the articulatory and phonatory aspects. Biomed. Signal Process. Control 2021, 66, 102418. [Google Scholar] [CrossRef]
  24. Quaye, G.E. Random Forest for High-Dimensional Data. Ph.D. Thesis, The University of Texas at El Paso, El Paso, TX, USA, 2024. [Google Scholar]
  25. Vásquez-Correa, J.C.; Arias-Vergara, T.; Orozco-Arroyave, J.R.; Eskofier, B.; Klucken, J.; Nöth, E. Multimodal assessment of Parkinson’s disease: A deep learning approach. IEEE J. Biomed. Health Inform. 2018, 23, 1618–1630. [Google Scholar] [CrossRef]
  26. Weede, P.; Smietana, P.D.; Kuhlenbäumer, G.; Deuschl, G.; Schmidt, G. Two-stage convolutional neural network for classification of movement patterns in tremor patients. Information 2024, 15, 231. [Google Scholar] [CrossRef]
  27. Graves, A. Supervised Sequence Labelling with Recurrent Neural Networks; Springer: Berlin/Heidelberg, Germany, 2012. [Google Scholar]
  28. Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, N. Attention is all you need. In Advances in Neural Information Processing Systems; Curran Associates: Red Hook, NY, USA, 2017; Volume 30. [Google Scholar]
  29. Baevski, A.; Zhou, Y.; Mohamed, A.; Auli, M. Wav2vec 2.0: A framework for self-supervised learning of speech representations. Adv. Neural Inf. Process. Syst. 2020, 33, 12449–12460. [Google Scholar]
  30. Beke, A.; Szaszák, G. Unsupervised clustering of prosodic patterns in spontaneous speech. In Proceedings of the Text, Speech and Dialogue, TSD 2012, Brno, Czech Republic, 3–7 September 2012; Springer: Berlin/Heidelberg, Germany, 2012; pp. 648–655. [Google Scholar]
  31. Bi, X.; Li, S.; Xiao, B.; Li, Y.; Wang, G.; Ma, X. Computer aided Alzheimer’s disease diagnosis by an unsupervised deep learning technology. Neurocomputing 2020, 392, 296–304. [Google Scholar] [CrossRef]
  32. Parlett-Pelleriti, C.M.; Stevens, E.; Dixon, D.; Linstead, E.J. Applications of unsupervised machine learning in autism spectrum disorder research: A review. Rev. J. Autism Dev. Disord. 2023, 10, 406–421. [Google Scholar]
  33. Kumar, S.; Rani, S.; Sharma, S.; Min, H. Multimodality fusion aspects of medical diagnosis: A comprehensive review. Bioengineering 2024, 11, 1233. [Google Scholar] [CrossRef] [PubMed]
  34. Rashid, A.F.; Shaker, S.H. Review of autistic detection using eye tracking and vocalization based on deep learning. J. Algebr. Stat. 2022, 13, 286–297. [Google Scholar]
  35. Oller, D.K.; Niyogi, P.; Gray, S.; Richards, J.A.; Gilkerson, J.; Xu, D.; Yapanel, U.; Warren, S.F. Automated vocal analysis of naturalistic recordings from children with autism, language delay, and typical development. Proc. Natl. Acad. Sci. USA 2010, 107, 13354–13359. [Google Scholar] [CrossRef]
  36. Kuijper, S.J.; Hartman, C.A.; Bogaerds-Hazenberg, S.; Hendriks, P. Narrative production in children with autism spectrum disorder (ASD) and children with attention-deficit/hyperactivity disorder (ADHD): Similarities and differences. J. Abnorm. Psychol. 2017, 126, 63. [Google Scholar] [CrossRef]
  37. Tsanas, A.; Little, M.A.; McSharry, P.E.; Spielman, J.; Ramig, L.O. Novel speech signal processing algorithms for high-accuracy classification of Parkinson’s disease. IEEE Trans. Biomed. Eng. 2012, 59, 1264–1271. [Google Scholar] [CrossRef]
  38. Thaler, F.; Gewald, H. Language characteristics supporting early Alzheimer’s diagnosis through machine learning—A literature review. Health Inform. Int. J. 2021, 10, 10102. [Google Scholar] [CrossRef]
  39. Luz, S.; Haider, F.; de la Fuente Garcia, S.; Fromm, D.; MacWhinney, B. Alzheimer’s dementia recognition through spontaneous speech. Front. Comput. Sci. 2021, 3, 780169. [Google Scholar] [CrossRef]
  40. Gratch, J.; Artstein, R.; Lucas, G.M.; Stratou, G.; Scherer, S.; Nazarian, A.; Wood, R.; Boberg, J.; DeVault, D.; Marsella, S.; et al. The Distress Analysis Interview Corpus of human and computer interviews. In Proceedings of the LREC, Reykjavik, Iceland, 26–31 May 2014; pp. 3123–3128. [Google Scholar]
  41. Sharma, N.; Krishnan, P.; Kumar, R.; Ramoji, S.; Chetupalli, S.R.; Ghosh, P.K.; Ganapathy, S. Coswara—A database of breathing, cough, and voice sounds for COVID-19 diagnosis. arXiv 2020, arXiv:2005.10548. [Google Scholar]
  42. MacWhinney, B. The CHILDES Project: Tools for Analyzing Talk. Volume I: Transcription Format and Programs; Psychology Press: New York, NY, USA, 2014. [Google Scholar]
  43. Brahmi, Z.; Mahyoob, M.; Al-Sarem, M.; Algaraady, J.; Bousselmi, K.; Alblwi, A. Exploring the role of machine learning in diagnosing and treating speech disorders: A systematic literature review. Psychol. Res. Behav. Manag. 2024, 2205–2232. [Google Scholar] [CrossRef]
  44. Potla, R.T. Scalable Machine Learning Algorithms for Big Data Analytics: Challenges and Opportunities. J. Artif. Intell. Res. 2022, 2, 124–141. [Google Scholar]
  45. Quiñonero-Candela, J.; Sugiyama, M.; Schwaighofer, A.; Lawrence, N.D. Dataset Shift in Machine Learning; MIT Press: Cambridge, MA, USA, 2022. [Google Scholar]
  46. Washington, P.; Wall, D.P. A review of and roadmap for data science and machine learning for the neuropsychiatric phenotype of autism. Annu. Rev. Biomed. Data Sci. 2023, 6, 211–228. [Google Scholar] [CrossRef] [PubMed]
  47. Health Care Service Corporation (HCSC). Digital Health Technologies: Diagnostic Applications; Policy No. PSY301.024; Effective 15 December 2024; Health Care Service Corporation (HCSC): Chicago, IL, USA, 2024. [Google Scholar]
  48. Schwartz, S.; Shinn-Cunningham, B.; Tager-Flusberg, H. Meta-analysis and systematic review of the literature characterizing auditory mismatch negativity in individuals with autism. Neurosci. Biobehav. Rev. 2018, 87, 106–117. [Google Scholar] [CrossRef] [PubMed]
  49. Lassen, J.; Oranje, B.; Vestergaard, M.; Foldager, M.; Kjær, T.W.; Arnfred, S.; Aggernæs, B. Reduced mismatch negativity in children and adolescents with autism spectrum disorder is associated with their impaired adaptive functioning. Autism Res. 2022, 15, 1469–1481. [Google Scholar] [CrossRef]
  50. Ghasemzadeh, H.; Hillman, R.E.; Mehta, D.D. Toward generalizable machine learning models in speech, language, and hearing sciences: Estimating sample size and reducing overfitting. J. Speech Lang. Hear. Res. 2024, 67, 753–781. [Google Scholar] [CrossRef]
  51. Jiang, X.; Hu, Z.; Wang, S.; Zhang, Y. Deep learning for medical image-based cancer diagnosis. Cancers 2023, 15, 3608. [Google Scholar] [CrossRef]
  52. Ramanarayanan, V.; Lammert, A.C.; Rowe, H.P.; Quatieri, T.F.; Green, J.R. Speech as a biomarker: Opportunities, interpretability, and challenges. Perspect. ASHA Spec. Interest Groups 2022, 7, 276–283. [Google Scholar] [CrossRef]
  53. Premera Blue Cross. Prescription Digital Health Diagnostic Aid for Autism Spectrum Disorder (Medical Policy 3.03.01); Premera Blue Cross: Spokane, WA, USA, 2024. [Google Scholar]
  54. Kleine, A.K.; Kokje, E.; Hummelsberger, P.; Lermer, E.; Gaube, S. AI-enabled clinical decision support tools for mental healthcare: A product review. OSF Preprints 2023. Available online: https://osf.io/ez43g (accessed on 3 November 2025). [CrossRef]
  55. Schmitz, H.; Howe, C.L.; Armstrong, D.G.; Subbian, V. Leveraging mobile health applications for biomedical research and citizen science: A scoping review. J. Am. Med. Inform. Assoc. 2018, 25, 1685–1698. [Google Scholar] [CrossRef]
  56. Majumder, S.; Mondal, T.; Deen, M.J. Wearable sensors for remote health monitoring. Sensors 2017, 17, 130. [Google Scholar] [CrossRef]
  57. Rieke, N.; Hancox, J.; Li, W.; Milletari, F.; Roth, H.R.; Albarqouni, S.; Bakas, S.; Galtier, M.N.; Landman, B.A.; Maier-Hein, K.; et al. The future of digital health with federated learning. NPJ Digit. Med. 2020, 3, 119. [Google Scholar] [CrossRef]
  58. Goodfellow, I.J.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial nets. In Advances in Neural Information Processing Systems; Curran Associates: Red Hook, NY, USA, 2014; Volume 27. [Google Scholar]
  59. Radford, A.; Kim, J.W.; Xu, T.; Brockman, G.; McLeavey, C.; Sutskever, I. Robust speech recognition via large-scale weak supervision. In Proceedings of the International Conference on Machine Learning, Honolulu, HI, USA, 23–29 July 2023; PMLR: Honolulu, HI, USA, 2023; pp. 28492–28518. [Google Scholar]
  60. Pan, S.J.; Yang, Q. A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 2010, 22, 1345–1359. [Google Scholar] [CrossRef]
  61. Baltrušaitis, T.; Ahuja, C.; Morency, L.P. Multimodal machine learning: A survey and taxonomy. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 41, 423–434. [Google Scholar] [CrossRef] [PubMed]
  62. Song, Y.; Wang, T.; Cai, P.; Mondal, S.K.; Sahoo, J.P. A comprehensive survey of few-shot learning: Evolution, applications, challenges, and opportunities. ACM Comput. Surv. 2023, 55, 1–40. [Google Scholar] [CrossRef]
  63. Larasati, R. Inclusivity of AI Speech in Healthcare: A Decade Look Back. arXiv 2025, arXiv:2505.10596. [Google Scholar] [CrossRef]
  64. Chen, Z.; Zhang, J.M.; Sarro, F.; Harman, M. A comprehensive empirical study of bias mitigation methods for machine learning classifiers. ACM Trans. Softw. Eng. Methodol. 2023, 32, 1–30. [Google Scholar] [CrossRef]
  65. Mohamed, A.; Lee, H.Y.; Borgholt, L.; Havtorn, J.D.; Edin, J.; Igel, C.; Kirchhoff, K.; Li, S.-W.; Livescu, K.; Maaløe, L.; et al. Self-supervised speech representation learning: A review. IEEE J. Sel. Top. Signal Process. 2022, 16, 1179–1217. [Google Scholar] [CrossRef]
  66. Babu, A.; Wang, C.; Tjandra, A.; Lakhotia, K.; Xu, Q.; Goyal, N.; Singh, K.; von Platen, P.; Saraf, Y.; Pino, J.; et al. XLS-R: Self-supervised cross-lingual speech representation learning at scale. arXiv 2021, arXiv:2111.09296. [Google Scholar]
  67. Gunning, D.; Stefik, M.; Choi, J.; Miller, T.; Stumpf, S.; Yang, G.Z. XAI—Explainable artificial intelligence. Sci. Robot. 2019, 4, eaay7120. [Google Scholar] [CrossRef]
  68. Nohara, Y.; Matsumoto, K.; Soejima, H.; Nakashima, N. Explanation of Machine Learning Models Using Shapley Additive Explanation and Application for Real Data in Hospital. Comput. Methods Programs Biomed. 2022, 214, 106584. [Google Scholar] [CrossRef] [PubMed]
  69. Zafar, M.R.; Khan, N. Deterministic Local Interpretable Model-Agnostic Explanations for Stable Explainability. Mach. Learn. Knowl. Extr. 2021, 3, 525–541. [Google Scholar] [CrossRef]
  70. Topol, E.J. High-performance medicine: The convergence of human and artificial intelligence. Nat. Med. 2019, 25, 44–60. [Google Scholar] [CrossRef] [PubMed]
  71. US Food and Drug Administration. Artificial Intelligence and Machine Learning in Software as a Medical Device. Available online: https://www.fda.gov/medical-devices/software-medical-device-samd/artificial-intelligence-and-machine-learning-software-medical-device (accessed on 19 August 2025).
  72. European Medicines Agency. Reflection Paper on the Use of Artificial Intelligence (AI) in the Lifecycle of Medicinal Products; EMA: London, UK, 2024; Available online: https://www.ema.europa.eu/en/documents/scientific-guideline/reflection-paper-use-artificial-intelligence-ai-medicinal-product-lifecycle_en.pdf (accessed on 3 November 2025).
  73. Talitckii, A.; Kovalenko, E.; Anikina, A.; Zimniakova, O.; Semenov, M.; Bril, E.; Shcherbak, A.; Dylov, D.V.; Somov, A. Avoiding misdiagnosis of Parkinson’s disease with the use of wearable sensors and artificial intelligence. IEEE Sens. J. 2020, 21, 3738–3747. [Google Scholar] [CrossRef]
  74. Pham, T.D.; Holmes, S.B.; Zou, L.; Patel, M.; Coulthard, P. Diagnosis of pathological speech with streamlined features for long short-term memory learning. Comput. Biol. Med. 2024, 170, 107976. [Google Scholar] [CrossRef]
  75. Kumari, P.; Chauhan, J.; Bozorgpour, A.; Huang, B.; Azad, R.; Merhof, D. Continual learning in medical image analysis: A comprehensive review of recent advancements and future prospects. arXiv 2023, arXiv:2312.17004. [Google Scholar] [CrossRef]
  76. Hassan, S.; Akaila, D.; Arjemandi, M.; Papineni, V.; Yaqub, M. MINDSETS: Multi-omics integration with neuroimaging for dementia subtyping and effective temporal study. Sci. Rep. 2025, 15, 15835. [Google Scholar] [CrossRef]
  77. Bhonde, S.B.; Prasad, J.R. Machine learning approach to revolutionize use of holistic health records for personalized healthcare. Int. J. Adv. Sci. Technol. 2020, 29, 313–321. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.