Artificial Intelligence-Enabled End-To-End Detection and Assessment of Alzheimer’s Disease Using Voice
Abstract
:1. Introduction
Aims
2. Materials and Methods
2.1. Dataset Descriptions
2.1.1. ADReSSo Dataset
2.1.2. DementiaBank Pitt Database
2.2. AI-Based Model
2.2.1. Embeddings from Pre-Trained Model
2.2.2. AD Classifier
2.2.3. AD Severity Predictor
2.3. Model Development and Validation
2.3.1. Model Training
2.3.2. Model Evaluation and Calibration
2.4. Performance Metrics and Statistical Analysis
2.5. Benchmark Studies
2.5.1. Comparison with Acoustic Features
2.5.2. Comparison with Machine Learning Baselines
3. Results
3.1. Evaluation of AD Diagnosis
3.2. Model Calibration
3.3. External Validation
3.4. Evaluation of AD Severity Prediction
4. Discussion
Strengths and Limitations
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Fratiglioni, L.; De Ronchi, D.; Agüero-Torres, H. Worldwide Prevalence and Incidence of Dementia. Drugs Aging 1999, 15, 365–375. [Google Scholar] [CrossRef] [PubMed]
- Seeley, W.W.; Miller, B.L. Alzheimer’s Disease. In Harrison’s Principles of Internal Medicine; Jameson, J.L., Fauci, A.S., Kasper, D.L., Hauser, S.L., Longo, D.L., Loscalzo, J., Eds.; McGraw-Hill Education: New York, NY, USA, 2018. [Google Scholar]
- Ernst, R.L.; Hay, J.W. The US Economic and Social Costs of Alzheimer’s Disease Revisited. Am. J. Public Health 1994, 84, 1261–1264. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Meek, P.D.; McKeithan, E.K.; Schumock, G.T. Economic Considerations in Alzheimer’s Disease. Pharmacother. J. Hum. Pharmacol. Drug Ther. 1998, 18, 68–73. [Google Scholar] [CrossRef]
- Yiannopoulou, K.G.; Papageorgiou, S.G. Current and Future Treatments in Alzheimer Disease: An Update. J. Cent. Nerv. Syst. Dis. 2020, 12, 1179573520907397. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Folstein, M.F.; Folstein, S.E.; McHugh, P.R. “Mini-Mental State”: A Practical Method for Grading the Cognitive State of Patients for the Clinician. J. Psychiatr. Res. 1975, 12, 189–198. [Google Scholar] [CrossRef] [PubMed]
- Gupta, Y.; Lee, K.H.; Choi, K.Y.; Lee, J.J.; Kim, B.C.; Kwon, G.R.; National Research Center for Dementia; Alzheimer’s Disease Neuroimaging Initiative. Early Diagnosis of Alzheimer’s Disease Using Combined Features from Voxel-Based Morphometry and Cortical, Subcortical, and Hippocampus Regions of MRI T1 Brain Images. PLoS ONE 2019, 14, e0222446. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Seitz, D.P.; Chan, C.C.; Newton, H.T.; Gill, S.S.; Herrmann, N.; Smailagic, N.; Nikolaou, V.; Fage, B.A. Mini-Cog for the Diagnosis of Alzheimer’s Disease Dementia and Other Dementias within a Primary Care Setting. Cochrane Database Syst. Rev. 2018, 2, CD011415. [Google Scholar] [CrossRef]
- Weiner, M.W.; Veitch, D.P.; Aisen, P.S.; Beckett, L.A.; Cairns, N.J.; Green, R.C.; Harvey, D.; Jack, C.R.; Jagust, W.; Liu, E.; et al. The Alzheimer’s Disease Neuroimaging Initiative: A Review of Papers Published since Its Inception. Alzheimers Dement. J. Alzheimers Assoc. 2013, 9, e111–e194. [Google Scholar] [CrossRef] [Green Version]
- Jack, C.R. Advances in Alzheimer’s Disease Research over the Past Two Decades. Lancet Neurol. 2022, 21, 866–869. [Google Scholar] [CrossRef]
- Goodglass, H.; Kaplan, E.; Weintraub, S. BDAE: The Boston Diagnostic Aphasia Examination, 3rd ed.; Lippincott Williams & Wilkins: Philadelphia, PA, USA, 2001. [Google Scholar]
- Lin, H.; Karjadi, C.; Ang, T.F.A.; Prajakta, J.; McManus, C.; Alhanai, T.W.; Glass, J.; Au, R. Identification of Digital Voice Biomarkers for Cognitive Health. Explor. Med. 2020, 1, 406–417. [Google Scholar] [CrossRef]
- Eyben, F.; Scherer, K.R.; Schuller, B.W.; Sundberg, J.; André, E.; Busso, C.; Devillers, L.Y.; Epps, J.; Laukka, P.; Narayanan, S.S. The Geneva Minimalistic Acoustic Parameter Set (GeMAPS) for Voice Research and Affective Computing. IEEE Trans. Affect. Comput. 2015, 7, 190–202. [Google Scholar] [CrossRef] [Green Version]
- Eyigoz, E.; Mathur, S.; Santamaria, M.; Cecchi, G.; Naylor, M. Linguistic Markers Predict Onset of Alzheimer’s Disease. EClinicalMedicine 2020, 28, 100583. [Google Scholar] [CrossRef] [PubMed]
- Fraser, K.C.; Meltzer, J.A.; Rudzicz, F. Linguistic Features Identify Alzheimer’s Disease in Narrative Speech. J. Alzheimers Dis. 2016, 49, 407–422. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Haider, F.; de la Fuente, S.; Luz, S. An Assessment of Paralinguistic Acoustic Features for Detection of Alzheimer’s Dementia in Spontaneous Speech. IEEE J. Sel. Top. Signal Process. 2020, 14, 272–281. [Google Scholar] [CrossRef]
- Balagopalan, A.; Novikova, J. Comparing Acoustic-Based Approaches for Alzheimer’s Disease Detection. arXiv 2021, arXiv:2106.01555. [Google Scholar] [CrossRef]
- Luz, S.; Haider, F.; de la Fuente, S.; Fromm, D.; MacWhinney, B. Alzheimer’s Dementia Recognition through Spontaneous Speech: The ADReSS Challenge. arXiv 2020, arXiv:2004.06833. [Google Scholar] [CrossRef]
- Luz, S.; Haider, F.; de la Fuente, S.; Fromm, D.; MacWhinney, B. Detecting Cognitive Decline Using Speech Only: The ADReSSo Challenge. arXiv 2021, arXiv:2104.09356. [Google Scholar]
- Balagopalan, A.; Eyre, B.; Rudzicz, F.; Novikova, J. To BERT or Not To BERT: Comparing Speech and Language-Based Approaches for Alzheimer’s Disease Detection. arXiv 2020, arXiv:2008.01551. [Google Scholar] [CrossRef]
- Guo, Y.; Li, C.; Roan, C.; Pakhomov, S.; Cohen, T. Crossing the “Cookie Theft” Corpus Chasm: Applying What BERT Learns From Outside Data to the ADReSS Challenge Dementia Detection Task. Front. Comput. Sci. 2021, 3, 642517. [Google Scholar] [CrossRef]
- Agbavor, F.; Liang, H. Predicting Dementia from Spontaneous Speech Using Large Language Models. PLoS Digit. Health 2022, 1, e0000168. [Google Scholar] [CrossRef]
- Becker, J.T.; Boiler, F.; Lopez, O.L.; Saxton, J.; McGonigle, K.L. The Natural History of Alzheimer’s Disease: Description of Study Cohort and Accuracy of Diagnosis. Arch. Neurol. 1994, 51, 585–594. [Google Scholar] [CrossRef] [PubMed]
- Nasreddine, Z.S.; Phillips, N.A.; Bédirian, V.; Charbonneau, S.; Whitehead, V.; Collin, I.; Cummings, J.L.; Chertkow, H. The Montreal Cognitive Assessment, MoCA: A Brief Screening Tool for Mild Cognitive Impairment. J. Am. Geriatr. Soc. 2005, 53, 695–699. [Google Scholar] [CrossRef] [PubMed]
- Rosenbaum, P.R.; Rubin, D.B. The Central Role of the Propensity Score in Observational Studies for Causal Effects. Biometrika 1983, 70, 41. [Google Scholar] [CrossRef]
- Baevski, A.; Hsu, W.-N.; Xu, Q.; Babu, A.; Gu, J.; Auli, M. Data2vec: A General Framework for Self-Supervised Learning in Speech, Vision and Language. arXiv 2022, arXiv:2202.03555. [Google Scholar] [CrossRef]
- Panayotov, V.; Chen, G.; Povey, D.; Khudanpur, S. Librispeech: An ASR Corpus Based on Public Domain Audio Books. In Proceedings of the 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), South Brisbane, QLD, Australia, 19–24 April 2015; pp. 5206–5210. [Google Scholar] [CrossRef]
- McFee, B.; Raffel, C.; Liang, D.; Ellis, D.P.; McVicar, M.; Battenberg, E.; Nieto, O. Librosa: Audio and Music Signal Analysis in Python. In Proceedings of the 14th Python in Science Conference, Austin, TX, USA, 6–12 July 2015; Volume 8, pp. 18–25. [Google Scholar]
- Baevski, A.; Zhou, Y.; Mohamed, A.; Auli, M. Wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations. In Advances in Neural Information Processing Systems; Curran Associates, Inc.: Nice, France, 2020; Volume 33, pp. 12449–12460. [Google Scholar]
- Wolf, T.; Debut, L.; Sanh, V.; Chaumond, J.; Delangue, C.; Moi, A.; Cistac, P.; Rault, T.; Louf, R.; Funtowicz, M.; et al. Transformers: State-of-the-Art Natural Language Processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, Online, 16–20 November 2020; Association for Computational Linguistics: Stroudsburg, PA, USA, 2020; pp. 38–45. [Google Scholar] [CrossRef]
- Dogo, E.M.; Afolabi, O.J.; Nwulu, N.I.; Twala, B.; Aigbavboa, C.O. A Comparative Analysis of Gradient Descent-Based Optimization Algorithms on Convolutional Neural Networks. In Proceedings of the 2018 International Conference on Computational Techniques, Electronics and Mechanical Systems (CTEMS), Belagavi, India, 21–23 December 2018; pp. 92–99. [Google Scholar] [CrossRef]
- Degroot, M.H.; Fienberg, S.E. The Comparison and Evaluation of Forecasters. J. R. Stat. Soc. Ser. Stat. 1983, 32, 12–22. [Google Scholar] [CrossRef]
- Murphy, A.H.; Winkler, R.L. Reliability of Subjective Probability Forecasts of Precipitation and Temperature. J. R. Stat. Soc. Ser. C Appl. Stat. 1977, 26, 41–47. [Google Scholar] [CrossRef]
- Robertson, T.; Wright, F.T.; Dykstra, R.L. Order Restricted Statistical Inference; Wiley Series in Probability and Mathematical Statistics; John Wiley & Sons: Chichester, UK, 1988. [Google Scholar]
- Efron, B.; Tibshirani, R.J. An Introduction to the Bootstrap; CRC Press: Boca Raton, FL, USA, 1994. [Google Scholar]
- Fan, J.; Upadhye, S.; Worster, A. Understanding Receiver Operating Characteristic (ROC) Curves. CJEM 2006, 8, 19–20. [Google Scholar] [CrossRef]
- DeLong, E.R.; DeLong, D.M.; Clarke-Pearson, D.L. Comparing the Areas under Two or More Correlated Receiver Operating Characteristic Curves: A Nonparametric Approach. Biometrics 1988, 44, 837–845. [Google Scholar] [CrossRef]
- Hosmer, D.W., Jr.; Lemeshow, S.; Sturdivant, R.X. Applied Logistic Regression; John Wiley & Sons: Hoboken, NJ, USA, 2013. [Google Scholar]
- Eyben, F.; Wöllmer, M.; Schuller, B. Opensmile: The Munich Versatile and Fast Open-Source Audio Feature Extractor. In Proceedings of the 18th ACM International Conference on Multimedia (MM’10), Firenze, Italy, 25–29 October 2010; Association for Computing Machinery: New York, NY, USA, 2010; pp. 1459–1462. [Google Scholar] [CrossRef]
- Amini, S.; Hao, B.; Zhang, L.; Song, M.; Gupta, A.; Karjadi, C.; Kolachalama, V.B.; Au, R.; Paschalidis, I.C. Automated Detection of Mild Cognitive Impairment and Dementia from Voice Recordings: A Natural Language Processing Approach. Alzheimers Dement. 2022, 1–10. [Google Scholar] [CrossRef]
- De la Fuente Garcia, S.; Ritchie, C.W.; Luz, S. Artificial Intelligence, Speech, and Language Processing Approaches to Monitoring Alzheimer’s Disease: A Systematic Review. J. Alzheimers Dis. 2020, 78, 1547–1574. [Google Scholar] [CrossRef]
- Pan, Y.; Mirheidari, B.; Harris, J.M.; Thompson, J.C.; Jones, M.; Snowden, J.S.; Blackburn, D.; Christensen, H. Using the Outputs of Different Automatic Speech Recognition Paradigms for Acoustic- and BERT-Based Alzheimer’s Dementia Detection Through Spontaneous Speech. In Proceedings of the INTERSPEECH 2021, Brno, Czech Republic, 30 August–3 September 2021; pp. 3810–3814. [Google Scholar] [CrossRef]
- Wong, P.W. Economic Burden of Alzheimer Disease and Managed Care Considerations. Suppl. Featur. Publ. 2020, 26, S177–S183. [Google Scholar]
- Yamada, Y.; Shinkawa, K.; Kobayashi, M.; Caggiano, V.; Nemoto, M.; Nemoto, K.; Arai, T. Combining Multimodal Behavioral Data of Gait, Speech, and Drawing for Classification of Alzheimer’s Disease and Mild Cognitive Impairment. J. Alzheimers Dis. 2021, 84, 315–327. [Google Scholar] [CrossRef] [PubMed]
Dataset | Age | AD | Non-AD | |||||
---|---|---|---|---|---|---|---|---|
M | F | MMSE (sd) | M | F | MMSE (sd) | |||
Train | [50, 55) [55, 60) [60, 65) [65, 70) [70, 75) [75, 80) [80, 85) | 1 4 6 4 5 8 1 | 0 3 5 17 17 16 0 | 23.0 (n/a) 17.7 (4.4) 19.2 (6.1) 18.9 (4.2) 17.1 (4.9) 15.8 (5.6) 3.0 (n/a) | 1 3 5 8 5 4 1 | 1 11 8 18 13 1 0 | 29.0 (1.4) 29.2 (1.0) 29.1 (1.3) 28.9 (1.1) 28.6 (1.3) 29.4 (0.5) 29.0 (n/a) | |
Total | 29 | 58 | 17.4 (5.3) | 27 | 52 | 29.0 (1.1) | ||
Test | [55, 60) [60, 65) [65, 70) [70, 75) [75, 80) | 3 1 3 4 3 | 3 3 5 4 6 | 16.8 (4.5) 18.5 (7.6) 19.8 (4.5) 17.6 (6.4) 20.7 (6.7) | 3 2 3 5 0 | 5 5 5 5 3 | 29.1 (1.4) 28.7 (0.8) 28.8 (0.7) 28.5 (2.5) 29.0 (1.0) | |
Total | 14 | 21 | 18.8 (5.8) | 13 | 23 | 28.8 (1.5) |
Age | AD | Non-AD | ||||
---|---|---|---|---|---|---|
M | F | MMSE (sd) | M | F | MMSE (sd) | |
[45,50) [50,55) [55,60) [60,65) [65,70) [70,75) [75,80) [80,85) [85,90) | 0 2 6 8 8 9 17 4 0 | 0 0 6 10 22 22 25 12 9 | n/a 23.5 (0.5) 18.7 (3.9) 20.4 (4.8) 20.5 (4.5) 18.3 (5.1) 18.6 (4.9) 20.1 (4.2) 20.1 (3.9) | 1 4 6 9 9 8 3 1 0 | 3 4 13 9 15 10 4 0 0 | 30.0 (0.0) 29.4 (0.7) 29.4 (1.0) 28.9 (1.3) 29.0 (1.0) 28.6 (1.2) 29.1 (0.8) 29.0 (n/a) n/a |
Total | 54 | 106 | 19.4 (4.8) | 41 | 58 | 29.1 (1.1) |
Embedding | Accuracy | Precision | Recall | F1 |
---|---|---|---|---|
eGeMAPs | 0.682 (0.101) | 0.699 (0.124) | 0.704 (0.113) | 0.696 (0.097) |
wav2vec2 | 0.721 (0.106) | 0.759 (0.151) | 0.711 (0.088) | 0.727 (0.096) |
data2vec | 0.730 (0.074) | 0.778 (0.136) | 0.703 (0.096) | 0.728 (0.071) |
Model | RMSE | MAE |
---|---|---|
SVR RFR NN | 4.941 (3.961, 5.887) 6.346 (5.239, 7.410) 4.906 (3.872, 5.912) | 3.784 (3.083, 4.559) 5.059 (4.177, 6.044) 3.493 (2.754, 4.201) |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Agbavor, F.; Liang, H. Artificial Intelligence-Enabled End-To-End Detection and Assessment of Alzheimer’s Disease Using Voice. Brain Sci. 2023, 13, 28. https://doi.org/10.3390/brainsci13010028
Agbavor F, Liang H. Artificial Intelligence-Enabled End-To-End Detection and Assessment of Alzheimer’s Disease Using Voice. Brain Sciences. 2023; 13(1):28. https://doi.org/10.3390/brainsci13010028
Chicago/Turabian StyleAgbavor, Felix, and Hualou Liang. 2023. "Artificial Intelligence-Enabled End-To-End Detection and Assessment of Alzheimer’s Disease Using Voice" Brain Sciences 13, no. 1: 28. https://doi.org/10.3390/brainsci13010028
APA StyleAgbavor, F., & Liang, H. (2023). Artificial Intelligence-Enabled End-To-End Detection and Assessment of Alzheimer’s Disease Using Voice. Brain Sciences, 13(1), 28. https://doi.org/10.3390/brainsci13010028