Automatic Detection of Dyspnea in Real Human–Robot Interaction Scenarios
Abstract
:1. Introduction
1.1. Voice-Based Estimation of Respiratory Distress
1.2. Human–Robot Interaction
2. Beamforming
2.1. Delay-and-Sum
2.2. MVDR
3. Testing Databases Using Static and Dynamic HRI Scenarios
3.1. Robotic Platform and Indoor Environment
3.2. Recording Scenarios
4. System for Respiratory Distress Estimation in HRI Scenarios
4.1. Source Localization and Beamforming
4.2. Deep Learning-Based Respiratory Distress Estimation
4.2.1. MLP Architectures for Time-Independent Features
4.2.2. Neural Networks Architectures for Time-Dependent Features
4.2.3. K-Fold Training with Double Validation
4.2.4. Acoustic Modeling Training for Respiratory Distress Estimation in HRI
4.2.5. Performance Metrics
4.2.6. Training and Testing Databases
- Training database labels: Both telephone and simulated (obtained with the acoustic modeling explained in Section 4.2.4) databases were used to train the respiratory distress neural network-based classifier. The telephone database is denoted as Telephone_training_data. The training dataset that resulted from the incorporation of the acoustic model of the HRI scenario (see Section 4.2.4) is named Simulated_training_data, which, in turn, corresponds to static simulations. When the D&S or MVDR beamforming scheme responses are also included, i.e., D&S and MVDR, the resulting training database are labeled as Simulated_training_data + D&S and Simulated_training_data + MVDR, respectively.
- Testing database labels: The testing telephone database is referred to as Telephone_testing_data. For simplicity in presenting the results, the outcomes of experiments using the Static 1 and Static 2 datasets (see Section 3.2) are averaged and presented under the label static. Consequently, results with the data re-recorded in real HRI static scenario are referred to as HRI_static_data. When the D&S or MVDR beamforming scheme responses are also included, the results are labeled as HRI_static_data + D&S and HRI_static_data + MVDR, respectively. Similarly, for the dynamic HRI scenario, the corresponding results are HRI_dynamic_data, HRI_dynamic_data + D&S, and HRI_dynamic_data + MVDR. As with the training data, the testing dataset that resulted from the incorporation of the acoustic model of the HRI scenario is named Simulated_testing_data, which, in turn, corresponds to static simulations. When the D&S or MVDR beamforming scheme responses are also included, the resulting training database are labeled as Simulated_training_data + D&S and Simulated_training_data + MVDR, respectively.
5. Results and Discussion
5.1. Architecture and Hyperparameter Tuning
5.2. Speech Enhancement with Beamforming Methods
5.3. Results with Telephone Training and Real HRI Testing Data
5.4. Results with Simulated Training and Testing Data
5.5. Results with Simulated Training and Testing with Real Data
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Jahanmahin, R.; Masoud, S.; Rickli, J.; Djuric, A. Human-Robot Interactions in Manufacturing: A Survey of Human Behavior Modeling. Robot. Comput. Integr. Manuf. 2022, 78, 102404. [Google Scholar] [CrossRef]
- Ingrand, F.; Ghallab, M. Deliberation for Autonomous Robots: A Survey. Artif. Intell. 2017, 247, 10–44. [Google Scholar] [CrossRef]
- Breazeal, C.; Dautenhahn, K.; Kanda, T. Social Robotics. In Springer Handbook of Robotics; Siciliano, B., Khatib, O., Eds.; Springer International Publishing: Cham, Switzerland, 2016; pp. 1935–1972. ISBN 978-3-319-32552-1. [Google Scholar]
- Rossi, S.; Ferland, F.; Tapus, A. User Profiling and Behavioral Adaptation for HRI: A Survey. Pattern Recognit. Lett. 2017, 99, 3–12. [Google Scholar] [CrossRef]
- Dunn, J.; Runge, R.; Snyder, M. Wearables and the Medical Revolution. Pers. Med. 2018, 15, 429–448. [Google Scholar]
- Tana, J.; Forss, M.; Hellsten, T. The Use of Wearables in Healthcare–Challenges and Opportunities; ARCADA: Helsinki, Finland, 2017. [Google Scholar]
- Smuck, M.; Odonkor, C.A.; Wilt, J.K.; Schmidt, N.; Swiernik, M.A. The Emerging Clinical Role of Wearables: Factors for Successful Implementation in Healthcare. NPJ Digit. Med. 2021, 4, 45. [Google Scholar]
- Cole, J. Prosody in Context: A Review. Lang. Cogn. Neurosci. 2015, 30, 1–31. [Google Scholar]
- Lella, K.K.; Pja, A. A Literature Review on COVID-19 Disease Diagnosis from Respiratory Sound Data. arXiv 2021, arXiv:2112.07670. [Google Scholar] [CrossRef]
- World Health Organization. Chronic Respiratory Diseases. Available online: https://www.who.int/news-room/fact-sheets/detail/chronic-obstructive-pulmonary-disease-(copd) (accessed on 15 February 2022).
- Pramono, R.X.A. Low-Complexity Algorithms to Enable Long-Term Symptoms Monitoring in Chronic Respiratory Diseases. Ph.D. Thesis, Imperial College London, London, UK, 2020. [Google Scholar]
- Willer, K.; Fingerle, A.A.; Noichl, W.; De Marco, F.; Frank, M.; Urban, T.; Schick, R.; Gustschin, A.; Gleich, B.; Herzen, J.; et al. X-Ray Dark-Field Chest Imaging for Detection and Quantification of Emphysema in Patients with Chronic Obstructive Pulmonary Disease: A Diagnostic Accuracy Study. Lancet Digit. Health 2021, 3, e733–e744. [Google Scholar]
- Barreiro, T.; Perillo, I. An Approach to Interpreting Spirometry. Am. Fam. Physician 2004, 69, 1107–1114. [Google Scholar] [PubMed]
- Huang, Y.; Meng, S.; Zhang, Y.; Wu, S.; Zhang, Y.; Zhang, Y.; Ye, Y.; Wei, Q.; Zhao, N.; Jiang, J.; et al. The Respiratory Sound Features of COVID-19 Patients Fill Gaps between Clinical Data and Screening Methods. MedRxiv 2020. Available online: https://www.medrxiv.org/content/10.1101/2020.04.07.20051060v1 (accessed on 3 March 2023).
- Duggal, R.; Brindle, I.; Bagenal, J. Digital Healthcare: Regulating the Revolution. BMJ 2018, 360, k6. [Google Scholar] [CrossRef]
- Feng, Y.; Wang, Y.; Zeng, C.; Mao, H. Artificial Intelligence and Machine Learning in Chronic Airway Diseases: Focus on Asthma and Chronic Obstructive Pulmonary Disease. Int. J. Med. Sci. 2021, 18, 2871–2889. [Google Scholar] [CrossRef]
- Shoeibi, A.; Khodatars, M.; Alizadehsani, R.; Ghassemi, N.; Jafari, M.; Moridian, P.; Khadem, A.; Sadeghi, D.; Hussain, S.; Zare, A.; et al. Automated Detection and Forecasting of COVID-19 Using Deep Learning Techniques: A Review. arXiv 2020, arXiv:2007.10785. [Google Scholar]
- Ghaderzadeh, M.; Asadi, F. Deep Learning in the Detection and Diagnosis of COVID-19 Using Radiology Modalities: A Systematic Review. J. Healthc. Eng. 2021, 2021, 6677314. [Google Scholar]
- Elpeltagy, M.; Sallam, H. Automatic Prediction of COVID-19 from Chest Images Using Modified ResNet50. Multimed. Tools Appl. 2021, 80, 26451–26463. [Google Scholar] [CrossRef] [PubMed]
- Subramanian, N.; Elharrouss, O.; Al-Maadeed, S.; Chowdhury, M. A Review of Deep Learning-Based Detection Methods for COVID-19. Comput. Biol. Med. 2022, 143, 105233. [Google Scholar]
- Amiriparian, S.; Schuller, B. AI Hears Your Health: Computer Audition for Health Monitoring. In Proceedings of the Communications in Computer and Information Science, Larnaca, Cyprus, 8–9 November 2021; Volume 1538. [Google Scholar]
- Valentine, S.; Cunningham, A.C.; Klasmer, B.; Dabbah, M.; Balabanovic, M.; Aral, M.; Vahdat, D.; Plans, D. Smartphone Movement Sensors for the Remote Monitoring of Respiratory Rates: Technical Validation. Digit. Health 2022, 8, 20552076221089090. [Google Scholar]
- Franek, J. Home Telehealth for Patients with Chronic Obstructive Pulmonary Disease (COPD): An Evidence-Based Analysis. Ont. Health Technol. Assess. Ser. 2012, 12, 1–58. [Google Scholar] [PubMed]
- Wijsenbeek, M.S.; Moor, C.C.; Johannson, K.A.; Jackson, P.D.; Khor, Y.H.; Kondoh, Y.; Rajan, S.K.; Tabaj, G.C.; Varela, B.E.; van der Wal, P.; et al. Home Monitoring in Interstitial Lung Diseases. Lancet Respir. Med. 2023, 11, 97–110. [Google Scholar] [PubMed]
- Viderman, D.; Seri, E.; Aubakirova, M.; Abdildin, Y.; Badenes, R.; Bilotta, F. Remote Monitoring of Chronic Critically Ill Patients after Hospital Discharge: A Systematic Review. J. Clin. Med. 2022, 11, 1010. [Google Scholar] [CrossRef]
- Mohammad-Rahimi, H.; Nadimi, M.; Ghalyanchi-Langeroudi, A.; Taheri, M.; Ghafouri-Fard, S. Application of Machine Learning in Diagnosis of COVID-19 through X-Ray and CT Images: A Scoping Review. Front. Cardiovasc. Med. 2021, 8, 638011. [Google Scholar]
- Kranthi Kumar, L.; Alphonse, P.J.A. COVID-19 Disease Diagnosis with Light-Weight CNN Using Modified MFCC and Enhanced GFCC from Human Respiratory Sounds. Eur. Phys. J. Spec. Top. 2022, 231, 3329–3346. [Google Scholar] [CrossRef]
- Stasak, B.; Huang, Z.; Razavi, S.; Joachim, D.; Epps, J. Automatic Detection of COVID-19 Based on Short-Duration Acoustic Smartphone Speech Analysis. J. Healthc. Inform. Res. 2021, 5, 201–217. [Google Scholar] [CrossRef]
- Xia, T.; Han, J.; Mascolo, C. Exploring Machine Learning for Audio-Based Respiratory Condition Screening: A Concise Review of Databases, Methods, and Open Issues. Exp. Biol. Med. 2022, 247, 2053–2061. [Google Scholar] [CrossRef]
- Stoeckel, M.C.; Esser, R.W.; Gamer, M.; Büchel, C.; von Leupoldt, A. Brain Mechanisms of Short-Term Habituation and Sensitization toward Dyspnea. Front. Psychol. 2015, 6, 748. [Google Scholar] [CrossRef]
- Wan, L.; Stans, L.; Bogaerts, K.; Decramer, M.; Van Den Bergh, O. Sensitization in Medically Unexplained Dyspnea: Differential Effects on Intensity and Unpleasantness. Chest 2012, 141, 989–995. [Google Scholar] [CrossRef] [PubMed]
- von Leupoldt, A.; Dahme, B. Psychological Aspects in the Perception of Dyspnea in Obstructive Pulmonary Diseases. Respir. Med. 2007, 101, 411–422. [Google Scholar] [CrossRef] [PubMed]
- Serrurier, A.; Neuschaefer-Rube, C.; Röhrig, R. Past and Trends in Cough Sound Acquisition, Automatic Detection and Automatic Classification: A Comparative Review. Sensors 2022, 22, 2896. [Google Scholar] [CrossRef]
- Suppakitjanusant, P.; Sungkanuparph, S.; Wongsinin, T.; Virapongsiri, S.; Kasemkosin, N.; Chailurkit, L.; Ongphiphadhanakul, B. Identifying Individuals with Recent COVID-19 through Voice Classification Using Deep Learning. Sci. Rep. 2021, 11, 19149. [Google Scholar] [CrossRef]
- Alkhodari, M.; Khandoker, A.H. Detection of COVID-19 in Smartphone-Based Breathing Recordings: A Pre-Screening Deep Learning Tool. PLoS ONE 2022, 17, e0262448. [Google Scholar]
- Lella, K.K.; Pja, A. Automatic Diagnosis of COVID-19 Disease Using Deep Convolutional Neural Network with Multi-Feature Channel from Respiratory Sound Data: Cough, Voice, and Breath. Alex. Eng. J. 2022, 61, 1319–1334. [Google Scholar] [CrossRef]
- Farrús, M.; Codina-Filbà, J.; Reixach, E.; Andrés, E.; Sans, M.; Garcia, N.; Vilaseca, J. Speech-Based Support System to Supervise Chronic Obstructive Pulmonary Disease Patient Status. Appl. Sci. 2021, 11, 7999. [Google Scholar] [CrossRef]
- Alvarado, E.; Grágeda, N.; Luzanto, A.; Mahu, R.; Wuth, J.; Mendoza, L.; Yoma, N.B. Dyspnea Severity Assessment Based on Vocalization Behavior with Deep Learning on the Telephone. Sensors 2023, 23, 2441. [Google Scholar] [CrossRef] [PubMed]
- Udugama, B.; Kadhiresan, P.; Kozlowski, H.N.; Malekjahani, A.; Osborne, M.; Li, V.Y.C.; Chen, H.; Mubareka, S.; Gubbay, J.B.; Chan, W.C.W. Diagnosing COVID-19: The Disease and Tools for Detection. ACS Nano 2020, 14, 3822–3835. [Google Scholar] [CrossRef] [PubMed]
- Ritwik, K.V.S.; Kalluri, S.B.; Vijayasenan, D. COVID-19 Patient Detection from Telephone Quality Speech Data. arXiv 2020, arXiv:2011.04299. [Google Scholar]
- Verde, L.; De Pietro, G.; Ghoneim, A.; Alrashoud, M.; Al-Mutib, K.N.; Sannino, G. Exploring the Use of Artificial Intelligence Techniques to Detect the Presence of Coronavirus COVID-19 through Speech and Voice Analysis. Ieee Access 2021, 9, 65750–65757. [Google Scholar] [CrossRef]
- Rashid, M.; Alman, K.A.; Hasan, K.; Hansen, J.H.L.; Hasan, T. Respiratory Distress Detection from Telephone Speech Using Acoustic and Prosodic Features. arXiv 2020, arXiv:2011.09270. [Google Scholar]
- Sharma, N.; Krishnan, P.; Kumar, R.; Ramoji, S.; Chetupalli, S.R.; Ghosh, P.K.; Ganapathy, S. Coswara—A Database of Breathing, Cough, and Voice Sounds for COVID-19 Diagnosis. arXiv 2020, arXiv:2005.10548. [Google Scholar]
- Muguli, A.; Pinto, L.; Sharma, N.; Krishnan, P.; Ghosh, P.K.; Kumar, R.; Bhat, S.; Chetupalli, S.R.; Ganapathy, S.; Ramoji, S.; et al. DiCOVA Challenge: Dataset, Task, and Baseline System for COVID-19 Diagnosis Using Acoustics. arXiv 2021, arXiv:2103.09148. [Google Scholar]
- Orlandic, L.; Teijeiro, T.; Atienza, D. The COUGHVID Crowdsourcing Dataset, a Corpus for the Study of Large-Scale Cough Analysis Algorithms. Sci. Data 2021, 8, 156. [Google Scholar] [CrossRef] [PubMed]
- Tang, S.; Hu, X.; Atlas, L.; Khanzada, A.; Pilanci, M. Hierarchical Multi-Modal Transformer for Automatic Detection of COVID-19. In Proceedings of the 2022 5th International Conference on Signal Processing and Machine Learning, Dalian, China, 4–6 August 2022; pp. 197–202. [Google Scholar]
- Erdoğan, Y.E.; Narin, A. COVID-19 Detection with Traditional and Deep Features on Cough Acoustic Signals. Comput. Biol. Med. 2021, 136, 104765. [Google Scholar] [CrossRef]
- Fakhry, A.; Jiang, X.; Xiao, J.; Chaudhari, G.; Han, A.; Khanzada, A. Virufy: A Multi-Branch Deep Learning Network for Automated Detection of COVID-19. arXiv 2021, arXiv:2103.01806. [Google Scholar]
- Mazumder, A.N.; Ren, H.; Rashid, H.-A.; Hosseini, M.; Chandrareddy, V.; Homayoun, H.; Mohsenin, T. Automatic Detection of Respiratory Symptoms Using a Low-Power Multi-Input CNN Processor. IEEE Des. Test 2021, 39, 82–90. [Google Scholar] [CrossRef]
- Solera-Ureña, R.; Botelho, C.; Teixeira, F.; Rolland, T.; Abad, A.; Trancoso, I. Transfer Learning-Based Cough Representations for Automatic Detection of COVID-19. In Proceedings of the Interspeech, Brno, Czechia, 30 August–3 September 2021; pp. 436–440. [Google Scholar]
- Ponomarchuk, A.; Burenko, I.; Malkin, E.; Nazarov, I.; Kokh, V.; Avetisian, M.; Zhukov, L. Project Achoo: A Practical Model and Application for COVID-19 Detection from Recordings of Breath, Voice, and Cough. IEEE J. Sel. Top. Signal Process. 2022, 16, 175–187. [Google Scholar] [CrossRef]
- Iroju, O.; Ojerinde, O.A.; Ikono, R. State of the Art: A Study of Human-Robot Interaction in Healthcare. Int. J. Inf. Eng. Electron. Bus. 2017, 9, 43. [Google Scholar]
- Kyrarini, M.; Lygerakis, F.; Rajavenkatanarayanan, A.; Sevastopoulos, C.; Nambiappan, H.R.; Chaitanya, K.K.; Babu, A.R.; Mathew, J.; Makedon, F. A Survey of Robots in Healthcare. Technologies 2021, 9, 8. [Google Scholar] [CrossRef]
- Kolpashchikov, D.; Gerget, O.; Meshcheryakov, R. Robotics in Healthcare. In Handbook of Artificial Intelligence in Healthcare; Springer: Berlin/Heidelberg, Germany, 2022; pp. 281–306. [Google Scholar]
- Bidelman, G.M.; Dexter, L. Bilinguals at the “Cocktail Party”: Dissociable Neural Activity in Auditory–Linguistic Brain Regions Reveals Neurobiological Basis for Nonnative Listeners’ Speech-in-Noise Recognition Deficits. Brain Lang. 2015, 143, 32–41. [Google Scholar] [CrossRef] [PubMed]
- Saleem, N.; Khattak, M.I. A Review of Supervised Learning Algorithms for Single Channel Speech Enhancement. Int. J. Speech Technol. 2019, 22, 1051–1075. [Google Scholar] [CrossRef]
- Van Veen, B.D.; Buckley, K.M. Beamforming: A Versatile Approach to Spatial Filtering. IEEE Assp Mag. 1988, 5, 4–24. [Google Scholar] [CrossRef]
- Zahn, R.; Johnston, J.D.; Elko, G.W. Computer-Steered Microphone Arrays for Sound Transduction in Large Rooms. J. Acoust. Soc. Am. 1985, 78, 1508–1518. [Google Scholar] [CrossRef]
- Xiao, Y.; Yin, J.; Qi, H.; Yin, H.; Hua, G. MVDR Algorithm Based on Estimated Diagonal Loading for Beamforming. Math. Probl. Eng. 2017, 2017, 7904356. [Google Scholar] [CrossRef]
- Pfeifenberger, L.; Pernkopf, F. Blind Speech Separation and Dereverberation Using Neural Beamforming. Speech Commun. 2022, 140, 29–41. [Google Scholar] [CrossRef]
- Liu, Y.; Ganguly, A.; Kamath, K.; Kristjansson, T. Neural Network Based Time-Frequency Masking and Steering Vector Estimation for Two-Channel Mvdr Beamforming. In Proceedings of the ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing, Calgary, AB, Canada, 15–20 April 2018; Volume 2018. [Google Scholar]
- Xiao, X.; Zhao, S.; Jones, D.L.; Chng, E.S.; Li, H. On Time-Frequency Mask Estimation for MVDR Beamforming with Application in Robust Speech Recognition. In Proceedings of the ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing, New Orleans, LA, USA, 5–9 March 2017. [Google Scholar]
- Zhang, Z.; He, B.; Zhang, Z. X-TaSNet: Robust and Accurate Time-Domain Speaker Extraction Network. In Proceedings of the Annual Conference of the International Speech Communication Association, Interspeech, Shanghai, China, 25–29 October 2020; Volume 2020. [Google Scholar]
- Hao, Y.; Xu, J.; Shi, J.; Zhang, P.; Qin, L.; Xu, B. A Unified Framework for Low-Latency Speaker Extraction in Cocktail Party Environments. In Proceedings of the Annual Conference of the International Speech Communication Association, Interspeech, Shanghai, China, 25–29 October 2020; Volume 2020. [Google Scholar]
- Ochiai, T.; Delcroix, M.; Ikeshita, R.; Kinoshita, K.; Nakatani, T.; Araki, S. Beam-TasNet: Time-Domain Audio Separation Network Meets Frequency-Domain Beamformer. In Proceedings of the ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, 4–8 May 2020; IEEE: New York, NY, USA; pp. 6384–6388. [Google Scholar]
- Aroudi, A.; Braun, S. DBNet: DOA-Driven Beamforming Network for End-to-End Reverberant Sound Source Separation. In Proceedings of the ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing, Toronto, ON, Canada, 6–11 June 2021; Volume 2021. [Google Scholar]
- Ren, X.; Zhang, X.; Chen, L.; Zheng, X.; Zhang, C.; Guo, L.; Yu, B. A Causal U-Net Based Neural Beamforming Network for Real-Time Multi-Channel Speech Enhancement. In Proceedings of the Annual Conference of the International Speech Communication Association, Interspeech, Brno, Czechia, 30 August–3 September 2021; Volume 1. [Google Scholar]
- Tawara, N.; Kobayashi, T.; Ogawa, T. Multi-Channel Speech Enhancement Using Time-Domain Convolutional Denoising Autoencoder. In Proceedings of the Annual Conference of the International Speech Communication Association, Interspeech, Graz, Austria, 15–19 September 2019; Volume 2019. [Google Scholar]
- Pandey, A.; Xu, B.; Kumar, A.; Donley, J.; Calamia, P.; Wang, D. Multichannel Speech Enhancement Without Beamforming. In Proceedings of the ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Singapore, 23–27 May 2022; pp. 6502–6506. [Google Scholar]
- Tzirakis, P.; Kumar, A.; Donley, J. Multi-Channel Speech Enhancement Using Graph Neural Networks. In Proceedings of the ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing, Toronto, ON, Canada, 6–11 June 2021; Volume 2021. [Google Scholar]
- Liu, C.L.; Fu, S.W.; Li, Y.J.; Huang, J.W.; Wang, H.M.; Tsao, Y. Multichannel Speech Enhancement by Raw Waveform-Mapping Using Fully Convolutional Networks. IEEE/ACM Trans. Audio Speech Lang. Process 2020, 28, 1888–1900. [Google Scholar] [CrossRef]
- Yuliani, A.R.; Amri, M.F.; Suryawati, E.; Ramdan, A.; Pardede, H.F. Speech Enhancement Using Deep Learning Methods: A Review. J. Elektron. Dan Telekomun. 2021, 21, 19–26. [Google Scholar] [CrossRef]
- Zhang, W.; Shi, J.; Li, C.; Watanabe, S.; Qian, Y. Closing the Gap between Time-Domain Multi-Channel Speech Enhancement on Real and Simulation Conditions. In Proceedings of the 2021 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, NY, USA, 17–20 October 2021; IEEE: New York, NY, USA; pp. 146–150. [Google Scholar]
- Díaz, A.; Mahu, R.; Novoa, J.; Wuth, J.; Datta, J.; Yoma, N.B. Assessing the Effect of Visual Servoing on the Performance of Linear Microphone Arrays in Moving Human-Robot Interaction Scenarios. Comput. Speech Lang. 2021, 65, 101136. [Google Scholar] [CrossRef]
- Novoa, J.; Mahu, R.; Wuth, J.; Escudero, J.P.; Fredes, J.; Yoma, N.B. Automatic Speech Recognition for Indoor Hri Scenarios. ACM Trans. Hum.-Robot. Interact. (THRI) 2021, 10, 1–30. [Google Scholar] [CrossRef]
- Chorin, E.; Padegimas, A.; Havakuk, O.; Birati, E.Y.; Shacham, Y.; Milman, A.; Topaz, G.; Flint, N.; Keren, G.; Rogowski, O. Assessment of Respiratory Distress by the Roth Score. Clin. Cardiol. 2016, 39, 636–639. [Google Scholar] [CrossRef]
- Alvarado, E.; Grágeda, N.; Luzanto, A.; Mahu, R.; Wuth, J.; Mendoza, L.; Stern, R.; Yoma, N.B. Respiratory Distress Estimation in Human-Robot Interaction Scenario. In Proceedings of the Interspeech, Dublin, Ireland, 20–24 August 2023. [Google Scholar]
- Tashev, I.J. Sound Capture and Processing: Practical Approaches; John Wiley & Sons: Hoboken, NJ, USA, 2009. [Google Scholar]
- Kumatani, K.; Arakawa, T.; Yamamoto, K.; McDonough, J.; Raj, B.; Singh, R.; Tashev, I. Microphone Array Processing for Distant Speech Recognition: Towards Real-World Deployment. In Proceedings of the 2012 Conference Handbook—Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2012, Hollywood, CA, USA, 3–6 December 2012. [Google Scholar]
- Higuchi, T.; Kinoshita, K.; Ito, N.; Karita, S.; Nakatani, T. Frame-by-Frame Closed-Form Update for Mask-Based Adaptive MVDR Beamforming. In Proceedings of the ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing, Calgary, AB, Canada, 15–20 April 2018; Volume 2018. [Google Scholar]
- Novoa, J.; Wuth, J.; Escudero, J.P.; Fredes, J.; Mahu, R.; Yoma, N.B. DNN-HMM Based Automatic Speech Recognition for HRI Scenarios. In Proceedings of the ACM/IEEE International Conference on Human-Robot Interaction, Chicago, IL, USA, 5–8 March 2018. [Google Scholar]
- Boersma, P. Praat, a System for Doing Phonetics by Computer. Glot. Int. 2001, 5, 341–345. [Google Scholar]
- Novoa, J.; Fredes, J.; Poblete, V.; Yoma, N.B. Uncertainty Weighting and Propagation in DNN–HMM-Based Speech Recognition. Comput. Speech Lang. 2018, 47, 30–46. [Google Scholar] [CrossRef]
- Fredes, J.; Novoa, J.; King, S.; Stern, R.M.; Yoma, N.B. Locally Normalized Filter Banks Applied to Deep Neural-Network-Based Robust Speech Recognition. IEEE Signal Process. Lett. 2017, 24, 377–381. [Google Scholar] [CrossRef]
Diagnosis | n | Age, Years Average SD | Females (n) | Smoking (n) | Pack Year Index (Average ± SD) | FEV1/FVC (Average ± SD) | FEV1 % Pred (Average ± SD) | FVC % Pred (Average ± SD) |
---|---|---|---|---|---|---|---|---|
COPD | 43 | 74.8 ± 8.1 | 15 | 43 | 34.11 ± 21.8 | 54.3 ± 11.1 | 65.1 ± 22.9 | 94.7 ±21.8 |
PF | 19 | 64.6 ± 8.6 | 9 | 9 | 17.8 ± 14.7 | 88.7 ± 4.3 | 88.9 ± 22.7 | 78.9 ± 20 |
COVID-19 | 4 | 56.25 ± 8.9 | 3 | 2 | 20 ± 10 | 88.5 ± 2.1 | 95.5 ± 17.7 | 89.5 ± 14.8 |
Total | 66 | 70.7 ± 10 | 27 | 54 | 31.4 ± 21.4 | 63.8 ± 18.3 | 71.1 ± 24.8 | 90.4 ± 22 |
Training Data | Testing Data | Accuracy (%) | AUC |
---|---|---|---|
Telephone_training_data | Telephone_testing_data | 51 | 0.92 |
Telephone_training_data | HRI_static_data | 38 | 0.82 |
Telephone_training_data | HRI_static_data+D&S | 42 | 0.84 |
Telephone_training_data | HRI_static_data+MVDR | 47 | 0.86 |
Telephone_training_data | HRI_dynamic_data | 40 | 0.84 |
Telephone_training_data | HRI_dynamic_data+D&S | 43 | 0.85 |
Telephone_training_data | HRI_dynamic_data+MVDR | 42 | 0.84 |
Training Data | Testing Data | Accuracy (%) | AUC |
---|---|---|---|
Simulated_training_data | Simulated_testing_data | 41 | 0.86 |
Simulated_training_data+D&S | Simulated_testing_data+D&S | 46 | 0.87 |
Simulated_training_data+MVDR | Simulated_testing_data+MVDR | 45 | 0.91 |
Training Data | Testing Data | Accuracy (%) | AUC |
---|---|---|---|
Simulated_training_data | HRI_static_data | 42 | 0.86 |
Simulated_training_data+D&S | HRI_static_data+D&S | 45 | 0.87 |
Simulated_training_data+MVDR | HRI_static_data+MVDR | 46 | 0.89 |
Training Data | Testing Data | Accuracy (%) | AUC |
---|---|---|---|
Simulated_training_data | HRI_dynamic_data | 43 | 0.86 |
Simulated_training_data+D&S | HRI_dynamic_data+D&S | 44 | 0.87 |
Simulated_training_data+MVDR | HRI_dynamic_data+MVDR | 44 | 0.87 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Alvarado, E.; Grágeda, N.; Luzanto, A.; Mahu, R.; Wuth, J.; Mendoza, L.; Stern, R.M.; Yoma, N.B. Automatic Detection of Dyspnea in Real Human–Robot Interaction Scenarios. Sensors 2023, 23, 7590. https://doi.org/10.3390/s23177590
Alvarado E, Grágeda N, Luzanto A, Mahu R, Wuth J, Mendoza L, Stern RM, Yoma NB. Automatic Detection of Dyspnea in Real Human–Robot Interaction Scenarios. Sensors. 2023; 23(17):7590. https://doi.org/10.3390/s23177590
Chicago/Turabian StyleAlvarado, Eduardo, Nicolás Grágeda, Alejandro Luzanto, Rodrigo Mahu, Jorge Wuth, Laura Mendoza, Richard M. Stern, and Néstor Becerra Yoma. 2023. "Automatic Detection of Dyspnea in Real Human–Robot Interaction Scenarios" Sensors 23, no. 17: 7590. https://doi.org/10.3390/s23177590
APA StyleAlvarado, E., Grágeda, N., Luzanto, A., Mahu, R., Wuth, J., Mendoza, L., Stern, R. M., & Yoma, N. B. (2023). Automatic Detection of Dyspnea in Real Human–Robot Interaction Scenarios. Sensors, 23(17), 7590. https://doi.org/10.3390/s23177590