Respiratory Sound Classification by Applying Deep Neural Network with a Blocking Variable
Abstract
:1. Introduction
- A simplified loss function that enables the network to handle a four-class classification with only two outputs.
- Mix-up data augmentation within the clusters is suggested to address the imbalanced data problem.
- A two-stage training process with the blocking variable is developed to address the not-independently and identically distributed (non-IID) data problem.
2. Pre-Processing the Sound
2.1. ICBHI 2017 Data
2.2. Short Fourier Transform
2.3. Wavelet Transform
2.4. Data Augmentation within the Clusters
3. The Architecture of the Network
3.1. The Block-I, Block-II, and the Self-Attention Block
3.2. Two Stage Training with the Blocking Variable
3.3. The Loss Function
4. Experiment Result
4.1. The Evaluation Method and Implement Device
4.2. Accuracy and Confusion Matrix
4.3. Discussion
5. Conclusions and Future Work
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Mangione, S.; Nieman, L.Z. Pulmonary auscultatory skills during training in internal medicine and family practice. Am. J. Respir. Crit. Care Med. 1999, 159, 1119–1124. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Xie, J.; Tong, Z.; Guan, X.; Du, B.; Qiu, H.; Slutsky, A.S. Critical care crisis and some recommendations during the COVID-19 epidemic in China. Intensive Care Med. 2020, 46, 837–840. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Adeloye, D.; Elneima, O.; Daines, L.; Poinasamy, K.; Quint, J.K.; Walker, S.; Brightling, C.E.; Siddiqui, S.; Hurst, J.R.; Chalmers, J.D.; et al. The long-term sequelae of COVID-19: An international consensus on research priorities for patients with pre-existing and new-onset airways disease. Lancet Respir. Med. 2021, 9, 1467–1478. [Google Scholar] [CrossRef] [PubMed]
- Nagasaka, Y. Lung sounds in bronchial asthma. Allergol. Int. 2012, 61, 353–363. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Weiss, E.B.; Carlson, C.J. Recording of breath sounds. Am. Rev. Respir. Dis. 1972, 105, 835–839. [Google Scholar] [PubMed]
- Vyshedskiy, A.; Alhashem, R.M.; Paciej, R.; Ebril, M.; Rudman, I.; Fredberg, J.J.; Murphy, R. Mechanism of inspiratory and expiratory crackles. Chest 2009, 135, 156–164. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Munakata, M.; Ukita, H.; Doi, I.; Ohtsuka, Y.; Masaki, Y.; Homma, Y.; Kawakami, Y. Spectral and waveform characteristics of fine and coarse crackles. Thorax 1991, 46, 651–657. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Rocha, B.M.; Filos, D.; Mendes, L.; Serbes, G.; Ulukaya, S.; Kahya, Y.P.; Jakovljevic, N.; Turukalo, T.L.; Vogiatzis, I.M.; Perantoni, E.; et al. An open access database for the evaluation of respiratory sound classification algorithms. Physiol. Meas. 2019, 40, 035001. [Google Scholar] [CrossRef] [PubMed]
- Jakovljević, N.; Lončar-Turukalo, T. Hidden markov model based respiratory sound classification. In Precision Medicine Powered by pHealth and Connected Health, Proceedings of the ICBHI 2017, Thessaloniki, Greece, 18–21 November 2017; Springer: Singapore, 2017; pp. 39–43. [Google Scholar]
- Chambres, G.; Hanna, P.; Desainte-Catherine, M. Automatic detection of patient with respiratory diseases using lung sound analysis. In Proceedings of the 2018 International Conference on Content-Based Multimedia Indexing (CBMI), La Rochelle, France, 4–6 September 2018; pp. 1–6. [Google Scholar]
- Serbes, G.; Ulukaya, S.; Kahya, Y.P. An automated lung sound preprocessing and classification system based onspectral analysis methods. In Precision Medicine Powered by pHealth and Connected Health, Proceedings of the ICBHI 2017, Thessaloniki, Greece, 18–21 November 2017; Springer: Singapore, 2017; pp. 45–49. [Google Scholar]
- Kochetov, K.; Putin, E.; Balashov, M.; Filchenkov, A.; Shalyto, A. Noise masking recurrent neural network for respiratory sound classification. In Artificial Neural Networks and Machine Learning—ICANN 2018, Proceedings of the 27th International Conference on Artificial Neural Networks, Rhodes, Greece, 4–7 October 2018; Springer: Singapore, 2018; pp. 208–217. [Google Scholar]
- Acharya, J.; Basu, A. Deep neural network for respiratory sound classification in wearable devices enabled by patient specific model tuning. IEEE Trans. Biomed. Circuits Syst. 2020, 14, 535–544. [Google Scholar] [CrossRef] [PubMed]
- Ma, Y.; Xu, X.; Yu, Q.; Zhang, Y.; Li, Y.; Zhao, J.; Wang, G. LungBRN: A smart digital stethoscope for detecting respiratory disease using bi-resnet deep learning algorithm. In Proceedings of the 2019 IEEE Biomedical Circuits and Systems Conference (BioCAS), Nara, Japan, 17–19 October 2019; pp. 1–4. [Google Scholar]
- Ma, Y.; Xu, X.; Li, Y. LungRN+ NL: An Improved Adventitious Lung Sound Classification Using Non-Local Block ResNet Neural Network with Mixup Data Augmentation. In Proceedings of the Interspeech, Shanghai, China, 25–29 October 2020; pp. 2902–2906. [Google Scholar]
- Gairola, S.; Tom, F.; Kwatra, N.; Jain, M. Respirenet: A deep neural network for accurately detecting abnormal lung sounds in limited data setting. In Proceedings of the 2021 43rd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), Virtual, 1–5 November 2021; pp. 527–530. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 12 June 2015; pp. 1–9. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017, 30. [Google Scholar]
- Bohadana, A.; Izbicki, G.; Kraman, S.S. Fundamentals of lung auscultation. N. Engl. J. Med. 2014, 370, 744–751. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Bracewell, R.N.; Bracewell, R.N. The Fourier Transform and Its Applications; McGraw-Hill: New York, NY, USA, 1986; Volume 31999. [Google Scholar]
- Bentley, P.M.; McDonnell, J. Wavelet transforms: An introduction. Electron. Commun. Eng. J. 1994, 6, 175–186. [Google Scholar] [CrossRef]
- Zhang, H.; Cisse, M.; Dauphin, Y.; Lopez-Paz, D. mixup: Beyond empirical risk management. In Proceedings of the 6th International Conference Learning Representations (ICLR), Vancouver, BC, Canada, 30 April 30–3 May 2018; pp. 1–13. [Google Scholar]
- Glorot, X.; Bordes, A.; Bengio, Y. Deep sparse rectifier neural networks. In Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics; JMLR Workshop and Conference Proceedings. Mlr Press: Mishawaka, IN, USA, 2011; pp. 315–323. [Google Scholar]
- Wu, Y.; He, K. Group normalization. In Proceedings of the European conference on computer vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 3–19. [Google Scholar]
Cluster1 | Cluster2 | Cluster3 | Cluster4 | Cluster5 | Cluster6 | Cluster7 | Total | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Before/After | B | A | B | A | B | A | B | A | B | A | B | A | B | A | B | A |
Normal | 530 | 530 | 580 | 580 | 566 | 566 | 333 | 333 | 427 | 427 | 202 | 202 | 271 | 271 | 2910 | 2910 |
Wheeze | 89 | 178 | 225 | 450 | 222 | 444 | 308 | 616 | 202 | 404 | 188 | 376 | 264 | 528 | 1498 | 2996 |
Crackle | 99 | 297 | 114 | 342 | 150 | 450 | 106 | 318 | 100 | 300 | 51 | 153 | 75 | 225 | 695 | 2085 |
Both | 20 | 20 | 87 | 87 | 65 | 65 | 69 | 69 | 80 | 80 | 38 | 38 | 56 | 56 | 415 | 415 |
Method | Score | |||
---|---|---|---|---|
Jakovljevic et al. [9] | - | - | 39.56 % | |
Chambres et al. [10] | 20.81% | 78.05% | 49.43% | |
Serbes et al. [11] | - | - | 49.86% | |
Official Split | Ma et al. [14] | 31.12% | 69.20% | 50.16% |
Ma et al. [15]. | 41.32% | 63.20% | 52.26% | |
Gairola et al. [16] (Pre-trained) | 40.1% | 72.3% | 56.2% | |
Proposed model stage 1 | 42.88% | 60.05% | 51.47% | |
Proposed model stage 2 with blocking | 42.63% | 61.33% | 51.98% | |
Kochetov et al. [12] | 58.43% | 73.00% | 65.70% | |
Acharya et al. [13] | 48.63% | 84.14% | 66.38% | |
8:2 Split | Ma et al. [15] | 63.69% | 64.73% | 64.21 % |
Gairola et al. [16] (Pre-trained) | 53.7% | 83.3% | 68.5 % | |
Our proposed model stage 1 | 63.99% | 78.72% | 71.35 % | |
Our proposed model stage 2 with blocking | 66.31% | 79.13% | 72.72 % |
Method | Score | ||
---|---|---|---|
Our proposed method | 63.99% | 78.72% | 71.35% |
Without fixed window size | 61.10% | 79.94% | 70.52% |
Without multiple kernels | 62.19% | 78.09% | 70.14% |
Without wavelet transform | 58.21% | 81.13% | 69.67% |
Without Self-attention | 59.01% | 79.61% | 69.31% |
Four-output loss function | 59.51% | 80.31% | 69.91% |
Four Block-I layers | 59.75% | 81.51% | 70.63% |
Six Block-I layers | 55.52% | 78.59% | 67.06% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Yang, R.; Lv, K.; Huang, Y.; Sun, M.; Li, J.; Yang, J. Respiratory Sound Classification by Applying Deep Neural Network with a Blocking Variable. Appl. Sci. 2023, 13, 6956. https://doi.org/10.3390/app13126956
Yang R, Lv K, Huang Y, Sun M, Li J, Yang J. Respiratory Sound Classification by Applying Deep Neural Network with a Blocking Variable. Applied Sciences. 2023; 13(12):6956. https://doi.org/10.3390/app13126956
Chicago/Turabian StyleYang, Runze, Kexin Lv, Yizhang Huang, Mingxia Sun, Jianxun Li, and Jie Yang. 2023. "Respiratory Sound Classification by Applying Deep Neural Network with a Blocking Variable" Applied Sciences 13, no. 12: 6956. https://doi.org/10.3390/app13126956
APA StyleYang, R., Lv, K., Huang, Y., Sun, M., Li, J., & Yang, J. (2023). Respiratory Sound Classification by Applying Deep Neural Network with a Blocking Variable. Applied Sciences, 13(12), 6956. https://doi.org/10.3390/app13126956