Development and Comparison of Machine Learning and Deep Learning Models for Speech Audiometry Prediction
Abstract
:1. Introduction
- Proposes a deep learning- and machine learning-based multi-class classification model for predicting speech audiometry results using pure-tone audiometry data and compares the performance of the multilayer perceptron (MLP), recurrent neural network (RNN), XGBoost, and gradient boosting models.
- Deep learning and machine learning models effectively learn the nonlinear patterns of pure-tone audiometry data.
- The performance of each deep learning and machine learning model was evaluated using accuracy, loss, and a confusion matrix. The results indicate that the recurrent neural network model slightly outperformed the multilayer perceptron model, and the gradient boosting model demonstrated superior performance to XGBoost.
2. Related Work
2.1. Speech Audiometry
2.2. Pure-Tone Audiometry
3. Materials and Methods
3.1. MLP
MLP Model and Methodology in This Paper
3.2. RNN
RNN Model and Methodology in This Paper
3.3. Gradient Boosting
Gradient Boosting Model and Methodology in This Paper
3.4. XGBoost
XGBoost Model and Methodology in This Paper
4. Results
4.1. Data Collection and Segmentation
4.2. Experimental Setup
4.3. MLP and RNN Accuracy, F1 Score, Loss
MLP and RNN Confusion Matrix
4.4. Gradient Boosting and XGBoost Accuracy, F1 Score, Loss
Gradient Boosting and XGBoost Confusion Matrix
4.5. Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
Abbreviations | Description |
PTA | pure-tone audiometry |
SRT | speech recognition threshold |
SDT | speech discrimination testing |
SDS | speech discrimination score |
Val_loss | validation loss |
LSTM | long short-term memory |
GRU | gated recurrent unit |
BPTT | backpropagation through time |
MSE | mean squared error |
MAE | mean absolute error |
AUC | area under the curve |
OvR | one vs. rest |
MLP | multilayer perceptron |
RNN | recurrent neural network |
XGBoost | extreme gradient boosting |
AC | air conduction |
BC | bone conduction |
dB | decibel |
Hz | hertz |
TP | true positive |
FP | false positive |
FN | false negative |
References
- Sharifani, K.; Amini, M. Machine Learning and Deep Learning: A Review of Methods and Applications. World Inf. Technol. Eng. J. 2023, 10, 3897–3904. [Google Scholar]
- Khalil, M.; McGough, A.S.; Pourmirza, Z.; Pazhoohesh, M.; Walker, S. Machine Learning, Deep Learning and Statistical Analysis for Forecasting Building Energy Consumption—A Systematic Review. Eng. Appl. Artif. Intell. 2022, 115, 105287. [Google Scholar] [CrossRef]
- Bhatt, C.; Kumar, I.; Vijayakumar, V.; Singh, K.U.; Kumar, A. The State of the Art of Deep Learning Models in Medical Science and Their Challenges. Multimed. Syst. 2021, 27, 599–613. [Google Scholar] [CrossRef]
- Shehab, M.; Abualigah, L.; Shambour, Q.; Abu-Hashem, M.A.; Shambour, M.K.Y.; Alsalibi, A.I.; Gandomi, A.H. Machine Learning in Medical Applications: A Review of State-of-the-Art Methods. Comput. Biol. Med. 2022, 145, 105458. [Google Scholar]
- Hoff, M.; Göthberg, H.; Tengstrand, T.; Rosenhall, U.; Skoog, I.; Sadeghi, A. Accuracy of Automated Pure-Tone Audiometry in Population-Based Samples of Older Adults. Int. J. Audiol. 2024, 63, 622–630. [Google Scholar] [CrossRef]
- Ryćko, P.; Rogowski, M. Speech Recognition and Speech Audiometry Parameters in Evaluation of Aural Rehabilitation Progress in Cochlear Implant Patients. Pol. J. Otolaryngol. 2024, 78, 1–6. [Google Scholar] [CrossRef]
- Sincock, B.P. Clinical Applicability of Adaptive Speech Testing: A Comparison of the Administration Time, Accuracy, Efficiency and Reliability of Adaptive Speech Tests with Conventional Speech Audiometry. Ph.D. Thesis, University of Canterbury, Christchurch, New Zealand, 2008. [Google Scholar]
- Shin, J.S.; Ma, J.; Choi, S.J.; Kim, S.; Hong, M. Development of a Deep Learning Model for Predicting Speech Audiometry Using Pure-Tone Audiometry Data. Appl. Sci. 2024, 14, 9379. [Google Scholar] [CrossRef]
- Wallaert, N.; Perry, A.; Jean, H.; Creff, G.; Godey, B.; Paraouty, N. Performance and Reliability Evaluation of an Automated Bone-Conduction Audiometry Using Machine Learning. Trends Hear. 2024, 28, 23312165241286456. [Google Scholar] [CrossRef]
- DeRuiter, M.; Ramachandran, V. Basic Audiometry Learning Manual; Plural Publishing: San Diego, CA, USA, 2021. [Google Scholar]
- Suatbayeva, R.; Toguzbayeva, D.; Taukeleva, S.; Mukanova, Z.; Sadykov, M. Speech Perception and Parameters of Speech Audiometry after Hearing Aid: Systematic Review and Meta-Analysis. Electron. J. Gen. Med. 2024, 21, em563. [Google Scholar] [CrossRef]
- Puglisi, G.E.; di Berardino, F.; Montuschi, C.; Sellami, F.; Albera, A.; Zanetti, D.; Albera, R.; Astolfi, A.; Kollmeier, B.; Warzybok, A. Evaluation of Italian simplified matrix test for speech-recognition measurements in noise. Audiol. Res. 2021, 11, 73–88. [Google Scholar] [CrossRef]
- Humes, L.E. Factors Underlying Individual Differences in Speech-Recognition Threshold (SRT) in Noise among Older Adults. Front. Aging Neurosci. 2021, 13, 702739. [Google Scholar] [CrossRef] [PubMed]
- Oh, J.H.; Lim, T.; Joo, J.B.; Cho, J.E.; Park, P.; Kim, J.Y. The Relationship Between Tinnitus Frequency and Speech Discrimination in Patients with Hearing Loss. Korean J. Otorhinolaryngol.-Head Neck Surg. 2023, 66, 156–161. [Google Scholar]
- Ristovska, L.; Jachova, Z.; Kovacevic, J. Cross Validation of the Pure Tone Threshold with Speech Audiometry. In Proceedings of the 6th International Scientific Conference—30 Years of Studies in Special Education and Rehabilitation, Ohrid, Republic of N. Macedonia, 14–16 September 2023; Petrov, R., Jachova, Z., Dimitrova-Radojičić, D., Eds.; Faculty of Philosophy: Skopje, North Macedonia, 2023; pp. 520–531. [Google Scholar]
- Deniz, B.; Gülmez, Z.D.; Kara, H.; Kara, E. Effect of Digital Noise Reduction in Hearing Aids on Speech Intelligibility in Both Quiet and Noisy Environments. Noise Health 2024, 26, 220–225. [Google Scholar] [PubMed]
- Dellazizzo, L.; Giguère, S.; Léveillé, N.; Potvin, S.; Dumais, A. A systematic review of relational-based therapies for the treatment of auditory hallucinations in patients with psychotic disorders. Psychol. Med. 2022, 52, 2001–2008. [Google Scholar]
- Zhao, F.; Mayr, R. Pure Tone Audiometry and Speech Audiometry. In Manual of Clinical Phonetics; Routledge: London, UK, 2021; pp. 444–460. [Google Scholar]
- Born, N.M.; Marciano, M.D.S.; Mass, S.D.C.; Silva, D.P.C.D.; Scharlach, R.C. Influence of the Type of Acoustic Transducer in Pure-Tone Audiometry. CoDAS 2022, 34, e20210019. [Google Scholar] [CrossRef]
- Masalski, M. The Hearing Test App for Android Devices: Distinctive Features of Pure-Tone Audiometry Performed on Mobile Devices. Med. Devices Evid. Res. 2024, 17, 151–163. [Google Scholar] [CrossRef]
- Giotakis, A.I.; Mariolis, L.; Koulentis, I.; Mpoutris, C.; Giotakis, E.I.; Apostolopoulou, A.; Papaefstathiou, E. The Benefit of Air Conduction Pure-Tone Audiometry as a Screening Method for Hearing Loss over the VAS Score. Diagnostics 2023, 14, 79. [Google Scholar] [CrossRef]
- Brandt, J.P.; Winters, R. Bone Conduction Evaluation. In StatPearls; StatPearls Publishing: Treasure Island, FL, USA, 2023. [Google Scholar]
- Le Prell, C.G.; Brewer, C.C.; Campbell, K. The Audiogram: Detection of Pure-Tone Stimuli in Ototoxicity Monitoring and Assessments of Investigational Medicines for the Inner Ear. J. Acoust. Soc. Am. 2022, 152, 470–490. [Google Scholar]
- Kassjański, M.; Kulawiak, M.; Przewoźny, T.; Tretiakow, D.; Kuryłowicz, J.; Molisz, A.; Grono, M. Automated Hearing Loss Type Classification Based on Pure Tone Audiometry Data. Sci. Rep. 2024, 14, 14203. [Google Scholar]
- Kim, H.; Park, J.; Choung, Y.H.; Jang, J.H.; Ko, J. Predicting speech discrimination scores from pure-tone thresholds—A machine learning-based approach using data from 12,697 subjects. PLoS ONE 2021, 16, e0261433. [Google Scholar]
- Liu, X.; Guo, P.; Wang, D.; Hsieh, Y.L.; Shi, S.; Dai, Z.; Wang, D.; Li, H.; Wang, W. Applications of Machine Learning in Meniere’s Disease Assessment Based on Pure-Tone Audiometry. Otolaryngol.–Head Neck Surg. 2025, 172, 233–242. [Google Scholar] [PubMed]
- Musiek, F.E.; Shinn, J.; Chermak, G.D.; Bamiou, D.E. Perspectives on the Pure-Tone Audiogram. J. Am. Acad. Audiol. 2017, 28, 655–671. [Google Scholar]
- Wang, X.; Rasidi, W.N.A.; Seluakumaran, K. Simplified Frequency Selectivity Measure as a Potential Candidate for Hearing Screening: Changes with Masker Level and Test-Retest Reliability of Self-Administered Testing. Int. J. Audiol. 2024, 1–10. [Google Scholar] [CrossRef]
- Walker, J.J.; Cleveland, L.M.; Davis, J.L.; Seales, J.S. Audiometry Screening and Interpretation. Am. Fam. Physician 2013, 87, 41–47. [Google Scholar]
- Kemaloğlu, Y.K.; Gündüz, B.; Gökmen, S.; Yilmaz, M. Pure Tone Audiometry in Children. Int. J. Pediatr. Otorhinolaryngol. 2005, 69, 209–214. [Google Scholar] [PubMed]
- Oosterloo, B.C.; Homans, N.C.; de Jong, B.R.J.; Ikram, M.A.; Nagtegaal, A.P.; Goedegebure, A. Assessing Hearing Loss in Older Adults with a Single Question and Person Characteristics; Comparison with Pure Tone Audiometry in the Rotterdam Study. PLoS ONE 2020, 15, e0228349. [Google Scholar] [CrossRef]
- Ahn, J.H.; Lee, H.S.; Kim, Y.J.; Yoon, T.H.; Chung, J.W. Comparing Pure-Tone Audiometry and Auditory Steady-State Response for the Measurement of Hearing Loss. Otolaryngol.-Head Neck Surg. 2007, 136, 966–971. [Google Scholar] [CrossRef]
- Komazec, Z.; Lemajić-Komazec, S.; Jović, R.; Nađ, Č.; Jovančević, L.; Savović, S. Comparison Between Auditory Steady-State Responses and Pure-Tone Audiometry. Vojnosanit. Pregl. 2010, 67, 761–765. [Google Scholar] [CrossRef]
Parameter | Value |
---|---|
Batch size | 32 |
Momentum | 0.9 |
Weight decay | L2 0.01 |
Epochs | 300 |
Learning rate | 0.001 |
Optimizer | Adam |
Workers | 1 |
Parameter | Value |
---|---|
Batch size | 128 |
Weight decay | L2 0.001 |
Epochs | 300 |
Learning rate | 0.0005 |
Optimizer | Adam |
Workers | 1 |
Age | SA | AC250 | AC500 | AC1000 | AC2000 | PTA4000 | BC250 | BC500 | BC1000 | BC2000 | BC4000 |
---|---|---|---|---|---|---|---|---|---|---|---|
68 | 1.0 | 30.0 | 30.0 | 40.0 | 45.0 | 55.0 | 20.0 | 35.0 | 40.0 | 40.0 | 50.0 |
51 | 2.0 | 50.0 | 60.0 | 65.0 | 85.0 | 95.0 | 40.0 | 60.0 | 70.0 | 70.0 | 70.0 |
55 | 0.0 | 15.0 | 20.0 | 35.0 | 50.0 | 40.0 | 15.0 | 20.0 | 35.0 | 50.0 | 35.0 |
55 | 0.0 | 15.0 | 20.0 | 35.0 | 50.0 | 40.0 | 15.0 | 15.0 | 35.0 | 50.0 | 35.0 |
Software/Parameters | Value |
---|---|
Windows 10 | 64-bit |
Programming language | Python, Keras, TensorFlow |
CPU | Intel(R) Core (TM) i9-14900K |
GPU | Nvidia GeForce RTX 4090 |
RAM | 128 GB |
Batch size | 16 |
Validation split | 0.2 |
Test split | 0.2 |
Optimizer | Adam |
Learning rate | 0.001 |
Loss function | Huber |
Epochs | 300 |
Dataset | 12,972 |
Method | Accuracy | Loss | F1 Score | Time |
---|---|---|---|---|
MLP | 85.77% | 0.3181 | 0.8596 | 91 s |
RNN | 85.41% | 0.3796 | 0.8548 | 102 s |
Gradient Boosting | 86.22% | 0.3083 | 0.8635 | 297 s |
XGBoost | 86.04% | 0.3056 | 0.8619 | 34 s |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Shin, J.s.; Ma, J.; Makara, M.; Sung, N.-J.; Choi, S.J.; Kim, S.y.; Hong, M. Development and Comparison of Machine Learning and Deep Learning Models for Speech Audiometry Prediction. Appl. Sci. 2025, 15, 3071. https://doi.org/10.3390/app15063071
Shin Js, Ma J, Makara M, Sung N-J, Choi SJ, Kim Sy, Hong M. Development and Comparison of Machine Learning and Deep Learning Models for Speech Audiometry Prediction. Applied Sciences. 2025; 15(6):3071. https://doi.org/10.3390/app15063071
Chicago/Turabian StyleShin, Jae sung, Jun Ma, Mao Makara, Nak-Jun Sung, Seong Jun Choi, Sung yeup Kim, and Min Hong. 2025. "Development and Comparison of Machine Learning and Deep Learning Models for Speech Audiometry Prediction" Applied Sciences 15, no. 6: 3071. https://doi.org/10.3390/app15063071
APA StyleShin, J. s., Ma, J., Makara, M., Sung, N.-J., Choi, S. J., Kim, S. y., & Hong, M. (2025). Development and Comparison of Machine Learning and Deep Learning Models for Speech Audiometry Prediction. Applied Sciences, 15(6), 3071. https://doi.org/10.3390/app15063071