Investigation of the Clinical Effectiveness and Prognostic Factors of Voice Therapy in Voice Disorders: A Pilot Study
Abstract
:1. Introduction
- Based on a gender analysis reflecting the vocal characteristics of males and females, we investigate the relationship between the effectiveness of voice therapy and predictive factors in relation to each gender’s voice.
- This paper introduces correlation analysis before and after treatment based on effectiveness (+) and non-effectiveness (−) in women’s and men’ voices.
- New parameters proposed according to the characteristics of acoustic parameters predict the effectiveness of speech therapy through binomial logistic regression analysis.
- Multiple experiments were conducted to validate the utility of the proposed parameters, employing the multilayer perceptron model.
- The results highlight the superiority of this system, which predicts the effectiveness of voice therapy by combining gender-analysis-based perceptual modeling and the new parameters.
2. Materials and Methods
2.1. Materials
2.2. Acoustic Analysis
2.3. Perceptual Analysis
2.4. Statistical Analysis
2.5. Multilayer Perceptron Model
3. Results
3.1. Descriptive Statistics
3.2. Binomial Logistic Regression Analysis
3.3. Multilayer Perceptron Model
4. Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Seok, J.; Kwon, T. Artificial Intelligence for Clinical Research in Voice Disease. J. Korean Soc. Laryngol. Phoniatr. Logop. 2022, 33, 142–155. [Google Scholar] [CrossRef]
- Remacle, A.; Lefèvre, N. Which teachers are most at risk for voice disorders? Individual factors predicting vocal acoustic parameters monitored in situ during a workweek. Int. Arch. Occup. Environ. Health 2021, 94, 1271–1285. [Google Scholar] [CrossRef] [PubMed]
- Naranjo, L.; Perez, C.J.; Martin, J.; Campos-Roca, Y. A two-stage variable selection and classification approach for Parkinson’s disease detection by using voice recording replications. Comput. Methods Prog. Biomed. 2017, 142, 147–156. [Google Scholar] [CrossRef] [PubMed]
- Lopez-de-Ipina, K.; Satue-Villar, A.; Faundez-Zanuy, M.; Arreola, V.; Ortega, O.; Clave, P.; Sanz-Cartagena, M.; Mekyska, J.; Calvo, P. Advances in a multimodal approach for dysphagia analysis based on automatic voice analysis. In Advances in Neural Networks; Springer International Publishing: Cham, Switzerland, 2016; pp. 201–211. ISBN 978-3-319-33746-3. [Google Scholar]
- Gupta, R.; Chaspari, T.; Kim, J.; Kumar, N.; Bone, D.; Narayanan, S. Pathological speech processing: State-of-the-art, current challenges, and future directions. In Proceedings of the 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Shanghai, China, 20–25 March 2016; pp. 6470–6474. [Google Scholar]
- Zheng, K.; Padman, R.; Johnson, M.P.; Diamond, H.S. Understanding technology adoption in clinical care: Clinician adop-tion behavior of a point-of-care reminder system. Int. J. Med. Inform. 2005, 74, 535–543. [Google Scholar] [CrossRef]
- Sim, I.; Gorman, P.; Greenes, R.A.; Haynes, R.B.; Kaplan, B.; Lehmann, H.; Tang, P.C. Clinical Decision Support Systems for the Practice of Evidence-based Medicine. J. Am. Med. Inform. Assoc. 2001, 8, 527–534. [Google Scholar] [CrossRef]
- Andrews, M. Voice Treatment for Children and Adolescents; Singular Publishing Group: San Diego, CA, USA, 2002. [Google Scholar]
- Lee, A.R.; Huh, M.J. Auditory Perceptual Factors of Voice Disorders for by Laypeople. J. Speech-Lang. Hear. Disord. 2016, 25, 103–111. [Google Scholar]
- Lee, Y.S.; Lee, D.H.; Jeong, G.E.; Kim, J.W.; Roh, J.L.; Choi, S.H.; Kim, S.Y.; Nam, S.Y. Treatment Efficacy of Voice Therapy for Vocal Fold Polyps and Factors Predictive of Its Efficacy. J. Voice 2017, 31, 120.e9–120.e13. [Google Scholar] [CrossRef]
- Henry, L.R.; Helou, L.B.; Solomon, N.P.; Howard, R.S.; Gurevich-Uvena, J.; Coppit, G.; Stojadinovic, A. Functional Voice Outcomes after Thyroidectomy: An Assessment of the Dsyphonia Severity Index (DSI) after Thyroidectomy. Surgery 2010, 147, 861–870. [Google Scholar] [CrossRef] [PubMed]
- Chhetri, S.S.; Gautam, R. Acoustic Analysis Before and After Voice Therapy for Laryngeal Pathology. Kathmandu Univ. Med. J. 2015, 13, 323–327. [Google Scholar] [CrossRef] [PubMed]
- Galaz, Z.; Mekyska, J.; Zvoncak, V.; Mucha, J.; Kiska, T.; Smekal, Z.; Eliasova, I.; Mrackova, M.; Kostalova, M.; Rektorova, I.; et al. Changes in Phonation and Their Relations with Progress of Parkinson’s Disease. Appl. Sci. 2018, 8, 2339. [Google Scholar] [CrossRef]
- Minh, P.H.N.; Yun, E.M.; Hong, K.H. A Study of the Correlation between Phonetic Parameters during Sustained Vowel and Speech Production with Benign Laryngeal Disorders. Int. Arch. Commun. Disord. 2020, 3, 1–6. [Google Scholar]
- Yun, C.B.; Kim, Y.-M.; Choi, J.-S.; Kim, J.W. Predictive Factors for the Efficacy of Voice Therapy for Pediatric Vocal Fold Nodule. J. Korean Soc. Laryngol. Phoniatr. Logop. 2021, 32, 130–134. [Google Scholar] [CrossRef]
- Lee, J.H.; Lee, C.Y.; Eom, J.S.; Pak, M.; Jeong, H.S.; Son, H.Y. Predictions for Three-Month Postoperative Vocal Recovery after Thyroid Surgery from Spectrograms with Deep Neural Network. Sensors 2022, 22, 6387. [Google Scholar] [CrossRef]
- Schlegel, P.Y.; Kist, A.M.; Semmler, M.; Dollinger, M.; Kunduk, M.; Durr, S.; Schutzenberger, A. Determination of Clinical Parameters Sensitive to Functional Voice Disorders Applying Boosted Decision Stumps. IEEE J. Transl. Eng. Health Med. 2020, 22, 2100511. [Google Scholar] [CrossRef]
- Smitsm, I.; Ceuppens, P.; Bodt, M.S.D. A Comparative Study of Acoustic Voice Measurements by Means of Dr. Speech and Computerized Speech Lab. J. Voice 2005, 19, 187–196. [Google Scholar] [CrossRef] [PubMed]
- Lovato, A.; De Colle, W.; Giacomelli, L.; Piacente, A.; Righetto, L.; Marioni, G.; de Filippis, C. Multi-Dimensional Voice Program (MDVP) vs Praat for Assessing Euphonic Subjects: A Preliminary Study on the Gender-discriminating Power of Acoustic Analysis Software. J. Voice 2016, 30, 765.e1–765.e5. [Google Scholar] [CrossRef] [PubMed]
- Silva, W.J.; Lopes, L.; Galdino, M.K.C.; Almeida, A.A. Voice Acoustic Parameters as Predictors of Depression. J. Voice 2021. online ahead of print. [Google Scholar] [CrossRef] [PubMed]
- Choi, S.H.; Yu, M.; Choi, C. Comparisons of 4-Point GRBAS, 7-Point-GRBAS, and CAPE-V for Auditory Perceptual Evaluation of Dysphonia. Audiol. Speech Res. 2021, 17, 206–219. [Google Scholar] [CrossRef]
- Youn, Y.S.; Kim, H.H.; Son, Y.-I.; Choi, H.S. Validation of the Korean Voice-Handicap Index(K-VHI) and the clinical usefulness of Korean VHI-10. Commun. Sci. Disord. 2008, 13, 216–241. [Google Scholar]
- Lee, Y.J.; Hwang, Y.J. Comparative Studies on the Self Voice Assessment of Voice Disorder Patients and the Hearer Voice Assessment of a Comparative Group of normal subjects. Phon. Speech Sci. 2012, 4, 105–114. [Google Scholar] [CrossRef]
- Choi, H.-J.; Lee, J.-Y. Comparative Study between Healthy Young and Elderly Subjects: Higher-Order Statistical Parameters as Indices of Vocal Aging and Sex. Appl. Sci. 2021, 11, 6966. [Google Scholar] [CrossRef]
- Kwak, S.G.; Park, S.-H. Normality Test in Clinical Research. J. Rheum. Dis. 2019, 26, 5–11. [Google Scholar] [CrossRef]
- Kim, S.H.; Jeong, G.H. An Analysis for Influencing Factors in Purchasing Electric Vehicle using a Binomial Logistic Regression Model (Focused on Suwon City). KSCE J. Civ. Environ. Eng. Res. 2018, 38, 887–894. [Google Scholar]
- Kim, M.J. A Study on WLB (Work-Life Balance) Attributes Affecting Job Satisfaction by Gender by using a Logistic Regression. Inst. Bus. Manag. 2018, 41, 213–229. [Google Scholar]
- Byun, H.W. The Prediction Model for Self-Reported Voice Problem Using a Decision Tree Model. J. Korea Acad.-Ind. Coop. Soc. (JKAIS) 2013, 14, 3368–3373. [Google Scholar]
- Verde, L.; Pietro, G.D.; Sannino, G. Voice Disorder Identification by Using Machine Learning Techniques. IEEE Access 2018, 6, 16246–16255. [Google Scholar] [CrossRef]
- Yoo, J.-H.; Heo, E.-J.; Kim, N.-Y.; Lee, Y.-J.; Kim, G.-W. Predictors of Clinical Efficacy of Oriental Medical Treatment in Patients with Panic Disorder. J. Orient. Neuropsychiatry 2015, 26, 293–305. [Google Scholar] [CrossRef]
- Yun, J.; Shim, H.J.; Seong, C. Classification of muscle tension dysphonia (MTD) female speech and normal speech using cepstrum variables and random forest algorithm. Phon. Speech Sci. 2020, 12, 91–98. [Google Scholar] [CrossRef]
- Mehmet, K. Performance Evaluation of Multilayer Perceptron Artificial Neural Network Model in the Classification of Heart Failure. J. Cogn. Syst. 2021, 6, 35–38. [Google Scholar]
- Gholamreza, P.; Maryam, M.Z. Comparison of Artificial Neural Network and SPSS Model in Predicting Customers Churn of Iran’s Insurance Industry. Int. J. Comput. Appl. 2020, 176, 14–21. [Google Scholar]
- Lee, J.; Choi, J. Alcohol Dependence Screening Test Using Artificial Neural Network Analysis: The Sensitivity and Specificity Stud. J. Korean Acad. Addict. Psychiatry 2005, 9, 102–109. [Google Scholar]
- Zhang, Z.; Zhou, D.; Zhang, J.; Xu, Y.; Lin, G.; Jin, B.; Liang, Y.; Geng, Y.; Zhang, S. Multilayer perceptron-based prediction of stroke mimics in prehospital triage. Sci. Rep. 2022, 12, 17994. [Google Scholar]
- Jeong, K.; Kim, S.-T.; Kim, S.-Y.; Roh, J.-L.; Nam, S.-Y.; Choi, S.-H. Factors Predictive of Voice Therapy Outcome in Patients with Unilateral Vocal Fold Paralysis. J. Korean Soc. Laryngol. Phoniatr. Logop. 2010, 21, 121–127. [Google Scholar]
- Tafiadis, D.; Tatsis, G.; Ziavra, N.; Toki, E.I. Voice Data on Female Smokers: Coherence between the Voice Handicap Index and Acoustic Voice Parameters. AIMS Med. Sci. 2017, 4, 151–163. [Google Scholar] [CrossRef]
- Kim, S.; Lee, Y.C.; Kwon, O.E.; Eun, Y. Factors Predicting the Outcome of Voice Therapy in Patients with Polyp or Nodule. Am. J. Otolaryngol. Head Neck Surg. 2022, 5, 1202. [Google Scholar]
- Giuliano, M.; García-López, A.; Pérez, S.; Pérez, F.D.; Spositto, O.; Bossero, J. Selection of voice parameters for Parkinson’s disease prediction from collected mobile data. In Proceedings of the 2019 XXII Symposium on Image, Signal Processing and Artificial Vision (STSIVA), Bucaramanga, Colombia, 24–26 April 2019; pp. 1–3. [Google Scholar]
- Sachdeva, K.; Shrivastava, T. Dysphonia and its Correlation with Acoustic Voice Parameters. Int. J. Phonosurg. Laryngol. 2018, 8, 6–12. [Google Scholar] [CrossRef]
- Kim, J.O. Acoustic characteristics of the voices of Korean normal adults by gender on MDVP. J. Korean Soc. Speech Sci. 2009, 1, 147–157. [Google Scholar]
Female | Male | |
---|---|---|
Number of samples | 55 (27 Voice users) | 26 (12 voice users) |
Average age | 51 | 48 |
Types of voice disorders (numbers) | Vocal fold polyp (15), vocal nodule (16), thyroid nodule (2), hoarseness (1), muscle tension dysphonia (5), sulcus vocalis (1), dysphonia (8), presbyphonia (3), vocal cyst (4) | Vocal fold polyp (11), vocal cord paralysis (2), mutational dysphonia (1), vocal nodule (1), vocal cyst (2), leukoplakia (1), dysphonia (2), sulcus vocalis (1), muscle tension dysphonia (2), presbyphonia (2), vallecular cyst (1) |
Number of responsive samples (effectiveness, +) | 41 | 18 |
Variables | Smoking status, alcohol status, voice user status, coffee status, fundamental frequency before and after treatment (Hz), jitter before and after treatment (%), shimmer before and after treatment (%), noise to harmonic ratio before and after treatment (NHR, dB), speaking fundamental frequency before and after treatment (SFF, Hz), maximum phonation time before and after treatment (MPT, s) |
Women | Men | |||
---|---|---|---|---|
Pre vs. Post Treatment | Pre vs. Post Treatment | |||
Correlation Coefficient | p-Value | Correlation Coefficient | p-Value | |
F0 (Hz) | 0.399 * | 0.256 | 0.587 * | 0.647 |
Jitter (%) | 0.307 ** | <0.001 ** | 0.111 | <0.001 ** |
Shimmer (%) | 0.154 | <0.001 ** | 0.325 | 0.013 ** |
NHR (dB) | −0.015 | 0.004 ** | 0.208 | 0.884 |
SFF (Hz) | 0.792 * | 0.635 | 0.771 * | 0.182 |
MPT (s) | 0.695 * | 0.061 | 0.688 * | 0.004 ** |
Female | Male | |||||||
---|---|---|---|---|---|---|---|---|
Effectiveness (+) | Effectiveness (−) | Effectiveness (+) | Effectiveness (−) | |||||
Before vs. after Treatment | Before vs. after Treatment | Before vs. after Treatment | Before vs. after Treatment | |||||
Correlation Coefficient | p-Value | Correlation Coefficient | p-Value | Correlation Coefficient | p-Value | Correlation Coefficient | p-Value | |
F0 (Hz) | 0.484 * | 0.359 | 0.101 | 0.462 | 0.149 | 0.291 | 0.984 * | 0.916 |
Jitter (%) | 0.223 | <0.001 ** | 0.818 * | 0.129 | 0.068 | <0.001 ** | 0.661 | 0.753 |
Shimmer (%) | 0.475 | <0.001 ** | 0.682 * | 0.270 | 0.257 | 0.002 ** | 0.681 | 0.916 |
NHR (dB) | −0.033 | 0.018 ** | 0.013 | 0.108 | 0.330 | 0.338 | −0.485 | 0.114 |
SFF (Hz) | 0.791 * | 0.735 | 0.781 * | 0.678 | 0.634 * | 0.129 | 0.950 * | 0.674 |
MPT (s) | 0.737 * | 0.104 | 0.507 | 0.382 | 0.635 * | 0.002 ** | 0.975 * | 0.916 |
Female | |||||||
---|---|---|---|---|---|---|---|
Dependent Variable | Independent Variable | B | S.E. | Wald | p Value | Exp(B) | Model |
Effectiveness | Alcohol | 2.154 | 1.204 | 3.202 | 0.074 | 8.622 | −2 log likelihood = 34.834 Cox and Snell R2 = 0.502 Nagelkerke R2 = 0.682 p < 0.001 |
Coffee | 2.794 | 1.243 | 5.049 | 0.025 * | 16.340 | ||
Jitter (Post-tx) | −1.951 | 0.947 | 4.243 | 0.039 * | 0.142 | ||
Jitter comparison | 4.471 | 2.008 | 4.957 | 0.026 * | 87.431 | ||
Shimmer (Pre-tx) | 0.494 | 0.224 | 4.871 | 0.027 * | 1.640 | ||
Constant | −2.155 | 1.914 | 1.267 | 0.260 | 0.116 | ||
Male | |||||||
Jitter (Pre-tx) | 1.282 | 0.606 | 4.470 | 0.034 * | 3.603 | −2 log likelihood = 21.60 Cox and Snell R2 = 0.332 Nagelkerke R2 = 0.468 p = 0.005 | |
Jitter (Post-tx) | −2.358 | 1.150 | 4.206 | 0.040 * | 0.095 | ||
MPT(Pre-tx) | −1.202 | 0.533 | 5.089 | 0.024 * | 0.301 | ||
Constant | 0.619 | 1.093 | 0.320 | 0.571 | 1.857 |
Female | Male | |
---|---|---|
F0 comparison |
|
|
Jitter comparison |
|
|
Shimmer comparison |
|
|
NHR comparison |
|
|
SFF comparison | Same as F0 comparison | Same as F0 comparison |
Female | ||||||||
---|---|---|---|---|---|---|---|---|
Dependent Variable | Independent Variable | B | S.E. | Wald | Odds Ratio | p Value 1 | Exp(B) | Model |
Effectiveness | Jitter comparison | 2.151 | 1.169 | 3.386 | 1 | 0.066 | 8.596 | −2 log likelihood = 63.417 Cox and Snell R2 = 0.162 Nagelkerke R2 = 0.220 p = 0.008 |
NHR comparison | 1.335 | 0.686 | 3.794 | 1 | 0.051 | 3.801 | ||
Constant | −2.424 | 1.228 | 3.897 | 1 | 0.048 | 0.089 | ||
Male | ||||||||
Jitter comparison | 2.303 | 1.378 | 2.790 | 1 | 0.095 | 10.000 | −2 log likelihood = 26.769 Cox and Snell R2 = 0.185 Nagelkerke R2 = 0.261 p = 0.07 | |
NHR comparison | 2.015 | 1.111 | 3.292 | 1 | 0.070 | 7.500 | ||
Constant | −2.708 | 1.653 | 2.683 | 1 | 0.101 | 0.067 |
Female | Male | ||
---|---|---|---|
Input layer | Input factors | Coffee status, jitter (Post-tx), shimmer (Pre-tx), jitter comparison, | Jitter (Pre-tx), jitter (Post-tx), MPT (Pre-tx) |
Number of units | 6 | 3 | |
Hidden layer | Number of hidden layers | 1 | 1 |
Number of units | 2 | 1 | |
Activation function | Hyperbolic tangent | Hyperbolic tangent | |
Output layer | Dependent variable | Effectiveness | Effectiveness |
Number of units | 2 | 2 | |
Rescaling of scale-dependent variables | Standardized | Standardized | |
Activation function | Softmax | Softmax | |
Error function | Cross entropy | Cross entropy |
Female | Reference | |||
---|---|---|---|---|
Effectiveness (+) | Effectiveness (−) | Total | ||
Predicted | Effectiveness (+) | 11 | 1 | 12 |
Effectiveness (−) | 1 | 3 | 4 | |
Total | 11 | 5 | 16 | |
Male | Reference | |||
Effectiveness (+) | Effectiveness (−) | Total | ||
Predicted | Effectiveness (+) | 4 | 1 | 5 |
Effectiveness (−) | 0 | 2 | 2 | |
Total | 4 | 3 | 7 |
Female | Male | |
---|---|---|
Performance Metrices | Values | |
Accuracy (%) | 87.5% | 85.71% |
Precision | 0.92 | 1.00 |
Specificity | 0.75 | 1.00 |
Recall | 0.92 | 0.67 |
G value | 0.83 | 0.82 |
F score | 0.92 | 0.80 |
AUC | 0.853 | 0.861 |
Female | Male | ||
---|---|---|---|
Input Variable | Importance | Input Variable | Importance |
Jitter (Post-tx) | 0.363 | MPT (Pre-tx) | 0.395 |
Shimmer (Pre-tx) | 0.260 | Jitter (Post-tx) | 0.361 |
Coffee-drinking status | 0.226 | Jitter (Pre-tx) | 0.244 |
Jitter comparison | 0.151 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Lee, J.-Y.; Park, J.-H.; Lee, J.-N.; Jung, A.-R. Investigation of the Clinical Effectiveness and Prognostic Factors of Voice Therapy in Voice Disorders: A Pilot Study. Appl. Sci. 2023, 13, 11523. https://doi.org/10.3390/app132011523
Lee J-Y, Park J-H, Lee J-N, Jung A-R. Investigation of the Clinical Effectiveness and Prognostic Factors of Voice Therapy in Voice Disorders: A Pilot Study. Applied Sciences. 2023; 13(20):11523. https://doi.org/10.3390/app132011523
Chicago/Turabian StyleLee, Ji-Yeoun, Ji-Hye Park, Ji-Na Lee, and Ah-Ra Jung. 2023. "Investigation of the Clinical Effectiveness and Prognostic Factors of Voice Therapy in Voice Disorders: A Pilot Study" Applied Sciences 13, no. 20: 11523. https://doi.org/10.3390/app132011523
APA StyleLee, J.-Y., Park, J.-H., Lee, J.-N., & Jung, A.-R. (2023). Investigation of the Clinical Effectiveness and Prognostic Factors of Voice Therapy in Voice Disorders: A Pilot Study. Applied Sciences, 13(20), 11523. https://doi.org/10.3390/app132011523