Speech Enhancement Based on Enhanced Empirical Wavelet Transform and Teager Energy Operator
Abstract
:1. Introduction
2. Proposed Approach
- Determination of Teager Energy Operator [36];
- Mask construction based on ;
- Mask processing ;
- Determination of the time–space adaptation of thresholds based on ;
- Realization of soft thresholding considering for the component signal .
2.1. Enhanced Empirical Wavelet Transform (EEWT)
- The use of fast Fourier transform (FFT) to determine the spectrum of the analyzed signal.
- The calculation of the upper envelope of the analyzed signal using the l-th order statistical filter (OSF). In the enhanced method [35] (in relation to the conventional empirical wavelet transform (EWT) [39]), the envelope is used to identify the trend of spectrum variation. The number of order l is determined according to the relationship:
- The determination of spectrum frequency peaks from the designated envelope and the selection of useful ones based on the following criteria: (a) the width of a flat top cannot be shorter than the order statistics filter size; (b) the most representative flat top in the neighbor ones is picked out; (c) the useful flat tops do not appear in the downtrend of the analyzed signal spectrum.
- The calculation of the spectrum segmentation boundaries based on the flat tops obtained in Step 3.
- The construction of the empirical scaling function and empirical wavelet as in the EWT method [39], and the decomposition of the analyzed signal into component signals.
2.2. Teager Energy Operator (TEO)
2.3. Mask Construction
2.4. Mask Processing
2.5. Thresholding
2.6. Final Processing
3. Materials
4. Results and Discussion
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Recording History. Available online: http://www.recording-history.org (accessed on 1 January 2022).
- Lindos Electronics. Available online: http://http://www.lindos.co.uk/ (accessed on 1 January 2022).
- Shapley, G.J. Sound of Failure: Experimental Electronic Music in Our Post–Digital Era; University of Technology: Sydney, Australia, 2012. [Google Scholar]
- Yang, Y.; SooCho, J.; Lee, B.; Kim, S. A Sound Activity Detector Embedded Low-Power MEMS Microphone Readout Interface for Speech Recognition. In Proceedings of the 2019 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED), Lausanne, Switzerland, 29–31 July 2019; pp. 1–6. [Google Scholar] [CrossRef]
- Schneider, M. Electromagnetic interference, microphones and cables. AES J. Audio Eng. Soc. 2005, 6339. Available online: https://www.aes.org/e-lib/browse.cfm?elib=13055 (accessed on 1 January 2023).
- Gannot, S.; Burshtein, D.; Weinstein, E. Iterative and sequential Kalman filter-based speech enhancement algorithms. IEEE Trans. Acoust. Speech Signal Process. 1998, 6, 373–385. [Google Scholar] [CrossRef] [Green Version]
- Vaseghi, S.V. Advanced Digital Signal Processing and Noise Reduction; John Wiley & Sons, Ltd.: Hoboken, NJ, USA, 2008. [Google Scholar]
- Kim, J.B.; Lee, K.; Lee, C. On the applications of the interacting multiple model algorithm for enhancing noisy speech. IEEE Trans. Acoust. Speech Signal Process. 2000, 8, 349–352. [Google Scholar] [CrossRef]
- Dendrinos, M.N.; Bakamidis, S.G.; Carayannis, G. Speech enhancement from noise: A regenerative approach. Speech Commun. 1991, 10, 45–57. [Google Scholar] [CrossRef]
- Loizou, P.C. Speech Enhancement: Theory and Practice; CRC Press: Boca Raton, FL, USA, 2013. [Google Scholar]
- Ephraim, Y.; Malah, D. Speech enhancement using a minimum-mean square error short-time spectral amplitude estimator. IEEE Trans. Acoust. Speech Signal Process. 1984, 32, 1109–1121. [Google Scholar] [CrossRef] [Green Version]
- Jax, P.; Vary, P. Artificial bandwidth extension of speech signals using MMSE estimation based on a hidden Markov model. In Proceedings of the 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP ’03), Hong Kong, China, 6–10 April 2003; Volume 1. [Google Scholar] [CrossRef]
- Boll, S. Suppression of acoustic noise in speech using spectral subtraction. IEEE Trans. Acoust. Speech Signal Process. 1979, 27, 113–120. [Google Scholar] [CrossRef] [Green Version]
- Kwan, C.; Chu, S.; Yin, J.; Liu, X.; Kruger, M.; Sityar, I. Enhanced speech in noisy multiple speaker environment. In Proceedings of the 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence), Hong Kong, China, 1–6 June 2008; pp. 1640–1643. [Google Scholar] [CrossRef]
- Mallat, S. A Wavelet Tour of Signal Processing; Academic Press: Cambridge, MA, USA, 2009. [Google Scholar]
- Virag, N. Single channel speech enhancement based on masking properties of the human auditory system. IEEE Trans. Acoust. Speech Signal Process. 1999, 7, 126–137. [Google Scholar] [CrossRef] [Green Version]
- Sun, C.; Zhu, Q.; Wan, M. A novel speech enhancement method based on constrained low-rank and sparse matrix decomposition. Speech Commun. 2014, 60, 44–55. [Google Scholar] [CrossRef]
- Sun, C.; Xie, J.; Leng, Y. A Signal Subspace Speech Enhancement Approach Based on Joint Low-Rank and Sparse Matrix Decomposition. Arch. Acoust. 2016, 41, 245–254. [Google Scholar] [CrossRef] [Green Version]
- Xian, Y.; Sun, Y.; Wang, W.; Naqvi, S.M. A Multi-Scale Feature Recalibration Network for End-to-End Single Channel Speech Enhancement. IEEE J. Sel. Top. Signal Process. 2021, 15, 143–155. [Google Scholar] [CrossRef]
- Wood, S.U.N.; Rouat, J. Unsupervised Low Latency Speech Enhancement With RT-GCC-NMF. IEEE J. Sel. Top. Signal Process. 2019, 13, 332–346. [Google Scholar] [CrossRef] [Green Version]
- Chakrabarty, S.; Habets, E.A.P. Time-Frequency Masking Based Online Multi-Channel Speech Enhancement With Convolutional Recurrent Neural Networks. IEEE J. Sel. Top. Signal Process. 2019, 13, 787–799. [Google Scholar] [CrossRef]
- Lavanya, T.; Nagarajan, T.; Vijayalakshmi, P. Multi-Level Single-Channel Speech Enhancement Using a Unified Framework for Estimating Magnitude and Phase Spectra. IEEE/ACM Trans. Audio Speech Lang. Process. 2020, 28, 1315–1327. [Google Scholar] [CrossRef]
- Tu, Y.H.; Du, J.; Lee, C.H. Speech Enhancement Based on Teacher-Student Deep Learning Using Improved Speech Presence Probability for Noise-Robust Speech Recognition. IEEE/ACM Trans. Audio Speech Lang. Process. 2019, 27, 2080–2091. [Google Scholar] [CrossRef]
- Ming, J.; Crookes, D. Speech Enhancement Based on Full-Sentence Correlation and Clean Speech Recognition. IEEE/ACM Trans. Audio Speech Lang. Process. 2017, 25, 531–543. [Google Scholar] [CrossRef] [Green Version]
- Kim, M.; Shin, J.W. Improved Speech Enhancement Considering Speech PSD Uncertainty. IEEE/ACM Trans. Audio Speech Lang. Process. 2022, 30, 1939–1951. [Google Scholar] [CrossRef]
- Saleem, N.; Khattak, M.I.; Ahmad, S.; Ali, M.Y.; Mohmand, M.I. Machine Learning Approach for Improving the Intelligibility of Noisy Speech. In Proceedings of the 2020 17th International Bhurban Conference on Applied Sciences and Technology (IBCAST), Islamabad, Pakistan, 14–18 January 2020; pp. 303–308. [Google Scholar] [CrossRef]
- Choudhury, A.; Roy, P.; Bandyopadhyay, S. Review of Various Machine Learning and Deep Learning Techniques for Audio Visual Automatic Speech Recognition. In Proceedings of the 2023 International Conference on Intelligent Systems, Advanced Computing and Communication (ISACC), Taza, Morocco, 26–27 October 2023; pp. 1–10. [Google Scholar] [CrossRef]
- Casey, O.; Dave, R.; Seliya, N.; Sowells Boone, E.R. Machine Learning: Challenges, Limitations, and Compatibility for Audio Restoration Processes. In Proceedings of the 2021 International Conference on Computing, Computational Modelling and Applications (ICCMA), Brest, France, 14–16 July 2021; pp. 27–32. [Google Scholar] [CrossRef]
- Ayhan, B.; Kwan, C. Robust Speaker Identification Algorithms and Results in Noisy Environments. In Advances in Neural Networks; Huang, T., Lv, J., Sun, C., Tuzikov, A.V., Eds.; Springer: Cham, Switzerland, 2018; pp. 443–450. [Google Scholar]
- Rehr, R.; Gerkmann, T. SNR-Based Features and Diverse Training Data for Robust DNN-Based Speech Enhancement. IEEE/ACM Trans. Audio Speech Lang. Process. 2021, 29, 1937–1949. [Google Scholar] [CrossRef]
- Zhang, Q.; Nicolson, A.; Wang, M.; Paliwal, K.K.; Wang, C. DeepMMSE: A Deep Learning Approach to MMSE-Based Noise Power Spectral Density Estimation. IEEE/ACM Trans. Audio Speech Lang. Process. 2020, 28, 1404–1415. [Google Scholar] [CrossRef]
- Takeuchi, D.; Yatabe, K.; Koizumi, Y.; Oikawa, Y.; Harada, N. Real-Time Speech Enhancement Using Equilibriated RNN. In Proceedings of the ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Virtual, 4–9 May 2020; pp. 851–855. [Google Scholar] [CrossRef] [Green Version]
- Zhu, Q.S.; Zhang, J.; Zhang, Z.Q.; Dai, L.R. A Joint Speech Enhancement and Self-Supervised Representation Learning Framework for Noise-Robust Speech Recognition. IEEE/ACM Trans. Audio Speech Lang. Process. 2023, 31, 1927–1939. [Google Scholar] [CrossRef]
- Shifas, M.P.; Zorila, C.; Stylianou, Y. End-to-End Neural Based Modification of Noisy Speech for Speech-in-Noise Intelligibility Improvement. IEEE/ACM Trans. Audio, Speech, Lang. Process. 2022, 30, 162–173. [Google Scholar] [CrossRef]
- Hu, Y.; Li, F.; Li, H.G.; Liu, C. An enhanced empirical wavelet transform for noisy and non-stationary signal processing. Dig. Signal Process. 2017, 60, 220–229. [Google Scholar] [CrossRef]
- Kaiser, J. On a simple algorithm to calculate the ’energy’ of a signal. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, Albuquerque, NM, USA, 3–6 April 1990; Volume 1, pp. 381–384. [Google Scholar] [CrossRef]
- Deller, J.; Hansen, J.; Proakis, J. Discrete-Time Processing of Speech Signals; Wiley-IEEE Press: Hoboken, NJ, USA, 2000. [Google Scholar]
- Rix, A.; Beerends, J.; Hollier, M.; Hekstra, A. Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs. In Proceedings of the 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing, Proceedings (Cat. No.01CH37221). Salt Lake City, UT, USA, 7–11 May 2001; Volume 2, pp. 749–752. [Google Scholar] [CrossRef]
- Gilles, J. Empirical Wavelet Transform. IEEE Trans. Signal Process. 2013, 61, 3999–4010. [Google Scholar] [CrossRef]
- Carvalho, V.R.; Moraes, M.F.; Braga, A.P.; Mendes, E.M. Evaluating five different adaptive decomposition methods for EEG signal seizure detection and classification. Biomed. Signal Process. Control 2020, 62, 102073. [Google Scholar] [CrossRef]
- Donoho, D.; Johnstone, I. Ideal Spatial Adaptation via Wavelet Shrinkage. Biometrika 1994, 81, 425–455. [Google Scholar] [CrossRef]
- Panayotov, V.; Chen, G.; Povey, D.; Khudanpur, S. Librispeech: An ASR corpus based on public domain audio books. In Proceedings of the 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), South Brisbane, Australia, 19–24 April 2015; pp. 5206–5210. [Google Scholar] [CrossRef]
- Michael; Li, E.X.D. scivision/Soothing-Sounds: Src/Layout, Black Format, Type Anno. 2021. Available online: https://zenodo.org/record/5574886 (accessed on 1 January 2022).
- Hu, Y.; Loizou, P. A generalized subspace approach for enhancing speech corrupted by colored noise. IEEE Trans. Speech Audio Process. 2003, 11, 334–341. [Google Scholar] [CrossRef] [Green Version]
- Scheibler, R.; Bezzam, E.; Dokmanic, I. Pyroomacoustics: A Python Package for Audio Room Simulation and Array Processing Algorithms. In Proceedings of the 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Calgary, AB, Canada, 15–20 April 2018; pp. 351–355. [Google Scholar] [CrossRef] [Green Version]
- Djahnine, A. Suppression-of-Acoustic-Noise-in-Speech-Using-Spectral-Subtraction. 2021. Available online: https://github.com/AissamDjahnine/Suppression-of-Acoustic-Noise-in-Speech-Using-Spectral-Subtraction- (accessed on 1 January 2022).
- Bahoura, M.; Rouat, J. Wavelet Speech Enhancement Based on Time-Scale Adaptation. Speech Commun. 2006, 48, 1620–1637. [Google Scholar] [CrossRef] [Green Version]
- Sainburg, T.; Thielk, M.; Gentner, T.Q. Finding, visualizing, and quantifying latent structure across diverse animal vocal repertoires. PLoS Comput. Biol. 2020, 16, e1008228. [Google Scholar] [CrossRef]
- Sainburg, T. timsainb/noisereduce: V1.0. 2019. Available online: https://zenodo.org/record/3243139 (accessed on 1 January 2022). [CrossRef]
- Yang, Y.; Rao, J. Robust and Efficient Harmonics Denoising in Large Dataset Based on Random SVD and Soft Thresholding. IEEE Access 2019, 7, 77607–77617. [Google Scholar] [CrossRef]
- Yang, Z.X.; Zhong, J.H. A Hybrid EEMD-Based SampEn and SVD for Acoustic Signal Processing and Fault Diagnosis. Entropy 2016, 18, 112. [Google Scholar] [CrossRef] [Green Version]
- PESQ (Perceptual Evaluation of Speech Quality). Wrapper for Python Users. Available online: https://pypi.org/project/pesq/ (accessed on 1 January 2022).
- Emiru, E.D.; Li, Y.; Xiong, S.; Fesseha, A. Speech Recognition System Based on Deep Neural Network Acoustic Modeling for Low Resourced Language-Amharic. In Proceedings of the 3rd International Conference on Telecommunications and Communication Engineering (ICTCE), Tokyo, Japan, 9–12 November 2019; pp. 141–145. [Google Scholar] [CrossRef]
- Cecko, R.; Jamrozy, J.; Jesko, W.; Kusmierek, E.; Lange, M.; Owsianny, M. Automatic Speech Recognition and its Application to Media Monitoring. Comput. Methods Sci. Technol. CMST 2021, 27, 41–55. [Google Scholar] [CrossRef]
- Jesko, W. Vocalization Recognition of People with Profound Intellectual and Multiple Disabilities (PIMD) Using Machine Learning Algorithms. In Proceedings of the Interspeech 2021, Brno, Czech Republic, 30 August–3 September 2021; pp. 2921–2925. [Google Scholar] [CrossRef]
- Ravanelli, M.; Parcollet, T.; Plantinga, P.; Rouhe, A.; Cornell, S.; Lugosch, L.; Subakan, C.; Dawalatabad, N.; Heba, A.; Zhong, J.; et al. SpeechBrain: A General-Purpose Speech Toolkit. arXiv 2021, arXiv:2106.04624. [Google Scholar]
- Pan, J.; Liu, C.; Wang, Z.; Hu, Y.; Jiang, H. Investigation of deep neural networks (DNN) for large vocabulary continuous speech recognition: Why DNN surpasses GMMS in acoustic modeling. In Proceedings of the 2012 8th International Symposium on Chinese Spoken Language Processing, Hong Kong, China, 5–8 December 2012; pp. 301–305. [Google Scholar] [CrossRef]
- Banaszek, A.; Lisaj, A. The Concept of Intelligent Radiocommunication System for Support Decision of Yacht Captains in distress situations with use of neural network computer systems. Procedia Comput. Sci. 2022, 207, 398–407. [Google Scholar] [CrossRef]
- Banaszek, A.; Lisaj, A. Advanced methodology for multi-way transmission of ship data treatment from mechanical-navigational technical state sensors with using computational neural network computer systems. Procedia Comput. Sci. 2022, 207, 388–397. [Google Scholar] [CrossRef]
Disturbance Type | SNRu | The Maximum Value | The Median Value | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
EEWT-TO | KLT | SSboll | WT-TO | NR | H-SVD | EEMD-SVD | EEWT-TO | KLT | SSboll | WT-TO | NR | H-SVD | EEMD-SVD | ||
“white” noise | 20 | 3.90 | 2.00 | 4.01 | 3.50 | 3.81 | 2.55 | 2.28 | 2.97 | 1.42 | 2.82 | 2.57 | 2.96 | 1.44 | 1.51 |
10 | 3.00 | 1.91 | 2.89 | 2.61 | 2.71 | 2.57 | 1.91 | 2.31 | 1.44 | 2.01 | 1.93 | 1.96 | 1.47 | 1.40 | |
0 | 2.24 | 1.50 | 1.77 | 1.90 | 2.18 | 2.24 | 1.55 | 1.71 | 1.21 | 1.37 | 1.45 | 1.63 | 1.48 | 1.24 | |
−10 | 1.86 | 1.46 | 1.24 | 1.61 | 1.68 | 1.87 | 1.48 | 1.36 | 1.22 | 1.10 | 1.27 | 1.31 | 1.27 | 1.19 | |
−20 | 1.49 | 1.60 | 1.20 | 1.39 | 1.40 | 1.31 | 1.42 | 1.19 | 1.07 | 1.05 | 1.18 | 1.05 | 1.08 | 1.16 | |
“violet” noise | 20 | 4.45 | 1.94 | 3.93 | 3.97 | 3.65 | 2.58 | 2.51 | 3.29 | 1.51 | 2.68 | 3.03 | 2.74 | 1.46 | 1.57 |
10 | 3.78 | 1.75 | 2.49 | 3.23 | 3.20 | 2.56 | 2.09 | 2.79 | 1.40 | 1.69 | 2.36 | 2.28 | 1.43 | 1.46 | |
0 | 3.10 | 1.60 | 1.86 | 2.49 | 2.65 | 2.57 | 1.64 | 2.23 | 1.24 | 1.30 | 1.81 | 1.87 | 1.47 | 1.26 | |
−10 | 2.05 | 1.53 | 1.44 | 1.68 | 1.88 | 1.84 | 1.48 | 1.54 | 1.22 | 1.15 | 1.29 | 1.53 | 1.23 | 1.26 | |
−20 | 1.70 | 1.50 | 1.27 | 1.50 | 1.25 | 1.20 | 1.51 | 1.32 | 1.18 | 1.10 | 1.26 | 1.08 | 1.09 | 1.19 | |
“brown” noise | 20 | 4.50 | 1.50 | 3.01 | 4.36 | 3.81 | 2.55 | 2.97 | 4.11 | 1.20 | 1.71 | 3.14 | 2.73 | 1.44 | 1.71 |
10 | 4.50 | 1.50 | 2.03 | 4.28 | 3.87 | 2.68 | 2.94 | 4.02 | 1.20 | 1.34 | 3.13 | 2.73 | 1.44 | 1.70 | |
0 | 4.00 | 1.53 | 1.51 | 3.74 | 3.30 | 2.34 | 2.13 | 3.20 | 1.23 | 1.15 | 2.65 | 2.37 | 1.12 | 1.50 | |
−10 | 3.45 | 2.05 | 1.20 | 2.58 | 2.85 | 1.17 | 3.00 | 2.37 | 1.42 | 1.09 | 1.78 | 1.97 | 1.06 | 1.84 | |
−20 | 2.20 | 1.80 | 1.14 | 1.69 | 1.76 | 1.11 | 1.37 | 1.42 | 1.25 | 1.03 | 1.26 | 1.32 | 1.08 | 1.23 | |
“blue” noise | 20 | 4.00 | 2.00 | 3.77 | 3.70 | 3.59 | 2.42 | 2.29 | 3.06 | 1.31 | 2.77 | 2.77 | 2.48 | 1.40 | 1.50 |
10 | 3.78 | 2.13 | 2.96 | 3.22 | 3.25 | 2.70 | 2.13 | 2.83 | 1.36 | 1.98 | 2.33 | 2.28 | 1.49 | 1.47 | |
0 | 2.50 | 1.75 | 1.87 | 2.07 | 2.31 | 2.36 | 1.53 | 1.87 | 1.28 | 1.35 | 1.58 | 1.69 | 1.43 | 1.21 | |
−10 | 1.83 | 1.66 | 1.41 | 1.67 | 1.75 | 2.05 | 1.41 | 1.40 | 1.32 | 1.16 | 1.32 | 1.35 | 1.26 | 1.17 | |
−20 | 1.52 | 1.51 | 1.27 | 1.20 | 1.66 | 1.27 | 1.51 | 1.23 | 1.21 | 1.10 | 1.08 | 1.24 | 1.08 | 1.19 | |
“pink” noise | 20 | 4.42 | 1.91 | 3.45 | 4.28 | 3.78 | 2.83 | 2.81 | 3.35 | 1.28 | 1.96 | 3.22 | 2.72 | 1.53 | 1.69 |
10 | 3.10 | 1.70 | 2.03 | 2.91 | 2.93 | 2.66 | 2.09 | 2.37 | 1.28 | 1.37 | 2.16 | 2.06 | 1.46 | 1.44 | |
0 | 2.00 | 1.47 | 1.47 | 1.75 | 1.91 | 2.49 | 1.57 | 1.63 | 1.14 | 1.16 | 1.39 | 1.51 | 1.28 | 1.23 | |
−10 | 1.55 | 1.49 | 1.70 | 1.42 | 1.62 | 1.19 | 1.52 | 1.24 | 1.17 | 1.19 | 1.14 | 1.27 | 1.08 | 1.20 | |
−20 | 1.33 | 1.43 | 1.69 | 1.17 | 1.65 | 1.17 | 1.36 | 1.15 | 1.05 | 1.05 | 1.06 | 1.11 | 1.08 | 1.11 | |
traffic sounds | 20 | 4.17 | 1.35 | 3.07 | 3.84 | 3.40 | 2.58 | 2.76 | 3.22 | 1.18 | 2.08 | 2.96 | 2.43 | 1.47 | 1.70 |
10 | 2.89 | 1.44 | 2.07 | 2.79 | 2.47 | 2.65 | 2.24 | 2.31 | 1.23 | 1.51 | 2.14 | 1.88 | 1.52 | 1.58 | |
0 | 1.92 | 1.38 | 1.56 | 1.85 | 1.72 | 1.83 | 1.61 | 1.52 | 1.14 | 1.18 | 1.43 | 1.37 | 1.14 | 1.27 | |
−10 | 1.40 | 1.30 | 1.90 | 1.40 | 1.26 | 1.52 | 1.51 | 1.27 | 1.11 | 1.13 | 1.15 | 1.10 | 1.09 | 1.15 | |
−20 | 1.47 | 1.27 | 1.60 | 1.44 | 1.28 | 1.33 | 1.60 | 1.26 | 1.08 | 1.13 | 1.12 | 1.10 | 1.07 | 1.13 | |
hair dryer | 20 | 4.00 | 1.36 | 2.96 | 3.70 | 3.60 | 2.71 | 2.64 | 3.05 | 1.14 | 2.12 | 2.71 | 2.56 | 1.49 | 1.58 |
10 | 2.80 | 1.33 | 1.96 | 2.38 | 2.54 | 2.60 | 1.91 | 2.12 | 1.13 | 1.41 | 1.78 | 1.90 | 1.45 | 1.37 | |
0 | 1.91 | 1.31 | 1.34 | 1.74 | 1.79 | 1.90 | 1.54 | 1.59 | 1.11 | 1.13 | 1.36 | 1.41 | 1.40 | 1.23 | |
−10 | 1.62 | 1.85 | 1.90 | 1.58 | 1.41 | 1.90 | 1.63 | 1.42 | 1.16 | 1.21 | 1.16 | 1.15 | 1.39 | 1.13 | |
−20 | 1.36 | 1.35 | 1.44 | 1.25 | 1.51 | 1.60 | 1.31 | 1.22 | 1.08 | 1.10 | 1.07 | 1.11 | 1.18 | 1.09 | |
fan | 20 | 3.56 | 1.33 | 2.79 | 3.25 | 3.17 | 2.54 | 2.41 | 2.69 | 1.11 | 1.91 | 2.49 | 2.30 | 1.43 | 1.59 |
10 | 2.52 | 1.32 | 1.96 | 2.37 | 2.36 | 2.50 | 1.89 | 1.95 | 1.12 | 1.41 | 1.74 | 1.77 | 1.41 | 1.39 | |
0 | 1.95 | 1.37 | 1.54 | 1.90 | 1.97 | 2.05 | 1.71 | 1.55 | 1.13 | 1.17 | 1.42 | 1.49 | 1.30 | 1.30 | |
−10 | 1.45 | 1.32 | 1.11 | 1.33 | 1.36 | 1.60 | 1.30 | 1.26 | 1.12 | 1.04 | 1.13 | 1.13 | 1.26 | 1.13 | |
−20 | 1.30 | 1.24 | 2.05 | 1.24 | 1.17 | 1.92 | 1.24 | 1.18 | 1.09 | 1.12 | 1.10 | 1.06 | 1.28 | 1.09 |
Disturbance Type | SNRu | The Maximum Value | The Median Value | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
EEWT-TO | KLT | SSboll | WT-TO | NR | H-SVD | EEMD-SVD | EEWT-TO | KLT | SSboll | WT-TO | NR | H-SVD | EEMD-SVD | ||
“white” noise | 20 | 22.43 | 5.00 | 21.72 | 19.56 | 13.07 | 4.75 | 11.47 | 17.18 | 0.25 | 17.32 | 13.73 | 9.23 | 0.62 | 1.51 |
10 | 19.38 | 1.12 | 17.36 | 18.32 | 13.47 | 3.13 | 11.14 | 15.29 | −2.05 | 11.78 | 13.69 | 8.84 | −2.51 | −1.20 | |
0 | 16.49 | 0.83 | 15.12 | 10.89 | 11.63 | 5.54 | 6.29 | 11.63 | −1.68 | 9.78 | 8.69 | 8.99 | −0.60 | 0.02 | |
−10 | 5.14 | 0.13 | 4.46 | 1.50 | 3.20 | 3.50 | −0.81 | 1.11 | −1.58 | 0.78 | −1.60 | −0.25 | −2.94 | −3.86 | |
−20 | 0.14 | −0.84 | −1.71 | −1.30 | −1.04 | −1.04 | −1.97 | −3.63 | −3.19 | −4.05 | −7.16 | −5.96 | −7.20 | −8.32 | |
“violet” noise | 20 | 27.63 | 6.06 | 25.07 | 23.65 | 13.10 | 2.21 | 11.25 | 19.82 | 1.28 | 19.45 | 13.80 | 8.25 | −3.44 | −2.42 |
10 | 21.08 | 2.84 | 19.56 | 20.44 | 12.01 | 2.30 | 11.41 | 15.48 | −1.20 | 14.17 | 11.04 | 7.74 | −2.90 | −1.64 | |
0 | 22.76 | 2.08 | 15.81 | 18.94 | 12.16 | 6.04 | 2.70 | 14.71 | −1.95 | 10.35 | 11.06 | 8.63 | −1.81 | −2.18 | |
−10 | 13.82 | 0.23 | 4.42 | 6.51 | 6.53 | 4.19 | −1.24 | 7.36 | −3.53 | 1.39 | 4.00 | 2.01 | −4.75 | −6.06 | |
−20 | 1.91 | 0.43 | 0.36 | 1.16 | 0.21 | −1.37 | −2.08 | 0.14 | −1.33 | −1.29 | −0.97 | −4.15 | −6.16 | −8.34 | |
“brown” noise | 20 | 25.02 | 3.11 | 20.63 | 18.73 | 12.57 | 3.86 | 10.20 | 18.31 | 0.05 | 11.59 | 16.01 | 7.85 | −3.35 | −2.23 |
10 | 14.98 | −0.02 | 12.41 | 10.44 | 13.82 | 4.11 | 11.72 | 12.64 | −3.61 | 6.24 | 9.92 | 9.44 | −4.11 | −0.04 | |
0 | 5.25 | 0.00 | 3.40 | 3.03 | 10.00 | −0.83 | 6.95 | 2.59 | −2.34 | −2.16 | 1.35 | 7.24 | −8.78 | −1.08 | |
−10 | 2.10 | −3.28 | 1.83 | −0.27 | 10.00 | −1.29 | 0.29 | −3.26 | −5.54 | −7.01 | −6.93 | 5.69 | −8.88 | −7.47 | |
−20 | −1.04 | −5.16 | −1.78 | −9.58 | −0.61 | −10.08 | 0.00 | −6.49 | −7.55 | −9.95 | −11.68 | −3.00 | −12.89 | −7.00 | |
“blue” noise | 20 | 24.93 | 6.39 | 23.84 | 24.09 | 10.39 | 0.70 | 9.50 | 18.33 | 1.37 | 18.49 | 14.94 | 6.06 | −4.30 | −3.88 |
10 | 18.20 | 5.58 | 18.03 | 19.23 | 11.76 | 2.93 | 10.73 | 14.13 | 2.70 | 13.42 | 11.97 | 7.88 | −1.72 | −0.19 | |
0 | 15.02 | 1.99 | 12.14 | 10.40 | 8.34 | 2.30 | 1.65 | 9.32 | −0.65 | 7.58 | 7.10 | 5.68 | −1.99 | −1.58 | |
−10 | 9.98 | 2.08 | 5.35 | 4.01 | 4.06 | 3.77 | −2.92 | 4.91 | −0.15 | 1.20 | 1.02 | 0.02 | −4.65 | −6.85 | |
−20 | 0.48 | 0.01 | −0.30 | 0.02 | −1.16 | −2.40 | −3.25 | −0.81 | −1.54 | −2.00 | −3.62 | −5.60 | −7.19 | −9.03 | |
“pink” noise | 20 | 25.02 | 2.18 | 23.40 | 21.33 | 13.50 | 2.44 | 11.25 | 19.21 | −1.12 | 14.89 | 17.77 | 8.41 | −3.30 | −2.14 |
10 | 15.02 | 0.82 | 14.04 | 12.92 | 13.34 | 5.65 | 10.99 | 12.84 | −1.32 | 8.83 | 11.69 | 9.22 | −0.84 | 0.90 | |
0 | 4.01 | −0.90 | 3.38 | 2.84 | 7.00 | −0.79 | 5.71 | 2.05 | −3.02 | −0.82 | 0.76 | 4.92 | −8.15 | −1.87 | |
−10 | 3.75 | −0.87 | 0.55 | 0.94 | 4.73 | −1.53 | 2.12 | −1.54 | −3.68 | −5.44 | −3.86 | 1.93 | −8.59 | −2.90 | |
−20 | 2.07 | 0.00 | 0.03 | −2.78 | 1.42 | −2.84 | 1.47 | −2.98 | −4.05 | −8.31 | −10.22 | −2.58 | −10.24 | −2.39 | |
traffic sounds | 20 | 25.27 | 2.44 | 21.62 | 21.49 | 13.69 | 5.29 | 12.86 | 20.62 | 0.02 | 17.66 | 18.80 | 9.06 | −1.94 | −0.68 |
10 | 15.01 | 2.15 | 13.07 | 11.96 | 12.08 | 4.96 | 11.56 | 12.74 | −0.24 | 9.71 | 11.17 | 8.13 | −1.84 | −0.82 | |
0 | 11.10 | 4.07 | 7.99 | 7.01 | 10.04 | 3.33 | 8.23 | 8.26 | 0.81 | 4.20 | 4.57 | 7.01 | −2.90 | −1.13 | |
−10 | 2.93 | 0.00 | −0.28 | 0.12 | 1.39 | −0.90 | 0.09 | 0.36 | −2.40 | −3.74 | −3.93 | 0.00 | −4.76 | −4.71 | |
−20 | 1.77 | 0.23 | −0.23 | −4.52 | −0.23 | −0.23 | −0.23 | −1.56 | −3.51 | −8.88 | −11.78 | −3.10 | −10.42 | −2.19 | |
hair dryer | 20 | 25.23 | 1.07 | 22.49 | 21.65 | 13.59 | 2.46 | 12.48 | 18.91 | −1.65 | 16.87 | 17.80 | 8.60 | −4.34 | −1.91 |
10 | 17.24 | 1.65 | 14.47 | 12.08 | 12.07 | 4.27 | 9.10 | 13.42 | −0.91 | 9.27 | 10.62 | 7.99 | −2.31 | −1.35 | |
0 | 11.10 | 2.27 | 7.69 | 4.86 | 8.07 | 4.71 | 4.11 | 7.23 | −0.97 | 3.63 | 2.79 | 6.01 | −3.01 | −1.50 | |
−10 | 6.41 | −0.20 | −0.20 | −0.32 | 1.66 | −2.55 | −1.29 | −0.20 | −0.39 | −3.04 | −5.02 | −2.16 | −7.35 | −5.16 | |
−20 | 2.21 | 1.37 | 0.73 | −1.56 | 0.09 | −1.59 | 0.17 | −2.70 | 1.19 | −2.80 | −10.32 | −4.99 | −6.56 | −4.93 | |
fan | 20 | 24.77 | 0.86 | 22.35 | 18.94 | 13.24 | 2.25 | 12.50 | 18.38 | −1.77 | 15.10 | 16.01 | 8.16 | −3.37 | −2.16 |
10 | 15.00 | 1.04 | 12.76 | 10.59 | 10.78 | 2.30 | 7.84 | 11.89 | −1.09 | 7.36 | 10.09 | 6.96 | −2.53 | −1.60 | |
0 | 10.96 | 2.27 | 8.04 | 5.23 | 8.39 | 3.44 | 4.94 | 7.46 | −0.56 | 3.37 | 2.49 | 5.60 | −3.79 | −1.50 | |
−10 | 2.78 | −0.14 | 0.27 | −0.42 | 2.45 | −1.44 | −0.24 | −0.25 | −3.85 | −0.98 | −4.40 | −1.20 | −4.29 | −3.90 | |
−20 | 0.53 | 0.62 | 0.53 | −4.77 | 0.68 | 0.53 | 0.53 | −1.43 | −2.89 | −3.26 | −12.89 | −5.57 | −8.27 | −4.27 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Kuwałek, P.; Jęśko, W. Speech Enhancement Based on Enhanced Empirical Wavelet Transform and Teager Energy Operator. Electronics 2023, 12, 3167. https://doi.org/10.3390/electronics12143167
Kuwałek P, Jęśko W. Speech Enhancement Based on Enhanced Empirical Wavelet Transform and Teager Energy Operator. Electronics. 2023; 12(14):3167. https://doi.org/10.3390/electronics12143167
Chicago/Turabian StyleKuwałek, Piotr, and Waldemar Jęśko. 2023. "Speech Enhancement Based on Enhanced Empirical Wavelet Transform and Teager Energy Operator" Electronics 12, no. 14: 3167. https://doi.org/10.3390/electronics12143167
APA StyleKuwałek, P., & Jęśko, W. (2023). Speech Enhancement Based on Enhanced Empirical Wavelet Transform and Teager Energy Operator. Electronics, 12(14), 3167. https://doi.org/10.3390/electronics12143167