# An Experimental Study on Speech Enhancement Based on a Combination of Wavelets and Deep Learning

^{*}

^{†}

## Abstract

**:**

## 1. Introduction

- a
- Spectral Subtraction Method, which uses estimations of statistics of the signal and the noise. It is suitable for real-time applications due to its simplicity. The first assumption is that the speech and the noise can be modeled using an addition of the single component:$$y\left(t\right)=s\left(t\right)+n\left(t\right),$$$$Y\left(\omega \right)=S\left(\omega \right)+D\left(\omega \right)$$The estimation of the enhanced speech $\tilde{S}\left(\omega \right)$ can be expressed as$$|\tilde{S}{\left(\omega \right)|}^{2}=\sqrt{{\left|Y\left(\omega \right)\right|}^{2}-E{\left|\left(D\omega \right)\right|}^{1}}$$The enhanced signal can be obtained in the time domain using the inverse Fourier transform in Equation (3) with the phase information.
- b
- Spectral Subtraction with Over-subtraction Model (SSOM):The previous method applies a difference in the spectral domain based on a statistical average of the noise. If such a statistical average is not representative of the signal, for example, in musical background noise, in this case, a value floor of minimum spectrum values is established, which leads to minimizing the narrow spectral peaks by decreasing the spectral excursions.
- c
- Non-Linear Spectral Subtraction: This method is based on a combination of the two previous algorithms, considering the subtraction based on the signal-to-noise ratio (SNR) of each frame. That makes the process nonlinear, applying less subtraction where the noise is less present, and vice versa.

- a
- Adaptive Noise Cancellation: This method takes advantage of the principle of destructive interference between wave sounds, by using a reference signal to generate an anti-noise wave of equal amplitude, but opposite phase. Several strategies have been applied for defining this reference signal, for example using sensors located near the noise and interference sources.
- b
- Multisensor Beamforming: This method is applicable when the sound is recorded using a geometric array of microphones. The sound signals are amplified or attenuated (in the time or frequency domain) depending on their direction of arrival. The phase information is particularly important, because most methods reject all the noisy components not aligned in phase.

#### 1.1. Related Work

#### 1.2. Problem Statement

- Transform $y\left(t\right)$ using a wavelet: $Y=W\left(y\right(t\left)\right)$.
- Obtain the denoised version using the threshold, in the wavelet domain: $Z=D(Y,\lambda )$.
- Transform the denoised version into the time domain: $\tilde{{s}_{1}}={W}^{-1}\left(Z\right)$.

## 2. Materials and Methods

#### 2.1. Wavelets

- Apply the wavelet transform to the noisy signal, to obtain the wavelet coefficients.
- Apply the thresholding function and procedure to obtain new wavelet coefficients.
- Reconstruct the signal by inverse transforming the coefficients after the threshold.

#### 2.1.1. Thresholding

- Minimax criterion: In statistics, the estimators face the problem of estimating a deterministic parameter from observations. The minimax method minimizes the cost of the estimator in the worst case. For the case of threshold selection, the principle is applied by assimilating the de-noised signal to the estimator of the unknown regression function. This way, the threshold can be expressed as:$$\lambda =\left(\right)open="\{"\; close>\begin{array}{cc}\sigma (0.336+0.1829{log}_{2}N)& N32\\ 0& N32\end{array}$$
- Sqtwolog criterion: The threshold is calculated using the equation$${\lambda}_{j}={\sigma}_{j}\sqrt{2log\left({N}_{j}\right)}$$
- Rigrsure: The soft threshold can be expressed as$$\lambda =\sigma \sqrt{{\omega}_{b,}}$$
- Hersure: The threshold combines Sqtwolog and Rigrsure, given the property that the Rigrsure threshold does not perform well at a low SNR. In such a case, the Sqtwolog method gives better threshold estimation. If the estimation from Sqtwolog is ${\lambda}_{1}$ and from Rigrsure is ${\lambda}_{2}$, then Hersure uses:$$\lambda =\left(\right)open="\{"\; close>\begin{array}{cc}{\lambda}_{1}& AB\\ min({\lambda}_{1},{\lambda}_{2})& A\ge B,\end{array}$$$$A=\frac{s-N}{N}$$$$B={\left({log}_{2}N\right)}^{\frac{3}{2}}\sqrt{N}$$

#### 2.1.2. No Thresholding Alternative

#### 2.2. Deep Learning

#### 2.3. Proposed System

- Select a suitable mother wavelet.
- Transform each speech signal using the mother wavelet.
- Select the appropriate threshold to remove the noise.
- Apply the inverse wavelet transform to obtain the denoised signal.

- Select one architecture of the network: In our experiments, we used the stacked dual-signal transformation LSTM network architecture presented in [47]. The architecture was based on two LSTM layers followed by a fully connected (FC) layer.
- Train the deep neural network with pairs of noisy and clean speech at the inputs and at the outputs. For the case of the hybrid approach, the outputs of the wavelet denoising were used as the inputs of the neural network, which were re-trained completely using pairs of wavelet-based denoising and clear speech.
- Establish a stop criterion for the training procedure.

#### 2.4. Experimental Setup

#### 2.4.1. Dataset

#### 2.4.2. Noise

- Clean, as the dataset described in the previous section.
- The same dataset degraded with additive White noise added at five SNR levels: SNR −10, SNR −5, SNR 0, SNR 5, and SNR 10.
- The clean dataset degraded with additive Pink noise added at five SNR levels: SNR −10, SNR −5, SNR 0, SNR 5, and SNR 10.
- The clean dataset degraded with additive Babble noise added at five SNR levels: SNR −10, SNR −5, SNR 0, SNR 5, and SNR 10.

#### 2.4.3. Evaluation

## 3. Results and Discussion

## 4. Conclusions

## Author Contributions

## Funding

## Informed Consent Statement

## Acknowledgments

## Conflicts of Interest

## Sample Availability

## Abbreviations

ASR | Automatic speech recognition |

CNNs | Convolutional neural networks |

DTLN | Dual-signal transformation LSTM network |

DL | Deep learning |

EEG | Electroencephalogram |

FC | Fully connected |

GPU | Graphics processing unit |

MLP | Multi-layer perceptron |

MMSE | Minimum mean-squared estimation |

MAD | Median Absolute Deviation |

PCA | Principal component analysis |

PESQ | Perceptual evaluation of speech quality |

RFBN | Radial basis function network |

STFT | Short-time Fourier transform |

SegSNR | Segmental signal-to-noise ratio |

LSTM | Long Short-Term Memory |

SNR | Signal-to-noise ratio |

SSOM | Spectral Subtraction with Over subtraction Model |

VoIP | Voice over Internet Protocol |

MDPI | Multidisciplinary Digital Publishing Institute |

## References

- Tan, L.; Chen, Y.; Wu, F. Research on Speech Signal Denoising Algorithm Based on Wavelet Analysis. J. Phys. Conf. Ser.
**2020**, 1627, 012027. [Google Scholar] [CrossRef] - Krishna, G.; Tran, C.; Yu, J.; Tewfik, A.H. Speech recognition with no speech or with noisy speech. In Proceedings of the ICASSP 2019—2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, UK, 12–17 May 2019; pp. 1090–1094. [Google Scholar]
- Meyer, B.T.; Mallidi, S.H.; Martinez, A.M.C.; Payá-Vayá, G.; Kayser, H.; Hermansky, H. Performance monitoring for automatic speech recognition in noisy multi-channel environments. In Proceedings of the 2016 IEEE Spoken Language Technology Workshop (SLT). IEEE, San Diego, CA, USA, 13–16 December 2016; pp. 50–56. [Google Scholar]
- Coto-Jimenez, M.; Goddard-Close, J.; Di Persia, L.; Rufiner, H.L. Hybrid speech enhancement with wiener filters and deep LSTM denoising autoencoders. In Proceedings of the 2018 IEEE International Work Conference on Bioinspired Intelligence (IWOBI), San Carlos, Costa Rica, 18–20 July 2018; pp. 1–8. [Google Scholar]
- Lai, Y.H.; Zheng, W.Z. Multi-objective learning based speech enhancement method to increase speech quality and intelligibility for hearing aid device users. Biomed. Signal Process. Control
**2019**, 48, 35–45. [Google Scholar] [CrossRef] - Park, G.; Cho, W.; Kim, K.S.; Lee, S. Speech Enhancement for Hearing Aids with Deep Learning on Environmental Noises. Appl. Sci.
**2020**, 10, 6077. [Google Scholar] [CrossRef] - Kulkarni, D.S.; Deshmukh, R.R.; Shrishrimal, P.P. A review of speech signal enhancement techniques. Int. J. Comput. Appl.
**2016**, 139. [Google Scholar] - Chaudhari, A.; Dhonde, S. A review on speech enhancement techniques. In Proceedings of the 2015 International Conference on Pervasive Computing (ICPC), Pune, India, 8–10 January 2015; pp. 1–3. [Google Scholar]
- Benesty, J.; Makino, S.; Chen, J. Speech Enhancement; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2005. [Google Scholar]
- Fukane, A.R.; Sahare, S.L. Different approaches of spectral subtraction method for enhancing the speech signal in noisy environments. Int. J. Sci. Eng. Res.
**2011**, 2, 1. [Google Scholar] - Evans, N.W.; Mason, J.S.; Liu, W.M.; Fauve, B. An assessment on the fundamental limitations of spectral subtraction. In Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings, Toulouse, France, 14–19 May 2006; Volume 1, pp. 145–148. [Google Scholar]
- Liu, D.; Smaragdis, P.; Kim, M. Experiments on deep learning for speech denoising. In Proceedings of the Fifteenth Annual Conference of the International Speech Communication Association, Singapore, 14–18 September 2014. [Google Scholar]
- Han, K.; Wang, Y.; Wang, D.; Woods, W.S.; Merks, I.; Zhang, T. Learning spectral mapping for speech dereverberation and denoising. IEEE/ACM Trans. Audio Speech Lang. Process.
**2015**, 23, 982–992. [Google Scholar] [CrossRef] - Coto-Jiménez, M. Robustness of LSTM neural networks for the enhancement of spectral parameters in noisy speech signals. In Proceedings of the Mexican International Conference on Artificial Intelligence, Guadalajara, Mexico, 22–27 October 2018; Springer: Berlin/Heidelberg, Germany, 2018; pp. 227–238. [Google Scholar]
- Zhong, X.; Dai, Y.; Dai, Y.; Jin, T. Study on processing of wavelet speech denoising in speech recognition system. Int. J. Speech Technol.
**2018**, 21, 563–569. [Google Scholar] [CrossRef] - Saleem, N.; Khattak, M.I. A review of supervised learning algorithms for single channel speech enhancement. Int. J. Speech Technol.
**2019**, 22, 1051–1075. [Google Scholar] [CrossRef] - Azarang, A.; Kehtarnavaz, N. A review of multi-objective deep learning speech denoising methods. Speech Commun.
**2020**, 122, 1–10. [Google Scholar] [CrossRef] - Lun, D.P.K.; Shen, T.W.; Hsung, T.C.; Ho, D.K. Wavelet based speech presence probability estimator for speech enhancement. Digit. Signal Process.
**2012**, 22, 1161–1173. [Google Scholar] [CrossRef] - Balaji, V.; Sathiya Priya, J.; Dinesh Kumar, J.; Karthi, S. Radial basis function neural network based speech enhancement system using SLANTLET transform through hybrid vector wiener filter. In Inventive Communication and Computational Technologies; Springer: Berlin/Heidelberg, Germany, 2021; pp. 711–723. [Google Scholar]
- Bahadur, I.; Kumar, S.; Agarwal, P. Performance measurement of a hybrid speech enhancement technique. Int. J. Speech Technol.
**2021**, 24, 665–677. [Google Scholar] [CrossRef] - Lun, D.P.K.; Hsung, T.C. Improved wavelet based a-priori SNR estimation for speech enhancement. In Proceedings of the 2010 IEEE International Symposium on Circuits and Systems, Paris, France, 30 May–2 June 2010; pp. 2382–2385. [Google Scholar]
- Bahoura, M.; Rouat, J. Wavelet speech enhancement based on time–scale adaptation. Speech Commun.
**2006**, 48, 1620–1637. [Google Scholar] [CrossRef] [Green Version] - Bouzid, A.; Ellouze, N. Speech enhancement based on wavelet packet of an improved principal component analysis. Comput. Speech Lang.
**2016**, 35, 58–72. [Google Scholar] - Ram, R.; Mohanty, M.N. Use of radial basis function network with discrete wavelet transform for speech enhancement. Int. J. Comput. Vis. Robot.
**2019**, 9, 207–223. [Google Scholar] [CrossRef] - Mihov, S.G.; Ivanov, R.M.; Popov, A.N. Denoising speech signals by wavelet transform. Annu. J. Electron.
**2009**, 6, 2–5. [Google Scholar] - Chui, C.K. An Introduction to Wavelets; Elsevier: Amsterdam, The Netherlands, 2016. [Google Scholar]
- Chavan, M.S.; Mastorakis, N. Studies on implementation of Harr and Daubechies wavelet for denoising of speech signal. Int. J. Circuits Syst. Signal Process.
**2010**, 4, 83–96. [Google Scholar] - Priyadarshani, N.; Marsland, S.; Castro, I.; Punchihewa, A. Birdsong denoising using wavelets. PLoS ONE
**2016**, 11, e0146790. [Google Scholar] [CrossRef] [Green Version] - Al-Qazzaz, N.K.; Ali, S.; Ahmad, S.A.; Islam, M.S.; Ariff, M.I. Selection of mother wavelets thresholding methods in denoising multi-channel EEG signals during working memory task. In Proceedings of the 2014 IEEE conference on biomedical engineering and sciences (IECBES), Miri, Sarawak, Malaysia, 8–10 December 2014; pp. 214–219. [Google Scholar]
- Gargour, C.; Gabrea, M.; Ramachandran, V.; Lina, J.M. A short introduction to wavelets and their applications. IEEE Circuits Syst. Mag.
**2009**, 9, 57–68. [Google Scholar] [CrossRef] - Mallat, S. A Wavelet Tour of Signal Processing: The Sparse Way; Academic Press: Cambridge, MA, USA, 2008. [Google Scholar]
- Taswell, C. The what, how, and why of wavelet shrinkage denoising. Comput. Sci. Eng.
**2000**, 2, 12–19. [Google Scholar] [CrossRef] [Green Version] - Donoho, D.; Johnstone, I. Ideal Spatial Adaptation via Wavelet Shrinkage. Biometrika. To Appear; Technical Report, Also Tech. Report; Department of Statistics, Stanford University: Stanford, CA, USA, 1992. [Google Scholar]
- Donoho, D.L. De-noising by soft-thresholding. IEEE Trans. Inf. Theory
**1995**, 41, 613–627. [Google Scholar] [CrossRef] [Green Version] - Xiu-min, Z.; Gui-tao, C. A novel de-noising method for heart sound signal using improved thresholding function in wavelet domain. In Proceedings of the 2009 International Conference on Future BioMedical Information Engineering (FBIE), Sanya, China, 13–14 December 2009; pp. 65–68. [Google Scholar]
- Oktar, M.A.; Nibouche, M.; Baltaci, Y. Denoising speech by notch filter and wavelet thresholding in real time. In Proceedings of the 2016 24th Signal Processing and Communication Application Conference (SIU), Zonguldak, Turkey, 16–19 May 2016; pp. 813–816. [Google Scholar]
- Verma, N.; Verma, A. Performance analysis of wavelet thresholding methods in denoising of audio signals of some Indian Musical Instruments. Int. J. Eng. Sci. Technol.
**2012**, 4, 2040–2045. [Google Scholar] - Valencia, D.; Orejuela, D.; Salazar, J.; Valencia, J. Comparison analysis between rigrsure, sqtwolog, heursure and minimaxi techniques using hard and soft thresholding methods. In Proceedings of the 2016 XXI Symposium on Signal Processing, Images and Artificial Vision (STSIVA), Bucaramanga, Colombia, 30 August 30–2 September 2016; pp. 1–5. [Google Scholar]
- Schimmack, M.; Mercorelli, P. An on-line orthogonal wavelet denoising algorithm for high-resolution surface scans. J. Frankl. Inst.
**2018**, 355, 9245–9270. [Google Scholar] [CrossRef] - Schimmack, M.; Mercorelli, P. A structural property of the wavelet packet transform method to localise incoherency of a signal. J. Frankl. Inst.
**2019**, 356, 10123–10137. [Google Scholar] [CrossRef] - Goodfellow, I.; Bengio, Y.; Courville, A.; Bengio, Y. Deep Learning; MIT Press: Cambridge, UK, 2016; Volume 1. [Google Scholar]
- Abiodun, O.I.; Jantan, A.; Omolara, A.E.; Dada, K.V.; Mohamed, N.A.; Arshad, H. State-of-the-art in artificial neural network applications: A survey. Heliyon
**2018**, 4, e00938. [Google Scholar] [CrossRef] [PubMed] [Green Version] - LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature
**2015**, 521, 436–444. [Google Scholar] [CrossRef] - Waseem, M.; Lin, Z.; Liu, S.; Jinai, Z.; Rizwan, M.; Sajjad, I.A. Optimal BRA based electric demand prediction strategy considering instance-based learning of the forecast factors. Int. Trans. Electr. Energy Syst.
**2021**, 31, e12967. [Google Scholar] [CrossRef] - Purwins, H.; Li, B.; Virtanen, T.; Schlüter, J.; Chang, S.Y.; Sainath, T. Deep learning for audio signal processing. IEEE J. Sel. Top. Signal Process.
**2019**, 13, 206–219. [Google Scholar] [CrossRef] [Green Version] - Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv
**2014**, arXiv:1412.6980. [Google Scholar] - Westhausen, N.L.; Meyer, B.T. Dual-Signal Transformation LSTM Network for Real-Time Noise Suppression. In Proceedings of the Interspeech 2020, Shanghai, China, 25–29 October 2020; pp. 2477–2481. [Google Scholar] [CrossRef]
- Mercorelli, P. A Fault Detection and Data Reconciliation Algorithm in Technical Processes with the Help of Haar Wavelets Packets. Algorithms
**2017**, 10, 13. [Google Scholar] [CrossRef] [Green Version] - Kominek, J.; Black, A.W. The CMU Arctic speech databases. In Proceedings of the Fifth ISCA Workshop on Speech Synthesis, Vienna, Austria, 20–22 September 2004. [Google Scholar]
- Rix, A.W.; Beerends, J.G.; Hollier, M.P.; Hekstra, A.P. Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs. In Proceedings of the 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing, Proceedings (Cat. No. 01CH37221), Salt Lake City, UT, USA, 7–11 May 2001; Volume 2, pp. 749–752. [Google Scholar]
- Rix, A.W.; Hollier, M.P.; Hekstra, A.P.; Beerends, J.G. Perceptual Evaluation of Speech Quality (PESQ) The New ITU Standard for End-to-End Speech Quality Assessment Part I–Time-Delay Compensation. J. Audio Eng. Soc.
**2002**, 50, 755–764. [Google Scholar] - Wang, L.; Zheng, W.; Ma, X.; Lin, S. Denoising speech based on deep learning and wavelet decomposition. Sci. Program.
**2021**, 2021, 8677043. [Google Scholar] [CrossRef] - Gnanamanickam, J.; Natarajan, Y.; KR, S.P. A hybrid speech enhancement algorithm for voice assistance application. Sensors
**2021**, 21, 7025. [Google Scholar] [CrossRef] [PubMed]

**Figure 1.**Different scales and shifts of the Ricker wavelet, also known as the “Mexican hat” wavelet.

**Figure 3.**Illustration of a multi-layer perceptron. Information flows from inputs to outputs through connections between unit i and unit j denoted as ${w}_{j}^{1}$. In each node, outputs ${s}_{k}^{i}$ are produced and propagated towards the outputs ${o}_{m}$ of the network. Hidden layers may differ in the number of units.

**Figure 4.**The four implementations for experimental setup: wavelet enhancement (

**a**), deep learning enhancement (

**b**), wavelet + deep learning enhancement (

**c**), deep learning + wavelet enhancement (

**d**).

**Figure 5.**Sample of waveforms with and without degradation with White noise and the results after several procedures presented in the study.

**Figure 6.**Spectrograms with results of the enhancement process. First row: clear utterance. Second row: noisy utterance, with Babble SNR 0. Third row: wavelet enhancement. Fourth row: hybrid wavelet+deep learning enhancement.

**Table 1.**Babble noise PESQ. The higher values represent better results. In bold is the best result for each SNR level.

SNR | Noisy | Wavelets | DL | Wavelets + DL | DL + Wavelets |
---|---|---|---|---|---|

−10 | 0.44 | 0.49 | 0.53 | 0.51 | 0.52 |

−5 | 0.53 | 0.54 | 0.95 | 1.43 | 0.95 |

0 | 0.82 | 0.83 | 1.85 | 1.86 | 1.85 |

5 | 1.32 | 1.32 | 2.20 | 2.16 | 2.20 |

10 | 1.94 | 1.94 | 2.42 | 2.53 | 2.43 |

**Table 2.**Babble noise SegSNR. The higher values represent better results. In bold is the best result for each SNR level.

SNR | Noisy | Wavelets | DL | Wavelets + DL | DL + Wavelets |
---|---|---|---|---|---|

−10 | −15.74 | −15.72 | −0.98 | −0.99 | −0.94 |

−5 | −10.75 | −10.74 | 0.76 | 0.69 | 0.621 |

0 | −5.80 | −5.82 | 4.90 | 4.94 | 4.62 |

5 | −0.98 | −1.04 | 6.38 | 6.02 | 5.92 |

10 | 3.60 | 3.43 | 7.12 | 7.58 | 6.45 |

**Table 3.**Pink noise PESQ. The higher values represent better results. In bold is the best result for each SNR level.

SNR | Noisy | Wavelets | DL | Wavelets + DL | DL + Wavelets |
---|---|---|---|---|---|

−10 | 0.16 | 0.04 | 1.27 | 1.29 | 1.26 |

−5 | 0.46 | 0.42 | 1.50 | 1.54 | 1.49 |

0 | 0.83 | 0.83 | 1.65 | 1.74 | 1.63 |

5 | 1.39 | 1.39 | 2.14 | 2.13 | 2.13 |

10 | 1.99 | 1.99 | 2.31 | 2.32 | 2.30 |

**Table 4.**Pink noise SegSNR. The higher values represent better results. In bold is the best result for each SNR level.

SNR | Noisy | Wavelets | DL | Wavelets + DL | DL + Wavelets |
---|---|---|---|---|---|

−10 | −15.11 | −9.98 | 4.26 | 4.51 | 4.36 |

−5 | −10.14 | −5.11 | 4.95 | 5.23 | 5.09 |

0 | −5.22 | −5.11 | 5.05 | 5.65 | 5.15 |

5 | −0.43 | −0.42 | 7.31 | 7.22 | 7.16 |

10 | 4.08 | 3.92 | 7.57 | 7.53 | 7.12 |

**Table 5.**White noise PESQ. The higher values represent better results. In bold is the best result for each SNR level.

SNR | Noisy | Wavelets | DL | Wavelets + DL | DL + Wavelets |
---|---|---|---|---|---|

−10 | 0.28 | 0.11 | 1.34 | 1.36 | 1.34 |

−5 | 0.58 | 0.56 | 1.67 | 1.75 | 1.65 |

0 | 0.94 | 0.94 | 1.76 | 1.81 | 1.75 |

5 | 1.43 | 1.43 | 1.92 | 1.92 | 1.90 |

10 | 1.95 | 1.94 | 2.23 | 2.44 | 2.20 |

**Table 6.**White noise SegSNR. The higher values represent better results. In bold is the best result for each SNR level.

SNR | Noisy | Wavelets | DL | Wavelets + DL | DL + Wavelets |
---|---|---|---|---|---|

−10 | −15.74 | −12.77 | 2.83 | 3.64 | 2.90 |

−5 | −10.77 | −7.84 | 5.50 | 6.40 | 5.70 |

0 | −5.84 | −3.03 | 7.71 | 8.61 | 7.85 |

5 | −1.03 | 1.49 | 9.54 | 10.21 | 9.66 |

10 | 3.54 | 5.51 | 11.29 | 11.51 | 11.34^{1} |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Gutiérrez-Muñoz, M.; Coto-Jiménez, M.
An Experimental Study on Speech Enhancement Based on a Combination of Wavelets and Deep Learning. *Computation* **2022**, *10*, 102.
https://doi.org/10.3390/computation10060102

**AMA Style**

Gutiérrez-Muñoz M, Coto-Jiménez M.
An Experimental Study on Speech Enhancement Based on a Combination of Wavelets and Deep Learning. *Computation*. 2022; 10(6):102.
https://doi.org/10.3390/computation10060102

**Chicago/Turabian Style**

Gutiérrez-Muñoz, Michelle, and Marvin Coto-Jiménez.
2022. "An Experimental Study on Speech Enhancement Based on a Combination of Wavelets and Deep Learning" *Computation* 10, no. 6: 102.
https://doi.org/10.3390/computation10060102