Multiresolution Speech Enhancement Based on Proposed Circular Nested Microphone Array in Combination with Sub-Band Affine Projection Algorithm
Abstract
:1. Introduction
2. The Microphone Model and Proposed Nested Microphone Array
2.1. Microphone Signal Model
2.2. The Proposed Uniform Circular Nested Microphone Array
3. The Proposed Multiresolution Sub-band-APA for the Speech Enhancement
4. Results and Discussion
5. Conclusions
Author Contributions
Funding
Acknowledgments
Conflicts of Interest
Abbreviations
AP | Affine projection |
APA | Affine projection algorithm |
AR | Auto-regressive |
ARHMM | Auto-regressive hidden Markov model |
CNMA | Circular nested microphone array |
CNMA-SBAPA | Circular nested microphone array in combination with sub-band affine projection algorithm |
DB-MWF | Distributed multichannel Wiener filter |
DNN | Deep neural network |
FBK | Fondazione Bruno Kessler |
FIR | Finite impulse response |
HMM | Hidden Markov model |
HPF | High-pass filter |
IFD | Instantaneous frequency deviation |
LMR-APA | Levenberg Marquardt regularized-Affine projection algorithm |
LMS | Least mean square |
LPF | Low-pass filter |
ML | Maximum likelihood |
MSE | Mean square error |
MMSE | Minimum mean square error |
MNMF | Multi-channel non-negative matrix factorization |
MNMF-MVDR | Multichannel nonnegative matrix factorization-minimum variance distortionless response |
MOS | Mean opinion score |
MVDR | Minimum variance distortionless response |
NLMS | Normalized least mean square |
NMA | Nested microphone array |
OCF-NLMS | Orthogonal correction factor-Normalized least mean square |
PESQ | Perceptual evaluation of speech quality |
PRAPA | Partial rank affine projection algorithm |
RLS | Recursive least square |
SBAPA | Sub-band affine projection algorithm |
SCM | Spatial covariance matrix |
SegSNR | Segmental signal-to-noise ratio |
SNR | Signal-to-noise ratio |
STOI | Short-time objective intelligibility |
STP | Short time predictor |
TTS | Text-to-speech |
VAD | Voice activity detector |
WF | Wiener filter |
References
- Prasad, P.B.M.; Ganesh, M.S.; Gangashetty, S.V. Two microphone technique to improve the speech intelligibility under noisy environment. In Proceedings of the IEEE 14th International Colloquium on Signal Processing & Its Applications (CSPA), Penang, Malaysia, 9–10 March 2018; pp. 13–18. [Google Scholar]
- Fukui, M.; Shimauchi, S.; Hioka, Y.; Nakagawa, A.; Haneda, Y. Acoustic echo and noise canceller for personal hands-free video IP phone. IEEE Trans. Consum. Electron. 2016, 62, 454–462. [Google Scholar] [CrossRef]
- Ephraim, Y. Statistical-Model-Based Speech Enhancement Systems. Proc. IEEE. 1992, 80, 1526–1555. [Google Scholar] [CrossRef]
- Rix, A.W.; Beerends, J.G.; Hollier, M.P.; Hekstra, A.P. Perceptual evaluation of speech quality (PESQ)—A new method for speech quality assessment of telephone net works and codecs. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (Cat. No. 01CH37221), Salt Lake City, UT, USA, 7–11 May 2001; pp. 749–752. [Google Scholar]
- Streijl, R.C.; Winkler, S.; Hands, D.S. Mean opinion score (MOS) revisited: Methods and applications, limitations and alternatives. Multimed. Syst. 2016, 22, 213–227. [Google Scholar] [CrossRef]
- Taal, C.H.; Hendriks, R.C.; Heusdens, R.; Jensen, J. A short-time objective intelligibility measure for time-frequency weighted noisy speech. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Dallas, TX, USA, 14–19 March 2010; pp. 4214–4217. [Google Scholar]
- Pollak, P.; Vondrasek, M. Methods for Speech SNR Estimation: Evaluation Tool and Analysis of VAD Dependency. Radio Eng. 2005, 14, 6–11. [Google Scholar]
- Loizou, P.C. Speech Enhancement: Theory and Practice, 2nd ed.; CRC Press: Boca Raton, FL, USA, 2007. [Google Scholar]
- Doclo, S.; Moonen, M.; Bogaert, T.V.; Wouters, J. Reduced-band width and distributed MWF-based noise reduction algorithms for binaural hearing aids. IEEE Trans. Audio Speech Lang. Process. 2009, 17, 38–51. [Google Scholar] [CrossRef]
- Boll, S.F. Suppression of Acoustic Noise in Speech Using Spectral Subtraction. IEEE Trans. Acoust. Speech Signal Process. 1979, 27, 113–120. [Google Scholar] [CrossRef] [Green Version]
- Martin, R. Spectral subtraction based on minimum statistics. In Proceedings of the European Signal Processing Conference, Scotland, UK, 13–16 September 1994; pp. 1182–1185. [Google Scholar]
- Ephraim, Y.; Malah, D. Speech enhancement using a minimum mean square error short-time spectral amplitude estimator. IEEE Trans. Acoust. Speech Signal Process. 1984, 32, 1109–1121. [Google Scholar] [CrossRef] [Green Version]
- Ephraim, Y.; Malah, D. Speech enhancement using a minimum mean-square error log-spectral amplitude estimator. IEEE Trans. Acoust. Speech Signal Process. 1985, 33, 443–445. [Google Scholar] [CrossRef]
- Cohen, I.; Berdugo, B. Noise estimation by minima controlled recursive averaging for robust speech enhancement. IEEE Signal Process. Lett. 2002, 9, 12–15. [Google Scholar] [CrossRef]
- Cohen, I. Noise spectrum estimation in adverse environments: Improved minima controlled recursive averaging. IEEE Trans. Speech Audio Process. 2003, 11, 466–475. [Google Scholar] [CrossRef] [Green Version]
- Martin, R. Noise power spectral density estimation based on optimal smoothing and minimum statistics. IEEE Trans. Speech Audio Process. 2001, 9, 504–512. [Google Scholar] [CrossRef] [Green Version]
- Sameti, H.; Sheikhzadeh, H.; Deng, L.; Brennan, R.L. HMM-based strategies for enhancement of speech signals embedded in non stationary noise. IEEE Trans. Speech Audio Process. 1998, 6, 445–455. [Google Scholar] [CrossRef]
- Zhao, D.Y.; Kleijn, W.B. HMM-based gain modeling for enhancement of speech in noise. IEEE Trans. Audio Speech Lang. Process. 2007, 15, 882–892. [Google Scholar] [CrossRef]
- Deng, F.; Bao, C.C.; Kleijin, W.B. Sparse Hidden Markov Models for Speech Enhancement in Non-Stationary Noise Environments. IEEE Trans. Audio Speech Lang. Process. 2015, 23, 1973–1987. [Google Scholar] [CrossRef]
- Geravanchizadeh, M.; Osgouei, S.G. Dual-channel speech enhancement using normalized fractional least-mean-squares algorithm. In Proceedings of the 19th Iranian Conference on Electrical Engineering, Tehran, Iran, 17–19 May 2011; pp. 1–5. [Google Scholar]
- Rakesh, P.; Kumar, T.K. A novel RLS based adaptive filtering method for speech enhancement. Int. J. Electr. Comput. Electron. Commun. Eng. 2015, 9, 153–158. [Google Scholar]
- He, Q.; Bao, F.; Bao, C. Multiplicative Update of Auto-Regressive Gains for Codebook-Based Speech Enhancement. IEEE/ACM Trans. Audio Speech Lang. Process. 2017, 25, 457–468. [Google Scholar] [CrossRef]
- Tavakoli, V.M.; Jensen, J.R.; Christensen, M.G.; Benesty, J. A Framework for Speech Enhancement with Ad Hoc Microphone Arrays. IEEE/ACM Trans. Audio Speech Lang. Process. 2016, 24, 1038–1051. [Google Scholar] [CrossRef]
- Shimada, K.; Bando, Y.; Mimura, M.; Itoyama, K.; Yoshii, K.; Kawahara, T. Unsupervised Speech Enhancement Based on Multi channel NMF-Informed Beamforming for Noise-Robust Automatic Speech Recognition. IEEE/ACM Trans. Audio Speech Lang. Process. 2019, 27, 960–971. [Google Scholar] [CrossRef]
- Kavalekalam, M.S.; Nielsen, J.K.; Boldt, J.B.; Christensen, M.G. Model-Based Speech Enhancement for Intelligibility Improvement in Binaural Hearing Aids. IEEE/ACM Trans. Audio Speech Lang. Process. 2019, 27, 99–113. [Google Scholar] [CrossRef] [Green Version]
- Valentini-Botinhao, C.; Yamagishi, J. Speech Enhancement of Noisy and Reverberant Speech for Text-to-Speech. IEEE/ACM Trans. Audio Speech Lang. Process. 2018, 26, 1420–1433. [Google Scholar] [CrossRef]
- Wang, Y.; Brookes, M. Model-Based Speech Enhancement in the Modulation Domain. IEEE/ACM Trans. Audio Speech Lang. Process. 2018, 26, 580–594. [Google Scholar] [CrossRef] [Green Version]
- Koutrouvelis, A.I.; Hendriks, R.C.; Heusdens, R.; Jensen, J. Robust Joint Estimation of Multi microphone Signal Model Parameters. IEEE/ACM Trans. Audio Speech Lang. Process. 2019, 27, 1136–1150. [Google Scholar] [CrossRef] [Green Version]
- Zheng, Y.R.; Goubran, R.A.; El-Tanany, M. Experimental evaluation of a nested microphone array with adaptive noise cancellers. IEEE Trans. Instrum. Meas. 2004, 53, 777–786. [Google Scholar] [CrossRef]
- Haykin, S. Adaptive Filter Theory, 4th ed.; Prentice-Hall: Upper Saddle River, NJ, USA, 2002. [Google Scholar]
- Gonzalez, A.; Ferrer, M.; Albu, F.; Diego, M. Affine projection algorithms: Evolution to smart and fast algorithms and applications. In Proceedings of the 20th European Signal Processing Conference (EUSIPCO), Bucharest, Romania, 27–31 August 2012; pp. 1965–1969. [Google Scholar]
- Sankaran, S.G.; Beex, A.A.L. Normalized LMS algorithm with orthogonal correction factors. In Proceedings of the Conference Record of the Thirty-First Asilomar Conference on Signals, Systems and Computers, Pacific Grove, CA, USA, 2–5 November 1997; pp. 1670–1673. [Google Scholar]
- Kratzer, S.G.; Morgan, D.R. The partial Rank Algorithm for adaptive beamforming. In Proceedings of the SPIE0564, Real-Time Signal Processing VIII, San Diego, CA, USA, 22–23 August 1985; pp. 9–14. [Google Scholar]
- Ozeki, K.; Umeda, T. An adaptive filtering algorithm using an orthogonal projection to an affine subspace and its properties. Electron. Commun. Jpn. 1984, 67-A, 19–27. [Google Scholar] [CrossRef]
- Gay, S.L.; Benesty, J. Acoustic Signal Processing for Telecommunication, 2nd ed.; Springer: Boston, MA, USA, 2000. [Google Scholar]
- Waterschoot, T.V.; Rombouts, G.; Moonen, M. Optimally regularized adaptive filtering algorithms for room acoustic signal enhancement. Signal Process. 2008, 88, 594–611. [Google Scholar] [CrossRef]
- Gabrea, M. Double affine projection algorithm-based speech enhancement algorithm. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Hong Kong, China, 6–10 April 2003; pp. I904–I907. [Google Scholar]
- Shin, H.C.; Sayed, A.H.; Song, W.J. Variable step-size NLMS and affine projection algorithms. IEEE Signal Process Lett. 2004, 11, 132–135. [Google Scholar] [CrossRef]
- Garofolo, J.S.; Lamel, L.F.; Fisher, W.M.; Fiscus, J.G.; Pallett, D.S.; Dahlgren, N.L.; Zue, V. TIMIT Acoustic-Phonetic Continuous Speech Corpus LDC93S1. Web Download. Philadelphia: Linguistic Data Consortium, 1993. Available online: https://catalog.ldc.upenn.edu/LDC93S1 (accessed on March 2019).
- Schwartz, O.; David, A.; Shahen-Tov, O.; Gannot, S. Multi-microphone voice activity and single-talk detectors based on steered-response power output entropy. In Proceedings of the IEEE International Conference on the Science of Electrical Engineering in Israel (ICSEE), Eilat, Israel, 12–14 December 2018; pp. 1–4. [Google Scholar]
- Allen, J.; Berkley, D. Image method for efficiently simulating small-room acoustics. J. Acoust. Soc. Am. 1979, 65, 943–950. [Google Scholar] [CrossRef]
- ITU-T: Methods for Subjective Determination of Transmission Quality; Recommendation P.862; International Telecommunications Union (ITU-T): Place des Nations, Geneva, Switzerland, 1996.
- ITU-T: Methods for Subjective Determination of Transmission Quality; Recommendation P.800; International Telecommunications Union (ITU-T): Place des Nations, Geneva, Switzerland, 1996.
Band | Bandwidth | Analysis Filter Bank | fc | dlim |
---|---|---|---|---|
1 | B1 = [3900–7800] Hz | B1,1 = [6825–7800] Hz B1,2 = [5850–6825] Hz B1,3 = [4875–5850] Hz B1,4 = [3900–4875] Hz | 5850 Hz | <2.2 cm |
2 | B2 = [1950–3900] Hz | B2,1 = [2925–3900] Hz B2,2 = [1950–2925] Hz | 2925 Hz | <4.4 cm |
3 | B3 = [975–1950] Hz | B3,1 = [1425–1950] Hz B3,2 = [975–1425] Hz | 1462 Hz | <8.8 cm |
4 | B4 = [50–975] Hz | B4,1 = [512–975] Hz B4,2 = [50–512] Hz | 512 Hz | <17.6 cm |
Filters | Bandwidth (Hz) | fmin (Hz) | fmax (Hz) | Filter Length (Samples) |
---|---|---|---|---|
F1[n] | 462 | 50 | 512 | 93 |
F2[n] | 462 | 512 | 975 | 115 |
F3[n] | 450 | 975 | 1425 | 102 |
F4[n] | 525 | 1425 | 1950 | 124 |
F5[n] | 975 | 1950 | 2925 | 109 |
F6[n] | 975 | 2925 | 3900 | 118 |
F7[n] | 975 | 3900 | 4875 | 131 |
F8[n] | 975 | 4875 | 5850 | 140 |
F9[n] | 975 | 5850 | 6825 | 146 |
F10[n] | 975 | 6825 | 7800 | 151 |
Rating | Quality (Standard ITU-T P.800) | Impairment |
---|---|---|
5 | Excellent | Imperceptible |
4 | Good | Perceptible but not annoying |
3 | Fair | Slightly annoying |
2 | Poor | Annoying |
1 | Bad | Very annoying |
SNR (dB) | Methods | Babble Noise | Train Noise | Car Noise | Restaurant Noise | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
SegSNR | PESQ | STOI | MOS | SegSNR | PESQ | STOI | MOS | SegSNR | PESQ | STOI | MOS | SegSNR | PESQ | STOI | MOS | ||
LMS | −5.23 | 0.34 | 0.44 | 1.10 | −5.74 | 0.29 | 0.41 | 1.05 | −6.25 | 0.25 | 0.36 | 1.05 | −6.87 | 0.18 | 0.37 | 1.00 | |
−10 | APA | −4.63 | 0.63 | 0.51 | 1.15 | −5.15 | 0.56 | 0.47 | 1.10 | −5.96 | 0.48 | 0.43 | 1.05 | −6.08 | 0.52 | 0.39 | 1.10 |
RLS | −2.91 | 0.86 | 0.58 | 1.45 | −3.29 | 0.81 | 0.51 | 1.35 | −3.98 | 0.74 | 0.48 | 1.30 | −4.61 | 0.65 | 0.44 | 1.30 | |
DB-MWF | −2.48 | 0.92 | 0.57 | 1.55 | −3.08 | 0.89 | 0.53 | 1.45 | −3.46 | 0.78 | 0.50 | 1.45 | −4.77 | 0.79 | 0.47 | 1.40 | |
MNMF-MVDR | −2.23 | 1.04 | 0.60 | 1.60 | −2.83 | 0.96 | 0.56 | 1.50 | −3.29 | 0.85 | 0.53 | 1.55 | −4.29 | 0.84 | 0.48 | 1.45 | |
CNMA-SBAPA | −1.71 | 1.19 | 0.65 | 1.85 | −2.09 | 1.16 | 0.63 | 1.70 | −2.69 | 1.03 | 0.59 | 1.65 | −3.14 | 0.99 | 0.56 | 1.60 | |
LMS | −2.66 | 0.48 | 0.53 | 1.50 | −3.02 | 0.44 | 0.51 | 1.40 | −3.59 | 0.39 | 0.46 | 1.35 | −3.78 | 0.41 | 0.47 | 1.35 | |
−5 | APA | −1.89 | 0.76 | 0.59 | 1.55 | −2.13 | 0.68 | 0.56 | 1.45 | −2.41 | 0.61 | 0.51 | 1.40 | −3.26 | 0.65 | 0.54 | 1.4 |
RLS | 0.57 | 0.91 | 0.65 | 1.75 | −0.08 | 0.82 | 0.62 | 1.70 | −1.14 | 0.79 | 0.58 | 1.60 | −1.97 | 0.72 | 0.60 | 1.55 | |
DB-MWF | 1.18 | 0.98 | 0.67 | 1.90 | 0.81 | 0.94 | 0.62 | 1.80 | −0.35 | 0.87 | 0.60 | 1.65 | −0.29 | 0.93 | 0.59 | 1.70 | |
MNMF-MVDR | 1.97 | 1.13 | 0.68 | 1.95 | 1.36 | 1.06 | 0.65 | 1.95 | 0.87 | 1.03 | 0.61 | 1.75 | 0.96 | 0.98 | 0.61 | 1.70 | |
CNMA-SBAPA | 3.63 | 1.43 | 0.73 | 2.30 | 3.06 | 1.36 | 0.70 | 2.20 | 2.73 | 1.32 | 0.68 | 2.15 | 2.26 | 1.20 | 0.64 | 2.10 | |
LMS | 3.58 | 0.75 | 0.61 | 1.65 | 3.13 | 0.66 | 0.56 | 1.55 | 2.69 | 0.59 | 0.52 | 1.5 | 2.24 | 0.54 | 0.49 | 1.55 | |
0 | APA | 4.07 | 1.14 | 0.63 | 1.8 | 3.75 | 1.05 | 0.61 | 1.70 | 3.21 | 0.99 | 0.57 | 1.65 | 3.56 | 0.93 | 0.55 | 1.65 |
RLS | 4.12 | 1.53 | 0.71 | 2.05 | 3.98 | 1.47 | 0.66 | 1.95 | 3.67 | 1.41 | 0.64 | 1.9 | 3.92 | 1.36 | 0.62 | 1.85 | |
DB-MWF | 4.83 | 1.61 | 0.72 | 2.25 | 4.39 | 1.59 | 0.68 | 2.00 | 3.98 | 1.58 | 0.66 | 1.95 | 4.11 | 1.49 | 0.65 | 1.95 | |
MNMF-MVDR | 5.07 | 1.72 | 0.75 | 2.35 | 4.56 | 1.67 | 0.7 | 2.25 | 4.31 | 1.69 | 0.69 | 2.15 | 4.39 | 1.61 | 0.65 | 2.10 | |
CNMA-SBAPA | 5.40 | 2.04 | 0.78 | 2.85 | 5.13 | 1.98 | 0.75 | 2.70 | 4.95 | 1.91 | 0.73 | 2.65 | 4.78 | 1.84 | 0.75 | 2.55 | |
LMS | 8.59 | 1.22 | 0.67 | 1.75 | 8.46 | 1.14 | 0.64 | 1.70 | 8.25 | 1.09 | 0.60 | 1.7 | 8.17 | 1.01 | 0.57 | 1.65 | |
5 | APA | 8.96 | 1.63 | 0.69 | 2.20 | 8.74 | 1.57 | 0.66 | 2.15 | 8.39 | 1.5 | 0.63 | 2.05 | 8.22 | 1.46 | 0.64 | 2.00 |
RLS | 9.28 | 2.21 | 0.76 | 2.40 | 9.08 | 2.12 | 0.73 | 2.35 | 8.80 | 2.07 | 0.71 | 2.35 | 8.64 | 1.95 | 0.67 | 2.30 | |
DB-MWF | 9.56 | 2.32 | 0.75 | 2.60 | 9.32 | 2.08 | 0.71 | 2.60 | 9.12 | 2.19 | 0.70 | 2.40 | 8.82 | 2.07 | 0.66 | 2.45 | |
MNMF-MVDR | 10.12 | 2.44 | 0.78 | 2.75 | 9.66 | 2.34 | 0.75 | 2.70 | 9.54 | 2.31 | 0.72 | 2.60 | 9.25 | 2.24 | 0.68 | 2.50 | |
CNMA-SBAPA | 10.63 | 2.59 | 0.82 | 3.25 | 10.34 | 2.53 | 0.80 | 3.10 | 10.21 | 2.48 | 0.78 | 2.95 | 10.03 | 2.49 | 0.73 | 2.90 | |
LMS | 12.52 | 1.53 | 0.69 | 2.1 | 12.37 | 1.47 | 0.68 | 2.05 | 12.11 | 1.39 | 0.65 | 2.00 | 11.95 | 1.41 | 0.63 | 1.95 | |
10 | APA | 12.86 | 2.2 | 0.73 | 2.55 | 12.62 | 2.13 | 0.70 | 2.45 | 12.43 | 2.07 | 0.65 | 2.40 | 12.27 | 2.08 | 0.66 | 2.45 |
RLS | 13.47 | 2.45 | 0.78 | 2.70 | 13.28 | 2.39 | 0.75 | 2.6 | 12.86 | 2.33 | 0.71 | 2.55 | 12.54 | 2.38 | 0.70 | 2.50 | |
DB-MWF | 13.21 | 2.53 | 0.78 | 2.80 | 13.31 | 2.48 | 0.77 | 2.75 | 12.75 | 2.25 | 0.73 | 2.65 | 12.56 | 2.49 | 0.72 | 2.55 | |
MNMF-MVDR | 13.52 | 2.67 | 0.79 | 2.95 | 13.43 | 2.61 | 0.78 | 2.85 | 12.98 | 2.46 | 0.76 | 2.70 | 12.71 | 2.65 | 0.75 | 2.65 | |
CNMA-SBAPA | 14.02 | 3.08 | 0.83 | 3.20 | 13.59 | 2.96 | 0.82 | 3.10 | 13.28 | 2.87 | 0.79 | 3.05 | 13.09 | 2.92 | 0.79 | 2.95 | |
LMS | 15.12 | 1.76 | 0.72 | 2.45 | 15.09 | 1.69 | 0.72 | 2.35 | 15.13 | 1.64 | 0.69 | 2.30 | 15.4 | 1.68 | 0.7 | 2.25 | |
APA | 15.55 | 2.44 | 0.76 | 2.65 | 15.48 | 2.37 | 0.75 | 2.55 | 15.35 | 2.28 | 0.76 | 2.50 | 15.32 | 2.25 | 0.74 | 2.45 | |
15 | RLS | 15.74 | 2.61 | 0.81 | 2.85 | 15.56 | 2.58 | 0.79 | 2.75 | 15.35 | 2.51 | 0.80 | 2.65 | 15.48 | 2.40 | 0.79 | 2.65 |
DB-MWF | 15.52 | 2.99 | 0.83 | 3.00 | 15.38 | 2.83 | 0.76 | 2.90 | 15.24 | 2.72 | 0.79 | 2.75 | 15.52 | 2.71 | 0.80 | 2.80 | |
MNMF-MVDR | 15.63 | 3.03 | 0.84 | 3.10 | 15.42 | 2.96 | 0.78 | 3.05 | 15.42 | 2.93 | 0.81 | 2.90 | 15.45 | 2.80 | 0.83 | 2.85 | |
CNMA-SBAPA | 15.71 | 3.31 | 0.89 | 3.35 | 15.52 | 3.22 | 0.84 | 3.30 | 15.69 | 3.24 | 0.80 | 3.25 | 15.44 | 3.19 | 0.82 | 3.20 |
SNR (dB) | Methods | Babble Noise | Train Noise | Car Noise | Restaurant Noise | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
SegSNR | PESQ | STOI | MOS | SegSNR | PESQ | STOI | MOS | SegSNR | PESQ | STOI | MOS | SegSNR | PESQ | STOI | MOS | ||
LMS | −5.56 | 0.32 | 0.41 | 1.00 | −5.82 | 0.25 | 0.39 | 1.00 | −6.46 | 0.18 | 0.36 | 1.00 | −6.92 | 0.14 | 0.35 | 1.00 | |
−10 | APA | −4.82 | 0.61 | 0.47 | 1.10 | −5.42 | 0.55 | 0.46 | 1.15 | −5.83 | 0.51 | 0.42 | 1.10 | −6.21 | 0.48 | 0.39 | 1.05 |
RLS | −3.04 | 0.82 | 0.52 | 1.35 | −3.87 | 0.74 | 0.51 | 1.30 | −4.52 | 0.71 | 0.49 | 1.20 | −4.86 | 0.64 | 0.45 | 1.15 | |
DB-MWF | −2.98 | 0.86 | 0.55 | 1.40 | −3.43 | 0.76 | 0.53 | 1.35 | −4.13 | 0.75 | 0.50 | 1.30 | −4.51 | 0.67 | 0.45 | 1.25 | |
MNMF-MVDR | −2.54 | 0.93 | 0.57 | 1.50 | −3.01 | 0.83 | 0.52 | 1.45 | −3.81 | 0.80 | 0.51 | 1.35 | −4.09 | 0.75 | 0.47 | 1.30 | |
CNMA-SBAPA | −2.21 | 1.08 | 0.58 | 1.65 | −2.67 | 1.01 | 0.56 | 1.50 | −3.14 | 0.96 | 0.53 | 1.45 | −3.46 | 0.92 | 0.53 | 1.40 | |
LMS | −2.83 | 0.46 | 0.50 | 1.45 | −3.2 | 0.44 | 0.48 | 1.40 | −3.54 | 0.41 | 0.47 | 1.30 | −3.92 | 0.38 | 0.43 | 1.25 | |
−5 | APA | −2.04 | 0.72 | 0.55 | 1.50 | −2.57 | 0.68 | 0.53 | 1.45 | −3.09 | 0.64 | 0.52 | 1.45 | −3.47 | 0.61 | 0.49 | 1.35 |
RLS | 0.14 | 0.88 | 0.62 | 1.60 | −0.52 | 0.82 | 0.61 | 1.55 | −1.45 | 0.74 | 0.58 | 1.50 | −1.88 | 0.69 | 0.54 | 1.50 | |
DB-MWF | 0.93 | 0.91 | 0.64 | 1.75 | 0.12 | 0.85 | 0.64 | 1.65 | −0.78 | 0.79 | 0.60 | 1.60 | −1.57 | 0.73 | 0.56 | 1.65 | |
MNMF-MVDR | 1.53 | 1.02 | 0.67 | 1.90 | 0.84 | 0.97 | 0.65 | 1.70 | 0.21 | 0.93 | 0.59 | 1.75 | −0.84 | 0.88 | 0.59 | 1.75 | |
CNMA-SBAPA | 3.29 | 1.29 | 0.70 | 2.10 | 2.76 | 1.21 | 0.67 | 2.05 | 2.25 | 1.16 | 0.64 | 1.95 | 1.76 | 1.11 | 0.61 | 1.90 | |
LMS | 3.25 | 0.72 | 0.57 | 1.5 | 2.84 | 0.65 | 0.52 | 1.45 | 2.43 | 0.59 | 0.51 | 1.35 | 2.07 | 0.53 | 0.46 | 1.30 | |
0 | APA | 3.99 | 1.05 | 0.62 | 1.75 | 3.71 | 1.01 | 0.57 | 1.65 | 3.67 | 0.94 | 0.52 | 1.6 | 3.44 | 0.89 | 0.51 | 1.50 |
RLS | 4.15 | 1.49 | 0.69 | 1.95 | 4.03 | 1.44 | 0.65 | 1.85 | 3.95 | 1.38 | 0.62 | 1.8 | 3.79 | 1.33 | 0.59 | 1.70 | |
DB-MWF | 4.32 | 1.54 | 0.7 | 2.05 | 4.26 | 1.51 | 0.67 | 1.95 | 4.04 | 1.44 | 0.64 | 1.9 | 3.93 | 1.40 | 0.60 | 1.75 | |
MNMF-MVDR | 4.51 | 1.69 | 0.72 | 2.20 | 4.49 | 1.58 | 0.67 | 2.05 | 4.29 | 1.56 | 0.65 | 2.00 | 4.17 | 1.58 | 0.63 | 1.95 | |
CNMA-SBAPA | 5.03 | 1.95 | 0.74 | 2.55 | 4.86 | 1.87 | 0.71 | 2.50 | 4.71 | 1.76 | 0.70 | 2.45 | 4.58 | 1.72 | 0.67 | 2.45 | |
LMS | 8.41 | 1.18 | 0.66 | 1.6 | 8.22 | 1.09 | 0.62 | 1.55 | 8.14 | 1.02 | 0.59 | 1.50 | 8.01 | 0.96 | 0.56 | 1.50 | |
5 | APA | 8.73 | 1.58 | 0.65 | 2.05 | 8.52 | 1.52 | 0.63 | 2.00 | 8.39 | 1.48 | 0.60 | 1.95 | 8.25 | 1.41 | 0.61 | 1.90 |
RLS | 9.04 | 2.15 | 0.74 | 2.25 | 8.96 | 2.04 | 0.71 | 2.2 | 8.71 | 1.96 | 0.70 | 2.15 | 8.52 | 1.89 | 0.66 | 2.05 | |
DB-MWF | 9.32 | 2.23 | 0.73 | 2.5 | 9.38 | 2.05 | 0.7 | 2.4 | 8.83 | 2.06 | 0.70 | 2.25 | 8.86 | 1.92 | 0.65 | 2.10 | |
MNMF-MVDR | 9.58 | 2.34 | 0.76 | 2.65 | 9.62 | 2.26 | 0.72 | 2.55 | 9.22 | 2.19 | 0.71 | 2.40 | 9.32 | 2.08 | 0.66 | 2.35 | |
CNMA-SBAPA | 10.27 | 2.48 | 0.78 | 3.00 | 10.08 | 2.41 | 0.75 | 2.90 | 9.95 | 2.36 | 0.72 | 2.85 | 9.76 | 2.32 | 0.69 | 2.75 | |
LMS | 12.43 | 1.48 | 0.67 | 2.05 | 12.26 | 1.45 | 0.66 | 1.95 | 12.09 | 1.40 | 0.64 | 1.90 | 11.87 | 1.37 | 0.65 | 1.95 | |
10 | APA | 12.91 | 2.15 | 0.70 | 2.40 | 12.62 | 2.11 | 0.69 | 2.35 | 12.47 | 2.07 | 0.67 | 2.35 | 12.31 | 2.01 | 0.66 | 2.25 |
RLS | 13.28 | 2.41 | 0.75 | 2.55 | 13.01 | 2.39 | 0.74 | 2.50 | 12.76 | 2.36 | 0.72 | 2.40 | 12.62 | 2.31 | 0.73 | 2.35 | |
DB-MWF | 13.36 | 2.48 | 0.76 | 2.75 | 13.22 | 2.49 | 0.73 | 2.60 | 12.89 | 2.47 | 0.71 | 2.50 | 12.75 | 2.38 | 0.74 | 2.45 | |
MNMF-MVDR | 13.51 | 2.56 | 0.76 | 2.90 | 13.34 | 2.55 | 0.75 | 2.75 | 12.96 | 2.51 | 0.74 | 2.65 | 12.83 | 2.48 | 0.72 | 2.70 | |
CNMA-SBAPA | 13.88 | 2.92 | 0.81 | 3.05 | 13.51 | 2.88 | 0.79 | 3.00 | 13.19 | 2.82 | 0.77 | 2.95 | 12.99 | 2.79 | 0.75 | 3.00 | |
LMS | 15.08 | 1.74 | 0.70 | 2.40 | 15.1 | 1.69 | 0.67 | 2.30 | 15.04 | 1.65 | 0.67 | 2.15 | 15.25 | 1.60 | 0.66 | 2.10 | |
APA | 15.41 | 2.38 | 0.73 | 2.55 | 15.38 | 2.35 | 0.72 | 2.50 | 15.86 | 2.28 | 0.70 | 2.40 | 15.45 | 2.21 | 0.7 | 2.45 | |
15 | RLS | 15.24 | 2.56 | 0.84 | 2.70 | 15.29 | 2.51 | 0.77 | 2.70 | 15.92 | 2.44 | 0.75 | 2.60 | 15.76 | 2.37 | 0.74 | 2.55 |
DB-MWF | 15.39 | 2.81 | 0.85 | 2.85 | 15.36 | 2.62 | 0.77 | 2.80 | 15.81 | 2.50 | 0.78 | 2.75 | 15.78 | 2.46 | 0.77 | 2.65 | |
MNMF-MVDR | 15.48 | 2.94 | 0.82 | 3.00 | 15.42 | 2.74 | 0.79 | 2.90 | 15.80 | 2.68 | 0.80 | 2.85 | 15.73 | 2.59 | 0.74 | 2.70 | |
CNMA-SBAPA | 15.35 | 3.22 | 0.83 | 3.20 | 15.49 | 3.15 | 0.82 | 3.25 | 15.83 | 3.13 | 0.77 | 3.15 | 15.61 | 3.09 | 0.73 | 3.10 |
SNR(dB) | Methods | Speed of Convergence (Seconds) | ||||
---|---|---|---|---|---|---|
White Noise | Babble Noise | Train Noise | Car Noise | Restaurant Noise | ||
LMS | 0.541 | 0.582 | 0.61 | 0.654 | 0.668 | |
−10 | APA | 0.516 | 0.551 | 0.592 | 0.611 | 0.627 |
RLS | 0.422 | 0.468 | 0.496 | 0.539 | 0.546 | |
DB-MWF | 0.586 | 0.612 | 0.652 | 0.673 | 0.695 | |
MNMF-MVDR | 0.51 | 0.539 | 0.57 | 0.592 | 0.637 | |
CNMA-SBAPA | 0.356 | 0.367 | 0.393 | 0.419 | 0.427 | |
LMS | 0.537 | 0.556 | 0.579 | 0.601 | 0.634 | |
−5 | APA | 0.497 | 0.527 | 0.541 | 0.56 | 0.572 |
RLS | 0.403 | 0.429 | 0.447 | 0.482 | 0.506 | |
DB-MWF | 0.545 | 0.593 | 0.615 | 0.636 | 0.658 | |
MNMF-MVDR | 0.494 | 0.509 | 0.527 | 0.553 | 0.587 | |
CNMA-SBAPA | 0.337 | 0.356 | 0.379 | 0.391 | 0.411 | |
LMS | 0.516 | 0.538 | 0.562 | 0.568 | 0.595 | |
0 | APA | 0.473 | 0.482 | 0.502 | 0.536 | 0.539 |
RLS | 0.396 | 0.409 | 0.427 | 0.435 | 0.452 | |
DB-MWF | 0.531 | 0.563 | 0.579 | 0.602 | 0.621 | |
MNMF-MVDR | 0.485 | 0.498 | 0.516 | 0.531 | 0.546 | |
CNMA-SBAPA | 0.318 | 0.329 | 0.35 | 0.358 | 0.362 | |
LMS | 0.492 | 0.505 | 0.517 | 0.525 | 0.529 | |
5 | APA | 0.464 | 0.479 | 0.492 | 0.467 | 0.503 |
RLS | 0.388 | 0.395 | 0.401 | 0.412 | 0.418 | |
DB-MWF | 0.507 | 0.512 | 0.544 | 0.565 | 0.586 | |
MNMF-MVDR | 0.459 | 0.466 | 0.487 | 0.503 | 0.525 | |
CNMA-SBAPA | 0.327 | 0.336 | 0.347 | 0.359 | 0.365 | |
LMS | 0.488 | 0.498 | 0.509 | 0.521 | 0.538 | |
10 | APA | 0.451 | 0.467 | 0.483 | 0.499 | 0.507 |
RLS | 0.369 | 0.381 | 0.389 | 0.395 | 0.411 | |
DB-MWF | 0.478 | 0.496 | 0.513 | 0.543 | 0.559 | |
MNMF-MVDR | 0.437 | 0.458 | 0.479 | 0.491 | 0.516 | |
CNMA-SBAPA | 0.305 | 0.328 | 0.339 | 0.352 | 0.368 | |
LMS | 0.472 | 0.485 | 0.493 | 0.498 | 0.506 | |
APA | 0.463 | 0.474 | 0.478 | 0.485 | 0.49 | |
15 | RLS | 0.372 | 0.376 | 0.385 | 0.396 | 0.408 |
DB-MWF | 0.443 | 0.455 | 0.462 | 0.481 | 0.494 | |
MNMF-MVDR | 0.41 | 0.427 | 0.448 | 0.463 | 0.48 | |
CNMA-SBAPA | 0.299 | 0.319 | 0.325 | 0.349 | 0.374 |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Dehghan Firoozabadi, A.; Irarrazaval, P.; Adasme, P.; Zabala-Blanco, D.; Durney, H.; Sanhueza, M.; Palacios-Játiva, P.; Azurdia-Meza, C. Multiresolution Speech Enhancement Based on Proposed Circular Nested Microphone Array in Combination with Sub-Band Affine Projection Algorithm. Appl. Sci. 2020, 10, 3955. https://doi.org/10.3390/app10113955
Dehghan Firoozabadi A, Irarrazaval P, Adasme P, Zabala-Blanco D, Durney H, Sanhueza M, Palacios-Játiva P, Azurdia-Meza C. Multiresolution Speech Enhancement Based on Proposed Circular Nested Microphone Array in Combination with Sub-Band Affine Projection Algorithm. Applied Sciences. 2020; 10(11):3955. https://doi.org/10.3390/app10113955
Chicago/Turabian StyleDehghan Firoozabadi, Ali, Pablo Irarrazaval, Pablo Adasme, David Zabala-Blanco, Hugo Durney, Miguel Sanhueza, Pablo Palacios-Játiva, and Cesar Azurdia-Meza. 2020. "Multiresolution Speech Enhancement Based on Proposed Circular Nested Microphone Array in Combination with Sub-Band Affine Projection Algorithm" Applied Sciences 10, no. 11: 3955. https://doi.org/10.3390/app10113955
APA StyleDehghan Firoozabadi, A., Irarrazaval, P., Adasme, P., Zabala-Blanco, D., Durney, H., Sanhueza, M., Palacios-Játiva, P., & Azurdia-Meza, C. (2020). Multiresolution Speech Enhancement Based on Proposed Circular Nested Microphone Array in Combination with Sub-Band Affine Projection Algorithm. Applied Sciences, 10(11), 3955. https://doi.org/10.3390/app10113955