Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (7)

Search Parameters:
Keywords = speech dereverberation

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
20 pages, 5218 KB  
Article
A Robust Bilinear Framework for Real-Time Speech Separation and Dereverberation in Wearable Augmented Reality
by Alon Nemirovsky, Gal Itzhak and Israel Cohen
Sensors 2025, 25(17), 5484; https://doi.org/10.3390/s25175484 - 3 Sep 2025
Viewed by 1390
Abstract
This paper presents a bilinear framework for real-time speech source separation and dereverberation tailored to wearable augmented reality devices operating in dynamic acoustic environments. Using the Speech Enhancement for Augmented Reality (SPEAR) Challenge dataset, we perform extensive validation with real-world recordings and review [...] Read more.
This paper presents a bilinear framework for real-time speech source separation and dereverberation tailored to wearable augmented reality devices operating in dynamic acoustic environments. Using the Speech Enhancement for Augmented Reality (SPEAR) Challenge dataset, we perform extensive validation with real-world recordings and review key algorithmic parameters, including the forgetting factor and regularization. To enhance robustness against direction-of-arrival (DOA) estimation errors caused by head movements and localization uncertainty, we propose a region-of-interest (ROI) beamformer that replaces conventional point-source steering. Additionally, we introduce a multi-constraint beamforming design capable of simultaneously preserving multiple sources or suppressing known undesired sources. Experimental results demonstrate that ROI-based steering significantly improves robustness to localization errors while maintaining effective noise and reverberation suppression. However, this comes at the cost of increased high-frequency leakage from both desired and undesired sources. The multi-constraint formulation further enhances source separation with a modest trade-off in noise reduction. The proposed integration of ROI and LCMP within the low-complexity frameworks, validated comprehensively on the SPEAR dataset, offers a practical and efficient solution for real-time audio enhancement in wearable augmented reality systems. Full article
(This article belongs to the Special Issue Sensors and Wearables for AR/VR Applications)
Show Figures

Figure 1

19 pages, 903 KB  
Article
Deep-Learning Framework for Efficient Real-Time Speech Enhancement and Dereverberation
by Tomer Rosenbaum, Emil Winebrand, Omer Cohen and Israel Cohen
Sensors 2025, 25(3), 630; https://doi.org/10.3390/s25030630 - 22 Jan 2025
Cited by 2 | Viewed by 8277
Abstract
Deep learning has revolutionized speech enhancement, enabling impressive high-quality noise reduction and dereverberation. However, state-of-the-art methods often demand substantial computational resources, hindering their deployment on edge devices and in real-time applications. Computationally efficient approaches like deep filtering and Deep Filter Net offer an [...] Read more.
Deep learning has revolutionized speech enhancement, enabling impressive high-quality noise reduction and dereverberation. However, state-of-the-art methods often demand substantial computational resources, hindering their deployment on edge devices and in real-time applications. Computationally efficient approaches like deep filtering and Deep Filter Net offer an attractive alternative by predicting linear filters instead of directly estimating the clean speech. While Deep Filter Net excels in noise reduction, its dereverberation performance remains limited. In this paper, we present a generalized framework for computationally efficient speech enhancement and, based on this framework, identify an inherent constraint within Deep Filter Net that hinders its dereverberation capabilities. We propose an extension to the Deep Filter Net framework designed to overcome this limitation, demonstrating significant improvements in dereverberation performance while maintaining competitive noise-reduction quality. Our experimental results highlight the potential of this enhanced framework for real-time speech enhancement on resource-constrained devices. Full article
(This article belongs to the Section Physical Sensors)
Show Figures

Figure 1

13 pages, 3138 KB  
Article
Iteratively Refined Multi-Channel Speech Separation
by Xu Zhang, Changchun Bao, Xue Yang and Jing Zhou
Appl. Sci. 2024, 14(14), 6375; https://doi.org/10.3390/app14146375 - 22 Jul 2024
Cited by 1 | Viewed by 2026
Abstract
The combination of neural networks and beamforming has proven very effective in multi-channel speech separation, but its performance faces a challenge in complex environments. In this paper, an iteratively refined multi-channel speech separation method is proposed to meet this challenge. The proposed method [...] Read more.
The combination of neural networks and beamforming has proven very effective in multi-channel speech separation, but its performance faces a challenge in complex environments. In this paper, an iteratively refined multi-channel speech separation method is proposed to meet this challenge. The proposed method is composed of initial separation and iterative separation. In the initial separation, a time–frequency domain dual-path recurrent neural network (TFDPRNN), minimum variance distortionless response (MVDR) beamformer, and post-separation are cascaded to obtain the first additional input in the iterative separation process. In iterative separation, the MVDR beamformer and post-separation are iteratively used, where the output of the MVDR beamformer is used as an additional input to the post-separation network and the final output comes from the post-separation module. This iteration of the beamformer and post-separation is fully employed for promoting their optimization, which ultimately improves the overall performance. Experiments on the spatialized version of the WSJ0-2mix corpus showed that our proposed method achieved a signal-to-distortion ratio (SDR) improvement of 24.17 dB, which was significantly better than the current popular methods. In addition, the method also achieved an SDR of 20.2 dB on joint separation and dereverberation tasks. These results indicate our method’s effectiveness and significance in the multi-channel speech separation field. Full article
(This article belongs to the Special Issue Advanced Technology in Speech and Acoustic Signal Processing)
Show Figures

Figure 1

15 pages, 473 KB  
Article
Crossband Filtering for Weighted Prediction Error-Based Speech Dereverberation
by Tomer Rosenbaum, Israel Cohen and Emil Winebrand
Appl. Sci. 2023, 13(17), 9537; https://doi.org/10.3390/app13179537 - 23 Aug 2023
Cited by 2 | Viewed by 2025
Abstract
Weighted prediction error (WPE) is a linear prediction-based method extensively used to predict and attenuate the late reverberation component of an observed speech signal. This paper introduces an extended version of the WPE method to enhance the modeling accuracy in the time–frequency domain [...] Read more.
Weighted prediction error (WPE) is a linear prediction-based method extensively used to predict and attenuate the late reverberation component of an observed speech signal. This paper introduces an extended version of the WPE method to enhance the modeling accuracy in the time–frequency domain by incorporating crossband filters. Two approaches to extending the WPE while considering crossband filters are proposed and investigated. The first approach improves the model’s accuracy. However, it increases the computational complexity, while the second approach maintains the same computational complexity as the conventional WPE while still achieving improved accuracy and comparable performance to the first approach. To validate the effectiveness of the proposed methods, extensive simulations are conducted. The experimental results demonstrate that both methods outperform the conventional WPE regarding dereverberation performance. These findings highlight the potential of incorporating crossband filters in improving the accuracy and efficacy of the WPE method for dereverberation tasks. Full article
(This article belongs to the Special Issue Automatic Speech Signal Processing)
Show Figures

Figure 1

22 pages, 2743 KB  
Article
Effective Dereverberation with a Lower Complexity at Presence of the Noise
by Fengqi Tan, Changchun Bao and Jing Zhou
Appl. Sci. 2022, 12(22), 11819; https://doi.org/10.3390/app122211819 - 21 Nov 2022
Cited by 5 | Viewed by 2514
Abstract
Adaptive beamforming and deconvolution techniques have shown effectiveness for reducing noise and reverberation. The minimum variance distortionless response (MVDR) beamformer is the most widely used for adaptive beamforming, whereas multichannel linear prediction (MCLP) is an excellent approach for the deconvolution. How to solve [...] Read more.
Adaptive beamforming and deconvolution techniques have shown effectiveness for reducing noise and reverberation. The minimum variance distortionless response (MVDR) beamformer is the most widely used for adaptive beamforming, whereas multichannel linear prediction (MCLP) is an excellent approach for the deconvolution. How to solve the problem where the noise and reverberation occur together is a challenging task. In this paper, the MVDR beamformer and MCLP are effectively combined for noise reduction and dereverberation. Especially, the MCLP coefficients are estimated by the Kalman filter and the MVDR filter based on the complex Gaussian mixture model (CGMM) is used to enhance the speech corrupted by the reverberation with the noise and to estimate the power spectral density (PSD) of the target speech required by the Kalman filter, respectively. The final enhanced speech is obtained by the Kalman filter. Furthermore, a complexity reduction method with respect to the Kalman filter is also proposed based on the Kronecker product. Compared to two advanced algorithms, the integrated sidelobe cancellation and linear prediction (ISCLP) method and the weighted prediction error (WPE) method, which are very effective for removing reverberation, the proposed algorithm shows better performance and lower complexity. Full article
(This article belongs to the Special Issue Advances in Speech and Language Processing)
Show Figures

Figure 1

20 pages, 5998 KB  
Article
De-Noising Process in Room Impulse Response with Generalized Spectral Subtraction
by Min Chen and Chang-Myung Lee
Appl. Sci. 2021, 11(15), 6858; https://doi.org/10.3390/app11156858 - 26 Jul 2021
Cited by 5 | Viewed by 2700
Abstract
The generalized spectral subtraction algorithm (GBSS), which has extraordinary ability in background noise reduction, is historically one of the first approaches used for speech enhancement and dereverberation. However, the algorithm has not been applied to de-noise the room impulse response (RIR) to extend [...] Read more.
The generalized spectral subtraction algorithm (GBSS), which has extraordinary ability in background noise reduction, is historically one of the first approaches used for speech enhancement and dereverberation. However, the algorithm has not been applied to de-noise the room impulse response (RIR) to extend the reverberation decay range. The application of the GBSS algorithm in this study is stated as an optimization problem, that is, subtracting the noise level from the RIR while maintaining the signal quality. The optimization process conducted in the measurements of the RIRs with artificial noise and natural ambient noise aims to determine the optimal sets of factors to achieve the best noise reduction results regarding the largest dynamic range improvement. The optimal factors are set variables determined by the estimated SNRs of the RIRs filtered in the octave band. The acoustic parameters, the reverberation time (RT), and early decay time (EDT), and the dynamic range improvement of the energy decay curve were used as control measures and evaluation criteria to ensure the reliability of the algorithm. The de-noising results were compared with noise compensation methods. With the achieved optimal factors, the GBSS contributes to a significant effect in terms of dynamic range improvement and decreases the estimation errors in the RTs caused by noise levels. Full article
(This article belongs to the Section Acoustics and Vibrations)
Show Figures

Figure 1

15 pages, 2096 KB  
Article
A Novel Scheme for Single-Channel Speech Dereverberation
by Nikolaos Kilis and Nikolaos Mitianoudis
Acoustics 2019, 1(3), 711-725; https://doi.org/10.3390/acoustics1030042 - 5 Sep 2019
Cited by 8 | Viewed by 4759
Abstract
This paper presents a novel scheme for speech dereverberation. The core of our method is a two-stage single-channel speech enhancement scheme. Degraded speech obtains a sparser representation of the linear prediction residual in the first stage of our proposed scheme by applying orthogonal [...] Read more.
This paper presents a novel scheme for speech dereverberation. The core of our method is a two-stage single-channel speech enhancement scheme. Degraded speech obtains a sparser representation of the linear prediction residual in the first stage of our proposed scheme by applying orthogonal matching pursuit on overcomplete bases, trained by the K-SVD algorithm. Our method includes an estimation of reverberation and mixing time from a recorded hand clap or a simulated room impulse response, which are used to create a time-domain envelope. Late reverberation is suppressed at the second stage by estimating its energy from the previous envelope and removed with spectral subtraction. Further speech enhancement is applied on minimizing the background noise, based on optimal smoothing and minimum statistics. Experimental results indicate favorable quality, compared to two state-of-the-art methods, especially in real reverberant environments with increased reverberation and background noise. Full article
Show Figures

Figure 1

Back to TopTop