Applied Sciences

Journal Browser

► Journal Browser

Application of Deep Learning in Speech Enhancement Technology

Share This Special Issue

Special Issue Editor

Special Issue Information

Dear Colleagues,

Speech enhancement aims to improve the quality of speech degraded by environmental noise with signal-processing techniques and is used in many applications such as voice communication, hearing aids, speech recognition, and human–robot interaction. In recent years, the research in speech enhancement has advanced significantly with deep learning and artificial intelligence techniques. When sufficient training data are available, deep neural networks can learn to predict speech from the noisy signal, achieving promising results in non-stationary and highly noisy acoustic environments. For this reason, deep learning-based speech enhancement has been investigated intensively and is becoming a hot spot in the field of speech processing. A number of methods have been developed with the aim of solving speech enhancement problems in extremely challenging environments, developing new deep architectures, increasing the generality and explainability of the deep model, incorporating deep learning into multi-channel signal processing, and multi-modal speech enhancement. This Special Issue aims to accelerate the research progress by reporting the latest theoretical and practical advances applying deep learning to speech enhancement, discussing emerging problems, creative solutions, and novel insights in the field. This Special Issue will mainly focus on (but is not limited to) the following deep learning-related topics:

Single-channel speech enhancement;
Multi-channel speech enhancement;
Multi-modal speech enhancement;
Explainable speech enhancement;
Novel applications of speech enhancement.

Dr. Lin Wang
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 250 words) can be sent to the Editorial Office for assessment.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Benefits of Publishing in a Special Issue

Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.

Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.

Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.

External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.

Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (2 papers)

Download All Papers

Order results

Result details

Show export options Show export options

Select all

Export citation of selected articles as:

Research

12 pages, 494 KB

Open AccessArticle

Design of a Dual-Path Speech Enhancement Model

by Seorim Hwang, Sung Wook Park and Youngcheol Park

Appl. Sci. 2025, 15(11), 6358; https://doi.org/10.3390/app15116358 - 5 Jun 2025

Viewed by 1777

Abstract

Although both noise suppression and speech restoration are fundamental to speech enhancement, many Deep neural network (DNN)-based approaches tend to focus disproportionately on one, often overlooking the importance of their joint handling. In this study, we propose a dual-path architecture designed to balance noise suppression and speech restoration. The main path consists of an encoder and two specialized decoders: one dedicated to estimating the clean speech spectrum and the other to predicting a noise suppression mask. To reinforce the joint modeling of noise suppression and speech restoration, we introduce an auxiliary refinement path. This path consists of a separate encoder–decoder structure and is designed to further refine the enhanced speech by incorporating complementary information, learned independently from the main path. By using this dual-path architecture, the model better preserves fine speech details while reducing residual noise. Experimental results on the VoiceBank + DEMAND dataset show that our model surpasses conventional methods across multiple evaluation metrics in the causal setup. Specifically, it achieves a PESQ score of 3.33, reflecting improved speech quality, and a CSIG score of 4.48, indicating enhanced intelligibility. Furthermore, it demonstrates superior noise suppression, achieving an SNRseg of 10.44 and a CBAK score of 3.75. Full article

(This article belongs to the Special Issue Application of Deep Learning in Speech Enhancement Technology)

► Show Figures

Figure 1

18 pages, 2345 KB

Open AccessArticle

SGM-EMA: Speech Enhancement Method Score-Based Diffusion Model and EMA Mechanism

by Yuezhou Wu, Zhiri Li and Hua Huang

Appl. Sci. 2025, 15(10), 5243; https://doi.org/10.3390/app15105243 - 8 May 2025

Viewed by 2308

Abstract

The score-based diffusion model has made significant progress in the field of computer vision, surpassing the performance of generative models, such as variational autoencoders, and has been extended to applications such as speech enhancement and recognition. This paper proposes a U-Net architecture using a score-based diffusion model and an efficient multi-scale attention mechanism (EMA) for the speech enhancement task. The model leverages the symmetric structure of U-Net to extract speech features and captures contextual information and local details across different scales using the EMA mechanism, improving speech quality in noisy environments. We evaluate the method on the VoiceBank-DEMAND (VB-DMD) dataset and the DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus–TUT Sound Events 2017 (TIMIT-TUT) dataset. The experimental results show that the proposed model performed well in terms of speech quality perception (PESQ), extended short-time objective intelligibility (ESTOI), and scale-invariant signal-to-distortion ratio (SI-SDR). Especially when processing out-of-dataset noisy speech, the proposed method achieved excellent speech enhancement results compared to other methods, demonstrating the model’s strong generalization capability. We also conducted an ablation study on the SDE solver and the EMA mechanism, and the results show that the reverse diffusion method outperformed the Euler–Maruyama method, and the EMA strategy could improve the model performance. The results demonstrate the effectiveness of these two techniques in our system. Nevertheless, since the model is specifically designed for Gaussian noise, its performance under non-Gaussian or complex noise conditions may be limited. Full article

(This article belongs to the Special Issue Application of Deep Learning in Speech Enhancement Technology)

► Show Figures

Journal Menu

Journal Browser

Application of Deep Learning in Speech Enhancement Technology

Share This Special Issue

Special Issue Editor

Special Issue Information

Keywords

Benefits of Publishing in a Special Issue

Published Papers (2 papers)

Research

Further Information

Guidelines

MDPI Initiatives

Follow MDPI