Next Article in Journal
Nanomaterials and Cross-Cutting Technologies for Fostering Smart Electrochemical Biosensors in the Detection of Chemical Warfare Agents
Next Article in Special Issue
Single Image Super-Resolution Method Using CNN-Based Lightweight Neural Networks
Previous Article in Journal
Effect of Osteoporosis on Well-Integrated Bone Implants
Previous Article in Special Issue
Place Classification Algorithm Based on Semantic Segmented Objects
Open AccessArticle

A Multi-Resolution Approach to GAN-Based Speech Enhancement

Department of Electrical and Computer Engineering and the Institute of New Media and Communications, Seoul National University, 1 Gwanak-ro, Gwanak-gu, Seoul 08826, Korea
Author to whom correspondence should be addressed.
Appl. Sci. 2021, 11(2), 721;
Received: 2 December 2020 / Revised: 8 January 2021 / Accepted: 10 January 2021 / Published: 13 January 2021
(This article belongs to the Special Issue Artificial Intelligence for Multimedia Signal Processing)
Recently, generative adversarial networks (GANs) have been successfully applied to speech enhancement. However, there still remain two issues that need to be addressed: (1) GAN-based training is typically unstable due to its non-convex property, and (2) most of the conventional methods do not fully take advantage of the speech characteristics, which could result in a sub-optimal solution. In order to deal with these problems, we propose a progressive generator that can handle the speech in a multi-resolution fashion. Additionally, we propose a multi-scale discriminator that discriminates the real and generated speech at various sampling rates to stabilize GAN training. The proposed structure was compared with the conventional GAN-based speech enhancement algorithms using the VoiceBank-DEMAND dataset. Experimental results showed that the proposed approach can make the training faster and more stable, which improves the performance on various metrics for speech enhancement. View Full-Text
Keywords: speech enhancement; generative adversarial network; relativistic GAN; convolutional neural network speech enhancement; generative adversarial network; relativistic GAN; convolutional neural network
Show Figures

Figure 1

MDPI and ACS Style

Kim, H.Y.; Yoon, J.W.; Cheon, S.J.; Kang, W.H.; Kim, N.S. A Multi-Resolution Approach to GAN-Based Speech Enhancement. Appl. Sci. 2021, 11, 721.

AMA Style

Kim HY, Yoon JW, Cheon SJ, Kang WH, Kim NS. A Multi-Resolution Approach to GAN-Based Speech Enhancement. Applied Sciences. 2021; 11(2):721.

Chicago/Turabian Style

Kim, Hyung Y.; Yoon, Ji W.; Cheon, Sung J.; Kang, Woo H.; Kim, Nam S. 2021. "A Multi-Resolution Approach to GAN-Based Speech Enhancement" Appl. Sci. 11, no. 2: 721.

Find Other Styles
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

Search more from Scilit
Back to TopTop