Next Article in Journal
A New Version of Spherical Magnetic Curves in the De-Sitter Space S 1 2
Next Article in Special Issue
Emerging Approaches and Advances in Big Data
Previous Article in Journal
A Symmetry Motivated Link Table
Previous Article in Special Issue
A Robust Distributed Big Data Clustering-based on Adaptive Density Partitioning using Apache Spark
Open AccessArticle

Improvement of Speech/Music Classification for 3GPP EVS Based on LSTM

Department of Electronic Engineering, Inha University, Incheon 22212, Korea
*
Author to whom correspondence should be addressed.
Symmetry 2018, 10(11), 605; https://doi.org/10.3390/sym10110605
Received: 18 October 2018 / Revised: 1 November 2018 / Accepted: 5 November 2018 / Published: 7 November 2018
(This article belongs to the Special Issue Emerging Approaches and Advances in Big Data)
The competition of speech recognition technology related to smartphones is now getting into full swing with the widespread internet of thing (IoT) devices. For robust speech recognition, it is necessary to detect speech signals in various acoustic environments. Speech/music classification that facilitates optimized signal processing from classification results has been extensively adapted as an essential part of various electronics applications, such as multi-rate audio codecs, automatic speech recognition, and multimedia document indexing. In this paper, we propose a new technique to improve robustness of a speech/music classifier for an enhanced voice service (EVS) codec adopted as a voice-over-LTE (VoLTE) speech codec using long short-term memory (LSTM). For effective speech/music classification, feature vectors implemented with the LSTM are chosen from the features of the EVS. To overcome the diversity of music data, a large scale of data is used for learning. Experiments show that LSTM-based speech/music classification provides better results than the conventional EVS speech/music classification algorithm in various conditions and types of speech/music data, especially at lower signal-to-noise ratio (SNR) than conventional EVS algorithm. View Full-Text
Keywords: speech/music classification; Enhanced Voice Service; long short-term memory; big data speech/music classification; Enhanced Voice Service; long short-term memory; big data
Show Figures

Figure 1

MDPI and ACS Style

Kang, S.-I.; Lee, S. Improvement of Speech/Music Classification for 3GPP EVS Based on LSTM. Symmetry 2018, 10, 605.

Show more citation formats Show less citations formats
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Back to TopTop