Next Article in Journal
A Privacy Preserving Cloud-Based K-NN Search Scheme with Lightweight User Loads
Previous Article in Journal
Beyond Platform Economy: A Comprehensive Model for Decentralized and Self-Organizing Markets on Internet-Scale
Open AccessArticle

An Investigation of a Feature-Level Fusion for Noisy Speech Emotion Recognition

1
Team Networks, Telecoms & Multimedia, University of Hassan II Casablanca, Casablanca 20000, Morocco
2
COSIM Lab, Higher School of Communications of Tunis, Carthage University, Ariana 2083, Tunisia
*
Author to whom correspondence should be addressed.
Computers 2019, 8(4), 91; https://doi.org/10.3390/computers8040091
Received: 18 October 2019 / Revised: 4 December 2019 / Accepted: 11 December 2019 / Published: 13 December 2019
(This article belongs to the Special Issue Mobile, Secure and Programmable Networking (MSPN'2019))
Because one of the key issues in improving the performance of Speech Emotion Recognition (SER) systems is the choice of an effective feature representation, most of the research has focused on developing a feature level fusion using a large set of features. In our study, we propose a relatively low-dimensional feature set that combines three features: baseline Mel Frequency Cepstral Coefficients (MFCCs), MFCCs derived from Discrete Wavelet Transform (DWT) sub-band coefficients that are denoted as DMFCC, and pitch based features. Moreover, the performance of the proposed feature extraction method is evaluated in clean conditions and in the presence of several real-world noises. Furthermore, conventional Machine Learning (ML) and Deep Learning (DL) classifiers are employed for comparison. The proposal is tested using speech utterances of both of the Berlin German Emotional Database (EMO-DB) and Interactive Emotional Dyadic Motion Capture (IEMOCAP) speech databases through speaker independent experiments. Experimental results show improvement in speech emotion detection over baselines. View Full-Text
Keywords: speech emotion recognition; feature fusion; SVM; naive Bayes; wavelet speech emotion recognition; feature fusion; SVM; naive Bayes; wavelet
Show Figures

Figure 1

MDPI and ACS Style

Sekkate, S.; Khalil, M.; Adib, A.; Ben Jebara, S. An Investigation of a Feature-Level Fusion for Noisy Speech Emotion Recognition. Computers 2019, 8, 91.

Show more citation formats Show less citations formats
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Back to TopTop