Next Article in Journal
Parametric Loop Division for 3D Localization in Wireless Sensor Networks
Previous Article in Journal
Efficient Retrieval of Massive Ocean Remote Sensing Images via a Cloud-Based Mean-Shift Algorithm
Previous Article in Special Issue
On Transform Domain Communication Systems under Spectrum Sensing Mismatch: A Deterministic Analysis
Article Menu
Issue 7 (July) cover image

Export Article

Open AccessArticle
Sensors 2017, 17(7), 1694; doi:10.3390/s17071694

Emotion Recognition from Chinese Speech for Smart Affective Services Using a Combination of SVM and DBN

1
Department of Software Engineering, China University of Petroleum, No. 66 Changjiang West Road, Qingdao 266031, China
2
Department of Information Processing Science, University of Oulu, Oulu FI-91004, Finland
*
Authors to whom correspondence should be addressed.
Received: 8 March 2017 / Revised: 12 June 2017 / Accepted: 14 July 2017 / Published: 24 July 2017
View Full-Text   |   Download PDF [843 KB, uploaded 24 July 2017]   |  

Abstract

Accurate emotion recognition from speech is important for applications like smart health care, smart entertainment, and other smart services. High accuracy emotion recognition from Chinese speech is challenging due to the complexities of the Chinese language. In this paper, we explore how to improve the accuracy of speech emotion recognition, including speech signal feature extraction and emotion classification methods. Five types of features are extracted from a speech sample: mel frequency cepstrum coefficient (MFCC), pitch, formant, short-term zero-crossing rate and short-term energy. By comparing statistical features with deep features extracted by a Deep Belief Network (DBN), we attempt to find the best features to identify the emotion status for speech. We propose a novel classification method that combines DBN and SVM (support vector machine) instead of using only one of them. In addition, a conjugate gradient method is applied to train DBN in order to speed up the training process. Gender-dependent experiments are conducted using an emotional speech database created by the Chinese Academy of Sciences. The results show that DBN features can reflect emotion status better than artificial features, and our new classification approach achieves an accuracy of 95.8%, which is higher than using either DBN or SVM separately. Results also show that DBN can work very well for small training databases if it is properly designed. View Full-Text
Keywords: speech emotion recognition; speech features; support vector machine; Deep Belief Networks speech emotion recognition; speech features; support vector machine; Deep Belief Networks
Figures

Figure 1

This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. (CC BY 4.0).

Scifeed alert for new publications

Never miss any articles matching your research from any publisher
  • Get alerts for new papers matching your research
  • Find out the new papers from selected authors
  • Updated daily for 49'000+ journals and 6000+ publishers
  • Define your Scifeed now

SciFeed Share & Cite This Article

MDPI and ACS Style

Zhu, L.; Chen, L.; Zhao, D.; Zhou, J.; Zhang, W. Emotion Recognition from Chinese Speech for Smart Affective Services Using a Combination of SVM and DBN. Sensors 2017, 17, 1694.

Show more citation formats Show less citations formats

Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Related Articles

Article Metrics

Article Access Statistics

1

Comments

[Return to top]
Sensors EISSN 1424-8220 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top