Next Article in Journal
A Novel Global Key-Value Storage System Based on Kinetic Drives
Previous Article in Journal
Dynamic Virtual Network Slicing and Orchestration for Selective MEC Services over Wide-Area SDN
Open AccessArticle

Recognition of Cross-Language Acoustic Emotional Valence Using Stacked Ensemble Learning

ICT and Society Research Group, South Africa Luban Workshop, Durban University of Technology, Durban 4001, South Africa
*
Author to whom correspondence should be addressed.
Algorithms 2020, 13(10), 246; https://doi.org/10.3390/a13100246
Received: 14 July 2020 / Revised: 13 September 2020 / Accepted: 24 September 2020 / Published: 27 September 2020
Most of the studies on speech emotion recognition have used single-language corpora, but little research has been done in cross-language valence speech emotion recognition. Research has shown that the models developed for single-language speech recognition systems perform poorly when used in different environments. Cross-language speech recognition is a craving alternative, but it is highly challenging because the corpora used will have been recorded in different environments and under varying conditions. The differences in the quality of recording devices, elicitation techniques, languages, and accents of speakers make the recognition task even more arduous. In this paper, we propose a stacked ensemble learning algorithm to recognize valence emotion in a cross-language speech environment. The proposed ensemble algorithm was developed from random decision forest, AdaBoost, logistic regression, and gradient boosting machine and is therefore called RALOG. In addition, we propose feature scaling using random forest recursive feature elimination and a feature selection algorithm to boost the performance of RALOG. The algorithm has been evaluated against four widely used ensemble algorithms to appraise its performance. The amalgam of five benchmarked corpora has resulted in a cross-language corpus to validate the performance of RALOG trained with the selected acoustic features. The comparative analysis results have shown that RALOG gave better performance than the other ensemble learning algorithms investigated in this study. View Full-Text
Keywords: deep learning; ensemble learning; feature elimination; feature selection; speech emotion; speech recognition deep learning; ensemble learning; feature elimination; feature selection; speech emotion; speech recognition
Show Figures

Figure 1

MDPI and ACS Style

Zvarevashe, K.; Olugbara, O.O. Recognition of Cross-Language Acoustic Emotional Valence Using Stacked Ensemble Learning. Algorithms 2020, 13, 246.

Show more citation formats Show less citations formats
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Search more from Scilit
 
Search
Back to TopTop