Next Article in Journal
Special Issue “MoDAT: Designing the Market of Data”
Previous Article in Journal
On the Use of Mobile Devices as Controllers for First-Person Navigation in Public Installations
Open AccessArticle

Multi-Modal Emotion Aware System Based on Fusion of Speech and Brain Information

Department of Computer, Mansoura University, Mansoura 35516, Egypt
Department of Information Technology, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, Riyadh 84428, Saudi Arabia
Faculty of Engineering & IT, The British University in Dubai, Dubai 345015, United Arab Emirates
Author to whom correspondence should be addressed.
Information 2019, 10(7), 239;
Received: 28 May 2019 / Revised: 26 June 2019 / Accepted: 27 June 2019 / Published: 11 July 2019
PDF [3623 KB, uploaded 18 July 2019]


In multi-modal emotion aware frameworks, it is essential to estimate the emotional features then fuse them to different degrees. This basically follows either a feature-level or decision-level strategy. In all likelihood, while features from several modalities may enhance the classification performance, they might exhibit high dimensionality and make the learning process complex for the most used machine learning algorithms. To overcome issues of feature extraction and multi-modal fusion, hybrid fuzzy-evolutionary computation methodologies are employed to demonstrate ultra-strong capability of learning features and dimensionality reduction. This paper proposes a novel multi-modal emotion aware system by fusing speech with EEG modalities. Firstly, a mixing feature set of speaker-dependent and independent characteristics is estimated from speech signal. Further, EEG is utilized as inner channel complementing speech for more authoritative recognition, by extracting multiple features belonging to time, frequency, and time–frequency. For classifying unimodal data of either speech or EEG, a hybrid fuzzy c-means-genetic algorithm-neural network model is proposed, where its fitness function finds the optimal fuzzy cluster number reducing the classification error. To fuse speech with EEG information, a separate classifier is used for each modality, then output is computed by integrating their posterior probabilities. Results show the superiority of the proposed model, where the overall performance in terms of accuracy average rates is 98.06%, and 97.28%, and 98.53% for EEG, speech, and multi-modal recognition, respectively. The proposed model is also applied to two public databases for speech and EEG, namely: SAVEE and MAHNOB, which achieve accuracies of 98.21% and 98.26%, respectively. View Full-Text
Keywords: multi-modal emotion aware systems; speech processing; EEG signal processing; hybrid classification models multi-modal emotion aware systems; speech processing; EEG signal processing; hybrid classification models

Figure 1

This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).

Share & Cite This Article

MDPI and ACS Style

Ghoniem, R.M.; Algarni, A.D.; Shaalan, K. Multi-Modal Emotion Aware System Based on Fusion of Speech and Brain Information. Information 2019, 10, 239.

Show more citation formats Show less citations formats

Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Related Articles

Article Metrics

Article Access Statistics



[Return to top]
Information EISSN 2078-2489 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top