Next Article in Journal
ICIRD: Information-Principled Deep Clustering for Invariant, Redundancy-Reduced and Discriminative Cluster Distributions
Previous Article in Journal
Research on Innovative Shale Gas Exploitation and Utilization System Based on CO2 Integrated with Displacement, Power Generation and Refrigeration
Previous Article in Special Issue
Contrastive Learning Pre-Training and Quantum Theory for Cross-Lingual Aspect-Based Sentiment Analysis
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

Quantum AI in Speech Emotion Recognition

Department of Electrical and Smart Systems Engineering, University of South Africa, Florida 1709, South Africa
*
Author to whom correspondence should be addressed.
Entropy 2025, 27(12), 1201; https://doi.org/10.3390/e27121201
Submission received: 26 October 2025 / Revised: 22 November 2025 / Accepted: 25 November 2025 / Published: 26 November 2025
(This article belongs to the Special Issue The Future of Quantum Machine Learning and Quantum AI, 2nd Edition)

Abstract

We evaluate a hybrid quantum–classical pipeline for speech emotion recognition (SER) on a custom Afrikaans corpus using MFCC-based spectral features with pitch and energy variants, explicitly comparing three quantum approaches—a variational quantum classifier (VQC), a quantum support vector machine (QSVM), and a Quantum Approximate Optimisation Algorithm (QAOA)-based classifier—against a CNN–LSTM (CLSTM) baseline. We detail the classical-to-quantum data encoding (angle embedding with bounded rotations and an explicit feature-to-qubit map) and report test accuracy, weighted precision, recall, and F1. Under ideal analytic simulation, the quantum models reach 41–43% test accuracy; under a realistic 1% NISQ noise model (100–1000 shots) this degrades to 34–40%, versus 73.9% for the CLSTM baseline. Despite the markedly lower empirical accuracy—expected in the NISQ era—we provide an end-to-end, noise-aware hybrid SER benchmark and discuss the asymptotic advantages of quantum subroutines (Chebyshev-based quantum singular value transformation, quantum walks, and block encoding) that become relevant only in the fault-tolerant regime.
Keywords: speech emotion recognition; quantum machine learning; QSVM; QAOA; variational quantum classifier; MFCC; Afrikaans; noise mitigation speech emotion recognition; quantum machine learning; QSVM; QAOA; variational quantum classifier; MFCC; Afrikaans; noise mitigation

Share and Cite

MDPI and ACS Style

Norval, M.; Wang, Z. Quantum AI in Speech Emotion Recognition. Entropy 2025, 27, 1201. https://doi.org/10.3390/e27121201

AMA Style

Norval M, Wang Z. Quantum AI in Speech Emotion Recognition. Entropy. 2025; 27(12):1201. https://doi.org/10.3390/e27121201

Chicago/Turabian Style

Norval, Michael, and Zenghui Wang. 2025. "Quantum AI in Speech Emotion Recognition" Entropy 27, no. 12: 1201. https://doi.org/10.3390/e27121201

APA Style

Norval, M., & Wang, Z. (2025). Quantum AI in Speech Emotion Recognition. Entropy, 27(12), 1201. https://doi.org/10.3390/e27121201

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop