Quantum AI in Speech Emotion Recognition

Norval, Michael; Wang, Zenghui

doi:10.3390/e27121201

This is an early access version, the complete PDF, HTML, and XML versions will be available soon.

Open AccessArticle

Quantum AI in Speech Emotion Recognition

by

Michael Norval

and

Zenghui Wang

^*

Department of Electrical and Smart Systems Engineering, University of South Africa, Florida 1709, South Africa

^*

Author to whom correspondence should be addressed.

Entropy 2025, 27(12), 1201; https://doi.org/10.3390/e27121201

Submission received: 26 October 2025 / Revised: 22 November 2025 / Accepted: 25 November 2025 / Published: 26 November 2025

(This article belongs to the Special Issue The Future of Quantum Machine Learning and Quantum AI, 2nd Edition)

Download Versions Notes

Abstract

We evaluate a hybrid quantum–classical pipeline for speech emotion recognition (SER) on a custom Afrikaans corpus using MFCC-based spectral features with pitch and energy variants, explicitly comparing three quantum approaches—a variational quantum classifier (VQC), a quantum support vector machine (QSVM), and a Quantum Approximate Optimisation Algorithm (QAOA)-based classifier—against a CNN–LSTM (CLSTM) baseline. We detail the classical-to-quantum data encoding (angle embedding with bounded rotations and an explicit feature-to-qubit map) and report test accuracy, weighted precision, recall, and F1. Under ideal analytic simulation, the quantum models reach 41–43% test accuracy; under a realistic 1% NISQ noise model (100–1000 shots) this degrades to 34–40%, versus 73.9% for the CLSTM baseline. Despite the markedly lower empirical accuracy—expected in the NISQ era—we provide an end-to-end, noise-aware hybrid SER benchmark and discuss the asymptotic advantages of quantum subroutines (Chebyshev-based quantum singular value transformation, quantum walks, and block encoding) that become relevant only in the fault-tolerant regime.

Keywords: speech emotion recognition; quantum machine learning; QSVM; QAOA; variational quantum classifier; MFCC; Afrikaans; noise mitigation

Share and Cite

MDPI and ACS Style

Norval, M.; Wang, Z. Quantum AI in Speech Emotion Recognition. Entropy 2025, 27, 1201. https://doi.org/10.3390/e27121201

AMA Style

Norval M, Wang Z. Quantum AI in Speech Emotion Recognition. Entropy. 2025; 27(12):1201. https://doi.org/10.3390/e27121201

Chicago/Turabian Style

Norval, Michael, and Zenghui Wang. 2025. "Quantum AI in Speech Emotion Recognition" Entropy 27, no. 12: 1201. https://doi.org/10.3390/e27121201

APA Style

Norval, M., & Wang, Z. (2025). Quantum AI in Speech Emotion Recognition. Entropy, 27(12), 1201. https://doi.org/10.3390/e27121201

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Quantum AI in Speech Emotion Recognition

Abstract

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI