Joint Learning of Emotion and Singing Style for Enhanced Music Style Understanding

Chen, Yuwen; Mao, Jing; Wang, Rui-Feng

doi:10.3390/s25247575

This is an early access version, the complete PDF, HTML, and XML versions will be available soon.

Open AccessArticle

Joint Learning of Emotion and Singing Style for Enhanced Music Style Understanding

by

Yuwen Chen

¹,

Jing Mao

^1,* and

Rui-Feng Wang

^2,*

¹

School of Humanities and Arts, Hunan Institute of Traffic Engineering, Hengyang 421219, China

²

Department of Crop and Soil Sciences, College of Agriculture and Environmental Sciences, University of Georgia, Tifton, GA 31793, USA

^*

Authors to whom correspondence should be addressed.

Sensors 2025, 25(24), 7575; https://doi.org/10.3390/s25247575 (registering DOI)

Submission received: 5 November 2025 / Revised: 30 November 2025 / Accepted: 10 December 2025 / Published: 13 December 2025

(This article belongs to the Special Issue Music Acquisition and Automatic Processing for Machine Learning-Based Applications)

Download Versions Notes

Abstract

Understanding music styles is essential for music information retrieval, personalized recommendation, and AI-assisted content creation. However, existing work typically addresses tasks such as emotion classification and singing style classification independently, thereby neglecting the intrinsic relationships between them. In this study, we introduce a multi-task learning framework that jointly models these two tasks to enable explicit knowledge sharing and mutual enhancement. Our results indicate that joint optimization consistently outperforms single-task counterparts, demonstrating the value of leveraging inter-task correlations for more robust singing style analysis. To assess the generality and adaptability of the proposed framework, we evaluate it across various backbone architectures, including Transformer, TextCNN, and BERT, and observe stable performance improvements in all cases. Experiments on a benchmark dataset, which were self-constructed and collected through professional recording devices, further show that the framework not only achieves the best accuracy on both tasks on our dataset under a singer-wise split, but also yields interpretable insights into the interplay between emotional expression and stylistic characteristics in vocal performance.

Keywords: music singing style; music emotion; AI4Music; multi-task learning

Share and Cite

MDPI and ACS Style

Chen, Y.; Mao, J.; Wang, R.-F. Joint Learning of Emotion and Singing Style for Enhanced Music Style Understanding. Sensors 2025, 25, 7575. https://doi.org/10.3390/s25247575

AMA Style

Chen Y, Mao J, Wang R-F. Joint Learning of Emotion and Singing Style for Enhanced Music Style Understanding. Sensors. 2025; 25(24):7575. https://doi.org/10.3390/s25247575

Chicago/Turabian Style

Chen, Yuwen, Jing Mao, and Rui-Feng Wang. 2025. "Joint Learning of Emotion and Singing Style for Enhanced Music Style Understanding" Sensors 25, no. 24: 7575. https://doi.org/10.3390/s25247575

APA Style

Chen, Y., Mao, J., & Wang, R.-F. (2025). Joint Learning of Emotion and Singing Style for Enhanced Music Style Understanding. Sensors, 25(24), 7575. https://doi.org/10.3390/s25247575

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Joint Learning of Emotion and Singing Style for Enhanced Music Style Understanding

Abstract

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI