Music Genre Classification Using Prosodic, Stylistic, Syntactic and Sentiment-Based Features
Abstract
1. Introduction
2. Literature Review
3. Data and Methods
3.1. Training Data
3.2. Labeling
3.3. Preprocessing
3.4. Lexical Analysis
3.5. Feature Encoding
3.6. Models and Fitting
4. Results
4.1. SentiLex Improvements
4.2. Genre Classification
5. Discussion
5.1. Analysis
5.2. Limitations and Future Work
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Fabbri, F. A Theory of Musical Genres: Two Applications. In Proceedings of the Popular Music Perspectives, First International Conference on Popular Music Studies, Göteborg and Exeter: International Association for the Study of Popular Music, Amsterdam, The Netherlands, 22–26 June 1981; Available online: https://www.semanticscholar.org/paper/1-A-THEORY-OF-MUSICAL-GENRES-%3A-TWO-APPLICATIONS-Fabbri/feb8161666c893ed53a22eec9a3e7bba1f54fd57 (accessed on 14 November 2025).
- Holt, F. Genre in Popular Music; The University of Chicago Press: Chicago, IL, USA, 2007; ISBN 0-226-35039-8. [Google Scholar]
- Bronstein, M.; Droge, A.; Fredner, E.; Heuser, R.; Manshel, X.; Nomura, N.; Porter, J.D.; Walser, H. Microgenres. Available online: https://litlab.stanford.edu/projects/microgenres/ (accessed on 3 October 2025).
- Udrea, A.C.; Ruseti, S.; Pojoga, V.; Baghiu, S.; Terian, A.; Dascalu, M. Identifying Literary Microgenres and Writing Style Differences in Romanian Novels with ReaderBench and Large Language Models. Future Internet 2025, 17, 397. [Google Scholar] [CrossRef]
- Poell, T.; Nieborg, D.; Duffy, B.E.; Prey, R.; Cunningham, S. The Platformization of Cultural Production: Theorizing the Contingent Cultural Commodity. New Media Soc. 2018, 20, 4275–4292. [Google Scholar] [CrossRef]
- Washington, C.J.I. “It’s All the Same”: Genre Generalization in the American Music Industry. Master’s Thesis, Florida State University, Tallahassee, FL, USA, 2024. [Google Scholar]
- Blakeley, R. Against the Stream: Niche Music Streaming Services and the Streaming Paradigm. Ph.D. Thesis, University of Rochester, New York, NY, USA, 2024. [Google Scholar]
- Bowsher, A. Authenticity and the Commodity: Physical Music Media and the Independent Music Marketplace. Ph.D. Thesis, University of Oxford, Oxford, UK, 2014. [Google Scholar]
- Baghiu, Ș. Apartenența multiplă de subgen: O propunere pentru istoria formelor romanești. Transilvania 2022, 11–12, 45–49. [Google Scholar] [CrossRef]
- Griffiths, D. From Lyric to Anti-Lyric: Analyzing the Words in Pop Song. In Analyzing Popular Music; Cambridge University Press: Cambridge, UK, 2003. [Google Scholar]
- Bories, A.-S.; Plecháč, P.; Fabo, P.R. Computational Stylistics in Poetry, Prose, and Drama; De Gruyter: Berlin, Germany, 2022; ISBN 978-3-11-078150-2. [Google Scholar]
- Lazer, D.; Pentland, A.; Adamic, L.; Aral, S.; Barabási, A.-L.; Brewer, D.; Christakis, N.; Contractor, N.; Fowler, J.; Gutmann, M.; et al. Computational Social Science. Science 2009, 323, 721–723. [Google Scholar] [CrossRef] [PubMed]
- Kovacs, E.-R.; Cotfas, L.-A.; Delcea, C. A Deep Learning Approach to Fine-Grained Political Ideology Classification on Social Media Texts. In Computational Collective Intelligence; Nguyen, N.T., Franczyk, B., Ludwig, A., Núñez, M., Treur, J., Vossen, G., Kozierkiewicz, A., Eds.; Springer Nature: Cham, Switzerland, 2024; pp. 3–14. [Google Scholar]
- Chen, Y.; Skiena, S. Building Sentiment Lexicons for All Major Languages. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers); Toutanova, K., Wu, H., Eds.; Association for Computational Linguistics: Baltimore, MD, USA, 2014; pp. 383–389. [Google Scholar]
- Li, T.; Ogihara, M.; Li, Q. A Comparative Study on Content-Based Music Genre Classification. In Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Toronto, ON, Canada, 28 July–1 August 2003; Association for Computing Machinery: New York, NY, USA, 2003; pp. 282–289. [Google Scholar]
- Aguiar, L.; Martens, B. Digital Music Consumption on the Internet: Evidence from Clickstream Data. Inf. Econ. Policy 2016, 34, 27–43. [Google Scholar] [CrossRef]
- Jones, S. Music and the Internet. In The Handbook of Internet Studies; John Wiley & Sons, Ltd.: Hoboken, NJ, USA, 2011; pp. 440–451. ISBN 978-1-4443-1486-1. [Google Scholar]
- McKay, C.; Fujinaga, I. Improving Automatic Music Classification Performance by Extracting Features from Different Types of Data. In Proceedings of the International Conference on Multimedia Information Retrieval, Philadelphia, PA, USA, 29–31 March 2010; Association for Computing Machinery: New York, NY, USA, 2010; pp. 257–266. [Google Scholar]
- Sijbesma, D. Evaluating Audio Feature Importances and Machine Learning Models to Enhance Music Genre Classification and Recommendations. Ph.D. Thesis, Utrecht University, Utrecht, The Netherlands, 2024. [Google Scholar]
- Flederus, D. Enhancing Music Genre Classification with Neural Networks by Using Extracted Musical Features. Available online: https://purl.utwente.nl/essays/80549 (accessed on 14 November 2025).
- Pelchat, N.; Gelowitz, C.M. Neural Network Music Genre Classification. Can. J. Electr. Comput. Eng. 2020, 43, 170–173. [Google Scholar] [CrossRef]
- Tsaptsinos, A. Lyrics-Based Music Genre Classification Using a Hierarchical Attention Network. arXiv 2017, arXiv:1707.04678. [Google Scholar] [CrossRef]
- Bahdanau, D.; Cho, K.; Bengio, Y. Neural Machine Translation by Jointly Learning to Align and Translate. arXiv 2014, arXiv:1409.0473. [Google Scholar] [CrossRef]
- Yang, Z.; Yang, D.; Dyer, C.; He, X.; Smola, A.; Hovy, E. Hierarchical Attention Networks for Document Classification. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego, CA, USA, 12–17 June 2016; Knight, K., Nenkova, A., Rambow, O., Eds.; Association for Computational Linguistics: San Diego, CA, USA, 2016; pp. 1480–1489. [Google Scholar]
- Singhi, A.; Brown, D.G. On Cultural, Textual and Experiential Aspects of Music Mood. In Proceedings of the International Society for Music Information Retrieval Conference, Taipei, Taiwan, 27–31 October 2014; Available online: https://zenodo.org/records/1417391 (accessed on 14 November 2025).
- Akalp, H.; Furkan Cigdem, E.; Yilmaz, S.; Bolucu, N.; Can, B. Language Representation Models for Music Genre Classification Using Lyrics. In Proceedings of the 2021 International Symposium on Electrical, Electronics and Information Engineering, Seoul, Republic of Korea, 19–21 February 2021; Association for Computing Machinery: New York, NY, USA, 2021; pp. 408–414. [Google Scholar]
- Mayer, R.; Neumayer, R.; Rauber, A. Rhyme and Style Features for Musical Genre Classification by Song Lyrics. In Proceedings of the ISMIR 2008, 9th International Conference on Music Information Retrieval, Drexel University, Philadelphia, PA, USA, 14–18 September 2008; Available online: https://www.ifs.tuwien.ac.at/~mayer/publications/pdf/may_ismir08.pdf (accessed on 14 November 2025).
- Mayer, R.; Rauber, A. Music Genre Classification by Ensembles of Audio and Lyrics Features. In Proceedings of the 12th International Society for Music Information Retrieval Conference, ISMIR 2011, Miami, FL, USA, 24–28 October 2011; Available online: https://archives.ismir.net/ismir2011/paper/000127.pdf (accessed on 14 November 2025).
- Li, Y.; Zhang, Z.; Ding, H.; Chang, L. Music Genre Classification Based on Fusing Audio and Lyric Information. Multimed. Tools Appl. 2023, 82, 20157–20176. [Google Scholar] [CrossRef]
- Watanabe, K.; Goto, M. Lyrics Information Processing: Analysis, Generation, and Applications. In Proceedings of the 1st Workshop on NLP for Music and Audio (NLP4MusA), Online, 16 October 2020; Oramas, S., Espinosa-Anke, L., Epure, E., Jones, R., Sordo, M., Quadrana, M., Watanabe, K., Eds.; Association for Computational Linguistics: Vienna, Austria, 2020; pp. 6–12. [Google Scholar]
- Moretti, F. Distant Reading; Verso Books: London, UK; New York, NY, USA, 2013; ISBN 1-78168-084-1. [Google Scholar]
- Schiop, A. Smecherie Şi Lume Rea. Universul Social al Manelelor; Cartier: Bucharest, Romania, 2017; ISBN 978-9975-86-032-1. [Google Scholar]
- Beissinger, M.; Rădulescu, S.; Giurchescu, A. Manele in Romania: Cultural Expression and Social Meaning in Balkan Popular Music; Rowman & Littlefield Publishers: Lanham, MD, USA, 2016. [Google Scholar]
- Dumitrescu, S.; Avram, A.-M.; Pyysalo, S. The Birth of Romanian BERT. In Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, Punta Cana, Dominican Republic, 16–20 November 2020; Association for Computational Linguistics: Vienna, Austria, 2020; pp. 4324–4328. [Google Scholar]
- Li, T.; Ogihara, M. Music Genre Classification with Taxonomy. In Proceedings of the ICASSP ’05. IEEE International Conference on Acoustics, Speech, and Signal Processing, Philadelphia, PA, USA, 23–23 March 2005; Volume 5, pp. v/197–v/200. [Google Scholar]






| Original Genre Label | Merged Genre Label | Notes on Context |
|---|---|---|
| Etno/Folclor | Muzica populara | Traditional Romanian folklore music |
| Aniversări | Altele | Birthday songs, sorted into Altele [Other] |
| Din Republica Moldova | Altele | From the Republic of Moldova |
| Muzică ușoară | Pop | Literally means “light music”, a type of pop music focused on romantic themes popular in the second half of the 20th century. Sometimes used in Romanian as a synonym for all popular music (used in opposition to muzică cultă, art music) |
| Despre mama | Altele | Songs about mom |
| Lăutărești | Manele/Lautareasca | Lăutărească is a type of traditional folk music played by professional Roma musicians; Manele is a form of modern pop music sung by Roma singers. The genres share some stylistic traits and some musicians have worked within both genres, thus the association |
| Instrumentală | Altele | Instrumental music |
| Pop-Rock | Pop | |
| Cântece pentru copii | Altele | Children’s songs |
| Dance | Pop | |
| Rock | Rock | |
| Romanțe | Pop | An early form of pop music, these are romantic songs sung in a heightened emotional style, sometimes accompanied by piano and strings, popular in the early-to-mid 20th Century |
| Blues | Rock | |
| Școala și profesorii | Altele | Songs about school and teachers |
| Pop | Pop | |
| Folk | Folk | Folk (not to be confused with traditional folklore music) was inspired in its style from Western Folk music concentrating on the guitar and voice, and often tackling historical, literary or introspective themes |
| Country | Rock | |
| Social | Altele | |
| Crăciun | Muzica religioasa | Christmas songs and religious music, respectively |
| Cenaclul ‘Flacăra’ | Folk | Cenaclul ‘Flacăra’ was an influential youth culture circle during the last decades of the Communist period, mostly associated with the Folk genre |
| Satiră și umor | Altele | Humorous songs |
| Cântece de mahala | Manele/Lautareasca | Songs from marginalized communities/the ghetto |
| Cântece țigănești | Manele/Lautareasca | Gypsy (Roma) songs |
| Cântece de munte | Folk | Mountain songs |
| Parodii | Altele | Parodies |
| Imnuri | Altele | Anthems |
| Colinde | Muzica religioasa | Christmas Carols |
| Cinema și TV | Pop | Cinema and TV songs |
| Despre Patrie | Altele | About the Fatherland/patriotic songs |
| Muzică armânească | Muzica populara | Aromanian songs |
| Despre tata | Altele | Songs about dad |
| Creștine | Muzica religioasa | Christian songs |
| De la Autori | Altele | Original songs (provided by site users) |
| Experimental | Altele | |
| Manele | Manele/Lautareasca | |
| Hip-hop | Hip/Hop | |
| Latino | Pop | |
| Fotbal | Altele | Songs about football |
| Metal | Rock | |
| Punk/Ska | Rock | |
| Reggae | Rock | |
| Retro | Pop |
| Feature | Definition |
|---|---|
| Lexical | |
| swear_word_ratio | number of swear words/number of words |
| ethnic_slur_ratio | number of ethnic slurs/number of words |
| sexual_slur_ratio | number of sexual slurs/number of words |
| all_vulgarities_ratio | (number of ethnic slurs + number of sexual slurs + number of swear words)/number of words |
| stopword_ratio | number of stopwords/number of words |
| word_count | number of words |
| mean_word_length | average length, in characters, of the words used in a song |
| vocab_size | total number of different words used in a song |
| Sentiment-based | |
| positive_sentiment | number of words with positive valence, based on SentiLex/number of words |
| negative_sentiment | number of words with negative valence, based on SentiLex/number of words |
| positive_sentiment_sentilex_v2 | number of words with positive valence, based on SentiLex-v2/number of words |
| negative_sentiment_sentilex_v2 | number of words with negative valence, based on SentiLex-v2/number of words |
| Stylistic and prosodic | |
| repetitions_max_count | maximum of the longest repeated sequence of the most frequent word out of each verse |
| repetitions_beginning | number of repetitions of the most frequent word in each verse considering only the first half of the verse |
| repetitions_end | number of repetitions of the most frequent word in each verse considering only the latter half of the verse |
| mean_verse_length | average verse length in words |
| mean_phrase_length | average phrase length in words |
| char_count | total number of characters in the song |
| enjabement_count | number of phrases that end/start within a verse instead of at the end of a verse |
| Lexicon | Accuracy | Precision | Recall | F1 Score |
|---|---|---|---|---|
| SentiLex [14] | 61.48% | 40.73% | 88.63% | 55.81% |
| SentiLex-v2 | 68.39% | 61.59% | 80.91% | 69.94% |
| Class | Accuracy | Precision | Recall | F1 Score |
|---|---|---|---|---|
| Muzica populara | 71.84% | 67.86% | 85.39% | 75.62% |
| Pop | 59.32% | 62.34% | 53.70% | 57.70% |
| Manele/Lautareasca | 64.00% | 69.92% | 61.87% | 65.65% |
| Rock | 57.42% | 58.35% | 62.53% | 60.37% |
| Folk | 63.99% | 63.74% | 65.44% | 64.58% |
| Muzica religioasa | 68.14% | 65.19% | 81.97% | 72.62% |
| Hip/Hop | 54.54% | 54.76% | 67.65% | 60.53% |
| Mean | 62.75% | 63.16% | 68.37% | 65.29% |
| Model | Feature (Normalized) | Coefficient | Odds-Ratio |
|---|---|---|---|
| Muzica populara | mean_verse_length | 1.1853 | 3.27 |
| stopword_ratio | 0.3411 | 1.41 | |
| mean_word_length | 0.2005 | 1.22 | |
| word_count | −0.8684 | 0.42 | |
| vocab_size | −0.4737 | 0.62 | |
| swear_word_ratio | 0.1743 | 1.19 | |
| Pop | ethnic_slur_ratio | −0.0748 | 0.93 |
| stopword_ratio | 0.1924 | 1.21 | |
| mean_word_length | −0.2440 | 0.78 | |
| vocab_size | −0.2921 | 0.75 | |
| swear_word_ratio | 0.0578 | 1.06 | |
| negative_sentiment_sentilex_v2 | −0.0864 | 0.92 | |
| repetitions_beginning | 0.1892 | 1.21 | |
| repetitions_end | 0.5047 | 1.66 | |
| ethnic_slur_ratio | 0.4597 | 1.58 | |
| Manele/Lautareasca | sexual_slur_ratio | 7.5093 | 1824.94 |
| stopword_ratio | 0.4955 | 1.64 | |
| mean_word_length | −0.1770 | 0.84 | |
| vocab_size | −0.8577 | 0.42 | |
| swear_word_ratio | 0.7599 | 2.14 | |
| negative_sentiment_sentilex_v2 | −0.2603 | 0.77 | |
| repetitions_end | 0.2055 | 1.23 | |
| mean_phrase_length | 0.1239 | 1.13 | |
| Rock | stopword_ratio | −0.3424 | 0.71 |
| mean_word_length | 0.2025 | 1.22 | |
| vocab_size | −0.1091 | 0.90 | |
| swear_word_ratio | −0.1374 | 0.87 | |
| positive_sentiment_sentilex_v2 | −0.0904 | 0.91 | |
| repetitions_beginning | 0.0730 | 1.08 | |
| Folk | mean_verse_length | −0.5696 | 0.57 |
| mean_phrase_length | −0.6225 | 0.54 | |
| stopword_ratio | −0.1436 | 0.87 | |
| vocab_size | 0.6660 | 1.95 | |
| swear_word_ratio | −0.0582 | 0.94 | |
| negative_sentiment_sentilex_v2 | 0.2485 | 1.28 | |
| positive_sentiment_sentilex_v2 | −0.1444 | 0.87 | |
| repetitions_beginning | −0.2485 | 0.78 | |
| repetitions_end | −0.4740 | 0.62 | |
| mean_verse_length | −0.3007 | 0.74 | |
| Muzica religioasa | mean_phrase_length | 0.2903 | 1.34 |
| stopword_ratio | −0.5145 | 0.60 | |
| mean_word_length | 0.4355 | 1.55 | |
| vocab_size | −0.3784 | 0.68 | |
| enjabement_count | −0.2578 | 0.77 | |
| negative_sentiment_sentilex_v2 | −0.4002 | 0.67 | |
| positive_sentiment_sentilex_v2 | 0.3175 | 1.37 | |
| repetitions_end | −0.2445 | 0.78 | |
| stopword_ratio | 0.4305 | 1.54 | |
| Hip/Hop | mean_word_length | −0.5938 | 0.55 |
| vocab_size | 0.9149 | 2.50 | |
| repetitions_beginning | −0.1945 | 0.82 | |
| repetitions_end | −0.7356 | 0.48 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Kovacs, E.-R.; Baghiu, S. Music Genre Classification Using Prosodic, Stylistic, Syntactic and Sentiment-Based Features. Big Data Cogn. Comput. 2025, 9, 296. https://doi.org/10.3390/bdcc9110296
Kovacs E-R, Baghiu S. Music Genre Classification Using Prosodic, Stylistic, Syntactic and Sentiment-Based Features. Big Data and Cognitive Computing. 2025; 9(11):296. https://doi.org/10.3390/bdcc9110296
Chicago/Turabian StyleKovacs, Erik-Robert, and Stefan Baghiu. 2025. "Music Genre Classification Using Prosodic, Stylistic, Syntactic and Sentiment-Based Features" Big Data and Cognitive Computing 9, no. 11: 296. https://doi.org/10.3390/bdcc9110296
APA StyleKovacs, E.-R., & Baghiu, S. (2025). Music Genre Classification Using Prosodic, Stylistic, Syntactic and Sentiment-Based Features. Big Data and Cognitive Computing, 9(11), 296. https://doi.org/10.3390/bdcc9110296

