Automatic Transcription of Polyphonic Vocal Music†
AbstractThis paper presents a method for automatic music transcription applied to audio recordings of a cappella performances with multiple singers. We propose a system for multi-pitch detection and voice assignment that integrates an acoustic and a music language model. The acoustic model performs spectrogram decomposition, extending probabilistic latent component analysis (PLCA) using a six-dimensional dictionary with pre-extracted log-spectral templates. The music language model performs voice separation and assignment using hidden Markov models that apply musicological assumptions. By integrating the two models, the system is able to detect multiple concurrent pitches in polyphonic vocal music and assign each detected pitch to a specific voice type such as soprano, alto, tenor or bass (SATB). We compare our system against multiple baselines, achieving state-of-the-art results for both multi-pitch detection and voice assignment on a dataset of Bach chorales and another of barbershop quartets. We also present an additional evaluation of our system using varied pitch tolerance levels to investigate its performance at 20-cent pitch resolution. View Full-Text
Share & Cite This Article
McLeod, A.; Schramm, R.; Steedman, M.; Benetos, E. Automatic Transcription of Polyphonic Vocal Music. Appl. Sci. 2017, 7, 1285.
McLeod A, Schramm R, Steedman M, Benetos E. Automatic Transcription of Polyphonic Vocal Music. Applied Sciences. 2017; 7(12):1285.Chicago/Turabian Style
McLeod, Andrew; Schramm, Rodrigo; Steedman, Mark; Benetos, Emmanouil. 2017. "Automatic Transcription of Polyphonic Vocal Music." Appl. Sci. 7, no. 12: 1285.
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.