MDPI - Publisher of Open Access Journals

20 pages, 1278 KB

Open AccessReview

A Comprehensive Review on Music Transcription

by Bhuwan Bhattarai and Joonwhoan Lee

Appl. Sci. 2023, 13(21), 11882; https://doi.org/10.3390/app132111882 - 30 Oct 2023

Cited by 13 | Viewed by 13342

Music transcription is the process of transforming recorded sound of musical performances into symbolic representations such as sheet music or MIDI files. Extensive research and development have been carried out in the field of music transcription and technology. This comprehensive review paper surveys the diverse methodologies, techniques, and advancements that have shaped the landscape of music transcription. The paper outlines the significance of music transcription in preserving, analyzing, and disseminating musical compositions across various genres and cultures. It also provides a historical perspective by tracing the evolution of music transcription from traditional manual methods to modern automated approaches. It also highlights the challenges in transcription posed by complex singing techniques, variations in instrumentation, ambiguity in pitch, tempo changes, rhythm, and dynamics. The review also categorizes four different types of transcription techniques, frame-level, note-level, stream-level, and notation-level, discussing their strengths and limitations. It also encompasses the various research domains of music transcription from general melody extraction to vocal melody, note-level monophonic to polyphonic vocal transcription, single-instrument to multi-instrument transcription, and multi-pitch estimation. The survey further covers a broad spectrum of music transcription applications in music production and creation. It also reviews state-of-the-art open-source as well as commercial music transcription tools for pitch estimation, onset and offset detection, general melody detection, and vocal melody detection. In addition, it also encompasses the currently available python libraries that can be used for music transcription. Furthermore, the review highlights the various open-source benchmark datasets for different areas of music transcription. It also provides a wide range of references supporting the historical context, theoretical frameworks, and foundational concepts to help readers understand the background of music transcription and the context of our paper. Full article

(This article belongs to the Section Computing and Artificial Intelligence)

► Show Figures

Figure 1

13 pages, 1567 KB

Open AccessArticle

Annotated-VocalSet: A Singing Voice Dataset

by Behnam Faghih and Joseph Timoney

Appl. Sci. 2022, 12(18), 9257; https://doi.org/10.3390/app12189257 - 15 Sep 2022

Cited by 7 | Viewed by 9233

Abstract

There are insufficient datasets of singing files that are adequately annotated. One of the available datasets that includes a variety of vocal techniques (n = 17) and several singers (m = 20) with several WAV files (p = 3560) is the VocalSet dataset. However, although several categories, including techniques, singers, tempo, and loudness, are in the dataset, they are not annotated. Therefore, this study aims to annotate VocalSet to make it a more powerful dataset for researchers. The annotations generated for the VocalSet audio files include fundamental frequency contour, note onset, note offset, the transition between notes, note F0, note duration, Midi pitch, and lyrics. This paper describes the generated dataset and explains our approaches to creating and testing the annotations. Moreover, four different methods to define the onset/offset are compared. Full article

(This article belongs to the Special Issue Algorithmic Music and Sound Computing)

► Show Figures

Figure 1

Search Results (2)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (2)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI