A Two-Stage Approach to Note-Level Transcription of a Specific Piano
Key Laboratory of Speech Acoustics and Content Understanding, Institute of Acoustics, Chinese Academyof Sciences, Beijing 100190, China
University of Chinese Academy of Sciences, Beijing 100190, China
Xinjiang Laboratory of Minority Speech and Language Information Processing, Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumchi 830001, China
Author to whom correspondence should be addressed.
Academic Editor: Tapio Lokki
Appl. Sci. 2017, 7(9), 901; https://doi.org/10.3390/app7090901
Received: 22 July 2017 / Revised: 25 August 2017 / Accepted: 29 August 2017 / Published: 2 September 2017
(This article belongs to the Special Issue Sound and Music Computing)
This paper presents a two-stage transcription framework for a specific piano, which combines deep learning and spectrogram factorization techniques. In the first stage, two convolutional neural networks (CNNs) are adopted to recognize the notes of the piano preliminarily, and note verification for the specific individual is conducted in the second stage. The note recognition stage is independent of piano individual, in which one CNN is used to detect onsets and another is used to estimate the probabilities of pitches at each detected onset. Hence, candidate pitches at candidate onsets are obtained in the first stage. During the note verification, templates for the specific piano are generated to model the attack of note per pitch. Then, the spectrogram of the segment around candidate onset is factorized using attack templates of candidate pitches. In this way, not only the pitches are picked up by note activations, but the onsets are revised. Experiments show that CNN outperforms other types of neural networks in both onset detection and pitch estimation, and the combination of two CNNs yields better performance than a single CNN in note recognition. We also observe that note verification further improves the performance of transcription. In the transcription of a specific piano, the proposed system achieves 82% on note-wise F-measure, which outperforms the state-of-the-art.