Next Article in Journal
Collision Avoidance from Multiple Passive Agents with Partially Predictable Behavior
Next Article in Special Issue
Supporting an Object-Oriented Approach to Unit Generator Development: The Csound Plugin Opcode Framework
Previous Article in Journal
Comparative Study of Two Dynamics-Model-Based Estimation Algorithms for Distributed Drive Electric Vehicles
Previous Article in Special Issue
A Low Cost Wireless Acoustic Sensor for Ambient Assisted Living Systems
Article Menu
Issue 9 (September) cover image

Export Article

Open AccessArticle
Appl. Sci. 2017, 7(9), 901;

A Two-Stage Approach to Note-Level Transcription of a Specific Piano

1,2,* and 1,2,3
Key Laboratory of Speech Acoustics and Content Understanding, Institute of Acoustics, Chinese Academyof Sciences, Beijing 100190, China
University of Chinese Academy of Sciences, Beijing 100190, China
Xinjiang Laboratory of Minority Speech and Language Information Processing, Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumchi 830001, China
Author to whom correspondence should be addressed.
Academic Editor: Tapio Lokki
Received: 22 July 2017 / Revised: 25 August 2017 / Accepted: 29 August 2017 / Published: 2 September 2017
(This article belongs to the Special Issue Sound and Music Computing)
Full-Text   |   PDF [12209 KB, uploaded 22 January 2018]   |  


This paper presents a two-stage transcription framework for a specific piano, which combines deep learning and spectrogram factorization techniques. In the first stage, two convolutional neural networks (CNNs) are adopted to recognize the notes of the piano preliminarily, and note verification for the specific individual is conducted in the second stage. The note recognition stage is independent of piano individual, in which one CNN is used to detect onsets and another is used to estimate the probabilities of pitches at each detected onset. Hence, candidate pitches at candidate onsets are obtained in the first stage. During the note verification, templates for the specific piano are generated to model the attack of note per pitch. Then, the spectrogram of the segment around candidate onset is factorized using attack templates of candidate pitches. In this way, not only the pitches are picked up by note activations, but the onsets are revised. Experiments show that CNN outperforms other types of neural networks in both onset detection and pitch estimation, and the combination of two CNNs yields better performance than a single CNN in note recognition. We also observe that note verification further improves the performance of transcription. In the transcription of a specific piano, the proposed system achieves 82% on note-wise F-measure, which outperforms the state-of-the-art. View Full-Text
Keywords: music information retrieval; piano transcription; note recognition; note verification; onset detection; multi-pitch estimation music information retrieval; piano transcription; note recognition; note verification; onset detection; multi-pitch estimation

Figure 1

This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. (CC BY 4.0).

Share & Cite This Article

MDPI and ACS Style

Wang, Q.; Zhou, R.; Yan, Y. A Two-Stage Approach to Note-Level Transcription of a Specific Piano. Appl. Sci. 2017, 7, 901.

Show more citation formats Show less citations formats

Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Related Articles

Article Metrics

Article Access Statistics



[Return to top]
Appl. Sci. EISSN 2076-3417 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top