Next Article in Journal
Wind-Load Response and Evacuation Efficiency Analysis of Marine Evacuation Inflatable Slide
Next Article in Special Issue
Antiderivative Antialiasing for Stateful Systems
Previous Article in Journal
Measurement of Structural Loads Using a Novel MEMS Extrinsic Fabry–Perot Strain Sensor
Open AccessArticle

Learning Low-Dimensional Embeddings of Audio Shingles for Cross-Version Retrieval of Classical Music

International Audio Laboratories Erlangen, 91058 Erlangen, Germany
*
Authors to whom correspondence should be addressed.
Appl. Sci. 2020, 10(1), 19; https://doi.org/10.3390/app10010019
Received: 23 August 2019 / Revised: 3 December 2019 / Accepted: 13 December 2019 / Published: 18 December 2019
(This article belongs to the Special Issue Digital Audio Effects)
Cross-version music retrieval aims at identifying all versions of a given piece of music using a short query audio fragment. One previous approach, which is particularly suited for Western classical music, is based on a nearest neighbor search using short sequences of chroma features, also referred to as audio shingles. From the viewpoint of efficiency, indexing and dimensionality reduction are important aspects. In this paper, we extend previous work by adapting two embedding techniques; one is based on classical principle component analysis, and the other is based on neural networks with triplet loss. Furthermore, we report on systematically conducted experiments with Western classical music recordings and discuss the trade-off between retrieval quality and embedding dimensionality. As one main result, we show that, using neural networks, one can reduce the audio shingles from 240 to fewer than 8 dimensions with only a moderate loss in retrieval accuracy. In addition, we present extended experiments with databases of different sizes and different query lengths to test the scalability and generalizability of the dimensionality reduction methods. We also provide a more detailed view into the retrieval problem by analyzing the distances that appear in the nearest neighbor search. View Full-Text
Keywords: music information retrieval; version identification; audio matching; embedding; PCA; deep learning; triplet loss music information retrieval; version identification; audio matching; embedding; PCA; deep learning; triplet loss
Show Figures

Graphical abstract

MDPI and ACS Style

Zalkow, F.; Müller, M. Learning Low-Dimensional Embeddings of Audio Shingles for Cross-Version Retrieval of Classical Music. Appl. Sci. 2020, 10, 19. https://doi.org/10.3390/app10010019

AMA Style

Zalkow F, Müller M. Learning Low-Dimensional Embeddings of Audio Shingles for Cross-Version Retrieval of Classical Music. Applied Sciences. 2020; 10(1):19. https://doi.org/10.3390/app10010019

Chicago/Turabian Style

Zalkow, Frank; Müller, Meinard. 2020. "Learning Low-Dimensional Embeddings of Audio Shingles for Cross-Version Retrieval of Classical Music" Appl. Sci. 10, no. 1: 19. https://doi.org/10.3390/app10010019

Find Other Styles
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Search more from Scilit
 
Search
Back to TopTop