Next Article in Journal
Fast Signals of Opportunity Fingerprint Database Maintenance with Autonomous Unmanned Ground Vehicle for Indoor Positioning
Previous Article in Journal
Optical Acceleration Measurement Method with Large Non-ambiguity Range and High Resolution via Synthetic Wavelength and Single Wavelength Superheterodyne Interferometry
Open AccessArticle

Towards End-to-End Acoustic Localization Using Deep Learning: From Audio Signals to Source Position Coordinates

Department of Electronics, University of Alcalá, Campus Universitario s/n, Alcalá de Henares, 28805 Madrid, Spain
*
Author to whom correspondence should be addressed.
Sensors 2018, 18(10), 3418; https://doi.org/10.3390/s18103418
Received: 29 July 2018 / Revised: 14 September 2018 / Accepted: 6 October 2018 / Published: 12 October 2018
(This article belongs to the Section Physical Sensors)
This paper presents a novel approach for indoor acoustic source localization using microphone arrays, based on a Convolutional Neural Network (CNN). In the proposed solution, the CNN is designed to directly estimate the three-dimensional position of a single acoustic source using the raw audio signal as the input information and avoiding the use of hand-crafted audio features. Given the limited amount of available localization data, we propose, in this paper, a training strategy based on two steps. We first train our network using semi-synthetic data generated from close talk speech recordings. We simulate the time delays and distortion suffered in the signal that propagate from the source to the array of microphones. We then fine tune this network using a small amount of real data. Our experimental results, evaluated on a publicly available dataset recorded in a real room, show that this approach is able to produce networks that significantly improve existing localization methods based on SRP-PHAT strategies and also those presented in very recent proposals based on Convolutional Recurrent Neural Networks (CRNN). In addition, our experiments show that the performance of our CNN method does not show a relevant dependency on the speaker’s gender, nor on the size of the signal window being used. View Full-Text
Keywords: acoustic source localization; microphone arrays; deep learning; convolutional neural networks acoustic source localization; microphone arrays; deep learning; convolutional neural networks
Show Figures

Figure 1

MDPI and ACS Style

Vera-Diaz, J.M.; Pizarro, D.; Macias-Guarasa, J. Towards End-to-End Acoustic Localization Using Deep Learning: From Audio Signals to Source Position Coordinates. Sensors 2018, 18, 3418.

Show more citation formats Show less citations formats
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Back to TopTop