Special Issue "Computational Acoustic Scene Analysis"

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Acoustics and Vibrations".

Deadline for manuscript submissions: closed (15 June 2018)

Special Issue Editors

Guest Editor
Dr. Maurizio Omologo

Fondazione Bruno Kessler (FBK), Trento 38122, Italy
Website | E-Mail
Interests: digital signal processing; audio and music signal processing; robustness in ASR; audio and speech corpora; microphone arrays for speech recognition and acoustic scene analysis
Guest Editor
Prof. Dr. Stefano Squartini

Department of Information Engineering, Università Politecnica delle Marche, Ancona 60121, Italy
Website | E-Mail
Interests: digital signal processing; computational Intelligence; computational audio processing; digital music
Guest Editor
Prof. Dr. Tuomas Virtanen

Laboratory of Signal Processing, Tampere University of Technology, Tampere 33720, Finland
Website | E-Mail
Interests: machine listening; audio content analysis; audio signal processing; sound source separation; sound event detection

Special Issue Information

Dear Colleagues,

Computational acoustic scene analysis is a highly-active research field where audio signal processing and machine learning meet several scientific topics, such as room acoustics, microphone arrays, sound source localization, source separation, acoustic event detection, pattern classification, and many others. Emerging application fields include surveillance, environmental monitoring, hearing-aids, distant-speech interaction, for example in smart-home and industry automation. In most of these cases, state-of-the-art techniques are still inadequate for a deployment in real-world contexts.

Indeed, very challenging research problems need to be solved, since real-world noisy and reverberant environments are typically characterized by the presence of multiple speakers and noise sources, often overlapping each other. With the recent advent of machine learning, a significant transformation is under way in these fields, as witnessed by many papers presented in recent conferences, workshops, and even challenges such as DCASE, ACE, REVERB and CHIME.

In this Special Issue, we aim to describe current advances on computational methods on acoustic scene analysis in the following topics, but not limited to them:

  • Acoustic event detection and classification

  • Acoustic scene classification

  • Environmental monitoring by means of audio signals

  • Sound source localization and tracking

  • Sound source and speech activity detection

  • Blind source separation

  • Acoustic scene understanding

Preference will be given to works describing advanced digital signal processing and machine learning techniques applied to challenging contexts, such as multiple sources, often overlapping each other, under real-world noisy and reverberant environments.

Dr. Maurizio Omologo
Prof. Dr. Stefano Squartini
Prof. Dr. Tuomas Virtanen
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All papers will be peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Published Papers (2 papers)

View options order results:
result details:
Displaying articles 1-2
Export citation of selected articles as:

Research

Open AccessArticle Deep Learning for Audio Event Detection and Tagging on Low-Resource Datasets
Appl. Sci. 2018, 8(8), 1397; https://doi.org/10.3390/app8081397
Received: 15 June 2018 / Revised: 11 August 2018 / Accepted: 14 August 2018 / Published: 18 August 2018
PDF Full-text (1051 KB) | HTML Full-text | XML Full-text | Supplementary Files
Abstract
In training a deep learning system to perform audio transcription, two practical problems may arise. Firstly, most datasets are weakly labelled, having only a list of events present in each recording without any temporal information for training. Secondly, deep neural networks need a
[...] Read more.
In training a deep learning system to perform audio transcription, two practical problems may arise. Firstly, most datasets are weakly labelled, having only a list of events present in each recording without any temporal information for training. Secondly, deep neural networks need a very large amount of labelled training data to achieve good quality performance, yet in practice it is difficult to collect enough samples for most classes of interest. In this paper, we propose factorising the final task of audio transcription into multiple intermediate tasks in order to improve the training performance when dealing with this kind of low-resource datasets. We evaluate three data-efficient approaches of training a stacked convolutional and recurrent neural network for the intermediate tasks. Our results show that different methods of training have different advantages and disadvantages. Full article
(This article belongs to the Special Issue Computational Acoustic Scene Analysis)
Figures

Graphical abstract

Open AccessArticle Acoustic Scene Classification Using Efficient Summary Statistics and Multiple Spectro-Temporal Descriptor Fusion
Appl. Sci. 2018, 8(8), 1363; https://doi.org/10.3390/app8081363
Received: 12 February 2018 / Revised: 29 July 2018 / Accepted: 9 August 2018 / Published: 13 August 2018
PDF Full-text (1969 KB) | HTML Full-text | XML Full-text
Abstract
This paper presents a novel approach for acoustic scene classification based on efficient acoustic feature extraction using spectro-temporal descriptors fusion. Grounded on the finding in neuroscience—“auditory system summarizes the temporal details of sounds using time-averaged statistics to understand acoustic scenes”, we devise an
[...] Read more.
This paper presents a novel approach for acoustic scene classification based on efficient acoustic feature extraction using spectro-temporal descriptors fusion. Grounded on the finding in neuroscience—“auditory system summarizes the temporal details of sounds using time-averaged statistics to understand acoustic scenes”, we devise an efficient computational framework for sound scene classification by using multipe time-frequency descriptors fusion with discriminant information enhancement. To characterize rich information of sound, i.e., local structures on the time-frequency plane, we adopt 2-dimensional local descriptors. A more critical issue raised in how to logically ‘summarize’ those local details into a compact feature vector for scene classification. Although ‘time-averaged statistics’ is suggested by the psychological investigation, directly computing time average of local acoustic features is not a logical way, since arithmetic mean is vulnerable to extreme values which are anticipated to be generated by interference sounds which are irrelevant to the scene category. To tackle this problem, we develop time-frame weighting approach to enhance sound textures as well as to suppress scene-irrelevant events. Subsequently, robust acoustic feature for scene classification can be efficiently characterized. The proposed method had been validated by using Rouen dataset which consists of 19 acoustic scene categories with 3029 real samples. Extensive results demonstrated the effectiveness of the proposed scheme. Full article
(This article belongs to the Special Issue Computational Acoustic Scene Analysis)
Figures

Figure 1

Back to Top