Next Article in Journal
A Roadmap towards Breast Cancer Therapies Supported by Explainable Artificial Intelligence
Previous Article in Journal
A Performance Analysis of Internet of Things Networking Protocols: Evaluating MQTT, CoAP, OPC UA
Previous Article in Special Issue
Power Allocation for Secrecy-Capacity-Optimization-Artificial-Noise Secure MIMO Precoding Systems under Perfect and Imperfect Channel State Information
Article

A Study of Features and Deep Neural Network Architectures and Hyper-Parameters for Domestic Audio Classification

1
Faculty of Engineering and Information Sciences, University of Wollongong in Dubai, Dubai 20183, United Arab Emirates
2
School of Electrical, Computer and Telecommunications Engineering, University of Wollongong, Northfields Ave, Wollongong, NSW 2522, Australia
3
Department of Musicology, University of Oslo, Sem Sælands vei 2, 0371 Oslo, Norway
*
Author to whom correspondence should be addressed.
Academic Editor: Cheonshik Kim
Appl. Sci. 2021, 11(11), 4880; https://doi.org/10.3390/app11114880
Received: 1 May 2021 / Revised: 24 May 2021 / Accepted: 25 May 2021 / Published: 26 May 2021
Recent methodologies for audio classification frequently involve cepstral and spectral features, applied to single channel recordings of acoustic scenes and events. Further, the concept of transfer learning has been widely used over the years, and has proven to provide an efficient alternative to training neural networks from scratch. The lower time and resource requirements when using pre-trained models allows for more versatility in developing system classification approaches. However, information on classification performance when using different features for multi-channel recordings is often limited. Furthermore, pre-trained networks are initially trained on bigger databases and are often unnecessarily large. This poses a challenge when developing systems for devices with limited computational resources, such as mobile or embedded devices. This paper presents a detailed study of the most apparent and widely-used cepstral and spectral features for multi-channel audio applications. Accordingly, we propose the use of spectro-temporal features. Additionally, the paper details the development of a compact version of the AlexNet model for computationally-limited platforms through studies of performances against various architectural and parameter modifications of the original network. The aim is to minimize the network size while maintaining the series network architecture and preserving the classification accuracy. Considering that other state-of-the-art compact networks present complex directed acyclic graphs, a series architecture proposes an advantage in customizability. Experimentation was carried out through Matlab, using a database that we have generated for this task, which composes of four-channel synthetic recordings of both sound events and scenes. The top performing methodology resulted in a weighted F1-score of 87.92% for scalogram features classified via the modified AlexNet-33 network, which has a size of 14.33 MB. The AlexNet network returned 86.24% at a size of 222.71 MB. View Full-Text
Keywords: neural network; transfer learning; scalograms; MFCC; Log-mel; pre-trained models neural network; transfer learning; scalograms; MFCC; Log-mel; pre-trained models
Show Figures

Figure 1

MDPI and ACS Style

Copiaco, A.; Ritz, C.; Abdulaziz, N.; Fasciani, S. A Study of Features and Deep Neural Network Architectures and Hyper-Parameters for Domestic Audio Classification. Appl. Sci. 2021, 11, 4880. https://doi.org/10.3390/app11114880

AMA Style

Copiaco A, Ritz C, Abdulaziz N, Fasciani S. A Study of Features and Deep Neural Network Architectures and Hyper-Parameters for Domestic Audio Classification. Applied Sciences. 2021; 11(11):4880. https://doi.org/10.3390/app11114880

Chicago/Turabian Style

Copiaco, Abigail, Christian Ritz, Nidhal Abdulaziz, and Stefano Fasciani. 2021. "A Study of Features and Deep Neural Network Architectures and Hyper-Parameters for Domestic Audio Classification" Applied Sciences 11, no. 11: 4880. https://doi.org/10.3390/app11114880

Find Other Styles
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Back to TopTop