Next Article in Journal
Simulation of Cyclic Deformation Behavior of Selective Laser Melted and Hybrid-Manufactured Aluminum Alloys Using the Phase-Field Method
Previous Article in Journal
The Characteristics of Intrinsic Fluorescence of Type I Collagen Influenced by Collagenase I
Article

Domestic Cat Sound Classification Using Learned Features from Deep Neural Nets

Division of Computer Science and Engineering, Chonbuk National University, Jeonju 54896, Korea
*
Author to whom correspondence should be addressed.
Appl. Sci. 2018, 8(10), 1949; https://doi.org/10.3390/app8101949
Received: 6 September 2018 / Revised: 6 October 2018 / Accepted: 15 October 2018 / Published: 16 October 2018
The domestic cat (Feliscatus) is one of the most attractive pets in the world, and it generates mysterious kinds of sound according to its mood and situation. In this paper, we deal with the automatic classification of cat sounds using machine learning. Machine learning approach for the classification requires class labeled data, so our work starts with building a small dataset named CatSound across 10 categories. Along with the original dataset, we increase the amount of data with various audio data augmentation methods to help our classification task. In this study, we use two types of learned features from deep neural networks; one from a pre-trained convolutional neural net (CNN) on music data by transfer learning and the other from unsupervised convolutional deep belief network that is (CDBN) solely trained on a collected set of cat sounds. In addition to conventional GAP, we propose an effective pooling method called FDAP to explore a number of meaningful features. In FDAP, the frequency dimension is roughly divided and then the average pooling is applied in each division. For the classification, we exploited five different machine learning algorithms and an ensemble of them. We compare the classification performances with respect following factors: the amount of data increased by augmentation, the learned features from pre-trained CNN or unsupervised CDBN, conventional GAP or FDAP, and the machine learning algorithms used for the classification. As expected, the proposed FDAP features with larger amount of data increased by augmentation combined with the ensemble approach have produced the best accuracy. Moreover, both learned features from pre-trained CNN and unsupervised CDBN produce good results in the experiment. Therefore, with the combination of all those positive factors, we obtained the best result of 91.13% in accuracy, 0.91 in f1-score, and 0.995 in area under the curve (AUC) score. View Full-Text
Keywords: balanced dataset; data augmentation; deep belief network; feature extraction; frequency division average pooling; ensemble balanced dataset; data augmentation; deep belief network; feature extraction; frequency division average pooling; ensemble
Show Figures

Graphical abstract

MDPI and ACS Style

Pandeya, Y.R.; Kim, D.; Lee, J. Domestic Cat Sound Classification Using Learned Features from Deep Neural Nets. Appl. Sci. 2018, 8, 1949. https://doi.org/10.3390/app8101949

AMA Style

Pandeya YR, Kim D, Lee J. Domestic Cat Sound Classification Using Learned Features from Deep Neural Nets. Applied Sciences. 2018; 8(10):1949. https://doi.org/10.3390/app8101949

Chicago/Turabian Style

Pandeya, Yagya R., Dongwhoon Kim, and Joonwhoan Lee. 2018. "Domestic Cat Sound Classification Using Learned Features from Deep Neural Nets" Applied Sciences 8, no. 10: 1949. https://doi.org/10.3390/app8101949

Find Other Styles
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Back to TopTop