Resource-Efficient Pet Dog Sound Events Classification Using LSTM-FCN Based on Time-Series Data
Abstract
:1. Introduction
2. Background
2.1. Time-Series Classification
2.2. LSTM-FCN
3. Proposed Methods
- Labeling sound event as barking, growling, howling, or whining.
- Applying normalization methods to obtain a constant data distribution.
- Extending the dimension of learning data by interpolation.
- Applying the LSTM-FCN model to classify the pet dog sound events.
3.1. Pet Dog Sound Event Intensity Data Acquired by Noise Sensor
3.2. Analysis of Pet Dog Sound Intensity
3.3. Bicubic Interpolation
3.4. Classification of Pet Dog Sound Events Using LSTM-FCN
Algorithm 1 Overall algorithm with the proposed method. |
Input: Intensity data obtained from pet dog sound event using noise sensor Output: Classification accuracy of pet dog sound event // Load an intensity data Value = Load (noise sensor) // Normalization for uniform distribution for (I = 0; i ≤ the number of columns in Value; i++) for (j = 0; j ≤ the number of rows in Value; i++) ValueNormalize[i,j] = 0–1_Nomalization(Value[i,j]) // Extending the dimension of learning data by interpolation for (I = 0; i ≤ the number of columns in Value; i++) for (j = 0; j ≤ the number of rows in Value; i++) ValueIntepolation[i,j] = BicubicInterpolation(ValueNormailzation[i,j]) // Classification of pet dog sound event using LSTM-FCN for each ValueNormailzation Calculate accuracy of each pet dog sound event using LSTM-FCN Return; |
4. Experimental Results
4.1. Experimental Environment
4.2. Comparison of Results Based on Sound and Intensity Data
5. Conclusions
Supplementary Materials
Author Contributions
Funding
Acknowledgments
Conflicts of Interest
References
- Chung, Y.; Lee, S.; Jeon, T.; Park, D. Fast Video Encryption Using the H. 264 Error Propagation Property for Smart Mobile Devices. Sensors 2015, 15, 7953–7968. [Google Scholar] [CrossRef] [PubMed]
- Lee, S.; Jeong, T. Forecasting Purpose Data Analysis and Methodology Comparison of Neural Model Perspective. Symmetry 2017, 9, 108. [Google Scholar] [CrossRef]
- Lee, S.; Kim, H.; Chung, Y.; Park, D. Energy Efficient Image/video Data Transmission on Commercial Multi-core Processors. Sensors 2012, 12, 14647–14670. [Google Scholar] [CrossRef] [PubMed]
- Lee, S.; Kim, H.; Sa, J.; Park, B.; Chung, Y. Real-time Processing for Intelligent-surveillance Applications. IEICE Electron. Express 2017, 14, 20170227. [Google Scholar] [CrossRef]
- Lee, S.; Jeong, T. Cloud-based Parameter-driven Statistical Services and Resource Allocation in a Heterogeneous Platform on Enterprise Environment. Symmetry 2016, 8, 103. [Google Scholar] [CrossRef]
- Ribeiro, C.; Ferworn, A.; Denko, M.; Tran, J. Canine Pose Estimation: A Computing for Public Safety Solution. In Proceedings of the 2009 Canadian Conference on Computer and Robot Vision, Kelowna, BC, Canada, 25–27 May 2009; pp. 37–44. [Google Scholar]
- Pongrácz, P.; Molnár, C.; Miklósi, Á. Barking in Family Dogs: An Ethological Approach. Vet. J. 2010, 183, 141–147. [Google Scholar] [CrossRef] [PubMed]
- Chung, Y.; Lee, J.; Oh, S.; Park, D.; Chang, H.H.; Kim, S. Automatic Detection of Cow’s Estrus in Audio Surveillance System. Asian-Australas. J. Anim. Sci. 2013, 26, 1030–1037. [Google Scholar] [CrossRef] [PubMed]
- Ye, L.; Keogh, E. Time Series Shapelets: A New Primitive for Data Mining. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Paris, France, 28 June–1 July 2009; pp. 947–956. [Google Scholar]
- Karim, F.; Majumdar, S.; Darabi, H.; Chen, S. LSTM Fully Convolutional Networks for Time Series Classification. IEEE Access 2018, 6, 1662–1669. [Google Scholar] [CrossRef]
- Lukman, A.; Harjoko, A.; Yang, C.K. Classification MFCC Feature from Culex and Aedes Aegypti Mosquitoes Noise Using Support Vector Machine. In Proceedings of the 2017 International Conference on Soft Computing, ICSIIT, Denpasar, Indonesia, 26–28 September 2017; pp. 17–20. [Google Scholar]
- Zhang, W.; Han, J.; Deng, S. Heart sound classification based on scaled spectrogram and partial least squares regression. Biomed. Signal Process. Control 2017, 32, 20–28. [Google Scholar] [CrossRef]
- Dong, M. Convolutional Neural Network Achieves Human-level Accuracy in Music Genre Classification. arXiv, 2018; arXiv:1802.09697. [Google Scholar]
- Sonawane, A.; Inamdar, M.U.; Bhangale, K.B. Sound based human emotion recognition using MFCC & multiple SVM. In Proceedings of the International Conference on Information, Communication, Instrumentation and Control (ICICIC), Indore, India, 17–19 August 2017; pp. 1–4. [Google Scholar]
- Kim, J.; Park, C.; Ahn, J.; Ko, Y.; Park, J.; Gallagher, J.C. Real-time UAV sound detection and analysis system. In Proceedings of the 2017 IEEE Sensors Applications Symposium (SAS), Glassboro, NJ, USA, 13–15 March 2017; pp. 1–5. [Google Scholar]
- Khwarahm, N.R.; Dash, J.; Skjøth, C.A.; Newnham, R.M.; Adams-Groom, B.; Head, K.; Caulton, E.; Atkinson, P.M. Mapping the Birch and Grass Pollen Seasons in the UK Using Satellite Sensor Time-series. Sci. Total Environ. 2017, 578, 586–600. [Google Scholar] [CrossRef] [PubMed]
- Vitola, J.; Pozo, F.; Tibaduiza, D.A.; Anaya, M. A Sensor Data Fusion System Based on K-nearest Neighbor Pattern Classification for Structural Health Monitoring Applications. Sensors 2017, 17, 417. [Google Scholar] [CrossRef] [PubMed]
- Liu, J.; Fu, Y.; Ming, J.; Ren, Y.; Sun, L.; Xiong, H. Effective and Real-time In-app Activity Analysis in Encrypted Internet Traffic Streams. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, NS, Canada, 13–17 August 2017; pp. 335–344. [Google Scholar]
- Chen, Z.; He, K.; Li, J.; Geng, Y. Seq2Img: A Sequence-to-image Based Approach towards IP Traffic Classification Using Convolutional Neural Networks. In Proceedings of the 2017 IEEE Conference on Big Data, Boston, MA, USA, 11–14 December 2017; pp. 1271–1276. [Google Scholar]
- Zhang, Y.; Pezeshki, M.; Brakel, P.; Zhang, S.; Bengio, C.L.Y.; Courville, A. Towards End-to-end Speech Recognition with Deep Convolutional Neural Networks. arXiv, 2017; arXiv:1701.02720. [Google Scholar]
- Pei, W.; Dibeklioğlu, H.; Tax, D.M.; van der Maaten, L. Multivariate Time-series Classification Using the Hidden-unit Logistic Model. IEEE Trans. Neural Netw. Learn. Syst. 2018, 29, 920–931. [Google Scholar] [CrossRef] [PubMed]
- Soares, E.; Costa, P., Jr.; Costa, B.; Leite, D. Ensemble of Evolving Data Clouds and Fuzzy Models for Weather Time Series Prediction. Appl. Soft Comput. 2018, 64, 445–453. [Google Scholar] [CrossRef]
- Manandhar, S.; Dev, S.; Lee, Y.H.; Meng, Y.S.; Winkler, S. A Data-driven Approach to Detecting Precipitation from Meteorological Sensor Data. arXiv, 2018; arXiv:1805.01950. [Google Scholar]
- Hu, Y.; Gunapati, V.Y.; Zhao, P.; Gordon, D.; Wheeler, N.R.; Hossain, M.A.; Peshek, T.J.; Bruckman, L.S.; Zhang, G.Q.; French, R.H. A Nonrelational Data Warehouse for the Analysis of Field and Laboratory Data from Multiple Heterogeneous Photovoltaic Test Sites. IEEE J. Photovolt. 2017, 7, 230–236. [Google Scholar] [CrossRef]
- Garcke, J.; Iza-Teran, R.; Marks, M.; Pathare, M.; Schollbach, D.; Stettner, M. Dimensionality Reduction for the Analysis of Time Series Data from Wind Turbines. In Proceedings of the Scientific Computing and Algorithms in Industrial Simulations, Cham, Switzerland, 19 July 2017; pp. 317–339. [Google Scholar]
- Wilson, S.J. Data Representation for Time Series Data Mining: Time Domain Approaches. WIREs Comput. Stat. 2017, 9, e1392. [Google Scholar] [CrossRef]
- Egri, A.; Horváth, I.; Kovács, F.; Molontay, R.; Varga, K. Cross-correlation Based Clustering and Dimension Reduction of Multivariate Time Series. In Proceedings of the 2017 IEEE 21st International Conference on Intelligent Engineering Systems, Larnaca, Cyprus, 20–23 October 2017; pp. 241–246. [Google Scholar]
- Um, T.T.; Pfister, F.M.; Pichler, D.; Endo, S.; Lang, M.; Hirche, S.; Fietzek, U.; Kulić, D. Data Augmentation of Wearable Sensor Data for Parkinson’s Disease Monitoring Using Convolutional Neural Networks. In Proceedings of the 19th ACM International Conference on Multimodal Interaction, New York, NY, USA, 13–17 November 2017; pp. 216–220. [Google Scholar]
- Salamon, J.; Bello, J.P. Deep Convolutional Neural Networks and Data Augmentation for Environmental Sound Classification. IEEE Signal Process. Lett. 2017, 24, 279–283. [Google Scholar] [CrossRef]
- Feng, Z.H.; Kittler, J.; Christmas, W.; Huber, P.; Wu, X.J. Dynamic Attention-controlled Cascaded Shape Regression Exploiting Training Data Augmentation and Fuzzy-set Sample Weighting. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 3681–3690. [Google Scholar]
- Krogh, A.; Larsson, B.; Von Heijne, G.; Sonnhammer, E.L. Predicting Transmembrane Protein Topology with a Hidden Markov Model: Application to Complete Genomes. J. Mol. Biol. 2001, 305, 567–580. [Google Scholar] [CrossRef] [PubMed]
- Berndt, D.J.; Clifford, J. Using Dynamic Time Warping to Find Patterns in Time Series. In Proceedings of the KDD Workshop, Seattle, WA, USA, 31 July–1 August 1994; pp. 359–370. [Google Scholar]
- Adafruit. Measuring Sound Levels. Available online: https://learn.adafruit.com/adafruit-microphone-amplifier-breakout/measuring-sound-levels (accessed on 6 July 2018).
Pet Dog Sound Events | Field Name | |||||
---|---|---|---|---|---|---|
CM | NC | SR | TS | Duration | BPS | |
Barking | Uncompressed | 1 | 22,050 | 5327 | 0.24 | 16 |
Growling | Uncompressed | 1 | 22,050 | 11,461 | 0.51 | 16 |
Howling | Uncompressed | 1 | 22,050 | 32,628 | 1.47 | 16 |
Whining | Uncompressed | 1 | 22,050 | 6311 | 0.28 | 16 |
Pet dog Sound Events | Noise Sensor (Intensity Level) | ||||
---|---|---|---|---|---|
Barking | Growling | Howling | Whining | ||
Sound sensor (Intensity) | Barking | 4.61 | 14.79 | 8.63 | 13.57 |
Growling | 10.41 | 4.70 | 8.14 | 8.34 | |
Howling | 8.89 | 8.89 | 3.54 | 7.93 | |
Whining | 9.57 | 8.38 | 7.81 | 3.13 |
Pet dog Sound Events | Field Name | |||
---|---|---|---|---|
Minimum Length | Maximum Length | Mean Length | Median Length | |
Barking | 5 | 47 | 19.24 | 19 |
Growling | 16 | 405 | 59.59 | 56 |
Howling | 51 | 646 | 188.60 | 161 |
Whining | 5 | 198 | 27.97 | 19 |
Pet Dog Sound Events | Field Name | |||
---|---|---|---|---|
Minimum Voltage | Maximum Voltage | Mean Voltage | Median Voltage | |
Barking | 0.98 | 25.39 | 15.70 | 17.58 |
Growling | 0.98 | 8.79 | 6.29 | 5.86 |
Howling | 0.98 | 11.72 | 7.68 | 7.81 |
Whining | 0.98 | 13.67 | 8.32 | 7.81 |
Pet Dog Sound Events | 1/138 Sec | Intensity Level | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Barking | 1–16 | 4.82 | 4.81 | 4.79 | 4.81 | 4.81 | 4.34 | 4.66 | 4.46 | 4.47 | 4.19 | 3.48 | 2.84 | 2.23 | 1.68 | 1.59 | 2.38 |
17–32 | 2.84 | 3.62 | 4.34 | 3.92 | 2.97 | 2.31 | 2.28 | 2.54 | 2.82 | 3.09 | 3.38 | 3.50 | 3.26 | 2.85 | 2.63 | 2.84 | |
33–48 | 2.23 | 1.68 | 1.59 | 2.38 | 3.62 | 4.34 | 3.92 | 2.97 | 2.31 | 2.28 | 2.54 | 2.82 | 3.09 | 3.38 | 3.50 | 3.26 | |
49–64 | 2.85 | 2.63 | 2.84 | 3.23 | 3.4 | 3.07 | 2.52 | 1.94 | 1.91 | 2.16 | 2.22 | 2.55 | 2.85 | 3.00 | 3.11 | 3.33 | |
Growling | 1–16 | 4.45 | 4.28 | 3.83 | 2.99 | 2.49 | 2.63 | 3.11 | 3.54 | 3.80 | 4.01 | 4.04 | 3.73 | 3.26 | 2.93 | 2.94 | 3.11 |
17–32 | 3.17 | 2.96 | 2.64 | 2.37 | 2.14 | 1.96 | 2.03 | 2.58 | 3.37 | 3.91 | 3.85 | 3.52 | 3.39 | 3.74 | 4.28 | 4.55 | |
33–48 | 4.29 | 3.76 | 3.22 | 2.66 | 2.09 | 1.81 | 2.01 | 2.49 | 2.94 | 3.25 | 3.53 | 3.66 | 3.48 | 3.15 | 3.06 | 3.49 | |
49–64 | 4.15 | 4.50 | 4.25 | 3.70 | 3.14 | 2.57 | 1.99 | 1.65 | 1.68 | 1.94 | 2.25 | 2.63 | 3.05 | 3.19 | 2.77 | 2.03 | |
Howling | 1–16 | 3.61 | 4.08 | 4.28 | 4.15 | 3.95 | 3.52 | 2.64 | 2.18 | 1.95 | 1.88 | 2.01 | 2.47 | 3.13 | 3.58 | 3.56 | 3.32 |
17–32 | 3.19 | 3.31 | 3.54 | 3.75 | 3.88 | 3.99 | 4.03 | 3.92 | 3.75 | 3.65 | 3.77 | 2.55 | 2.15 | 2.62 | 2.57 | 2.08 | |
33–48 | 2.48 | 3.30 | 3.89 | 3.97 | 3.81 | 3.06 | 2.48 | 2.20 | 2.54 | 3.18 | 3.54 | 3.27 | 2.73 | 2.37 | 2.38 | 2.57 | |
49–64 | 2.80 | 3.11 | 3.45 | 3.50 | 2.93 | 2.08 | 1.55 | 1.71 | 2.19 | 2.50 | 2.35 | 2.03 | 1.82 | 1.85 | 2.00 | 2.11 | |
Whining | 1–16 | 0.01 | 0.33 | 0.76 | 1.05 | 1.04 | 0.89 | 0.71 | 0.48 | 0.19 | 0.76 | 1.41 | 2.20 | 2.62 | 2.27 | 1.56 | 1.11 |
17–32 | 1.69 | 1.98 | 2.16 | 2.96 | 3.16 | 3.31 | 3.41 | 3.46 | 3.47 | 3.35 | 3.02 | 2.57 | 2.20 | 2.01 | 1.85 | 1.65 | |
33–48 | 1.22 | 0.70 | 0.31 | 0.10 | 0.01 | 0.15 | 0.39 | 0.52 | 0.40 | 0.18 | 2.20 | 2.54 | 3.18 | 3.54 | 2.73 | 2.37 | |
49–64 | 2.08 | 1.94 | 1.81 | 1.63 | 1.44 | 1.38 | 1.57 | 1.90 | 2.09 | 1.97 | 1.72 | 1.58 | 1.94 | 2.21 | 1.95 | 1.68 |
Pet Dog Sound Events | # of Events | # of Intensity Data per Event |
---|---|---|
Barking | 300 | 5771 |
Growling | 300 | 17,877 |
Howling | 300 | 56,579 |
Whining | 300 | 8390 |
Total Number of Data | 1200 | 88,617 |
Type of Sensor | Sound Sensor (Typical) | Noise Sensor (Proposed) | |||||||
---|---|---|---|---|---|---|---|---|---|
Type of Data | Sound data | Intensity Data | |||||||
Feature Extraction Method | MFCC | Spectrogram | Mel-Spectrum | None | |||||
Classification method | SVM | K-NN | SVM | K-NN | SVM | K-NN | Shapelet | LSTM-FCN | Bicubic + LSTM-FCN |
Accuracy | 0.8545 | 0.7944 | 0.8633 | 0.7834 | 0.8432 | 0.7855 | 0.6788 | 0.7396 | 0.8368 |
Type of Device | Average of Data Size (KB) | Current (mA) | Voltage (V) | Energy (J) |
---|---|---|---|---|
Sound sensor (MQ-U300) | 66.4 | 180 | 5.0 | 0.9 |
Noise sensor (LM-393) | 0.9 | 20 | 5.0 | 0.1 |
Wi-Fi (ESP8266) | — | 170 | 3.3 | 0.5 |
Transmission Speed (KB/s) | |||||
---|---|---|---|---|---|
300 | 600 | 900 | 1200 | ||
Sensing Energy (J) | Sound | 0.9 | |||
Noise | 0.1 | ||||
Transmission Energy (J) | Sound | 0.111 | 0.056 | 0.037 | 0.028 |
Noise | 0.002 | 0.001 | 0.001 | 0.001 | |
Total Energy (J) | Sound | 1.011 | 0.956 | 0.937 | 0.928 |
Noise | 0.102 | 0.101 | 0.101 | 0.101 | |
Battery usage time (h) | Sound | 1.9 | 2.0 | 2.1 | 2.2 |
Noise | 19.6 | 19.8 | 19.8 | 19.8 |
© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Kim, Y.; Sa, J.; Chung, Y.; Park, D.; Lee, S. Resource-Efficient Pet Dog Sound Events Classification Using LSTM-FCN Based on Time-Series Data. Sensors 2018, 18, 4019. https://doi.org/10.3390/s18114019
Kim Y, Sa J, Chung Y, Park D, Lee S. Resource-Efficient Pet Dog Sound Events Classification Using LSTM-FCN Based on Time-Series Data. Sensors. 2018; 18(11):4019. https://doi.org/10.3390/s18114019
Chicago/Turabian StyleKim, Yunbin, Jaewon Sa, Yongwha Chung, Daihee Park, and Sungju Lee. 2018. "Resource-Efficient Pet Dog Sound Events Classification Using LSTM-FCN Based on Time-Series Data" Sensors 18, no. 11: 4019. https://doi.org/10.3390/s18114019
APA StyleKim, Y., Sa, J., Chung, Y., Park, D., & Lee, S. (2018). Resource-Efficient Pet Dog Sound Events Classification Using LSTM-FCN Based on Time-Series Data. Sensors, 18(11), 4019. https://doi.org/10.3390/s18114019