A Novel Deep Transfer Learning Approach Based on Depth-Wise Separable CNN for Human Posture Detection
Abstract
:1. Introduction
Contribution
- The study implemented novel hybrid DenseNet121 and SVM techniques to automatically recognize human postures.
- The suggested model used regularization, early stop, and dropout techniques for L1 and L2 to prevent overfitting.
- The layers of the DenseNet121 deep transfer learning (DTL) model were fine-tuned to achieve better results.
- A comparative analysis of the outcomes implemented, and the existing system was conducted.
2. Related Works
3. Materials and Methods
3.1. Dataset
3.2. Data Preprocessing
3.3. Existing DTL Images Models
3.3.1. Suggested Approach
3.3.2. InceptionV3
3.3.3. ResNet
3.3.4. DenseNet
3.4. Fine-Tuning the Models
3.5. Model Uncertainty
4. Results
4.1. Implementation
4.1.1. Implementation Setup
4.1.2. Training
4.2. Results of Implementation
5. Discussion
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Verma, A.; Suman, A.; Biradar, V.G.; Brunda, S. Human Activity Classification Using a Deep Convolutional Neural Network. In Recent Advances in Artificial Intelligence and Data Engineering; Springer: Singapore, 2022; pp. 41–50. [Google Scholar]
- Ogundokun, R.O.; Maskeliunas, R.; Misra, S.; Damaševičius, R. Improved CNN Based on Batch Normalization and Adam Optimizer. In International Conference on Computational Science and Its Applications; Springer: Cham, Switzerland, 2022; pp. 593–604. [Google Scholar]
- Le, N.Q.K.; Ho, Q.T. Deep transformers and a convolutional neural network to identify DNA N6-methyladenine sites in genomes of cross-species. Methods 2022, 204, 199–206. [Google Scholar] [CrossRef] [PubMed]
- Danilatou, V.; Nikolakakis, S.; Antonakaki, D.; Tzagkarakis, C.; Mavroidis, D.; Kostoulas, T.; Ioannidis, S. Outcome Prediction in Critically Ill Patients with Venous Thromboembolism and/or Cancer Using Machine Learning Algorithms: External Validation and Comparison with Scoring Systems. Int. J. Mol. Sci. 2022, 23, 7132. [Google Scholar] [CrossRef] [PubMed]
- Ogundokun, R.O.; Maskeliūnas, R.; Damaševičius, R. Human Posture Detection Using Image Augmentation and Hyperparameter-Optimized Transfer Learning Algorithms. Appl. Sci. 2022, 12, 10156. [Google Scholar] [CrossRef]
- Le, N.Q.K. Potential of deep representative learning features to interpret sequence information in proteomics. Proteomics 2021, 22, e2100232. [Google Scholar] [CrossRef] [PubMed]
- Ali, F.; El-Sappagh, S.; Islam, S.R.; Kwak, D.; Ali, A.; Imran, M.; Kwak, K.S. A smart healthcare monitoring system for heart disease prediction based on ensemble deep learning and feature fusion. Inf. Fusion 2020, 63, 208–222. [Google Scholar] [CrossRef]
- Choi, Y.A.; Park, S.J.; Jun, J.A.; Pyo, C.S.C.; Cho, K.H.; Lee, H.S.; Yu, J.H. Deep learning-based stroke disease prediction system using real-time biosignals. Sensors 2021, 21, 4269. [Google Scholar] [CrossRef] [PubMed]
- Pan, Y.; Fu, M.; Cheng, B.; Tao, X.; Guo, J. Enhanced deep learning-assisted convolutional neural network for heart disease prediction on the platform of medical things platform. IEEE Access 2020, 8, 189503–189512. [Google Scholar] [CrossRef]
- Cao, Z.; Simon, T.; Wei, S.E.; Sheikh, Y. Realtime multiperson 2d pose estimation using part affinity fields. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 7291–7299. [Google Scholar]
- Mehr, H.D.; Polat, H. Recognition of human activity in the smart home with the deep learning approach. In Proceedings of the 2019 7th International Istanbul Smart Grids and Cities Congress and Fair (ICSG), Istanbul, Turkey, 25–26 April 2019; pp. 149–153. [Google Scholar]
- Du, H.; He, Y.; Jin, T. Transfer learning for human activities classification using micro-Doppler spectrograms. In Proceedings of the 2018 IEEE International Conference on Computational Electromagnetics (ICCEM), Chengdu, China, 26–28 March 2018; pp. 1–3. [Google Scholar]
- Shi, X.; Li, Y.; Zhou, F.; Liu, L. Human activity recognition is based on the deep learning method. In Proceedings of the 2018 International Conference on Radar (RADAR), Brisbane, Australia, 27–31 August 2018; pp. 1–5. [Google Scholar]
- Sung, J.; Ponce, C.; Selman, B.; Saxena, A. Human activity detection from RGBD images. In Proceedings of the Workshops at the 25th AAAI Conference on Artificial Intelligence, Austin, TX, USA, 25–30 January 2015. [Google Scholar]
- Karpathy, A.; Toderici, G.; Shetty, S.; Leung, T.; Sukthankar, R.; Fei-Fei, L. Large-scale video classification with convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 1725–1732. [Google Scholar]
- Laptev, I.; Marszalek, M.; Schmid, C.; Rozenfeld, B. Learning realistic human actions from movies. In Proceedings of the 2008 IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, AK, USA, 24–26 June 2018; pp. 1–8. [Google Scholar]
- LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning is applied to document recognition. Proc. IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef] [Green Version]
- Abhishek Kumar Indian Institute of Information Technology Kottayam. EbinDeni Raj Indian Institute of Information Technology Kottayam. Available online: https://ieee-dataport.org/ (accessed on 5 September 2022).
- Szegedy, C.; Vanhoucke, V.; Ioffe, S.; Shlens, J.; Wojna, Z. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 2818–2826. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Identity mappings in deep residual networks. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 11–14 October 2016; Springer: Cham, Switzerland, 2016; pp. 630–645. [Google Scholar]
- Huang, G.; Liu, Z.; Van Der Maaten, L.; Weinberger, K.Q. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 4700–4708. [Google Scholar]
- Keleş, S.; Günlü, A.; Ercanli, I. Estimation of the carbon of the aboveground stand by combining Sentinel-1 and Sentinel-2 satellite data: A case study from Turkey. In Forest Resources Resilience and Conflicts; Elsevier: Amsterdam, The Netherlands, 2021; pp. 117–126. [Google Scholar]
- Pan, S.J.; Yang, Q. A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 2009, 22, 1345–1359. [Google Scholar] [CrossRef]
- Mohanty, S.P.; Hughes, D.P.; Salathé, M. Using deep learning for image-based plant disease detection. Front. Plant Sci. 2016, 7, 1419. [Google Scholar] [CrossRef] [PubMed]
- Ho, E.S.L.; Chan, J.C.P.; Chan, D.C.K.; Shum, H.P.H.; Cheung, Y.; Yuen, P.C. Improving posture classification accuracy for depth sensor-based human activity monitoring in smart environments. Comput. Vis. Image Underst. 2016, 148, 97–110. [Google Scholar] [CrossRef] [Green Version]
- Gal, Y.; Ghahramani, Z. Dropout as a Bayesian Approximation: Representing model uncertainty in deep learning. In Proceedings of the International Conference on Machine Learning, New York, NY, USA, 19–24 June 2016; pp. 1050–1059. [Google Scholar]
- Zhang, C.; Ma, Y. (Eds.) Ensemble Machine Learning: Methods and Applications; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2012. [Google Scholar]
- Chu, G. Machine learning for automation of Chromosome based Genetic Diagnostics. Digit. Vetensk. Ark. 2020, 832, 46. [Google Scholar]
- Glorot, X.; Bordes, A.; Bengio, Y. Deep-sparse rectifier neural networks. In Proceedings of the 14th International Conference on Artificial Intelligence and Statistics, Lauderdale, FL, USA, 11–13 April 2011; pp. 315–323, JMLR Workshop and Conference Proceedings. [Google Scholar]
- Mishkin, D.; Matas, J. All you need is a good idea. arXiv 2015, arXiv:1511.06422. [Google Scholar]
- Le, Q.V.; Ngiam, J.; Coates, A.; Lahiri, A.; Prochnow, B.; Ng, A.Y. On optimization methods for deep learning. In Proceedings of the 28th International Conference on International Conference on Machine Learning, Bellevue, WA, USA, 28 June 2011; pp. 265–272. [Google Scholar]
- Ioffe, S.; Szegedy, C. Batch Normalization: Accelerating deep network training by reducing internal covariate shift. In Proceedings of the International Conference on Machine Learning, Lille, France, 6–11 July 2015; pp. 448–456. [Google Scholar]
- Samanta, D.; Chaudhury, P.P.; Ghosh, A. Scab disease detection of potatoes using image processing. Int. J. Comput. Trends Technol. 2012, 3, 109–113. [Google Scholar]
- Athanikar, G.; Badar, P. Potato leaf disease detection and classification system. Int. J. Comput. Sci. Mob. Comput. 2016, 5, 76–88. [Google Scholar]
- Wang, H.; Li, G.; Ma, Z.; Li, X. Application of neural networks to the recognition of image of plant diseases. In Proceedings of the 2012 International Conference on Systems and Informatics (ICSAI2012), Shandong, China, 19–20 May 2012; pp. 2159–2164. [Google Scholar]
- Ogundokun, R.O.; Misra, S.; Douglas, M.; Damaševičius, R.; Maskeliūnas, R. Medical Internet-of-Things Based Breast Cancer Diagnosis Using Hyperparameter-Optimized Neural Networks. Future Internet 2022, 14, 153. [Google Scholar] [CrossRef]
- Sladojevic, S.; Arsenovic, M.; Anderla, A.; Culibrk, D.; Stefanovic, D. Deep neural network-based recognition of plant diseases by leaf image classification. Comput. Intell. Neurosci. 2016, 2016, 3289801. [Google Scholar] [CrossRef] [PubMed]
Authors | Approaches | Contributions | Limitations |
---|---|---|---|
Du et al. [12] | This study presents a technique for identifying human activity by employing a transfer-learned residual network based on microdoppler spectrograms. | The authors attained an accuracy of 97% Jitter, which is less and the results deflection is less. | Their study lacks the computational capacity to train the data set used in this study. |
Shi et al. [13] | The authors used DL approaches on spectrogram data. The adversarial generative network employed a generative adversarial network and DCNN | The study offers good scalability and uses less time for the training process | They obtained an accuracy of 82% and the study has high Jitter. Their results are also deflected. |
Cao et al. [3] | An effective technique is suggested to recognize 2D poses from several images comprising manifold persons. The technique used was the greedy bottom-up analysis technique. | The study delivers high accuracy with a low Jitter. The outcomes are very unreflective and suitable for instantaneous utilization. | The study requires extra computation power |
Ning et al. [7] | This paper suggests a technique to intensify the accuracy of categorization by enhancing the single-shot detection (SSD) technique. The technique is developed by integrating architectural features | The technique used is vigorous. | The jitter is too high; the result frequently deflects with a lower accuracy for small images. |
Caba et al. [10] | Three probable applications where ActivityNet can be employed were suggested. These include untrimmed video categorization, trimmed activity categorization, and activity recognition. | Additional varieties of classification and activity variety | The accuracy attained is low |
Sung et al. [14] | A sensor that is not expensive, called RGBD was used in the input dataset. The authors employed a two-layer maximum entropy Markov model (MEMM). | The study achieved a training accuracy of 84.3% | The research achieved a low accuracy of 64.2% on the testing dataset |
Karpathy et al. [15] | The authors used the UCF-101 video dataset, and this led to the development of a slow fusion model using CNN. | The sluggish fusion model was performed on early and late fusion networks, and it was demonstrated that the system had an accuracy of 80% on UCF-101. | There was little or no difference in the human image interaction, the individual-individual body-motion interaction, and the playing devices. |
Laptev et al. [16] | Local space-time features, space-time pyramids, and nonlinear SVM of the manifold channel were employed to enhance existing outcomes on the standard KTH action dataset | The suggested technique attains a precision of 91.8% on the KTH action dataset | The suggested technique requires an enhancement in script-to-video gathering to a much larger dataset. |
Posture Classes | Training Set | Validation Set | Testing Set |
---|---|---|---|
0 | 867 | 153 | 180 |
1 | 867 | 153 | 180 |
2 | 867 | 153 | 180 |
3 | 867 | 153 | 180 |
Total | 3468 | 612 | 720 |
Class Label | Posture Class Name | Number of Postures |
---|---|---|
0 | Bending | 1200 |
1 | Lying | 1200 |
2 | Sitting | 1200 |
3 | Standing | 1200 |
Total | 4800 |
Model | Layers | Parameters | Layers in Based Model | Size (MB) | Depth |
---|---|---|---|---|---|
InceptionV3 | 48 | 23.9M | 311 | 92 | 189 |
DenseNet-121 | 121 | 8.1M | 427 | 33 | 242 |
ResNet-50V2 | 164 | 25.6M | 190 | 98 | 103 |
Model | Training Accuracy (%) | Validation Accuracy (%) | Testing Accuracy (%) | Training Loss | Validation Loss | Testing Loss |
---|---|---|---|---|---|---|
InceptionV3 | 80.07 | 90.20 | 92.08 | 0.4436 | 0.3824 | 0.3361 |
ResNet-50V2 | 85.47 | 89.54 | 91.53 | 0.3518 | 0.7053 | 0.3784 |
DenseNet-121 | 91.06 | 91.99 | 93.47 | 0.2420 | 0.3242 | 0.2421 |
DeneSVM | 97.06 | 93.79 | 94.72 | 0.1318 | 0.4338 | 0.2918 |
Model | Label | Precision | Recall | F1-Score | Accuracy | AUC |
---|---|---|---|---|---|---|
InceptionV3 | Bending Lying Sitting Standing | 0.97 0.97 0.84 0.91 | 0.88 0.94 0.93 0.93 | 0.92 0.96 0.88 0.92 | 0.92 | 98.46 |
DenseNet-121 | Bending Lying Sitting Standing | 0.94 0.99 0.86 0.96 | 0.94 0.94 0.95 0.92 | 0.94 0.96 0.90 0.94 | 0.93 | 99.05 |
ResNet-50V2 | Bending Lying Sitting Standing | 0.93 1.00 0.86 0.94 | 0.93 0.93 0.95 0.90 | 0.93 0.96 0.90 0.92 | 0.93 | 98.73 |
DeneSVM (Proposed model) | Bending Lying Sitting Standing | 0.96 0.98 0.90 0.96 | 0.94 0.96 0.96 0.93 | 0.95 0.97 0.93 0.94 | 0.95 | 99.36 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Ogundokun, R.O.; Maskeliūnas, R.; Misra, S.; Damasevicius, R. A Novel Deep Transfer Learning Approach Based on Depth-Wise Separable CNN for Human Posture Detection. Information 2022, 13, 520. https://doi.org/10.3390/info13110520
Ogundokun RO, Maskeliūnas R, Misra S, Damasevicius R. A Novel Deep Transfer Learning Approach Based on Depth-Wise Separable CNN for Human Posture Detection. Information. 2022; 13(11):520. https://doi.org/10.3390/info13110520
Chicago/Turabian StyleOgundokun, Roseline Oluwaseun, Rytis Maskeliūnas, Sanjay Misra, and Robertas Damasevicius. 2022. "A Novel Deep Transfer Learning Approach Based on Depth-Wise Separable CNN for Human Posture Detection" Information 13, no. 11: 520. https://doi.org/10.3390/info13110520
APA StyleOgundokun, R. O., Maskeliūnas, R., Misra, S., & Damasevicius, R. (2022). A Novel Deep Transfer Learning Approach Based on Depth-Wise Separable CNN for Human Posture Detection. Information, 13(11), 520. https://doi.org/10.3390/info13110520