Sensor-Based Human Activity Recognition Using Adaptive Class Hierarchy
Abstract
:1. Introduction
- We show the effectiveness of the B-CNN model in sensor-based activity recognition. In addition, by examining the effect of the number of subjects used for training on the recognition performance of the model, we find that the B-CNN model is particularly effective when the number of training data is small.
- By examining the relationship within the class hierarchy provided for the B-CNN and its recognition performance, we found that an inappropriate class hierarchy decreases the recognition performance of the model, indicating that a class hierarchy is an important factor that affects the performance of B-CNNs.
- The above verification also revealed that class hierarchies designed by humans are not always optimal.
- To construct class hierarchies that work effectively for B-CNNs, we propose a method for automatically constructing class hierarchies based on the distances among classes in the feature space and we demonstrate the method’s effectiveness.
2. Related Works
2.1. Human Activity Recognition
2.2. Usage of Class Hierarchy in Human Activity Recognition
2.3. Usage of Class Hierarchy in Computer Vision
2.4. Automatically Constructing Class Hierarchy
2.5. Hierarchical Multi-Label Classification
3. Class Hierarchy-Adaptive B-CNN Model
Algorithm 1 Class hierarchy-adaptive B-CNN. |
Input: Train Dataset for B-CNN ; Split rate ; Dimension for PCA d; The number of corase levels in class hierarchy L; Output: Trained B-CNN Model
|
- The task can be formulated as a classification problem and the estimation target can be grouped into abstract concepts.
- The classification deep learning model is composed of stacked convolution layers, such as VGG and ResNet.
- The entire model can be pre-trained in an end-to-end manner such as training with softmax cross-entropy loss to design a class hierarchy by our method.
3.1. Branch Convolutional Neural Network (B-CNN)
3.2. Automatic Construction of Class Hierarchies
Algorithm 2 Constructing a class hierarchy. |
Input: Training Dataset for B-CNN ; Split rate ; Dimension for PCA d; Number of target classes C; The number of corase levels in class hierarchy L; Output: Hierarchical multi-labels based on constructed class hierarchy, P
|
4. Experimental Settings
4.1. Dataset
4.2. Model Structure
4.3. Training Model
4.4. Evaluating Model
5. Experimental Results
5.1. Discussion on the Effectiveness of B-CNNs
5.1.1. Effectiveness of B-CNNs
5.1.2. A Study on the Effect of Backbone Architecture on the Recognition Performance of B-CNNs
5.1.3. A Study on the Effect of Different Class Hierarchies on the Recognition Performance of B-CNNs
5.1.4. Search Costs of Class Hierarchy
5.2. Discussion of the Proposed Method for Automatic Construction of Class Hierarchies
5.3. Discussion on Class Hierarchy Designed Using the Proposed Method
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Lara, O.; Labrador, M. A Survey on Human Activity Recognition Using Wearable Sensors. IEEE Commun. Surv. Tut. 2013, 15, 1192–1209. [Google Scholar] [CrossRef]
- Wang, J.; Chen, Y.; Hao, S.; Peng, X.; Hu, L. Deep Learning for Sensor-based Activity Recognition: A Survey. Pattern Recogn. Lett. 2017, 119, 3–11. [Google Scholar] [CrossRef] [Green Version]
- Silla, C.; Freitas, A. A survey of hierarchical classification across different application domains. Data. Min. Knowl. Disc. 2011, 22, 31–72. [Google Scholar] [CrossRef]
- Bilal, A.; Jourabloo, A.; Ye, M.; Liu, X.; Ren, L. Do Convolutional Neural Networks Learn Class Hierarchy? IEEE Trans. Vis. Comput. Graph. 2018, 24, 152–162. [Google Scholar] [CrossRef] [PubMed]
- Fazli, M.; Kowsari, K.; Gharavi, E.; Barnes, L.; Doryab, A. HHAR-net: Hierarchical Human Activity Recognition using Neural Networks. In Proceedings of the 12th International Conference on Intelligent Human Computer Interaction (IHCI), Daegu, South Korea, 24–26 December 2020; pp. 48–58. [Google Scholar]
- Cho, H.; Yoon, S.M. Divide and Conquer-Based 1D CNN Human Activity Recognition Using Test Data Sharpening. Sensors 2018, 18, 1055. [Google Scholar] [CrossRef] [Green Version]
- Zhu, X.; Bain, M. B-CNN: Branch Convolutional Neural Network for Hierarchical Classification. arXiv 2017, arXiv:1709.09890. [Google Scholar]
- Minh Dang, L.; Min, K.; Wang, H.; Jalil Piran, M.; Hee Lee, C.; Moon, H. Sensor-based and vision-based human activity recognition: A comprehensive survey. Pattern Recognit. 2020, 108, 107561. [Google Scholar] [CrossRef]
- Rodríguez-Moreno, I.; Martínez-Otzeta, J.M.; Sierra, B.; Rodriguez, I.; Jauregi, E. Video Activity Recognition: State-of-the-Art. Sensors 2019, 19, 3160. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Zeng, M.; Nguyen, L.T.; Yu, B.; Mengshoel, O.; Zhu, J.; Wu, P. Convolutional Neural Networks for Human Activity Recognition using Mobile Sensors. In Proceedings of the 2014 6th International Conference on Mobile Computing, Applications and Services (MobiCASE), Austin, TX, USA, 6–7 November 2014; pp. 197–205. [Google Scholar]
- Chen, Y.; Xue, Y. A Deep Learning Approach to Human Activity Recognition Based on Single Accelerometer. In Proceedings of the 2015 IEEE International Conference on Systems, Man, and Cybernetics, Hong Kong, China, 9–12 October 2015; pp. 1488–1492. [Google Scholar]
- Yang, J.B.; Nguyen, M.N.; San, P.P.; Li, X.L.; Krishnaswamy, S. Deep Convolutional Neural Networks on Multichannel Time Series for Human Activity Recognition. In Proceedings of the 24th International Conference on Artificial Intelligence (IJCAI), Buenos Aires, Argentina, 25–31 July 2015; pp. 3995–4001. [Google Scholar]
- Ordóñez, F.J.; Roggen, D. Deep Convolutional and LSTM Recurrent Neural Networks for Multimodal Wearable Activity Recognition. Sensors 2016, 16, 115. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Xu, C.; Chai, D.; He, J.; Zhang, X.; Duan, S. InnoHAR: A Deep Neural Network for Complex Human Activity Recognition. IEEE Access 2019, 7, 9893–9902. [Google Scholar] [CrossRef]
- Xia, K.; Huang, J.; Wang, H. LSTM-CNN Architecture for Human Activity Recognition. IEEE Access 2020, 8, 56855–56866. [Google Scholar] [CrossRef]
- Gao, W.; Zhang, L.; Teng, Q.; He, J.; Wu, H. DanHAR: Dual Attention Network for multimodal human activity recognition using wearable sensors. Appl. Soft Comput. 2021, 111, 107728. [Google Scholar] [CrossRef]
- Ma, H.; Li, W.; Zhang, X.; Gao, S.; Lu, S. AttnSense: Multi-level Attention Mechanism For Multimodal Human Activity Recognition. In Proceedings of the 28th International Joint Conference on Artificial Intelligence (IJCAI), Macao, China, 10–16 August 2019; pp. 3109–3115. [Google Scholar]
- Zheng, Y. Human Activity Recognition Based on the Hierarchical Feature Selection and Classification Framework. J. Electr. Comput. Eng. 2015, 2015, 140820. [Google Scholar] [CrossRef] [Green Version]
- Wang, A.; Chen, G.; Wu, X.; Liu, L.; An, N.; Chang, C.Y. Towards Human Activity Recognition: A Hierarchical Feature Selection Framework. Sensors 2018, 18, 3629. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Khan, A.; Lee, S.Y.; Kim, T.S. A Triaxial Accelerometer-Based Physical-Activity Recognition via Augmented-Signal Features and a Hierarchical Recognizer. IEEE Trans. Inf. Technol. Biomed. 2010, 14, 1166–1172. [Google Scholar] [CrossRef]
- Leutheuser, H.; Schuldhaus, D.; Eskofier, B.M. Hierarchical, Multi-Sensor Based Classification of Daily Life Activities: Comparison with State-of-the-Art Algorithms Using a Benchmark Dataset. PLoS ONE 2013, 8, e75196. [Google Scholar]
- van Kasteren, T.L.M.; Englebienne, G.; Kröse, B.J.A. Hierarchical Activity Recognition Using Automatically Clustered Actions. In Proceedings of the 2nd International Conference on Ambient Intelligence, Amsterdam, The Netherlands, 16–18 November 2011; pp. 82–91. [Google Scholar]
- Yan, Z.; Zhang, H.; Piramuthu, R.; Jagadeesh, V.; DeCoste, D.; Di, W.; Yu, Y. HD-CNN: Hierarchical Deep Convolutional Neural Networks for Large Scale Visual Recognition. In Proceedings of the 10th IEEE International Conference on Computer Vision (ICCV), Beijing, China, 17–21 October 2015; pp. 2740–2748. [Google Scholar]
- Liu, Y.; Dou, Y.; Jin, R.; Qiao, P. Visual Tree Convolutional Neural Network in Image Classification. In Proceedings of the 24th International Conference on Pattern Recognition (ICPR), Beijing, China, 20–24 August 2018; pp. 758–763. [Google Scholar]
- Fu, R.; Li, B.; Gao, Y.; Wang, P. CNN with coarse-to-fine layer for hierarchical classification. IET Comput. Vision 2018, 12, 892–899. [Google Scholar] [CrossRef]
- Huo, Y.; Lu, Y.; Niu, Y.; Lu, Z.; Wen, J.R. Coarse-to-Fine Grained Classification. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, Paris, France, 21–25 July 2019; pp. 1033–1036. [Google Scholar]
- Deng, J.; Ding, N.; Jia, Y.; Frome, A.; Murphy, K.; Bengio, S.; Li, Y.; Neven, H.; Adam, H. Large-Scale Object Classification Using Label Relation Graphs. In Proceedings of the 13th European Conference on Computer Vision (ECCV), Zurich, Switzerland, 6–12 September 2014; pp. 48–64. [Google Scholar]
- Koo, J.; Klabjan, D.; Utke, J. Combined Convolutional and Recurrent Neural Networks for Hierarchical Classification of Images. In Proceedings of the 2020 IEEE International Conference on Big Data (Big Data), Atlanta, GA, USA, 10–13 December 2020; pp. 1354–1361. [Google Scholar]
- Liu, S.; Yi, H.; Chia, L.T.; Rajan, D. Adaptive hierarchical multi-class SVM classifier for texture-based image classification. In Proceedings of the 2005 IEEE International Conference on Multimedia and Expo, Amsterdam, The Netherlands, 6 July 2005; p. 4. [Google Scholar]
- Wang, Y.C.; Casasent, D. A hierarchical classifier using new support vector machine. In Proceedings of the 8th International Conference on Document Analysis and Recognition (ICDAR), Seoul, Korea, 31 August–1 September 2005; Volume 2, pp. 851–855. [Google Scholar]
- Yuan, X.; Lai, W.; Mei, T.; Hua, X.; Wu, X.; Li, S. Automatic Video Genre Categorization using Hierarchical SVM. In Proceedings of the 2006 International Conference on Image Processing, Atlanta, GA, USA, 8–11 October 2006; pp. 2905–2908. [Google Scholar]
- Marszalek, M.; Schmid, C. Constructing Category Hierarchies for Visual Recognition. In Proceedings of the 10th European Conference on Computer Vision (ECCV), Marseille, France, 12–18 October 2008; Volume 5305, pp. 479–491. [Google Scholar]
- Griffin, G.; Perona, P. Learning and using taxonomies for fast visual categorization. In Proceedings of the 2008 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Anchorage, AK, USA, 23–28 June 2008; pp. 1–8. [Google Scholar]
- Zhigang, L.; Wenzhong, S.; Qianqing, Q.; Xiaowen, L.; Donghui, X. Hierarchical support vector machines. In Proceedings of the 2005 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Seoul, Korea, 29 July 2005; Volume 1, p. 4. [Google Scholar]
- Cevikalp, H. New Clustering Algorithms for the Support Vector Machine Based Hierarchical Classification. Pattern Recogn. Lett. 2010, 31, 1285–1291. [Google Scholar] [CrossRef]
- Ge, W. Deep Metric Learning with Hierarchical Triplet Loss. In Proceedings of the 15th European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 269–285. [Google Scholar]
- Jin, R.; Dou, Y.; Wang, Y.; Niu, X. Confusion Graph: Detecting Confusion Communities in Large Scale Image Classification. In Proceedings of the 26th International Joint Conference on Artificial Intelligence (IJCAI), Melbourne, Australia, 19–25 August 2017; pp. 1980–1986. [Google Scholar]
- Blondel, V.; Guillaume, J.L.; Lambiotte, R.; Lefebvre, E. Fast Unfolding of Communities in Large Networks. J. Stat. Mech. Theory Exp. 2008, 2008, 10008. [Google Scholar] [CrossRef] [Green Version]
- Wehrmann, J.; Cerri, R.; Barros, R. Hierarchical Multi-Label Classification Networks. In Proceedings of the 35th International Conference on Machine Learning (ICML), Stockholm, Sweden, 10–15 July 2018; Volume 80, pp. 5075–5084. [Google Scholar]
- Giunchiglia, E.; Lukasiewicz, T. Coherent Hierarchical Multi-Label Classification Networks. In Proceedings of the 34th Conference on Advances in Neural Information Processing Systems (NeurIPS), Online, 6–12 December 2020; Volume 33, pp. 9662–9673. [Google Scholar]
- Ji, S.; Xu, W.; Yang, M.; Yu, K. 3D Convolutional Neural Networks for Human Action Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2013, 35, 221–231. [Google Scholar] [CrossRef] [Green Version]
- Zhou, Y.; Sun, X.; Zha, Z.J.; Zeng, W. MiCT: Mixed 3D/2D Convolutional Tube for Human Action Recognition. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2018; pp. 449–458. [Google Scholar]
- Ward, J.H., Jr. Hierarchical Grouping to Optimize an Objective Function. J. Am. Stat. Assoc. 1963, 58, 236–244. [Google Scholar] [CrossRef]
- Ichino, H.; Kaji, K.; Sakurada, K.; Hiroi, K.; Kawaguchi, N. HASC-PAC2016: Large Scale Human Pedestrian Activity Corpus and Its Baseline Recognition. In Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing: Adjunct, Heidelberg, Germany, 12–16 September 2016; pp. 705–714. [Google Scholar]
- Kwapisz, J.R.; Weiss, G.M.; Moore, S.A. Activity Recognition Using Cell Phone Accelerometers. SIGKDD Explor. Newsl. 2011, 12, 74–82. [Google Scholar] [CrossRef]
- Micucci, D.; Mobilio, M.; Napoletano, P. UniMiB SHAR: A Dataset for Human Activity Recognition Using Acceleration Data from Smartphones. Appl. Sci. 2017, 7, 1101. [Google Scholar] [CrossRef] [Green Version]
- Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. In Proceedings of the 3rd International Conference on Learning Representations (ICLR), San Diego, CA, USA, 7–9 May 2015; pp. 1–14. [Google Scholar]
- Hasegawa, T.; Koshino, M. Representation Learning by Convolutional Neural Network for Smartphone Sensor Based Activity Recognition. In Proceedings of the 2019 2nd International Conference on Computational Intelligence and Intelligent Systems, Bangkok, Thailand, 23–25 November 2019; pp. 99–104. [Google Scholar]
- Takahashi, R.; Matsubara, T.; Uehara, K. RICAP: Random Image Cropping and Patching Data Augmentation for Deep CNNs. In Proceedings of the 10th Asian Conference on Machine Learning, Beijing, China, 14–16 November 2018; Volume 95, pp. 786–798. [Google Scholar]
- Hasegawa, T. Octave Mix: Data Augmentation Using Frequency Decomposition for Activity Recognition. IEEE Access 2021, 9, 53679–53686. [Google Scholar] [CrossRef]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
Dataset | Train | Validation | Test |
---|---|---|---|
HASC | 10 | 50 | 50 |
WISDM | 12 | 12 | 12 |
UniMib SHAR | 10 | 10 | 10 |
Model | Class Hierarchy | Branch | HASC | WISDM | UniMib | |||
---|---|---|---|---|---|---|---|---|
Accuracy | F-Score | Accuracy | F-Score | Accuracy | F-Score | |||
stdvgg16 | - | - | 0.803 | 0.806 | 0.866 | 0.799 | 0.725 | 0.607 |
branchvgg16 | hand-crafted | ✓ | 0.819 | 0.823 | 0.887 | 0.830 | 0.723 | 0.620 |
branchvgg16 | Jin et al. [37] | ✓ | 0.810 | 0.814 | 0.870 | 0.801 | 0.719 | 0.607 |
branchvgg16 | ours | ✓ | 0.814 | 0.817 | 0.881 | 0.827 | 0.728 | 0.614 |
Backbone | w/o Branch | w/ Branch | ||
---|---|---|---|---|
Architecture | Accuracy | F-Score | Accuracy | F-Score |
VGG11 | 0.802 | 0.806 | 0.816 | 0.820 |
VGG13 | 0.808 | 0.812 | 0.821 | 0.824 |
VGG16 | 0.803 | 0.806 | 0.819 | 0.823 |
VGG16-S | 0.811 | 0.815 | 0.821 | 0.825 |
VGG16-W | 0.802 | 0.805 | 0.811 | 0.814 |
VGG19 | 0.801 | 0.805 | 0.808 | 0.811 |
ResNet18 | 0.809 | 0.809 | 0.807 | 0.811 |
ResNet50 | 0.797 | 0.799 | 0.798 | 0.801 |
LSTM-CNN | 0.815 | 0.818 | 0.820 | 0.824 |
Level | HASC Best Hierarchy | |||||
---|---|---|---|---|---|---|
1 | stay | skip | walk | stup | jog | stdown |
2 | stay | skip | walk | stup | jog | stdown |
3 | stay | skip | walk | stup | jog | stdown |
HASC Worst Hierarchy | ||||||
1 | stay | jog | walk | stdown | skip | stup |
2 | stay | jog | walk | stdown | skip | stup |
3 | stay | jog | walk | stdown | skip | stup |
(a) HASC | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Level | Hand-Crafted | Proposed Method | ||||||||||
1 | stay | walk | stup | stdown | jog | skip | stay | walk | stup | stdown | jog | skip |
2 | stay | walk | stup | stdown | jog | skip | stay | walk | stup | stdown | jog | skip |
3 | stay | walk | stup | stdown | jog | skip | stay | walk | stup | stdown | jog | skip |
(b) WISDM | ||||||||||||
Level | Hand-Crafted | Proposed Method | ||||||||||
1 | jog | walk | stup | stdown | sit | stand | jog | walk | stup | stdown | sit | stand |
2 | jog | walk | stup | stdown | sit | stand | jog | walk | stup | stdown | sit | stand |
3 | jog | walk | stup | stdown | sit | stand | jog | walk | stup | stdown | sit | stand |
Level | Hand-Crafted Method | |||||||||
---|---|---|---|---|---|---|---|---|---|---|
1 | jog | walk | stup | stdown | jump | standFS | standFL | |||
2 | jog | walk | stup | stdown | jump | standFS | standFL | |||
3 | jog | walk | stup | stdown | jump | standFS | standFL | |||
1 | layFS | sit | fallF | fallPS | fallR | fallL | fallB | fallBSC | hitO | syncope |
2 | layFS | sit | fallF | fallPS | fallR | fallL | fallB | fallBSC | hitO | syncope |
3 | layFS | sit | fallF | fallPS | fallR | fallL | fallB | fallBSC | hitO | syncope |
Level | Proposed Method | |||||||||
1 | jog | walk | stup | stdown | jump | standFS | standFL | layFS | sit | |
2 | jog | walk | stup | stdown | jump | standFS | standFL | layFS | sit | |
3 | jog | walk | stup | stdown | jump | standFS | standFL | layFS | sit | |
1 | fallF | fallPS | fallR | fallL | fallB | fallBSC | hitO | syncope | ||
2 | fallF | fallPS | fallR | fallL | fallB | fallBSC | hitO | syncope | ||
3 | fallF | fallPS | fallR | fallL | fallB | fallBSC | hitO | syncope |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Kondo, K.; Hasegawa, T. Sensor-Based Human Activity Recognition Using Adaptive Class Hierarchy. Sensors 2021, 21, 7743. https://doi.org/10.3390/s21227743
Kondo K, Hasegawa T. Sensor-Based Human Activity Recognition Using Adaptive Class Hierarchy. Sensors. 2021; 21(22):7743. https://doi.org/10.3390/s21227743
Chicago/Turabian StyleKondo, Kazuma, and Tatsuhito Hasegawa. 2021. "Sensor-Based Human Activity Recognition Using Adaptive Class Hierarchy" Sensors 21, no. 22: 7743. https://doi.org/10.3390/s21227743
APA StyleKondo, K., & Hasegawa, T. (2021). Sensor-Based Human Activity Recognition Using Adaptive Class Hierarchy. Sensors, 21(22), 7743. https://doi.org/10.3390/s21227743