Deep Learning Algorithms for Human Activity Recognition in Manual Material Handling Tasks
Abstract
1. Introduction
- 1.
- BiLSTM;
- 2.
- Sparse Denoising Autoencoder (Sp-DAE);
- 3.
- Recurrent Sp-DAE;
- 4.
- Recurrent CNN (RCNN).
2. Background
2.1. Recurrent Neural Network
2.2. Autoencoder
2.3. Convolutional Neural Networks
2.4. Hybrid Networks
2.4.1. Recurrent Convolutional Neural Networks
2.4.2. Recurrent Autoencoder Networks
2.5. Selected Architectures
3. Dataset and Methodology
3.1. Dataset and Networks Input
3.2. Network Architectures
3.2.1. BiLSTM
3.2.2. Autoencoder
3.2.3. Recurrent Sp-DAE
3.2.4. RCNN
3.3. Networks Training and Testing
Metrics
4. Results
4.1. Best Parameters Selection
4.1.1. BiLSTM
4.1.2. Sp-DAE
4.1.3. Recurrent Sp-DAE
4.1.4. RCNN
4.2. Network Architecture Comparison
4.3. Performance of the Selected Networks with LOSO Validation
4.4. Comparison of the Selected Networks with SoA
5. Discussion
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Singh, D.; Merdivan, E.; Psychoula, I.; Kropf, J.; Hanke, S.; Geist, M.; Holzinger, A. Human activity recognition using recurrent neural networks. In Proceedings of the International Cross-Domain Conference for Machine Learning and Knowledge Extraction, Reggio Calabria, Italy, 29 August–1 September 2017; Springer: Berlin/Heidelberg, Germany, 2017; pp. 267–274. [Google Scholar]
- Schrader, L.; Toro, A.V.; Konietzny, S.; Rüping, S.; Schäpers, B.; Steinböck, M.; Krewer, C.; Müller, F.; Güttler, J.; Bock, T. Advanced sensing and human activity recognition in early intervention and rehabilitation of elderly people. J. Popul. Ageing 2020, 13, 139–165. [Google Scholar] [CrossRef]
- Malaisé, A.; Maurice, P.; Colas, F.; Ivaldi, S. Activity recognition for ergonomics assessment of industrial tasks with automatic feature selection. IEEE Robot. Autom. Lett. 2019, 4, 1132–1139. [Google Scholar] [CrossRef]
- Martínez-Villaseñor, L.; Ponce, H. A concise review on sensor signal acquisition and transformation applied to human activity recognition and human–robot interaction. Int. J. Distrib. Sens. Netw. 2019, 15, 1550147719853987. [Google Scholar] [CrossRef]
- Azizi, S.; Yazdi, P.G.; Humairi, A.A.; Alsami, M.; Rashdi, B.A.; Zakwani, Z.A.; Sheikaili, S.A. Design and fabrication of intelligent material handling system in modern manufacturing with industry 4.0 approaches. Int. Robot. Autom. J. 2018, 4, 1–10. [Google Scholar] [CrossRef]
- Conforti, I.; Mileti, I.; Prete, Z.D.; Palermo, E. Measuring biomechanical risk in lifting load tasks through wearable system and machine-learning approach. Sensors 2020, 20, 1557. [Google Scholar] [CrossRef]
- Rajesh, R. Manual material handling: A classification scheme. Procedia Technol. 2016, 24, 568–575. [Google Scholar] [CrossRef]
- Morshed, M.G.; Sultana, T.; Alam, A.; Lee, Y.K. Human action recognition: A taxonomy-based survey, updates, and opportunities. Sensors 2023, 23, 2182. [Google Scholar] [CrossRef]
- Chen, K.; Zhang, D.; Yao, L.; Guo, B.; Yu, Z.; Liu, Y. Deep learning for sensor-based human activity recognition: Overview, challenges, and opportunities. ACM Comput. Surv. (CSUR) 2021, 54, 1–40. [Google Scholar] [CrossRef]
- Nweke, H.F.; Teh, Y.W.; Al-Garadi, M.A.; Alo, U.R. Deep learning algorithms for human activity recognition using mobile and wearable sensor networks: State of the art and research challenges. Expert Syst. Appl. 2018, 105, 233–261. [Google Scholar] [CrossRef]
- Luzheng, B.; Genetu, F.A.; Cuntai, G. A review on EMG-based motor intention prediction of continuous human upper limb motion for human-robot collaboration. Biomed. Signal Process. Control 2019, 51, 113–127. [Google Scholar]
- Shin, S.; Baek, Y.; Lee, J.; Eun, Y.; Son, S.H. Korean sign language recognition using EMG and IMU sensors based on group-dependent NN models. In Proceedings of the 2017 IEEE Symposium Series on Computational Intelligence (SSCI), Honolulu, HI, USA, 27 November–1 December 2017; IEEE: New York, NY, USA, 2017; pp. 1–7. [Google Scholar]
- Totah, D.; Ojeda, L.; Johnson, D.; Gates, D.; Provost, E.M.; Barton, K. Low-back electromyography (EMG) data-driven load classification for dynamic lifting tasks. PLoS ONE 2018, 13, e0192938. [Google Scholar] [CrossRef]
- Giannini, P.; Bassani, G.; Avizzano, C.A.; Filippeschi, A. Wearable sensor network for biomechanical overload assessment in manual material handling. Sensors 2020, 20, 3877. [Google Scholar] [CrossRef]
- Patriarca, F.; Di Lillo, P.; Arrichiello, F. EMG-Driven Shared Control Architecture for Human–Robot Co-Manipulation Tasks. Machines 2025, 13, 669. [Google Scholar] [CrossRef]
- Xiong, D.; Zhang, D.; Chu, Y.; Zhao, Y.; Zhao, X. Intuitive human-robot-environment interaction with EMG signals: A review. IEEE/CAA J. Autom. Sin. 2024, 11, 1075–1091. [Google Scholar] [CrossRef]
- Attal, F.; Mohammed, S.; Dedabrishvili, M.; Chamroukhi, F.; Oukhellou, L.; Amirat, Y. Physical human activity recognition using wearable sensors. Sensors 2015, 15, 31314–31338. [Google Scholar] [CrossRef]
- Bulling, A.; Blanke, U.; Schiele, B. A tutorial on human activity recognition using body-worn inertial sensors. ACM Comput. Surv. (CSUR) 2014, 46, 1–33. [Google Scholar] [CrossRef]
- Zhang, S.; Li, Y.; Zhang, S.; Shahabi, F.; Xia, S.; Deng, Y.; Alshurafa, N. Deep learning in human activity recognition with wearable sensors: A review on advances. Sensors 2022, 22, 1476. [Google Scholar] [CrossRef]
- Shin, J.; Hassan, N.; Miah, A.S.M.; Nishimura, S. A comprehensive methodological survey of human activity recognition across diverse data modalities. Sensors 2025, 25, 4028. [Google Scholar] [CrossRef]
- Kaya, Y.; Topuz, E.K. Human activity recognition from multiple sensors data using deep CNNs. Multimed. Tools Appl. 2024, 83, 10815–10838. [Google Scholar] [CrossRef]
- Thakur, D.; Dangi, S.; Lalwani, P. A novel hybrid deep learning approach with GWO–WOA optimization technique for human activity recognition. Biomed. Signal Process. Control 2025, 99, 106870. [Google Scholar] [CrossRef]
- Thakur, D.; Biswas, S.; Ho, E.S.; Chattopadhyay, S. Convae-lstm: Convolutional autoencoder long short-term memory network for smartphone-based human activity recognition. IEEE Access 2022, 10, 4137–4156. [Google Scholar] [CrossRef]
- Imran, H.A.; Riaz, Q.; Hussain, M.; Tahir, H.; Arshad, R. Smart-wearable sensors and cnn-bigru model: A powerful combination for human activity recognition. IEEE Sens. J. 2023, 24, 1963–1974. [Google Scholar] [CrossRef]
- Zhang, Z.; Wang, W.; An, A.; Qin, Y.; Yang, F. A human activity recognition method using wearable sensors based on convtransformer model. Evol. Syst. 2023, 14, 939–955. [Google Scholar] [CrossRef]
- Wei, X.; Wang, Z. TCN-attention-HAR: Human activity recognition based on attention mechanism time convolutional network. Sci. Rep. 2024, 14, 7414. [Google Scholar] [CrossRef]
- Al-qaness, M.A.; Dahou, A.; Trouba, N.T.; Abd Elaziz, M.; Helmi, A.M. TCN-inception: Temporal convolutional network and inception modules for sensor-based human activity recognition. Future Gener. Comput. Syst. 2024, 160, 375–388. [Google Scholar] [CrossRef]
- Zhang, H.; Xu, L. Multi-STMT: Multi-level network for human activity recognition based on wearable sensors. IEEE Trans. Instrum. Meas. 2024, 73, 1–12. [Google Scholar] [CrossRef]
- Dentamaro, V.; Gattulli, V.; Impedovo, D.; Manca, F. Human activity recognition with smartphone-integrated sensors: A survey. Expert Syst. Appl. 2024, 246, 123143. [Google Scholar] [CrossRef]
- Benmessabih, T.; Slama, R.; Havard, V.; Baudry, D. Online human motion analysis in industrial context: A review. Eng. Appl. Artif. Intell. 2024, 131, 107850. [Google Scholar] [CrossRef]
- Trkov, M.; Stevenson, D.T.; Merryweather, A.S. Classifying hazardous movements and loads during manual materials handling using accelerometers and instrumented insoles. Appl. Ergon. 2022, 101, 103693. [Google Scholar] [CrossRef]
- Syed, A.S.; Syed, Z.S.; Shah, M.; Saddar, S. Using wearable sensors for human activity recognition in logistics: A comparison of different feature sets and machine learning algorithms. Int. J. Adv. Comput. Sci. Appl. 2020, 11, 644–650. [Google Scholar] [CrossRef]
- Kumar, P.; Chauhan, S.; Awasthi, L.K. Human activity recognition (har) using deep learning: Review, methodologies, progress and future research directions. Arch. Comput. Methods Eng. 2024, 31, 179–219. [Google Scholar] [CrossRef]
- Bassani, G.; Filippeschi, A.; Avizzano, C.A. A Dataset of Human Motion and Muscular Activities in Manual Material Handling Tasks for Biomechanical and Ergonomic Analyses. Sens. J. 2021, 21, 24731–24739. [Google Scholar] [CrossRef]
- Ordóñez, F.J.; Roggen, D. Deep convolutional and lstm recurrent neural networks for multimodal wearable activity recognition. Sensors 2016, 16, 115. [Google Scholar] [CrossRef] [PubMed]
- Yoshimura, N.; Morales, J.; Maekawa, T.; Hara, T. Openpack: A large-scale dataset for recognizing packaging works in iot-enabled logistic environments. In Proceedings of the 2024 IEEE International Conference on Pervasive Computing and Communications (PerCom), Biarritz, France, 11–15 March 2024; IEEE: New York, NY, USA, 2024; pp. 90–97. [Google Scholar]
- Hochreiter, S. The vanishing gradient problem during learning recurrent neural nets and problem solutions. Int. J. Uncertainty Fuzziness Knowl.-Based Syst. 1998, 6, 107–116. [Google Scholar] [CrossRef]
- Cho, K.; van Merrienboer, B.; Gulcehre, C.; Bahdanau, D.; Bougares, F.; Schwenk, H.; Bengio, Y. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv 2014, arXiv:1406.1078. [Google Scholar] [CrossRef]
- Greff, K.; Srivastava, R.K.; Koutník, J.; Steunebrink, B.; Schmidhuber, J. LSTM: A search space odyssey. IEEE Trans. Neural Networks Learn. Syst. 2016, 28, 2222–2232. [Google Scholar] [CrossRef] [PubMed]
- Schuster, M.; Paliwal, K.K. Bidirectional recurrent neural networks. IEEE Trans. Signal Process. 1997, 45, 2673–2681. [Google Scholar] [CrossRef]
- Murad, A.; Pyun, J. Deep recurrent neural networks for human activity recognition. Sensors 2017, 17, 2556. [Google Scholar] [CrossRef]
- Porta, M.; Kim, S.; Pau, M.; Nussbaum, M.A. Classifying diverse manual material handling tasks using a single wearable sensor. Appl. Ergon. 2021, 93, 103386. [Google Scholar] [CrossRef]
- Arab, A.; Schmidt, A.; Aufderheide, D. Human Activity Recognition Using Sensor Fusion and Deep Learning for Ergonomics in Logistics Applications. In Proceedings of the 2023 IEEE International Conference on Internet of Things and Intelligence Systems (IoTaIS), Bali, Indonesia, 28–30 November 2023; IEEE: New York, NY, USA, 2023; pp. 254–260. [Google Scholar]
- Wang, L. Recognition of human activities using continuous autoencoders with wearable sensors. Sensors 2016, 16, 189. [Google Scholar] [CrossRef]
- Almaslukh, B.; AlMuhtadi, J.; Artoli, A. An effective deep autoencoder approach for online smartphone-based human activity recognition. Int. J. Comput. Sci. Netw. Secur. 2017, 17, 160–165. [Google Scholar]
- Vincent, P.; Larochelle, H.; Bengio, Y.; Manzagol, P.A. Extracting and composing robust features with denoising autoencoders. In Proceedings of the 25th International Conference on Machine Learning, Helsinki, Finland, 5–9 July 2008; pp. 1096–1103. [Google Scholar]
- Gu, F.; Khoshelham, K.; Valaee, S.; Shang, J.; Zhang, R. Locomotion activity recognition using stacked denoising autoencoders. IEEE Internet Things J. 2018, 5, 2085–2093. [Google Scholar] [CrossRef]
- Islam, M.M.; Nooruddin, S.; Karray, F.; Muhammad, G. Human activity recognition using tools of convolutional neural networks: A state of the art review, data sets, challenges, and future prospects. Comput. Biol. Med. 2022, 149, 106060. [Google Scholar] [CrossRef]
- Niemann, F.; Lüdtke, S.; Bartelt, C.; Hompel, M.T. Context-aware human activity recognition in industrial processes. Sensors 2021, 22, 134. [Google Scholar] [CrossRef] [PubMed]
- Syed, A.S.; Syed, Z.S.; Memon, A.K. Continuous human activity recognition in logistics from inertial sensor data using temporal convolutions in CNN. Int. J. Adv. Comput. Sci. Appl. 2020, 11, 597–603. [Google Scholar] [CrossRef]
- Xia, K.; Huang, J.; Wang, H. LSTM-CNN architecture for human activity recognition. Access 2020, 8, 56855–56866. [Google Scholar] [CrossRef]
- He, J.L.; Wang, J.H.; Lo, C.M.; Jiang, Z. Human Activity Recognition via Attention-Augmented TCN-BiGRU Fusion. Sensors 2025, 25, 5765. [Google Scholar] [CrossRef]
- Gao, X.; Luo, H.; Wang, Q.; Zhao, F.; Ye, L.; Zhang, Y. A human activity recognition algorithm based on stacking denoising autoencoder and lightGBM. Sensors 2019, 19, 947. [Google Scholar] [CrossRef]
- Li, Y.; Shi, D.; Ding, B.; Liu, D. Unsupervised feature learning for human activity recognition using smartphone sensors. In Proceedings of the Mining Intelligence and Knowledge Exploration: Second International Conference, MIKE 2014, Cork, Ireland, 10–12 December 2014; Springer: Berlin/Heidelberg, Germany, 2014; pp. 99–107. [Google Scholar]
- Bassani, G.; Filippeschi, A.; Avizzano, C.A. A wearable device to assist the evaluation of workers health based on inertial and sEMG signals. In Proceedings of the 2021 29th Mediterranean Conference on Control and Automation (MED), Puglia, Italy, 22–25 June 2021; IEEE: New York, NY, USA, 2021; pp. 669–674. [Google Scholar]
- Losey, D.P.; McDonald, C.G.; Battaglia, E.; O’Malley, M.K. A review of intent detection, arbitration, and communication aspects of shared control for physical human–robot interaction. Appl. Mech. Rev. 2018, 70, 010804. [Google Scholar] [CrossRef]
- Qiu, S.; Zhao, H.; Jiang, N.; Wang, Z.; Liu, L.; An, Y.; Zhao, H.; Miao, X.; Liu, R.; Fortino, G. Multi-sensor information fusion based on machine learning for real applications in human activity recognition: State-of-the-art and research challenges. Inf. Fusion 2022, 80, 241–265. [Google Scholar] [CrossRef]
- Guan, Y.; Plötz, T. Ensembles of deep lstm learners for activity recognition using wearables. Interact. Mob. Wearable Ubiquitous Technol. 2017, 1, 1–28. [Google Scholar] [CrossRef]
- Buda, M.; Maki, A.; Mazurowski, M.A. A systematic study of the class imbalance problem in convolutional neural networks. Neural Netw. 2018, 106, 249–259. [Google Scholar] [CrossRef] [PubMed]
- Krizhevsky, A.; Sutskever, I.; Hinton, G. Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 2012, 25. [Google Scholar] [CrossRef]
- Grzeszick, R.; Lenk, J.M.; Rueda, F.M.; Fink, G.A.; Feldhorst, S.; Hompel, M.T. Deep neural network based human activity recognition for the order picking process. In Proceedings of the iWOAR, Rostock, Germany, 21–22 September 2017; pp. 1–6. [Google Scholar]
- Freire, P.; Srivallapanondh, S.; Spinnler, B.; Napoli, A.; Costa, N.; Prilepsky, J.E.; Turitsyn, S.K. Computational complexity optimization of neural network-based equalizers in digital signal processing: A comprehensive approach. J. Light. Technol. 2024, 25, 4177–4201. [Google Scholar] [CrossRef]
- Huang, J.; Lin, S.; Wang, N.; Dai, G.; Xie, Y.; Zhou, J. TSE-CNN: A two-stage end-to-end CNN for human activity recognition. IEEE J. Biomed. Health Inform. 2019, 24, 292–299. [Google Scholar] [CrossRef] [PubMed]
- Morales, J.; Yoshimura, N.; Xia, Q.; Wada, A.; Namioka, Y.; Maekawa, T. Acceleration-based human activity recognition of packaging tasks using motif-guided attention networks. In Proceedings of the 2022 IEEE international conference on pervasive computing and communications (PerCom), Pisa, Italy, 21–25 March 2022; IEEE: New York, NY, USA, 2022; pp. 1–12. [Google Scholar]
- Kuschan, J.; Filaretov, H.; Krüger, J. Inertial measurement unit based human action recognition dataset for cyclic overhead car assembly and disassembly. In Proceedings of the 2022 IEEE 20th International Conference on Industrial Informatics (INDIN), Perth, Australia, 25–28 July 2022; IEEE: New York, NY, USA, 2022; pp. 469–476. [Google Scholar]
- Stefana, E.; Marciano, F.; Rossi, D.; Cocca, P.; Tomasoni, G. Wearable devices for ergonomics: A systematic literature review. Sensors 2021, 21, 777. [Google Scholar] [CrossRef] [PubMed]
- Inkulu, A.K.; Bahubalendruni, M.V.A.R.; Dara, A.; K., S. Challenges and opportunities in human robot collaboration context of Industry 4.0 - a state of the art review. Ind. Robot. Int. J. Robot. Res. Appl. 2022, 49, 226–239. [Google Scholar] [CrossRef]
- Moutinho, D.; Rocha, L.F.; Costa, C.M.; Teixeira, L.F.; Veiga, G. Deep learning-based human action recognition to leverage context awareness in collaborative assembly. Robot. Comput.-Integr. Manuf. 2023, 80, 102449. [Google Scholar] [CrossRef]
- Khosravy, M.; Gupta, N.; Pasquali, A.; Dey, N.; Crespo, R.G.; Witkowski, O. Human-Collaborative Artificial Intelligence Along With Social Values in Industry 5.0: A Survey of the State-of-the-Art. IEEE Trans. Cogn. Dev. Syst. 2024, 16, 165–176. [Google Scholar] [CrossRef]








| Hyperparameter | Value |
|---|---|
| Input size | 10 |
| Optimizer | Adam |
| Maximum epochs | 100, 300, 500, 700, 1000, 1500, 2000, 2500, 3000, 3500 |
| Hidden units | 100, 300, 500, 700, 900 |
| Batch size | 128 |
| Initial learning rate | 1 × |
| Learning rate drop factor | |
| Learning rate drop period | 10 |
| L2 regularization | 1 × |
| Loss function | Cross-entropy loss |
| Hyperparameter | Value |
|---|---|
| Input size | 10 |
| Maximum epochs | 100, 300, 500, 700, 1000, 1300, 1500, 2000, 2500, 3000, 3500, 4000, 4500, 5000, 5500, 6000, 6500 |
| Hidden units | 100, 300, 500, 700, 900, 1100, 1300, 2000, 2500, 3000, 3500 |
| Training algorithm | Conjugate gradient descent |
| Sparsity regularization | 1 |
| Sparsity proportion | |
| L2 regularization | 1 × |
| Transfer function | log-sigmoid |
| Loss function | Sparse mse |
| Hyperparameter | Value |
|---|---|
| Input size | 10 × 240 |
| Filter size | 3 × 3 |
| Filter dimension | 32 |
| Padding | 0 |
| 1° CNN layer stride | 1 × 1 |
| 2° CNN layer stride | 1 × 4 |
| 1° CNN layer dilation factor | 1 × 1 |
| 2° CNN layer dilation factor | 2 × 2 |
| Maximum epochs | 100, 300, 500, 700, 1000 |
| Hidden units | 100, 300, 500, 700, 900 |
| Action | BiLSTM | RCNN |
|---|---|---|
| N-pose (N) | 89.1% | 90.1% |
| Lifting from the Table (LT) | 94.5% | 92.4% |
| Placing on the Table (PT) | 84% | 85.1% |
| Lifting from the Floor (LF) | 90.3% | 88.7% |
| Placing on the Floor (PF) | 88.8% | 86.7% |
| Keeping lifted (K) | 92.3% | 91.4% |
| Carrying (W) | 93.4% | 90.2% |
| Parameter | BiLSTM | RCNN | DeepConvLSTM |
|---|---|---|---|
| F1-score 70–30 split | 95.7% | 95.9% | 95.2% |
| Accuracy 70–30 split | 96.0% | 96.3% | 95.5% |
| Precision 70–30 split | 95.8% | 96.1% | 95.3 % |
| F1-score LOSO | 90.6% | 89.2% | 90.3% |
| Accuracy LOSO | 97.2% | 96.9% | 97.2% |
| Precision LOSO | 91.0% | 89.4% | 90.3% |
| MAC | 1,492,200 | 4,212,028 | 327,027,584 |
| MA | 1,532,400 | 6,972,056 | 543,762,176 |
| LP | 750,607 | 756,675 | 15,989,191 |
| Memory (MB) | 2.863 | 2.886 | 60.994 |
| Latency (ms) | 0.076 | 0.346 | 7.7 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Bassani, G.; Avizzano, C.A.; Filippeschi, A. Deep Learning Algorithms for Human Activity Recognition in Manual Material Handling Tasks. Sensors 2025, 25, 6705. https://doi.org/10.3390/s25216705
Bassani G, Avizzano CA, Filippeschi A. Deep Learning Algorithms for Human Activity Recognition in Manual Material Handling Tasks. Sensors. 2025; 25(21):6705. https://doi.org/10.3390/s25216705
Chicago/Turabian StyleBassani, Giulia, Carlo Alberto Avizzano, and Alessandro Filippeschi. 2025. "Deep Learning Algorithms for Human Activity Recognition in Manual Material Handling Tasks" Sensors 25, no. 21: 6705. https://doi.org/10.3390/s25216705
APA StyleBassani, G., Avizzano, C. A., & Filippeschi, A. (2025). Deep Learning Algorithms for Human Activity Recognition in Manual Material Handling Tasks. Sensors, 25(21), 6705. https://doi.org/10.3390/s25216705

