Fall-from-Height Detection Using Deep Learning Based on IMU Sensor Data for Accident Prevention at Construction Sites

Lee, Seunghee; Koo, Bummo; Yang, Sumin; Kim, Jongman; Nam, Yejin; Kim, Youngho

doi:10.3390/s22166107

Open AccessArticle

Fall-from-Height Detection Using Deep Learning Based on IMU Sensor Data for Accident Prevention at Construction Sites

by

Seunghee Lee

^†,

Bummo Koo

^†

,

Sumin Yang

,

Jongman Kim

,

Yejin Nam

and

Youngho Kim

^*

Department of Biomedical Engineering, Yonsei University, Wonju 26493, Korea

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Sensors 2022, 22(16), 6107; https://doi.org/10.3390/s22166107

Submission received: 15 July 2022 / Revised: 5 August 2022 / Accepted: 8 August 2022 / Published: 16 August 2022

(This article belongs to the Collection Wearable Sensors for Risk Assessment and Injury Prevention)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Workers at construction sites are prone to fall-from-height (FFH) accidents. The severity of injury can be represented by the acceleration peak value. In the study, a risk prediction against FFH was made using IMU sensor data for accident prevention at construction sites. Fifteen general working movements (NF: non-fall), five low-hazard-fall movements, (LF), and five high-hazard-FFH movements (HF) were performed by twenty male subjects and a dummy. An IMU sensor was attached to the T7 position of the subject to measure the three-axis acceleration and angular velocity. The peak acceleration value, calculated from the IMU data, was 4 g or less in general work movements and 9 g or more in FFHs. Regression analysis was performed by applying various deep learning models, including 1D-CNN, 2D-CNN, LSTM, and Conv-LSTM, to the risk prediction, and then comparing them in terms of their mean absolute error (MAE) and mean squared error (MSE). The FFH risk level was estimated based on the predicted peak acceleration. The Conv-LSTM model trained by MAE showed the smallest error (MAE: 1.36 g), and the classification with the predicted peak acceleration showed the best accuracy (97.6%). This study successfully predicted the FFH risk levels and could be helpful to reduce fatal injuries at construction sites.

Keywords:

fall-from-height; IMU sensor; deep learning; risk prediction

1. Introduction

Fall-from-height (FFH) accidents account for an extremely high proportion of accidents at construction sites with a fairly high mortality rate. Choi et al. [1] conducted a comparative analysis of accidents that occurred between 2011 and 2015 in three countries: the United States, Korea, and China. Accidents were found to occur frequently at construction sites, with the U.S. showing a 26% increase (from 781 to 985), while China and Korea showed a 28% decrease (2634 to 1891) and a 21% decrease (from 621 to 493), respectively. The average mortality rate was the highest in Korea (17.9 persons), followed by the U.S. and China (9.4 and 5.3 persons, respectively).

The Occupational Safety and Health Administration (OSHA) requires implementing physical safety measures to reduce such accidents at construction sites [2]. Primary protection measures include implementing guardrails, covers, safety nets, and physical safety devices, while secondary protection measures include the use of a personal fall arrest system (PFAS) whereby the impact of an FFH accident can be minimized [3]. A PFAS comprises a connector, full-body harness, lanyard, and rescue line, and may prevent a person from falling when properly configured [4]. A PFAS cannot prevent FFHs but can effectively avoid fatalities from FFHs [5]. Yang et al. [2] reported that fatalities due to losing balance can be avoided when the PFAS is properly used; however, if a worker is suspended from a PFAS for a prolonged time, there is a risk of suspension trauma, orthostatic intolerance, or other serious injuries, and workers still sustain injuries due to the incompleteness of a PFAS [6]. Furthermore, the effect of a PFAS is insignificant when a person falls from a height below 15 ft, and accidents occur because workers do not properly wear PFAS due to their inconvenience during work [7].

To address the limitations of interrupting the movements of workers, some studies have considered the use of visual devices [8,9,10]. Han et al. [8] analyzed four actions on ladders using depth information measured by Kinect and classified unsafe actions with 90.9% accuracy. Fang et al. [9] defined unsafe behaviors as situations in which workers and the structural supports overlap in construction. They developed an automatic computer-vision system with 90% recall and 75% precision using CNN. Kong et al. [10] combined computer vision with long short-term memory (LSTM) to predict unsafe actions from video data. They determined the safety of actions based on the predicted trajectories. Their model showed a mean intersection-over-union of 73.4%, mean absolute precision of 92.9% (IOU: 0.5), and mean absolute precision of 68.1% (IOU: 0.7). However, the vision-based system is not effective since workers might be obscured by structures on construction sites.

Researchers have actively developed fall detection algorithms to minimize injuries in the elderly from falls using wearable sensors [11,12,13]. Threshold-based methods have mainly been used for fall detection. Jung et al. [11] developed a fall detection algorithm with an accuracy of 92.4% and a lead time of 280.25 ± 10.29 ms, evaluated on the SisFall public dataset, and used a complementary filter to compute the vertical angles from the IMU sensor data. Ahn et al. [12] developed a hip protection system for the elderly, which comprises an IMU sensor, a non-gunpowder type inflator, and a wearable airbag with a threshold-based fall detection algorithm, and 100% accuracy and a 401.9 ± 46.9 ms lead time were obtained. Koo et al. [13] developed a post-fall detection algorithm based on machine learning using an IMU sensor. They used five different ranking algorithms to select feature subsets. The feature subsets selected by the T-score showed the best accuracy of 99.86%.

Several studies have been conducted on developing FFH detection algorithms by extending the aforementioned research [2,14,15]. Yang, et al. [2] performed near-miss fall detection based on machine learning with IMU sensor data, where the algorithm showed 86.8% accuracy in the laboratory and 85.2% accuracy outdoors. Dogan and Akcamete [14] performed an FFH detection study by calculating the fall height from three-axis acceleration data, with an overall error rate of 10.8%. Kim et al. [15] developed an FFH detection algorithm by calculating the vertical velocity and the trunk angles from IMU data and reported that 100% accuracy and a lead time of 301.8 ± 87.8 ms were obtained.

In order to increase survival rates after FFHs [16], it is important to predict the risk levels of falls. Arena et al. [17] experimentally confirmed that the peak acceleration of the head is between 4 and 11 m/s² during falls. The peak acceleration value is one of the key measurement factors that can affect the severity of injury [18]. Kim et al. [19] proposed a study to predict the impact of falls of the elderly with the peak acceleration value. A regression analysis was performed using a deep learning algorithm based on IMU sensor data, and its results showed a mean absolute percent error of 6.69 ± 0.33% and an r value of 0.93. The risk of FFH accidents can be represented using the peak acceleration value. Risk prediction is more necessary and challenging to discriminate since FFH accidents result in more fatal injuries.

In this study, an IMU sensor was attached to the subject to obtain data regarding frequently observed or dangerous behaviors at construction sites. A human dummy was used to acquire data for falls from 2 m or above. Regression analysis was performed using various deep learning models (1D-CNN, 2D-CNN, LSTM, and Conv-LSTM) applied to the three-axis acceleration, three-axis angular velocity, and their SVM feature vectors. The risk level was estimated based on the predicted peak acceleration.

2. Materials and Methods

2.1. Experiment

A total of 20 healthy adult males (24.8 ± 2.0 years old, 173.5 ± 6.1 cm, 76.6 ± 13.0 kg) were recruited from Yonsei University for the study. Participants who had musculoskeletal problems were excluded. The experiment was conducted with the approval of the Yonsei University Mirae Campus IRB (1041849-202004-BM-042-02), and written consent was obtained from the participants [20].

An IMU module based on MPU-9250 (InvenSens, San Jose, CA, USA) was attached to the T7 position of the subject. The three-axis acceleration and angular velocity signals were measured at a sampling frequency of 100 Hz. LabVIEW (National Instruments, Austin, TX, USA) was used to save the data on a PC.

Fifteen general working movements (NF: non-fall), five low-hazard fall movements, (LF), and five high-hazard FFH movements (HF) were selected based on the reports of safety and health [8,21,22] and measured three times. For safety reasons, a dummy (Madamade, Chuncheon, Gangwon-do, Korea) (height: 180 cm, weight: 10 kg) was used for the HF experiments with fall heights higher than 2 m. Vertical and forward falling movements (FFH) were performed by the dummy at heights of 2 m and 3 m. All movements were repeated five times. Our previous study [15] revealed no differences between the data obtained when a person fell forward from 0.7 m and when a dummy fell forward from the same height using an SPSS-based independent sample t-test.

Table 1 represents the experimental movements in this study, comprising NFs, LFs, and HFs. Four sets of HF movements (HF01, 02, 04, and 05) were performed using the dummy, while HF03 was performed by participants. For all subjects, the experiments were performed based on the videos that were made on the Internet.

2.2. Pre-Processing

For the training, 70% of the human data and 60% of the dummy data were used (30% of the human data and 40% of the dummy data were used for the testing). Two sum vector magnitude (SVM) values, acceleration SVM (A_SVM) and gyro SVM (G_SVM), were calculated from the acceleration and angular velocity data measured by the IMU sensor [19]. Eight features were used to train the deep learning models, which predicted the magnitude of the impact acceleration. (Table 2). The data from 0.7 s to 0.2 s prior to the peak A_SVM value were extracted for the analysis (Figure 1). The data analysis was conducted on a desktop equipped with Intel i7-12700 2.1 GHz, 32 GB RAM, NVIDIA GeForce RTX3060, and Windows 11.

A_{S V M} = \sqrt{A c c_{x}^{2} + A c c_{y}^{2} + A c c_{z}^{2}}

(1)

G_{S V M} = \sqrt{G y r o_{x}^{2} + G y r o_{y}^{2} + G y r o_{z}^{2}}

(2)

2.3. Deep Learning Models

Python 3.9 (Python Software Foundation, Wilmington, DE, USA) and the TensorFlow 2.9.0 library were used for deep learning. One-dimensional CNNs are mostly used for 1D time-series analyses [23], whereas 2D-CNNs are generally used for image classification or facial recognition in the form of 2D filters [24]. The long short-term memory (LSTM) method is a neural network model devised to supplement the drawback of a recurrent neural network (RNN), which is losing the first input information through multiple layers. Unlike RNN, LSTM introduces a structure where major information from the previous step is inputted into the next step [25]. Conv-LSTM utilizes the advantages of CNN and LSTM and is particularly useful for time-series prediction [26]. Data features are extracted from the convolutional layer of CNN, while LSTM receives the extracted features as input [27].

Figure 2 illustrates the structure of the four deep learning models. In Figure 2A,B, 1D-CNN and 2D-CNN models show two convolution layers in which overfitting is prevented by reducing the computation process through the max-pooling layer [28]. Figure 2C illustrates two layers of LSTM before the dropout layer. The Conv-LSTM comprised two convolution layers and two LSTM layers, in which the max-pooling layer was positioned after the convolution layer. Overfitting was prevented using the dropout (0.25) layer before the output. The basic hyperparameters for the four models were as follows: The Conv1D layer had a kernel size of 2 and a stride of 2. The max-pooling 1D and max-pooling 2D layers were 2 and (2, 2), respectively. Conv2D had a kernel size of (4, 4) and a stride of (2, 2). The activation function was ReLU, while the batch size of each model was 1. In addition, an early stop function was used to prevent overfitting. The models with an LSTM layer were set to include 50 epochs with 10 as patience. For the two CNNs, the epoch and the patience were set to 200 and 100, respectively.

The number of convolution filters and LSTM memory units were optimized using a grid search method. The numbers of filters in the first and the second layers for both the 1D-CNN and 2D-CNN were set to 8, 16, 32, and 64, and the numbers of units in the LSTM were 8, 16, 32, and 64. The numbers of filters in the conv1D layer of the Conv-LSTM were 16 and 64, while the numbers of units in the LSTM layer were 16 and 64.

2.4. Evaluation Methods

The errors between the measured and predicted values were calculated using two error functions: the mean absolute error (MAE) and the mean squared error (MSE).

M A E = \frac{1}{N} \sum_{i = 1}^{N} |\hat{y_{i}} - y_{i}|

(3)

M S E = \frac{1}{N} \sum_{i = 1}^{N} {(\hat{y_{i}} - y_{i})}^{2}

(4)

where

\hat{y_{i}}

is the peak acceleration value predicted by the model, and

y_{i}

is the peak acceleration value measured in the experiment. The units of MAE and MSE are g and g² respectively.

The accuracy, sensitivity, and specificity were calculated as

S e n s i t i v i t y (%) = \frac{T r u e P o s i t i v e s}{T r u e p o s i t i v e s + F a l s e n e g a t i v e s} \times 100,

(5)

S p e c i f i c i t y (%) = \frac{T r u e n e g a t i v e s}{T r u e n e g a t i v e s + F a l s e p o s i t i v e s} \times 100,

(6)

A c c u r r a c y (%) = \frac{T r u e P o s i t i v e s + T r u e n e g a t i v e s}{T r u e p o s i t i v e s + T r u e n e g a t i v e s + F a l s e p o s i t i v e s + F a l s e n e g a t i v e s} \times 100,

(7)

where True positives is the number of FFHs detected as FFHs, False positives is the number of non-FFHs detected as FFHs, True negatives is the number of non-FFHs detected as non-FFHs, and False negatives is the number of FFHs detected as non-FFHs.

3. Results

Table 3 represents the best performance among the models trained with hyperparameters. The smallest error (MAE: 1.46 g) was observed when the 1D-CNN model was trained with the MAE and the number of filters in the first and the second layers was 16. The error was the smallest (MSE = 6.02 g²) when the 1D-CNN model was trained with the MSE and the numbers of filters in the first and the second layers were 64 and 8, respectively. The smallest error (MAE: 1.61 g) was observed when the 2D-CNN model was trained with the MAE and the numbers of filters in the first and the second layers were 32 and 16, respectively. The error was smallest (MSE = 9.51 g²) when the 2D-CNN model was trained with the MSE and the numbers of filters in the first and the second layers were 32 and 16, respectively. The smallest error (MAE: 2.07 g) was observed when the LSTM model was trained with the MAE and the numbers of units in the first and the second layers were 8 and 16, respectively. On the other hand, the error was smallest (MSE = 12.20 g²) when the LSTM model was trained with the MSE and the numbers of filters in the first and the second layers were 8 and 64, respectively. The smallest error (MAE: 1.36 g) was observed when the Conv-LSTM model was trained with the MAE and the numbers of filters in the first and the second layers were 64 and 64, respectively and the numbers of units in the third and the fourth layers were 64 and 16, respectively. The error was smallest (MSE = 5.69 g²) when the Conv-LSTM model was trained with the MSE and the numbers of filters in the first and the second layers were 64 and 16, respectively and the numbers of units in the third and the fourth layers were 16 and 64, respectively.

Figure 3 shows the means and the standard deviations of the true and the predicted values for all experimental movements. The NF movements, except for NF15 (jumping), showed true values between 1 g and 4 g (5 g~9 g for NF15). The 2D-CNN and LSTM predicted peak acceleration values of less than 10 g in NF15, but both the 1D-CNN and Conv-LSTM trained by the MSE predicted values higher than 10 g. The LF movements were within the range of 5 g~13 g, whereas most HF movements had values higher than 9 g. The LF movements, especially LF03 and LF04, showed larger deviations than the other NF or HF movements. The LF03 and LF04 movements had peak acceleration values higher than 10 g, while the other LF movements were in the range of 4 g~9 g. Most predicted values in the LF were less than 9 g. The 1D-CNN, 2D-CNN, and LSTM (except Conv-LSTM) had significantly underestimated predictions, although at least 9 g of the peak acceleration values was measured in HF01. The vertical FFHs (HF01 and HF02) revealed smaller peak acceleration values than the forward FFHs (HF04 and HF05). The 1D-CNN and Conv-LSTM predicted peak acceleration values of HF movements beyond 10 g when estimated by the MAE. It is noted that the 2D-CNN predicted peak acceleration values with the largest deviation. The 1D-CNN and Conv-LSTM predicted the peak values in the HF movements better than the others. Considering the predicted peak acceleration values in the LFs and HFs, movements higher than 9 g could be defined as the threshold of FFH.

Table 4 presents the classification performance, assuming that a peak acceleration of 9 g or higher indicates FFH movements. The 1D-CNN showed better sensitivity (MAE: 83.3%, MSE: 87.5%) than the 2D-CNN (MAE: 4.2%, MSE: 79.2%), even though more than 90% accuracy was found in the 1D-CNN (MAE: 92%, MSE: 93.9%) and 2D-CNN (MAE: 90.7%, MSE: 96.5%). The LSTM demonstrated very poor sensitivity since its predicted values were mostly small. Among all the deep learning models, the Conv-LSTM showed the highest classification accuracy when trained by the MAE (97.6%).

4. Discussion

Our study shows that the 1D-CNN had smaller errors than the 2D-CNN. This implies that a 1D-CNN is more appropriate for time-series data. The Conv-LSTM showed the smallest errors among the others in both the MAE and MSE. The LSTM was used to predict the peak values, while the CNN was efficient in extracting features.

An NF15 (0.7 m jump) can frequently occur at construction sites; it is not necessarily a falling movement, but a dangerous one leading to an FFH. It showed larger predicted values of peak accelerations than the other NF movements; thus, it can be classified as an FFH. LF movements can easily occur at construction sites and are mostly predicted within the range of 4 g~9 g, for which the 1D-CNN and Conv-LSTM models obtained reasonable predictions. Such movements might not be too dangerous for the workers to require a life-saving wearable airbag, but they still might lead to serious injuries for the elderly during daily life.

In Figure 3, HF01 and HF02, vertical FFHs, showed relatively smaller peak acceleration values than HF03, HF04, and HF05, representing forward FFHs. In vertical FFHs, the foot first touches the ground during falling, which results in a smaller transfer of mechanical energy and, therefore, a smaller impact compared to forward FFHs. In Figure 3, the 1D-CNN and the Conv-LSTM predicted HFs more accurately, showing the high peak acceleration values, than the 2D-CNN and LSTM. The 2D-CNN and LSTM showed under-estimated peak acceleration values since a data imbalance appeared between a large number of NFs and a small number of HFs. This can be solved using data augmentation techniques, but they have overfitting problems.

As for estimating errors in prediction, the MSE resulted in smaller deviations between the sensitivity and the specificity of all deep learning models than the MAE. The MSE amplified errors by squaring, and the errors in the LFs and HFs were larger than those in the NFs. The MSE tended to focus on the LFs or HFs more than the NFs since deep learning models train to minimize errors. The 2D-CNN trained by the MAE had the poorest sensitivity (4.2%). The LSTM showed high accuracy (MAE: 94.4%, MSE: 92.0%), but significantly low sensitivity (MAE: 45.8%, MSE: 50.0%). The sensitivity is more important than the specificity since it is directly related to the notification of dangerous situations. The 1D-CNN and Conv-LSTM showed higher sensitivity than the others. Thus, 1D-CNN- or Conv-LSTM-based risk prediction algorithms would be more effective for safety issues. A risk prediction algorithm can be used as a device to warn of risks and call for emergency support when applied to a mobile application. In addition, the classification of NFs and LFs from HFs may be helpful when used with protective equipment such as an airbag. The aforementioned can lead to a reduction in the death rate.

There are some limitations to the study. First, our dataset does not represent all movements at construction sites since it was obtained from the simulated movements of only twenty participants. However, it is enough to represent major movements at construction sites and sufficient for developing an algorithm for predicting the risk of movements. Second, the developed algorithm tended to underestimate the risk of each movement, since the amount of data in each class (NF, LF, and HF) was unbalanced. Several augmentation techniques can be applied to solve this problem in the future.

5. Conclusions

The risks of FFHs at construction sites were predicted using IMU sensor data for accident prevention. A total of 15 non-falls (NF), 5 low-hazard falls (LF), and 5 high-hazard FFHs (HF) were selected as the experimental movements. Four deep learning models (1D-CNN, 2D-CNN, LSTM, and Conv-LSTM) and two error functions (MAE and MSE) were applied to predict the peak acceleration values at impact. The Conv-LSTM trained by the MAE showed the smallest error (1.36 g). When the threshold of 9 g was applied as the peak acceleration, the Conv-LSTM showed the highest accuracy (97.6%). Our algorithm successfully predicted the risk of movements and can be applied to real construction sites in the future. This study could be very helpful to prevent fatal injuries at construction sites by providing not only proper feedback to avoid unsafe behaviors but can also prompt treatment after accidents.

Author Contributions

Conceptualization, S.L., B.K. and Y.K.; Data curation, B.K., S.Y., J.K. and Y.N.; Investigation, S.L. and J.K.; Methodology, S.L.; Project administration, Y.K.; Software, S.L.; Supervision, Y.K.; Writing—original draft, S.L. and B.K.; Writing—review and editing, S.L., B.K. and Y.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by a major project of the Korea Institute of Machinery and Materials (Project ID: NK238F) and the Technology Innovation Program (20006386) funded by the Ministry of Trade, Industry, and Energy (MOTIE, Korea).

Institutional Review Board Statement

The study was conducted in accordance with the Declaration of Helsinki, and approved by the Institutional Review Board of the Yonsei University Mirae Campus (1041849-202004-BM-042-02).

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Conflicts of Interest

The authors declare no conflict of interest.

References

Choi, S.; Guo, L.; Kim, J.; Xiong, S. Comparison of fatal occupational injuries in construction industry in the United States, South Korea, and China. Int. J. Ind. Ergon. 2019, 71, 64–74. [Google Scholar] [CrossRef]
Yang, K.; Ahn, C.R.; Vuran, M.C.; Aria, S.S. Semi-supervised near-miss fall detection for ironworkers with a wearable inertial measurement unit. Automat. Constr. 2016, 68, 194–202. [Google Scholar] [CrossRef]
Liu, D.; Sommerich, C.M.; Sanders, E.B.-N.; Lavender, S.A. Application of a participatory methodology for investigating Personal Fall Arrest System (PFAS) usage in the construction industry. In Proceedings of the Human Factors and Ergonomics Society Annual Meeting, San Antonio, TX, USA, 19–23 October 2009; Volume 53. pp. 925–929. [CrossRef]
Pomares, J.C.; Carrión, E.Á.; González, A.; Saez, P.I. Optimization on Personal Fall Arrest Systems. Experimental Dynamic Studies on Lanyard Prototypes. Int. J. Environ. Res. Public Health 2020, 17, 1107. [Google Scholar] [CrossRef] [PubMed]
Goh, Y.M. Empirical investigation of the average deployment force of personal fall-arrest energy absorbers. J. Constr. Eng. Manag. 2015, 141, 04014059. [Google Scholar] [CrossRef]
Lee, C.; Porter, K.M. Suspension trauma. Emerg. Med. J. 2007, 24, 237–238. [Google Scholar] [CrossRef]
Dong, X.S.; Largay, J.A.; Choi, S.; Wang, X.; Cain, C.T.; Romano, N. Fatal falls and PFAS use in the construction industry: Findings from the NIOSH FACE reports. Accid. Anal. Prev. 2017, 102, 136–143. [Google Scholar] [CrossRef]
Han, S.; Lee, S.; Peña-Mora, F. Vision-based detection of unsafe actions of a construction worker: Case study of ladder climbing. J. Comput. Civ. Eng. 2013, 27, 635–644. [Google Scholar] [CrossRef]
Fang, W.; Zhong, B.; Zhao, N.; Love, P.E.D.; Luo, H.; Xue, J.; Xu, S. A deep lea rning-based approach for mitigating falls from height with computer vision: Convolutional neural network. Adv. Eng. Inform. 2019, 39, 170–177. [Google Scholar] [CrossRef]
Kong, T.; Fang, W.; Love, E.D.; Luo, H.; Xu, S.; Li, H. Computer vision and long short-term memory: Learning to predict unsafe behaviour in construction. Adv. Eng. Inform. 2021, 50, 101400. [Google Scholar] [CrossRef]
Jung, H.; Koo, B.; Kim, J.; Kim, T.; Nam, Y.; Kim, Y. Enhanced algorithm for the detection of preimpact fall for wearable airbags. Sensors 2020, 20, 1277. [Google Scholar] [CrossRef]
Ahn, S.; Choi, D.; Kim, J.; Kim, S.; Jeong, Y.; Jo, M.; Kim, Y. Optimization of a pre-impact fall detection algorithm and development of hip protection airbag system. Sens. Mater. 2018, 30, 1743–1752. [Google Scholar] [CrossRef]
Koo, B.; Kim, J.; Nam, Y.; Kim, Y. The Performance of Post-Fall Detection Using the Cross-Dataset: Feature Vectors, Classifiers and Processing Conditions. Sensors 2021, 21, 4638. [Google Scholar] [CrossRef] [PubMed]
Dogan, O.; Akcamete, A. Detecting falls-from-height with wearable sensors and reducing consequences of occupational fall accidents leveraging IoT. In Advances in Informatics and Computing in Civil and Construction Engineering; Springer: Cham, Switzerland, 2019; pp. 207–214. [Google Scholar] [CrossRef]
Kim, Y.; Jung, H.; Koo, B.; Kim, J.; Kim, T.; Nam, Y. Detection of pre-impact falls from heights using an inertial measurement unit sensor. Sensors 2020, 20, 5388. [Google Scholar] [CrossRef] [PubMed]
Alizo, G.; Sciarretta, J.D.; Gibson, S.; Muertos, K.; Romano, A.; Davis, J.; Pepe, A. Fall from heights: Does height really matter? Eur. J. Trauma Emerg. Surg. 2018, 44, 411–416. [Google Scholar] [CrossRef] [PubMed]
Arena, S.; Davis, J.; Grant, J.; Madigan, M. Tripping elicits earlier and larger deviations in linear head acceleration compared to slipping. PLoS ONE 2016, 11, e0165670. [Google Scholar] [CrossRef]
Hajiaghamemar, M.; Seidi, M.; Ferguson, J.R.; Caccese, V. Measurement of Head Impact Due to Standing Fall in Adults Using Anthropomorphic Test Dummies. Ann. Biomed. Eng. 2015, 43, 2143–2152. [Google Scholar] [CrossRef]
Kim, T.; Choi, A.; Heo, H.; Kim, H.; Mun, J. Acceleration Magnitude at Impact Following Loss of Balance Can Be Estimated Using Deep Learning Model. Sensors 2020, 20, 6126. [Google Scholar] [CrossRef]
Koo, B.; Kim, J.; Nam, Y.; Sung, D.; Shim, J.; Yang, S.; Kim, Y. Multi-class Classification of Industrial Fall from Height based on Machine Learning Algorithm. Trans. Korean Soc. Mech. Eng. A 2022, 46, 259–265. [Google Scholar] [CrossRef]
Occupational Safety and Health Adminstraion, U.S. Department of Labor, Directorate of Technical Support and Emergency Management, OSHA Technical Manual (OTM)—Section V: Chapter 4. Available online: https://www.osha.gov/otm/section-5-construction-operations/chapter-4#fall_arrest (accessed on 2 August 2022).
Albers, J.T.; Estill, C.F. Simple Solutions: Ergonomics for Construction Workers; National Institute for Occupational Safety and Health: Washington, DC, USA, 2007. [Google Scholar]
Huang, S.; Tang, J.; Dai, J.; Wang, Y. Signal Status Recognition Based on 1DCNN and Its Feature Extraction Mechanism Analysis. Sensors 2019, 19, 2018. [Google Scholar] [CrossRef]
Trigueros, D.S.; Meng, L.; Hartnett, M. Enhancing convolutional neural networks for face recognition with occlusion maps and batch triplet loss. Image Vis. Comput. 2018, 79, 99–108. [Google Scholar] [CrossRef]
Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
Yu, X.; Qiu, H.; Xiong, S. A novel hybrid deep neural network to predict pre-impact fall for older people based on wearable inertial sensors. Front. Bioeng. Biotechnol. 2020, 8, 63. [Google Scholar] [CrossRef] [PubMed]
Lu, W.; Li, J.; Li, Y.; Sun, A.; Wang, J. A CNN-LSTM-based model to forecast stock prices. Complexity 2020, 2020, 6622927. [Google Scholar] [CrossRef]
Gu, J.; Wang, Z.; Kuen, J.; Ma, L.; Shahroudy, A.; Shuai, B.; Liu, T.; Wang, X.; Wang, G.; Cai, J.; et al. Recent advances in convolutional neural networks. Pattern Recognit. 2018, 77, 354–377. [Google Scholar] [CrossRef]

Figure 1. An example of the window extraction.

Figure 2. Structures of deep learning models: (A) 1D-CNN, (B) 2D-CNN, (C) LSTM, and (D) Conv-LSTM.

Figure 3. True vs. predicted (MSE, MAE) values of trained models for different movements.

Table 1. Experimental movements.

Non-Fall (NF)	NF01	Sitting quickly and getting up	NF09	Moving up and down in an elevator
	NF02	Sitting and getting up comfortably	NF10	Walking on a beam
	NF03	Going up and down the stairs	NF11	Walking on a beam with luggage
	NF04	Going up and down a ladder	NF12	Shoveling
	NF05	Working with a pickaxe	NF13	Stretching
	NF06	Lifting (front)	NF14	Climbing up and down a scaffold
	NF07	Lifting (back)	NF15	0.7 m jump
	NF08	Lifting (side)
Low-Hazard Fall (LF)	LF01	Forward trip	LF04	Backward slip
	LF02	Lateral trip	LF05	Fainting
	LF03	Forward slip
High-Hazard FFH (HF)	HF01	2 m Vertical FFH	HF04	2 m Forward FFH
	HF02	3 m Vertical FFH	HF05	3 m Forward FFH
	HF03	0.7 m Forward FFH

Table 2. Eight features used in this study.

No.	Feature	No.	Feature
1	$A_{X} :$ x-axis acceleration	5	$G_{X}$ : x-axis angular velocity
2	$A_{Y} :$ y-axis acceleration	6	$G_{Y}$ : y-axis angular velocity
3	$A_{Z} :$ z-axis acceleration	7	$G_{Z} :$ z-axis angular velocity
4	$A_{S V M}$ : Sum vector magnitude of acceleration	8	$G_{S V M}$ : Sum vector magnitude of angular velocity

Table 3. Best performances of deep learning models (in terms of MAE and MSE).

Model Name	MAE (Epoch)	MSE (Epoch)
1D-CNN	1.46 g (183)	6.02 g² (151)
2D-CNN	1.61 g (130)	9.51 g² (187)
LSTM	2.07 g (18)	12.20 g² (13)
Conv-LSTM	1.36 g (25)	5.69 g² (49)

It is noted that the 1D-CNN showed smaller errors in prediction than the 2D-CNN in both the MAE and MSE. Among four deep learning models, Conv-LSTM demonstrated the best prediction results, and LSTM the poorest.

Table 4. Classification performances.

	Model
	1D-CNN		2D-CNN		LSTM		Conv-LSTM
Error Function	MAE	MSE	MAE	MSE	MAE	MSE	MAE	MSE
Accuracy (%)	92.0	93.9	90.7	96.5	94.4	92.0	97.6	92.3
Sensitivity (%)	83.3	87.5	4.2	79.2	45.8	50.0	62.5	95.8
Specificity (%)	92.6	94.3	96.6	97.7	97.7	94.9	100	92.0

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Lee, S.; Koo, B.; Yang, S.; Kim, J.; Nam, Y.; Kim, Y. Fall-from-Height Detection Using Deep Learning Based on IMU Sensor Data for Accident Prevention at Construction Sites. Sensors 2022, 22, 6107. https://doi.org/10.3390/s22166107

AMA Style

Lee S, Koo B, Yang S, Kim J, Nam Y, Kim Y. Fall-from-Height Detection Using Deep Learning Based on IMU Sensor Data for Accident Prevention at Construction Sites. Sensors. 2022; 22(16):6107. https://doi.org/10.3390/s22166107

Chicago/Turabian Style

Lee, Seunghee, Bummo Koo, Sumin Yang, Jongman Kim, Yejin Nam, and Youngho Kim. 2022. "Fall-from-Height Detection Using Deep Learning Based on IMU Sensor Data for Accident Prevention at Construction Sites" Sensors 22, no. 16: 6107. https://doi.org/10.3390/s22166107

APA Style

Lee, S., Koo, B., Yang, S., Kim, J., Nam, Y., & Kim, Y. (2022). Fall-from-Height Detection Using Deep Learning Based on IMU Sensor Data for Accident Prevention at Construction Sites. Sensors, 22(16), 6107. https://doi.org/10.3390/s22166107

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Fall-from-Height Detection Using Deep Learning Based on IMU Sensor Data for Accident Prevention at Construction Sites

Abstract

1. Introduction

2. Materials and Methods

2.1. Experiment

2.2. Pre-Processing

2.3. Deep Learning Models

2.4. Evaluation Methods

3. Results

4. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI