Assessment System for Child Head Injury from Falls Based on Neural Network Learning

Yang, Ziqian; Tsui, Baiyu; Wu, Zhihui

doi:10.3390/s23187896

Open AccessArticle

Assessment System for Child Head Injury from Falls Based on Neural Network Learning

by

Ziqian Yang

^1,2,*

,

Baiyu Tsui

^1,2 and

Zhihui Wu

^1,2

¹

College of Furnishings and Industrial Design, Nanjing Forestry University, Nanjing 210037, China

²

Jiangsu Co-Innovation Center of Efficient Processing and Utilization of Forest Resources, Nanjing 210037, China

^*

Author to whom correspondence should be addressed.

Sensors 2023, 23(18), 7896; https://doi.org/10.3390/s23187896

Submission received: 26 June 2023 / Revised: 19 August 2023 / Accepted: 13 September 2023 / Published: 15 September 2023

(This article belongs to the Special Issue AI-based Sensing for Health Monitoring and Medical Diagnosis)

Download

Browse Figures

Versions Notes

Abstract

:

Toddlers face serious health hazards if they fall from relatively high places at home during everyday activities and are not swiftly rescued. Still, few effective, precise, and exhaustive solutions exist for such a task. This research aims to create a real-time assessment system for head injury from falls. Two phases are involved in processing the framework: In phase I, the data of joints is obtained by processing surveillance video with Open Pose. The long short-term memory (LSTM) network and 3D transform model are then used to integrate key spots’ frame space and time information. In phase II, the head acceleration is derived and inserted into the HIC value calculation, and a classification model is developed to assess the injury. We collected 200 RGB-captured daily films of 13- to 30-month-old toddlers playing near furniture edges, guardrails, and upside-down falls. Five hundred video clips extracted from these are divided in an 8:2 ratio into a training and validation set. We prepared an additional collection of 300 video clips (test set) of toddlers’ daily falling at home from their parents to evaluate the framework’s performance. The experimental findings revealed a classification accuracy of 96.67%. The feasibility of a real-time AI technique for assessing head injuries in falls through monitoring was proven.

Keywords:

head injury from falls; deep learning; Open Pose; LSTM; 3D transform model

1. Introduction

For toddlers three years and younger, falls pose the most significant risk to their lives, leading to injuries such as craniofacial fractures, concussions, and sprains of the cervical spine’s spinal cord and neck strains [1,2,3,4,5]. According to an Injury Review of Chinese Children and Adolescents [6], falls are the most common reason for accidental injury and death in China’s children aged 1 to 4 years old. A total of 40.18 percent of these falls occurred at home, and 46.99 percent resulted in head injuries. Toddlers are most likely to fall from steps, daycare furniture, or playground equipment occasionally during normal development [7]. Among them, toddlers climbing over bed guardrails, tumbling, and sustaining injuries represent a high incidence of events nationwide. The majority of toddler overturned falls involve head contact.

Research on fall injury detection can be categorized into the wearable sensor-based method (WSM), the finite element method (FEM), and visual detection. In wearable sensor-based systems, accelerometers and gyroscopes are commonly employed to measure acceleration and deflection angles. Gina et al. [8] recorded incidences of falling in children aged 12 to 36 months who wore sensor-laden headbands (SIMG) and were monitored via video in a daycare setting. The drop heights varied between 0.01 and 1 m. No one was critically wounded in the fall, but one child did have minor injuries. Kakara et al. [9] outfitted a licensed creche facility with an embedded video surveillance system and equipped test subjects with accelerometer gyroscope sensors. The head injury criterion (HIC) was calculated using fall simulations via FEM. The result is the biomechanical simulation cannot be utilized to estimate the severity of fall-related injuries without actual fall data. Most of the sensor-based falls took place in a “safe” environment, and the researcher safeguarded children at a predefined rate for ethical reasons, resulting in minimal injury events required for experimental accuracy. There is a need to enhance the precision of the experiment.

Utilizing wearable sensors, measuring the mechanical indicators of a toddler’s head is conducted. This process involves discretizing the solution area into a grouping system of units recombined in a computer to generate a finite element model [10,11]. Angela et al. [12] investigated the effects of altering fall environment and child proxy characteristics on fall kinetics and probable injury consequences using a bed fall FEM. Environmental measures (bed height, onset force, and impact surface stiffness) are more sensitive to changes in injury outcome indicators than alternative parameters (body segment stiffness, neck stiffness, and head stiffness). Fahlstedt et al. [13] used FEM to test the effects of three playground surface stiffnesses and three head hit locations on seven age groups ranging from 1.5 to 18. These techniques need to learn about children’s injury tolerance and biomechanical responses, which limits their fidelity. Their models have stiffer heads and necks than real children but share anthropometric characteristics with a 12-month-old youngster [10,11,12]. Computer models are discrete representations of actual occurrences, so it should be stressed. Hence, this class of models may need more accuracy regarding bed falls.

Children are hesitant to wear sensors daily due to discomfort and limited mobility, and these expensive devices are often subject to complex home environments. Therefore, vision-based fall detection approaches have been popular in recent years. Skeletal information [14,15] was retrieved from RGB photos using a deep-learning model [16,17,18] like OpenPose. A recurrent neural network model, such as a Region Convolutional Neural Network (R-CNN), was used to learn the movement features [19]. Spatio-Temporal Graph Convolutional Networks (STGCN) [20], and a one-dimensional convolutional neural network mode (1D-CNN) [21]. This class of techniques offers high detection precision and recall. Still, their processing speed is insufficient for real-time detection and fall injury assessments of children, which requires a predictive analysis of changes over time. Time series-based models have high precision and speed advantages, among which LSTM [22,23,24] obtains multimodal feature fusion for fall detection from skeleton frame sequences, which satisfies the need for fall injury assessment. Lin et al. [24] reported that the accuracy of the Recurrent Neural Network (RNN), LSTM, and Gate Recurrent Unit (GRU) models did not improve after interpolation, indicating that the interpolation procedure does not have the desired effect when key point data are lacking.

In light of the preceding discussion, to make intelligent child fall injury assessment more accessible and practical. We devised an algorithm via real-time monitoring to estimate the impact force on a falling child’s head. As depicted in Figure 1, we can assess whether or not they sustain injuries and issue a real-time alarm warning.

For the self-built dataset, we selected 200 RGB-captured daily videos of 13- to 30-month-old toddlers playing near furniture edges or guardrails and upside-down falling. These recordings ranged in length from 10 min to 15 s. One hundred parents consented to use video surveillance content for research purposes. Five hundred video clips can be cut out based on the rapid head movement of these videos. The dataset’s diversity was enhanced, where the children’s beds’ lighting conditions, camera settings, and furnishings varied. It may improve the framework’s generalization capacity. The processed falling video data is fed into a 3D transform model to identify the features. The recognition results are output via the regression layer to construct a model for assessing the seriousness of the head injury based on Head Impact Criteria (HIC) [25]. Subsequently, the trained classification model is employed to carry out the assessment. Finally, the dataset from another 300 video clips of children’s falls is utilized to test the performance of the framework.

The following are the main contributions of this research:

A two-stage deep-learning-based architecture for high-precision detection is proposed. The LSTM neural network and 3D transform model can correct the coordinates of the omitted 2D skeleton points and derive the 3D stereo coordinates by combining the spatial and temporal information of the key points. We hope to decrease missed or inaccurate detections brought on by complicated environments, small target sizes, and density distributions. It seeks to bring intelligent child fall in-injury assessments one step closer to being convenient and speedy.
A HIC-based approach for classifying damage degree is developed based on the triaxial acceleration of the human head center drop.
A dataset of 500 video clips of children’s falls is generated using 200 real-time daily videos, and another test set of 300 video clips is used to evaluate the performance of the system. These videos encompass a range of lighting conditions, camera settings, and interior furnishings. It levels the dataset’s diversity and boosts the framework’s generalization ability.
The feasibility of a real-time AI technique for assessing head injuries in falls is proven. After future upgrades and optimizations, it is implementable on hardware platforms such as intelligent surveillance cameras.

The rest of this essay is structured as follows. The approach is described in Section 2. Experiments are performed, and the outcomes are discussed in Section 3. Section 4 contrasts the methodology and accuracy of our classification system to those of fall injury classification systems. Section 5 outlines the conclusions of this work and possible further research.

2. The Proposed Method

We attempted to extract the key point parameters before designing the framework. The compared algorithms were YOLO and OpenPose, and YOLO failed to attain the expected results, as evidenced by the following factors: while capturing the target child, other interfering objects are also captured, impeding the algorithm’s ability to detect children; the algorithm cannot capture the child if there is excessive feature occlusion or low light intensity. These issues necessitate the creation of YOLO plugins. Therefore, OpenPose was ultimately chosen to extract the parameters of the child’s objective.

The framework is divided into two stages: extraction of kid key point information and classification of brain injury evaluation.

The process commences with selecting and cutting fragment frames depicting the child’s fall. These frames are subsequently sent into OpenPose and merged with LSTM to identify the child’s key point information within the input image. Then, the frames are passed via the 3D transform model to estimate the 3D human pose. Ultimately, the target transformation results are integrated with the previous stage’s output key point information to complete the 3D skeletal point. It implies that the estimation of the head point positions can be deduced based on the body joints, even if the child’s head is concealed.

The head acceleration is computed using the head point’s coordinates, then entered into the HIC value calculation method, and the degree of head injury is determined by comparing it to the international standard value. Brain injury severity can be used to designate the human body as safe or dangerous.

Using these two elements, we create an algorithm with temporal continuity and an exceptional classifier for identifying potential threats. Figure 2 shows the pipeline of the framework.

2.1. Open Pose Obtains Human Body Skeleton Information

Open Pose can demonstrate stable tracking in the presence of underbody occlusion or non-frontal lobe tracking to extract human key points from datasets [26]. It can extract 25 human key points from 2D pictures in low-light conditions, as shown in Figure 2. It overcomes the detection distance limitation compared to the Kinect sensor-based method for obtaining fall recognition points.

2.2. 2D Key Points Inspection and Repair

Due to the fact that the coordinates of 2D key points recognized via Open Pose are readily obscured by other objects or lighting concerns in the home, the coordinate points may be disregarded or misled, affecting subsequent 3D coordinate estimations; therefore, the 2D joint points must be verified and supplemented.

Only spatial information variables are considered in Open Pose’s 2D key point extraction process. We integrate each frame’s spatial and temporal information to ensure no critical key points are neglected. We use the LSTM [27] to learn long-term dependencies between data based on the correlation of human joint coordinates in continuous time increments and extract complete 2D key point coordinates frame by frame.

This study mathematically describes its forward propagation:

F_{t} = σ (T_{t} \times W_{f} + h_{t - 1} \times {(W h)}_{f})

(1)

T_{t}

is the input forgetting gate at time t,

W_{f}

is the weight corresponding to the input,

h_{t - 1}

is the hidden state at time t−1,

{(W h)}_{f}

is the matrix of weights within the hidden state, and

σ

is the function that makes the final result at that gate.

I_{t} = σ (T_{t} \times W_{t} + h_{t - 1} \times {(W h)}_{i})

(2)

{\tilde{C}}_{t} = \tanh (T_{t} \times W_{c} + h_{t - 1} \times {(W h)}_{c})

(3)

T_{t}

is the input at time t,

W_{i}

is the weight corresponding to the input,

h_{t - 1}

is the hidden state at time t−1,

{(W h)}_{i}

is the matrix of weights within the hidden state, and

σ

is the function that gives 0 or 1 as the final result at that gate.

C_{t} = F_{t} \times C_{t - 1} \times I_{t} \times {\tilde{C}}_{t}

(4)

O_{t} = σ (T_{t} \times W_{o} + h_{t - 1} \times {(W h)}_{o})

(5)

H_{t} = O_{t} \times \tanh (C_{t})

(6)

\{\begin{cases} C_{t - 1} \times F_{t} = 0, F_{t} = 0 \\ C_{t - 1} \times F_{t} = 1, F_{t} = 1 \end{cases}

(7)

where

f_{t}

represents the forget gate,

i_{t}

is the input gate, and

{\tilde{C}}_{T}

denotes the candidate state. The forgetting gate decides whether or not to keep the information from the previous timestamp based on Equation (1). The input gate employs Equation (2) to compute the significance of the information and utilizes Equation (3) to incorporate the updated information. Equation (4) can also be employed to modify the state of a cell. Ultimately, the output gate utilizes Equation (5) to generate the output and simultaneously changes the existing hidden state by Equation (6). The input sent to the LSTM unit is denoted as F. This input represents the key point data about the missed detection occurring in the kth frame of the video. As demonstrated in Equation (7), input values with a value of 0 for the variable

F_{t}

= 0 are disregarded, whereas input data with a value of 1 for

F_{t}

= 1 are preserved. The LSTM network was configured to memorize up to 100 dependencies, which proved adequate for achieving the desired results. The results were then forwarded to the fully connected layer, analyzing the nonlinear relationships between the higher-level features.

Algorithm 1 details the 2D key point inspection and repair process.

Algorithm 1: Correction of Skeleton Key Point Coordinates

Input: Bone point coordinate matrix’F’, size’

T \times N_{t} \times 2 ’

Output:

Missing point N_{t} \times 2

Procedure:
1: Input: Enter key point data’F’;
2: Forget gate:

F_{t}

is 0 or 1
If

C_{t - 1} \times F_{t}

= 0,
The input value is reserved
Else:
The input value is forgotten;
3: Input gate:
By the sigmoid and tanh functions, determines the value affecting

C_{t}

;
4: Output gate:
By the sigmoid function, output

C_{t}^{'}

, then

C_{t}^{'} \times O_{t}

;
5: Combine T results for linear regression and out the missing point’s coordinates.

2.3. Map 2D Key Joint to 3D Positions

After obtaining the 2D key point coordinates, we wish to use the 2D input to estimate the human key point locations in 3D space. The method conveys less information in 2D detection, but its low dimensionality reduces overall training time and greatly accelerates network design and the search for training hyperparameters. The value of the z coordinate is derived from the known 2D coordinates of the key point (x, y), as shown in Figure 3.

Our method is founded on a simple, deep, multilayer neural network. The network consists of a linear layer, serial normalization, rectified linear units (RELUs), and dropout. It is performed thrice, with the last link connecting the two sections [28]. Our method accepts a 2D matrix of joint positions as input and generates a series of 3D joint positions. The objective is to identify a function that minimizes prediction error across an N-pose set:

F^{*} = \arg \underset{h}{m i n} \frac{1}{N} \sum_{i = 1}^{N} L (h (x_{i}) - y_{i})

(8)

Input is a sequence of 2D points

x \in R^{2 n}

, while output is a sequence of 3D points

y \in R^{3 n}

.

x_{i}

are the 2D point coordinates generated via OpenPose during camera image processing.

Batch normalization and dropout are utilized to enhance the efficacy of our system, but this results in a slight increase in training and testing time. Additionally, in conjunction with batch normalization, we set a restriction on the weights of each layer so that the maximum number of paradigms is fewer than or equal to 1. The training initiates a 0.001 percent learning rate and exponential decay with 32-unit batches.

The coordinate conversion is shown clearly in Algorithm 2.

Algorithm 2: Convert 2D to 3D coordinates

Input: 2D bone point coordinates
Output:

Z_{i}

1: Enter the linear layer and increase its dimension by 1024;
2: Standardize for batch processing and discard;
3: Enter ReLUs for activation processing;
4: Select the Z value with the minor error from it;
5: Entering the linear layer once more, generating an output of size 3 n;
6: Combine with input and output the result.

2.4. Values Calculated for Head Injuries

Through research, a wealth of knowledge on brain injury biomechanics has been accumulated, and different injury evaluation standards have been proposed for various types of head injuries. The renowned Wayne State Tolerance Curve (WSTC) [29] has become the basis for most acknowledged craniocerebral tolerance indices. It explains the connection between linear head acceleration, acceleration duration, and the onset of concussion. Gadd (1966) later proposed the severity index (SI) [30] based on the WSTC, and Versace made alterations to the SI as a head injury criterion (HIC) in 1971. It is the most frequently used injury criterion and is calculated as a function of the duration of acceleration at the head’s center of gravity [31]. In 2011, Hideyuki proposed the Rotational Injury Criterion (RIC) [32], derived from HIC by substituting resultant angular acceleration for resulting linear acceleration. However, since this paper is based on visual detection from a single surveillance camera in daily life, the viewpoint and number of equipment are insufficient to obtain accurate angular acceleration. Consequently, this paper employs HIC as the evaluation method for children’s fall injuries.

From the head acceleration on the center of mass, the following formula can be used to calculate the HIC.

H I C = (t_{2} - t_{1}) {[\frac{1}{t_{2} - t_{1}} \int_{t_{1}}^{t_{2}} a (t) d t]}_{m a x}^{2.5}

(9)

where

t_{1}

and

t_{2}

are the initial and final impact times (in seconds), and a (t) is the head acceleration at the point of impact (in g/s, with g being the standard gravitational acceleration).

t_{2} - t_{1}

is the impact time, which is restricted to 36 ms or 15 ms based on historical data and Federal Motor Vehicle Safety Standard 208 in order to guarantee the maximum HIC. Due to the scarcity of time, we employ a 15 ms time limit.

Table 1 shows that the specific HIC criteria are 250, 700, and 1000, defining a spectrum of injury severity levels [33,34]. We employ the conventional HIC criteria as a classification standard for injuries sustained by toddlers during overturning falls.

Using previously obtained 3D coordinates, we estimated the head’s acceleration. The obtained acceleration data was then used to calculate the HIC by fitting the curve in MATLAB to identify and evaluate the fall-related head injury. As shown in Table 2, it is based on the HIC-derived head injury threshold. There are four categories of injury severity: no injury, minor injury, moderate injury, and severe injury. This value functions as the label for the training data that follows.

2.5. Classification Based on Machine Learning

We compare two standard classification algorithms, random forest (RF) and support vector machine (SVM), to determine the most effective method for assessing fall injuries.

SVM provides small sample pattern identification advantages and is initially utilized for binary classification. A kernel function linearizes and categorizes the input feature vector’s characteristics by projecting them onto a high-dimensional space.

Several advantages of the RF algorithm have been highlighted in the literature. RF yields high output precision and is resistant to overfitting [35,36]. It is computationally faster than other algorithms, such as SVM. In addition, it allows us to select the most essential variables [36] and eliminate the least important attributes [35,37].

3. Experimental Setup

3.1. Dataset and Test

The fall event is employed to assess the effectiveness of the proposed methodology. A total of 200 videos were obtained from parents, capturing the daily activities of toddlers aged 13 to 30 months. These videos specifically focus on instances where the toddlers tumbled from guardrails. The sample comprises 100 toddlers and exhibits diverse lighting situations, camera settings, and interior decor. Including various data in the dataset enhances its diversity and improves the generalization capability of the framework. Several of these are illustrated in Figure 4. Each video lasts 10 min to 15 s. And these depict children’s daily activities, such as falling forward or backward on the bed or being captured after falling more than once in one video. Therefore, we can sample two to three video clips from each video depending on falling forward or backward with various outcomes. A total of 500 video clips of children extracted from 200 videos were divided into an 8:2 ratio into a training and validation set. We prepared an additional set of 300 video clips (test set) of children’s daily falling at home from their parents to evaluate the framework’s performance.

Based on the size of the dataset and the training pace, the batch sizes and Epoch are set to 16 and 300, respectively. The dataset photos are uniformly scaled to 640 × 640 pixels and sent through the network for training the foreground extraction model. The beginning and ultimate learning rates are set at 0.01 and 0.10. The momentum is adjusted to 0.937 to avoid overfitting, and the weight decay coefficient is set to 0.0005 to prevent the model from attaining a local maximum. The toddler’s key point model is trained using images from the dataset scaled to 300 × 300 pixels. The weight decay coefficient and minimum and maximum learning rates are set to 0.0005, 0.0002, and 0.02, respectively.

To evaluate the performance of our proposed model, there are four distinct possibilities within the classification of head injuries.

The first is when the system correctly categorizes damage following a fall.

The second scenario involves an algorithm that wrongly labels a fall event that did not result in injury as an injury.

The third scenario involves a fall incident and an injury not recognized by the system.

In the fourth type, no fall event occurred; hence the algorithm did not divide it.

Accordingly, there are four categories. TP, FP, TN, and FN.

True Positive (TP): Injury occurred, and the equipment correctly classifies injury.

False Positive (FP): No injury happened; however, the equipment misclassified it.

True Negative (TN): No injury occurred, and the equipment correctly classified it.

False negative (FN): Injury occurs; however, the equipment misclassified it.

The classification’s reliability was evaluated using sensitivity, specificity, accuracy, precision, and F-score to assess these four cases.

S e n s i t i v i t y = \frac{T P}{T P + F N}

(10)

S p e c i f i c i t y = \frac{T N}{T N + F P}

(11)

A c c u r a c y = \frac{T P + T N}{T P + T N + F P + F N}

(12)

P r e c i s i o n = \frac{T P}{T P + F P}

(13)

F - S c o r e = \frac{2 T P}{2 T P + F P + F N}

(14)

These five metrics are performance indicators in classification selection and model evaluation. Among these is sensitivity, which characterizes the model’s responsiveness to damage classification. Specificity refers to the model’s capacity to prevent misidentification; the more significant the index, the less likely the model will be misidentified for another. These two factors can intuitively highlight the categorization model’s properties. Consequently, they can be used as a more rigorous reference standard.

3.2. Experimental Results Analysis

Figure 5 displays the change of acceleration in 3D data from our self-build fall dataset when four types of injury events occur: no injury, minor injury, moderate injury, and heavy injury.

Rapid changes occur in the coordinates of the body during the descent. Consequently, the curve in the figure appears to change swiftly and dramatically. After the skull hits the ground, a secondary collision will occur due to sliding or bouncing. This is also related to the impact surface stiffness [12]. Therefore, the curves at this stage are clustered in the figure to create a cluster.

Figure 5d depicts a scenario in which a head injury is severe. Compared to Figure 5a–c, Figure 5d’s acceleration graph has a more extensive change area in the 3D coordinate system. It represents a more significant acceleration change, possibly related to falling height, toddler weight, and onset force. However, clusters representing secondary bounces in Figure 5d did not differ substantially from the other three cases. In our framework, we cannot infer the association of ground stiffness with injury severity qualitatively and quantitatively. It requires AI identification of ground material hardness through visual detection.

We utilized the additional set of 300 video clips of children’s daily falling to evaluate the model’s dependability. We have chosen three classifiers to compare classification outcomes, including random forest (RF) [37,38,39,40] and two kernel functions in the SVM [41,42,43,44,45,46] classifier: linear and RBF [42,47]. Table 3 outlines the test results.

According to Table 3, RBF achieved the lowest classification results and is manifestly inapplicable to the research. Figure 6 depicts the classification results using linear SVM. Therefore, the four categories of No injury, Minor injury, Moderate injury, and Serious injury were renamed as 0, 1, 2, and 3, respectively. Three out of ninety No injury samples were misclassified as Minor injury, while seven out of eighty-four Minor injury samples were misclassified as No injury. Two samples are misclassified as Moderate injuries, while four out of 82 Moderate injuries are misclassified as Minor injuries. Two samples out of 44 with Heavy injury are incorrectly classified as having Moderate injury. While the method obtains a precision of 98.48%, its accuracy is only 93.67%, and the remaining evaluation criteria are unimpressive.

As demonstrated in Table 3, the random forest outperformed other classifiers regarding sensitivity, specificity, accuracy, precision, and F-score; its classification results are depicted in Figure 7. Two out of ninety samples of No injury are misclassified as Minor injury, two out of eighty-four samples of Minor injury are misclassified as No injury, and three out of eighty-two samples of Moderate injury are misclassified as Minor injury. Three of eighty-two samples of Moderate injury are incorrectly categorized as Heavy injury. The Heavy injury of 44 samples is accurately categorized. Figure 7 depicts the visual classification outcomes derived from Random Forest (RF) and Table 3 for the fall injury assessment test conducted on the fall dataset.

This study’s technique effectively identifies human head injuries with a low rate of missed detection, achieving 96.67% accuracy and 99.02% precision; it has a specificity of 97.78% and an F- grade. At the same time, it provides a specificity of 97.78% and an F-score of 97.58%, indicating that this procedure effectively distinguishes between various injuries and generates fewer false positives. The four levels of head injury have been correctly classified by the classification system.

The random forest (RF) demonstrates a high level of efficacy in performing classification tasks, whereas the approach of evaluating fall injuries through the measurement of head acceleration also exhibits effectiveness.

4. Discussions

This research proposes a real-time head fall injury assessment based on HIC values. The detection procedure consists of two phases, intending to accurately detect toddler injuries from falling in real-time. Phase I begins with extracting the video frames. The data of joints is obtained by processing video captured by surveillance with Open Pose. The LSTM neural network and 3D transform model are then used to integrate key spots’ frame space and time information. In phase II, the head acceleration is derived and inserted into the HIC value calculation. The experiment results demonstrate that the system has an accuracy of 96.67%, sensitivity is 96.19%, and specificity is 97.78%. The proposed detection system can be extended functionally and structurally to outdoor environments. For example, in community playgrounds, surveillance cameras near slides and climbing frames could monitor the occurrence of head injuries and issue a real-time alarm warning.

Table 4 compares the classification results of the proposed method to those of other classifiers, such as ANN [48,49], CNN [50], and STGCN+1D-CNN [21]. The proposed framework’s classification accuracy on self-built datasets is 96.67%. Similar to this, other classifiers’ classification accuracy using the self-built dataset is 93.14%, 94.3%, 96.43%, and 95.8%, respectively. The classification accuracy of STGCN+1D-CNN [21] with URFD is 96.53%. Thus, the proposed method attains a slight advantage compared with other classifiers.

Even though the offered method produces good detection results, it has shortcomings.

As we are using home environment monitoring, some interior furnishings obstruct the camera angles, resulting in insufficient data due to the absence of key joints. We will optimize the ability of our framework to repair small-size toddlers’ skeleton joints.
Since our framework needs to be uploaded to the server for processing, we should also perform real-time optimization to increase the model’s inference efficiency. Throughout each iteration, the convergence rate of the optimization techniques should be improved.
Our framework does not allow us to deduce how the stiffness of the ground affects the severity of injury. AI ground hardness identification via visual detection and the effect on the harm value is required in the coming work.
In practice, it is necessary to determine whether a fall has occurred before deciding the presence of a head injury. So, a vision-based fall detection system is required in the framework in the future.

5. Conclusions

The feasibility of a real-time AI technique for assessing head injuries in falls via monitoring is proven. It encourages advancing child injury detection research in the direction of practicability. The detection procedure consists of two phases, key points repair and 3D transform, a classifier of injuries based on HIC. In phase I, the LSTM neural network and 3D transform model integrate key spots’ frame space and time information and repair the coordinates of the missing skeletal point. In phase II, the head acceleration is derived and inserted into the HIC value calculation to establish the classifier. Reliable video surveillance of falls in children’s daily lives is crucial to improving our accuracy. Nearly 14.7% of the falls in our 300 video frames (test set) involving toddlers resulted in serious injuries (requiring medical treatment). This is confirmed after parental verification. Compared with the traditional head injuries assessment method, toddlers’ discomfort with wearable sensors is addressed. It also solves the lack of accuracy in injury detection of the FEM model. The experiment results demonstrate that the system has an accuracy of 96.67%. sensitivity is 96.19%, and specificity is 97.78%. Our solution has a relatively high detection accuracy for toddler fall-related injuries and a substantial application value.

Author Contributions

Conceptualization, methodology, writing—review and editing, supervision, Z.Y.; data curation, formal analysis, and investigation, writing—original draft preparation, B.T.; supervision, Z.W. All authors have read and agreed to the published version of the manuscript.

Funding

This work has been co-funded by the National Key Research and Development Program (2017YFD0601104).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Informed consent was obtained from all subjects involved in this study.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

RGB	Red, green, blue
RNN	Recurrent Neural Network
LSTM	Long short-term memory
GRU	Gate Recurrent Unit
STGCN	Spatio-Temporal Graph Convolutional Network
CNN	Convolution neural network
FEM	Finite element method
WSM	Wearable sensor-based method
DANN	Deep Adversarial Neural Network
HIC	Head Impact Criteria
SIMG	Sensor-laden headbands
ReLU	Rectified Linear Unit
SVM	Support vector machine
RBF	Radial basis function
RF	Random forest
DT	Decision tree

References

Brown, C.W.; Akbar, S.P.; Cooper, J.G. Things that go bump in the day or night: The aetiology of infant head injuries presenting to a Scottish Paediatric Emergency Department. Eur. J. Emerg. Med. 2014, 21, 447–450. [Google Scholar] [CrossRef] [PubMed]
Clayton, J.L.; Harris, M.B.; Weintraub, S.L.; Marr, A.B.; Timmer, J.; Stuke, L.E.; Mcswain, N.E.; Duchesne, J.C.; Hunt, J.P. Risk factors for cervical spine. Injury 2012, 43, 431–435. [Google Scholar] [CrossRef] [PubMed]
Crowe, L.M.; Catroppa, C.; Anderson, V.; Babl, F.E. Head injuries in children under 3 years. Inj.-Int. J. Care Inj. 2012, 43, 2141–2145. [Google Scholar] [CrossRef] [PubMed]
Yang, R.T.; Li, Z.; Li, Z.B. Maxillofacial Injuries in Infants and Preschools: A 2.5-Year Study. J. Craniofacial Surg. 2014, 25, 964–967. [Google Scholar] [CrossRef]
Zhou, Y.; Wang, X.B.; Kan, S.L.; Ning, G.Z.; Li, Y.L.; Yang, B.; Li, Y.; Sun, J.C.; Feng, S.Q. Traumatic spinal cord injury in Tianjin, China: A single-center report of 354 cases. Spinal Cord 2016, 54, 670–674. [Google Scholar] [CrossRef]
National Center for Chronic and Noncommunicable; Disease Control and Prevention China CDC, B. Review of Chinese Children and Adolescents; People’s Health Electronic Audio and Video Publishing House: Beijing, China, 2018; pp. 2–3.
Flavin, M.P.; Dostaler, S.M.; Simpson, K.; Brison, R.J.; Pickett, W. Stages of development and injury patterns in the early years: A population-based analysis. BMC Public Health 2006, 6, 10. [Google Scholar] [CrossRef]
Bertocci, G.; Smalley, C.; Brown, N.; Dsouza, R.; Hilt, B.; Thompson, A.; Bertocci, K.; McKinsey, K.; Cory, D.; Pierce, M.C. Head biomechanics of video recorded falls involving children in a childcare setting. Sci. Rep. 2022, 12, 13. [Google Scholar] [CrossRef]
Kakara, H.; Nishida, Y.; Yoon, S.M.; Miyazaki, Y.; Koizumi, Y.; Mizoguchi, H.; Yamanaka, T. Development of childhood fall motion database and browser based on behavior measurements. Accid. Anal. Prev. 2013, 59, 432–442. [Google Scholar] [CrossRef]
He, J.Y.; Yan, J.W.; Margulies, S.; Coats, B.; Spear, A.D. An adaptive-remeshing framework to predict impact-induced skull fracture in infants. Biomech. Model. Mechanobiol. 2020, 19, 1595–1605. [Google Scholar] [CrossRef]
Hu, J.; Li, Z.; Zhang, J. Development and Preliminary Validation of a Parametric Pediatric Head Finite Element Model for Population-Based Impact Simulations. In Proceedings of the Asme Summer Bioengineering Conference, Farmington, PA, USA, 22–25 June 2011. [Google Scholar]
Thompson, A.; Bertocci, G. Pediatric bed fall computer simulation model: Parametric sensitivity analysis. Med. Eng. Phys. 2014, 36, 110–118. [Google Scholar] [CrossRef]
Fahlstedt, M.; Kleiven, S.; Li, X.G. Current playground surface test standards underestimate brain injury risk for children. J. Biomech. 2019, 89, 1–10. [Google Scholar] [CrossRef] [PubMed]
D’Orazio, T.; Marani, R.; Reno, V.; Cicirelli, G. Recent trends in gesture recognition: How depth data has improved classical approaches. Image Vis. Comput. 2016, 52, 56–72. [Google Scholar] [CrossRef]
Uddin, M.K.; Bhuiyan, A.; Bappee, F.K.; Islam, M.M.; Hasan, M. Person Re-Identification with RGB-D and RGB-IR Sensors: A Comprehensive Survey. Sensors 2023, 23, 29. [Google Scholar] [CrossRef] [PubMed]
Shafay, M.; Ahmad, R.W.; Salah, K.; Yaqoob, I.; Jayaraman, R.; Omar, M. Blockchain for deep learning: Review and open challenges. Clust. Comput.-J. Netw. Softw. Tools Appl. 2023, 26, 197–221. [Google Scholar] [CrossRef]
Shoaib, M.; Bosch, S.; Incel, O.D.; Scholten, H.; Havinga, P.J.M. Fusion of Smartphone Motion Sensors for Physical Activity Recognition. Sensors 2014, 14, 10146–10176. [Google Scholar] [CrossRef]
Yang, C.G.; Chen, C.Z.; He, W.; Cui, R.X.; Li, Z.J. Robot Learning System Based on Adaptive Neural Control and Dynamic Movement Primitives. IEEE Trans. Neural Netw. Learn. Syst. 2019, 30, 777–787. [Google Scholar] [CrossRef]
Girshick, R.; Donahue, J.; Darrell, T.; Malik, J.; IEEE. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the 27th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Columbus, OH, USA, 23–28 June 2014; pp. 580–587. [Google Scholar]
Wu, W.W.; Tu, F.B.; Niu, M.Q.; Yue, Z.H.; Liu, L.B.; Wei, S.J.; Li, X.Y.; Hu, Y.; Yin, S.Y. STAR: An STGCN ARchitecture for Skeleton-Based Human Action Recognition. IEEE Trans. Circuits Syst. I-Regul. Pap. 2023, 70, 2370–2383. [Google Scholar] [CrossRef]
Amsaprabhaa, M.; Jane, Y.N.; Nehemiah, H.K. Multimodal spatiotemporal skeletal kinematic gait feature fusion for vision-based fall detection. Expert Syst. Appl. 2023, 212, 15. [Google Scholar] [CrossRef]
Geun, S.B.; Ho, K.U.; Woo, L.S.; Young, Y.J.; Wongyum, K. Fall Detection Based on 2-Stacked Bi-LSTM and Human-Skeleton Keypoints of RGBD Camera. KIPS Trans. Softw. Data Eng. 2021, 10, 491–500. [Google Scholar]
Han, K.; Yang, Q.Q.; Huang, Z.F. A Two-Stage Fall Recognition Algorithm Based on Human Posture Features. Sensors 2020, 20, 21. [Google Scholar] [CrossRef]
Lin, C.B.; Dong, Z.; Kuan, W.K.; Huang, Y.F.J.A.S. A Framework for Fall Detection Based on OpenPose Skeleton and LSTM/GRU Models. Appl. Sci. 2020, 11, 329. [Google Scholar] [CrossRef]
Anh, L.H.; Nguyen, P.T.L.; Vu, N.A. Effects of Impact Location, Impact Angle and Impact Speed on Head Injury Risk of Vietnamese Pedestrian Hit by a Sedan. Int. J. Automot. Technol. 2023, 24, 411–420. [Google Scholar] [CrossRef]
Zhou, C.M.; Huang, T.; Luo, X.; Kaner, J.; Fu, X.M. Cluster analysis of kitchen cabinet operation posture based on OpenPose technology. Int. J. Ind. Ergon. 2022, 91, 12. [Google Scholar] [CrossRef]
Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
Shafiq, M.; Gu, Z.Q. Deep Residual Learning for Image Recognition: A Survey. Appl. Sci. 2022, 12, 43. [Google Scholar] [CrossRef]
Mellander, H. HIC-the head injury criterion. Practical significance for the automotive industry. Acta Neurochir. Suppl. 1986, 36, 18–20. [Google Scholar]
Long, K.J.; Gao, Z.B.; Yuan, Q.; Xiang, W.; Hao, W. Safety evaluation for roadside crashes by vehicle-object collision simulation. Adv. Mech. Eng. 2018, 10, 12. [Google Scholar] [CrossRef]
Wang, F.; Wang, Z.; Hu, L.; Xu, H.Z.; Yu, C.; Li, F. Evaluation of Head Injury Criteria for Injury Prediction Effectiveness: Computational Reconstruction of Real-World Vulnerable Road User Impact Accidents. Front. Bioeng. Biotechnol. 2021, 9, 16. [Google Scholar] [CrossRef]
Kimpara, H.; Iwamoto, M. Mild Traumatic Brain Injury Predictors Based on Angular Accelerations During Impacts. Ann. Biomed. Eng. 2012, 40, 114–126. [Google Scholar] [CrossRef]
Thompson, A.K.; Bertocci, G.E. Paediatric bed fall computer simulation model development and validation. Comput. Methods Biomech. Biomed. Eng. 2013, 16, 592–601. [Google Scholar] [CrossRef]
Kendrick, D.; Maula, A.; Reading, R.; Hindmarch, P.; Coupland, C.; Watson, M.; Hayes, M.; Deave, T. Risk and Protective Factors for Falls From Furniture in Young Children Multicenter Case-Control Study. JAMA Pediatr. 2015, 169, 145–153. [Google Scholar] [CrossRef] [PubMed]
Belgiu, M.; Drăguţ, L. Random forest in remote sensing: A review of applications and future directions. J. Photogramm. Remote Sens. 2016, 114, 24–31. [Google Scholar] [CrossRef]
Rodriguez-Galiano, V.F.; Ghimire, B.; Rogan, J.; Chica-Olmo, M.; Rigol-Sanchez, J.P. An assessment of the effectiveness of a random forest classifier for land-cover classification. ISPRS-J. Photogramm. Remote Sens. 2012, 67, 93–104. [Google Scholar] [CrossRef]
Liaw, A.; Wiener, M.J.R.N. Classification and Regression by randomForest. R News 2002, 23, 18–22. [Google Scholar]
Jiang, P.; Lu, H.X.; Liu, Z.B. Drugs Identification Using Near-Infrared Spectroscopy Based on Random Forest and CatBoost. Spectrosc. Spectr. Anal. 2022, 42, 2148–2155. [Google Scholar] [CrossRef]
Chen, F.X.; Yang, T.W.; Li, J.Q.; Liu, H.G.; Fan, M.P.; Wang, Y.Z. Identification of Boletus Species Based on Discriminant Analysis of Partial Least Squares and Random Forest Algorithm. Spectrosc. Spectr. Anal. 2022, 42, 549–554. [Google Scholar] [CrossRef]
Lin, X.W.; Wang, L.Q.; Zeng, Y.G.; Chen, Y.Z.; Wang, M.Y.; Zhong, J.P.; Wang, X.H.; Xiong, H.L.; Chen, Y. Random Forest Retinal Segmentation in OCT Images Based on Principal Component Analysis. Prog. Biochem. Biophys. 2021, 48, 336–343. [Google Scholar] [CrossRef]
dos Santos, C.M.; Escobedo, J.F.; Teramoto, E.T.; da Silva, S. Assessment of ANN and SVM models for estimating normal direct irradiation (H-b). Energy Convers. Manag. 2016, 126, 826–836. [Google Scholar] [CrossRef]
Li, X.; Gao, W.K.; Gu, L.X.; Gong, C.L.; Jing, Z.; Su, H. A cooperative radial basis function method for variable-fidelity surrogate modeling. Struct. Multidiscip. Optim. 2017, 56, 1077–1092. [Google Scholar] [CrossRef]
Aburomman, A.A.; Reaz, M.B. A novel weighted support vector machines multiclass classifier based on differential evolution for intrusion detection systems. Inf. Sci. 2017, 414, 225–246. [Google Scholar] [CrossRef]
Chauhan, V.K.; Dahiya, K.; Sharma, A. Problem formulations and solvers in linear SVM: A review. Artif. Intell. Rev. 2019, 52, 803–855. [Google Scholar] [CrossRef]
Tian, Y.J.; Shi, Y.; Liu, X.H. Recent advances on support vector machines research. Technol. Econ. Dev. Econ. 2012, 18, 5–33. [Google Scholar] [CrossRef]
Singla, M.; Shukla, K.K. Robust statistics-based support vector machine and its variants: A survey. Neural Comput. Appl. 2020, 32, 11173–11194. [Google Scholar] [CrossRef]
Jiang, Q.H.; Zhu, L.L.; Shu, C.; Sekar, V. An efficient multilayer RBF neural network and its application to regression problems. Neural Comput. Appl. 2022, 34, 4133–4150. [Google Scholar] [CrossRef]
Dusenberry, M.W.; Brown, C.K.; Brewer, K.L. Artificial neural networks: Predicting head CT findings in elderly patients presenting with minor head injury after a fall. Am. J. Emerg. Med. 2017, 35, 260–267. [Google Scholar] [CrossRef]
Sinha, M.; Kennedy, C.S.; Ramundo, M.L. Artificial neural network predicts CT scan abnormalities in pediatric patients with closed head injury. J. Trauma-Inj. Infect. Crit. Care 2001, 50, 308–312. [Google Scholar] [CrossRef]
Yhdego, H.; Li, J.; Morrison, S.; Audette, M.; Paolini, C.; Sarkar, M.; Okhravi, H.; IEEE. Towards musculoskeletal simulation-aware fall injury mitigation: Transfer learning with deep cnn for fall detection. In Proceedings of the Spring Simulation Conference (SpringSim), Tucson, AZ, USA, 29 April–2 May 2019. [Google Scholar]

Figure 1. General structure of child head injury assessment system.

Figure 2. General structure of assessment system using OpenPose and LSTM.

Figure 3. Map 2D points into 3D points.

Figure 4. Images in self-build fall dataset of toddlers.

Figure 5. The change of acceleration in x, y, and z directions.

Figure 6. Test results of 300 samples (SVM).

Figure 7. Test results of 300 samples (RF) and Error Curves.

Table 1. The HIC values correspond to the injury table.

Head Injury Criterion	Injury Type
250	cerebral concussion
700	Serious injury probability: 5% (Ais = 4)
1000	The probability of developing a malignant skull fracture: 33%

Table 2. Classification of head injury severity.

Head Injury Criterion	Injury Type
α < 250	No injury
250 < α < 700	Minor injury
700 < α < 1000	Moderate injury skull fracture: 33%
1000 < α	Heavy injury

Table 3. The experimental calculation results.

	Sensitivity	Specificity	Accuracy	Precision	F-Score
LibSVM	92.38%	96.67%	93.67%	98.48%	95.33%
RBF	91.43%	94.44%	92.33%	97.46%	94.35%
RF	96.19%	97.78%	96.67%	99.02%	97.58%

Table 4. Comparison of our suggested algorithm to other methods.

Reference	Theme	Dataset	Approach	Accuracy
Dusenberry et al. [48]	predict head injury	self-built	ANN	93.14%
M Sinha et al. [49]	predict head injury	self-built	ANN	94.3%
Yhdego et al. [50]	fall detection	self-built	Sensor data + CNN	96.43%
Amsaprabhaa et al. [21]	vision-based fall detection	URFD, self-built	STGCN+1D-CNN	96.53%, 95.8%
Ours	head injury assessment	self-built	LSTM+3D transform model	96.67%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yang, Z.; Tsui, B.; Wu, Z. Assessment System for Child Head Injury from Falls Based on Neural Network Learning. Sensors 2023, 23, 7896. https://doi.org/10.3390/s23187896

AMA Style

Yang Z, Tsui B, Wu Z. Assessment System for Child Head Injury from Falls Based on Neural Network Learning. Sensors. 2023; 23(18):7896. https://doi.org/10.3390/s23187896

Chicago/Turabian Style

Yang, Ziqian, Baiyu Tsui, and Zhihui Wu. 2023. "Assessment System for Child Head Injury from Falls Based on Neural Network Learning" Sensors 23, no. 18: 7896. https://doi.org/10.3390/s23187896

APA Style

Yang, Z., Tsui, B., & Wu, Z. (2023). Assessment System for Child Head Injury from Falls Based on Neural Network Learning. Sensors, 23(18), 7896. https://doi.org/10.3390/s23187896

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Assessment System for Child Head Injury from Falls Based on Neural Network Learning

Abstract

1. Introduction

2. The Proposed Method

2.1. Open Pose Obtains Human Body Skeleton Information

2.2. 2D Key Points Inspection and Repair

2.3. Map 2D Key Joint to 3D Positions

2.4. Values Calculated for Head Injuries

2.5. Classification Based on Machine Learning

3. Experimental Setup

3.1. Dataset and Test

3.2. Experimental Results Analysis

4. Discussions

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI