A Novel Feature Set Extraction Based on Accelerometer Sensor Data for Improving the Fall Detection System

: Because falls are the second leading cause of injury deaths, especially in the elderly according to WHO statistics, there have been a lot of studies on developing a fall detection and warning system. Many approaches based on wearable sensors, cameras, Infrared sensors, radar, etc., have been proposed to detect falls efﬁciently. However, it still faces many challenges due to noise and no clear deﬁnition of fall activities. This paper proposes a new way to extract 44 features based on the time domain, frequency domain, and Hjorth parameters to deal with this. The effect of the proposed feature set has been evaluated on several classiﬁcation algorithms, such as SVM, k-NN, ANN, J48, and RF. Our method achieves a relative high performance (F1-Score metric) in detecting fall and non-fall activities, i.e., 95.23% (falls), 99.11% (non-falls), and 96.16% (falls), 99.90% (non-falls) for the MobileAct 2.0 and UP-Fall datasets, respectively.


Introduction
Human activity recognition (HAR) has a wide range of applications, such as ambient assisted living, smart homes, rehabilitation, health care, and so on. Human activities can be categorized into normal activities of daily living (ADL),such as sitting, standing, walking, etc., and abnormal activities such as falls [1,2]. Falls are the second leading cause of unintentional injury deaths worldwide [3]. Thus, fall detection and prevention play an important role in daily living assistance, especially for seniors who are likely to have a higher risk when facing falls. Therefore, many researchers recently pay a lot of attention to study FDS.
The performance of the FDS depends on the type of sensor used. Sensors based on cameras and wearables are generally more common than ambient sensors. FDS based on ambient sensors gives high false-positive results due to environmental influences [4,5]. The limitation of FDS based on cameras is the limited recognition space. Detection performance is highly dependent on the lighting conditions of the environment. It also is expensive, computationally intensive, and has a slow processing time. In addition, this system may also affect the privacy of users [4,5]. The advantage of FDS based on wearable sensors are that they are easy to deploy, allow for continuous monitoring, and are unaffected by the environment. The implementations for FDS based on wearable sensors are generally inexpensive, highly portable, and with low power consumption [4]. In addition, the wearable sensor also provides outdoor monitoring, and collects data easily. The smartphone can be used as a wearable because of its ubiquity. Smartphones have built-in sensors, such as accelerometers, magnetometers, and gyroscopes. Beside of that, the ability to connect and communicate is the absolute advantage of smartphones. Many works have proved that it is possible to achieve high performance in fall detection when relying solely on accelerometer data [6][7][8][9][10][11].
The recent algorithms for dealing with fall detection mostly belong to two main categories that are machine learning-based methods and deep learning-based methods. Each approach has its advantages and disadvantages. For instance, the performance of the machine learning methods largely depends on the proper selection of feature sets [2,6,12]. In contrast, deep-learning methods depend on model building and optimization [7][8][9]13]. Nonetheless, this latter approach usually requires a huge amount of computational and storage resources.
In general, the existing fall detection algorithms can be classified into two main categories: binary and multi-class detection methods. The binary detection methods focus on identifying falls and non-fall activities [6,7,[9][10][11][14][15][16]. Indeed, fall activities are those having an abnormal and sudden change in their patterns, whereas, the non-fall activities or ADLs usually have regular patterns. Thus, the works focusing on distinguishing falls and non-fall activities are likely to achieve high accuracy. In contrast, the multi-class detection methods try to recognize various specific fall activities-for example, fall forward, falling backward, etc.-and various specific non-fall activities, for example, sitting, standing, jogging, stair up, stair down, etc. [2,8,12,13]. A classifier might misinterpret an activity with another one since some of their patterns are possibly similar. Therefore, the complexity of the classification problem certainly increases as the number of labels grows.
Remarkably, due to the difficulty in collecting data for fall activities, the existing public dataset used in fall detection contains very little amount of data representing fall activities-possibly less than 10% of total data samples [1,2]. Therefore, these datasets are unbalanced. This unbalanced issue causes most of the existing works dealing with multi-class activity detection to usually achieve high performance in detecting specific non-fall activities but lower performance in detecting fall activities [2,8,12,13].
Given the presented challenges, we propose a new feature set based on accelerometer data to establish a fall detection framework. In addition to the ability to classify falling and non-falling activities, this framework also has high detection accuracy in the context of unbalanced datasets. Two sets of raw data from different sources (MobiAct [1] dataset is obtained by smartphone; UP-Fall [2] dataset is obtained by IMU device) are used to test the effectiveness of the proposed feature set. The obtained results show that our proposed feature set has better fall detection performance than previous works in both cases. Specifically, the experimental results are 95.23% (fall), 99.11% (no fall) and 96.16% (fall), 99.90% (not fall) on the MobiAct and UP-Fall, respectively.
The rest of this article is organized as follows: the related work is presented in Section 2. Then, the details of the proposed method are described in Section 3. Next, the experimental settings and results are shown in Section 4. Finally, Section 5 provides various conclusion remarks.

Related Works
Comparing the performance of FDSs is a problem because each study uses a different dataset [17,18]. Many authors state that they cannot be compared directly with previous studies due to differences in data collection methods, target groups, and environmental settings in their studies [18,19]. Pannurat et al. [19] indicate that the factor affecting the performance is the number of training samples used to train the FDS. The results obtained from FDS studies based on accelerometer data are difficult to compare without using the same dataset. To evaluate the effectiveness of the proposed feature set, we have surveyed to find many datasets that many research teams have used and are relatively consistent with our research target (Table 1). With the characteristics discussed, we have selected two open-Electronics 2022, 11, 1030 3 of 20 access datasets, MobiAct v2.0 and UP-Fall, for feature extraction and FDS construction. These two datasets are collected methodically, scientifically, and diversely. The MobiAct v2.0 dataset has been collected from the accelerometer sensor of the smartphone. During the data collection process, volunteers were allowed to choose the location and orientation of the phone at random, so it was as close to reality as possible. The UP-Fall has been collected from the IMU (Inertial Measurement Unit) device. These are inexpensive and common devices in life. We use two datasets with different collection methods to ensure the stability of the results; reduce bias when evaluating system performance. Many research works on fall detection systems based on the MobiAct dataset only use the Accuracy (ACC) metric to evaluate the performance of the model [6,7,14]. Accuracy is a popular metric for evaluating the performance of classification models. However, with unbalanced datasets, the ACC metric is no longer suitable to evaluate the performance of the model [8]. In the MobiAct and UP-Fall datasets, the sample number of non-fall activities accounted for more than 90%. Meanwhile, the sample number of fall activities that need to be correctly detected is less than 10%. Non-fall activities (ADLs) are generally easier to classify than fall activities [2,12]. The overall results on the ACC measure were very high, but the results for the identification of fall activities were low [2,8,12,13]. As can be seen, the ACC measure is not suitable to evaluate the performance of the fall detection model when using an unbalanced dataset. In this case, metrics, such as Accuracy, Sensitivity, Specificity, and F1 Score, are preferred [20].
It can be observed that most of the recent works focus on using a binary classifier to identify fall and non-fall activities. There exist very few studies that deal with the multi-class problem in detecting specific fall and non-fall activities [2,8,12,13]. As discussed, these latter methods need to handle various difficulties such as the imbalance of datasets, and the complexity of multi-class methods in fall detection. Thus, they often achieve high performance in detecting specific non-fall activities but relatively lower performance in detecting specific fall activities. Le et al. [15] propose a new feature set to reduce the number of input attributes based on the APGWO model. They used three types of data included in the UP-Fall dataset to train the model. Various classifiers, such as Logistic Regression (LR), K-Nearest Neighbor (k-NN), Support Vector Machine (SVM), Decision Tree (DT), Naïve Bayes (NB), Random Forest (RF), Multilayer Perceptron (MLP), and the proposed model (APGWO), were used to evaluate the effectiveness of the proposed feature set. The best performance in Accuracy and F1-scores metrics are 99% and 96%, respectively. Although Le et al. [15] proposed an identification model based on binary classification, the F1 score is not impressive. In addition, this study also uses a combination of many different types of sensor data making it difficult to apply in practice.
Diponkor Bala and GM Waliullah [21] extracted the time and frequency domain properties of the four activities Stand (STD), Walk (WAL), Jog (JOG), and Jump (JUM) from the MobiAct V2.0 dataset. They then preprocessed the data size using PCA and Fisher's LDA methods. The best accuracy they obtained was 99.305% using the k-NN classifier. The number of samples of the two operations STD and WAL is much larger than that of JOG and JUM, so it can be considered unbalanced. Therefore, the Accuracy measure does not accurately reflect the effectiveness of the proposed attribute set. With unbalanced data, it is necessary to consider additional measures, such as Precision, Sensibility, F1-Score to obtain the most objective results. In addition, the four activities that Diponkor Bala and G. M. Waliullah [21] selected to build the identification model are cyclic, less complicated, so they are easier to identify. If this model applies to confusing activities, such as going upstairs, going downstairs, standing to sitting, sitting to standing, or activities with sudden changes in resistance such as falls, it will not be effective.
Panhwar et al. [22] presented a fall detection method using a smartphone's inertia sensor, a three-axis accelerometer, a three-axis gyroscope, and orientation data. They used both SVM and ANN for testing. The Neural Network-based model gives the best overall classification results with an accuracy of 96.07%. However, Panhwar et al. [22] only selected the subset of MobiAct dataset consisting of two fall activities (FOL, BSC) and three ADLs (STD, WAL, and SIT) to build the model. The data for these three daily activities are much more weighted than the two falls. Therefore, the obtained results do not accurately reflect the fall detection ability of the model.
Shi et al. [14] built a convolutional neural network (CNN) with class activation mapping (CAM). They used fall data and ADLs, including standing (STD), walking (WAL), jumping (JUM), and jogging (JOG) of the MobiAct dataset to train the model. Short segments of the fall around the frame initiate the fall phase, and short segments of ADL were used as training samples. Their model was able to achieve detection with an accuracy of 95.55%. However, their proposed method could not give promising results in cases such as a fall [2,[6][7][8][9][10][11][12]14,23].
Martínez-Villaseñor et al. [2] introduced a UP-Fall Fall Detection dataset with various sources of data, such as IMU, EEG, CAM, etc. The MLP classifier achieved the best results on the Accuracy scale (95.0%). However, on the precision, sensitivity, and F1-score measures, the results are 77.7%, 69.9%, and 72.8%, respectively, which are lower than the Accuracy metric. This model has a relatively low recognition of falls. That proves that their proposed feature set and model are not suitable for multiclass classification based on an unbalanced dataset.
The proper selection of feature sets critically affects the accuracy of the detection system. The ability to recognize fall activities is highly dependent on the feature set extracted from the raw data [24][25][26][27]. Table 1 also shows that a large number of features are required in the multi-class problem to obtain the prediction accuracy of higher than 94% for both MobiAct and UP-Fall datasets. High prediction performance with the number of features less than 40 can be only obtained in simple problems modeled as the binary classification. Chatzaki et al. [12] used 39 features in time and frequency domains to obtain an average prediction accuracy of 96.8%, however, the highest accuracy of fall detection did not exceed 84%. Therefore, our work aims to investigate an efficient framework with low computational resource for fall detection in the context of a complex multi-class problem.

The Proposal Framework for Fall Detection
A typical FDS is shown in Figure 1, which consists of three main components: data capturing, fall detection, and communication components. The data-capturing component, which can be a smartphone or a wearable device, retrieves real-time data from an accelerometer or inertial sensors. Then, the data are processed by a fall detection system to detect and recognize falls. In case of a fall detected, the communication component is responsible for transmitting the alert signals to family members and/or to a hospital or caregiver through a wireless communication channel. The fall detection component is the most important in the system that requires high accuracy and reliability and real-time processing ability.
are much more weighted than the two falls. Therefore, the obtained results do not accurately reflect the fall detection ability of the model.
Shi et al. [14] built a convolutional neural network (CNN) with class activation mapping (CAM). They used fall data and ADLs, including standing (STD), walking (WAL), jumping (JUM), and jogging (JOG) of the MobiAct dataset to train the model. Short segments of the fall around the frame initiate the fall phase, and short segments of ADL were used as training samples. Their model was able to achieve detection with an accuracy of 95.55%. However, their proposed method could not give promising results in cases such as a fall [2,[6][7][8][9][10][11][12]14,23].
Martínez-Villaseñor et al. [2] introduced a UP-Fall Fall Detection dataset with various sources of data, such as IMU, EEG, CAM, etc. The MLP classifier achieved the best results on the Accuracy scale (95.0%). However, on the precision, sensitivity, and F1-score measures, the results are 77.7%, 69.9%, and 72.8%, respectively, which are lower than the Accuracy metric. This model has a relatively low recognition of falls. That proves that their proposed feature set and model are not suitable for multiclass classification based on an unbalanced dataset.
The proper selection of feature sets critically affects the accuracy of the detection system. The ability to recognize fall activities is highly dependent on the feature set extracted from the raw data [24][25][26][27]. Table 1 also shows that a large number of features are required in the multi-class problem to obtain the prediction accuracy of higher than 94% for both MobiAct and UP-Fall datasets. High prediction performance with the number of features less than 40 can be only obtained in simple problems modeled as the binary classification. Chatzaki et al. [12] used 39 features in time and frequency domains to obtain an average prediction accuracy of 96.8%, however, the highest accuracy of fall detection did not exceed 84%. Therefore, our work aims to investigate an efficient framework with low computational resource for fall detection in the context of a complex multi-class problem.

The Proposal Framework for Fall Detection
A typical FDS is shown in Figure 1, which consists of three main components: data capturing, fall detection, and communication components. The data-capturing component, which can be a smartphone or a wearable device, retrieves real-time data from an accelerometer or inertial sensors. Then, the data are processed by a fall detection system to detect and recognize falls. In case of a fall detected, the communication component is responsible for transmitting the alert signals to family members and/or to a hospital or caregiver through a wireless communication channel. The fall detection component is the most important in the system that requires high accuracy and reliability and real-time processing ability.  From the surveys in Section 2, there are several approaches to recognize falls among daily activities, such as threshold-based, machine learning-based, and deep learning-based approaches. Recently, machine learning and deep learning algorithms are broadly used for their high-accuracy achievements [8,13,14,28,29]. Hence, in this work, we investigate various machine-learning techniques, such as k-NN, J48, SVM, ANN and RF, for detecting daily activities as well as falling activities.
The daily activities including abnormal behaviors such as falls are commonly recognized in a framework, namely Human Activity Recognition (HAR) and are described in Figure 2. based approaches. Recently, machine learning and deep learning algorithms are broadly used for their high-accuracy achievements [8,13,14,28,29]. Hence, in this work, we investigate various machine-learning techniques, such as k-NN, J48, SVM, ANN and RF, for detecting daily activities as well as falling activities.
The daily activities including abnormal behaviors such as falls are commonly recognized in a framework, namely Human Activity Recognition (HAR) and are described in Figure 2. In the following section, we will present our proposed approach related to data preprocessing.

Data Preprocessing
In the HAR framework, the data preprocessing is composed of filtering, sliding windows, and feature extraction.

Filtering Technique
The signal describing human activity obtained from the accelerometer consists of three basic components: motion, gravitational, and noise [30]. To build a highly accurate model of human daily activities detection, it is necessary to separate motion, gravitational, and noise components in the signal received from the accelerometer [31]. Various types of filters can be used to remove noise from accelerometer sensor data. Depending on the data type and the Machine Learning model, a low-pass filter, a high-pass filter, a band-pass filter, or a band-stop filter, are used. Low-pass filters are often used to remove high-frequency noise [32,33], eliminate misplacement [34], and remove gravity components [35,36]. High-pass filters and band-pass filters are often used to separate the acceleration signal that is mixed with the gravitational force and noise [31,[37][38][39][40][41]. These filtering processes also assist the efficiency enhancement of feature extraction in activity recognition.

Windowing Technique (Sliding Window)
As aforementioned, after applying noise filtering, the resulting acceleration data are split into several smaller data segments of a predefined size. Each data segment is alternatively called a data window. A sliding window with a specific overlapping ratio is used to avoid data sample loss, which can degrade the recognition accuracy of activities. Each type of activity or behavior has different properties, so the proportion of data window size can be also different to efficiently recognize activity classes. For short windows, it can be helpful to detect fast activities such as falling, but some periodic characteristics of some activities may be lost. For long windows, it can cover enough information of activities, but the noise interference may be also increased. In the following section, we will present our proposed approach related to data preprocessing.

Data Preprocessing
In the HAR framework, the data preprocessing is composed of filtering, sliding windows, and feature extraction.

Filtering Technique
The signal describing human activity obtained from the accelerometer consists of three basic components: motion, gravitational, and noise [30]. To build a highly accurate model of human daily activities detection, it is necessary to separate motion, gravitational, and noise components in the signal received from the accelerometer [31]. Various types of filters can be used to remove noise from accelerometer sensor data. Depending on the data type and the Machine Learning model, a low-pass filter, a high-pass filter, a band-pass filter, or a band-stop filter, are used. Low-pass filters are often used to remove high-frequency noise [32,33], eliminate misplacement [34], and remove gravity components [35,36]. Highpass filters and band-pass filters are often used to separate the acceleration signal that is mixed with the gravitational force and noise [31,[37][38][39][40][41]. These filtering processes also assist the efficiency enhancement of feature extraction in activity recognition.

Windowing Technique (Sliding Window)
As aforementioned, after applying noise filtering, the resulting acceleration data are split into several smaller data segments of a predefined size. Each data segment is alternatively called a data window. A sliding window with a specific overlapping ratio is used to avoid data sample loss, which can degrade the recognition accuracy of activities. Each type of activity or behavior has different properties, so the proportion of data window size can be also different to efficiently recognize activity classes. For short windows, it can be helpful to detect fast activities such as falling, but some periodic characteristics of some activities may be lost. For long windows, it can cover enough information of activities, but the noise interference may be also increased.
Therefore, finding a suitable window size to effectively detect ADL activities including falls will be always challenging. It is really necessary to determine the sliding window parameters on the recognition performance to select the best for various datasets.

The Proposal Features Extraction Method (Feature Extraction)
Based on the successful models of activities detection [42], we also compute similar sets of features in each data window. The sets of features are divided into three categories:  [43]. Because these parameters can provide useful information in both the time and frequency domains, they were mostly used in analyzing biomedical signals such as ECG and EEG [43,44]. Therefore, these parameters are proposed as additional features in our work. The number of features in this set H is only 3 features because only the rms component is applied in computation. Table 2 summarizes various feature sets that are extracted in our study. The formulas to compute these features are described in detail in [42]. The feature set plays an important role; hence, a selection of suitable feature sets significantly affects the performance of the activity detection model. For identifying ADLs and falls, it is necessary to combine the features in the time domain, the frequency domain, and the Hjorth parameters. Feature combination between different domains also needs to be investigated to determine the effective set of features for ADLs recognition and fall detection. Note: a x , a y , a z are acceleration signals in x, y, z axes, respectively; ϕ is the angle of rotation of the accelerometer in the x-axis, θ is the angle of inclination of the accelerometer in the y-axis; a rms is the available acceleration of the signal.

Model Selections
There are many classification algorithms, which have been applied in daily activities recognition as well as fall detection. Most of the studies on HAR focus on proposing the combined classifier to improve pedestrian fall detection performance. During the research process, they used additional basic classifiers, such as ANN, k-NN, LSTM, SVM, RF, J48, DT, etc., to evaluate the effectiveness of the proposed model. Table 1 presents the best recognition model of some typical research. Besides, many research papers use basic classi-Electronics 2022, 11, 1030 8 of 20 fiers but still achieve very good recognition results, such as the study of Chatzaki et al. [12], Mahfuz et al. [6], Wu et al. [13], L. Martínez-Villaseñor et al. [2].
The paper focuses on constructing a feature set to improve the performance of the activities recognition and fall detection model. We select basic and common classifiers to evaluate the effectiveness of the proposed feature set. We have selected the same classifiers as the studies included in the performance comparisons in this paper. The classifiers used are ANN, k-NN, J48, RF, and SVM. These are common classifiers in HAR studies.

Model Evaluations
As aforementioned, the metrics, such as Precision, Sensibility, Specificity, and F1-Score, are preferred to measure the performance of fall detection methods due to the imbalance of datasets. They are defined below [20]: where TP is the number of true positives (correct prediction of activity), TN is the number of true negatives (correct prediction of non-activity), FP is the number of false positives (incorrect prediction of activity), and FN is the number of false negatives (incorrect prediction of non-activity).

Experimental Datasets
We use two public datasets-MobiAct 2.0 and Up-Fall-for investigation on ADLs and fall detection due to their popularity and huge size. Each dataset has different characteristics based on/in terms of the number of ADLs and types of falls and the way to collect the sensor data. MobiAct was collected from smartphone sensors [1], whereas UP-Fall was collected from wearable devices [2]. These datasets have been widely used by numerous HAR studies. The details of the two datasets are presented in Table 3. The MobiAct dataset was collected from the accelerometer, gyroscope, and orientation sensors of a Samsung Galaxy S3 smartphone. It includes 4 different types of falls and 12 different ADLs from 66 subjects [1]. The details of this dataset (version 2.0) are summarized in Table 4. Different from previous studies [7,22,45], we use only accelerometer data to train the activity detection model to recognize all sixteen actions of the MobiAct dataset. The sampling frequency for all activities is around 85 Hz.

UP-Fall Detection Dataset
UP-Fall Detection (UP-Fall) dataset [2] includes 11 activities-six human daily activities as well as five different types of human falls-of 17 healthy young adults using a multimodal approach, i.e., wearable sensors, ambient sensors, and vision devices. The details of this dataset are described in Table 5. To collect data for the UP-Fall dataset, Martínez-Villaseñor et al. [2] used five Mbientlab MetaSensor wearable sensors collecting raw data from the 3-axis accelerometer, the 3-axis gyroscope, and the ambient light value. In addition, one electroencephalograph (EEG) NeuroSky MindWave headset was occupied to measure the raw brainwave signal from its unique EEG channel sensor located at the forehead. They installed six infrared sensors as a grid 0.40 m above the floor of the room, to measure the changes in interruption of the optical devices. Lastly, two Microsoft LifeCam Cinema cameras were located at 1.82 m above the floor, one for a lateral view and the other for a frontal view.
Nevertheless, in this paper, we use only the data obtained from the 3-axis accelerometer of the IMU device in the UP-Fall dataset. This device was placed in the right pocket of the volunteers' pants. The sampling frequency is standardized at 100 Hz for all actions. from its unique EEG channel sensor located at the forehead. They installed six infrared sensors as a grid 0.40 m above the floor of the room, to measure the changes in interruption of the optical devices. Lastly, two Microsoft LifeCam Cinema cameras were located at 1.82 m above the floor, one for a lateral view and the other for a frontal view.
Nevertheless, in this paper, we use only the data obtained from the 3-axis accelerometer of the IMU device in the UP-Fall dataset. This device was placed in the right pocket of the volunteers' pants. The sampling frequency is standardized at 100 Hz for all actions.

Experiment Description
Feature selection significantly affects the performance of the activity recognition classifier. In addition to features extracted from the time and frequency domain like in other studies, we extract more features from the Hjorth parameter. Experiments on different datasets show that, when adding the features of the Hjorth parameter, the performance of the fall detection system increases significantly. In this experiment, we study six feature sets to evaluate the influence of feature sets on different classifiers in fall detection. These include the feature sets of each separate domain and the feature sets combined from different domains as shown in Table 6. As previous studies have shown, the time-domain

Experiment Description
Feature selection significantly affects the performance of the activity recognition classifier. In addition to features extracted from the time and frequency domain like in other studies, we extract more features from the Hjorth parameter. Experiments on different datasets show that, when adding the features of the Hjorth parameter, the performance of the fall detection system increases significantly. In this experiment, we study six feature sets to evaluate the influence of feature sets on different classifiers in fall detection. These include the feature sets of each separate domain and the feature sets combined from different domains as shown in Table 6. As previous studies have shown, the time-domain feature set has more advantages than other domain feature sets in terms of the number of features and computational time; so, it has more influence on the performance of activity detection [42,46]. Therefore, the time domain is chosen as a key factor to combine with the frequency domain and Hjorth parameter. Then, the combinations of feature sets from different domains are time domain with frequency domain (TF), time domain with parameter Hjorth (TH), time domain with frequency domain, and with Hjorth parameter (TFH). In this experiment, raw accelerometer data of the MobiAct dataset and the UP-Fall dataset have been processed to split into data windows with a 256-sample size and 80% overlap. Each data window is transformed to extract six different feature sets, as shown in Table 6.
Five popular classifiers, i.e., Random Forest (RF), Artificial Neural Network (ANN), Decision Tree (J48), k-nearest neighbor (k-NN), and Support Vector Machine (SVM), have been used to evaluate the influence of feature sets on different classifiers as well as to select the best classifier for the fall detection system. The parameters in the classifiers have been set as shown in Table 7. The 10-fold cross-validation and the F-score metric are used to evaluate the performance of the classifiers. Table 7. The parameters of experimented classification algorithms.

RF
Size of each bag P = 100; number of iterations I = 100; number of execution slots = 1; number of attributes to randomly investigate K = 0.

ANN
Learning rate for the backpropagation algorithm L = 0.3; momentum rate for the backpropagation algorithm M = 0.2; number of epochs to train through N = 500; percentage size of validation set to use to terminate training V = 0; the value used to seed the random number generator S = 0; the number of consecutive increases of error allowed for validation testing before training terminates, E = 20.
For the UP-Fall dataset, we focus on the data collected from the IMU's 3-axis accelerometer placed in the right trouser pocket. For the MobiAct dataset, we use only the data collected from the smartphone's 3-axis accelerometer mounted on the right waistband. The experimental results of evaluating the influence of these feature sets are shown in Figure 4.   Figure 4 shows the experimental results of the performance of the ADLs and fall detection system on classifiers using different feature sets. The obtained results in Figure 4 show that the influence of the feature sets on the recognition performance in all five classifiers is similar. In both datasets, the time domain feature set gives the best performance of activity recognition while the Hjorth feature set with only three features attains the lowest performance when considering the performance on each separate feature set. The combination of different feature sets improves the recognition performance significantly. In general, the recognition performance based on the TF feature sets is better than that of  Figure 4 shows the experimental results of the performance of the ADLs and fall detection system on classifiers using different feature sets. The obtained results in Figure 4 show that the influence of the feature sets on the recognition performance in all five classifiers is similar. In both datasets, the time domain feature set gives the best performance of activity recognition while the Hjorth feature set with only three features attains the lowest performance when considering the performance on each separate feature set. The combination of different feature sets improves the recognition performance significantly. In general, the recognition performance based on the TF feature sets is better than that of the TH feature set in different classifiers except for the k-NN classifier for the MobiAct dataset and the J48 and RF classifiers for the UpFall dataset. Overall, the TFH feature set combining the features of three domains attains the best recognition performance in all classifiers.

Discussion
As shown in Figure 4, the RF algorithm is the best classifier in all cases for both datasets. In the case of using the TFH feature set, the RF classifier attains the accuracy of 98.3% and 99.3% for the MobiAct and UpFall datasets, respectively. The following classifiers are k-NN, J48, ANN, and SVM for the MobiAct dataset and J48, k-NN, ANN, and SVM for the UP-Fall dataset. Thus, there is a small dependence on the performance of classifiers on datasets using different feature sets. The results also show the decisional role of the time-domain feature set due to the majority number of features in the feature sets. In the MobiAct dataset, the list in descending order concerning the recognition performance is RF, k-NN, J48, ANN, and SVM for the combined feature sets and the T feature set. The corresponding list is RF, J48, k-NN, ANN, and SVM for the F feature set, and RF, J48, ANN, SVM, and k-NN for the H feature set. However, in the UP-Fall dataset, the list in descending order concerning the recognition performance is RF, J48, k-NN, ANN and SVM is identical for all feature sets.
Although the influence of the recognition performance on the feature sets in fall detection is similar to that in the general activity recognition as mentioned above, there is a small dependence/difference of the fall detection performance using different feature sets ( Figure 5). For the MobiAct dataset, the recognition performance on the feature sets in fall detection is divided into two groups, the lower performance group consisting of ANN and SVM and the higher performance group consisting of k-NN, J48, and RF for different feature sets except for the TFH set. It is shown that the J48 algorithm outperforms the k-NN algorithm for the F and H feature sets, but it is inverted for the combined feature sets in fall detection. For the Up-Fall dataset, there is an obvious difference in the fall detection between using the feature sets in different domains. The fall detection model using the time domain feature set still attains the highest performance among three separate domains. However, the models using the combined feature sets such as the TF and TH sets cannot improve the performance as compared with the model using only the T set. The best model only is attained using the combined TFH feature set. We noted that there are only three Hjorth features. However, the addition of Hjorth parameters to the feature sets considerably improves the fall detection model. In most classifiers, using the TH feature set is even better than using the TF feature set. This improvement is because the Hjorth parameters contain the information relating to frequency domain, such as mean frequency and bandwidth, which assist to detect very well the fast transients of activities such as falls. cannot improve the performance as compared with the model using only the T set. The best model only is attained using the combined TFH feature set. We noted that there are only three Hjorth features. However, the addition of Hjorth parameters to the feature sets considerably improves the fall detection model. In most classifiers, using the TH feature set is even better than using the TF feature set. This improvement is because the Hjorth parameters contain the information relating to frequency domain, such as mean frequency and bandwidth, which assist to detect very well the fast transients of activities such as falls. The recognition performance of specific activities using different classifiers is presented in Table 8 for the MobiAct dataset and in Table 9 for the UP-Fall dataset. In general, the recognition performance of harmonic activities, such as walking, jumping, and idle activities such as standing, sitting, is much better than that of changing activities such as The recognition performance of specific activities using different classifiers is presented in Table 8 for the MobiAct dataset and in Table 9 for the UP-Fall dataset. In general, the recognition performance of harmonic activities, such as walking, jumping, and idle activities such as standing, sitting, is much better than that of changing activities such as falls, standing to sit. The classifier achieving the best accuracy in recognizing a specific activity varies on the two datasets. For example, the BSC fall detection based on the RF classifier obtains the best accuracy of 94%; however, the SDL fall detection based on other classifiers obtains the best performance in fall detection for the MobiAct dataset. In ADL recognition, the JUM activity obtains the best accuracy of 99.9% for all classifiers. The CHU and SCH activities are easy to be confused with falling activities, which causes a downgrade in fall detection performance. However, using the RF classifier, one can attain an accuracy of above 93% in detecting these activities. Similarly, the 01 (FH) fall activity obtains the best accuracy in fall detection for all classifiers, and the 06 (W) walking and 11 (L) laying activities obtain the best accuracy in ADL recognition for most classifiers. In this section, we provide a comparison between our model and the recent works on the MobiAct dataset [1]. As mentioned above, many works use the MobiAct to build the fall detection model with only two classes of activities; therefore, the work of Chatzaki [12] is selected for our comparison because of the similarity in the multi-class recognition problem between our model and Chatzaki's model. In this comparison, we have set up the model parameters such as the sliding window size of 128 samples (equivalent to 1.5 s) with an 80% overlapping ratio similar to that in the research by Chatzaki et al. [12].
The comparison results are shown in Figure 6. In general, the obtained results show that the detection performance of our proposed model is similar to that of Chatzaki's model [12] in terms of weighted average F1-Score when using the same J48 or k-NN classifier. However, there is a significant improvement of accuracy in fast-changing activities recognition including fall detection with the proposed TFH combined feature set. For the J48 classifier, our model can attain an accuracy of higher than 80% in fast-changing activities recognition, while Chatzaki's model only attains the highest accuracy of lower than 70% in these recognitions. In similarity, the fall detection can attain an accuracy of higher than 90% in our model, but lower than 84% in Chatzaki's model for using the k-NN classifier. With the proposed TFH combined feature set, the ability to identify activities that are difficult to distinguish, fast state changes, such as sitting to standing (CHU), standing to sit (SCH), is better than that in the research work by Chatzaki et al. [12]. The results in Figure 6 show that the proposed feature set in this paper is especially suitable for activities that happen quickly and change suddenly, such as falls (BSC, FKL, FOL, and SDL). That was our aim when constructing the feature set.
The identification results of 16 activities in the MobiAct v2.0 dataset are present as a confusion matrix shown in Figure 7. A closer look at the results in Figure 7a (Results of Chatzaki et al. [12] and Figure 7b (Our results) show that sit to stand (chair up) (CHU) is the most confusing. CHU activity in the study by Chatzaki et al. [12] is misclassified as a stand to sit (SCH), the confusion is up to 22.8%. The CHU activity in our study is misclassified as car step out (CSO), confusion is 7.02%. In addition, the rate of misclassifying Front-knees-lying (FKL) into forward-lying (FOL) and vice versa was also higher than other activities. Chatzaki et al. [12] said that the correct recognition of falls is more problematic. Although the weighted accuracy of Chatzaki et al. [12] can attain 96.8%, the detection accuracy of falls is very low which is only 70.9% for FOL and max 83.2% for BSC. It should be noted that the obtained results indicate a significant improvement of accuracy in fall detection and the balance of detection accuracy in all activities. In particular, the detection accuracy of all activities is higher than 96% and the fall detection accuracy of FOL, FKL, BSC, and BSC is 90.6%, 90.8%, 91.2%, and 94.7%, respectively. The balanced accuracy in multi-class recognition of our model also demonstrates the importance of data With the proposed TFH combined feature set, the ability to identify activities that are difficult to distinguish, fast state changes, such as sitting to standing (CHU), standing to sit (SCH), is better than that in the research work by Chatzaki et al. [12]. The results in Figure 6 show that the proposed feature set in this paper is especially suitable for activities that happen quickly and change suddenly, such as falls (BSC, FKL, FOL, and SDL). That was our aim when constructing the feature set.
The identification results of 16 activities in the MobiAct v2.0 dataset are present as a confusion matrix shown in Figure 7. A closer look at the results in Figure 7a (Results of Chatzaki et al. [12] and Figure 7b (Our results) show that sit to stand (chair up) (CHU) is the most confusing. CHU activity in the study by Chatzaki et al. [12] is misclassified as a stand to sit (SCH), the confusion is up to 22.8%. The CHU activity in our study is misclassified as car step out (CSO), confusion is 7.02%. In addition, the rate of misclassifying Front-kneeslying (FKL) into forward-lying (FOL) and vice versa was also higher than other activities. Chatzaki et al. [12] said that the correct recognition of falls is more problematic. Although the weighted accuracy of Chatzaki et al. [12] can attain 96.8%, the detection accuracy of falls is very low which is only 70.9% for FOL and max 83.2% for BSC. It should be noted that the obtained results indicate a significant improvement of accuracy in fall detection and the balance of detection accuracy in all activities. In particular, the detection accuracy of all activities is higher than 96% and the fall detection accuracy of FOL, FKL, BSC, and BSC is 90.6%, 90.8%, 91.2%, and 94.7%, respectively. The balanced accuracy in multi-class recognition of our model also demonstrates the importance of data cleaning, normalization, and feature extraction when using raw data. The results used to compare with the study of Chatzaki et al. are not the best. In this paper, our method has the best outcome of 98.79% on F1-Score when using the MobiAct dataset. This result is achieved when using the TFH feature set cut at a sliding window size of 128 samples, 80% overlap rate, and trained by an RF classifier model. Table 10 details this result.  The results used to compare with the study of Chatzaki et al. are not the best. In this paper, our method has the best outcome of 98.79% on F1-Score when using the MobiAct dataset. This result is achieved when using the TFH feature set cut at a sliding window size of 128 samples, 80% overlap rate, and trained by an RF classifier model. Table 10 details this result.

Research Based on UP-Fall Dataset
Similarly, we have compared our model to the recent works on the UP-Fall dataset with the equivalent conditions of the experiment. Figure 8 shows the published research results of Lai et al. [8] and experimental results based on our proposal. Similarly, we have compared our model to the recent works on the UP-Fall dataset with the equivalent conditions of the experiment. Figure 8 shows the published research results of Lai et al. [8] and experimental results based on our proposal. With the focus of research to propose a feature set for the fall detection system, our model has achieved very high accuracy results. The Accuracy, Precision, Sensibility, Specificity, and F1-Score metrics have similar results, the lowest is 93.63%, and the highest is 99.52%. That proves that our model rarely happens to recognize "false negative" and "false positive". In the research of Lai et al. [8], the recognition performance of falling and non-falling activities has a big difference. The ability to detect falls and activities that confuse (P-Picking up an object) is not good and only reaches 62.2% to 85.8%. Meanwhile, our method also achieves at least 90.49%, including confusing activity (P-Picking up an object). The recognition performance of fall activities in Lai's research [8] published in Pattern Recognition was better than that in Villaseñor's research work [12]. However, with the feature set we proposed, our model has a higher performance than these two studies, especially for fall activities. Figure 9 shows the results of recognizing activities in the form of a Confusion Matrix of the UP-Fall dataset. The elements on the diagonal represent the ratio at which the predicted performance compares to reality. The elements off-diagonal are those that are mislabeled by the classifier. The higher the percentage of the elements lying on the diagonal of the confusion matrix, the better, which indicates more correct predictions.
In the research work of Lai et al. [8], confusion matrix data appear to be split into two clusters grouped based on performed actions: fall and non-fall. Their model is capable of classifying non-fall activities almost perfectly. However, our model still gives better classification performance in this group of activities.
In the group of fall activities, the model of Lai et al. [8] has a decent recognition performance. Example: Falling backward (Column 3, FB) is only correctly classified 62.7%. Our model gives an almost perfect performance in recognizing fall activities. It is even better than the model proposed in [8] when it comes to detecting non-falling activities. With the focus of research to propose a feature set for the fall detection system, our model has achieved very high accuracy results. The Accuracy, Precision, Sensibility, Specificity, and F1-Score metrics have similar results, the lowest is 93.63%, and the highest is 99.52%. That proves that our model rarely happens to recognize "false negative" and "false positive". In the research of Lai et al. [8], the recognition performance of falling and non-falling activities has a big difference. The ability to detect falls and activities that confuse (P-Picking up an object) is not good and only reaches 62.2% to 85.8%. Meanwhile, our method also achieves at least 90.49%, including confusing activity (P-Picking up an object). The recognition performance of fall activities in Lai's research [8] published in Pattern Recognition was better than that in Villaseñor's research work [12]. However, with the feature set we proposed, our model has a higher performance than these two studies, especially for fall activities. Figure 9 shows the results of recognizing activities in the form of a Confusion Matrix of the UP-Fall dataset. The elements on the diagonal represent the ratio at which the predicted performance compares to reality. The elements off-diagonal are those that are mislabeled by the classifier. The higher the percentage of the elements lying on the diagonal of the confusion matrix, the better, which indicates more correct predictions.
In the research work of Lai et al. [8], confusion matrix data appear to be split into two clusters grouped based on performed actions: fall and non-fall. Their model is capable of classifying non-fall activities almost perfectly. However, our model still gives better classification performance in this group of activities.
In the group of fall activities, the model of Lai et al. [8] has a decent recognition performance. Example: Falling backward (Column 3, FB) is only correctly classified 62.7%. Our model gives an almost perfect performance in recognizing fall activities. It is even better than the model proposed in [8] when it comes to detecting non-falling activities.
With the proposed feature set, our model has good recognition ability in all activities in the UP-Fall dataset. All activities are over 90% detectable. In particular, the ability to detect activities, such as Walking, Standing, Sitting, Jumping, Laying, achieved excellent performance, 99.8% or more. Table 11 summarizes the best detection results of each activity when using the UP-Fall dataset. Our method has a fall detection efficiency of over 96%. With the proposed feature set, our model has good recognition ability in all activities in the UP-Fall dataset. All activities are over 90% detectable. In particular, the ability to detect activities, such as Walking, Standing, Sitting, Jumping, Laying, achieved excellent performance, 99.8% or more. Table 11 summarizes the best detection results of each activity when using the UP-Fall dataset. Our method has a fall detection efficiency of over 96%.

Conclusions
Building an ML model for sensor-based fall detection is often fraught with difficulties due to the unbalanced amount of data, a lot of noise, and various types of actions. To solve

Conclusions
Building an ML model for sensor-based fall detection is often fraught with difficulties due to the unbalanced amount of data, a lot of noise, and various types of actions. To solve this problem, researchers have combined many different solutions to improve the detection performance of the model. In this paper, we propose a data extraction method based on the time domain, frequency domain, and Hjorth parameter to build a dataset of 44 features of accelerometer data. We use two sets (MobiAct V2.0 and UP-Fall) of accelerometer data with different collection methods to evaluate the effectiveness of the proposed method. The proposed dataset has also been tested on five different classifiers (SVM, k-NN, J48, RF, and ANN algorithms) to confirm its superiority.
Our experimental results illustrate that a proper combination of features in different domains greatly improves the activity recognition performance of all classifiers in the context of fall detection among many ADL activities. Specially, the fall detection accuracy can be significantly improved although the datasets are strongly unbalanced. The RF algorithm in our model is the best classifier in fall detection. In particular, our method achieves an equivalent high performance in detecting fall and non-fall, fall and non-fall activities, i.e., 95.23% (falls), 99.11% (non-falls), 98.79% (falls and non-falls), and 96.16% (falls), 99.90% (non-falls), 98.79% (falls and non-falls) for the Mobile Act and UP-Fall datasets, respectively. These figures are present in detail in Tables 10 and 11.
Our proposed method can be extended for detecting abnormal activities beside falls among complex daily activities. Mobile applications for real-time fall detection and warning based on our model can be easily and feasibly implemented due to its low computing resource consumption.

Institutional Review Board Statement:
We use approved and publicly available datasets. Many previous studies have also used these datasets.

Informed Consent Statement:
Informed consent was obtained from all subjects involved in the study.