Smartphone Sensor-Based Human Locomotion Surveillance System Using Multilayer Perceptron

Azmat, Usman; Ghadi, Yazeed Yasin; Shloul, Tamara al; Alsuhibany, Suliman A.; Jalal, Ahmad; Park, Jeongmin

doi:10.3390/app12052550

Open AccessArticle

Smartphone Sensor-Based Human Locomotion Surveillance System Using Multilayer Perceptron

by

Usman Azmat

¹,

Yazeed Yasin Ghadi

²

,

Tamara al Shloul

³,

Suliman A. Alsuhibany

⁴,

Ahmad Jalal

¹ and

Jeongmin Park

^5,*

¹

Department of Computer Science, Air University, Islamabad 44000, Pakistan

²

Department of Computer Science and Software Engineering, Al Ain University, Al Ain 15551, United Arab Emirates

³

Department of Humanities and Social Science, Al Ain University, Al Ain 15551, United Arab Emirates

⁴

Department of Computer Science, College of Computer, Qassim University, Buraydah 51452, Saudi Arabia

⁵

Department of Computer Engineering, Korea Polytechnic University, 237 Sangidaehak-ro, Siheung-si 15073, Gyeonggi-do, Korea

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2022, 12(5), 2550; https://doi.org/10.3390/app12052550

Submission received: 25 January 2022 / Revised: 21 February 2022 / Accepted: 24 February 2022 / Published: 28 February 2022

(This article belongs to the Special Issue Artificial Intelligence and Beyond in Medical and Healthcare Engineering)

Download

Browse Figures

Versions Notes

Abstract

:

Featured Application

The proposed methodology is an application for monitoring people, tracking, and localization, which is evaluated over several challenging benchmark datasets. The technique can be applied in advanced surveillance security systems that help to find targeted persons, to track their functional movements, and to observe their daily actions.

Abstract

Applied sensing technology has made it possible for human beings to experience a revolutionary aspect of the science and technology world. Along with many other fields in which this technology is working wonders, human locomotion activity recognition, which finds applications in healthcare, smart homes, life-logging, and many other fields, is also proving to be a landmark. The purpose of this study is to develop a novel model that can robustly handle divergent data that are acquired remotely from various sensors and make an accurate classification of human locomotion activities. The biggest support for remotely sensed human locomotion activity recognition (RS-HLAR) is provided by modern smartphones. In this paper, we propose a robust model for an RS-HLAR that is trained and tested on remotely extracted data from smartphone-embedded sensors. Initially, the system denoises the input data and then performs windowing and segmentation. Then, this preprocessed data goes to the feature extraction module where Parseval’s energy, skewness, kurtosis, Shannon entropy, and statistical features from the time domain and the frequency domain are extracted from it. Advancing further, by using Luca-measure fuzzy entropy (LFE) and Lukasiewicz similarity measure (LS)–based feature selection, the system drops the least-informative features and shrinks the feature set by 25%. In the next step, the Yeo–Johnson power transform is applied, which is a maximum-likelihood-based feature optimization algorithm. The optimized feature set is then forwarded to the multilayer perceptron (MLP) classifier that performs the classification. MLP uses the cross-validation technique for training and testing to generate reliable results. We designed our system while experimenting on three benchmark datasets namely, MobiAct_v2.0, Real-World HAR, and Real-Life HAR. The proposed model outperforms the existing state-of-the-art models by scoring a mean accuracy of 84.49% on MobiAct_v2.0, 94.16% on Real-World HAR, and 95.89% on Real-Life HAR. Although our system can accurately differentiate among similar activities, excessive noise in data and complex activities have shown an inverse effect on its performance.

Keywords:

global positioning system; human locomotion activity recognition; inertial sensors; smartphone data

1. Introduction

With the evolution of the technological world, remote sensing has secured an indispensable role in multitudinous fields. It enables researchers to get a huge amount of data to design smart applications, and in many cases, it does not even cause any disturbance in the environment. One of the remote applications that have caught the attention of researchers is remotely sensed human locomotion activity recognition (RS-HLAR) [1,2,3,4,5,6,7]. Some examples of human locomotion activities are walking, sitting, jogging, jumping, climbing stairs, coming down the stairs, standing, and running. The accurate recognition of these activities can assist the design process of a variety of applications that are related to smart homes [8,9,10,11], life logging [12,13,14,15,16,17], indoor localization [18], healthcare [19,20,21], rescue [22], fitness [23], surveillance [24], and entertainment [25]. As an exemplary scenario, let us consider that a person is in a huge and crowded shopping mall and his location is to be identified. This task that seems to be arduous can be easily taken care of, if we remotely track the person’s locomotion activities and record them with their timestamps.

When talking about remote devices, smartphones are second to none. Present-age smartphones have become very powerful by virtue of the sensors that are embedded in them [26]. Another quality that makes them eminent is that they are ubiquitous. With these features, smartphone technology proves to be one of the best means to design remote applications. Types of smartphone sensors that have been used for remotely sensed human locomotion activity recognition (RS-HLAR) include inertial sensors and proximity sensors [27,28]. A typical inertial sensor that is used for HLAR is an accelerometer, which measures the force along the x, y, and z-axis [29,30,31]. However, while using the smartphone-embedded sensors, the significant issue of the position and orientation of the smartphone arises as every individual has a preference of carrying their smartphone either in their hand or in pocket or a bag. This issue is resolved by adding some assistive sensors to the framework. An accelerometer combined with a gyroscope that measures the angular velocity about the x, y, and z axes and a magnetometer that gives the magnetic field intensity in the x, y, and z directions resolves the position and orientation issue and makes the system user-independent [32]. In addition to these, other assistive sensors can also be added to the system to enhance its accuracy.

Despite all these advances, several challenges are also faced in RS-HLAR. In the wireless transmission of the sensor signals, there is a large possibility of noise intrusion and even complete distortion of the signal in the worst case. In such cases, the identification and classification of the performed locomotion activity becomes burdensome. Another challenge is the compound and complex locomotion activities. These are the activities that are composed of two or more individual activities, e.g., a person has fallen when they were trying to sit on a chair. The proposed system tackles the mentioned challenges in an efficient way and produces better results than those of the available state-of-the-art (SOTA) methods.

For this article, we propose a comprehensive and a robust model Smartphone Sensors Remote Data classification (SSRDC) for RS-HLAR while working in a remote sensing environment. We adopted three challenging datasets that provide real-life-depicting data of human locomotion activities. Every dataset provides different locomotion activities’ data while using a distinct combination of various smartphone sensors including accelerometer, gyroscope, magnetometer, global positioning system (GPS), light sensor, and microphone. We introduced an efficient blend of features including Parseval’s energy, time-domain statistics, and frequency-domain statistics. We also implement feature selection using Lukasiewicz similarity measure (LS), Luca-measure fuzzy entropy (LFE), and Yeo–Johnson power transform, which is a novel feature optimization technique. Finally, locomotion activity recognition is performed by a multilayer perceptron (MLP). The chief contributions of this paper are as follows:

We implement a feature selection framework based on Lukasiewicz similarity measure (LS) and Luca-measure fuzzy entropy (LFE). Quality features produce more accurate results.
To optimize the selected features, the Yeo–Johnson power transform is implemented, which brings the data into a sophisticated shape and consequently supports better classification.
We implement a multilayer perceptron (MLP) and tune its parameters for the final classification of human locomotion activities.
We introduce novel features to the field of RS-HLAR in the shape of Parseval’s energy and auto-regressive coefficients that have been popularly used in EEG signal and acoustic signal processing, respectively.

The rest of the article is formatted as follows: Section 2 contains the related work on the field of RS-HLAR. Section 3 comprehensively covers the materials and methods used in this study. Section 4 shows the results for all the datasets used in the proposed SSRDC system and compares the system’s performance with the available state-of-the-art models. Section 5 discusses the results in a broader perspective and highlights the strengths and limitations of the system, while Section 6 concludes the research article and describes the future plans.

2. Related Work

The recent advances in the recognition of remotely sensed human locomotion activities direct us toward very efficient and productive techniques. We conducted a comprehensive literature review to explore the remote sensing techniques previously adopted, so that we can develop a better and more effective model for the concerned research problem. In this section, we divide the literature review into two sub-sections, i.e., RS-HLAR using inertial sensors and RS-HLAR using proximity sensors.

2.1. RS-HLAR Using Inertial Sensors

Inertial sensors are held in high esteem when it comes to human locomotion activity recognition. As modern smartphones have very useful inertial sensors such as accelerometers, gyroscopes, and magnetometers embedded in them, RS-HLAR also benefits from them. Bashar et al. [33] extracted hand-crafted features from smartphone-embedded gyroscope and accelerometer and selected the best ones among them using neighborhood component analysis. Then, they used a dense neural network for the classification of locomotion activities. Using smartphone accelerometer data, Shan et al. [34] extracted deep features from temporal and spatial domains. Their locomotion activity classification system, namely C4M4BL, consisted of four convolutional layers, four max-pooling layers, and one bidirectional long short-term-memory (LSTM) layer. Xie et al. [35] compared the performance of different kernels of a support vector machine (SVM) in recognizing the locomotion activities such as climbing stairs, coming down the stairs, walking, and standing. They used a smartphone-embedded accelerometer, gyroscope, and magnetometer to extract various frequency- and time-domain features. After that, they employed a multiclass SVM in a one vs. all mode. Their 10-fold cross-validated results acknowledge the credibility of their system. In [36], the authors created a one-dimensional magnitude vector by using instantaneous values of a tri-axial smartphone accelerometer. After that, the magnitude vector was forwarded to a one-dimensional convolutional neural network (CNN) that auto-generated features and then performed the classification. Azmat et al. used the inertial sensors of a smartphone to classify human locomotion activities such as jumping, walking, standing, jogging, etc. Initially, they split the activities into static and dynamic categories using a cross-correlation-based template matching approach. Then, they separately extracted time- and frequency-domain features for each branch (static and dynamic). Following this, for the data optimization, a vector-quantization-based codebook was generated. For dynamic activities, they employed a multiclass SVM classification model, while for static activities, they used a Gaussian naïve Bayes (GNB) classification algorithm [37].

Another approach that was followed in [38] used a hybrid CNN-LSTM model in which they introduced an attention mechanism. The smartphone sensors used in this system were accelerometers, gyroscopes, and magnetometers. In another work, a model that combines a deep bidirectional long short-term memory (DBLSTM) and CNN is proposed. While using an abstract approach, a DBLSTM serializes the smartphone accelerometer and gyroscope data and generates a bidirectional output vector. As the DBLSTM model is not good for feature extraction, the feature extraction task is assigned to a CNN. Finally, a SoftMax function is implemented in the final layer of the network to perform the classification of the activities [39].

2.2. RS-HLAR Using Proximity Sensors

Another kind of sensors that has been used for RS-HLAR is proximity sensors. These are electromagnetic-radiation-based sensors that can sense and recognize activities without any physical contact. Some examples of proximity sensors are acoustic sensors, infrared sensors, and radars. A lot of research has been done on RS-HLAR using proximity sensors. Guo et al. [40] used a two-dimensional array of acoustic sensors that consisted of 4 transmitters and 256 receivers. The transmitters sent ultrasonic sinusoidal waves, and the receivers received reflected waves. In this way, the samples were gathered in the form of tri-axial acoustic data to extract frequency-domain and time-domain features. Then they forwarded those features to a vanilla CNN for the classification of walking, sitting, falling, and standing. Tripathi et al. [41] used acoustic sensors to detect the locomotion activities of humans at bus stops and parks and generated perceptual features. They used an ensemble of one-class classifiers that were based on fuzzy rules. Additionally, they validated their method using real data and then compared the results using an SVM classifier. She et al. [42] proposed a human activity recognition model based on a micro-doppler radar system. A data augmentation method was utilized that consisted of three major operations, i.e., frequency disturbance, time shift, and frequency shift. Radar data was recorded while targeting the subject performing locomotion and then these data were converted into a spectrogram. The activity was recognized based on the change in speed and frequency. They used various deep architectures and compared their performances using their augmentation concept. The networks they experimented with include Kim and Moon’s architecture [43], Inception-ResNet [44], Jordan’s net [45], ResNet-18 [46], and CNN-RNN [47].

Li et al. [48] worked on human activity recognition based on radar. Following the traditional approach, they acquired radar signals from the targets and then generated spectrograms. Their major contribution was that they proposed a transfer-learning-based semi-supervised algorithm. The algorithm consisted of two modules, i.e., supervised semantic transfer and unsupervised domain adaptation. They labeled every few spectrograms and achieve good results in the classification of a total of six locomotion activities.

3. Materials and Methods

Sensory data were remotely acquired from smartphones and then denoising was performed using a second-order Chebyshev type-I filter. After that, we windowed the input signal using a rectangular window of five seconds duration. By combining three windows in each segment, we segmented the data. Then we extracted features from the segmented data. As multiple sensors were used in this research, and some sensors had more than one channel, features were extracted for each channel of each sensor separately. Then by concatenating all the features, feature vectors were generated and placed in a single data frame. Considering the fact that garbage in leads to garbage out, we rejected the least-informative features using feature selection based on Luca-measure fuzzy entropy (LFE) and Lukasiewicz similarity measure (LS). By selecting the useful features, we reduced the size of the feature data frame by 25%. In the next step, we revamped the feature data frame to have a more sophisticated distribution by utilizing the Yeo–Johnson power transform. The architecture of the proposed system is shown in Figure 1.

3.1. Preprocessing

The very first step of data processing is denoising. For this purpose, we used a second-order Chebyshev type-I filter [49] with a cutoff frequency of 0.001. The mentioned filter rejects noise very well and provides a signal with a good signal-to-noise ratio (SNR) at the output. Figure 2 represents the noisy and denoised signals for the x, y, and z channels of the magnetometer for the walking activity for all three datasets. A similar approach was followed to denoise the data obtained from other sensors corresponding to other activities. We chose the walking activity to plot because walking is a common activity among all three datasets, thus supporting a good comparison among datasets. Figure 2a–c shows the results for the MobiAct_v2.0 magnetometer. Figure 3a–c represents the denoising results for the Real-World HAR magnetometer, and Figure 4a–c shows the denoising results for the Real-Life HAR magnetometer.

3.2. Windowing and Segmentation

In locomotion activities, there are repeating patterns. For example, if a person is walking, the basic step is “taking a step”, and then, this basic step repeats itself throughout the locomotion activity. Processing a signal’s windows where each window contains the basic pattern of the locomotion activity produces better results as compared to processing the signal as a whole. According to our experimentation, a five-second window worked in the best way to produce precise results. In addition to windowing, we also produced segments of the signal where each segment contained three windows of the input signal and in this way, covered the complete signal. Equation (1) shows the segmentation of the signal.

s e g_{p} = [w_{q}, w_{q + 1}, w_{q + 2}]; q < r - 2

(1)

where

s e g_{p}

is the pth segment and

w_{q}

represents the qth window. Meanwhile, it is to be asserted that

q

must be less than

r - 2

, where

r

is the last window index of the signal [50].

3.3. Feature Extraction

After denoising the data, we extracted a total of 15 features from the data some of which are mentioned along with their description and formulation in Table 1.

Other features that include the maximum and the minimum point difference and their ratio in the frequency domain, median, mode, and min and max points of time and frequency domain [51,52,53,54,55,56,57,58,59,60,61,62,63] require a complete mathematical procedure to be followed for their computation, and therefore, we did not mention them in Table 1. The features that are mentioned in Table 1 can be graphically observed in Figure 5.

3.4. Feature Selection

To select the best feature from the extracted features, we apply Luca-measure fuzzy entropy (LFE) and Lukasiewicz similarity measure (LS)–based feature selection. It works in such a way that we provide a feature set

A = (a_{1}, a_{2}, \dots, a_{n})

to the algorithm, and after the optimization of an objective function, it provides a subset

B = (a_{1}, a_{2}, \dots, a_{m})

of

A

with m < n. The parameters used for the concerned algorithm were LFE and LS. LFE is given by Equation (2), where

μ_{A} (x_{j})

has a value within the range of [0, 1], and LS is defined by Equation (3), where

f_{r}

is the feature set. Moreover,

x

is the input and

v

is the mean vector for a certain class [64].

H (A) = - \sum_{j = 1}^{n} (μ_{A} (x_{j}) \log μ_{A} (x_{j}) + (1 - μ_{A} (x_{j})) \log (1 - μ_{A} (x_{j})))

(2)

S (x, y) = \frac{1}{t} \sum (1 - |x (f_{r}) - v (f_{r})|), f o r [x, v] \in [0, 1]

(3)

We experimented using a different number of features, but we obtained the best results with a set of 11 features. In this way, we reduced the size of our feature set by approximately 25%. The 11 features selected using the mentioned feature selection algorithm were Parseval’s energy, skewness, kurtosis, Shannon entropy, AR-coefficients, minimum, maximum mode, and median of the time-domain data, FFT minimum and maximum point difference, and FFT minimum and maximum point ratio. The feature selection flow diagram is presented in Figure 6.

3.5. Feature Optimization

By concatenating the selected features, we designed a data frame and labeled each example so that classification can be performed. However, before performing the classification, data must be optimized so that the classifier can distinguish among classes more efficiently. For optimization, we used the Yeo–Johnson power transform, which brings the data into a more sophisticated form obeying the formulation given in Equation (4), where

s

is the input array and the value of

α

can be

0.5, 0

or

- 1

. This transformation generated very efficient results and helped in boosting the accuracy of the system [65].

φ (α, s) = \{\begin{matrix} \frac{{(s + 1)}^{α} - 1}{α} i f α \neq 0, s \geq 0 \\ l o g (s + 1) i f α = 0, s \geq 0 \\ - \frac{[{(- s + 1)}^{2 - α} - 1]}{2 - α} i f α \neq 2, s < 0 \\ - l o g (- s + 1) i f α = 2, s < 0 \end{matrix}

(4)

Working on MobiAct_v2.0, Real-World HAR, and Real-Life HAR, we also show a comparison of the original feature vector for a specific class and optimized feature vector that resulted from power transform in Figure 7a–c.

3.6. Classification

The final step of the proposed system was classification, which was performed by a multilayer perceptron (MLP). MLP is a feed-forward neural network (FFNN) that has an input layer, one or more hidden layers, and an output layer. The mathematical representation [66] of an MLP is given in Equation (5), where Y_m is the output of the mth perceptron and w_mi represents the mth weight that is multiplied with ith input x_i. Other than this, the bias of the mth perceptron is represented by b_m, n is the number of total neurons in the current layer, and f is the representative of the activation function.

Y_{m} (x) = f \{\sum_{i = 1}^{n} (w_{m i} x_{i} + b_{m})\}

(5)

MLP works in such a way that it takes features as input and multiplies those features with the initial weights in the hidden layers and then sends the weighted features to an activation function that gives the output as a probability distribution. The instance with the highest probability is declared as a class of the input. The structure of an MLP is depicted in Figure 8.

4. Experimental Results

The primary evaluation metric for our system was the mean accuracy of the system. To compute the mean accuracy, we adopted the 10-fold cross-validation technique so that the results are more general and dependable. All the experimentation was performed on a Windows-10 operating system having 16 GB RAM, and a processor of core-i7-7500U CPU @2.70 GHz. After a brief description of the datasets, we present the results of four experiments that we performed on each dataset. In the first experiment, we generated receiver operating characteristics (ROC) curves for each class separately. In the second experiment, we compared the accuracies for the classification of individual classes with the help of confusion matrices. In the next experiment, we analyzed the proposed system’s performance while using some other well-known classifiers. Finally, we made a comparison of our system with the available state-of-the-art techniques.

4.1. Datasets Description

The MobiAct_v2.0 dataset provides data for smartphone inertial sensors, i.e., accelerometers, gyroscopes, and orientation sensors. For this dataset, the orientation sensor is a software-based sensor that works as a magnetometer and provides data for the magnetic field intensity in the x, y, and z directions. There are 15 locomotion activities to be classified based on the provided sensor data that are shown in Table 2. A notable point here is that the subjects performing the activities are not consistent for every activity. Therefore, some classes have more samples than others and vice versa.

The Real-World HAR Dataset collects data from 15 subjects for a total of 8 locomotion activities that are mentioned in Table 2. This dataset collects a large amount of data using the six sensors of a smartphone, i.e., accelerometer, gyroscope, magnetometer, GPS, light sensor, and microphone. Moreover, for each locomotion activity, the dataset uses seven smartphones at the same time that are tied to seven different positions of the subject’s body. For a single smartphone, the accelerometer, gyroscope, and magnetometer provide data from the following body positions: chest, forearm, head, shin, thigh, upper arm, and waist. However, the GPS, light sensor, and the microphone do not provide data for the forearm. Thus, by combining the data coming from all the smartphones and all the sensors, we obtained a 39-dimensional vector against one example of a specific locomotion activity.

The third and the last dataset used in this study was the Real-Life HAR dataset that provides data from four sensors to classify four locomotion activities. This dataset was released at the end of the year 2020, which makes it the newest among the datasets that we used in this study. The fact that makes this dataset extremely challenging is that the subjects were free to use the smartphones in any way they preferred. Sensors used to collect this dataset include an accelerometer, a gyroscope, a magnetometer, and a GPS. The respective locomotion activities for each dataset are provided in Table 2.

4.2. Experiment I: Receiver Operating Characteristic (ROC) Curves SSRDS

The ROC curve shows the performance of the system in terms of the area that is covered under it. The larger the area under the curve, the better the performance and vice versa. For the evaluation of the proposed SSRDC system, we plotted the ROC curves for every class of all the datasets that our system was dealing with. Figure 9 depicts the ROC curves for MobiAct v2.0.

Figure 9 shows that the average area under the ROC curve was 0.93, which is a very good number regarding the performance of the proposed model over Mobiact_v2.0. According to Figure 9, the smallest area was covered by “front-knees lying”, i.e., 0.77, and the largest covered area, 1.00, was achieved by the “sitting” and “car step-out” classes. Following a similar approach, we plotted the ROC curves for Real-World HAR, which are shown in Figure 10.

Figure 10 shows that the average area under the ROC curve was 0.95, which advocates SSRDC system’s performance over that of Real-World HAR. According to Figure 10, the smallest area was covered by “stairs-down”, i.e., 0.67, and the largest covered area, 1.00, was achieved by the “jumping”, “lying”, “sitting”, “standing”, and “walking” classes. The ROC curves for Real-Life HAR are given in Figure 11. According to Figure 11, the average area under the curves for Real-Life HAR was 0.96, which was the greatest number achieved by our system. The proposed SSRDC system performed in the best manner over the Real-Life HAR with a minimum area of 0.90 under the curve for the class “active” and the maximum area was 1.00, which was commonly attained by the “walking” and “driving” classes.

4.3. Experiment II: Individual Locomotion Activity Classification Accuracies

One of the best options to find out how accurately the proposed system classifies an individual locomotion activity is the confusion matrix. We plotted the confusion matrices for all the concerned datasets. The confusion matrix for MobiAct_v2.0 is given in Figure 12.

Figure 12 shows that the lowest recognition accuracy was 70%, and the corresponding locomotion activity was forward lying, while the highest accuracy, that is, 99%, was achieved by the car step-in activity. All other locomotion activities were also predicted with pretty decent accuracies. Following a similar pattern, we analyzed individual accuracies for the Real-World HAR with the help of the confusion matrix given in Figure 13.

Considering the Real-World HAR, the lowest classification accuracy was for the sitting activity, that is, 73%, while jumping and running were classes that were always predicted correctly as their recognition accuracy was 100%. All other activities were also predictable with appreciable accuracy. Last, but not the least, the individual locomotion activity classification accuracies for the Real-Life HAR dataset are provided in the confusion matrix shown in Figure 14.

The Real-Life HAR confusion matrix shows that all four activities were classified with exceptional accuracy. The walking and active classes were predicted with 95% confidence, while the driving and inactive classes were predicted 100% accurately.

4.4. Experiment III: Comparison with Well-Known Classifiers

In this experiment, we analyzed the performance of the proposed system while using two other well-known classifiers namely, K-nearest neighbors (KNN) and AdaBoost. After that, we compared the results with the original multilayer perceptron (MLP)–based system in terms of precision, recall, and F1-score. We repeated the process for all three datasets used in this study. Table 3 represents the comparison results using MobiAct_v2.0, while Table 4 shows the comparison using Real-World HAR, and finally, Table 5 summarizes the comparison results that we obtained using Real-Life HAR.

The last row of Table 3 shows the mean precision, recall, and F1-score for each of the used classifiers. Performance statistics clearly state that the AdaBoost classifier was the worst performer among the three, and KNN was the second-best, while MLP performed the classification in the best way.

Considering the mean values of the performance metrics used, it can be concluded that the performance of the AdaBoost classifier on Real-World HAR was better than its performance on MobiAct_v2.0, but still, it performed a poor classification. The K-nearest neighbors classifier displayed a noticeable performance but still could not outperform the multilayer perceptron classifier.

By looking at Table 5, it is evident that the MLP classifier still left the other two method behind with its classification performance. Overall, all the methods performed well, but the SSRDC system was outstanding.

4.5. Experiment IV: Comparison with Available State-of-the-Art Techniques

Considering all the datasets that we used, we compared the performance of the SSRDC system with other available SOTA methods. There are various human locomotion activity recognition algorithms that have been applied on MobiAct_v2.0, Real-World HAR, and Real-Life HAR. Table 6 advocates the fact that the proposed SSRDC system outperformed all of them by a good margin. In [67], they used a random forest model with two different modes of the system while working on the Real-World HAR dataset. In the first mode, the system was aware of the position of device on the body of subject; while in other mode, the system was unaware of the device position. In the position-unaware case, their system produced 80.2% accurate results, while in the position-aware case, they predicted locomotion activity with 83.4% accuracy. In [68], a model that was based on cross-subject activity recognition was used, and it scored an 83.1% accuracy. A work referenced at [69], used signal visualization alongside a CNN to produce 92.4% accurate results over Real-World HAR. Finally, another framework [70] that was designed over a Real-World HAR used a CNN and scored 93.8% accuracy. In comparison to these works, the proposed system attained 94.2% accuracy and left the others behind.

Regarding MobiAct_v2.0, works referenced at [71,72,73] used SVM, CNN, and thresholding techniques to predict locomotion activities with 77.9%, 80.7%, and 81.3% accuracy. While our system proved to be 84.5% accurate. We also compared the performance of our system with state-of-the-art works over Real-Life HAR dataset. In [74], they utilized an SVM while working on different combinations of the sensors that are described in Table 6. While using an accelerometer and a GPS, they achieved 60.1% accuracy; with the addition of magnetometer to these two sensors, the accuracy increased to 62.6%; and while using gyroscope data at the same time, they obtained 67.2% accurate results. Another work that can be found at [38], employed an attention-based hybrid model that consisted of a CNN and an LSTM. They also experimented using different sensors. Using a magnetometer, they achieved 70.3% accuracy; with gyroscope data, they managed to obtain 95.2% accurate results; and they predicted with 95.7% accuracy when they used an accelerometer. Our SSRDC system produced 95.9% accurate results for the Real-Life HAR dataset. The accuracy of our system beat the work in [38] by a small margin. The major difference between both approaches was that we extracted hand-crafted features, selected the best featured, and then used a novel technique of power transformation for feature optimization, while they simply preprocessed the data and trained their deep learning model. Our approach evidently proved better than their approach on Real-Life HAR dataset.

5. Discussion

In the proposed SSRDC framework, we extracted different features from denoised smartphone sensor data. Then, we performed feature selection; power-transformed the chosen features; and using Multilayer Perceptron, performed the classification. To get started, the proposed system needs the smartphone sensor data. For this purpose, we chose three challenging RS-HLAR datasets that depict real-life scenarios and provide data from various sensors. The very first step of our system was to denoise the data using a Chebyshev type-I filter. From the denoised data, we extracted a total of 15 features including statistical features from the time domain and frequency domain and other features. We rejected the four least informative features with the help of LFE and LS-based feature selection. As a result, we determined the 11 most informative features. These 11 features were selected because a set of these 11 features provided the best accuracy. If we added or eliminated any feature in this set, it had a negative effect on the system’s performance. Raw data produces bad results, so, we employed power transform for data optimization. Finally, we labeled the data and sent it to a multilayer perceptron (MLP) for classification. After tuning the hyperparameters of the MLP, we achieved excellent classification results.

Although the system classifies human locomotion activities very well, it still has some limitations. When the system has to detect locomotion activities that are a combination of multiple locomotion activities, it finds the task more difficult as compared to the recognition of a single locomotion activity. An example of such a combination of locomotion activities can be a person performing acrobatics in which they rapidly go from one position to another. As the SSRDC system operates on remotely sensed data, there is a great chance of signal distortion, which can result in complete loss of useful data in some cases. The failure cases of the SSRDC system are picturized in Figure 15.

The results and analysis of the proposed system are as follows. The mean accuracy of the proposed system over Real-World HAR was 94.16%, while a previous implementation that include position-unaware and position-aware system modes with a random forest model [67] achieved 80.2% and 83.4% accuracy for each modes, respectively. SSRDC system’s mean accuracy over MobiAct_v2.0 was 84.49%, while the approaches used in [71,72,73] show a mean accuracy of 77.9%, 80.7%, and 81.3%, respectively. Finally, the techniques used over Real-Life HAR in [38,74], scored 70.3% and 95.7% respectively in terms of accuracy, while our system gave up to 95.89% accurate results. More detailed comparisons can be found in Table 6. These statistics show that the proposed system outperforms the available state-of-the-art techniques.

6. Conclusions

The proposed SSRDC system aims to classify human locomotion activities using remotely sensed data. In this regard, we first denoised the sensor data using a second-order Chebyshev type-I filter. Secondly, we divided the input signal into windows and then generated signal segments using those windows. Then we extracted divergent features from the data. Feature extraction was followed by an LFE and LS-based feature selection algorithm that reduced the size of the feature set by 25%. To optimize the features, the Yeo–Johnson power transform was applied, and then, data were sent to an MLP for the classification purpose.

6.1. Research Limitations

Although the outstanding performance of the proposed SSRDC system leaves state-of-the-art techniques behind, it certainly has some limitations also. The first point of concern is the correct recognition of complex locomotion activities that are a combination of two or more individual locomotion activities, and as our system deals with remote data, there can be some cases when data are lost during the wireless communication and the SSRDC system might not recognize the activity.

6.2. Future Work

Revolving around the smartphone sensors, the future directions of our research may proceed to pinpointing the location of a human being in an indoor environment along with the recognition of their locomotion activity. Meanwhile, we will be working to better our SSRDC system by improving our algorithms and adding more sensors in the framework.

Author Contributions

Conceptualization, U.A. and Y.Y.G.; methodology, U.A. and A.J.; software, U.A., S.A.A. and T.a.S.; validation, U.A., Y.Y.G. and J.P.; formal analysis, T.a.S., S.A.A. and J.P.; resources, Y.Y.G., T.a.S. and J.P.; writing—review and editing, U.A., T.a.S. and J.P.; funding acquisition, Y.Y.G., U.A., S.A.A. and J.P. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by a grant (2021R1F1A1063634) of the Basic Science Research Program through the National Research Foundation (NRF) funded by the Ministry of Education, Korea.

Conflicts of Interest

The authors declare no conflict of interest.

References

Choudhury, N.A.; Moulik, S.; Choudhury, S. Cloud-based Real-time and Remote Human Activity Recognition System using Wearable Sensors. In Proceedings of the 2020 IEEE International Conference on Consumer Electronics—Taiwan (ICCE-Taiwan), Taoyuan, Taiwan, 28–30 September 2020; pp. 1–2. [Google Scholar] [CrossRef]
Jalal, A.; Lee, S.; Kim, J.; Kim, T. Human activity recognition via the features of labeled depth body parts. In Proceedings of the Smart Homes Health Telematics, Artiminio, Italy, 12–15 June 2012; pp. 246–249. [Google Scholar]
Jalal, A.; Kim, Y.; Kamal, S.; Farooq, A.; Kim, D. Human daily activity recognition with joints plus body features representation using Kinect sensor. In Proceedings of the IEEE International Conference on Informatics, Electronics and Vision, Fukuoka, Japan, 15–18 June 2015; pp. 1–6. [Google Scholar]
Damodaran, N.; Schäfer, J. Device Free Human Activity Recognition using WiFi Channel State Information. In Proceedings of the 2019 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computing, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation, Leicester, UK, 19–23 August 2019; pp. 1069–1074. [Google Scholar] [CrossRef]
Jalal, A.; Sharif, N.; Kim, J.T.; Kim, T.-S. Human activity recognition via recognized body parts of human depth silhouettes for residents monitoring services at smart homes. Indoor Built Environ. 2013, 22, 271–279. [Google Scholar] [CrossRef]
Chelli, A.; Muaaz, M.; Pätzold, M. ActRec: A Wi-Fi-Based Human Activity Recognition System. In Proceedings of the 2020 IEEE International Conference on Communications Workshops (ICC Workshops), Dublin, Ireland, 7–11 June 2020; pp. 1–6. [Google Scholar] [CrossRef]
Gochoo, M.; Akhter, I.; Jalal, A.; Kim, K. Stochastic remote sensing event classification over adaptive posture estimation via multifused data and deep belief network. Remote Sens. 2021, 13, 912. [Google Scholar] [CrossRef]
Batool, M.; Jalal, A.; Kim, K. Telemonitoring of Daily Activity Using Accelerometer and Gyroscope in Smart Home Environments. J. Electr. Eng. Technol. 2020, 15, 2801–2809. [Google Scholar] [CrossRef]
Jalal, A.; Uddin, M.Z.; Kim, T.-S. Depth Video-based Human Activity Recognition System Using Translation and Scaling Invariant Features for Life Logging at Smart Home. IEEE Trans. Consum. Electron. 2012, 58, 863–871. [Google Scholar] [CrossRef]
Jalal, A.; Quaid, M.A.K.; Sidduqi, M.A. A Triaxial acceleration-based human motion detection for ambient smart home system. In Proceedings of the IEEE International Conference on Applied Sciences and Technology, Islamabad, Pakistan, 8–12 January 2019. [Google Scholar]
Kim, K.; Jalal, A.; Mahmood, M. Vision-based Human Activity recognition system using depth silhouettes: A Smart home system for monitoring the residents. J. Electr. Eng. Technol. 2019, 14, 2567–2573. [Google Scholar] [CrossRef]
Tahir, B.; Jalal, A.; Kim, K. Daily life Log Recognition based on Automatic Features for Health care Physical Exercise via IMU Sensors. In Proceedings of the 2021 International Bhurban Conference on Applied Sciences and Technologies (IBCAST), Islamabad, Pakistan, 12–16 January 2021. [Google Scholar]
Jalal, A.; Kamal, S. Real-Time Life Logging via a Depth Silhouette-based Human Activity Recognition System for Smart Home Services. In Proceedings of the IEEE International Conference on Advanced Video and Signal-based Surveillance, Seoul, Korea, 26–29 August 2014; pp. 74–80. [Google Scholar]
Jalal, A.; Kamal, S.; Kim, D. A depth video sensor-based life-logging human activity recognition system for elderly care in smart indoor environments. Sensors 2014, 14, 11735–11759. [Google Scholar] [CrossRef]
Jalal, A.; Quaid, M.A.K.; Hasan, A.S. Wearable Sensor-Based Human Behavior Understanding and Recognition in Daily Life for Smart Environments. In Proceedings of the IEEE conference on International Conference on Frontiers of Information Technology, Islamabad, Pakistan, 17–19 December 2018. [Google Scholar]
Jalal, A.; Quaid, M.A.; Tahir, S.B.; Kim, K. A study of accelerometer and gyroscope measurements in physical life-log activities detection systems. Sensors 2020, 20, 6670. [Google Scholar]
Jalal, A.; Batool, M.; Kim, K. Sustainable Wearable System: Human Behavior Modeling for Life-logging Activities Using K-Ary Tree Hashing Classifier. Sustainability 2020, 12, 10324. [Google Scholar] [CrossRef]
Wei, S.; Wang, J.; Zhao, Z. Poster Abstract: LocTag: Passive WiFi Tag for Robust Indoor Localization via Smartphones. In Proceedings of the IEEE INFOCOM 2020—IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), Toronto, ON, Canada, 6–9 July 2020; pp. 1342–1343. [Google Scholar] [CrossRef]
Madiha, J.; Jalal, A.; Kim, K. Wearable sensors based exertion recognition using statistical features and random forest for physical healthcare monitoring. In Proceedings of the IEEE International Conference on Applied Sciences and Technology, Islamabad, Pakistan, 12–16 January 2021. [Google Scholar]
Jalal, A.; Batool, M.; Kim, K. Stochastic recognition of physical activity and healthcare using tri-axial inertial wearable sensors. Appl. Sci. 2020, 10, 7122. [Google Scholar] [CrossRef]
Madiha, J.; Gochoo, M.; Jalal, A.; Kim, K. HF-SPHR: Hybrid features for sustainable physical healthcare pattern recognition using deep belief networks. Sustainability 2021, 13, 1699. [Google Scholar]
Kalita, S.; Karmakar, A.; Hazarika, S.M. Human Fall Detection during Activities of Daily Living using Extended CORE9. In Proceedings of the 2019 Second International Conference on Advanced Computational and Communication Paradigms (ICACCP), Gangtok, India, 25–28 February 2019; pp. 1–6. [Google Scholar] [CrossRef]
Golestani, N.; Moghaddam, M. Magnetic Induction-based Human Activity Recognition (MI-HAR). In Proceedings of the 2019 IEEE International Symposium on Antennas and Propagation and USNC-URSI Radio Science Meeting, Atlanta, GA, USA, 7–12 July 2019; pp. 17–18. [Google Scholar] [CrossRef]
Liu, C.; Ying, J.; Han, F.; Ruan, M. Abnormal Human Activity Recognition using Bayes Classifier and Convolutional Neural Network. In Proceedings of the 2018 IEEE 3rd International Conference on Signal and Image Processing (ICSIP), Shenzhen, China, 13–15 July 2018; pp. 33–37. [Google Scholar] [CrossRef]
Imran, H.A.; Latif, U. HHARNet: Taking inspiration from Inception and Dense Networks for Human Activity Recognition using Inertial Sensors. In Proceedings of the 2020 IEEE 17th International Conference on Smart Communities: Improving Quality of Life Using ICT, IoT and AI (HONET), Charlotte, NC, USA, 14–16 December 2020; pp. 24–27. [Google Scholar] [CrossRef]
Hasegawa, T. Smartphone Sensor-Based Human Activity Recognition Robust to Different Sampling Rates. IEEE Sens. J. 2020, 21, 6930–6941. [Google Scholar] [CrossRef]
Badar, S.; Jalal, A.; Kim, K. Wearable Inertial Sensors for Daily Activity Analysis Based on Adam Optimization and the Maximum Entropy Markov Model. Entropy 2020, 22, 579. [Google Scholar]
Kamal, S.; Jalal, A.; Kim, D. Depth Images-based Human Detection, Tracking and Activity Recognition Using Spatiotemporal Features and Modified HMM. J. Electr. Eng. Technol. 2016, 11, 1857–1862. [Google Scholar] [CrossRef] [Green Version]
Quaid, M.A.K.; Jalal, A. Wearable Sensors based Human Behavioral Pattern Recognition using Statistical Features and Reweighted Genetic Algorithm. Multimed. Tools Appl. 2020, 79, 6061–6083. [Google Scholar] [CrossRef]
Badar, S.; Jalal, A.; Batool, M. Wearable Sensors for Activity Analysis using SMO-based Random Forest over Smart home and Sports Datasets. In Proceedings of the 2020 3rd International Conference on Advancements in Computational Sciences (ICACS), Lahore, Pakistan, 17–19 February 2020. [Google Scholar]
Jalal, A.; Quaid, M.A.K.; Kim, K. A Wrist Worn Acceleration Based Human Motion Analysis and Classification for Ambient Smart Home System. J. Electr. Eng. Technol. 2019, 14, 1733–1739. [Google Scholar] [CrossRef]
Gu, F.; Kealy, A.; Khoshelham, K.; Shang, J. User-Independent Motion State Recognition Using Smartphone Sensors. Sensors 2015, 15, 30636–30652. [Google Scholar] [CrossRef]
Bashar, S.K.; Al Fahim, A.; Chon, K.H. Smartphone Based Human Activity Recognition with Feature Selection and Dense Neural Network. In Proceedings of the 2020 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), Montreal, QC, Canada, 20–24 July 2020; pp. 5888–5891. [Google Scholar] [CrossRef]
Shan, C.Y.; Han, P.Y.; Yin, O.S. Deep Analysis for Smartphone-based Human Activity Recognition. In Proceedings of the 2020 8th International Conference on Information and Communication Technology (ICoICT), Yogyakarta, Indonesia, 24–26 June 2020; pp. 1–5. [Google Scholar] [CrossRef]
Xie, L.; Tian, J.; Ding, G.; Zhao, Q. Human activity recognition method based on inertial sensor and barometer. In Proceedings of the 2018 IEEE International Symposium on Inertial Sensors and Systems (INERTIAL), Lake Como, Italy, 26–29 March 2018; pp. 1–4. [Google Scholar] [CrossRef]
Lee, S.-M.; Yoon, S.M.; Cho, H. Human activity recognition from accelerometer data using Convolutional Neural Network. In Proceedings of the 2017 IEEE International Conference on Big Data and Smart Computing (BigComp), Jeju, Korea, 13–16 February 2017; pp. 131–134. [Google Scholar] [CrossRef]
Azmat, U.; Jalal, A. Smartphone Inertial Sensors for Human Locomotion Activity Recognition based on Template Matching and Codebook Generation. In Proceedings of the 2021 International Conference on Communication Technologies (ComTech), Rawalpindi, Pakistan, 21–22 September 2021; pp. 109–114. [Google Scholar] [CrossRef]
Mekruksavanich, S.; Jitpattanakul, A. Recognition of Real-life Activities with Smartphone Sensors using Deep Learning Approaches. In Proceedings of the 2021 IEEE 12th International Conference on Software Engineering and Service Science (ICSESS), Beijing, China, 20–22 August 2021; pp. 243–246. [Google Scholar] [CrossRef]
Su, T.; Sun, H.; Ma, C.; Jiang, L.; Xu, T. HDL: Hierarchical Deep Learning Model based Human Activity Recognition using Smartphone Sensors. In Proceedings of the 2019 International Joint Conference on Neural Networks (IJCNN), Budapest, Hungary, 14–19 July 2019; pp. 1–8. [Google Scholar] [CrossRef]
Guo, X.; Hu, X.; Ye, X.; Hu, C.; Song, C.; Wu, H. Human Activity Recognition Based on Two-Dimensional Acoustic Arrays. In Proceedings of the 2018 IEEE International Ultrasonics Symposium (IUS), Kobe, Japan, 22–25 October 2018; pp. 1–4. [Google Scholar] [CrossRef]
Tripathi, A.M.; Baruah, D.; Baruah, R.D. Acoustic sensor based activity recognition using ensemble of one-class classifiers. In Proceedings of the 2015 IEEE International Conference on Evolving and Adaptive Intelligent Systems (EAIS), Douai, France, 1–3 December 2015; pp. 1–7. [Google Scholar] [CrossRef]
She, D.; Lou, X.; Ye, W. RadarSpecAugment: A Simple Data Augmentation Method for Radar-Based Human Activity Recognition. IEEE Sens. Lett. 2021, 5, 1–4. [Google Scholar] [CrossRef]
Kim, Y.; Moon, T. Human detection and activity classification based on microDoppler signatures using deep convolutional neural networks. Remote Sens. Lett. 2015, 13, 8–12. [Google Scholar] [CrossRef]
Szegedy, C.; Ioffe, S.; Vanhoucke, V.; Alemi, A. Inception-v4, inception–ResNet and the impact of residual connections on learning. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, San Francisco, CA, USA, 4–9 February 2017; Volume 3. [Google Scholar]
Jordan, T.S. Using convolutional neural networks for human activity classification on micro-Doppler radar spectrograms. In Proceedings of the SPIE DEFENSE + SECURITY, Baltimore, MD, USA, 17–21 April 2016; p. 9825. [Google Scholar]
Du, H.; He, Y.; Jin, T. Transfer learning for human activities classification using micro-Doppler spectrograms. In Proceedings of the 2018 IEEE International Conference on Computational Electromagnetics (ICCEM), Chengdu, China, 26–28 March 2018; pp. 1–3. [Google Scholar]
Zhu, J.; Chen, H.; Ye, W. A hybrid CNN–LSTM network for the classification of human activities based on micro-Doppler radar. IEEE Access 2020, 8, 24713–24720. [Google Scholar] [CrossRef]
Li, X.; He, Y.; Fioranelli, F.; Jing, X. Semisupervised Human Activity Recognition With Radar Micro-Doppler Signatures. IEEE Trans. Geosci. Remote Sens. 2021, 60, 1–12. [Google Scholar] [CrossRef]
Parvez, S.; Sakib, N.; Mollah, M.N. Chebyshev type-I low pass filter using annular ring resonator: A comparative performance analysis for different substrates. In Proceedings of the 2016 9th International Conference on Electrical and Computer Engineering (ICECE), Dhaka, Bangladesh, 20–22 December 2016; pp. 182–185. [Google Scholar] [CrossRef]
Puterka, B.; Kacur, J.; Pavlovicova, J. Windowing for Speech Emotion Recognition. In Proceedings of the 2019 International Symposium ELMAR, Zadar, Croatia, 23–25 September 2019; pp. 147–150. [Google Scholar]
Baykara, M.; Abdulrahman, A. Seizure detection based on adaptive feature extraction by applying extreme learning machines. Traitement Signal 2021, 38, 331–340. [Google Scholar] [CrossRef]
Bono, R.; Arnau, J.; Alarcón, R.; Blanca-Mena, M.J. Bias, Precision, and Accuracy of Skewness and Kurtosis Estimators for Frequently Used Continuous Distributions. Symmetry 2020, 12, 19. [Google Scholar] [CrossRef] [Green Version]
Jalal, A.; Ahmed, A.; Rafique, A.; Kim, K. Scene Semantic recognition based on modified Fuzzy c-mean and maximum entropy using object-to-object relations. IEEE Access 2021, 9, 27758–27772. [Google Scholar] [CrossRef]
Mazumder, I. An Analytical Approach of EEG Analysis for Emotion Recognition. In Proceedings of the 2019 Devices for Integrated Circuit (DevIC), Kalyani, India, 23–24 March 2019; pp. 256–260. [Google Scholar]
Jalal, A.; Batool, M.; Tahir, B. Markerless sensors for physical health monitoring system using ECG and GMM feature extraction. In Proceedings of the 2021 International Bhurban Conference on Applied Sciences and Technologies (IBCAST), Islamabad, Pakistan, 12–16 January 2021; pp. 340–345. [Google Scholar]
Kamal, S.; Jalal, A. A hybrid feature extraction approach for human detection, tracking and activity recognition using depth sensors. Arab. J. Sci. Eng. 2016, 41, 1043–1051. [Google Scholar] [CrossRef]
Jalal, A.; Kamal, S.; Farooq, A.; Kim, D. A spatiotemporal motion variation features extraction approach for human tracking and pose-based action recognition. In Proceedings of the IEEE International Conference on Informatics, electronics and vision, Fukuoka, Japan, 15–18 June 2015. [Google Scholar]
Jalal, A.; Kamal, S.; Kim, D. Depth Silhouettes Context: A new robust feature for human tracking and activity recognition based on embedded HMMs. In Proceedings of the 12th IEEE International Conference on Ubiquitous Robots and Ambient Intelligence, Goyangi, Korea, 28–30 October 2015; pp. 294–299. [Google Scholar]
Jalal, A.; Kim, Y.-H.; Kim, Y.-J.; Kamal, S.; Kim, D. Robust human activity recognition from depth video using spatiotemporal multi-fused features. Pattern Recognit. 2017, 61, 295–308. [Google Scholar] [CrossRef]
Jalal, A.; Kamal, S.; Kim, D. Human depth sensors-based activity recognition using spatiotemporal features and hidden markov model for smart environments. J. Comput. Netw. Commun. 2016, 2016, 8087545. [Google Scholar] [CrossRef] [Green Version]
Jalal, A.; Kamal, S.; Kim, D. Facial Expression recognition using 1D transform features and Hidden Markov Model. J. Electr. Eng. Technol. 2017, 12, 1657–1662. [Google Scholar]
Jalal, A.; Kamal, S.; Kim, D. A depth video-based human detection and activity recognition using multi-features and embedded hidden Markov models for health care monitoring systems. Int. J. Interact. Multimed. Artif. Intell. 2017, 4, 54–62. [Google Scholar] [CrossRef] [Green Version]
Jalal, A.; Mahmood, M.; Sidduqi, M.A. Robust spatio-temporal features for human interaction recognition via artificial neural network. In Proceedings of the IEEE conference on International Conference on Frontiers of Information Technology, Islamabad, Pakistan, 17–19 December 2018. [Google Scholar]
Ntakolia, C.; Kokkotis, C.; Moustakidis, S.P.; Tsaopoulos, D. Identification of most important features based on a fuzzy ensemble technique: Evaluation on joint space narrowing progression in knee osteoarthritis patients. Int. J. Med. Inform. 2021, 156, 104614. [Google Scholar] [CrossRef]
Abbas, S.A.; Aslam, A.; Rehman, A.U.; Abbasi, W.A.; Arif, S.; Kazmi, S.Z.H. K-Means and K-Medoids: Cluster Analysis on Birth Data Collected in City Muzaffarabad, Kashmir. IEEE Access 2020, 8, 151847–151855. [Google Scholar] [CrossRef]
Gaikwad, N.B.; Tiwari, V.; Keskar, A.; Shivaprakash, N.C. Efficient FPGA Implementation of Multilayer Perceptron for Real-Time Human Activity Classification. IEEE Access 2019, 7, 26696–26706. [Google Scholar] [CrossRef]
Sztyler, T.; Stuckenschmidt, H. On-body localization of wearable devices: An investigation of position-aware activity recognition. In Proceedings of the 2016 IEEE International Conference on Pervasive Computing and Communications (PerCom), Sydney, Australia, 14–19 March 2016; pp. 1–9. [Google Scholar]
Sztyler, T.; Stuckenschmidt, H. Online personalization of crosssubjects based activity recognition models on wearable devices. In Proceedings of the 2017 IEEE International Conference on Pervasive Computing and Communications (PerCom), Kona, HI, USA, 13–17 March 2017; pp. 180–189. [Google Scholar]
Ordóñez, F.J.; Roggen, D. Deep convolutional and LSTM recurrent neural networks for multimodal wearable activity recognition. Sensors 2016, 16, 115. [Google Scholar] [CrossRef] [Green Version]
Hur, T.; Bang, J.; Huynh-The, T.; Lee, J.; Kim, J.-I.; Lee, S. Iss2Image: A novel signal-encoding technique for CNN-based human activity recognition. Sensors 2018, 18, 3910. [Google Scholar] [CrossRef] [Green Version]
Ferrari, A.; Micucci, D.; Mobilio, M.; Napoletano, P. Hand-crafted Features vs Residual Networks for Human Activities Recognition using Accelerometer. In Proceedings of the 2019 IEEE 23rd International Symposium on Consumer Technologies (ISCT), Ancona, Italy, 19–21 June 2019; pp. 153–156. [Google Scholar]
Casilari, E.; Lora-Rivera, R.; García-Lagos, F. A Study on the Application of Convolutional Neural Networks to Fall Detection Evaluated with Multiple Public Datasets. Sensors 2020, 20, 1466. [Google Scholar] [CrossRef] [Green Version]
Colon, L.N.V.; de la Hoz, Y.; Labrador, M. Human fall detection with smartphones. In Proceedings of the 2014 IEEE Latin-America Conference on Communications (LATINCOM), Cartagena, Colombia, 5–7 November 2014; pp. 1–7. [Google Scholar]
Garcia-Gonzalez, D.; Rivero, D.; Fernandez-Blanco, E.; Luaces, M.R. A public domain dataset for real-life human activity recognition using smartphone sensors. Sensors 2020, 20, 2200. [Google Scholar] [CrossRef] [Green Version]

Figure 1. The architecture of the proposed smartphone sensors remote data classification (SSRDC) system.

Figure 2. Signal denoising using second-order Chebyshev type-I filter with a cutoff frequency equal to 0.001. (a) MobiAct_v2.0 magnetometer x-channel, (b) MobiAct_v2.0 magnetometer y-channel, and (c) MobiAct_v2.0 magnetometer z-channel.

Figure 3. Signal denoising using second-order Chebyshev type-I filter with a cutoff frequency equal to 0.001. (a) Real-World HAR magnetometer x-channel, (b) Real-World HAR magnetometer y-channel, and (c) Real-World HAR magnetometer z-channel.

Figure 4. Signal denoising using second-order Chebyshev type-I filter with a cutoff frequency equal to 0.001. (a) Real-Life HAR magnetometer x-channel, (b) Real-Life HAR magnetometer y-channel, and (c) Real-Life HAR magnetometer z-channel.

Figure 5. Feature plot for Parseval energy, skewness, kurtosis, Shannon entropy, and AR-coefficients.

Figure 6. Flow diagram for Luca-measure fuzzy entropy (LFE) and Lukasiewicz similarity measure (LS)–based feature selection.

Figure 7. Original vs. power-transformed feature vector. (a) MobiAct_v2.0 original vs. power-transformed feature vector, (b) Real-World HAR original vs. power-transformed feature vector, and (c) Real-Life HAR original vs. power-transformed feature vector.

Figure 8. Multilayer perceptron (MLP) classifier structure.

Figure 9. MobiAct_v2.0: individual ROC curves for all the classes, class 1 = walking, class 2 = stairs-up, class 3 = stairs-down, class 4 = stand-to-sit, class 5 = sit-to-stand, class 6 = car step-in, class 7 = car step-out, class 8 = forward lying, class 9 = back-sitting chair, class 10 = front-knees lying, class 11 = sideward lying, class 12 = jogging, class 13 = jumping, class 14 = standing, class 15 = sitting.

Figure 10. Real-World HAR: individual ROC curves for all the classes, class 1 = stairs-up, class 2 = stairs-down, class 3 = jumping, class 4 = lying, class 5 = running, class 6 = sitting, class 7 = standing, class 8 = walking.

Figure 11. Real-Life HAR: individual ROC curves for all the classes, class 1 = walking, class 2 = active, class 3 = inactive, class 4 = driving.

Figure 12. Confusion matrix for MobiAct_v2.0. WAL = walking, STU = stairs-up, STN = stairs-down, SCH = stand-to-sit, CHU = sit-to-stand, CSI = car step-in, CSO = car step-out, FOL = forward lying, BSC = back-sitting chair, FKL = front-knees lying, SDL = sidewards lying, JOG = jogging, JUM = jumping, STD = standing, SIT = sitting.

Figure 13. Confusion matrix for Real-World HAR.

Figure 14. Confusion matrix for Real-Life HAR.

Figure 15. Failure cases for SSRDC system: (a) complex activity and (b) extreme signal distortion.

Table 1. Features extracted for human locomotion activity signal.

Feature	Description	Formulation
Parseval’s Energy [43]	Energy of the signal in time domain is equal to the energy of the signal in frequency domain (Parseval Theorem).	$\frac{1}{2 π} \int_{- \infty}^{\infty} {\|x (j ω)\|}^{2} d ω$
Skewness [44]	It is a measure of the symmetry of a distribution.	$\frac{n}{(n - 1) (n - 2)} Σ {(\frac{X i - \bar{X}}{S})}^{3}$
Kurtosis [44]	It compares the tails of the distribution to the tails of a distribution.	$[\frac{n (n + 1)}{(n - 1) (n - 2) (n - 3)} Σ {(\frac{X i - \bar{X}}{S})}^{4}] - (\frac{3 {(n - 1)}^{2}}{(n - 2) (n - 3)})$
Shannon Entropy [45]	It is the expected amount of information in an instance of the distribution.	$\sum_{i = 1}^{n} g_{i} * - \log (g_{i})$
AR-Coefficients [46]	These are the auto-regressive coefficients of a distribution.	$Y_{m} (t) = β_{0} + β_{1} m (t - 1) + β_{2} m (t - 2) + \dots + β_{n} m (t - n) + E (t)$

Parseval’s energy:

x (j ω)

= frequency domain signal; skewness:

\bar{X}

= sample average, S = standard deviation, n = number of samples,

X i

= ith sample; kurtosis:

\bar{X}

= sample average, S = standard deviation, n = number of samples,

X i

= ith sample; Shannon entropy:

g_{i}

= probability of occurrence of a data point; AR-coefficients:

β

= weight, m = value, E = white noise.

Table 2. Human locomotion activities provided by all three datasets.

Serial No.	MobiAct_v2.0	Real-World HAR	Real-Life HAR
1	Walking	Stairs Climb-up	Inactive
2	Stairs-up	Stairs Climb-down	Active
3	Stairs-down	Jumping	Walking
4	Stand-to-sit	Lying	Driving
5	Sit-to-stand	Running	-
6	Car step-in	Sitting	-
7	Car step-out	Standing	-
8	Jogging	Walking	-
9	Jumping	-	-
10	Standing	-	-
11	Sitting	-	-
12	Forward lying	-	-
13	Back-sitting chair	-	-
14	Front-knees lying	-	-
15	Sideward lying	-	-

Table 3. Performance comparison of multilayer perceptron classifier with KNN and AdaBoost classifiers on MobiAct_v2.0 in terms of precision, recall, and F1-score.

Activity Classes	K-Nearest Neighbors (KNN)			AdaBoost			Multilayer Perceptron
Activities	Precision	Recall	F1-Score	Precision	Recall	F1-Score	Precision	Recall	F1-Score
WAL	0.95	0.90	0.92	0.52	0.52	0.52	0.95	0.97	0.96
STU	0.82	0.83	0.83	0.33	0.69	0.45	0.89	0.88	0.88
STN	0.82	0.89	0.85	0.34	0.29	0.32	0.87	0.89	0.88
SCH	0.92	0.96	0.94	0.19	0.99	0.32	0.98	0.98	0.98
CHU	0.97	0.98	0.98	0.00	0.00	0.00	0.99	0.95	0.97
CSI	0.95	0.96	0.96	0.00	0.00	0.00	0.97	0.97	0.97
CSO	1.00	0.98	0.99	0.52	0.07	0.13	0.98	0.99	0.99
FOL	0.42	0.52	0.46	0.00	0.00	0.00	0.69	0.68	0.68
BSC	0.85	0.66	0.75	0.00	0.00	0.00	0.84	0.86	0.85
FKL	0.49	0.48	0.49	0.00	0.00	0.00	0.70	0.70	0.70
SDL	0.80	0.74	0.77	0.00	0.00	0.00	0.84	0.85	0.85
JOG	0.89	0.91	0.90	0.00	0.00	0.00	0.94	0.95	0.95
JUM	0.91	0.81	0.86	0.50	0.05	0.09	0.94	0.94	0.94
STD	0.96	0.83	0.89	0.60	0.05	0.09	0.95	0.90	0.92
SIT	1.00	0.63	0.77	0.25	0.95	0.39	1.00	0.74	0.85
Mean	0.85	0.81	0.82	0.25	0.24	0.15	0.90	0.88	0.89

WAL = walking, STU = stairs-up, STN = stairs-down, SCH = stand-to-sit, CHU = sit-to-stand, CSI = car step-in, CSO = car step-out, FOL = forward lying, BSC = back-sitting chair, FKL = front-knees lying, SDL = sidewards lying, JOG = jogging, JUM = jumping, STD = standing, SIT = sitting.

Table 4. Performance comparison of multilayer perceptron classifier with KNN and AdaBoost classifiers on Real-World HAR in terms of precision, recall, and F1-score.

Activity Classes	K-Nearest Neighbors (KNN)			AdaBoost			Multilayer Perceptron
Activities	Precision	Recall	F1-Score	Precision	Recall	F1-Score	Precision	Recall	F1-Score
STDN	0.58	0.93	0.72	0.38	0.67	0.49	0.87	0.87	0.87
STUP	0.58	0.93	0.72	0.16	0.20	0.18	0.86	0.80	0.83
JUM	1.00	1.00	1.00	0.57	0.53	0.55	1.00	1.00	1.00
LY	1.00	0.93	0.97	0.50	0.47	0.48	1.00	1.00	1.00
RUN	0.81	0.87	0.84	1.00	0.60	0.75	0.88	1.00	0.94
SIT	1.00	0.47	0.64	0.32	0.40	0.35	1.00	0.93	0.97
STD	0.67	0.53	0.59	0.16	0.20	0.18	0.72	0.87	0.79
WAL	0.62	0.33	0.43	0.00	0.00	0.00	0.92	0.73	0.81
Mean	0.78	0.75	0.74	0.39	0.38	0.37	0.91	0.90	0.90

STDN = stairs-down, STUP = stairs-up, JUM = jumping, LY = lying, RUN = running, SIT = sitting, STD = standing, WAL = walking.

Table 5. Performance comparison of multilayer perceptron classifier with KNN and AdaBoost classifiers on Real-Life HAR in terms of precision, recall, and F1-score.

Activity Classes	K-Nearest Neighbors (KNN)			AdaBoost			Multilayer Perceptron
Activities	Precision	Recall	F1-Score	Precision	Recall	F1-Score	Precision	Recall	F1-Score
WAL	1.00	0.84	0.91	0.53	1.00	0.69	1.00	0.95	0.97
INAC	0.79	1.00	0.88	1.00	1.00	1.00	0.95	1.00	0.97
ACT	1.00	0.74	0.85	1.00	0.16	0.27	1.00	0.95	0.97
DRI	0.86	1.00	0.93	1.00	0.95	0.97	0.95	1.00	0.97
Mean	0.91	0.90	0.89	0.88	0.78	0.73	0.98	0.98	0.97

WAL = walking, INAC = inactive, ACT = active, DRI = driving.

Table 6. Comparison of proposed SSRDC system with available state-of-the-art methods.

Author/Method	Mean Accuracy %
	MobiAct_v2.0	Real-World HAR	Real-Life HAR
RF + Position Unaware [67]	-	80.2	-
Personalized Cross-Subject [68]	-	83.1	-
RF + Position Aware [67]	-	83.4	-
Signal Visualization + CNN [69]	-	92.4	-
Iss2Image + CNN [70]	-	93.8	-
Support Vector Machine [71]	77.9	-	-
CNN [72]	80.7	-	-
Threshold [73]	81.3	-	-
SVM + Acc. + GPS [74]	-	-	60.1
SVM + Acc. + GPS + Magn. [74]	-	-	62.6
SVM + Acc. + GPS + Magn. + Gyro. [74]	-	-	67.2
Att-CNN-LSTM + Magn. [38]	-	-	70.3
Att-CNN-LSTM + Gyro. [38]	-	-	95.2
Att-CNN-LSTM + Acc. [38]	-	-	95.7
Proposed	84.5	94.2	95.9

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Azmat, U.; Ghadi, Y.Y.; Shloul, T.a.; Alsuhibany, S.A.; Jalal, A.; Park, J. Smartphone Sensor-Based Human Locomotion Surveillance System Using Multilayer Perceptron. Appl. Sci. 2022, 12, 2550. https://doi.org/10.3390/app12052550

AMA Style

Azmat U, Ghadi YY, Shloul Ta, Alsuhibany SA, Jalal A, Park J. Smartphone Sensor-Based Human Locomotion Surveillance System Using Multilayer Perceptron. Applied Sciences. 2022; 12(5):2550. https://doi.org/10.3390/app12052550

Chicago/Turabian Style

Azmat, Usman, Yazeed Yasin Ghadi, Tamara al Shloul, Suliman A. Alsuhibany, Ahmad Jalal, and Jeongmin Park. 2022. "Smartphone Sensor-Based Human Locomotion Surveillance System Using Multilayer Perceptron" Applied Sciences 12, no. 5: 2550. https://doi.org/10.3390/app12052550

APA Style

Azmat, U., Ghadi, Y. Y., Shloul, T. a., Alsuhibany, S. A., Jalal, A., & Park, J. (2022). Smartphone Sensor-Based Human Locomotion Surveillance System Using Multilayer Perceptron. Applied Sciences, 12(5), 2550. https://doi.org/10.3390/app12052550

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Smartphone Sensor-Based Human Locomotion Surveillance System Using Multilayer Perceptron

Abstract

Featured Application

Abstract

1. Introduction

2. Related Work

2.1. RS-HLAR Using Inertial Sensors

2.2. RS-HLAR Using Proximity Sensors

3. Materials and Methods

3.1. Preprocessing

3.2. Windowing and Segmentation

3.3. Feature Extraction

3.4. Feature Selection

3.5. Feature Optimization

3.6. Classification

4. Experimental Results

4.1. Datasets Description

4.2. Experiment I: Receiver Operating Characteristic (ROC) Curves SSRDS

4.3. Experiment II: Individual Locomotion Activity Classification Accuracies

4.4. Experiment III: Comparison with Well-Known Classifiers

4.5. Experiment IV: Comparison with Available State-of-the-Art Techniques

5. Discussion

6. Conclusions

6.1. Research Limitations

6.2. Future Work

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI