Homogeneous Data Normalization and Deep Learning: A Case Study in Human Activity Classiﬁcation

: One class of applications for human activity recognition methods is found in mobile devices for monitoring older adults and people with special needs. Recently, many studies were performed to create intelligent methods for the recognition of human activities. However, the di ﬀ erent mobile devices in the market acquire the data from sensors at di ﬀ erent frequencies. This paper focuses on implementing four data normalization techniques, i.e., MaxAbsScaler, MinMaxScaler, RobustScaler, and Z-Score. Subsequently, we evaluate the impact of the normalization algorithms with deep neural networks (DNN) for the classiﬁcation of the human activities. The impact of the data normalization was counterintuitive, resulting in a degradation of performance. Namely, when using the accelerometer data, the accuracy dropped from about 79% to only 53% for the best normalization approach. Similarly, for the gyroscope data, the accuracy without normalization was about 81.5%, whereas with the best normalization, it was only 60%. It can be concluded that data normalization techniques are not helpful in classiﬁcation problems with homogeneous data.


Introduction
Nowadays, mobile devices in all everyday tasks are increasing, and their usage allows users to stay connected and communicate with ease [1,2]. The current pandemic situation discourages social interaction and personal contacts, enhancing the role of technology in promoting social distancing while being connected [3,4] and active, avoiding sedentary positions [5,6]. Several studies use mobile devices to identify human activities and create a personal agenda to track people [7][8][9][10]. This is especially important for people with special needs, including older adults or people with chronic diseases [11][12][13]. The constant contact with professional healthcare will benefit people's quality of life [14][15][16].
Sensors are vital for data acquisition related to human activities [17][18][19] and, lately, even for diagnostic purposes [20,21]. Mobile devices include a large variety of sensors, including accelerometer, magnetometer, gyroscope, acoustic, location, contacts, and other types of sensors [22,23]. Section 2.6 finalizes this study with a comparison of the results. Figure 1 presents the sequence of activities performed for the recognition of human activities.

Dataset
The dataset used in this research is named the "Heterogeneity Activity Recognition Data Set" [2]. This dataset was acquired from smartphones and smartwatches related to four human activities, including walking upstairs, walking downstairs, standing, and walking. The authors of the dataset reported that it was created to apply machine learning methods for automatic activity recognition. The data acquisition was performed with different mobile devices, including smartwatches, i.e., LG G and Samsung Galaxy Gear, smartphones, i.e., Apple iPhone 6, Samsung Galaxy Pocket+, Samsung Galaxy S3 mini, LG Nexus 4, Samsung Galaxy S3, Samsung Galaxy Nexus, Samsung Galaxy S+, LG Optimus 2X, HTC Desire, and HTC Nexus One, and tablets, i.e., Samsung Galaxy Tab 10.1. The devices used have different frequencies for data acquisition between 25 and 200 Hz. The recordings were performed by 9 different users from the accelerometer and gyroscope sensors at the highest frequency.

Data Normalization
The data of the dataset were normalized to improve the results on activity recognition with machine learning methods. Four data normalization techniques were applied. Firstly, MaxAbsScaler scales and translates each feature individually by the maximum absolute value in the dataset [34]. Secondly, MinMaxScaler scales and translates each feature individually by the given range on the training set [35]. Thirdly, the RobustScaler removes the median and scales the data according to the quantile range [36]. The interquartile range (IQR) [53] is the range between the 1st quartile (25th quantile) and the 3rd quartile (75th quantile). Finally, Z-Score normalization is a normalizing strategy that avoids the outlier issue [33].

Peak Detection
The detection of the sensors' signal variations and maximum values, commonly named peaks, is important for discretizing the different activities because activities with high intensity have more peaks with high values and low intensities have fewer peaks with low values [54].
The detection of peaks may be performed with different methods. This study used a sequential method, smoothing the sensors' signal and saving only the values where the next and previous values are lower. The process must be executed several times until the iteration where the value of peaks is the minimum, but it retains at least five peaks.

Feature Extraction
The definition of the correct and most reliable features for recognizing human activities is important for the obtention of highly accurate results for the method of the automatic recognition of them. Based on the previous knowledge [46,51,55], and the characteristics of the dataset used in this study, the features extracted from the sensors are as follows: • Accelerometer: mean, standard deviation, variance, and median values of the measured maximum peaks, and mean, standard deviation, variance, median, maximum, and minimum values of the raw signal; • Gyroscope: mean, standard deviation, variance, and median values of the measured maximum peaks, and mean, standard deviation, variance, median, maximum, and minimum values of the raw signal.
After the feature extraction, the data classification techniques may be applied to establish the relations between the features and the human activities.

Data Classification
This stage includes applying the artificial intelligence method to identify the human activities available in the dataset. For this purpose, the deep neural networks (DNN) method was applied similarly to [46,51].
For the training stage, we used a Sigmoid activation function, a learning rate of 0.1, a maximum of 4 × 10 6 training iterations, 3 hidden layers, the weight function called Xavier, the implementation of backpropagation, and the use of the L 2 regularization method [56].
The cross-validation technique was implemented in the testing stage to measure the validity parameters of the implemented method. The results were statistically analyzed, as explained in Section 2.6.

Statistical Analysis
For evaluating the results obtained with the cross-validation technique implemented, the classification performance scores were measured, such as precision, specificity, accuracy, recall, and F1-Score.
Finally, these results are compared with the results obtained with a previously published dataset [51]. Most of the activities included in the dataset used for comparison were also included in the dataset analyzed in this paper, except that captures were performed with other smartphones. Comparing the frequencies of data acquisition, the two datasets may be compared with measuring this dataset's reliability and implementation.

Results
This research uses an unbalanced dataset to recognize four human activities, including walking upstairs, walking downstairs, standing, and walking. The following section will present the confusion matrixes and other related parameters, such as accuracy, precision, recall, and F1-Score. True positives are cases where the activity was detected accurately. False positives are the cases where the activity was correctly not detected (another activity was present and detected). False positives are cases where the activity was falsely detected, and false negatives are the cases where the activity was detected but other activities should have been detected. Firstly, accuracy is defined as The analysis was performed with data acquired by the accelerometer and gyroscope sensors.

Normalized Data with MaxAbsScaler
Initially, MaxAbsScaler was used to normalize the data acquired from the accelerometer data related to the analyzed human activities, in particular, walking upstairs, walking downstairs, standing, and walking. Table 1 presents the confusion matrix related to the experiments performed with the accelerometer sensor included in the dataset used. It was verified that the most correctly identified activities are walking upstairs, walking downstairs, and walking. Next, the classification results of the data acquired from the accelerometer sensor after the application of MaxAbsScaler were analyzed, verifying that the DNN method implemented reported an accuracy of 53.12%, a precision of 51.59%, a recall value of 51.90%, and an F1-Score of 51.74%.
In continuation, MaxAbsScaler was used to normalize the data acquired from the gyroscope data related to the different human activities. Table 2 presents the confusion matrix related to the experiments performed with the gyroscope sensor included in the dataset used. It was also verified that the most correctly identified activities are walking upstairs, walking downstairs, and walking. Besides, the classification results of the data acquired from the gyroscope sensor after the application of MaxAbsScaler were analyzed, verifying that the DNN method implemented reported an accuracy of 59.71%, a precision of 66.82%, a recall value of 56.63%, and an F1-Score of 61.31%.

Normalized Data with MinMaxScaler
The second data normalization algorithm that was evaluated on the same accelerometry data was MinMaxScaler. Table 3 presents the confusion matrix related to the experiments performed with the accelerometer sensor included in the dataset used. It was verified that the most correctly identified activities are walking downstairs and standing. Next, the classification results of the data acquired from the accelerometer sensor after the application of MinMaxScaler were analyzed, verifying that the DNN method implemented reported an accuracy of 46.25%, a precision of 49.20%, a recall value of 49.08%, and an F1-Score of 49.14%.
MinMaxScaler was also used to normalize the data acquired from the gyroscope data related to the different human activities. Table 4 presents the confusion matrix related to the experiments performed with the gyroscope sensor included in the dataset used. It was also verified that the most correctly identified activities are walking upstairs and walking. Further, the classification results of the data acquired from the gyroscope sensor after the application of MinMaxScaler were analyzed, verifying that the DNN method implemented reported an accuracy of 52.75%, a precision of 49.93%, a recall value of 51.06%, and an F1-Score of 50.49%.

Normalized Data with RobustScaler
Thirdly, RobustScaler was also used to normalize the data acquired from the accelerometer. Table 5 presents the confusion matrix related to the experiments performed with the accelerometer sensor included in the dataset used. It was verified that the most correctly identified activities are walking upstairs and standing. Next, the classification results of the data acquired from the accelerometer sensor after the application of RobustScaler were analyzed, verifying that the DNN method implemented reported an accuracy of 49.79%, a precision of 52.16%, a recall value of 54.43%, and an F1-Score of 53.27%.
RobustScaler was used to normalize the data acquired from the gyroscope data related to the different human activities. Table 6 presents the confusion matrix related to the experiments performed with the gyroscope sensor included in the dataset used. It was also verified that the most correctly identified activity is walking upstairs. Further, the classification results of the data acquired from the gyroscope sensor after the application of RobustScaler were analyzed, verifying that the DNN method implemented reported an accuracy of 50.87%, a precision of 48.22%, a recall value of 39.01%, and an F1-Score of 43.13%.

Normalized Data with Z-Score
The last data normalization approach that was evaluated was the Z-Score normalizer. Table 7 presents the confusion matrix related to the experiments performed with the accelerometer sensor included in the dataset used. It was verified that the most correctly identified activities are walking upstairs and walking.
Next, the classification results of the data acquired from the accelerometer sensor after the application of the Z-Score normalizer were analyzed, verifying that the DNN method implemented reported an accuracy of 52.71%, a precision of 59.04%, a recall value of 48.07%, and an F1-Score of 52.99%. In continuation, the Z-Score normalizer was used to normalize the data acquired from the gyroscope data related to the different human activities. Table 8 presents the confusion matrix related to the experiments performed with the gyroscope sensor included in the dataset used. It was also verified that the most correctly identified activities are walking upstairs, standing, and walking. Besides, the classification results of the data acquired from the gyroscope sensor after the application of the Z-Score normalizer were analyzed, verifying that the DNN method implemented reported an accuracy of 56.81%, a precision of 63.63%, a recall value of 53.08%, and an F1-Score of 57.88%.

Non-Normalized Data
Finally, we evaluated the non-normalized data acquired from the accelerometer data related to the four human activities. Table 9 presents the confusion matrix related to the experiments performed with the accelerometer sensor included in the dataset used. It was verified that the most correctly identified activities are walking upstairs, walking downstairs, and walking.
Next, the classification results of the non-normalized data acquired from the accelerometer sensor were analyzed, verifying that the DNN method implemented reported an accuracy of 79.11%, a precision of 78.52%, a recall value of 67.62%, and an F1-Score of 72.66%.
In continuation, the non-normalized data acquired from the gyroscope data related to the different human activities were analyzed. Table 10 presents the confusion matrix related to the experiments performed with the gyroscope sensor included in the dataset used. It was also verified that the most correctly identified activities are walking upstairs, walking downstairs, and walking. The classification results of the non-normalized data acquired from the gyroscope sensor were analyzed, verifying that the DNN method implemented reported an accuracy of 81.46%, a precision of 80.54%, a recall value of 72.94%, and an F1-Score of 76.55%.

Overall Results
Based on the results obtained with this study, the best results were achieved with the gyroscope data without applying normalization techniques. It was expected that the use of normalized data would report the best results, but it was not verified, as presented in Figure 2. Analyzing the accelerometer data, the best accuracy was reported with non-normalized data (79.11%), and the application of normalization techniques decreased the accuracy. Firstly, the application of Z-Score normalization decreased the results by 26.4%. Secondly, the application of RobustScaler decreased the results by 29.32%. Thirdly, the application of MinMaxScaler decreased the results by 32.86%. Finally, the application of MaxAbsScaler decreased the results by 25.99%.
The gyroscope data analysis revealed that the best accuracy reported was also with non-normalized data (81.46%), and the application of normalization techniques also decreased the accuracy. Firstly, the application of Z-Score normalization decreased the results by 24.65%. Secondly, the application of RobustScaler decreased the results by 30.59%. Thirdly, the application of MinMaxScaler decreased the results by 28.71%. Finally, the application of MaxAbsScaler decreased the results by 21.75%.

Discussion and Conclusions
The "Heterogeneity Activity Recognition Data Set" [2] was acquired with different mobile devices, including smartphones, tablets, and smartwatches. Different devices have different frequencies of data acquisition. We experimented with four normalization techniques: MaxAbsScaler, MinMaxScaler, RobustScaler, and Z-Score. Furthermore, the DNN method was implemented for the classification of the different human activities.
This study analyzed the difference between non-normalized and normalized data, verifying that the dataset used in this study revealed the best results with the non-normalized data. However, the previously used dataset revealed the best results with normalized data.
The results showed that the best accuracy (81.46%) was reported with non-normalized gyroscope data to recognize three human activities. Furthermore, 79.11% accuracy was obtained with the use of accelerometer data to recognize three human activities, also with non-normalized data. On the contrary, the previously used dataset revealed the best accuracy with normalized data with or without data fusion techniques.
Based on the comparison of the previous results presented in Table 11, non-normalized data reported better accuracy with the dataset analyzed in this study than the previously used dataset. However, normalization techniques reported bad accuracy with this dataset compared to the previously used dataset which reported the correct recognition of five human activities. This dataset only reported a maximum of three human activities correctly.
As future work, the impact of different techniques for data classification, data imputation, and data normalization should be explored, as well as their impact when processing multi-modal data collected by various sensors. Furthermore, other data normalization techniques should be evaluated, as well as how the subsequently used machine learning algorithms benefit from the normalization. As this research shows, deep learning algorithms can overcome bias in the data without normalization. Furthermore, when processing homogeneous data collected by mobile devices, with completely identical data collection frequencies and different ranges of data, this research shows that data normalization impairs the classification accuracy. Other studies [56,57] show that more classical algorithms, such as SVMs, decision trees, and tree ensembles, considerably benefit from data normalization. These algorithms need to be further evaluated with the proposed approaches. In the future, the impact of the presented data normalization and imputation methods should also be evaluated on other datasets. In particular, when using other sensors in collecting multi-modal data from various sensors, such as microphones [58], pressure sensors, infrared sensors, proximity sensors, and oximeters, we expect the impact of the proposed data normalization algorithms to be even more emphasized. In conclusion, the benefits of the implementation of data normalization techniques depend on the dataset. It is unclear if normalization would improve the data classification because the number of samples used was smaller than the previously used dataset. As the dataset is unbalanced, it may also influence the implementation of artificial intelligence methods for activity recognition.
Author Contributions: Conceptualization, methodology, software, validation, formal analysis, investigation, writing-original draft preparation, writing-review and editing; I.M.P., F.H., N.M.G., P.L. and E.Z. All authors have read and agreed to the published version of the manuscript.
Funding: This work is funded by FCT/MEC through national funds and co-funded by FEDER-PT2020 partnership agreement under the project UIDB/50008/2020 (Este trabalho é financiado pela FCT/MEC através de fundos nacionais e cofinanciado pelo FEDER, no âmbito do Acordo de Parceria PT2020 no âmbito do projeto UIDB/50008/2020). This work is also funded by National Funds through the FCT-Foundation for Science and Technology, I.P., within the scope of the project UIDB/00742/2020. Furthermore, we would like to thank the Politécnico de Viseu for their support.