Human Activity Recognition Using Binary Sensors, BLE Beacons, an Intelligent Floor and Acceleration Data: A Machine Learning Approach

: Although there have been many studies aimed at the field of Human Activity Recognition, the relationship between what we do and where we do it has been little explored in this field. The objective of this paper is to propose an approach based on machine learning to address the challenge of the 1st UCAmI cup, which is the recognition of 24 activities of daily living using a dataset that allows to explore the aforementioned relationship, since it contains data collected from four data sources: binary sensors, an intelligent floor, proximity and acceleration sensors. The methodology for data mining projects CRISP-DM was followed in this work. To perform synchronization and classification tasks a java desktop application was developed. As a result, the accuracy achieved in the classification of the 24 activities using 10-fold-cross-validation on the training dataset was 92.1%, but an accuracy of 60.1% was obtained on the test dataset. The low accuracy of the classification might be caused by the class imbalance of the training dataset; therefore, more labeled data are necessary for training the algorithm. Although we could not obtain an optimal result, it is possible to iterate in the methodology to look for a way to improve the obtained results.


Introduction
Human Activity Recognition (HAR) has been a field widely studied recently, however few works have taken advantage of the existing relationship between what we do (HAR) and where we do it (Indoor Localization) [1][2][3].On the one hand, supporting the HAR in Indoor Localization (IL) makes it possible to increase the number and complexity of recognized activities, as well as their level of accuracy.On the other hand, supporting the IL on HAR allows increasing the accuracy of the location.Initially, the lack of studies taking advantage of this relationship happened due to the lack of adequate technology to obtain human movement and location data simultaneously in indoor environments, but nowadays it is already possible to collect this type of data by means of Inertial Measurement Units (IMU), smartphones, wearables and different types of ubiquitous sensors located in indoor environments.For this reason, recently some datasets containing data related to both HAR and IL have been collected and published in order to experiment with different approaches [4,5].
The dataset collected for the 1st UCAmI [6] joins that select group of datasets.This paper aims to perform the recognition of the 24 human activities included in the dataset of the 1st UCAmI cup.
For this, an approach based on machine learning has been proposed following the Cross Industry Standard Process for Data Mining methodology (CRISP-DM) [7].

Methods
The proposed challenge is addressed from a machine learning approach.For that reason, the CRISP-DM was used.CRISP-DM is a free-use methodology and is one of the most used in data mining projects currently.It defines 6 phases: Business understanding, data understanding, data preparation, modeling, evaluation and deployment.Each of these phases is composed of different tasks.It is important to mention that the CRISP-DM life cycle is not linear.In the present work the deployment phase is not carried out, since it would aim to implement the models obtained in the previous phases in a real context, which is out of the scope of this paper.In this section the most relevant tasks of the first four phases of the methodology are described.

Business Understanding and Data Understanding
The context and the main requirements of the project are defined during the business understanding phase.As a result of this phase, both the data mining objective and how to measure compliance must be defined.It was considered that a level of classification accuracy greater than 80% would be desirable in order to implement the resulting model in a real context in the future, therefore, the objective of data mining of this work states: "perform the classification of the 24 activities included in the dataset of the 1st UCAmI cup with an accuracy of at least 80%".
The identification and definition of the data that will be used in the project is one of the tasks pertaining to the phase of data understanding.The dataset of the 1st UCAmI cup was collected in the UJAmI SmartLab of the University of Jaén.It contains data of 24 activities that were performed by a person during 10 days.Data of each day was collected in three different sessions: morning, afternoon and evening.For each session, the dataset contains five comma-separated text files: one contains the labeled activities and the remaining four contain data related to the following data sources: 1. Event streams of binary sensors: 30 binary sensors were located in different parts of the SmartLab.They send a binary value such as Open-Close, Movement-No movement and Pressure-No Pressure with its respective Timestamp.2. Spatial data from an intelligent floor: Capacitance data of each of the Smart Lab's smart floor modules with their respective Timestamp 3. Proximity data between a smart watch worn by the inhabitant and Bluetooth Low Energy (BLE) beacons: 15 BLE beacons were located in different parts of the SmartLab.Their RSSI was collected at a sampling frequency of 0.25 Hz 4. Acceleration data from the same smart watch worn by the inhabitant: 3D acceleration collected with a sample frequency of 50 Hz.
The 24 activities included in the dataset, the 30 binary sensors and the 15 BLE beacons deployed in the SmartLab are listed in Appendix A. For the 1st UCAmI cup, the dataset was divided into two parts: part one contains labelled training data of seven days of recordings and part two contains unlabeled test data of three days of recordings.The objective of the competition is to propose an approach for the classification of the activities using data from part one of the dataset.The evaluation of the approach is done with part two of the dataset, for that reason, part one contains the file with the activities labeled, part two does not.
As described in the introduction section, previous studies have shown that HAR and indoor location are directly related.For that reason, the four data sources were used: data sources 1, 2 and 3 to obtain location information and data source number 4 to obtain information about body movement.

Data Preparation
In this phase, the data must be prepared in such a way as to allow the training and evaluation of classification algorithms that will be used in the next phase, the modeling phase.
The first step in this phase was to generate a file with data from the four data sources synchronized for each day.This was done matching the Timestamps, sample by sample, of the comma-separated files belonging to data sources 1, 2 and 3 with respect to data source number 4, which has the highest sampling frequency.In case of samples containing missing data, those values were established by calculating the average of the value on the 50 previous samples.
The next step was to perform a feature extraction process.This process starts with the segmentation of the synchronized data in segments of 5 s.Thus, each segment contains approximately 200 samples.Then, from each segment was obtained an example, which contains in total the following 87 features calculated from the data of each segment:

•
Binary sensors: 30 binary features (one for each sensor).If the status of a sensor is in Open, Movement or Pressure in any sample belonging to the segment, its corresponding feature has the value "1", otherwise it will be assigned to "0".

•
Intelligent floor: 40 binary features (one for each module).In the case that the capacitance of a module is greater than zero in any of the samples of the segment, its corresponding feature is "1", otherwise it is "0".

•
Proximity data: 4 categorical features that correspond to the ID of the nearest four BLE beacons.

•
Acceleration: 13 statistical features: mean, median, standard deviation and mean absolute deviation for each axis and the mean of the square roots of the sum of the values of each axis squared.
Once the examples for the complete dataset were obtained, each example belonging to part one of the dataset was labelled with its corresponding activity using the Timestamps included in the files that contain the label of the activities.In total, 4997 labeled examples were obtained for part one of the dataset, while 535 was the number of unlabeled examples belonging to part two of the dataset.

Modeling
In this phase, the modeling techniques are selected and applied and, if necessary, their parameters are calibrated to improve their results.Based on the results obtained in a previous work in which the classification of sedentary behaviors was performed using acceleration and proximity data with BLE beacons [8], the classification algorithms used in this work are: J48, Ib1, SVM, Random Forest (RF), AdaBoostM1 (ABM1), and Bagging.The last three are ensemble algorithms and J48 was set as their base classifier.To perform the training and evaluation of the models, the application developed in the context of the work presented in [8] was adapted.That application, written in JAVA, incorporates the Waikato Environment for Knowledge Analysis (WEKA) library [9].The training of the algorithms to obtain the classification model was done with the 4997 examples belonging to the part one of the dataset and its evaluation was done with the 10-fold-cross-validation method.It is important to note that all the 87 features were used for modeling, that is, no feature selection process was done.
Taking advantage of the fact that the life cycle of the CRISP-DM methodology is not linear, it was found that merging the activities "prepare breakfast, lunch and dinner" into one activity and merging "breakfast, lunch and dinner" into another, the accuracy of the classification increased on average by 13%.The above makes sense since for those activities, both the body movement and the location have similar data; therefore, the classification algorithms are not able to distinguish between them.The strategy used for the final classification of each of the six activities mentioned above was made based on the corresponding recording session (Morning, afternoon and evening).In a system to be deployed in a real environment, rules based on the time of day and the sequence of these activities can be incorporated to make their classification.

Evaluation
In this phase, the results obtained during the modeling phase are analyzed and the best classification model is chosen according to the evaluation metrics proposed in the businessunderstanding phase.Below, in the results section, the results obtained in this phase are detailed.

Results
Table 1 shows the accuracy obtained by each algorithm using 10-fold-cross-validation on part one of the dataset.The highest accuracy was achieved by the algorithm AdaBoostM1, therefore the model obtained with this algorithm was used for the classification of the examples of part two of the dataset.
Since the original labelling of the activities of the dataset is done in batches of 30 s, and the duration of the examples used in our approach was 5 s, the majority voting algorithm was configured in the application developed in such a way that the classification of the activity corresponding to every 30 s was the activity that appeared more frequently in the 6 resulting examples.
The resulting accuracy in the classification of the 535 examples belonging to part two of the dataset was 60.1% which means that 322 examples were correctly classified.The precision was 56%, the recall was 57.3% and the F-measure was 59.9%.Therefore, the objective of data mining proposed in phase 1 was not fulfilled.This implies that it is necessary to iterate in the CRISP-DM life cycle in order to improve that level of accuracy.In Appendix B, the confusion matrix obtained in this classification process is presented.

Discussion and Conclusions
Following the guidelines of the CRISP-DM methodology, an approach based on machine learning for the recognition of the activities of daily living included in the 1st UCAmI dataset was proposed.The tasks in each phase of the life cycle of this methodology allowed to have a well-defined work flow during the execution of this work.Considering that the accuracy achieved using 10-foldcross-validation on part one of the dataset was 92.1%, the accuracy of 60.1% achieved by our approach on part two of the dataset was not as expected.This may have happened due to the class imbalance present in the part one of the dataset (with which the classification model was obtained).For example, only 11 and 78 examples are labeled with the activity 'wash dishes' and 'Playing video game' respectively, while other activities contain more than 100 examples.The previous hypothesis can be corroborated initially using an evaluation method like Leave-One-Day-Out on the complete dataset.This method would consist in training the classification algorithm with data of 9 days and evaluate it with the data of the remaining day.According to the results obtained in this experiment, it could be concluded if more labeled data are necessary for the training of the classification algorithm.
The confusion matrix presented in Appendix B shows that there were three activities in which there were many misclassifications: idle, Watching TV and Playing video game.'Idle' represents spaces of time in which there is sensor data but no activity was labeled.This takes place especially at the beginning and end of the recording of each session, just before starting the first activity and finishing the last one.Therefore, the inclusion of 'Idle' for HAR in a real environment would not be necessary.In that scenario, the accuracy would increase to 62.77%.On the other hand, 76 of the 95 examples corresponding to the activity 'Watch TV' and 17 of 23 examples corresponding to the activity 'Playing video game' were erroneously classified as 'Relax on the sofa'.Although one might think that the binary sensors would provide enough information to classify these activities correctly, the results obtained seem to demonstrate the opposite.As a future work, this case will be analyzed in depth to verify what happened and to raise any additional characteristic or rule that allows the correct classification of these activities.

Table 1 .
Accuracy of the classification using 10-fold-cross-validation on part one of the dataset.