UCAmI Cup. Analyzing the UJA Human Activity Recognition Dataset of Activities of Daily Living

: Many real-world applications, which are focused on addressing the needs of a human, require information pertaining to the activities being performed. The UCAmI Cup is an event held within the context of the International Conference on Ubiquitous Computing and Ambient Intelligence, where delegates are given the opportunity to use their tools and techniques to analyse a previously unseen human activity recognition dataset and to compare their results with others working in the same domain. In this paper, the human activity recognition dataset used relates to activities of daily living generated in the UJAmI Smart Lab, University of Jaén. The dataset chosen for the first edition of the UCAmI Cup represents 246 activities performed over a period of ten days carried out by a single inhabitant. The dataset includes four data sources: (i) event streams from 30 binary sensors, (ii) intelligent floor location data, (iii) proximity data between a smart watch worn by the inhabitant and 15 Bluetooth Low Energy beacons and (iv) acceleration of the smart watch. In this first edition of the UCAmI Cup, 26 participants from 10 different countries contacted the organizers to obtain the dataset.


Introduction
Activity recognition systems deployed in smart homes are characterized by their ability to detect Activities of Daily Living (ADL) in order to improve assistance. Such solutions have been adopted by smart homes in practice and have delivered promising results for improving the quality of care services for elderly people and responsive assistance in emergency situations [1].
Data driven approaches [2] developed for the purposes of Human Activity Recognition (HAR) of ADLs require large annotated data sets which offer high levels of quality in terms of both the ground truth and generalisation of the underlying data. A limited number of online repositories have supported the notion of providing openly available datasets for research and development purposes.
Two key examples are the UC Irvine Machine Learning repository [2] and Physionet [3]. The former has recently extended its datasets to include a small number of HAR related resources. The European Union funded Project OPPORTUNITY created a common platform whereby researchers working in different organizations could have access to a common data set and therefore were able to compare their results with others [4]. Beyond the aforementioned, efforts to provide high quality, openly available large scale datasets have been largely un-co-ordinated. There still remains a lack of frameworks where multiple researchers have the ability to compare their results using their tools and techniques to analyses the same HAR problem relating to an ADL dataset. The competition closest to the UCAmI Cup is the recently announced Sussex-Huawei Locomotion Challenge [5] where the Sussex-Huawei Locomotion Dataset is used to recognize 8 modes of locomotion and transportation (Car, Bus, Train, Subway, Walk, Run, Bike and Still) from the inertial sensor data of a smartphone (accelerometer, gyroscope, magnetometer, linear acceleration, gravity, orientation (quaternions and ambient pressure.). Nevertheless, this competition does not aim to develop solutions for smart homes to improve assistance.
The concept of comparing techniques on openly available data is performed in other domains such as indoor localization [6] or automatic image classification [7], Physionet CinC, IJCAI [8] Competitions [9] and the KDD Cup challenge [10].
In order to address the gap in the domain of HAR for ADL, the UCAmI Cup has been announced. The UCAmI Cup aims to be an annual event within the forum of the International Conference of Ubiquitous Computing and Ambient Intelligence (UCAmI) where delegates will be provided with the opportunity to use their tools and techniques to analyse a HAR dataset and to compare their results with others in the ADL context. Each year, the dataset and the problem to be addressed will be changed to align with the major research topics being considered as state-of-the-art trends. The selected dataset to be used in the 1st UCAmI Cup [11] is the HAR dataset of ADL generated by the University of Jaén (UJA) in its newly created UJAmI Smart Lab. This paper aims to review the details of the infrastructure of the UJAmI Smart Lab in addition to present in detail the selected dataset that will be used in the 1st UCAmI Cup together with the details of the competition.
The remainder of the paper is structured as follows: Section 2 presents the UJAmI Smart Lab of the University of Jaén where the HAR dataset of ADL was generated. Section 3 presents a general description of the dataset in addition to its structure and its format. Section 4 presents each kind of data source that is contained in the dataset: binary sensor data, proximity data, acceleration data and, finally, location data of the smart floor. Section 5 presents details of the competition in the 1 st UCAmI Cup and the results attained. Finally, Section 6 presents the conclusions and future works.

UJAmI Smart Lab of the University of Jaén
The University of Jaén Ambient Intelligence (UJAmI) [12][13][14] represents an innovative space that plays a key role in the implementation of new ground-breaking research within the realms of Ambient Intelligence (AmI) [4], which is a paradigm in information technology aimed at empowering people's capabilities through the means of digital environments.
The aim of the creation of the UJAmI Smart Lab [14] in 2014 was to produce a real apartment: sensitive, adaptive and responsive to human needs (habits, gestures and emotions) which subsequently underpinned assistive technology based solutions in the home.
The UJAmI SmartLab measures approximately 25 square meters; its measurements are 5.8 m long and 4.6 m wide. It is divided into five regions: entrance, kitchen, workplace, living room and a bedroom with an integrated bathroom. The layout of the UJAmI SmartLab is presented in Figure 1. A set of multiple and heterogeneous sensors have been deployed in different areas of the environment in order to capture human-environment interactions in addition to inhabitant behaviour. Currently, a web-based system for managing and monitoring smart environments is deployed [15] based on openHAB [16] with an approach for distributing and processing heterogeneous data based on a representation with fuzzy linguistic terms [17]. It is, however, beneficial to utilize a framework that includes a common protocol for data collection, a common format for data exchange, and a data repository and related tools to underpin research within the domain of activity recognition. For this reason, the UJAmI SmartLab is moving towards the deployment of a common middleware platform referred to as SensorCentral [18] that is compatible with an open data format referred to as the Open Data Initiative (ODI) [19].

General Description of UJAmI HAR Dataset
In this Section, the HAR of the ADL dataset from the UJAmI SmartLab used in the 1st UCAmI Cup is described.
The UJA dataset from the UJAmI SmartLab is composed of four data sources that have been obtained whilst an inhabitant performed 246 instances of activity classes over a period of 10 days. The dataset is divided into two sets: The four data sources are as follows: 1. Event stream generated by 30 binary sensors.

Proximity information between a smart watch worn by an inhabitant and a set of 15 Bluetooth
Low Energy (BLE) beacons deployed in the UJAmI SmartLab. 3. Acceleration generated by the smart watch. 4. An intelligent floor with 40 modules that provides location data.
The inhabitant who performed the activities was a 24 year old male student from the University of Jaen. During data collection, the smart watch "LG Urbane model" [20] was worn on the participant's right hand. For reasons of energy saving, recording of acceleration data and proximity related information ceased when the inhabitant went to bed in addition to when he left the UJAmI SmartLab.
The dataset includes 24 different types of activities as presented in Table 1 with the frequency of each activity only in the training set presented. Table 1. Activities recorded in the UJA dataset.

Name Activity
Freq. Description Act01 Take medication 7 This activity involved the inhabitant going to the kitchen, taking some water, removing medication from a box and swallowing the pills.

Act02
Prepare breakfast 7 This activity involved the inhabitant going to the kitchen, taking some products for lunch. This activity can involve (i) making a cup of tea with kettle or (ii) making a hot chocolate drink with milk in the microwave. This activity involves placing things to eat in the dining room, but not sitting down to eat.

Act03
Prepare lunch 6 This activity involved the inhabitant going to the kitchen, and taking some products from the refrigerator and pantry. This activity can involve (i) preparing a plate of hot food on the fire, for example pasta or (ii) heating a precooked dish in the microwave. This activity also involves placing things to eat in the dining room, but not sitting down to eat.

Act04
Prepare dinner 7 This activity involved the inhabitant going to the kitchen, and taking some products from the refrigerator and pantry.
This activity can involve (i) preparing a plate of hot food on the fire, for example pasta or (ii) heating a precooked dish in the microwave. This activity also involves placing things to eat in the dining room, but not sitting down to eat.

Act05
Breakfast 7 This activity involved the inhabitant going to the dining room in the kitchen in the morning and sitting down to eat. When the inhabitant finishes eating, they place the utensils in the sink or in the dishwasher.

Act06
Lunch 6 This activity involved the inhabitant going to the dining room in the kitchen in the afternoon and sitting down to eat. When the inhabitant finishes eating, he places the utensils in the sink or in the dishwasher.

Act07
Dinner 7 This activity involved the inhabitant going to the dining room in the kitchen in the evening and sitting down to eat. When the inhabitant finishes eating, they place the utensils in the sink or in the dishwasher.

Act08
Eat a snack 5 This activity involved the inhabitant going to the kitchen to take fruit or a snack, and to eat it in the kitchen or in the living room. This activity can imply that the utensils are placed in the sink or in the dishwasher.

Act09
Watch TV 6 This activity involved the inhabitant going to the living room, taking the remote control, sitting down on the sofa and when he was finished, the remote control was left close to the TV.

Act10
Enter the SmartLab 12 This activity involved the inhabitant entering the SmartLab through the entrance at the main door and putting the keys into a small basket.

Act11
Play a videogame 1 This activity involved the inhabitant going to the living room, taking the remote controls of the TV and XBOX, and sitting on the sofa. When the inhabitant finishes playing, he gets up from the sofa and places the controls near the TV.

Act12
Relax on the sofa 1 This activity involved the inhabitant going to the living room, sitting on the sofa and after several minutes, getting up off the sofa.

Act13
Leave the SmarLab 9 This activity involved the inhabitant going to the entrance, opening the main door and leaving the SmartLab, then closing the main door.

Act14
Visit in the SmartLab 1 This activity involved the inhabitant going to the entrance, opening the main door, chatting with someone at the main door, and then closing the door.

Act15
Put waste in the bin 11 This activity involved the inhabitant going to the kitchen, picking up the waste, then taking the keys from a small basket in the entrance and exiting the SmartLab. Usually, the inhabitant comes back after around 2 min, leaving the keys back in the small basket.

Act16
Wash hands 6 This activity involved the inhabitant going to the bathroom, opening/closing the tap, lathering his hands, and then rinsing and drying them.

Act17
Brush teeth 21 This activity involved the inhabitant going to the bathroom and brushing his teeth and opening/closing the tap.

Act18
Use the toilet 10 This activity involved the inhabitant going to the bathroom and using the toilet, opening/closing the toilet lid and pulling the cistern.

Act19
Wash dishes 2 This activity involved the inhabitant going to the kitchen and placing the dirty dishes in the dishwasher, and then placing the dishes back in the right place.

Act20
Put washing into the washing machine 6 This activity involved the inhabitant going to the bedroom, picking up the laundry basket, going to the kitchen, putting clothes in the washing machine, waiting around 20 min and then taking the clothes out of the washing machine and placing them in the bedroom closet.

Act21
Work at the  table  2 This activity involved the inhabitant going to the workplace, sitting down, doing work, and finally, getting up.

Act22
Dressing 15 This activity involved the inhabitant going to the bedroom, putting dirty clothes in the laundry basket, opening the closet, putting on clean clothes and then closing the closet.

Act23
Go to the bed 7 This activity involved the inhabitant going to the bedroom, lying in bed and sleeping. This activity is terminated once the inhabitant stays 1 min in bed. Act24 Wake up 7 This activity involved the inhabitant getting up and out of the bed.
The activities being undertaken during data collection were annotated by using NFC tags and a smartphone. This process was used to label the beginning and end of each activity.
The root folder of the dataset contains the folders and files as illustrated in Figure 2. • The Folder named "Layout" (UCAmI Cup\Layout\) contains: o A file named "Coordinates.docx" which contains a table with the coordinates X and Y of each binary sensor and each BLE sensor in the UJAmI SmartLab.
• The Folder named "Data" (UCAmI Cup\Data\) contains 10 days of recordings divided into the following two folders (refer to Figure 3): o The Folder named Test contains the data for 3 days and is unlabelled. o The Folder named Training contains data for 7 days and is fully labelled. Each of the 10 sub-folders contains data for each recording day. The name of each folder in each recording day has the following format: YYYY-MM-DD, with YYYY representing the year, MM the month and DD the day. Each of the folders contain three subfolders, one for each time routine of the day. The time routines are represented by T, which can take the following values: A for the morning, B for the afternoon and C for the evening.
In a similar manner, each of the 3 sub-folders are named according to the day of the recording and the time of the routine (YYYY-MM-DD-T). Each routine-folder has the following files according to the four data sources: Binary Sensors, Proximity (BLE sensors), Acceleration and Floor.
Furthermore, each routine-folder in the training set contains the file YYYY-MM-DD-Tactivity.csv with the sequence of activities that are carried out together and the timestamps of the beginning and the end of each activity (refer to Figure 4). The name of the inhabitant is imaginary and has been included to support the future extension of the AR evaluation for multiple occupancy scenarios. The 1st UCAmI Cup is, however, only concerned with a single inhabitant scenario.
As an example, folders and files included in the day-folder named "2017-11-08-A" are listed in Figure 5. This folder is contained in the training set. As an example, the files included in the day-folder named "2017-11-09-A" are presented in Figure 6. This day is contained in the test set and therefore does not include any labelling of the data.   • The field named "results.csv" (UCAmI Cup\results.csv) contains a csv file with the timeslots for the test set, however, none of the activities have been labelled. This labelling exercise is to be completed by the participants in the UCAmI Cup. An excerpt from this file is presented in Figure 8.

Data Sources of the UJA HAR Dataset
In this Section, the four data sources of the UJAmI dataset are described in detail.

Binary Sensor Data File
In the UJAmI SmartLab a set of 30 binary sensors were deployed. All of them transmit a binary value together with the timestamp. The set of binary sensors are categorised into the following three sensor types where the meaning/semantic of the values are described:

•
Magnetic contact. This is a wireless magnetic sensor [21] that works with the Z-Wave protocol.
When the sensor detects that the two pieces of the sensor have been separated, the sensor sends an event with a value that represents "open". When the pieces of the sensor are put back together, the sensor sends an event with a value that represents "close". In our dataset, this kind of sensor is used for the purposes of tracking the position of doors (open or closed) in addition to placing them in objects that have a fixed place when they are not being used. For example, a TV remote control, medicine box, or bottle of water. In these instances when the value is "close", it means that the object is not being used, otherwise, when the value is "open", it means that the object is being used.

•
Motion. This is a wireless PIR sensor that works with the ZigBee protocol that is used to detect whether an inhabitant has moved in or out of the sensor's range. It has a maximum IR detection range of 7 metres with a sample rate of 5 s. When motion is detected the sensor sends a value that represents movement. When the movement ceases, the sensor sends a value that represents no movement.

•
Pressure. This is a wireless sensor that works with the Z-Wave protocol that is connected to a textile layer. When pressure is detected in the textile layer the sensor sends a value that represents press. When the pressure ceases, the sensor sends a value that represents no press. Usually, this kind of sensor is used in sofas, chairs or beds.
The details of the objects/sensors and their locations are presented in Table 2. Details of the binary sensors deployed in the UJAmI Smart Lab Table 2.

Proximity Data
The proximity data was collected through an Android application installed on the smart watch of the inhabitant and a set of 15 BLE beacons with a sample frequency of 0.25 Hz. The beacon model used was the Sticker from Estimote [22].
When the smart watch reads the signal from a BLE beacon, it collects a Received Signal Strength Indicator (RSSI) measurement. Each BLE beacon must set a broadcasting power with which it broadcasts its signal. The smart watch has the capability to read the RSSIs from several BLE beacons when they are in range. The proximity between a wearable device and a BLE beacon impacts upon the RSSI. The greater the RSSI received by the smart watch, the smaller the distance between it and the BLE beacon.
15 BLE beacons were deployed in the UJAmI SmartLab as presented in Table 3. For small items, for example a toothbrush and medicine box, the BLE broadcasting power (measured in decibels) was set to a smaller range in an effort to reduce/avoid false positives. Further information relating to the methods that are used to obtain the proximity and the RSSI from the BLE beacon can be found in the product's SDK [22]. Figure 10. Excerpt from a proximity.csv file Figure 10 illustrates an excerpt of a file YYYY-MM-DD-T-proximitiy.csv.

Acceleration Data
The acceleration data has been collected through an Android application installed on the smart watch of the inhabitant. Data was collected with a sample frequency of 50 Hz. The acceleration data has been collected in three axes, which are expressed by meter per second squared (m/s−2) [23].
The files named YYYY-MM-DD-T-acceleration.csv contain the acceleration data collected that have been generated by the smart watch while the habitant carried out the different activities.
The files named YYYY-MM-DD-T-acceleration.csv contain the following fields: • TIMESTAMP: This indicates when the data is collected. • X: The acceleration in the x-axis. • Y: The acceleration in the y-axis. • Z: The acceleration in the z-axis. Figure 11 illustrates an excerpt from a file YYYY-MM-DD-T-acceleration.csv. Figure 11. Excerpt from an acceleration.csv file

Floor Capacitance Data
The UJAmI SmartLab has a SensFloor® [24] that consists of a suite of capacitive sensor that lie below the floor.
The floor of the UJAmI SmartLab is formed by 40 modules that are distributed in a matrix of 4 rows and 10 columns. A module is composed of eight sensor fields, each sensor in a module is associated with an id-number. The layout of the SensFloor in the UJAmI SmartLab is presented in Figure 12.

Competition
The dataset presented in Sections 3 and 4 was available for participants to train their methods and tools. The four data sources were available; participants could use one source, several of them or all of them. In order to evaluate participant's approaches and compare results within the community, the unlabeled test set with three days of recordings that contains 77 instances was provided.
26 participants from 10 countries (Spain, China, U.K., Argentina, Mexico, Ireland, Colombia, Sweden, South Korea and Japan) made contact with the organizers of the 1st UCAmI Cup to obtain the UJA dataset. Participants were required to use their trained methods and tools in order to recognize each activity in the test dataset. Participants were subsequently required to submit their predicted activities from the benchmarking test to the organizers of the UCAMI Cup in the format of a file field named results.csv, which was described in Section 2.
By the closing date of the competition, six contributions were submitted and the organizers computed the results in terms of classification accuracy. Let N be the number of activities from each class A , and TP the number of activities correctly classified, the classification accuracy was then defined by Equation (1): Once the deadline to participate in the UCAmI cup had expired (May 10, 2018), an excel file containing the ground truth of each activity in the text set was included in the shared UJA dataset [25].

Conclusions
In this paper, the human activity recognition dataset for activities of daily living generated in the University of Jaén has been presented within the context of the first edition of the UCAmI Cup within the International Conference of Ubiquitous Computing and Ambient Intelligence. To do so, the UJAmI Smart Lab, where the dataset was generated, was described. A general description of the dataset, its structure and its format have been presented. Furthermore, the four data sources that are included in the dataset have been presented in detail: (i) event streams from 30 binary sensor, (ii) location data from an intelligent floor, (iii) proximity data between a smart watch worn by the inhabitant and 15 Bluetooth Low Energy beacons and (iv) acceleration of the smart watch. Finally, the initial details of the competition in the UCAmI Cup have been provided. Our future work is focused on gathering all the techniques associated with the analysis of the 1st UCAmI Cup in order to publish for the first time a consolidated report of the performance of HAR on a common dataset. Planning for the 2nd UCAmI Cup is currently underway which will involve an ADL dataset focused on multi-occupancy.