A Physical Activity Reference Data-Set Recorded from Older Adults Using Body-Worn Inertial Sensors and Video Technology—The ADAPT Study Data-Set

Physical activity monitoring algorithms are often developed using conditions that do not represent real-life activities, not developed using the target population, or not labelled to a high enough resolution to capture the true detail of human movement. We have designed a semi-structured supervised laboratory-based activity protocol and an unsupervised free-living activity protocol and recorded 20 older adults performing both protocols while wearing up to 12 body-worn sensors. Subjects’ movements were recorded using synchronised cameras (≥25 fps), both deployed in a laboratory environment to capture the in-lab portion of the protocol and a body-worn camera for out-of-lab activities. Video labelling of the subjects’ movements was performed by five raters using 11 different category labels. The overall level of agreement was high (percentage of agreement >90.05%, and Cohen’s Kappa, corrected kappa, Krippendorff’s alpha and Fleiss’ kappa >0.86). A total of 43.92 h of activities were recorded, including 9.52 h of in-lab and 34.41 h of out-of-lab activities. A total of 88.37% and 152.01% of planned transitions were recorded during the in-lab and out-of-lab scenarios, respectively. This study has produced the most detailed dataset to date of inertial sensor data, synchronised with high frame-rate (≥25 fps) video labelled data recorded in a free-living environment from older adults living independently. This dataset is suitable for validation of existing activity classification systems and development of new activity classification algorithms.


Introduction
The share of people aged 65 years and over, among the world's dependents, has doubled since the mid-1960s, reaching 20% in 2015. Projections estimate that by 2050, older persons will account for 36% of people in the dependent age group worldwide [1]. With this projected shift in population demographics, increased demand will be placed on national health care services and budgets. The classification and monitoring of human physical activities, using wearable technology, can improve health assessment and monitoring systems and thus promote safer independent living and early detection of health deterioration in this population.
Recent developments in integrated circuit design and specifically Micro Electro Mechanical Systems (MEMS) technology has stimulated the advancement of ubiquitous body-worn inertial sensors, facilitating the accurate measurement of body-segment kinematics. These MEMS-based inertial sensors consist of a seismic mass suspended using supporting springs, etched into the silicon layer of miniature integrated circuits. Movement of the mass is governed by the combination of Hook's Law and Newton's Even with improvements in protocol design, the methods of annotation of recorded datasets for the development and validation of activity classification algorithms can be improved upon. Such methods include self-report labelling [14]; direct human observation of the person's movements labelled in real-time on paper [16]; using a portable device (e.g., touch screen tablet) or laptop [17]; a combination of video recordings and reference inertial sensor [12]; or a previously validated inertial sensor-based reference system [18] with the method of direct observation combined with live annotation, as reported in [17] and employed in [18]. This last method suffers from human error and inaccuracy due to attentive observation and a reported error of 1 to 3 s [17]. Video validation of inertial sensor-based activity monitors has previously been performed by Taylor et al. [19] who used video analysis to allocate four categories (standing, sitting, lying, and locomotion) in 1 s resolution; Capela et al. [20] who used six categories (stand, sit, lie, walk, stairs, and small moves) in 1 s resolution; and Aminian et al. [21] who used five categories and a resolution of 10 s. However, the video resolution of these recordings is insufficient to validate various daily life activity transitions, where typically higher resolutions of tens of frames per second is necessary [22].
The aim of this study is to resolve previous shortcomings by compiling a comprehensive reference dataset of representative activities from an older adult population that is suitable for the validation of existing activity classification algorithms and allows for the development of new activity classification algorithms using the harvested raw sensor data.

Materials and Methods
The aim of this study will be achieved in two steps: (1) develop and describe a comprehensive flexible semi-structured supervised task-based protocol, and a free-living unsupervised task-based protocol, where a wide range of representative activities and postures are included; and (2) compile a representative reference dataset using a population of community dwelling older adults recorded performing the developed protocols, while being monitored using high frame-rate video technology of ≥25 fps (≤0.04 s resolution) and a selection of multiple, synchronised body-worn inertial sensors.

Subjects
A convenience sample of 20 older adult participants was recruited from a senior citizen centre in the Trondheim area in Norway. As inclusion criteria, participants were required to: (1) be over 65 years of age; (2) be able to walk 100 m without walking aids; (3) accept oral instructions; and (4) be living independently. A total of 5 male and 15 female were recruited, ranging in age from 68 to 90 years (76.4 ± 5.6 years), body mass from 56 to 93 kg (73.7 ± 11.4 kg), and height from 1.56 to 1.81 m (1.67 ± 0.072 m). The Regional Committee on Ethics in Medical Research in Central Norway approved the trial protocol and subjects provided written informed consent.

Sensor Set-Up
The choice of sensors and body locations was motivated by the potential for algorithm development from popular activity monitoring device attachment locations [2,3] and existing large datasets recorded from independent living older adults in previous projects, where detection of falls and the assessment of fall risk was the focus (see Table 1 and Figure 1). These projects include the FARSEEING project [23], the Generation 100 project [24], the PreventIT project [25] and the HUNT population-based study [26]. Through developing accurate activity classification algorithms from different body-worn sensor locations, used by the sensors in each project, a common output can be obtained. This harmonises these datasets and allows for the development of fall-risk assessment algorithms through a common physical activity output.
Company/Institution PAL Technologies Ltd., Glasgow, UK Figure 1. The sensors configuration, where * indicates the sensors that will only be used in the semistructured protocol, and ** indicates the camera that will be attached for the out-of-lab activities.

Activity Selection
A list of activities that are commonly performed in everyday life by older adults was compiled using the following procedure (see flowchart Figure 2). First, the Compendium of Physical Activity by Ainsworth et al. [27] was consulted to identify individual postures and behaviours that occur in everyday life; Second, these postures and behaviours were combined to 41 independent categories (e.g., walking, sitting, standing, etc.); Third, activities related to sport and other confounding activities were excluded, resulting in a list of 11 individual posture and behaviours represented in Table 2 that are related to daily physical activity. Fourth, transitions between the 11 general postures and behaviours were defined, as presented in Table 3. Several transition types were not included as part of the protocol as they are either rare events (e.g., lie-to-picking and lie-to-leaning) or will not induce a meaningful transfer (e.g., picking-to-leaning, kneeling-to-picking, and kneeling-to-leaning). Two task-based protocols were then designed to collect a minimum sufficient number of the desired transitions, (1) a supervised semi-structured protocol and (2) a free-living unsupervised protocol. A more detailed breakdown of the desired quantity of general postures, transitions and behaviours for the supervised semi-structured protocol is described in Appendix A, Table A1, and the free-living protocol in Appendix B, Table A2.  Table 2. The sensors configuration, where * indicates the sensors that will only be used in the semi-structured protocol, and ** indicates the camera that will be attached for the out-of-lab activities.

Activity Selection
A list of activities that are commonly performed in everyday life by older adults was compiled using the following procedure (see flowchart Figure 2). First, the Compendium of Physical Activity by Ainsworth et al. [27] was consulted to identify individual postures and behaviours that occur in everyday life; Second, these postures and behaviours were combined to 41 independent categories (e.g., walking, sitting, standing, etc.); Third, activities related to sport and other confounding activities were excluded, resulting in a list of 11 individual posture and behaviours represented in Table 2 that are related to daily physical activity. Fourth, transitions between the 11 general postures and behaviours were defined, as presented in Table 3. Several transition types were not included as part of the protocol as they are either rare events (e.g., lie-to-picking and lie-to-leaning) or will not induce a meaningful transfer (e.g., picking-to-leaning, kneeling-to-picking, and kneeling-to-leaning). Two task-based protocols were then designed to collect a minimum sufficient number of the desired transitions, (1) a supervised semi-structured protocol and (2) a free-living unsupervised protocol. A more detailed breakdown of the desired quantity of general postures, transitions and behaviours for the supervised semi-structured protocol is described in Appendix A, Table A1, and the free-living  protocol in Appendix B, Table A2. The sensors configuration, where * indicates the sensors that will only be used in the semistructured protocol, and ** indicates the camera that will be attached for the out-of-lab activities.

Activity Selection
A list of activities that are commonly performed in everyday life by older adults was compiled using the following procedure (see flowchart Figure 2). First, the Compendium of Physical Activity by Ainsworth et al. [27] was consulted to identify individual postures and behaviours that occur in everyday life; Second, these postures and behaviours were combined to 41 independent categories (e.g., walking, sitting, standing, etc.); Third, activities related to sport and other confounding activities were excluded, resulting in a list of 11 individual posture and behaviours represented in Table 2 that are related to daily physical activity. Fourth, transitions between the 11 general postures and behaviours were defined, as presented in Table 3. Several transition types were not included as part of the protocol as they are either rare events (e.g., lie-to-picking and lie-to-leaning) or will not induce a meaningful transfer (e.g., picking-to-leaning, kneeling-to-picking, and kneeling-to-leaning). Two task-based protocols were then designed to collect a minimum sufficient number of the desired transitions, (1) a supervised semi-structured protocol and (2) a free-living unsupervised protocol. A more detailed breakdown of the desired quantity of general postures, transitions and behaviours for the supervised semi-structured protocol is described in Appendix A, Table A1, and the free-living protocol in Appendix B, Table A2.   Table 2.  Table 3. Matrix of transfers between non-upright static postures and from non-upright static posture to an upright posture or upright activity. Transitions in grey were deemed unnecessary to include in the protocol as they are either rare events or will not induce a meaningful transfer. Stepping stepping-tostand stepping-to-sit steppingto-lie stepping-tokneeling * pick an object-to-stepping stepping-to leaning * indicates that this transfer/posture is only relevant for the in-lab protocol due to its difficult nature.

Supervised Semi-Structured Protocol
The semi-structured protocol was performed in a smart-home environment in the Usability Laboratory at the Faculty of Medicine at the Norwegian University of Science and Technology, Trondheim, Norway. This laboratory consists of three rooms plus an observation room. The three rooms contained different types of furniture and ceiling-mounted cameras, which are monitored and controlled from the observation room (see Figure 3). MultiCam Studio and Camtasia Studio screen software was used to control and capture the camera feeds from the smart-home environment. The resulting video was recorded at 25 fps at a resolution of 768 pixels × 576 pixels in an AVI file format.
The subjects were instructed to perform the task-based protocol described in Table 4, where the instruction set is presented in Appendix C, Table A3. A synchronisation handshake was performed in view of the cameras prior to sensor attachment. The handshake consisted of a series of static and dynamic movements of the sensors which were evident in the root-sum-of-squares accelerometer signal. Through identifying the maximum correlation between the square wave outputs from the static/dynamic video data, synchronisation between the cameras and the raw sensors' signals is achieved. The sensors were then fitted to the participants in the configuration described in Figure 1. Once all sensors were attached, the supervised semi-structured protocol was performed by the participant, guided by one of the study investigators. Prior to completion of the stair climbing tasks, a GoPro 1 Hero3+ camera was attached to the chest of the participant using a GoPro Chesty TM harness (GoPro, Inc., San Mateo, CA, USA). A second synchronisation handshake, consisting of standing, lying and jumping was performed, allowing for synchronisation between the GoPro camera, the raw sensor's signals and the Usability Laboratory cameras. This also constituted the transition to the out-of-lab scenario. The study investigator then instructed the participant on completing the stair climbing task. Following this the sensors attached to the feet were removed and the participants were provided with a taxi and returned home to perform the free-living protocol unsupervised, see Table 5. Once all sensors were attached, the supervised semi-structured protocol was performed by the participant, guided by one of the study investigators. Prior to completion of the stair climbing tasks, a GoPro 1 Hero3+ camera was attached to the chest of the participant using a GoPro Chesty TM harness (GoPro, Inc., San Mateo, CA, USA) A second synchronisation handshake, consisting of standing, lying and jumping was performed, allowing for synchronisation between the GoPro camera, the raw sensor's signals and the Usability Laboratory cameras. This also constituted the transition to the outof-lab scenario. The study investigator then instructed the participant on completing the stair climbing task. Following this the sensors attached to the feet were removed and the participants were provided with a taxi and returned home to perform the free-living protocol unsupervised, see Table  5. Figure 3. View from the in-lab cameras as a subject performs the semi-structured protocol and the floor plan of the four different activity zones used in the instructions to the participants, represented in Appendix C, Table A3. Table 4. The semi-structured supervised task-based protocol.

Semi-Structured Protocol
Stand-to-sit-to-stand at a table Stand-to-sit-to-stand on a soft chair Sit-to-kneel-to-sit Stand-to-lie-to-stand Lying-to-sit-to-lying Stand-to-kneel-to-stand Stand-to-pick an object off the floor-to-stand Stand-to-lean to pick an object off a table forward-to-stand Stand-to-sit, while sitting, pick an object off the floor forward-to-stand Stand-to-sit, while sitting, pick an object off the floor right-to-stand Stand-to-sit, while sitting, pick an object off the floor left-to-stand Stand-to-sit at a table-to-walk-to-pick an object off the floor-to-sit-to-stand Lying-to-walk-to-pick an object off the floor-to-lying Sitting on a soft chair-to-walk-to-pick an object off the floor-to-walk-to-sit Stand-to-move objects from one table to another Stand-to-walk(normal)-to-stand Stand-to-walk(fast)-to-stand Stand-to-walk(slow)-to-stand Stand-to-ascend stairs-to-stand-to-descend stairs  Table A3. Table 4. The semi-structured supervised task-based protocol.

Semi-Structured Protocol
Stand-to-sit-to-stand at a table Stand-to-sit-to-stand on a soft chair Sit-to-kneel-to-sit Stand-to-lie-to-stand Lying-to-sit-to-lying Stand-to-kneel-to-stand Stand-to-pick an object off the floor-to-stand Stand-to-lean to pick an object off a table forward-to-stand Stand-to-sit, while sitting, pick an object off the floor forward-to-stand Stand-to-sit, while sitting, pick an object off the floor right-to-stand Stand-to-sit, while sitting, pick an object off the floor left-to-stand Stand-to-sit at a table-to-walk-to-pick an object off the floor-to-sit-to-stand Lying-to-walk-to-pick an object off the floor-to-lying Sitting on a soft chair-to-walk-to-pick an object off the floor-to-walk-to-sit Stand-to-move objects from one table to another Stand-to-walk(normal)-to-stand Stand-to-walk(fast)-to-stand Stand-to-walk(slow)-to-stand Stand-to-ascend stairs-to-stand-to-descend stairs Table 5. The free-living unsupervised task-based protocol.

Free-Living Protocol
Sit at a table and write a letter/list or read Sit on an armchair watch TV/video, or read a magazine Sit on a low stool or toilet seat (lid down clothes on, simulation only) Lie on a bed, clothes on Get in and out of a car or sit on a bed Prepare and consume a drink or food while standing Set a table for dinner or move from one counter to another many times (up to 10) (shuffling) Simulate unloading a washing machine for 10 s or prepare a fireplace Pick an object off the floor then replace or tie/untie shoe laces Climbing and descending stairs or walking up and down an inclined path Remove clothes from washing machine and hang on clothes rack or remove rubbish from bin and dispose Sit and prepare and eat something Clean mirror or clean a window Wash and dry hands Sit at a table and read

Free-Living Unsupervised Protocol
The participants were instructed to perform the free-living tasks, see Table 5, in their own chosen order in their home environment. The free-living unsupervised tasks were recorded using a body-worn camera, GoPro Hero3+ camera (GoPro, Inc., San Mateo, CA, USA) with a 64 GB SanDisk Ultra XC I micro SD card, worn at the chest, attached using a harness (GoPro Chesty TM ). Video files were recorded at 29.97 fps at 1280 pixels × 720 pixels in an MP4 format in 20-min lengths. The GoPro camera was pointed towards the feet as illustrated in Figure 4. This camera angle was chosen as it provides a view of both the subject's lower extremities and the local environment simultaneously, thus allowing for convenient identification of the type of activity and the orientation of the body relative to the surroundings. The sensors and the GoPro camera were collected in the evening by a project co-worker, after the GoPro camera had stopped recording and the participant had removed the sensors. The camera and sensors' data were downloaded to a computer in their respective raw data formats, using a USB interface, for later off-line data processing and analysis using MATLAB (The MathWorks Inc., Natick, MA, USA).  Table 5. The free-living unsupervised task-based protocol.

Free-Living Protocol
Sit at a table and write a letter/list or read Sit on an armchair watch TV/video, or read a magazine Sit on a low stool or toilet seat (lid down clothes on, simulation only) Lie on a bed, clothes on Get in and out of a car or sit on a bed Prepare and consume a drink or food while standing Set a table for dinner or move from one counter to another many times (up to 10) (shuffling) Simulate unloading a washing machine for 10 s or prepare a fireplace Pick an object off the floor then replace or tie/untie shoe laces Climbing and descending stairs or walking up and down an inclined path Remove clothes from washing machine and hang on clothes rack or remove rubbish from bin and dispose Sit and prepare and eat something Clean mirror or clean a window Wash and dry hands Sit at a table and read

Free-Living Unsupervised Protocol
The participants were instructed to perform the free-living tasks, see Table 5, in their own chosen order in their home environment. The free-living unsupervised tasks were recorded using a bodyworn camera, GoPro Hero3+ camera (GoPro, Inc., San Mateo, CA, USA) with a 64 GB SanDisk Ultra XC I micro SD card, worn at the chest, attached using a harness (GoPro Chesty TM ). Video files were recorded at 29.97 fps at 1280 pixels × 720 pixels in an MP4 format in 20-min lengths. The GoPro camera was pointed towards the feet as illustrated in Figure 4. This camera angle was chosen as it provides a view of both the subject's lower extremities and the local environment simultaneously, thus allowing for convenient identification of the type of activity and the orientation of the body relative to the surroundings. The sensors and the GoPro camera were collected in the evening by a project coworker, after the GoPro camera had stopped recording and the participant had removed the sensors. The camera and sensors' data were downloaded to a computer in their respective raw data formats, using a USB interface, for later off-line data processing and analysis using MATLAB (The MathWorks Inc., Natick, MA, USA).

Pre-Processing and Video Annotation
The video files from the Usability Laboratory were split into files of maximum 20 min in length, using VideoPad by NCH Software (NCH Software, Inc., Greenwood Village, CO, USA) to make them compatible with the video annotation software. The videos obtained by both the wall mounted and the body-worn camera were then converted into an AVI file format with a resolution of 640 pixels × 360 pixels using a the Apple Cinepak codec, maintaining a frame rate of 25 fps and 30 fps, respectively. The videos were annotated using the Anvil software package [28]. It offers multi-layered annotation based on a user-defined coding scheme. An activity track was created where the 11 general postures and behaviours in Table 2 could be inserted (see example in Figure 5).

Pre-Processing and Video Annotation
The video files from the Usability Laboratory were split into files of maximum 20 min in length, using VideoPad by NCH Software (NCH Software, Inc., Greenwood Village, CO, USA) to make them compatible with the video annotation software. The videos obtained by both the wall mounted and the body-worn camera were then converted into an AVI file format with a resolution of 640 pixels × 360 pixels using a the Apple Cinepak codec, maintaining a frame rate of 25 fps and 30 fps, respectively. The videos were annotated using the Anvil software package [28]. It offers multi-layered annotation based on a user-defined coding scheme. An activity track was created where the 11 general postures and behaviours in Table 2 could be inserted (see example in Figure 5). Four raters individually labelled the videos of in-lab activities and five labelled the out-of-lab activity videos. Raters were instructed to label the activities described in Table 2, using a set of definitions, and not allow any space between any elements in the activity track. In addition, an "undefined" category was introduced that occurred when the rater could not determine what activity the person was performing, if the camera view became blocked, or the lighting was poor. The labelling took place in a swipe-card secured PC laboratory at the Faculty of Neuroscience at St. Olav's Hospital. One 20 min in-lab and one out-of-lab video were randomly chosen to test for the inter-rater reliability of the four and five raters, respectively. The statistics for the inter-rater reliability were Category Agreement percentage [29], Cohen's kappa [29], corrected kappa, Krippendorff's alpha [30] and Fleiss's kappa.
For all video labelled data, the following statistics were recorded for both the in-lab and freeliving protocols: the quantity of activities, the maximum bout length, minimum bout length, average bout length, the standard deviation, the total time and the percentage of the overall activity time.

Inter-Rater Reliability
The overall level of agreement was high for the in-lab video coding, with the percentage of agreement at 90.85% and Cohen's kappa, corrected kappa, Krippendorff's alpha and Fleiss's kappa all over 0.86, see Table 6. Four raters individually labelled the videos of in-lab activities and five labelled the out-of-lab activity videos. Raters were instructed to label the activities described in Table 2, using a set of definitions, and not allow any space between any elements in the activity track. In addition, an "undefined" category was introduced that occurred when the rater could not determine what activity the person was performing, if the camera view became blocked, or the lighting was poor. The labelling took place in a swipe-card secured PC laboratory at the Faculty of Neuroscience at St. Olav's Hospital. One 20 min in-lab and one out-of-lab video were randomly chosen to test for the inter-rater reliability of the four and five raters, respectively. The statistics for the inter-rater reliability were Category Agreement percentage [29], Cohen's kappa [29], corrected kappa, Krippendorff's alpha [30] and Fleiss's kappa.
For all video labelled data, the following statistics were recorded for both the in-lab and free-living protocols: the quantity of activities, the maximum bout length, minimum bout length, average bout length, the standard deviation, the total time and the percentage of the overall activity time.

Inter-Rater Reliability
The overall level of agreement was high for the in-lab video coding, with the percentage of agreement at 90.85% and Cohen's kappa, corrected kappa, Krippendorff's alpha and Fleiss's kappa all over 0.86, see Table 6. Table 6. Inter-rater reliability statistics for the in-lab scenario.

Activities
A total of 9.521 h of in-lab activities were recorded using the semi-structured protocol (see Table 7) The activity standing was the most commonly performed activity (34.01%) followed by sitting (23.67%), transition (18.31%), walking (13.02%), shuffling (6.10%) and lying (4.09%). The activities of kneeling, picking and leaning accounted for less than 1% of all activities recorded (0.79%).

Transitions
A total of 2640 transitions were planned for the in-lab scenario, however, 2677 transitions were recorded in total. Of the 2677 transitions recorded, 2333 were part of the protocol, while the remaining 344 were not. Thus 88.37% of planned transitions were recorded ( Table 8). Out of the 22 types of transitions that were part of the protocol, 13 produced fewer transitions (range from −63.33% to −1.58% fewer), while nine produced more (range from 36.67% to 1.11%). A total of 18 transitions that were not part of the protocol were also performed.

Inter-Rater Reliability
The overall level of agreement was high for the out-of-lab video labelling, with the percentage of agreement at 90.05% with Cohen's Kappa, corrected kappa, Krippendorff's alpha and Fleiss's kappa all over 0.86 (see Table 9).

Activities
A total of 34.408 h of out-of-lab activities were recorded using the free-living protocol and the stair-climbing task at the end of the semi-structured protocol (see Table 10). The activity sitting was the most commonly performed activity (48.09%) followed by standing (22.17%), walking (14.22%), transition (5.12%), shuffling (4.67%), leaning (2.32%) and lying (1.33%). The activities of stair climbing, picking and kneeling accounted for 2.09% all activities recorded in the out-of-lab scenario.

Transitions
A total of 1080 transitions were planned as part of the free-living protocol, however 3442 transitions were recorded (see Table 11). In total, 16 transitions were planned as part of the protocol, 10 produced fewer transitions (range from −100% to −4.17% fewer), with one transition not being completed at all, "lying-transition-standing", while six produced more (range from 290% to 70% more). A total of 37 transitions that were not part of the protocol were also performed.

In-Lab and Out-of-Lab Activities
A total of 43.93 h of video annotation activity data were recorded (see Table 12). The activity sitting was the activity performed most often (42.8%), followed by standing (24.73%), walking (13.96%), transitions (7.98%), shuffling (4.98%) lying (1.93%) and leaning (1.84%). The activities of stair climbing, picking and kneeling account for less than 2% of the overall activity (1.78%). Considering the most common activities of sitting, standing, lying and walking (including stair ascending and stair descending) account for 84.61% of all activities recorded. However, the remaining activities of transitions, shuffling, leaning, picking and kneeling do still constitute a relevant proportion of activities (15.39%) which are often overlooked in activity classification systems.

Discussion
We have compiled a comprehensive dataset of representative activities from an independently living, older adult population recorded using two task-based protocols in a laboratory setting and a free-living setting in the participants' home environment. This dataset is suitable for the validation of existing activity classification algorithms and will allow for the development of new activity classification algorithms using the harvested raw inertial sensor data.
A strength of the dataset is that it resulted from two protocols, a semi-structured protocol and a free-living protocol. The semi-structured protocol is designed for a laboratory setting where activities are performed under supervision; a protocol of this nature offers a compromise between achieving the desired number of planned activities and transitions with the trade-off that these are performed under supervised conditions and thus not performed as naturally as possible. The free-living protocol is designed for a person's own home environment, where activities are performed without any supervision; a protocol of this nature prioritises the quality of the activities over the quantity of activities, ensuring that activities are performed as naturally as possible. This design makes it suitable to compare the performance of existing and new algorithms developed in an in-lab setting for an out-of-lab application.
We used video data as the gold standard for classifying activities, with labelling of the subjects' movements performed by five raters. For both the in-lab and out-of-lab video data, the overall level of agreement was high (percentage of agreement at 90.85% and 90.05%, respectively). The Cohen's Kappa, corrected kappa, Krippendorff's alpha and Fleiss' kappa were all over 0.86 for both the in-lab and out-of-lab videos for the chosen activity categories, thus demonstrating that the raters successfully labelled the video data with a high level of agreement.
A total of 43.93 h of activities were recorded, including 9.52 h of in-lab activities and 34.41 h of out-of-lab activities. Standing was the most commonly performed activity in the in-lab scenario (34.01%), ahead of sitting (23.67%), while the opposite was true for the out-of-lab scenario, with sitting performed more often in the out-of-lab scenario (48.09%) than standing (22.17%). In the in-lab scenario, transitions were performed 18.31% of the time, whereas for the out-of-lab scenario, they were performed less often (5.12%). The effect of the semi-structured protocol can be clearly seen in the increased amount of transition time in the in-lab scenario due to the intensive nature of the semi-structured protocol.
The quantity of walking in both scenarios was approximately equal, 13.02% for the in-lab scenario and 14.22% for the out-of-lab scenario. There were more shuffling episodes in the in-lab scenario (6.1%) than in the out-of-lab scenario (4.67%), being 23.44% higher in the in-lab scenario. However, both were relatively low in both scenarios and were the fifth most common activity. Lying was much more frequent in the in-lab scenario (4.09%) than in the out-of-lab scenario (1.33%), however this can be expected as the out-of-lab recording did not include any overnight recording, and any lying activity was motivated by the influence of both protocols. The difference between the percentages of activities performed between the in-lab scenario and the out-of-lab scenario can be attributed to the influence of the semi-structured protocol and the free-living protocol. In the semi-structured protocol, participants were instructed to perform tasks which incorporated specific activities, while in the presence of a study investigator. In the out-of-lab scenario, participants were only requested to incorporate specific tasks as part of a free-living protocol and were not in the presence of a study investigator. They could thus choose to perform these tasks how they wished or not at all. In addition, the semi-structured protocol incorporated 19 different tasks to be completed three times, whereas the free-living protocol incorporated 15 tasks to be completed only once.
The activities performed during the free-living protocol are less susceptible to the Hawthorn effect [4,5] due to the unsupervised nature of the protocol; in addition, since the participants are performing the protocol in their own home and thus a familiar environment, this results in a more natural pattern of distribution of activities and performance quality. Ultimately, the application of physical activity classification algorithms is in a free-living setting. If a high accuracy can be obtained using video-validated data harvested in an in-field setting, more accurate algorithms can be developed.
The difference between the planned transitions and the recorded transitions in both protocols can be attributed to the manner in which participants were able to perform the tasks. For the semi-structured protocol the investigators planned the tasks to include specific activities. Thus, the task "Stand-sit-stand at a table" was planned to consist of stand-transition-sit-transition-stand. However, this task could have required the participant to adjust their body to position themselves to sit on the chair placed at a table. This positioning of the body, for descent into a sitting position, could have required some shuffling, which is also supported by the finding that more shuffling was performed in the in-lab setting. Thus, this task could have consisted of stand-shuffle-transitionsit-transition-shuffle-stand, for example, and thus shuffling-transition-sit and sit-transition-shuffling may have been recorded instead of stand-transition-sit and sit-transition-stand.
The analysis of the difference between the planned transitions and the recorded transitions provides an insight into modifications required that would produce a more balanced dataset of activities, which can be efficiently accumulated as part of a semi-structured protocol.
This dataset is a valuable resource for the development of new physical activity classification algorithms given the level of detail used in the annotation and the fact that a lower limit was placed on the amount of rarely observed activities, e.g., lying and transitions. Thus, if this dataset is used as part of a machine learning approach, transitions from a wide spectrum of physical activity will be used and thus more robust algorithms can be developed. However, a limitation of this study is that the dataset is not balanced. In order to create a dataset ideally suited to the development of an activity classification algorithm, using a machine learning approach, a dataset with an equal amount of each activity is preferred, thus eliminating any classification bias. This is referred to as a balanced dataset in the work by Guiry et al. [31]. Even if the dataset created here can be described as unbalanced, it does more closely reflect the proportion of activities that would occur in a real world setting, given the nature of the free-living protocol, in that the participants were unsupervised for 78.32% of the time, and were only given guidelines on which tasks to perform, not how and when. Future studies could incorporate a higher frequency of certain tasks to increase the number of transitions and activities to achieve a more balanced dataset; other techniques include using synthetic minority over-sampling [32] to artificially increase the minority classes in the dataset, or simply removing data from the dataset to create a balanced sub-set.
A strength of the current dataset is that actual older adults performed the protocols. This has not always been the case for other studies, despite being aimed at developing algorithms for classifying activities and for assessing features of movement behaviour in older adults. We included older home-dwelling adults who were independent in mobility only. Thus, this dataset is likely less suitable for analysis of algorithms for older adults that are dependent in daily life activities.
To the best of our knowledge, this study is the first to generate a dataset of inertial sensor data using a free-living protocol in an unsupervised setting, using high frame-rate video recording to label participants' movements, producing a dataset annotated at 25 frames per second recorded from older adults. This will allow activity classification algorithms with inertial sensor data to be filtered up to 12.5 Hz (Nyquist-Shannon sampling theorem) if a window width of one frame is desired. Since the typical frequency of body motion is often below 10 Hz [33], with 99% of body motion energy contained below 15 Hz, the developed algorithms will almost entirely capture the details of human movement, thus alleviating the measurement error that is often a feature of existing activity classification devices, since these datasets are often labelled with a resolution of approximately 1 s (1 Hz) or even coarser. Thus if the measurement range of an activity classification system is within the same error range as the parameter of interest in the research question, these existing algorithms and systems will not be adequately sensitive.
Given the design of the semi-structured supervised protocol and the free-living unsupervised protocol, a wide variety of transitions, postures and activities has been generated in a way that is as natural as possible due to the task-based nature of the trial protocols. It is difficult to generate the required number of activities that a study requires in order to obtain a completely balanced dataset that consists of an adequately high number of all transitions and activities. However, in order to generate activities that are performed as naturally as possible, a study of this type should include a protocol that is as close as possible to real-life situations.

Conclusions
In conclusion, we have described the development and collection of a dataset that is suitable for validation of existing, and development of new, activity classification algorithms. The strengths of the dataset include that it consists of both a semi-structured and free-living protocol, and that it has involved older adults as participants. Furthermore, this study has produced the most detailed dataset of inertial sensor data to date, synchronised with high frame-rate (>25 fps) video-labelled data and includes a wide variety of activities recorded from older adults living independently. This dataset will be suitable for validation of existing activity classification systems and the development of new activity classification algorithms capable of classification at up to 25 Hz. Researchers are also invited to collaborate with the consortium on specific research questions and get access to the full dataset. The authors will consider each proposal for collaboration. Development and validation of algorithms using the dataset will allow for a better understanding of the accuracy of existing algorithms and has the potential to remove the measurement inaccuracy in existing academic activity classification algorithms caused by low-resolution labelling of the contributing datasets.  Appendix B Table A2. Postures, transitions and behaviours for the free-living protocol for each task on a per subject basis and for a total of 20 subjects, assuming each task is repeated once.