A Multi-Sensor Dataset for Human Activity Recognition Using Inertial and Orientation Data

Rivas-Caicedo, Jhonathan L.; Saldaña-Aristizabal, Laura; Niño-Tejada, Kevin; Patarroyo-Montenegro, Juan F.

doi:10.3390/data10080129

Open AccessData Descriptor

A Multi-Sensor Dataset for Human Activity Recognition Using Inertial and Orientation Data

by

Jhonathan L. Rivas-Caicedo

^1,*

,

Laura Saldaña-Aristizabal

¹

,

Kevin Niño-Tejada

¹

and

Juan F. Patarroyo-Montenegro

^2,*

¹

Department of Electrical and Computer Engineering, University of Puerto Rico, Mayaguez, PR 00680, USA

²

Department of Computer Science and Engineering, University of Puerto Rico, Mayaguez, PR 00680, USA

^*

Authors to whom correspondence should be addressed.

Data 2025, 10(8), 129; https://doi.org/10.3390/data10080129

Submission received: 8 July 2025 / Revised: 7 August 2025 / Accepted: 12 August 2025 / Published: 14 August 2025

Download

Browse Figures

Versions Notes

Abstract

Human Activity Recognition (HAR) using wearable sensors is an increasingly relevant area for applications in healthcare, rehabilitation, and human–computer interaction. However, publicly available datasets that provide multi-sensor, synchronized data combining inertial and orientation measurements are still limited. This work introduces a publicly available dataset for Human Activity Recognition, captured using wearable sensors placed on the chest, hands, and knees. Each device recorded inertial and orientation data during controlled activity sessions involving participants aged 20 to 70. A standardized acquisition protocol ensured consistent temporal alignment across all signals. The dataset was preprocessed and segmented using a sliding window approach. An initial baseline classification experiment, employing a Convolutional Neural Network (CNN) and Long-Short Term Memory (LSTM) model, demonstrated an average accuracy of 93.5% in classifying activities. The dataset is publicly available in CSV format and includes raw sensor signals, activity labels, and metadata. This dataset offers a valuable resource for evaluating machine learning models, studying distributed HAR approaches, and developing robust activity recognition pipelines utilizing wearable technologies.

Dataset: https://doi.org/10.5281/zenodo.15830858, direct download link: https://zenodo.org/api/records/15830858/files-archive.

Dataset License: CC-BY 4.0

Keywords:

activity classification; distributed neural networks; human activity recognition; IMU dataset; inertial data; multi-sensor systems; quaternions; wearable sensors

1. Summary

Human Activity Recognition (HAR) using wearable sensors plays a vital role in many emerging applications, such as rehabilitation, elderly care, sport performance monitoring, and human–robot interaction. Although significant progress has been made using data from smartphones and individual sensors [1,2,3], many real-world scenarios require more complex sensing systems that combine data from multiple body locations. However, HAR research continues to face several challenges, including the lack of publicly available datasets with synchronized multi-sensor data, variability in sensor placement across users, limited support for orientation-aware models, and difficulty evaluating robustness to sensor loss or communication delays. These limitations hinder the development of accurate, generalizable, and energy-efficient HAR systems suitable for real-world deployment.

Numerous publicly available datasets have been developed to support research in HAR, as shown in Table 1, particularly those based on wearable inertial sensors. These datasets typically include accelerometers and gyroscopes placed on different body parts or embedded in smartphones and smartwatches. For instance, the UCI HAR [1] and WISDM [2] datasets use smartphone sensors to record six common activities, providing a lightweight solution for everyday activity tracking. However, these datasets are often limited by single-sensor configurations and lack detailed spatial orientation data.

Other datasets, such as PAMAP2 [3] and KU-HAR [4], incorporate multiple sensing modalities, including magnetometers, enabling more comprehensive motion analysis. While these datasets provide a broader range of activities and richer sensor data, they often include a small number of participants or lack consistent multi-sensor synchronization. Opportunity [5], for example, integrates body-worn and ambient sensors to capture diverse activities, but its use is limited by the small sample size (four participants) and the high complexity of its setup.

Industrial-focused datasets like Skoda [6] and REALDISP [7] offer large activity sets or controlled environments but are constrained in participant variability or do not include quaternion orientation information. Most existing datasets also lack real-time streaming configurations and do not address distributed sensing scenarios, which are increasingly relevant for embedded or edge-based HAR systems.

This work presents a comprehensive dataset collected using five MetaMotionRL wearable sensors from MbientLab (San Jose, CA, USA), attached to the chest, hands, and knees. The sensors record tri-axial acceleration, gyroscope data, and quaternions representing 4D orientation. Data were collected from adult participants (aged 20 to 70) performing a series of daily activities under a structured yet natural protocol. The dataset is intended to support research on centralized and distributed machine learning models for HAR, sensor fusion, and robustness to sensor loss [8,9,10].

This dataset was collected as part of a research project focused on developing distributed neural networks for HAR on embedded edge devices [11,12]. A journal article is currently in preparation based on this dataset. By releasing the dataset publicly, we aim to encourage reproducibility, benchmarking, and collaborative development of robust HAR systems.

2. Data Description

2.1. Data Structure and Format

The dataset is organized as a collection of CSV (Comma-Separated Values) files, each representing a complete data recording session from a single participant. In total, 67 files are included, 1 per subject, using the consistent naming convention: SynchronizedDataSubjectX_etiquetado.csv, where X is an integer from 1 to 67 indicating the subject identifier.

Each file contains multivariate time-series data from wearable inertial sensors placed at standardized body locations:

Chest (sternum)
Left hand
Right hand
Left knee
Right knee

The signals have been temporally aligned using a post-processing synchronization algorithm, ensuring consistent time correspondence across all sensors. Data were sampled uniformly, enabling precise reconstruction of temporal dynamics.

Each row includes 30 continuous-valued features describing orientation, linear acceleration, and angular velocity from the sensors, along with a final column indicating the corresponding activity label. The columns follow a naming convention that combines the variable type and sensor location, making the dataset self-descriptive and machine-readable. Below is a breakdown of the data columns:

Orientation (Quaternions)
Linear acceleration (Accelerometer)
Angular velocity (Gyroscope)
Activity label

The structure of the dataset allows for immediate use in machine learning pipelines and supports a wide range of research applications, such as activity classification, sensor fusion, distributed learning architectures, or robust analysis. The dataset can be easily loaded into Python (v3.11), MATLAB (R2024b), or any environment capable of parsing standard CSV format.

2.2. Sensor Data Columns

Each file in the dataset contains a total of 30 numerical columns representing sensor signals, plus 1 additional column for the activity label. These columns are grouped by sensor location and signal type and follow a consistent naming convention that indicates the origin and nature of each measurement.

The columns are logically organized into three categories of sensor data per device:

Orientation: represented by quaternions (q_w, q_x, q_y, q_z)
Linear acceleration: 3-axis accelerometer data (a_x, a_y, a_z)
Angular velocity: 3-axis gyroscope data (g_x, g_y, g_z)

Each variable is prefixed with the corresponding body location, using the following identifiers:

chest: Sensor placed on the sternum
left_hand: Sensor placed on the left hand
right_hand: Sensor placed on the right hand
left_knee: Sensor placed above the left knee
right_knee: Sensor placed above the right knee

For example, the column q_x_left_hand refers to the x-component of the quaternion orientation recorded by the sensor on the left hand. In contrast, g_z_chest refers to the z-axis angular velocity from the chest-mounted sensor.

Table 2 summarizes the sensing modalities recorded at each sensor location. A selective configuration was intentionally adopted to balance data richness with transmission stability, sensor battery life, and reduced redundancy. For example, quaternion orientation was prioritized at locations where rotational dynamics are more informative (e.g., chest, left hand, right knee). At the same time, accelerometer and gyroscope data were recorded primarily from locations with significant linear and angular motion (e.g., chest, right hand, left knee).

This deliberate distribution enables diverse and representative coverage of full-body motion without overloading the Bluetooth Low-Energy (BLE) communication channels or introducing unnecessary overlap in sensor data, as shown in Figure 1.

2.3. Labels and Metadata

Each row in the dataset includes a column named “label”, which encodes the ground-truth activity being performed by the subject at that specific timestep. This column appears as the final field in every CSV file and contains categorical string values that correspond to human activities performed during the session.

Table 3 presents the list of activities included in the dataset, along with brief descriptions of the movements associated with each activity.

These labels are assigned during data collection based on a predefined protocol and are synchronized with sensor data using timestamps. Each label remains constant over a time interval during which the corresponding activity was being executed. The dataset is suitable for both window-based classification and sequence modeling, as the activity transitions are preserved in the full temporal signal.

In addition to the labeled data, the dataset includes a separate file named participants_info.csv, which provides metadata for each of the 67 participants. This file contains structured demographic and experimental information in tabular form. Each row corresponds to a participant and includes the following fields:

subject_id: Unique numeric identifier assigned to each participant (e.g., 1 to 67).
gender: Self-reported gender of the participant (male, female, prefer not to say, etc.).
age: Participant’s age in years at the time of the recording session.
height (m): Self-reported or measured height of the participant, expressed in meters.
weight (kg): Self-reported or measured weight of the participant, expressed in kilograms.

This metadata enables researchers to perform stratified analysis (e.g., by age or gender), evaluate inter-subject variability, and design leave-one-subject-out cross-validation protocols, which are commonly used in HAR benchmarks.

Together, the activity labels and participant metadata provide the necessary context for performing both intra- and inter-personal HAR experiments, with the added advantage of demographic traceability for model generalization studies.

2.4. Sampling Rate

All sensors operated with a uniform sampling configuration and transmitted data independently via BLE. A software-based synchronization procedure was applied post-collection to align the signals temporally, compensating for transmission delays and clock drift. The resulting data maintain a constant timestep, facilitating inter-sensor analysis and fusion without additional preprocessing.

The synchronization process compensates for minor transmission delays, clock drift, and differences in sensor startup times. As a result, researchers can safely assume that all measurements across columns in a given row are temporally consistent, facilitating inter-sensor analysis, orientation comparisons, and fusion-based learning methods without additional alignment procedures.

No resampling or interpolation was applied during synchronization. Therefore, the raw time-series structure is preserved, and the sampling frequency remains strictly constant at 50 Hz throughout each file.

This configuration provides a good balance between temporal resolution, power efficiency, and modeling accuracy, making the dataset suitable for both offline analysis and potential deployment in real-time HAR systems.

3. Methods

3.1. Data Collection Protocol

The dataset was collected as part of a study focused on HAR using synchronized inertial and orientation data from wearable sensors. The participants were recruited to perform a series of predefined daily activities, including sitting, walking, squatting, folding clothes, sweeping, moving boxes, and riding a bicycle, as shown in Figure 2. Each participant was instructed to follow a structured activity protocol in a controlled indoor environment.

Each participant wore MetaMotionRL sensors positioned at five predefined anatomical sites to capture acceleration, angular velocity, and orientation data. The devices streamed signals via BLE to a central computer, supporting real-time acquisition under controlled indoor conditions.

3.2. Recording Environment

All data collection sessions were conducted in a controlled indoor laboratory setting specifically arranged to support the execution of a predefined set of human activities. The environment was designed to minimize external interference and to ensure consistency across participants.

A dedicated area was organized with clearly defined zones for each activity, as illustrated in Figure 3. The layout included the following:

A treadmill for the walking activity, allowing continuous and safe movement at a constant pace.
A stationary exercise bike designated for the cycling (ride a bike) activity.
A chair and table setup for the sitting and folding clothes tasks, where participants sat and performed repetitive folding motions with garments.
A set of medium-sized cardboard boxes placed beside the table, used for the box-moving activity.
A sweeping area, defined as a square region in the center of the room, where participants were instructed to perform sweeping motions using a standard broom in a continuous pattern.
A host station located in front of the participant, where a laptop running the data collection and synchronization software was monitored by the experiment supervisor.

The environment ensured that all activities could be performed naturally and without spatial restrictions, while keeping sensors within Bluetooth range for reliable real-time streaming. The consistent arrangement of the layout across participants supported standardized data collection and reduced variability due to external factors.

Participants were guided by the host and followed the activity sequence under supervision to ensure protocol adherence and timing accuracy.

Figure 3. Layout of the laboratory environment used during data collection. The room was segmented into distinct zones for each predefined activity to ensure consistency and repeatability across participants.

3.3. Sensors

All recordings in this dataset were obtained using MetaMotionRL devices developed by MbientLab Inc. (San Jose, CA, USA) [13]. These are commercially available, BLE-enabled wearable inertial measurement units (IMUs) designed for research-grade motion tracking. Each MetaMotionRL device integrates a multi-sensor system capable of capturing precise motion dynamics and orientation in real time.

The external appearance and internal architecture of the sensor, including its embedded components, are illustrated in Figure 4. Each unit includes the following embedded components:

3-axis accelerometer (±16 g), used to measure linear acceleration.
3-axis gyroscope (±2000°/s), used to measure angular velocity.
Bosch BMM150 magnetometer (used internally for sensor fusion).
Sensor fusion module based on the Bosch BNO055 or BMI160, providing computed quaternion-based orientation data.
BLE interface for real-time streaming at up to 100 Hz.

Figure 4. Overview of the wearable MetaMotionRL sensor used for data acquisition. (a) External view of the enclosed device with LED and charging port; (b) internal architecture highlighting key embedded components, including the 3-axis accelerometer, 3-axis gyroscope, magnetometer, battery, Bluetooth 4.0 module, memory, and micro-USB interface. Images reproduced from the official MbientLab website: https://mbientlab.com/metamotionrl/ (accessed on 8 July 2025).

The MetaMotionRL boards feature onboard computation and sensor fusion firmware capable of providing real-time quaternion estimates through internal Kalman filtering, thus reducing the need for external post-processing. Each device is powered by a rechargeable lithium-polymer battery and is enclosed in a compact plastic casing with a clip or strap mount for body attachment.

The BLE communication and sensor control were managed using the official MbientLab Python SDK (v1.0.8), which provides functionality for device discovery, sensor configuration, streaming control, and data handling. In addition, the MbientLab mobile application was used during initial setup and testing to ensure proper sensor functionality and streaming stability before data collection sessions.

This sensor platform was chosen for its balance between mobility, signal quality, ease of integration, and real-time compatibility. Its support for both the official Python SDK and the MbientLab mobile application simplifies deployment across various platforms and facilitates reliable multi-sensor streaming. These characteristics make it particularly well suited for HAR research involving neural network architectures and edge computing scenarios.

3.4. Synchronization and Validation

A synchronization procedure was applied post-recording to ensure all signals aligned temporally, despite independent transmission from each sensor. The resulting dataset maintains a unified time base across all measurements, enabling reliable fusion and analysis of multimodal signals.

Manual validation was performed by comparing the labeled segments against video recordings to confirm that activities were correctly annotated. Noise inspection and signal integrity analysis were conducted by plotting raw sensor outputs and verifying stability, magnitude, and drift behavior.

3.5. Data Quality and Cleaning

Basic filtering was applied to remove samples with missing or corrupted data, and segments with significant packet loss were excluded from the final dataset. No artificial smoothing, resampling, or feature extraction was applied to preserve the original signal structure. Users are encouraged to apply custom filtering depending on their target application.

Data were stored in CSV format using UTF-8 (Unicode Transformation Format) encoding, with all numeric values expressed in raw sensor units (g for acceleration, deg/s for gyroscope, and unit quaternions for orientation).

3.6. Ethical Approval

This study involved the voluntary participation of adult individuals, and all procedures were carried out following the Declaration of Helsinki [14]. Written informed consent was obtained from all participants before data collection, and no personally identifiable information is included in the dataset.

The study protocol was reviewed and approved by the institutional ethics committee of the University of Puerto Rico’s Institutional Review Board (IRB) (approved on 27 January 2025). All data have been anonymized, and each participant is referenced by a numeric ID only.

4. Usage Notes

4.1. Data Access

The dataset is publicly available and can be accessed through Zenodo. All data files are provided in CSV format, encoded in UTF-8, and organized as one file per participant. Each file contains synchronized multi-sensor inertial and orientation data with labeled human activities. The filenames follow the convention SynchronizedDataSubjectX_etiquetado.csv, where X represents the subject ID.

In addition to the sensor recordings, the repository includes the following:

A metadata file (participants_info.csv) containing demographic and biometric information for all 67 participants.
A README file describing the dataset structure, contents, and file naming conventions.
A Jupyter notebook providing example preprocessing routines for loading, cleaning, and segmenting the data into windows for model training.

The dataset is released under a CC-BY 4.0, which permits reuse, redistribution, and adaptation with appropriate attribution.

Researchers can download individual components directly through the Zenodo platform. Versioning is enabled to track future updates and corrections.

4.2. Data Preprocessing and Segmentation Guidelines

To effectively utilize the dataset for HAR tasks, it is recommended to segment the continuous time-series data into fixed-length windows before model training or feature extraction. A commonly adopted strategy, and one that has proven effective in the literature, involves using a window size of 2.56 s (equivalent to 128 samples at 50 Hz) with a 50% overlap between consecutive windows. This overlap improves temporal resolution while maintaining context across transitions, following established practices in HAR datasets [1].

Each windowed segment can be treated as a standalone training sample, labeled using the majority class or center label within the window. This method is suitable for both traditional machine learning algorithms and deep learning architectures, such as CNNs or LSTMs, which process sequential input.

A sample preprocessing pipeline is provided in the accompanying repository. This notebook loads multiple CSV files containing synchronized IMU data from wearable sensors, cleans and scales the data, filters invalid entries, and creates sliding windows suitable for training HAR models. It includes routines for handling participant metadata and exporting ready-to-use datasets for classification tasks.

Researchers are encouraged to experiment with different combinations of sensors, feature sets (e.g., using only quaternions or acceleration), and evaluation strategies (e.g., leave-one-subject-out cross-validation) depending on the intended application.

4.3. Limitations

While this dataset offers a rich and well-structured collection of synchronized inertial and orientation data for HAR, several limitations must be acknowledged:

Fixed Sensor Placement: Sensors were placed on five specific body locations (chest, hands, knees) using elastic bands or straps. Variations in placement tightness, orientation, or slight shifts during activity execution may introduce variability. However, these conditions were not varied systematically, and sensor misplacement or loose fitting was not explicitly modeled.
Single-Session Recordings per Participant: Each subject performed the activity protocol once, under supervision. Therefore, the dataset does not capture intra-subject variability across multiple days, levels of fatigue, or environmental contexts.
No Timestamp Column: Although all sensor data are temporally synchronized and uniformly sampled at 50 Hz, the dataset does not include absolute timestamps. This design choice simplifies data structure and is consistent with many HAR datasets. However, it introduces limitations for certain applications that require alignment with external modalities or for performing evaluations that depend on real-time system behavior and latency estimation. Without a global reference clock, precise fusion with other sensor types or time-dependent systems must rely on additional synchronization mechanisms (e.g., simultaneous start triggers or external time markers). We acknowledge this constraint, and in future versions of the dataset, we plan to include optional absolute timestamp fields to facilitate precise alignment with external modalities and improve interoperability in multimodal and real-time experimental setups.
Lack of Transitional or Dynamic Activities: The dataset focuses on a well-structured set of common daily activities, but it does not include dynamic or transitional movements. These types of actions are relevant in real-world applications and typically involve more complex motion patterns and inter-class transitions. While this omission was intentional to maintain protocol consistency and recording simplicity, we acknowledge it as a limitation and plan to incorporate such activities in future expansions of the dataset to enhance model generalizability and robustness.
Class Imbalance Potential: Some activities, such as sitting, have longer duration segments than other activities. While the labeling protocol was designed for balance, users should assess class distribution when training models.
Sensor-Specific Modalities: Not all sensors provide the same type of data. For example, quaternion orientation is only available from the chest, left hand, and right knee. This design reflects practical constraints and avoids redundancy but may require special handling in models that expect uniform feature vectors across sensors.

Author Contributions

Conceptualization, J.L.R.-C. and L.S.-A.; methodology, J.L.R.-C. and L.S.-A.; software, J.L.R.-C. and L.S.-A.; validation, J.L.R.-C., K.N.-T. and L.S.-A.; formal analysis, J.L.R.-C.; investigation, J.L.R.-C., K.N.-T., L.S.-A. and J.F.P.-M.; resources, J.L.R.-C. and L.S.-A.; data curation, J.L.R.-C. and L.S.-A.; writing—original draft preparation, J.L.R.-C.; writing—review and editing, J.L.R.-C., K.N.-T., L.S.-A. and J.F.P.-M.; visualization, J.L.R.-C., K.N.-T., L.S.-A. and J.F.P.-M.; supervision, J.F.P.-M.; project administration, J.F.P.-M.; funding acquisition, J.F.P.-M. All authors have read and agreed to the published version of the manuscript.

Funding

This research received financial support from the NSF CAREER Award under Grant No. OAC-2439345.

Institutional Review Board Statement

The study was conducted under the Declaration of Helsinki and approved by the University of Puerto Rico’s Institutional Review Board (IRB), 2024120022 (approved on 27 January 2025).

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

HAR	Human Activity Recognition
CSV	Comma-Separated Values
CNN	Convolutional Neural Network
LSTM	Long Short-Term Memory
BLE	Bluetooth Low Energy
UTF	Unicode Transformation Format

References

Michel, V. Proceedings. Ciaco—i6doc.com, 2013. In Proceedings of the ESANN 2013: 21st European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, Bruges, Belgium, 24–26 April 2013. [Google Scholar]
Kwapisz, J.R.; Weiss, G.M.; Moore, S.A. Activity Recognition Using Cell Phone Accelerometers. SIGKDD Explor. Newsl. 2011, 12, 74–82. [Google Scholar] [CrossRef]
Reiss, A.; Stricker, D. Introducing a new benchmarked dataset for activity monitoring. In Proceedings of the International Symposium on Wearable Computers, ISWC, Newcastle, UK, 18–22 June 2012; pp. 108–109. [Google Scholar] [CrossRef]
Kim, Y.; Toomajian, B. Hand Gesture Recognition Using Micro-Doppler Signatures with Convolutional Neural Network. IEEE Access 2016, 4, 7125–7130. [Google Scholar] [CrossRef]
Roggen, D.; Calatroni, A.; Rossi, M.; Holleczek, T.; Förster, K.; Tröster, G.; Lukowicz, P.; Bannach, D.; Pirkl, G.; Ferscha, A.; et al. Collecting Complex Activity Datasets in Highly Rich Networked Sensor Environments. In Proceedings of the 7th International Conference on Networked Sensing Systems (INSS 2010), Kassel, Germany, 15–18 June 2010; IEEE: Piscataway, NJ, USA, 2010; pp. 233–240. [Google Scholar] [CrossRef]
Zappi, P.; Stiefmeier, T.; Farella, E.; Roggen, D.; Benini, L.; Tröster, G. Activity Recognition from on-Body Sensors by Classifier Fusion: Sensor Scalability and Robustness. In Proceedings of the 3rd International Conference on Intelligent Sensors, Sensor Networks and Information (ISSNIP 2007), Melbourne, VIC, Australia, 3–6 December 2007; IEEE: Piscataway, NJ, USA, 2007; pp. 281–286. [Google Scholar]
Reyes-Ortiz, J.-L.; Oneto, L.; Samà, A.; Parra, X.; Anguita, D. Transition-Aware Human Activity Recognition Using Smartphones. Neurocomputing 2016, 171, 754–767. [Google Scholar] [CrossRef]
Tao, W.; Chen, H.; Moniruzzaman, M.; Leu, M.C.; Yi, Z.; Qin, R. Attention-Based Sensor Fusion for Human Activity Recognition Using IMU Signals. arXiv 2021, arXiv:2112.11224. [Google Scholar] [CrossRef]
Xaviar, S.; Yang, X.; Ardakanian, O. Robust Multimodal Fusion for Human Activity Recognition. arXiv 2023, arXiv:2303.04636. [Google Scholar] [CrossRef]
Chung, S.; Lim, J.; Noh, K.J.; Kim, G.; Jeong, H. Sensor data acquisition and multimodal sensor fusion for human activity recognition using deep learning. Sensors 2019, 19, 1716. [Google Scholar] [CrossRef] [PubMed]
Huang, X.; Yuan, Y.; Chang, C.; Gao, Y.; Zheng, C.; Yan, L. Human Activity Recognition Method Based on Edge Computing-Assisted and GRU Deep Learning Network. Appl. Sci. 2023, 13, 9059. [Google Scholar] [CrossRef]
Teerapittayanon, S.; McDanel, B.; Kung, H.T. Distributed Deep Neural Networks over the Cloud, the Edge and End Devices. arXiv 2017, arXiv:1709.01921. [Google Scholar] [CrossRef]
MbientLab Inc. MetaMotionRL Sensor Platform. 2024. Available online: https://mbientlab.com/metamotionrl/ (accessed on 8 July 2025).
World Medical Association. World Medical Association Declaration of Helsinki: Ethical Principles for Medical Research Involving Human Subjects. 2013. Available online: https://www.wma.net/policies-post/wma-declaration-of-helsinki/ (accessed on 7 July 2025).

Figure 1. Sensor placement and signal modality distribution across the five wearable devices. Each rectangle represents a sensor attached to a specific anatomical location. Color-coded boxes indicate the type of data recorded by each sensor: quaternion orientation (orange), linear acceleration (blue), and angular velocity (red). Signal labels use compact shorthand notation to represent multi-dimensional signals: three components for accelerometer and gyroscope (x, y, z), and four components for quaternion orientation (w, x, y, z).

Figure 2. Timeline of the data collection protocol for each recording session. Participants began in a seated position and transitioned through a sequence of predefined activities, including folding clothes, sweeping, walking, moving boxes, and riding a stationary bicycle. Each activity was interleaved with seated rest periods, and the duration of each activity segment was fixed and supervised to ensure consistency across subjects.

Table 1. Comparison of publicly available Human Activity Recognition (HAR) datasets.

Dataset	Year	Type of Sensor	No. of Activities	No. of Participants
UCI HAR [1]	2012	Accelerometer, Gyroscope	6	30
WISDM [2]	2011	Accelerometer	6	26
PAMAP2 [3]	2012	Accelerometer, Gyroscope, Magnetometer	18	9
KU-HAR [4]	2017	Accelerometer, Gyroscope, Magnetometer	18	18
Opportunity [5]	2010	Body-worn and Ambient Sensors	18	4
Skoda [6]	2008	Accelerometer	10	1
REALDISP [7]	2013	Accelerometer, Gyroscope	33	17
Proposed dataset	2025	Accelerometer, Gyroscope, Quaternion	6	67

Table 2. Summary of sensor placements and the corresponding signal types recorded. The sensor provides quaternion orientation (4D), linear acceleration (3D), and angular velocity (3D) signals. N/A indicates that the respective modality was intentionally excluded from that sensor in the final configuration to reduce redundancy and optimize data streaming. Column names in the dataset follow the format <signal>_<axis>_<location>.

Sensor Location	Orientation (4D)	Linear Acceleration (3D)	Angular Velocity (3D)
Chest	q_w, q_x, q_y, q_z	a_x, a_y, a_z	g_x, g_y, g_z
Left Hand	q_w, q_x, q_y, q_z	N/A	N/A
Right Hand	N/A	a_x, a_y, a_z	g_x, g_y, g_z
Left Knee	N/A	a_x, a_y, a_z	g_x, g_y, g_z
Right Knee	q_w, q_x, q_y, q_z	N/A	N/A

Table 3. List of activity labels included in the dataset, along with their corresponding descriptions. These labels are used in the label column of each CSV file to annotate the activity being performed at each timestep.

Activity	Label	Description
Sitting	0	The subject is seated in a stationary position.
Sweeping	1	The subject is performing sweeping movements with a broom.
Folding Clothes	2	The subject is performing repetitive folding motions using both arms.
Walking	3	The subject is walking at a comfortable, natural pace.
Moving Boxes	4	The subject is lifting and moving medium-sized boxes repeatedly.
Riding a Bike	5	The subject is a pedaling stationary bicycle.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Rivas-Caicedo, J.L.; Saldaña-Aristizabal, L.; Niño-Tejada, K.; Patarroyo-Montenegro, J.F. A Multi-Sensor Dataset for Human Activity Recognition Using Inertial and Orientation Data. Data 2025, 10, 129. https://doi.org/10.3390/data10080129

AMA Style

Rivas-Caicedo JL, Saldaña-Aristizabal L, Niño-Tejada K, Patarroyo-Montenegro JF. A Multi-Sensor Dataset for Human Activity Recognition Using Inertial and Orientation Data. Data. 2025; 10(8):129. https://doi.org/10.3390/data10080129

Chicago/Turabian Style

Rivas-Caicedo, Jhonathan L., Laura Saldaña-Aristizabal, Kevin Niño-Tejada, and Juan F. Patarroyo-Montenegro. 2025. "A Multi-Sensor Dataset for Human Activity Recognition Using Inertial and Orientation Data" Data 10, no. 8: 129. https://doi.org/10.3390/data10080129

APA Style

Rivas-Caicedo, J. L., Saldaña-Aristizabal, L., Niño-Tejada, K., & Patarroyo-Montenegro, J. F. (2025). A Multi-Sensor Dataset for Human Activity Recognition Using Inertial and Orientation Data. Data, 10(8), 129. https://doi.org/10.3390/data10080129

Article Menu

A Multi-Sensor Dataset for Human Activity Recognition Using Inertial and Orientation Data

Abstract

1. Summary

2. Data Description

2.1. Data Structure and Format

2.2. Sensor Data Columns

2.3. Labels and Metadata

2.4. Sampling Rate

3. Methods

3.1. Data Collection Protocol

3.2. Recording Environment

3.3. Sensors

3.4. Synchronization and Validation

3.5. Data Quality and Cleaning

3.6. Ethical Approval

4. Usage Notes

4.1. Data Access

4.2. Data Preprocessing and Segmentation Guidelines

4.3. Limitations

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI