To assess the effectiveness of our proposed 1D-ResNeXt model for HAR in IoT-enabled, uncontrolled settings, we performed comprehensive experiments on three publicly available datasets: mHealth, MotionSense, and Wild-SHARD. This section outlines the experimental setup, provides an overview of the datasets, and shares the findings from our comparative analysis.
4.1. Experimental Setup
In this study, we utilized advanced computing resources and cutting-edge software frameworks to implement and evaluate the 1D-ResNeXt model alongside other deep learning models. The experimental setup was structured to ensure efficient model training and thorough analysis of the results.
4.1.1. Hardware and Software Infrastructure
We employed Google Colab Pro+ and a Tesla V100-SXM2-16GB GPU (Hewlett Packard Enterprise, Los Angeles, CA, USA) to accelerate the training process for our deep learning models. This high-performance computing setup enabled us to efficiently handle the large sensor data volumes from the mHealth, MotionSense, and Wild-SHARD datasets.
The model was implemented in Python 3.6.9, with TensorFlow 2.2.0 serving as the main deep learning framework. To optimize GPU computations, we used CUDA 10.2 as the backend.
4.1.2. Software Libraries
Our methodology utilized a range of Python libraries, each playing a unique role within the data pipeline and model development:
Data management and analysis with NumPy and Pandas: These libraries facilitated the efficient retrieval, processing, and analysis of sensor data.
Visualization with Matplotlib and Seaborn: Used to generate visualizations for presenting the outcomes of data analysis and model evaluation.
Machine learning and data preparation with Scikit-learn (Sklearn): Applied for data preparation tasks such as splitting data into train and test sets, conducting cross-validation, and calculating performance metrics.
Deep learning framework with TensorFlow: Serves as the primary library for building and training the 1D-ResNeXt model along with other baseline deep learning models.
4.1.3. Training Process
Our training approach aimed to ensure robust performance and adaptability across the three datasets. A five-old cross-validation strategy was applied to each dataset, allowing us to evaluate the model’s consistency and minimize overfitting risks.
Before training, we used a thorough pre-processing pipeline as described in
Section 3.2. This included denoising to eliminate signal artifacts, normalization to standardize input data scales, and segmentation with a 2-s sliding window to capture temporal dependencies.
For optimization, we employed the Adam algorithm with hyperparameters specified in
Table 2. The loss function combined cross-entropy loss with L2 regularization, effectively reducing classification errors while controlling overfitting.
Each training fold was allowed to run up to 200 epochs. However, an early stopping mechanism based on validation performance was implemented to halt training if convergence occurred before the maximum epochs. This strategy ensured the efficient use of resources and minimized unnecessary overfitting.
4.2. Results and Analysis
Table 4,
Table 5 and
Table 6 provide a comparison of the performance of our proposed 1D-ResNeXt model with other deep learning models on the mHealth, MotionSense, and Wild-SHARD datasets, respectively.
The findings in
Table 4, based on the mHealth dataset, highlight the exceptional performance of our 1D-ResNeXt model for HAR. Achieving an impressive accuracy of 99.97% (±0.06%), our model outperforms all other models tested, with BiGRU as the closest competitor at 98.14%. Additionally, the 1D-ResNeXt model achieved the lowest loss at 0.00 (±0.00) and the highest F1-score at 99.95% (±0.11%), indicating both high precision and balanced performance across various activity classes.
Remarkably, these results were obtained using only 26,118 parameters, which is significantly less than other models, such as CNN, which has 799,948 parameters. This combination of superior accuracy and a reduced parameter count makes our model ideal for deployment on IoT devices with limited resources. The low standard deviations across all metrics further indicate the model’s consistent and reliable recognition capabilities. These outcomes clearly demonstrate that our 1D-ResNeXt architecture provides an effective solution for IoT-based HAR applications, offering substantial improvements in both accuracy and computational efficiency compared to other deep learning models.
The findings in
Table 5, based on the MotionSense dataset, further confirm the effectiveness of our 1D-ResNeXt model for HAR. Our model achieved the highest accuracy at 98.77% (±0.11%) and F1-score at 98.63% (±0.13%), surpassing all other models tested. GRU and BiGRU were the next best performers, both with an accuracy of 98.46%, which emphasizes the clear advantage of our model. Notably, the 1D-ResNeXt model also showed the lowest loss at 0.06 (±0.01), indicating highly reliable predictions.
These superior results were obtained with only 24,130 parameters, a significant reduction compared to other models like CNN, which has 495,558 parameters, and BiLSTM, which has 171,910 parameters. This combination of high accuracy and a low parameter count is particularly valuable for IoT applications, where computational efficiency is as critical as performance. The consistent results in terms of accuracy, F1-score, and loss, along with low standard deviations, suggest that our model is both robust and stable across various activities in the MotionSense dataset.
These findings highlight the adaptability of our 1D-ResNeXt architecture across different HAR scenarios and its strong potential for real-world IoT-based activity recognition applications.
The results in
Table 6, based on the Wild-SHARD dataset, further confirm the effectiveness of our 1D-ResNeXt model for HAR in uncontrolled settings. Our model achieved the highest accuracy at 97.59% (±0.53%) and F1-score at 97.53% (±0.55%), outperforming all other deep learning models tested. The closest competitor, BiGRU, reached an accuracy of 96.81% and an F1-score of 96.73%. Notably, our model also had the lowest loss at 0.17 (±0.05), indicating more reliable predictions.
The superior performance of the 1D-ResNeXt model is especially noteworthy given the challenging Wild-SHARD dataset, which reflects real-world, uncontrolled conditions. Furthermore, our model achieved these results with only 24,976 parameters, significantly fewer than models like CNN, with 385,478 parameters, and BiLSTM, with 181,126 parameters. This combination of high accuracy and low parameter count demonstrates the model’s capacity to effectively capture complex activity patterns in diverse, real-life scenarios while remaining computationally efficient. These findings underscore the robustness and adaptability of our approach, making it particularly well suited for IoT-enabled HAR applications in dynamic and uncontrolled environments.
4.3. Comparison Results with State-of-the-Art Models
To exemplify the importance and effectiveness of the proposed 1D-ResNeXt model for IoT-based HAR in uncontrolled environments, this study evaluated its interpretation against leading state-of-the-art methods. The evaluation was conducted using the same benchmark HAR datasets employed in this work: mHEALTH, MotionSense, and Wild-SHARD. Accomplishing heightened accuracy and computational efficiency is essential for implementing HAR systems on IoT devices with limited resources. These devices often operate in dynamic, real-world conditions with varied sensory data and user behaviors. The results of the comparison are presented in
Table 7.
The results summarized in
Table 7 highlight the exceptional performance of the proposed 1D-ResNeXt model compared to state-of-the-art methods for HAR using sensor data from the mHEALTH, MotionSense, and Wild-SHARD datasets.
For the mHEALTH dataset, the 1D-ResNeXt model achieved an impressive accuracy of 99.97%, surpassing other advanced deep learning methods. These include the ensemble of hybrid DL models (99.34%), the ensemble of autoencoders (94.80%), the deep belief network (93.33%), and the fully convolutional network (92.50%). The dataset comprises multi-modal sensor data, including accelerometer, gyroscope, magnetometer, and ECG signals. The 1D-ResNeXt model’s superior accuracy demonstrates its ability to extract and learn intricate patterns from diverse sensory inputs efficiently.
On the MotionSense dataset, which uses inertial sensor data from smartphones, the 1D-ResNeXt model achieved an accuracy of 97.77%. This performance exceeds that of other models, such as DySan (92.00%), CNN (95.05%), and random forest (96.96%). These results confirm the adaptability of the 1D-ResNeXt architecture to various sensor types and device configurations.
For the Wild-SHARD dataset, representing a more complex and uncontrolled environment, the 1D-ResNeXt model achieved an accuracy of 97.59%. This result outperformed the CNN-LSTM model (97.00%), showcasing the robustness and effectiveness of the 1D-ResNeXt approach in handling real-world, dynamic settings.