1. Introduction
Individuals with spinal cord injuries (SCI) who rely on wheelchairs typically experience associated symptoms such as obesity and low muscular strength. These symptoms may eventually lead to secondary complications, including diabetes and cardiovascular diseases [
1,
2]. Rehabilitation processes, such as in-home strength exercises, play an essential role in avoiding such symptoms and redeveloping the motor skills that are needed to perform daily activities and promote quality of life [
3,
4]. Currently, therapists rely on patient surveys to measure their adherence to these activities. However, studies indicate wide variability between self-reported and actually performed physical activity, which can undermine rehabilitation progress [
5]. Nevertheless, with rapid technological innovation, physical activity recognition systems are emerging as a more reliable way to detect these activities [
6,
7,
8,
9].
Based on the approach used to collect data, activity recognition can be broadly classified into two approaches: the vision-based and sensor-based approaches. Although the vision-based approach is information-rich, it often suffers from ethical and privacy concerns, especially in healthcare applications when dealing with patients. By contrast, the devices used in the sensor-based approach, including wearable sensors, can operate with limited cost and power, and they have no restrictions in terms of the surrounding environment or the location where activities must be performed. As a result, activity recognition systems commonly adopt the sensor-based approach [
10].
Several studies have been undertaken to investigate the impact of different sensor positions on overall recognition accuracy. These studies indicate that sensor position should be determined mainly based on the type of activity under study. Forms of locomotion, including walking and running, as well as static activities, such as standing and sitting, can be recognized with an accuracy of between 83% to 95% using lower-limb segments (hip, thigh, and ankle) as the sensor positions. To improve accuracy when recognizing upper-limb activities, sensors are placed on the wrist and upper arm [
11]. Within this context, the study in [
12] considered different positions, such as hip, belt, wrist, upper arm, ankle, and thigh, to recognize 20 types of activities, including both upper- and lower-limb activities. The results showed high accuracy when combining different positions. However, the study also demonstrated a slight performance decrease when using only the thighs and wrists. In addition to the impact of sensor placement on accuracy, user preferences should be considered to gain acceptance. To address this problem in the design of wearables, a meta-analysis was undertaken in [
13]. The study concluded that people prefer wearing sensors on their wrist, followed by the trunk, belt, ankle, and, finally, armpit.
Activity recognition systems have a wide variety of applications, including rehabilitation and physical therapy. These systems allow monitoring of patients and the identification of exercises being performed [
14]. In this regard, Pernek et al. [
15] proposed a monitoring system consisting of a network of wearable accelerometers and a smartphone to recognize the intensity of specific physical activities (e.g., strength exercises). The system used two Support Vector Machine (SVM) layers to detect the type of activity being performed and determine its intensity. The study demonstrated that the hierarchical algorithm achieved an accuracy of approximately 85% in recognizing a set of upper-body activities. The study in [
16] presented a methodology to recognize three fundamental arm movements using two different classifiers: Linear Discriminant Analysis (LDA) and SVM. The overall average accuracy was 88% using data collected from accelerometers and 83% using gyroscope data. With the same objective, Panwar et al. [
10] designed a model to recognize three physical activities of the human forearm, relying on data collected from a single wrist-worn accelerometer. Lin et al. [
17] proposed a model for recognizing the physical activities performed to rehabilitate frozen shoulder. Based on wireless sensor networks (WSN), the model could recognize six physical activities with an accuracy ranging from 85 to 95%. The study showed the applicability of using these types of models to recognize the rehabilitation exercises that are ubiquitous in healthcare self-management. In [
18], Cai et al. developed an upper-limb robotic device to rehabilitate stroke patients. The system works by initially recognizing the activity performed by the healthy side of the patient and then provides mirror therapy to the affected side. The method used surface electromyography (sEMG) signals to train and test the model, and SVM was applied to classify the activities. To provide stroke survivors with feedback to maintain a correct posture during rehabilitation, Zambrana et al. [
7] proposed a hierarchical approach using interrail sensors to monitor arm movements. This approach consisted of two levels: the first level distinguishes between movements and non-movements of the arm, while the second level determines whether the movement was purposeful.
Similar to other pattern recognition problems, continuous raw data should be divided into smaller fragments before proceeding to feature extraction and other following operations. The selection and application of an efficient segmentation method substantially influence the classification process, which directly results in accurate activity recognition [
19]. The sliding window is the most widely used approach and, to date, it is still considered the best available approach [
19,
20,
21]. In this method, continuous data obtained from sensors are segmented into windows of either static or dynamic sizes based on time intervals. For the former, two different algorithms are available: fixed-size non-overlapping sliding window and fixed-size overlapping sliding window. The first algorithm is considered a simple segmentation process, where the number of windows can be calculated exactly since no overlap exists. The second algorithm includes data overlap between two consecutive windows, where the percentage overlap can be referred to as the window shift. Since different activities have different periods of motion, the size of the window depends on the type of activity that is evaluated [
22]. However, determining the effective window size is considered a critical issue. A short window size may split an activity’s signal into two or more consecutive windows, whereas a long window size may combine signals for more than one activity. Ultimately, these cases may affect the accuracy of activity classification because information is lost or noise is introduced into the signal, respectively [
23,
24].
In dynamic sliding windows, data are segmented into different window sizes according to specific features. One of the challenges is to optimize different window sizes while considering activities with both short and long duration. Numerous studies have sought to resolve the limitation of the sliding window approach. Feda et al. [
22] investigated the impact of using different window sizes on the accuracy of recognizing activities with different durations, reporting that a 1.5-second window size may represent the best trade-off. Other researchers have proposed adaptive window size techniques. In this context, Santos et al. [
25] used entropy feedback to adjust the window size and time continuously, thereby increasing classification accuracy. Nevertheless, the algorithm is computationally complex since shorter time shifts increase the rate of classifications per second. In [
24], Noor et al. presented a segmentation technique based on adjusting the window size according to the probability of the signal. Initially, the approach specifies a small window size suitable for splitting static and dynamic activities. In turn, this size expands dynamically when a transitional activity is encountered, which stems from its longer duration. Similarly, using cluster analysis for period extraction, [
21] proposed a technique to differentiate between basic and transitional activities during segmentation. Sheng et al. [
26] designed an adaptive time window by using pitch extraction algorithms to divide the data into periodic and non-periodic activities. The study in [
27] designed and implemented a segmentation method based on the sliding window autocorrelation technique and the Gaussian model. Using a dataset consisting of readings from an accelerometer embedded in a smartphone, the method successfully divided the data into distinct subsets of activities. Based on a change detection algorithm, an activity segmentation method was presented in [
19]. To identify stationary, dynamic, and transitional activities, starting window positions were dynamically detected.
The objective of this research is to propose a novel signal segmentation method for physical activity recognition that can enhance classification performance. Unlike previous studies, this method is concerned with the segmentation of physical activities that belong to the same category (i.e., dynamic activities). To achieve this objective, an experiment was conducted to verify and compare the proposed method with the sliding window approach. The comparison demonstrates the effectiveness of our method, particularly in terms of enhancing recognition accuracy.
The remainder of this paper is organized as follows:
Section 2 presents the set of physical activities applied during the rehabilitation of SCI patients;
Section 3 offers an overview of the system;
Section 4 describes the proposed segmentation method;
Section 5 demonstrates the experimental setup;
Section 6 and
Section 7 present and discuss the results, respectively; and finally,
Section 8 concludes the paper.
3. System Overview
A wireless sensor was used (Shimmer Research, Dublin, Ireland), each consisting of a tri-axial accelerometer, a tri-axial gyroscope, and a tri-axial magnetometer. Due to the efficiency of accelerometers in activity recognition, the dataset used in this research was collected using a single tri-axial accelerometer [
31,
32,
33]. It is a sensing device used to measure acceleration in three orthogonal directions simultaneously. However, gyroscope and magnetometer were excluded since prior studies indicate that accelerometers provide higher overall accuracy [
16]. In addition, the ferromagnetic materials that are commonly available in domestic environments can affect magnetometers. The sensor was configured to collect acceleration data with a sampling frequency of 30 Hz (range ± 2 g), which has been shown to be sufficient for recognizing similar activities [
30,
31]. In addition, a previous study demonstrated that the type and intensity of human activities can be recognized using signals with a sampling rate equal to 10 Hz [
34].
Sensors are placed on the wrist and upper arm when recognizing upper-limb activities, both of which were examined in this research. However, due to the type of motion being recognized, certain activities, such as EE, EF, and IR, lack upper-arm movements. This meant that the sensor placed on the upper arm could not detect any motion. Accordingly, the wrist was chosen as the sensor position.
In terms of axis orientation, the Y-axis was in parallel with the wrist, pointing toward the fingers and across the X-axis. In addition, the Z-axis pointed away from the backside of the wrist, as shown in
Figure 2.
6. Results
Various performance metrics have been used in prior works, including accuracy, which refers to the ratio of correctly predicted observations to the total observations; recall, which refers to the ratio of correctly predicted positive observations to all observations in the actual class; precision, which is the ratio of correctly predicted positive observations to the total predicted positive observations; and F-measure, which is a combination of the precision and recall measures that are used to represent the detection result.
To evaluate the performance improvement of the proposed method, the experiment was conducted in two phases. First, the abovementioned performance metrics were used to determine the recognition performance using both segmentation methods: sliding window and the proposed method. For comparison purposes, only values of similar activities, as in [
15], were presented. This study was chosen because it has the greatest number of shared activities with the ones provided in this work (i.e., the shared activities are abduction, flexion, and EF). In addition, it used the fixed-size sliding window protocol for segmentation. In the second phase, for the purpose of determining the effectiveness of the proposed method using the SVM classifier, other common classifiers, including J48, K-Nearest Neighbors (KNN), and Naïve Bayes (NB), were used for comparison.
Table 2 reports the classification performance of the proposed method in comparison to the fixed-size sliding window approach. It indicates that not only a performance improvement in accuracy measures was obtained when using the proposed method but also the values for precision, recall, and F-measure showed statistically significant improvements.
Additionally, an evaluation of activity type recognition accuracy and prediction error was undertaken for each of the three physical activities. As shown in
Table 3, the algorithm had the greatest difficulties when recognizing abduction and flexion. This was expected because these two activities are similar, especially with regard to the starting and ending points of the movement, as well as the range of motion. However, the algorithm still achieved a recognition accuracy of 96% for these physical activities.
Figure 8 depicts the recall, precision, and F-measure values for each activity obtained by the model using the SVM classifier. Both segmentation methods achieved high classification performance in recognizing EF, and the enhancement achieved by the proposed method was small. However, the enhancement became increasingly large when recognizing more similar activities: abduction and flexion. The increase in recall when using the proposed method was 5% in abduction and approximately 4% in flexion, while precision increased by 5% and 7% in recognizing abduction and flexion, respectively. In addition, our method increased the F-measure of abduction by 7% and flexion by 5%. These results show that the proposed method not only enhanced performance but also increased model robustness.
To investigate the effectiveness of the proposed method using the SVM classifier, three common machine learning algorithms were further used for the comparison.
Table 4 shows the performance of the proposed method and sliding window using NB, J48, and KNN classifiers.
7. Discussion
In this study, we proposed and verified a machine learning-based method for physical activity segmentation using wearable sensors. Our method enabled the algorithm to classify specific types of physical activity with an accuracy reaching up to 96%. Overall classification performance improved by approximately 5% compared to a commonly used approach, namely the sliding window. Furthermore, the results in
Table 2 clearly indicate that the statistically significant improvement occurred not only in terms of accuracy but also in all performance measures used in this work. This enhancement reflects the effectiveness and applicability of the method on continuous data collected from a single accelerometer.
The algorithm enabled the accurate classification of similar activities, such as abduction and flexion. In contrast, when using sliding window segmentation, the algorithm frequently confused these activities and experienced difficulties in recognizing them. This demonstrates that the impact of the correct segmentation of raw data is not only on performance but also on model robustness.
Table 4 shows that the new segmentation method achieved a recognition rate of more than 91% using four different ML classifiers, and SVM outperformed the others. This is consistent with expectations because SVM is highly regularized and works effectively with small datasets and few classes. Moreover, the results of this table emphasize the effectiveness of the proposed method, which outperformed the sliding window method across all four classifiers with an average of 5.5%.
The results clearly show that wearable sensors are a promising technology for monitoring and performing automated rehabilitation assessments. Despite the performance enhancement obtained using specific sensor types, affordability and usability are also important factors for determining their applicability. The study in [
18] used sEMG electrodes to recognize different activities performed by stroke patients. Although the results suggested that sEMG signals provide good accuracy in upper-limb activities, attaching these electrodes is a sensitive process that requires an expert. This type of sensor is impractical for use in certain applications, including monitoring in-home rehabilitation, especially if the set of activities must be repeated daily or multiple times during the day. Contrastingly, the accelerometers used in this research are low-cost and easy-to-use sensors.
This work can be considered as a systematic approach to dynamic signal segmentation, which could be applied to other types of physical activity. However, slight modifications should be taken into account when needed. For the segmentation of a wider range of activities, more signal characteristics might be needed. One possible solution is to exploit statistical and time series analysis to detect the signal variation.
The new method presented in this paper overcomes the limitation of the sliding window approach through the adaptive segmentation of physical activities. However, we acknowledge that certain limitations are evident in our work. First, only an accelerometer was used for physical activity recognition. Although studies have proven the effectiveness and efficiency of accelerometers, additional types of sensors, such as gyroscopes and magnetometers, may improve recognition performance. Second, this work focused on the segmentation of physical activities applied during the rehabilitation of SCI patients. Further research should be undertaken to study the effect of this method on other physical activities. Third, the data were collected in a controlled environment. Future work might consider collecting data from real scenarios in which participants perform activities at home. Finally, the selection of a threshold value depends on the training data. In future work, the threshold could be chosen with the ability to update periodically according to the incoming signal.
In addition to the abovementioned future work, the impact of the method on the rest of the activities will be investigated. In addition, frequency-domain features and additional time-domain features will be identified to facilitate performance enhancement. Finally, the method will be introduced into hospital-based rehabilitation sessions to examine the performance on SCI individuals.
8. Conclusions
In physical activity recognition using machine learning algorithms, data segmentation is an essential step that may influence accuracy. Nevertheless, studies mostly adopt the sliding window technique and rely on the window size used in previous works. Although this approach is considered simple, it might be ineffective, especially for activities with different durations.
This study proposed a novel segmentation method that can be applied to enhance the recognition of physical activities performed in a rehabilitative context. To adaptively segment the raw data, the algorithm identifies the time boundaries to represent the start- and endpoints of each activity. Peak boundaries and valley boundaries are used depending on the signal characteristics.
The proposed algorithm was also verified in this paper. The results, which were generated using data from a single accelerometer located on the wrist, approved the effectiveness and applicability of the method on continuous raw data. Moreover, adopting the proposed method generally improved recognition performance, and the improvement was more substantial for similar activities.