A Multi-Layer Classifier Model XR-KS of Human Activity Recognition for the Problem of Similar Human Activity

Sensor-based human activity recognition is now well developed, but there are still many challenges, such as insufficient accuracy in the identification of similar activities. To overcome this issue, we collect data during similar human activities using three-axis acceleration and gyroscope sensors. We developed a model capable of classifying similar activities of human behavior, and the effectiveness and generalization capabilities of this model are evaluated. Based on the standardization and normalization of data, we consider the inherent similarities of human activity behaviors by introducing the multi-layer classifier model. The first layer of the proposed model is a random forest model based on the XGBoost feature selection algorithm. In the second layer of this model, similar human activities are extracted by applying the kernel Fisher discriminant analysis (KFDA) with feature mapping. Then, the support vector machine (SVM) model is applied to classify similar human activities. Our model is experimentally evaluated, and it is also applied to four benchmark datasets: UCI DSA, UCI HAR, WISDM, and IM-WSHA. The experimental results demonstrate that the proposed approach achieves recognition accuracies of 97.69%, 97.92%, 98.12%, and 90.6%, indicating excellent recognition performance. Additionally, we performed K-fold cross-validation on the random forest model and utilized ROC curves for the SVM classifier to assess the model’s generalization ability. The results indicate that our multi-layer classifier model exhibits robust generalization capabilities.


Introduction and Related Work
Human activity recognition (HAR) involves identifying various human behaviors through a series of observations of individuals and their surrounding environment [1].HAR has been generally applied in many fields, such as security and surveillance [2], sports and fitness [3], industry and manufacturing [4], autonomous driving [5], and the references therein.
A novel IoT-perceptive HAR approach based on a multi-head convolutional model was investigated in [6].A hand-crafted and deep convolutional neural network feature fusion and selection strategy is given in [7].In [8], the authors consider smart home environments using LSTM networks based on sensor-based smartphone data.In [9], a federated learning system with enhanced feature extraction was applied to HAR.The Bi-LSTM network was developed for multimodal continuous HAR [10].In the field of Sensors 2023, 23, 9613 2 of 31 industry and manufacturing, time factor analyses in conjunction with HAR have been considered for worker operating times [11].As pointed out in [12,13], HAR technology improves the accuracy of targeting criminals with deep learning.The recognition of human activities has been applied to the development of a suitable autonomous driving system in the field of autonomous driving [14].
HAR methods can be broadly categorized into two main directions: vision-based HAR and wearable sensor-based HAR.It is well known that vision-based HAR has been concerned as well as wearable sensor-based HAR [15].However, it faces several challenges, including privacy concerns related to potential video data leakage and the significant computational power and storage resources required for image processing.Additionally, factors such as the observer's position and angle, the subject's physique, background color, and light intensity can impact the accuracy of vision-based HAR [16].In contrast, inertial sensor technology is typically cost-effective and offers greater robustness and portability in various environmental conditions [17].Currently, sensor-based recognition technology has gained widespread attention due to its superior confidentiality and relatively lower computational requirements.The role of sensor placement in the design of HAR systems to optimize their availability has been discussed in [18].Leveraging these advantages, wearable sensor-based HAR has garnered increasing interest in recent years.
The earliest research on sensor-based recognition of human behavior can be traced back to the 1990s, studied by researchers such as Foerster [19] and Bouten [20].Nowadays, research on wearable sensors has resulted in the development of many highly accurate models.In [21], the authors achieved an overall accuracy of 84% by effectively collecting data and using decision tree classification.The Centinela system, developed by Lara and colleagues, achieved an overall accuracy of 95.7% [22].However, a problem was identified where single-classification models can cause confusion when distinguishing similar activities, such as ascending stairs and descending stairs.In a study conducted by Jansi et al. [23], they utilized chaotic mapping to compress raw tri-axial accelerometer data and extracted 38 time-domain and frequency-domain features.These features included mean, standard deviation, root mean square, dominant frequency coefficient, spectral energy, and others.They achieved a recognition accuracy of 83.22% in human activity recognition.However, the results showed significant confusion between activities such as running, ascending stairs, and descending stairs.In the research conducted by Vanrell et al. [24], a 91-dimensional feature vector was extracted from single-axis accelerometer data.This vector included cepstral coefficients, time-domain features, and periodicity features.They achieved a recognition accuracy of 91.21% in a classification task involving ten different human activities.However, the results also indicated substantial confusion between activities such as cycling on an exercise bike in a horizontal position, cycling on an exercise bike in a vertical position, ascending stairs, and descending stairs.The reasons for the confusion between similar activities can be summarized in two aspects.Firstly, within the same individual, different activities may have similar activity cycles or amplitudes, which can cause confusion in activity recognition and result in a decrease in overall accuracy.
The kernel Fisher discriminant analysis (KFDA) is a powerful extension of the Fisher discriminant analysis (FDA) [25].It has been shown to be highly effective in various pattern recognition and classification tasks.While the traditional FDA is primarily designed for linearly separable data, the KFDA extends its capabilities by allowing the analysis of nonlinearly separable data using kernel functions.The KFDA method serves as a robust nonlinear classifier that is suitable for pattern recognition, classification, and regression analysis tasks [26].It is capable of capturing the nonlinear relationships between the input and output variables in a dataset and demonstrates good generalization performance in various practical problems.
The motivation for this paper is derived from [27,28].We considered similar issues by utilizing the KFDA method prior to proceeding with data classification.Numerous applications in the field of machine learning have been submitted to the KFDA.To address the issue of confusion between similar activities in single-model human activity recognition and improve the overall accuracy of recognizing multi-class activities, we took inspiration from the successful approaches used by [27] and other researchers in solving similar problems related to lithofacies identification.We decided to utilize KFDA (kernel Fisher discriminant analysis) to preprocess the similar activity data before conducting classification.In this paper, we propose a multi-layer neural network model based on the KFDA.This approach involves preprocessing steps, followed by initial classification using a random forest method.Subsequently, the KFDA is applied to process the data.Finally, SVM is employed for the detailed classification of ambiguous actions.The end result is a robust neural network classification model that effectively tackles the challenge of differentiating between similar activities.
In this paper, we propose the XR-KS (detailed description is given in Section 2) design aimed at addressing the issue of confusion between similar activities.To address the issue of similar activity feature similarity, we propose an SVM classification approach that utilizes KFDA.This approach effectively categorizes similar activities.Additionally, we conducted classification experiments on four common benchmark datasets and performed detailed analyses on these datasets.We compared our model to mainstream classification models.Experimental results demonstrate that our model exhibits excellent classification performance.
The remaining sections of this paper are organized as follows: Section 2 provides a brief introduction to the work carried out in this paper, along with details about the dataset used.Section 3 conducts a basic data analysis and employs appropriate data preprocessing techniques.This section introduces our proposed approach for human motion, which is based on a multi-layer classifier.Section 4 presents the experimental setup, provides results for our proposed method on multiple datasets, and offers an analysis and discussion of these results.Finally, in Section 5, we will summarize the insights gathered from these experiments and outline future directions.

Modeling Framework and Database
Within human activity recognition (HAR) research, various datasets have been previously published.Notably, the UCI (University of California, Irvine, CA, USA) HAR dataset holds a prominent position due to its extensive usage in numerous studies and comparative analyses [29].Equally noteworthy is the WISDM (wireless sensor data mining) dataset [30], which has also gained significant recognition.Additionally, datasets like UCI DSA [31] and IM-WSHA [32] have been widely employed in research endeavors.Besides these datasets, there are several others that are not explicitly detailed in this article.The subsequent sections will offer a comparative analysis, presented in Table 1, highlighting the strengths and weaknesses of these four primary datasets.To highlight these differences, a qualitative comparison between these three datasets is presented in    While the data collected from UCI DSA may appear simpler in comparison to UCI HAR, WISDM, and UCI ADL, UCI DSA captures a wide range of 19 different human activities.Unlike other datasets, it better represents complex human activities and serves as a more comprehensive showcase for our model in this paper.We conducted experiments using the four databases, but in the following sections, we focus our narrative on UCI DSA.
The UCI DSA data in this paper were obtained from measurements of human activity by miniature inertial sensors and magnetometers in different parts of the body.Sensor data were collected from a total of 8 subjects performing 19 different activities.The total signal duration for each subject for each activity was 5 min.The sensor unit was calibrated to acquire data at a 25 Hz sampling frequency.The 5-min signal was divided into 5-s segments, resulting in 480 (=60 × 8) signal segments for each activity.
Eight volunteers participated, resulting in a dataset comprising 9120 instances.This dataset intricately describes the data captured from various sensors, measuring activities performed by different subjects within the same time intervals.We consolidated this textual dataset into a CSV file with two columns: subject ID and activity type.
After preprocessing the data and utilizing filtered features, our team developed a practical algorithm to classify 19 human behaviors.Given the vast amount of data and the inherent similarities among human activities, direct classification using a single machine learning algorithm could lead to confusion and decreased accuracy.
Our approach includes data preprocessing, feature selection, and a two-layer model classification, roughly as shown in Figure 1.Eight volunteers participated, resulting in a dataset comprising 9120 instances.This dataset intricately describes the data captured from various sensors, measuring activities performed by different subjects within the same time intervals.We consolidated this textual dataset into a CSV file with two columns: subject ID and activity type.
After preprocessing the data and utilizing filtered features, our team developed a practical algorithm to classify 19 human behaviors.Given the vast amount of data and the inherent similarities among human activities, direct classification using a single machine learning algorithm could lead to confusion and decreased accuracy.
Our approach includes data preprocessing, feature selection, and a two-layer model classification, roughly as shown in Figure 1.The model proposed in this paper is named XR-KS.First, we define the XR-KS model as follows: The XR-KS model is a two-tiered model.The XR model is a random forest model that utilizes the XGBoost feature selection algorithm.On the other hand, the KS model is a support vector machine model that is based on kernel Fisher discriminant analysis.We will detail these two models below.The approach in this paper begins by processing data that have been minimized and normalized.Inspired by the success of Zhang et al. in ozone prediction, who utilized the BO-XGBoost-RFE algorithm [33] for feature selection, we decided to employ the advanced XGBoost model for feature selection.The data with the filtered features are then used as the initial dataset, and a random forest classification model is employed for classification, which produces preliminary classification results.This entire process constitutes the first layer of the XR model.However, we observed that certain similar human behaviors are prone to causing confusion in the classification results.Therefore, we introduce kernel Fisher discriminant analysis to preprocess the data before employing a support vector machine for classification.This iterative process continues until no more instances of confusing activities are encountered.The models derived from these processes are called the KS models.
To provide a more detailed overview of our efforts in addressing similar activities and distinguishing them from other actions in confusing scenarios, we created Figure 2 to illustrate our model.
The model proposed in this paper is named XR-KS.First, we define the XR-KS model as follows: The XR-KS model is a two-tiered model.The XR model is a random forest model that utilizes the XGBoost feature selection algorithm.On the other hand, the KS model is a support vector machine model that is based on kernel Fisher discriminant analysis.We will detail these two models below.
The approach in this paper begins by processing data that have been minimized and normalized.Inspired by the success of Zhang et al. in ozone prediction, who utilized the BO-XGBoost-RFE algorithm [33] for feature selection, we decided to employ the advanced XGBoost model for feature selection.The data with the filtered features are then used as the initial dataset, and a random forest classification model is employed for classification, which produces preliminary classification results.This entire process constitutes the first layer of the XR model.However, we observed that certain similar human behaviors are prone to causing confusion in the classification results.Therefore, we introduce kernel Fisher discriminant analysis to preprocess the data before employing a support vector machine for classification.This iterative process continues until no more instances of confusing activities are encountered.The models derived from these processes are called the KS models.
To provide a more detailed overview of our efforts in addressing similar activities and distinguishing them from other actions in confusing scenarios, we created Figure 2 to illustrate our model.In this section, the preprocessing work is illustrated using the UCI DSA dataset as an example, in order to avoid unnecessary complexity in the article.We downloaded the dataset from the official UCI website [31] and found it to be somewhat disorganized.To streamline the dataset, we consolidated the original files into a CSV file.Additionally, in order to simplify the lengthy labels in the "Behavior" column of the dataset, we adopted an abbreviated format.Since the actual experiments do not require specific information such as IDs and names, we found it unnecessary to include them.This processing aligns with the original dataset, for example, by replacing "sitting" with "A1".For detailed information, please refer to Table 2.

Standardization and Normalization
To address potential variations in sensor data collection and enhance the classification performance of our proposed model, we conducted a comprehensive examination of data samples.This involved randomly selecting a metric and comparing it with 19 different activities.The initial dataset, illustrated in Figure 3a, comprised 60,000 sample points across various activities for this metric.To ensure consistency and facilitate improved classification, we applied standardization and normalization methods during data preprocessing.
As shown in Figure 3b, after our preprocessing steps, it becomes evident that all data points now fall within the standardized range of 0 to 1.Despite this transformation, the fundamental characteristics of the data are preserved.This meticulous preprocessing not only mitigates potential discrepancies in sensor readings but also enhances the robustness of the classification framework for the diverse set of activities.
Firstly, we propose the XGBoost-random forest model by combining the XGBoost feature selection algorithm with the random forest model.Subsequently, we introduce the XGBoost feature selection algorithm and the random forest model, presenting the combined XGBoost feature selection algorithm-random forest model.Furthermore, we introduce the kernel Fisher discriminant analysis model and SVM model, culminating in the proposal of the KFDA-SVM model by combining these two techniques.

Random Forest Initial Classification Model Based on the XGBoost Feature Selection Algorithm
To achieve a higher initial classification accuracy for subsequent improvements in the second classification stage, it is noted in [34] that utilizing feature selection algorithms can effectively enhance the efficiency of machine learning.Therefore, we initially extract relevant metrics using the XGBoost feature selection algorithm to assess the importance of input metrics.Random forest, an ensemble classifier employing multiple decision trees to train samples and make predictions, is then utilized.In this section, the random forest model based on the XGBoost feature selection algorithm is depicted in Figure 4.

XGBoost Feature Selection Algorithm
XGBoost was first proposed by Chen et al. ( 2014) [35].The traditional objective function of GBDT is to predict the target category by stacking the residual trees from different iterations.XGBoost improves upon the traditional GBDT objective function by incorporating a regularization term into the original function and reducing the number of regularization terms.XGBoost improves the traditional GBDT objective function by incorporating a regularization term into the original function.This addition helps to mitigate overfitting and accelerate convergence.XGBoost improves the traditional GBDT objective function by incorporating a regularization term into the original function.This addition helps to mitigate overfitting and enhances the convergence speed.
The objective function of the model is shown in Equation ( 1) as follows: ( where is the loss function of the squared difference between the true value and the predicted value . is the regularization term. ( where is the difficulty coefficient of the tabular tree splitting that is used to control the generation of the tree, denotes the number of leaf nodes, and denotes the L2 regularity coefficient. Taylor's second-order expansion of the objective function from Equation ( 1) is as follows: (3) The definition of a tree is as follows: The objective function of the model is shown in Equation ( 1) as follows: where L(y i , ŷi ) is the loss function of the squared difference between the true value y i and the predicted value ŷi .Ω(φ) is the regularization term.
where γ is the difficulty coefficient of the tabular tree splitting that is used to control the generation of the tree, T denotes the number of leaf nodes, and λ denotes the L2 regularity coefficient.
Taylor's second-order expansion of the objective function from Equation ( 1) is as follows: The definition of a tree is as follows: where q represents the structure of the tree: Map input samples x i ∈ R d to leaf nodes; T is the number of leaf nodes; and w is a one-dimensional vector with length T, which represents the weight of the leaf nodes.
The objective function can be rewritten as follows: ( where I j is the sample set of the j-th leaf node, and γ is the weight factor.
The main features that can classify human behavior have been extracted in the previous steps.Considering the relationship between these data and the fact that the training samples are discrete and the data volume is substantial, the random forest algorithm is being considered for network training.The general algorithmic flow of the random forest is illustrated in Figure 5.In order to initially identify multi-class activities, the random forest classification algorithm, which has shown excellent performance in supervised learning, is chosen as the layer 1 classifier.The objective function can be rewritten as follows: (5) where is the sample set of the -th leaf node, and is the weight factor.The main features that can classify human behavior have been extracted in the previous steps.Considering the relationship between these data and the fact that the training samples are discrete and the data volume is substantial, the random forest algorithm is being considered for network training.The general algorithmic flow of the random forest is illustrated in Figure 5.In order to initially identify multi-class activities, the random forest classification algorithm, which has shown excellent performance in supervised learning, is chosen as the layer 1 classifier.The final classification decision in a random forest is made by training through rounds, obtaining a sequence of classification models , and using them to create a multi-classification model system.The ultimate classification result of this system is determined using a simple majority voting method:

Random Forest Based on the XGBoost Feature Selection Algorithm
Random forest is a composite classification model composed of many decision tree classification models {h(X, Θ k ), k = 1, . ..}, and the parameter set {Θ k } is a collection of independently and identically distributed random vectors.Under the given independent variables X, each decision tree classification model selects the optimal classification result through a majority vote.The basic idea is to first use bootstrap sampling to extract k samples from the original training set, with each sample having the same sample size as the original training set.Then, k decision tree models are built for the k samples, resulting in k different classification results.Finally, based on these k classification results, a majority vote is used to determine the final classification result for each record.
The final classification decision in a random forest is made by training through k rounds, obtaining a sequence of classification models {h 1 (X), h 2 (X), • • • , h k (X)}, and using them to create a multi-classification model system.The ultimate classification result of this system is determined using a simple majority voting method: where, H(x) is a multi-classification model, h i is an individual decision tree classification model, and Y represents the output variable.
The specific implementation of the ideas, as demonstrated in Algorithm 1, integrates the XGBoost feature selection algorithm with random forest modeling to develop a unique classification model.The advantage of this lies in its ability to effectively select features from the classification dataset.Input: Let D = (x 1 , y 1 ), . . ., (x N , y N ) denote the training data, with x i = (x 1 , . . . ,x k ) number of trees M > 0. Output: Prediction of the random forest at x i and random forest model.

1:
Train the XGBoost model based on the data D.

2:
For a = 1, . . ., k do 3: The i metric is scored according to the XGBoost model, and then the scoring data are obtained.

4:
End 5: Output m metrics and replace them with metrics from the D dataset.

6:
Randomly divide D into train data R and test data L in a certain ratio.7: Set the objective function wx T + b = 0 using train data R and test data L.

Second-Layer SVM Classification Based on Kernel Fisher Discriminant Analysis
In Figure 6, we observe challenges in classifying actions such as ascending and descending stairs due to intricate details.Additionally, Figure 7 reveals that PCA and small intra-class distances in the original features hinder effective classification.To mitigate confusion between similar actions, we employ two key steps.First, we employ KFDA (kernel Fisher discriminant analysis) for feature dimensionality reduction to improve the discrimination of similar activities.This aims to increase the separation between distinct actions in the data space, thereby facilitating subsequent SVM classification.The workflow mirrors the one shown in Figure 8.

Principle of Kernel Fisher Discriminant Analysis
KFDA is a pattern recognition and classification method based on kernel techniques and is an extension of Fisher discriminant analysis.KFDA is designed to handle nonlinearly separable data by mapping the data to a high-dimensional feature space, thereby improving classification performance.We describe KFDA in conjunction with [27].Kernel Fisher discriminant analysis (KFDA) was first proposed by Schölkopf et al. in 1997 [36] and can be expressed as the maximization Equation (7): where S w represents the within-class scatter matrix, S b is the between-class scatter matrix, and w denotes the projection vector.
The above problem can be equated to finding the generalized eigenvectors of the eigenvalue problem: where the eigenvalues λ i represent the discriminative power of each projection vector.
Once we obtain the projected vector v, it can be used for classification instead of the original vectors with a linear classifier.
The limitations of the LDA method are primarily due to its inherent linearity, especially when applied to nonlinear problems [37].In contrast, the KFDA method, which is an improved version of LDA that uses a kernel trick, overcomes these limitations.KFDA is better suited for the analysis of high-dimensional data and complex systems.It is easy to implement and is characterized by its adaptability and generalizability.
The core concept of KFDA is to map the original input data using a nonlinear mapping function φ into a high-dimensional feature space F, typically a nonlinear space (see Figure 10).Through this transformation, nonlinear relationships within the input data are indirectly transformed into linear relationships.LDA is then applied to extract the most significant discriminating features in this feature space.To overcome the computational challenges of calculating φ, researchers add kernel parameters to express functional relationships for nonlinear mappings.
Fisher discriminant analysis (KFDA) was first proposed by Schölkopf et al. in 199 and can be expressed as the maximization Equation ( 7): where represents the within-class scatter matrix, is the between-class scatte trix, and denotes the projection vector.The above problem can be equated to finding the generalized eigenvectors of t genvalue problem: where the eigenvalues represent the discriminative power of each projection v Once we obtain the projected vector , it can be used for classification instead of the inal vectors with a linear classifier.
The limitations of the LDA method are primarily due to its inherent linearity, cially when applied to nonlinear problems [37].In contrast, the KFDA method, wh an improved version of LDA that uses a kernel trick, overcomes these limitations.K is better suited for the analysis of high-dimensional data and complex systems.It is to implement and is characterized by its adaptability and generalizability.
The core concept of KFDA is to map the original input data using a nonlinear ping function into a high-dimensional feature space F, typically a nonlinear spac Figure 10).Through this transformation, nonlinear relationships within the input da indirectly transformed into linear relationships.LDA is then applied to extract the significant discriminating features in this feature space.To overcome the computa challenges of calculating , researchers add kernel parameters to express functiona tionships for nonlinear mappings.
The goal of KFDA is to find a set of projection vectors that maximize the inter distance while minimizing the intra-class distance within the feature space.T achieved by maximizing the kernel Fisher criterion.
where α represents the projection vector, represents the kernel between-class s matrix, and is the kernel within-class scatter matrix in the feature space.The goal of KFDA is to find a set of projection vectors that maximize the inter-class distance while minimizing the intra-class distance within the feature space.This is achieved by maximizing the kernel Fisher criterion.
where α represents the projection vector, K b represents the kernel between-class scatter matrix, and K w is the kernel within-class scatter matrix in the feature space.The described information in Equation ( 9) can be reformulated as solving the generalized feature equation, thus reducing redundancy: where λ is the nonzero eigenvalue of projection vector α.Let α opt = (α 1 , . . . ,α M ) be the optimal projection vector, it is also the maximum eigenvalue from Equation (4).λ 1 , . . ., λ M are the eigenvalue of α 1 , . . ., α M , respectively, and The number of vectors m is calculated using the cumulative contribution rate If α opt is known, the nonlinear decision function f (x) of KFDA is as follows: where α i is the coefficient vector of the i kernel, x i is the i one in all the input samples, and k is the kernel function.

Kernel Parameter Optimization
Among these kernel functions, the Gaussian kernel stands out due to its strong generalization capability and the fact that it requires fewer parameters to be set.This makes it particularly effective in capturing nonlinear relationships.Therefore, in [37], the Gaussian kernel function was chosen as the kernel function and is expressed as shown in Equation (12).
where σ is the width parameter of the Gaussian kernel.
The kernel parameter σ in the KFDA-SVM model is crucial, influencing the position and distribution of data in the feature space.It significantly impacts the classification efficiency and generalization ability of the SVM model.Figure 11 illustrates the importance of the kernel parameter, showcasing experimental values of 0.5, 1, 2, and 5.The selection of the correct kernel parameter value is a crucial step in attaining optimal results.

SVM Model Based on Kernel Fisher Discriminant Analysis
In order to further categorize the confusion action into specific actions, this paper introduces the support vector machine (SVM) model as a sub-classification model for dividing the confusion action.The principle of the SVM classifier is to find a hyperplane that maximizes the distance between different categories to achieve effective classification.As shown in Figure 12, increasing the width of the classification interval (i.e., maximizing it) reduces the impact of local interference in the training set.Therefore, it can be considered that the last classification method has the best generalization performance and overall applicability.The SVM model can be formulated as follows: where, x is the feature vector, w is the weight vector, y is the marker vector, and sign(y) is the sign function.Calculate sample size .

5:
Calculate the inter-class scatter matrix .
6: End 7: Construct the projection matrix W, where the columns of W are the selected eigenvectors.
8: Return the projection matrix W and data .
9: Set the objective function using train data and test data .Find suitable and , meet: Mathematically, all sample points that satisfy Equation (15) (i.e., sample points wi the smallest Euclidean distance to the classification hyperplane) will be defined as suppo vectors.Therefore, the set of samples must satisfy the following two cases: if the sampl are positive, and if the samples are negative, as shown in Figure 13.When y = 1, the sample is positive; when y = −1, the sample is negative, i.e., As shown in Figure 12, SVM typically finds the optimal classification hyperplane by maximizing the classification margin.Assuming that the input of the training set consists of a set of x(i) vectors and the output is the set of y(i) vectors, the classification interval is twice the minimum distance from the full set of samples to the hyperplane, which is as follows: where m is the number of samples.See Algorithm 2.

4:
Update Sw+ = X c .T × K c × X c .5: Calculate the inter-class scatter matrix Sb+ = n c × (mu c − mu) × (mu c − mu).6: End 7: Construct the projection matrix W, where the columns of W are the selected eigenvectors.8: Return the projection matrix W and data A. 9: Set the objective function wx T + b = 0 using train data A 1 and test data A 2 .10: For i = 1, . . ., M do 11: Find suitable w and b, meet: 12: For i = 1, . . ., M do 13: If y j = +1 then 14: else if y j = −1 then 16: wx j + b ≤ −1 Mathematically, all sample points that satisfy Equation (15) (i.e., sample points with the smallest Euclidean distance to the classification hyperplane) will be defined as support vectors.Therefore, the set of samples must satisfy the following two cases: if the samples are positive, and if the samples are negative, as shown in Figure 13.

Experimental Setting
The experiments were conducted in Guilin, China, on an ASUS computer with the following specifications: an AMD Ryzen 7 4800H processor with Radeon graphics, operating at 2.90 GHz, 16 GB of RAM, and an NVIDIA GeForce GTX 1660 Ti graphics card.The operating system used was Windows 10.We used both MATLAB 2022R and Python 3.9.7 tools to conduct the experiments and validate them on four different datasets: UCI DSA, UCI HAR, WISDM, and UCI ADL.We also conducted a comprehensive evaluation of our approach.To maintain the conciseness of the paper, the following experiments are illustrated using the UCI DSA dataset as an example.

Extraction of Important Features
By utilizing the XGBoost feature value selection algorithm to analyze the 45 features in the dataset, we can evaluate the relative importance of each feature.As shown in Figure 14b, we obtained different experimental results by using the first n features as input to the first layer of the random forest model.These results effectively demonstrate the accuracy and time consumption in various scenarios.Therefore, we selected the first 31 features from Figure 14a to be used in the subsequent multi-layer classifier based on generalized discriminant analysis.To mitigate potential interference with our classification accuracy, we filter out features with lower importance.
The histogram of important feature weights as well as feature selection experiments are shown in Figure 14, and some of the results of the experiments are shown in Table 3.

Experimental Setting
The experiments were conducted in Guilin, China, on an ASUS computer with the following specifications: an AMD Ryzen 7 4800H processor with Radeon graphics, operating at 2.90 GHz, 16 GB of RAM, and an NVIDIA GeForce GTX 1660 Ti graphics card.The operating system used was Windows 10.We used both MATLAB 2022R and Python 3.9.7 tools to conduct the experiments and validate them on four different datasets: UCI DSA, UCI HAR, WISDM, and UCI ADL.We also conducted a comprehensive evaluation of our approach.To maintain the conciseness of the paper, the following experiments are illustrated using the UCI DSA dataset as an example.

Extraction of Important Features
By utilizing the XGBoost feature value selection algorithm to analyze the 45 features in the dataset, we can evaluate the relative importance of each feature.As shown in Figure 14b, we obtained different experimental results by using the first n features as input to the first layer of the random forest model.These results effectively demonstrate the accuracy and time consumption in various scenarios.Therefore, we selected the first 31 features from Figure 14a to be used in the subsequent multi-layer classifier based on generalized discriminant analysis.To mitigate potential interference with our classification accuracy, we filter out features with lower importance.3.

Extraction of Random Forest Based on the XGBoost Feature Selection Algorithm
Firstly, using the random forest model in MATLAB, a plot illustrating the relationship between the number of decisions and the error and time was generated, as depicted in Figure 15.Specific results can be found in Table 4.

Extraction of Random Forest Based on the XGBoost Feature Selection Algorithm
Firstly, using the random forest model in MATLAB, a plot illustrating the relationship between the number of decisions and the error and time was generated, as depicted in Figure 15.Specific results can be found in Table 4.  Examining Figure 9 above, we note that, after a comprehensive evaluation considering computer performance, model accuracy, and overall model reliability, we have concluded that the optimal number of decision trees for this dataset is 50.To thoroughly assess the classification capabilities of the random forest model, a dedicated test dataset was created.Experiments were conducted using MATLAB, involving the uniform partitioning of the entire dataset into different ratios based on various human activities and different volunteers.The outcomes, which detail the influence of these ratios on both training and test set accuracy, are summarized in Table 5.
To better analyze the random forest identification results mentioned above, a confusion matrix plot was created using MATLAB.Examining Figure 9 above, we note that, after a comprehensive evaluation considering computer performance, model accuracy, and overall model reliability, we have concluded that the optimal number of decision trees for this dataset is 50.To thoroughly assess the classification capabilities of the random forest model, a dedicated test dataset was created.Experiments were conducted using MATLAB, involving the uniform partitioning of the entire dataset into different ratios based on various human activities and different volunteers.The outcomes, which detail the influence of these ratios on both training and test set accuracy, are summarized in Table 5.To better analyze the random forest identification results mentioned above, a confusion matrix plot was created using MATLAB.
We also conducted a random forest classification on data from the other three databases individually, using a similar experimental setup.The experimental results obtained are presented in Table 6.Observing the four charts above and drawing upon real-world judgment, this study suggests that the main reason for the inconsistency between action recognition results and actual results is the similarity in features among these actions, which makes them easily confused during the algorithmic recognition process.For instance, activities such as climbing stairs, walking, or standing in an elevator demonstrate these similarities.Apart from the mentioned actions, the predictive accuracy for all other actions approaches 100%.This indicates that these actions can be recognized and classified as genuine actions within this layer of the classification model.
The remaining unrecognized actions fall into two main categories.The model classifies A9, A13, and A18 as A19 and confuses A7 and A8 with each other.To facilitate subsequent fine-grained classification models, these similar actions are divided into two main categories, as illustrated in Table 7. Taking the four behavior classes (A9, A13, A18, and A19) as an example of Confusing category I, we first extracted three of the most important features from the dataset and created a scatter plot, as shown on the left side of Figure 7.It can be observed that these four behavior classes have a relatively short spatial distribution in these three original features, indicating a small inter-class distance and a large intra-class distance.This does not support activity recognition by the classifier.
Subsequently, we applied a principal component analysis (PCA) for dimensionality reduction, as illustrated on the right side of Figure 7.It represents three randomly selected nonlinear discriminative features extracted from the original features of these four similar activities.In this study, we found that the mapping results of the PCA are not particularly favorable, as the intra-class distance remains small.Therefore, in this study, we employed kernel Fisher discriminant analysis for dimensionality reduction, focusing on the points that were previously misclassified in the upper layer of a random forest.Kernel Fisher discriminant analysis has a parameter denoted as σ, which can vary.Typically, the range for this parameter is set within [0, 10].We conducted experiments with different parameter settings and obtained multiple images, as depicted in Figure 11.
Based on the above experimental results, we can see that in this scenario, upon observing the three-dimensional scatter plot, the data have been categorized into four classes.In order to obtain a clearer visual representation, we selected the two features that performed best in the three-dimensional space and generated a two-dimensional scatter plot, as shown in Figure 16.Taking the four behavior classes (A9, A13, A18, and A19) as an example of Confusing category I, we first extracted three of the most important features from the dataset and created a scatter plot, as shown on the left side of Figure 7.It can be observed that these four behavior classes have a relatively short spatial distribution in these three original features, indicating a small inter-class distance and a large intra-class distance.This does not support activity recognition by the classifier.
Subsequently, we applied a principal component analysis (PCA) for dimensionality reduction, as illustrated on the right side of Figure 7.It represents three randomly selected nonlinear discriminative features extracted from the original features of these four similar activities.In this study, we found that the mapping results of the PCA are not particularly favorable, as the intra-class distance remains small.Therefore, in this study, we employed kernel Fisher discriminant analysis for dimensionality reduction, focusing on the points that were previously misclassified in the upper layer of a random forest.Kernel Fisher discriminant analysis has a parameter denoted as , which can vary.Typically, the range for this parameter is set within [0, 10].We conducted experiments with different parameter settings and obtained multiple images, as depicted in Figure 11.
Based on the above experimental results, we can see that in this scenario, upon observing the three-dimensional scatter plot, the data have been categorized into four classes.In order to obtain a clearer visual representation, we selected the two features that performed best in the three-dimensional space and generated a two-dimensional scatter plot, as shown in Figure 16.In this paper, we utilize MATLAB to sub-classify the aforementioned model and input the indicators that have been generalized in the discriminant analysis into SVM as the original data, taking Confusing category I as an example because A9 and A13 are more closely connected and A18 and A19 are also more closely connected.We first subdivide Confusing category I into two large classes, A9, A13 and A18, A19, and then a second subdivision was made to subdivide Confusing category I into more classes, A9, A13, A18, and A19, by a two-layer SVM vector machine.The four classes of activities are A9, A13, A18, and A19, as shown in Figure 17.
Sensors 2021, 21, x FOR PEER REVIEW 24 of 34 In this paper, we utilize MATLAB to sub-classify the aforementioned model and input the indicators that have been generalized in the discriminant analysis into SVM as the original data, taking Confusing category Ⅰ as an example because A9 and A13 are more closely connected and A18 and A19 are also more closely connected.We first subdivide Confusing category Ⅰ into two large classes, A9, A13 and A18, A19, and then a second subdivision was made to subdivide Confusing category Ⅰ into more classes, A9, A13, A18, and A19, by a two-layer SVM vector machine.The four classes of activities are A9, A13, A18, and A19, as shown in Figure 17.Through the above steps, the data from Confusing category Ⅰ were classified into two major classes, A8, A13 and A18, A19, using SVM.In order to achieve a more precise classification, this paper further conducts a fine classification of these two major classes into specific activity classes, as shown in Figures 17-19.Through the above steps, the data from Confusing category I were classified into two major classes, A8, A13 and A18, A19, using SVM.In order to achieve a more precise classification, this paper further conducts a fine classification of these two major classes into specific activity classes, as shown in Figures 17-19.
Through the steps related to the figure, we are able to classify all the data in Confusion category I into specific active classes using the SVM vector machine meticulous classification.Although the effect of the SVM vector machine fine classification A9 and A13 is not significant, as shown in Figure 19, it is still much better than the initial random forest classification effect.Similarly, we conducted various experiments, as shown in Table 8.In this study, a similar operation was performed on Confusing category I, which served as input to the second layer of the support vector machine.This process yielded recognition probabilities for four similar activities of human behavior.The final recognition results for these four activities were determined through a weighted average, which considered the recognition probabilities from the first-layer classifier.Figure 20 illustrates the confusion matrix, revealing substantial improvement in the original actions prone to confusion.The overall accuracy rate increased significantly, rising from 89.07% to an impressive 97.69%.We also compared our approach with those of others on three datasets: UCI HAR, WISDM, and IM-WSHA, as shown in Table 9.  [37] Estimation algorithm [38] 80.49% We also compared our approach with those of others on three datasets: UCI HAR, WISDM, and IM-WSHA, as shown in Table 9.

Extraction of XR-KS Model Generalization Test
Through research [50], it is known that testing the generalizability of a model can be performed using K-fold cross-validation.Below, we validate the model.First, we conducted experiments using K-fold cross-validation with the random forest model.In this study, we set K = 5 for validation.The experimental results are shown in Figure 21 and Table 10, demonstrating the strong generalization capability of our model.Through the study conducted by Fawcett [51] on ROC curves, we aim to assess the effectiveness of SVM by analyzing the ROC values.The results obtained from the model with a training-to-testing ratio of 9:1 are presented in Figure 22, while the results of other models with varying ratios can be found in Table 11.The above results show that our model is well trained.Through the study conducted by Fawcett [51] on ROC curves, we aim to assess the effectiveness of SVM by analyzing the ROC values.The results obtained from the model with a training-to-testing ratio of 9:1 are presented in Figure 22, while the results of other models with varying ratios can be found in Table 11.The above results show that our model is well trained.

Conclusions
This study proposes an approach to identifying similar activities by introducing a multi-layer model called XR-KS.The approach involves filtering all the features of the data using the XGBoost feature selection algorithm and selecting a subset of features for random forest classification.This method aims to improve the efficiency of classification.Based on the classification results mentioned above, we filtered out the data points that were difficult to identify.We then utilized the SVM model with kernel Fisher discriminant analysis to identify these points.Firstly, we applied the kernel Fisher discriminant analysis to distinguish similar activities based on the data.Next, we used the SVM model to classify the data, resulting in a satisfactory recognition outcome.Additionally, we employed k-fold cross-validation and ROC curves to validate our model separately.The results confirmed that our model exhibits strong generalization capabilities.Our method can identify similar human activities very well.However, we also observed that the recognition effect of A9 and A13 is not very good.Therefore, we will investigate a more suitable model and its simulation algorithm to achieve higher accuracy in recognizing human activity in A9 or A13.This research, however, is subject to several limitations.

•
The differentiation of similar activities in the first and second tiers of this thesis is based on subjective perception and data.
Our future research work can be focused on the following three points: • Extend the proposed technique to handle classification tasks involving similar activity data, such as typing and handwriting.Extend the proposed technique to handle generative tasks in challenging driving conditions, such as datasets with limited features and insufficient data samples.

•
Enable the models in the article to automatically distinguish between similar human activities and automate the process of discrimination is a future research direction.

•
Starting with data collection, it is important to design and implement a robust and targeted sensor data collection scheme and algorithm.For instance, we should consider removing noise during data collection and focusing on collecting data that facilitate the identification of human activity.

Figure 1 .
Figure 1.Overview of the system workflow.(The collected human activity data are processed and then classified before utilizing the XR-KS model).

Figure 1 .
Figure 1.Overview of the system workflow.(The collected human activity data are processed and then classified before utilizing the XR-KS model).

Figure 2 .
Figure 2. XR-KS model workflow diagram.(The architecture of XR-KS.The first layer of this model relies on a random forest model based on the XGBoost feature selection algorithm, and then, based on the results of the first layer, the second layer uses the support vector machine model based on kernel Fisher discriminant analysis to output the final result, which is very effective in classifying similar human activities.)

Figure 2 .
Figure 2. XR-KS model workflow diagram.(The architecture of XR-KS.The first layer of this model relies on a random forest model based on the XGBoost feature selection algorithm, and then, based on the results of the first layer, the second layer uses the support vector machine model based on kernel Fisher discriminant analysis to output the final result, which is very effective in classifying similar human activities).

Figure 3 .
Figure 3. Pre-and post-data preprocessed from A1 to A19. (Minimizing and normalizing aim to scale data consistently, ensuring equal influence from variables in the model).(a) Original data; (b) data after preprocessing.

Figure 4 .
Figure 4. Presentation of the random forest model based on the XGBoost feature selection algorithm workflow.

Figure 4 .
Figure 4. Presentation of the random forest model based on the XGBoost feature selection algorithm workflow.3.2.1.XGBoost Feature Selection Algorithm XGBoost was first proposed by Chen et al. (2014) [35].The traditional objective function of GBDT is to predict the target category by stacking the residual trees from different iterations.XGBoost improves upon the traditional GBDT objective function by incorporating a regularization term into the original function and reducing the number of regularization terms.XGBoost improves the traditional GBDT objective function by incorporating a regularization term into the original function.This addition helps to mitigate overfitting and accelerate convergence.XGBoost improves the traditional GBDT objective function by incorporating a regularization term into the original function.This addition helps to mitigate overfitting and enhances the convergence speed.The objective function of the model is shown in Equation (1) as follows:

3. 2 . 2 .
Random Forest Based on the XGBoost Feature Selection Algorithm Random forest is a composite classification model composed of many decision tree classification models , and the parameter set is a collection of independently and identically distributed random vectors.Under the given independent variables , each decision tree classification model selects the optimal classification result through a majority vote.The basic idea is to first use bootstrap sampling to extract samples from the original training set, with each sample having the same sample size as the original training set.Then, decision tree models are built for the samples, resulting in different classification results.Finally, based on these classification results, a majority vote is used to determine the final classification result for each record.

Algorithm 1 :
Random forest model based on the XGBoost feature selection algorithm

Figure 6 .
Figure 6.Confusion matrix of test set and training set random forest training.(The experimental setup is like the one in Figure 9).(a) Confusion matrix of random forest model by train set; (b) confusion matrix of random forest model by test set.

Figure 7 .
Figure 7.Primary feature and PCA feature.(a) Origin data; (b) data after PCA.

Figure 6 .Figure 6 .
Figure 6.Confusion matrix of test set and training set random forest training.(The experimental setup is like the one in Figure 9).(a) Confusion matrix of random forest model by train set; (b) confusion matrix of random forest model by test set.

Figure 7 .
Figure 7.Primary feature and PCA feature.(a) Origin data; (b) data after PCA.

Figure 7 .
Figure 7.Primary feature and PCA feature.(a) Origin data; (b) data after PCA.

Figure 8 .
Figure 8. SVM model workflow based on kernel Fisher discriminant analysis.(Thesecond-layer classification model ensures close proximity for (a) similar activities.(b) After applying KFDA, (c) 3 axial data and images are generated.(d) The two-dimensional data for the 2 axial image, (e) SVM classification, (f) the final results).

Figure 9 .
Figure 9.Comparison chart of random forest model recognition results.(The training and test set ratio was set to 9:1, and the experiment was repeated five times.)(a) Radom forest model by train set; (b) random forest model by test set.3.3.1.Principle of Kernel Fisher Discriminant AnalysisKFDA is a pattern recognition and classification method based on kernel techniques and is an extension of Fisher discriminant analysis.KFDA is designed to handle nonline-

Figure 8 . 34 Figure 8 .
Figure 8. SVM model workflow based on kernel Fisher discriminant analysis.(The second-layer classification model ensures close proximity for (a) similar activities.(b) After applying KFDA, (c) 3 axial data and images are generated.(d) The two-dimensional data for the 2 axial image, (e) SVM classification, (f) the final results).

Figure 9 .
Figure 9.Comparison chart of random forest model recognition results.(The training and test set ratio was set to 9:1, and the experiment was repeated five times.)(a) Radom forest model by train set; (b) random forest model by test set.3.3.1.Principle of Kernel Fisher Discriminant AnalysisKFDA is a pattern recognition and classification method based on kernel techniques and is an extension of Fisher discriminant analysis.KFDA is designed to handle nonline-

Figure 9 .
Figure 9.Comparison chart of random forest model recognition results.(The training and test set ratio was set to 9:1, and the experiment was repeated five times).(a) Radom forest model by train set; (b) random forest model by test set.

Figure 10 .
Figure 10.Transformation process illustration of a KFD model.A nonlinear mapping function φ(x)converts a nonlinear problem in the original (low-dimensional) input space to a linear problem in a (higher-dimensional) feature space (from[27]).

Figure 11 .
Figure 11.Result of kernel Fisher discriminant analysis with varying parameters.(a) ; (b) ; (c) ; (d) ; (e) ; (f) .3.3.3.SVM Model Based on Kernel Fisher Discriminant Analysis In order to further categorize the confusion action into specific actions, this paper introduces the support vector machine (SVM) model as a sub-classification model for dividing the confusion action.The principle of the SVM classifier is to find a hyperplane that

Figure 12 .
Figure 12.Vector machine classification flow chart.(The figure displays blue and red dots representing distinct classes, with the SVM iteratively seeking the optimal classification hyperplane through continuous refinement).

Algorithm 2 :
SVM model based on kernel Fisher discriminant analysis Input: Let , kernel value .Output: Prediction of the SVM at and SVM model.

19 :
Obtain the support vector machine model.

Figure 12 .
Figure 12.Vector machine classification flow chart.(The figure displays blue and red dots representing distinct classes, with the SVM iteratively seeking the optimal classification hyperplane through continuous refinement).

Figure 13 .
Figure 13.Vector machine classification schematic.(differentcolors show positive and negative samples.Hollow circles mark support vectors, the points closest to the separating hyperplane.Solid markers represent instances farther away from the hyperplane.).Therefore, the characteristic samples in the sample set should satisfy the discriminant equation when multiplied by the corresponding coefficients.

Figure 13 .
Figure 13.Vector machine classification schematic.(differentcolors show positive and negative samples.Hollow circles mark support vectors, the points closest to the separating hyperplane.Solid markers represent instances farther away from the hyperplane).. Therefore, the characteristic samples in the sample set should satisfy the discriminant equation when multiplied by the corresponding coefficients.

Figure 14 .
Figure 14.Weights of important features chart and effect of different numbers of features on a random model.(We set the number of features to be from 1 to 46 steps, took 5, and repeated the error experiment for each different n to obtain the mean and confidence interval.)(a) Result of the XGBoost feature value selection algorithm; (b) effect of different numbers of features on a random model.

Figure 14 .
Figure 14.Weights of important features chart and effect of different numbers of features on a random model.(We set the number of features to be from 1 to 46 steps, took 5, and repeated the error experiment for each different n to obtain the mean and confidence interval).(a) Result of the XGBoost feature value selection algorithm; (b) effect of different numbers of features on a random model.The histogram of important feature weights as well as feature selection experiments are shown in Figure 14, and some of the results of the experiments are shown in Table3.

Figure 15 .
Figure 15.Plot of the number of decisions versus error in the random forest model.(We set the number of decision trees to be from 1 to 50 steps, took 5, and repeated the error experiment for each different n to obtain the mean and confidence interval).

Figure 15 .
Figure 15.Plot of the number of decisions versus error in the random forest model.(We set the number of decision trees to be from 1 to 50 steps, took 5, and repeated the error experiment for each different n to obtain the mean and confidence interval).

Figure 17 .
Figure 17.SVM preliminary segmentation A9, A13 and A18, A19 result graph.(Solid line is the main decision boundary, and dashed line is the margin.)(a) SVM model by train set; (b) SVM model by test set.

Figure 17 .
Figure 17.SVM preliminary segmentation A9, A13 and A18, A19 result graph.(Solid line is the main decision boundary, and dashed line is the margin).(a) SVM model by train set; (b) SVM model by test set.

Figure 18 .Figure 19 .
Figure 18.SVM preliminary segmentation A9 and A13 result graph.(Solid line is the main decision boundary, and dashed line is the margin.)(a) SVM model by train set; (b) SVM model by test set.

Figure 18 .Figure 18 .Figure 19 .
Figure 18.SVM preliminary segmentation A9 and A13 result graph.(Solid line is the main decision boundary, and dashed line is the margin).(a) SVM model by train set; (b) SVM model by test set.

Figure 19 .
Figure 19.SVM preliminary segmentation A18 and A19 result graph.(Solid line is the main decision boundary, and dashed line is the margin.)(a) SVM model by train set; (b) SVM model by test set.Figure 19.SVM preliminary segmentation A18 and A19 result graph.(Solid line is the main decision boundary, and dashed line is the margin).(a) SVM model by train set; (b) SVM model by test set.

Figure 20 .
Figure 20.Confusion matrix diagram for the two-layer classifier model XR-KS.(a) Test set confusion matrix obtained after correction; (b) test set confusion matrix obtained after correction (%).

Figure 20 .
Figure 20.Confusion matrix diagram for the two-layer classifier model XR-KS.(a) Test set confusion matrix obtained after correction; (b) test set confusion matrix obtained after correction (%).

Figure 21 .
Figure 21.Cross-validation and k = 5 results.(a) First-split training set confusion matrix; (b) first-split testing set confusion matrix; (c) second-split training set confusion matrix; (d) second-split testing set confusion matrix; (e) third-split training set confusion matrix; (f) third-split testing set confusion matrix; (g) fourth-split training set confusion matrix; (h) fourth-split testing set confusion matrix; (i) fifth-split training set confusion matrix; (j) fifth-split testing set confusion matrix.

Figure 22 .
Figure 22.UCI DSA of the SVM model ROC curve and AUC value by train:test = 9:1.(a) ROC curves and AUC values of SVM confounded A9, A13 and A18, A19 class result plots; (b) ROC curves and AUC values of SVM confounded A9 and A13 class result plots; (c) ROC curves and AUC values of SVM confounded A18 and A19 class result plots.

Figure 22 .
Figure 22.UCI DSA of the SVM model ROC curve and AUC value by train:test = 9:1.(a) ROC curves and AUC values of SVM confounded A9, A13 and A18, A19 class result plots; (b) ROC curves and AUC values of SVM confounded A9 and A13 class result plots; (c) ROC curves and AUC values of SVM confounded A18 and A19 class result plots.

Table 2 .
Table of specific actions and corresponding codes in the text.
For k = 1, . .., M do 20:Train the kth tree based on the test set and the training set to obtain h i (x).

Table 3 .
Number of features and error rate, accuracy rate, training times, and running time (mean ± std).

Table 3 .
Number of features and error rate, accuracy rate, training times, and running time (mean ± std).

Table 4 .
Number of decisions and error rate, accuracy rate, training times, and running time (mean ± std).

Table 4 .
Number of decisions and error rate, accuracy rate, training times, and running time (mean ± std).

Table 5 .
Random forest model based on XGBoost feature selection algorithm result table of UCI DSA (mean ± std).(The experiments were set up with different ratios of test and validation sets and repeated five times).

Table 6 .
Random forest model based on the XGBoost feature selection algorithm result table between datasets: UCI DSA, UCI HAR, WISDM, and UCI ADL (mean ± std) (the experimental setup is like the one in Table5).

Table 7 .
Confusing action classification table.

Table 7 .
Confusing action classification table.

Table 9 .
Comparison of recognition accuracy of the proposed method with other state-of-the-art methods over UCI DSA, UCI HAR, WISDM, and IM-WSHA datasets (Bolding in the table indicates the methodology of this paper.).

Table 9 .
Comparison of recognition accuracy of the proposed method with other state-of-the-art methods over UCI DSA, UCI HAR, WISDM, and IM-WSHA datasets (Bolding in the table indicates the methodology of this paper)..

Table 10 .
Mean (±std) k = 5 and cross-validation XR model results of UCI DSA.(Five repetitions of the experiment were conducted to obtain the mean and confidence intervals).

Table 11 .
KS model AUC mean (±std) results of UCI DSA.(Five repetitions of the experiment were conducted to obtain the mean and confidence intervals).

Table 11 .
KS model AUC mean (±std) results of UCI DSA.(Five repetitions of the experiment were conducted to obtain the mean and confidence intervals.)