Next Article in Journal
Deep Learning and NLP-Based Trend Analysis in Actuators and Power Electronics
Previous Article in Journal
A Multibody-Based Benchmarking Framework for the Control of the Furuta Pendulum
Previous Article in Special Issue
Similarity Analysis of Upper Extremity’s Trajectories in Activities of Daily Living for Use in an Intelligent Control System of a Rehabilitation Exoskeleton
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Continuous Estimation of sEMG-Based Upper-Limb Joint Angles in the Time–Frequency Domain Using a Scale Temporal–Channel Cross-Encoder

by
Xu Han
1,
Haodong Chen
2,
Xinyu Cheng
1 and
Ping Zhao
1,*
1
School of Mechanical Engineering, Hefei University of Technology, Hefei 230009, China
2
Seattle Children’s Hospital, Seattle, WA 98144, USA
*
Author to whom correspondence should be addressed.
Actuators 2025, 14(8), 378; https://doi.org/10.3390/act14080378 (registering DOI)
Submission received: 20 June 2025 / Revised: 29 July 2025 / Accepted: 30 July 2025 / Published: 31 July 2025

Abstract

Surface electromyographic (sEMG) signal-driven joint-angle estimation plays a critical role in intelligent rehabilitation systems, as its accuracy directly affects both control performance and rehabilitation efficacy. This study proposes a continuous elbow joint angle estimation method based on time–frequency domain analysis. Raw sEMG signals were processed using the Short-Time Fourier Transform (STFT) to extract time–frequency features. A Scale Temporal–Channel Cross-Encoder (STCCE) network was developed, integrating temporal and channel attention mechanisms to enhance feature representation and establish the mapping from sEMG signals to elbow joint angles. The model was trained and evaluated on a dataset comprising approximately 103,000 samples collected from seven subjects. In the single-subject test set, the proposed STCCE model achieved an average Mean Absolute Error (MAE) of 2.96 ± 0.24 , Root Mean Square Error (RMSE) of 4.41 ± 0.45 , Coefficient of Determination ( R 2 ) of 0.9924 ± 0.0020 , and Correlation Coefficient (CC) of 0.9963 ± 0.0010 . It achieved a MAE of 3.30 , RMSE of 4.75 , R 2 of 0.9915, and CC of 0.9962 on the multi-subject test set, and an average MAE of 15.53 ± 1.80 , RMSE of 21.72 ± 2.85 , R 2 of 0.8141 ± 0.0540 , and CC of 0.9100 ± 0.0306 on the inter-subject test set. These results demonstrated that the STCCE model enabled accurate joint-angle estimation in the time–frequency domain, contributing to a better motion intent perception for upper-limb rehabilitation.

1. Introduction

According to the World Stroke Organization (WSO), stroke is the second leading cause of death and disability worldwide [1]. Upper-limb hemiparesis is one of the most common motor impairments following stroke [2], and regaining upper-limb function is critical for restoring patients’ independence in daily life [3]. Intelligent rehabilitation systems [4,5,6,7,8], particularly upper-limb rehabilitation robots [9,10,11,12], have shown great potential in promoting neuroplasticity and functional recovery by delivering intensive, repetitive, and quantifiable training [13,14,15,16]. Achieving efficient and natural human–robot interactions is central to improving rehabilitation outcomes, and accurately perceiving the user’s motion intention is fundamental to this goal [17].
sEMG, as an electrophysiological representation of muscle activity on the skin surface, contains rich information about limb movement intention [18,19]. Its non-invasive and wearable characteristics, along with its ability to reflect movement intention at an early stage, makes it particularly valuable in fields such as rehabilitation robotics, prosthetic control, and human–machine interaction [20,21]. In particular, for upper-limb exoskeletons and rehabilitation robotic systems, sEMG signals are widely regarded as a crucial input source for achieving natural and compliant control [22,23]. Among these applications, continuously estimating joint angles is critical for assisting in the execution of various functional movements [24].
In recent years, significant progress has been made in estimating joint motion information using surface sEMG. Early studies primarily focused on gesture recognition and discrete state estimation [25,26,27]. With the advancement of research, continuous joint angle estimation has attracted increasing attention due to its ability to provide more refined motion control information [28]. In terms of feature extraction, researchers have explored various approaches to obtain more expressive and task-relevant features. For example, Xiao et al. extracted multiple time-domain features, including Mean Absolute Value (MAV), Zero Crossing (ZC), Waveform Length (WL), Slope Sign Changes (SSC), and Difference Absolute Standard Deviation Value (DASDV), and demonstrated their effectiveness in joint-angle estimation tasks [29]. Raj et al. used Integrated EMG (IEMG) and ZC as model inputs to estimate elbow joint displacement and velocity [30]. Time–frequency features, by simultaneously analyzing the variations of signals in both the time and frequency domains, can provide a more comprehensive description of the dynamic characteristics of sEMG signals, making them particularly suitable for decoding tasks involving non-stationary and continuous movements [31]. Several comparative studies have demonstrated that the incorporation of time–frequency information significantly improves classification accuracy and robustness in continuous motion estimation tasks [32]. Wen et al. extracted latent motion information from multi-scale time–frequency features using Variational Mode Decomposition (VMD) and Wavelet Packet Transform (WPT) and significantly improved continuous angle estimation performance through a Bidirectional LSTM (BiLSTM) network [33]. Alazrai et al. employed Discrete Wavelet Transform (DWT) to construct time–frequency representations of surface EMG signals and extracted time–frequency features to estimate the joint angles of the wrist and fingers [34]. Jiang et al. combined raw time-domain signals with frequency-domain features to achieve the high-precision continuous estimation of multi-joint angles, providing a promising solution for myoelectric prosthesis control [35]. Overall, the evolution of sEMG feature extraction, from basic time-domain descriptors to time–frequency analysis and multi-domain feature fusion, has greatly enhanced our ability to capture motion-related information embedded in the signals, laying a solid foundation for continuous joint-angle estimation.
In terms of estimation methods, advancements in machine learning and deep learning have significantly improved the accuracy of sEMG-based joint-angle estimation. Nonlinear models have demonstrated stronger fitting capabilities. For example, Zhang et al. employed the Whale Optimization Algorithm (WOA) to optimize a Support Vector Regression (SVR) model, reducing the RMSE of elbow-joint-angle estimation to 10.86 [36]. Artificial Neural Networks (ANNs) have also been widely explored to establish nonlinear mappings between sEMG features and joint angles [37,38]. Subsequently, deep learning models have shown better performance. In particular, Recurrent Neural Networks (RNNs) and their variants, such as Long Short-Term Memory (LSTM) [39] and BiLSTM [40], have proven highly effective in modeling temporal dependencies, making them well-suited for processing temporally correlated signals like sEMG. Ruan et al. applied an LSTM-based model using multi-channel time-domain sEMG features to simultaneously estimate elbow and wrist joint angles, outperforming conventional neural networks in both accuracy and stability [41]. To address the frequent synchronization issues between sEMG and joint-angle data during real-world acquisition, Ma et al. employed a BiLSTM model to estimate continuous shoulder and elbow joint movements under weakly synchronized conditions [42]. Convolutional Neural Networks (CNNs) have been used to extract spatial and temporal features from multi-channel sEMG signals. Hajian et al. proposed a Two-stream multi-scale Convolutional Neural Network (TS-CNN) architecture, which directly extracts and fuses hierarchical features from raw high-density EMG signals using convolutional kernels of different scales, enabling the simultaneous estimation of elbow joint angle and velocity [43]. Furthermore, due to variability in sEMG signals across individuals, various strategies have been explored to improve generalization. For example, some studies have explored fusing sEMG with EEG signals for multimodal learning [44] and employed domain adaptation methods to capture subject-invariant features [45]. Generally, deep learning models have gradually become the prevailing method for continuous sEMG-based joint-angle estimation and have showed significant advantages in terms of accuracy.
This study aims to develop a continuous elbow-joint-angle decoding method based on the time–frequency features of sEMG signals. A dataset is constructed using time–frequency features extracted by STFT. A Transformer-based [46] regression model is constructed, consisting of two key modules: an independent per-channel temporal attention encoder, responsible for capturing the time–frequency dynamics of each sEMG channel; and a cross-channel attention encoder, designed to model the spatial relationships among multiple channels. Additionally, to accelerate model convergence, a feature scaling module is introduced to amplify the input feature magnitudes. The proposed method is evaluated on a dataset including seven subjects and a total of about 103,000 samples. Evaluations were conducted using training and testing in three settings: single-subject datasets, a mixed multi-subject dataset, and a leave-one-subject-out dataset. The main contributions of this work can be summarized as follows:
  • This study constructs a sEMG–elbow-angle dataset consisting of over 100,000 samples collected from seven healthy subjects, providing a valuable data resource for continuous joint-angle estimation research.
  • A fixed Input Scaling operation is applied to amplify the time–frequency features, which accelerates model convergence and improves the accuracy of angle estimation.
  • We propose a novel STCCE, built upon a Transformer architecture, that integrates multi-scale temporal and channel attention mechanisms to effectively model the mapping from time–frequency sEMG features to joint angles.
This paper is organized as follows: Section 2 provides a detailed description of the data acquisition platform, participant information, data collection procedures, and the steps for data trimming and preprocessing. Section 3 describes the construction of the dataset, the proposed STCCE model structure, and the implementation and training details, as well as the evaluation metrics. Section 4 comprehensively shows and analyzes the test results of the model under single-subject, multi-subject, and inter-subject scenarios, and compares these results to those from other related studies. Section 5 concludes this work and discusses its current limitations and future research opportunities.

2. Experiment Setup and Data Collection

2.1. Experiment Platform

The experiment platform setting in Figure 1a,b shows the sEMG and joint-angle sensors used in this study. The Myo armband (Thalmic Labs Inc., Kitchener, ON, Canada) contains eight sensors evenly distributed around the circumference. The sEMG signals are wirelessly transmitted to a computer, and real-time data at a sampling rate of 200 Hz can be obtained using the Myo Software Development Kit (SDK)v1.0-2014. For joint-angle collection, a measurement device was developed using 3D printing, as shown in Figure 1c. The length of the device components can be adjusted according to the subject’s arm length to accommodate different individuals. Furthermore, a single-axis angle sensor (JY-ME02-CAN, WIT Motion, Shenzhen, China, as shown in Figure 1b) was equipped at the elbow joint to measure flexion-extension angles. The sampling frequency for this sensor was also set to 200 Hz. In the experiment, the device was fastened to the participant’s arm with an elastic strap. The length of the adjustable components was set by aligning multiple fixed connection holes and sliding slots, ensuring that the angle sensor remained properly aligned with the elbow joint throughout the movement. Once aligned, the length was locked in place with bolts. This adjustment procedure was repeated for each participant.

2.2. Participants

In this study, seven healthy participants (three females and four males, 23–27 years old) are chosen. None of the participants had a history of neurological or muscular disorders. Each participant is assigned a numerical ID from 1 to 7, and some of their information is summarized in Table 1. Before the experiment, all participants were fully introduced to the experimental procedure and signed written informed consent forms. All the experimental procedures were approved by the Institutional Review Board (IRB) of Hefei University of Technology on 19 June 2025, with Protocol No. HFUT20250619001H.

2.3. Data Acquisition

During the 6 h before the collection process, all subjects did not engage in any strenuous exercise. All the subjects were seated and instructed to remain relaxed to avoid muscle tension that could interfere with the sEMG signals. The measurement device was first adjusted to an appropriate length to ensure that the rotational center of the participant’s elbow joint was aligned coaxially with the angle sensor. The MYO armband was worn on the upper arm, with Channel 4 positioned directly over the biceps and Channel 1 aligned with the medial head of the triceps. The specific movement process for each subject are described as follows and are illustrated in Figure 2.
  • Preparation state: The forearm hangs naturally with the palm facing forward, and the elbow flexion angle is approximately 10 20 .
  • Start mark state: The forearm is extended to 0 .
  • Repeated flexion-extension: Starting from the Start mark state, perform the arm flexion and extension k times repeatedly. The maximum flexion angle is approximately 140°–150°; the minimum angle is 0 (some participants showed a brief hyperextension of the arm, with an actual angle less than 0 ; however, we still treated it as 0 , because such a condition does not occur during rehabilitation exercises.).
  • End mark state: The last forearm extension to 0 during the repetition process.
  • Restore to preparation state: The subject relaxes, and the elbow flexion angle is maintained at approximately 10 20 .
Each completion of the above experiment process by a subject is defined as 1 record data, with k = 12 . Therefore, each record data contains 13 times ( k + 1 ) mark states. Each subject was required to perform and record 20 such records, with a 1 min rest interval between consecutive experiments to avoid muscle fatigue. It is particularly noted that we did not require the participants to perform strict speed control. Conversely, participants were instructed to perform forearm flexion and extension naturally at a comfortable speed, focusing on the feeling of muscle activation. As a result, the amount of data collected from each participant is not strictly the same.

2.4. Data Trimming

Since the sEMG signals and joint-angle data are acquired independently and are not synchronized, trimming is required. For each record, we implement a Gaussian kernel function fitting on the absolute value of the sEMG signal from one channel (corresponding to the triceps), as shown in Equation (1). Here, K is the number of Gaussian kernel functions, which is set to 13 to match the number of mark states observed in the experiment. Each component is parameterized by its amplitude A k , center μ k , and standard deviation σ k , representing the weight, position, and spread of the k-th Gaussian kernel, respectively. The parameters ( [ A k , μ k , σ k ] k = 1 K ) are estimated using a nonlinear least squares fitting method to minimize the error between the fitted curve ( x i , f ( x i ) ) i = 1 N and the observed data ( x i , y i ) i = 1 N , as shown in Equation (2). Based on the identified start and end marker state index, all irrelevant data before the first start marker and after the last end marker are deleted.
f ( x ) = k = 1 K A k · exp ( x μ k ) 2 2 σ k 2
min { A k , μ k , σ k } i = 1 N y i f ( x i ) 2
For angle data processing, we identified the index of 13 local minima. Similar to the sEMG processing, data outside the first and last minima are trimmed to match the sEMG signals. It should be noted that what we have achieved is not a strict temporal synchronization, but rather an alignment in terms of effect. That is, when the forearm was fully extended, the activation of the triceps muscle reached its peak.

2.5. Data Preprocessing

The sEMG signals collected from the MYO armband are internally processed and normalized to the range of [−1, 1], with a built-in 50 Hz notch filter integrated to effectively suppress power line interference. Subsequently, to remove low-frequency noise and baseline drift, a 20 Hz high-pass filter (fourth-order Butterworth) is applied to the signal, thereby preserving high-frequency sEMG features that are more physiologically meaningful. The angle data collected in this study theoretically share the same sampling frequency as the sEMG signals, but they are trimmed. Therefore, to ensure precise alignment between them, we applied interpolation to the angle data, and no additional processing is implemented.

3. Method

3.1. Dataset Construction

Considering the non-stationarity of sEMG signals in the time domain, we do not adopt this approach of relying solely on time-domain features. Instead, STFT is applied to each of the 8 sEMG channels individually to capture the time–frequency characteristics of the signals, as shown in Equations (3) and (4).
X ( m , i ) = n = 0 N 1 x [ n + m H ] · w [ n ] · e j 2 π N i n
A ( m , i ) = X ( m , i )
The parameters are defined as follows: x [ n ] denotes the raw EMG signal, w [ n ] is the window function, N is the window length (the number of sampling points per frame), and H is the frame shift, representing the sliding step size between adjacent frames. m denotes the time frame index, and i is the frequency index. X ( m , i ) represents the complex spectral coefficient at the i-th frequency in the m-th frame. In this study, the Hanning window is used as the window function, with a window length of 40 data points. To ensure spectral continuity of frequency domain features, a 75% overlap is applied between frames, resulting in a frame shift of 10 data points. We then extract the magnitude of each frequency component after the STFT as the features. And a sequence of 7 consecutive windows is used as the input. Each input has a shape of [ W , C , A ] , where W is the window dimension, C is the channel dimension, and A is the frequency magnitude dimension. The corresponding output is a single angle value aligned with the end timestamp of the input EMG segment.
It should be noted that, although the number of actions collected from each subject is the same, the speed of the actions varied. Therefore, the signal lengths of each subject are different, and, consequently, the sizes of their data sets are also different, as shown in Table 2. All datasets are split into training, validation, and test sets in a ratio of 0.7:0.15:0.15. To eliminate temporal dependencies within the data, a random partitioning strategy is adopted. In addition, we construct a mixed dataset containing data from all subjects, as well as leave-one-subject-out (LOSO) datasets for each subject, to evaluate the proposed method’s performance in both multi-subject and inter-subject scenarios.

3.2. Proposed Model

Considering the significant non-stationarity and distributional differences of sEMG signals across both temporal and channel dimensions, this study proposes a STCCE model for the continuous estimation of upper-limb elbow joint angles. The model fully leverages the time–frequency features of sEMG and the inter-channel correlations, integrating the global modeling capability of the Transformer Encoder architecture with the adaptive properties of attention mechanisms. The overall structure is illustrated in Figure 3, and the main modules are described as follows.

3.2.1. Input Scaling

We visualized the distribution of all input features from the constructed dataset after flattening, as shown in Figure 4. In this figure, the horizontal axis is divided into many equal-width bins, each representing a range of amplitude values of the time–frequency features. The vertical axis represents the ratio of the frequency to bin width. The frequency distribution histogram illustrates that the time–frequency amplitudes of the sEMG signals are highly concentrated within a narrow range around zero. Such numerical imbalance may adversely affect gradient propagation and model optimization, especially in the early training stages. To address this issue, a fixed linear scaling operation is applied to the raw time–frequency features extracted by STFT. This operation preserves the original distribution shape of the features while amplifying their magnitude distribution, thereby enhancing the numerical expressiveness of the inputs. As a result, the Input Scaling facilitates more stable gradient propagation and accelerates model convergence during training.

3.2.2. Per-Channel Temporal Attention Encoder

Considering the relative independence of sEMG signals across different channels, the model first performs separate encoding for each channel. Specifically, for the time-series input of each channel, a linear layer is applied to map the input into a high-dimensional representation space, followed by positional encoding to retain temporal order information. The encoded sequence is then modeled using a stack of Transformer Encoder layers. To more effectively extract information from a key frame, an attention-based temporal weighted pooling mechanism is further introduced to adaptively model the importance of different time frames, thereby compressing the feature into a single-channel.

3.2.3. Cross-Channel Attention Encoder

The high-dimensional representations of all channels obtained in the per-channel temporal attention encoder are concatenated to form a parallel channel feature, which is then put into the second-stage cross-channel attention encoder. This module is designed to model the synergistic relationships among multiple sEMG channels and further extract fused features along the channel dimension. To more precisely control the weight of each channel during the fusion stage, a channel attention pooling mechanism, similar to the temporal one described earlier, is employed. This mechanism adaptively integrates channel features by computing attention-based channel importance weights.

3.2.4. Regression Head

The fused channel features are put into a regression head, which consists of two fully connected layers for nonlinear transformation, ultimately producing a one-dimensional angle estimation output. A ReLU activation function and a dropout mechanism are applied between the two linear layers to prevent overfitting and enhance the model’s generalization capability.

3.3. Implementation and Training

The proposed STCCE network architecture was implemented in a PyTorch 2.7.1-based environment, with all model training and analysis conducted on an NVIDIA RTX 3080 Ti GPU (NVIDIA, Santa Clara, CA, USA). In terms of model design, the Input Scaling is fixed at a value of 50. The Per-Channel Temporal Attention Encoder consists of 2 stacked Transformer Encoder layers, with independent parameters for each channel. Similarly, the Cross-Channel Attention Encoder employs two stacked Transformer encoder layers to enhance inter-channel modeling. Each encoder uses a feature dimension of 512 to improve the model’s representational capacity and nonlinear fitting ability. Additionally, dropout operations with a dropout rate of 0.2 are applied throughout all encoder layers and the MLP regression head to prevent overfitting.
During model training, the network weights are updated using the Adaptive Moment Estimation (Adam) optimizer through backpropagation, with a fixed learning rate of 1 × 10 5 and no learning rate decay strategy applied. The exponential decay rates for the first and second moment estimates in Adam were set to β 1 = 0.9 and β 2 = 0.999 , respectively. The batch size is set to 64, and the number of training epochs is set to 500.
To comprehensively evaluate the performance of the STCCE model, we first construct individualized models for each subject by training separately on their respective datasets in order to assess intra-subject performance. Next, we combine the dataset from all subjects to train a unified model, thereby evaluating the performance in multi-subject setting. Finally, to evaluate the model’s inter-subject generalization capability, we employ a leave-one-subject-out strategy: all data from one subject are reserved as the validation and test sets (1:1), while the data from the other subjects are combined to form the training set.

3.4. Evaluation Metric

To evaluate the proposed method, 4 commonly used regression performance metrics are adopted: Mean Absolute Error (MAE), Root Mean Square Error (RMSE), Coefficient of Determination ( R 2 ), and Pearson Correlation Coefficient (CC), as defined in Equations (5)–(8). MAE and RMSE reflect the overall error magnitude; R 2 shows the performance of regression, with a range of [0, 1], where a value closer to 1 indicate stronger explanatory power of the model. CC quantifies the linear correlation between the predicted and true data, ranging from −1 to 1, where 1 indicates positive correlation, 0 indicates no correlation, and −1 indicates negative correlation.
MAE = 1 N i = 1 N y i x i
RMSE = 1 N i = 1 N ( y i x i ) 2
R 2 = 1 i = 1 N ( x i y i ) 2 i = 1 N ( x i x ¯ ) 2
CC = i = 1 N ( x i x ¯ ) ( y i y ¯ ) i = 1 N ( x i x ¯ ) 2 · i = 1 N ( y i y ¯ ) 2
where x i denotes the ground truth angle of the i-th sample, y i denotes the corresponding predicted value, x ¯ and y ¯ represent the mean values of all ground truth and predicted values, respectively, and N denotes the total number of samples in the test set.

4. Results and Discussion

In this section, we present the results of elbow joint angle estimation using the proposed method under three scenarios: single-subject, multi-subject, and inter-subject. In the multi-subject, we further discuss the effect of the Input Scaling module on the training performance of the STCCE. Finally, the results are quantitatively compared and analyzed using evaluation metrics, and the model’s performance is benchmarked against some existing methods.

4.1. Single-Subject

To evaluate the model’s ability to learn the mapping between sEMG signals and elbow joint angles within individual subjects, experiments were first conducted in single-subject training and testing scenarios. Figure 5 illustrates the training and validation loss curves for the seven individual models. The black line represents the training loss, while the dark blue line represents the validation loss. It can be observed that the model converges rapidly without signs of overfitting. Figure 6 illustrates the variation trends of the MAE and RMSE on the validation set during training for individually trained models corresponding to 7 subjects. It can be observed that the errors for all the subjects decrease rapidly in the early training stages and gradually stabilize. Although there are slight differences in initial error levels and convergence speeds among subjects, both MAE and RMSE remain at low levels.
Table 3 shows the evaluation results under this setting. The seven subjects achieved a MAE of 2.96 ± 0.24 , RMSE of 4.41 ± 0.45 , R 2 of 0.9924 ± 0.0020 , and CC of 0.9963 ± 0.0010 on the test set. These demonstrate that the proposed STCCE model achieves high estimation accuracy and consistency in the single-subject scenario.

4.2. Multi-Subject

To further evaluate the model’s performance in multi-subject scenarios, training experiment using data from multiple subjects was conducted. Similar to the single-subject training process, both training and validation loss converge rapidly, and the validation error eventually stabilized without signs of overfitting, as shown in Figure 7. Table 4 presents the evaluation metrics under the multi-subject training scenario, with a MAE of 3.30 , RMSE of 4.75 , R 2 of 0.9915, and CC of 0.9962 on the test set. The results are similar to those obtained in the single-subject scenario, indicating that the proposed method maintains high estimation accuracy even in multi-subject settings. To further visualize the elbow angle estimation performance, a segment of continuous data are randomly selected for comparison, as shown in Figure 8. It can be observed that the estimated values closely match the ground truth, demonstrating the model’s excellent accuracy and stability in angle estimation.
To evaluate the effectiveness of the Input Scaling module during training, this section compares the training and validation loss with and without the module. As shown in Figure 9, the model incorporating the Input Scaling converges more rapidly in the early stages of training, exhibits more stable validation loss throughout, and ultimately achieves a lower final loss than the model without the module. Obviously, in the later stages of training, the model without Input Scaling continues to show a decreasing training loss, while the validation loss has already converged, resulting in an increasing gap between the two. In contrast, the model with Input Scaling maintains consistent trends in both training and validation losses, with both converging to the lower values. These findings indicate that the Input Scaling module not only accelerates convergence, but also improves the model’s regression performance and enhances generalization capability.

4.3. Inter-Subject

In the inter-subject scenario, the LOSO cross-validation strategy is adopted, where data from one subject are entirely excluded for validation and testing in each round, while the remaining subjects’ data are used for training. This setting is intended to evaluate the model’s generalization ability to unseen individuals. To avoid overfitting, an early stopping mechanism was employed, and the number of training epochs was limited to fewer than 15.
Figure 10 presents the test results of the model corresponding to L3. As shown in Table 5, compared to the single-subject and multi-subject tasks, the estimation accuracy in the cross-subject scenario declined, with an average MAE of 15.62 ± 1.73 , RMSE of 21.79 ± 2.76 , R 2 of 0.8128 ± 0.0532 , and CC of 0.9114 ± 0.0317 on the test set. Specifically, RMSE and MAE increased significantly, and R 2 and CC exhibited greater fluctuations. These results indicate that substantial differences exist in the distribution of sEMG signals across subjects, and the current model has not yet fully captured the common representations of inter-subjects within the time–frequency feature space.
The notable decline in inter-subject performance is primarily attributed to the differences in sEMG signals across individuals, including variations in amplitude distribution, time–frequency patterns, and movement execution. These discrepancies cause the model to learn subject-dependent features during training, which limits its generalization ability when tested on unseen individuals. To address this issue, future work may consider incorporating strategies with stronger generalization capabilities to enhance model robustness to unseen subjects. On one hand, normalization methods (e.g., z-score normalization) can help reduce inter-subject amplitude variance. On the other hand, domain adaptation techniques (e.g., Transfer Component Analysis (TCA) [47], Domain-Adversarial Neural Network (DANN) [48]) may be employed to align feature distributions across subjects.

4.4. Compared to Other Methods

To enable a rigorous and consistent comparison, we reimplemented and evaluated the commonly used LSTM and BiLSTM models on our dataset using identical input features and evaluation protocols. Table 6 summarizes the average performance under three experimental scenarios: single-subject, multi-subject, and inter-subject estimation. All methods were trained and tested under the same settings.
As shown in Table 6, the proposed STCCE model achieves the best overall performance across all evaluation metrics, including MAE, RMSE, R 2 , and CC, under the three scenarios. In the single-subject and multi-subject settings, STCCE exhibits significant improvements over LSTM and BiLSTM, with notably lower estimation errors and stronger consistency with the ground truth, reflecting superior accuracy and stability. Although the inter-subject scenario presents greater challenges due to individual variability, STCCE still maintains a consistent advantage, outperforming the baselines across all metrics. However, the margin of improvement is less pronounced in this case, suggesting that, while the model generalizes well across subjects, inter-subject differences remain a limiting factor. These results collectively confirm the robustness and generalization ability of STCCE across diverse experimental conditions.
To further verify the statistical significance of the performance improvement, we conducted paired t-tests between STCCE and the baseline models based on MAE, as shown in Table 7. It is worth noting that, in the single-subject and inter-subject scenarios, one MAE value is computed per model for each subject, and these paired values are used to perform the t-test. In contrast, the multi-subject results are obtained through five-fold cross-validation, and the averaged MAE of each fold is used as a sample, resulting in five paired values per comparison. The results demonstrate that STCCE significantly outperforms both LSTM and BiLSTM across all three evaluation scenarios. Specifically, in the single-subject and multi-subject settings, extremely high t-statistics ( T > 15 ) and p-values below 1 × 10 5 indicate strong statistical significance, even in the more challenging inter-subject scenario, where individual variability is higher. The improvements in STCCE over LSTM ( T = 3.718 , p = 0.00494 ) and BiLSTM ( T = 3.73 , p = 0.00487 ) remain statistically significant. These statistical results further show the reliability and robustness of the proposed STCCE model, confirming its superiority over the baselines under various testing conditions.

5. Conclusions

This study focuses on the continuous estimation of upper-limb elbow joint angles driven by sEMG signals and proposes a regression method based on time–frequency domain features. To address the non-stationarity and multi-channel distribution variability of sEMG signals, STFT is employed to extract time–frequency features. A multi-scale encoder network, termed STCCE, is designed by integrating temporal and channel attention mechanisms. The model includes a per-channel temporal modeling module and a cross-channel modeling module, enabling effective feature fusion across both temporal and spatial dimensions. Additionally, an input scaling module is introduced to enhance training efficiency and generalization performance.
Extensive experiments are conducted on a dataset comprising over 100,000 samples collected from seven subjects. The proposed method demonstrated strong performance in three representative scenarios: single-subject, multi-subject, and inter-subject. In the single-subject setting, the model achieved an average MAE of 2.96 , RMSE of 4.41 , R 2 of 0.9924, and CC of 0.9963. In the multi-subject training scenario, the model maintained high estimation accuracy, achieving a MAE of 3.30 , RMSE of 4.75 , R 2 of 0.9915, and CC of 0.9962. Although the estimation accuracy declined in the inter-subject scenario due to individual variability, the model still outperformed some existing methods, with an average MAE of 15.62 , RMSE of 21.79 , R 2 of 0.8128, and CC of 0.9114.
Although the proposed model demonstrates high accuracy under controlled conditions, there are some limitations regarding its real-world applicability. Since sEMG signals are sensitive to electrode placement, deviations in electrode positioning may lead to distribution shifts in the input features, thereby affecting the model’s performance. While the elastic design of the armband helps maintain a relatively consistent sensor layout, small variations in orientation or muscle contact may still introduce signal variability. These factors highlight the need for further robustness studies and adaptation mechanisms to ensure stable performance across different wearing conditions.
Moreover, inter-subject generalization remains a significant challenge. Differences in physiological factors, such as muscle geometry and skin impedance, can lead to substantial variability in sEMG patterns across subjects. These factors contribute to the performance gap observed when applying a model trained on some individuals to unseen ones. To mitigate this issue, potential solutions include transfer learning approaches that fine-tune the model using a small amount of data from the target subject and subject normalization techniques that reduce inter-subject variability at the feature level. In addition, lightweight calibration procedures or pretraining on large, diverse subject datasets may help improve robustness and facilitate practical deployment in real-world rehabilitation scenarios.
In conclusion, the proposed method effectively fuses the time–frequency feature of sEMG signals and achieves continuous estimation of elbow joint angles. This contributes to more accurate and natural motion intention perception in upper-limb rehabilitation systems. Future research will aim to address the challenge of inter-subject generalization by exploring a range of learning frameworks, such as transfer learning, meta-learning, and domain generalization. Our approach will consider both data- and model-level strategies: at the data level, we will investigate normalization and alignment methods to reduce individual differences; and at the model level, we will explore techniques that promote the learning of subject-invariant and generalizable representations.

Author Contributions

Conceptualization, X.H. and P.Z.; methodology, X.H. and H.C.; software, X.H.; validation, X.H. and X.C.; investigation, X.H. and X.C.; data curation, X.C.; writing—original draft preparation, X.H.; writing—review and editing, X.H. and P.Z.; visualization, X.H.; supervision, P.Z. and H.C.; funding acquisition, P.Z. All authors have read and agreed to the published version of the manuscript.

Funding

The research was funded by the Fundamental Research Funds for the Central Universities of China (Grant No. PA2025GDSK0060)—Anhui Province Key Laboratory of Digital Design and Manufacturing. All findings and results presented in this paper are by those of the authors and do not represent the funding agencies.

Institutional Review Board Statement

The study was conducted in accordance with the Declaration of Helsinki and was approved by the Institutional Review Board of Hefei University of Technology (Protocol No. HFUT20250619001H from 19 June 2025).

Informed Consent Statement

Informed consent was obtained from all subjects involved in this study. Written informed consent has been obtained from the patient(s) to publish this paper.

Data Availability Statement

The data are available from the corresponding author on reasonable request.

Conflicts of Interest

The authors declared no potential conflicts of interest with respect to the research, authorship, and publication of this article.

Abbreviations

The following abbreviations are used in this manuscript:
sEMGSurface electromyographic
STFTShort-Time Fourier Transform
STCCEScale Temporal-Channel Cross Encoder
ZCZero Crossing
MAVMean Absolute Value
WLWaveform Length
SSCSlope Sign Changes
DASDVAbsolute Standard Deviation Value
IEMGIntegrated EMG
VMDVariational Mode Decomposition
WPTWavelet Packet Transform
DWTWavelet Transform
WOAWhale Optimization Algorithm
SVRSupport Vector Regression
ANNsArtificial Neural Networks
LSTMLong Short-Term Memory
BiLSTMBidirectional LSTM
CNNsConvolutional Neural Networks
TS-CNNTwo-stream multi-scale Convolutional Neural Network
SDKMyo Software Development Kit
RMSEMean Square Error
R 2 Coefficient of Determination
CCPearson Correlation Coefficient
TCATransfer Component Analysis
DANNDomain-Adversarial Neural Network

References

  1. Feigin, V.L.; Brainin, M.; Norrving, B.; Martins, S.; Sacco, R.L.; Hacke, W.; Fisher, M.; Pandian, J.; Lindsay, P. World Stroke Organization (WSO): Global Stroke Fact Sheet 2022. Int. J. Stroke 2022, 17, 18–29. [Google Scholar] [CrossRef]
  2. Ersoy, C.; Iyigun, G. Boxing Training in Patients with Stroke Causes Improvement of Upper Extremity, Balance, and Cognitive Functions but Should It Be Applied as Virtual or Real? Top. Stroke Rehabil. 2021, 28, 112–126. [Google Scholar] [CrossRef]
  3. Anwer, S.; Waris, A.; Gilani, S.O.; Iqbal, J.; Shaikh, N.; Pujari, A.N.; Niazi, I.K. Rehabilitation of Upper Limb Motor Impairment in Stroke: A Narrative Review on the Prevalence, Risk Factors, and Economic Statistics of Stroke and State of the Art Therapies. Healthcare 2022, 10, 190. [Google Scholar] [CrossRef] [PubMed]
  4. Zhang, Y.; Zhao, P.; Li, X.; Zhang, L.; Zhou, Y.; Wang, S. Design of MMSD Six-Bar Rehab Device toward the Realization of Multiple Gait Trajectories with One Adjustable Parameter. IEEE/ASME Trans. Mechatron. 2024, 29, 4309–4319. [Google Scholar] [CrossRef]
  5. Song, W.; Zhao, P.; Li, X.; Zhang, Y.; Wang, S. Data-Driven Design of a Six-Bar Lower-Limb Rehabilitation Mechanism Based on Gait Trajectory Prediction. IEEE Trans. Neural Syst. Rehabil. Eng. 2022, 31, 109–118. [Google Scholar] [CrossRef] [PubMed]
  6. Zhao, P.; Zhang, Y.; Guan, H.; Li, X.; Wang, S. Design of a Single-Degree-of-Freedom Immersive Rehabilitation Device for Clustered Upper-Limb Motion. J. Mech. Robot. 2021, 13, 031006. [Google Scholar] [CrossRef]
  7. Zhao, P.; Zhu, L.; Zi, B.; Zhang, Y.; Wang, S. Design of Planar 1-DOF Cam-Linkages for Lower-Limb Rehabilitation via Kinematic-Mapping Motion Synthesis Framework. J. Mech. Robot. 2019, 11, 041006. [Google Scholar] [CrossRef]
  8. Chen, H.; Zhu, H.; Teng, Z.; Xie, L.; Song, A. Design of a Robotic Rehabilitation System for Mild Cognitive Impairment Based on Computer Vision. J. Eng. Sci. Med. Diagn. Ther. 2020, 3, 021108. [Google Scholar] [CrossRef]
  9. Inoue, Y.; Kuroda, Y.; Yamanoi, Y.; Okajima, Y.; Tsuji, T. Development of Wrist Separated Exoskeleton Socket of Myoelectric Prosthesis Hand for Symbrachydactyly. Cyborg Bionic Syst. 2024, 5, 0141. [Google Scholar] [CrossRef]
  10. Kuroda, Y.; Yamanoi, Y.; Jiang, H.; Inoue, Y.; Tsuji, T. Toward Cyborg: Exploring Long-Term Clinical Outcomes of a Multi-Degree-of-Freedom Myoelectric Prosthetic Hand. Cyborg Bionic Syst. 2025, 6, 0195. [Google Scholar] [CrossRef]
  11. Hu, K.; Ma, Z.; Zou, S.; Zhu, Y.; Tao, B.; Zhang, D. Impedance Sliding-Mode Control Based on Stiffness Scheduling for Rehabilitation Robot Systems. Cyborg Bionic Syst. 2024, 5, 0099. [Google Scholar] [CrossRef]
  12. Chen, W.; Song, W.; Chen, H.; Xie, L.; Wang, S. Motion Synthesis for Upper-Limb Rehabilitation Motion with Clustering-Based Machine Learning Method. In Proceedings of the ASME International Mechanical Engineering Congress and Exposition, Salt Lake City, UT, USA, 8–14 November 2019; American Society of Mechanical Engineers: New York, NY, USA, 2019; Volume 59407, p. V003T04A066. [Google Scholar]
  13. Qassim, H.M.; Wan Hasan, W.Z. A Review on Upper Limb Rehabilitation Robots. Appl. Sci. 2020, 10, 6976. [Google Scholar] [CrossRef]
  14. Colombo, R.; Pisano, F.; Micera, S.; Mazzone, A.; Delconte, C.; Carrozza, M.C.; Dario, P.; Minuco, G. Robotic Techniques for Upper Limb Evaluation and Rehabilitation of Stroke Patients. IEEE Trans. Neural Syst. Rehabil. Eng. 2005, 13, 311–324. [Google Scholar] [CrossRef] [PubMed]
  15. Ghai, S.; Ghai, I.; Lamontagne, A. Virtual Reality Training Enhances Gait Poststroke: A Systematic Review and Meta-Analysis. Ann. N. Y. Acad. Sci. 2020, 1478, 18–42. [Google Scholar] [CrossRef] [PubMed]
  16. Ghai, S.; Ghai, I. Effects of (Music-Based) Rhythmic Auditory Cueing Training on Gait and Posture Post-Stroke: A Systematic Review & Dose-Response Meta-Analysis. Sci. Rep. 2019, 9, 2183. [Google Scholar]
  17. Zhang, T.; Sun, H.; Zou, Y. An Electromyography Signals-Based Human-Robot Collaboration System for Human Motion Intention Recognition and Realization. Robot. Comput.-Integr. Manuf. 2022, 77, 102359. [Google Scholar] [CrossRef]
  18. Zhang, X.; Qu, Y.; Zhang, G.; Wang, Z.; Chen, C.; Xu, X. Review of sEMG for Exoskeleton Robots: Motion Intention Recognition Techniques and Applications. Sensors 2025, 25, 2448. [Google Scholar] [CrossRef]
  19. Khairuddin, I.M.; Sidek, S.N.; Majeed, A.P.P.A.; Razman, M.A.M.; Puzi, A.A.; Yusof, H.M. The Classification of Movement Intention through Machine Learning Models: The Identification of Significant Time-Domain EMG Features. PeerJ Comput. Sci. 2021, 7, e379. [Google Scholar] [CrossRef]
  20. Li, Z.Y.; Zhao, X.G.; Zhang, B.; Ding, Q.C.; Zhang, D.H.; Han, J.D. Review of sEMG-Based Motion Intent Recognition Methods in Non-Ideal Conditions. Acta Autom. Sin. 2021, 47, 955–969. [Google Scholar]
  21. Li, L.L.; Cao, G.Z.; Liang, H.J.; Zhang, Y.P.; Cui, F. Human Lower Limb Motion Intention Recognition for Exoskeletons: A Review. IEEE Sens. J. 2023, 23, 30007–30036. [Google Scholar] [CrossRef]
  22. Liu, H.; Tao, J.; Lyu, P.; Tian, F. Human-Robot Cooperative Control Based on sEMG for the Upper Limb Exoskeleton Robot. Robot. Auton. Syst. 2020, 125, 103350. [Google Scholar] [CrossRef]
  23. Kiguchi, K.; Hayashi, Y. An EMG-Based Control for an Upper-Limb Power-Assist Exoskeleton Robot. IEEE Trans. Syst. Man Cybern. Part B Cybern. 2012, 42, 1064–1071. [Google Scholar] [CrossRef]
  24. Aung, Y.M.; Al-Jumaily, A. Estimation of Upper Limb Joint Angle Using Surface EMG Signal. Int. J. Adv. Robot. Syst. 2013, 10, 369. [Google Scholar] [CrossRef]
  25. Ding, Z.; Yang, C.; Tian, Z.; Yi, C.; Fu, Y.; Jiang, F. sEMG-Based Gesture Recognition with Convolution Neural Networks. Sustainability 2018, 10, 1865. [Google Scholar] [CrossRef]
  26. Zhang, L.; Liu, G.; Han, B.; Wang, Z.; Zhang, T. sEMG-Based Human Motion Intention Recognition. J. Robot. 2019, 2019, 3679174. [Google Scholar] [CrossRef]
  27. Wei, W.; Wong, Y.; Du, Y.; Hu, Y.; Kankanhalli, M.; Geng, W. A Multi-Stream Convolutional Neural Network for sEMG-Based Gesture Recognition in Muscle-Computer Interface. Pattern Recognit. Lett. 2019, 119, 131–138. [Google Scholar] [CrossRef]
  28. Wei, Z.; Zhang, Z.Q.; Xie, S.Q. Continuous Motion Intention Prediction Using sEMG for Upper-Limb Rehabilitation: A Systematic Review of Model-Based and Model-Free Approaches. IEEE Trans. Neural Syst. Rehabil. Eng. 2024, 32, 1487–1504. [Google Scholar] [CrossRef] [PubMed]
  29. Xiao, F.; Wang, Y.; Gao, Y.; Zhu, Y.; Zhao, J. Continuous Estimation of Joint Angle from Electromyography Using Multiple Time-Delayed Features and Random Forests. Biomed. Signal Process. Control 2018, 39, 303–311. [Google Scholar] [CrossRef]
  30. Raj, R.; Sivanandan, K.S. Comparative Study on Estimation of Elbow Kinematics Based on EMG Time Domain Parameters Using Neural Network and ANFIS NARX Model. J. Intell. Fuzzy Syst. 2017, 32, 791–805. [Google Scholar] [CrossRef]
  31. Karheily, S.; Moukadem, A.; Courbot, J.B.; Abdeslam, D.O. sEMG time–frequency features for hand movements classification. Expert Syst. Appl. 2022, 210, 118282. [Google Scholar] [CrossRef]
  32. Adzkia, M.; Setiawan, A.W.; Arland, F. Comparation Classification of EMG Signals in the Time Domain and Time-Frequency Domain. In Proceedings of the 2023 International Conference on Electrical Engineering and Informatics (ICEEI), Bandung, Indonesia, 10–11 October 2023; IEEE: New York, NY, USA, 2023; pp. 1–5. [Google Scholar]
  33. Wen, L.; Xu, J.; Li, D.; Pei, X.; Wang, J. Continuous Estimation of Upper Limb Joint Angle from sEMG Based on Multiple Decomposition Feature and BiLSTM Network. Biomed. Signal Process. Control 2023, 80, 104303. [Google Scholar] [CrossRef]
  34. Alazrai, R.; Alabed, D.; Alnuman, N.; Khalifeh, A.; Mowafi, Y. Continuous Estimation of Hand’s Joint Angles from sEMG Using Wavelet-Based Features and SVR. In Proceedings of the 4th Workshop on ICTs for Improving Patients Rehabilitation Research Techniques, Lisbon, Portugal, 13–14 October 2016; pp. 65–68. [Google Scholar]
  35. Jiang, H.; Yamanoi, Y.; Chen, P.; Wang, X.; Chen, S.; Xu, Y.; Li, G.; Yokoi, H.; Jing, X. TF2AngleNet: Continuous Finger Joint Angle Estimation Based on Multidimensional Time–Frequency Features of sEMG Signals. Biomed. Signal Process. Control 2025, 107, 107833. [Google Scholar] [CrossRef]
  36. Zhang, L.; Wang, J.; Liu, J.; Chen, W. Estimation of Joint Angle Using sEMG Based on WOA-SVR Algorithm. In Proceedings of the 2023 IEEE 18th Conference on Industrial Electronics and Applications (ICIEA), Ningbo, China, 18–22 August 2023; IEEE: New York, NY, USA, 2023; pp. 1674–1679. [Google Scholar]
  37. Aung, Y.M.; Al-Jumaily, A. sEMG Based ANN for Shoulder Angle Prediction. Procedia Eng. 2012, 41, 1009–1015. [Google Scholar] [CrossRef]
  38. Li, D.; Zhang, Y. Artificial Neural Network Prediction of Angle Based on Surface Electromyography. In Proceedings of the 2011 International Conference on Control, Automation and Systems Engineering (CASE), Singapore, 30–31 July 2011; IEEE: New York, NY, USA, 2011; pp. 1–3. [Google Scholar]
  39. Graves, A. Long Short-Term Memory. In Supervised Sequence Labelling with Recurrent Neural Networks; Springer: Berlin/Heidelberg, Germany, 2012; pp. 37–45. [Google Scholar]
  40. Graves, A.; Schmidhuber, J. Framewise Phoneme Classification with Bidirectional LSTM and Other Neural Network Architectures. Neural Netw. 2005, 18, 602–610. [Google Scholar] [CrossRef] [PubMed]
  41. Ruan, Z.; Ai, Q.; Chen, K.; Ma, L.; Liu, Q.; Meng, W. Simultaneous and Continuous Motion Estimation of Upper Limb Based on sEMG and LSTM. In Proceedings of the 14th International Conference on Intelligent Robotics and Applications (ICIRA 2021), Yantai, China, 22–25 October 2021; Springer: Cham, Switzerland, 2021. Part I. pp. 313–324. [Google Scholar]
  42. Ma, C.; Lin, C.; Samuel, O.W.; Guo, W.; Zhang, H.; Greenwald, S.; Xu, L.; Li, G. A Bi-Directional LSTM Network for Estimating Continuous Upper Limb Movement from Surface Electromyography. IEEE Robot. Autom. Lett. 2021, 6, 7217–7224. [Google Scholar] [CrossRef]
  43. Hajian, G.; Morin, E. Deep Multi-Scale Fusion of Convolutional Neural Networks for EMG-Based Movement Estimation. IEEE Trans. Neural Syst. Rehabil. Eng. 2022, 30, 486–495. [Google Scholar] [CrossRef]
  44. Silva-Acosta, V.C.; Román-Godínez, I.; Torres-Ramos, S.; Salido-Ruiz, R.A. Automatic Estimation of Continuous Elbow Flexion–Extension Movement Based on Electromyographic and Electroencephalographic Signals. Biomed. Signal Process. Control 2021, 70, 102950. [Google Scholar] [CrossRef]
  45. Li, H.; Guo, S.; Wang, H.; Bu, D. Subject-Independent Continuous Estimation of sEMG-Based Joint Angles Using Both Multisource Domain Adaptation and BP Neural Network. IEEE Trans. Instrum. Meas. 2022, 72, 1–10. [Google Scholar] [CrossRef]
  46. Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention Is All You Need. Adv. Neural Inf. Process. Syst. 2017, 30, 5998–6008. [Google Scholar]
  47. Pan, S.J.; Tsang, I.W.; Kwok, J.T.; Yang, Q. Domain Adaptation via Transfer Component Analysis. IEEE Trans. Neural Netw. 2010, 22, 199–210. [Google Scholar] [CrossRef]
  48. Ganin, Y.; Ustinova, E.; Ajakan, H.; Germain, P.; Larochelle, H.; Laviolette, F.; Marchand, M.; Lempitsky, V. Domain-Adversarial Training of Neural Networks. J. Mach. Learn. Res. 2016, 17, 1–35. [Google Scholar]
Figure 1. Experiment platform setting: (a) Schematic of data acquisition. (b) Myo armband and single-axis angle sensor. (c) Angle measurement device.
Figure 1. Experiment platform setting: (a) Schematic of data acquisition. (b) Myo armband and single-axis angle sensor. (c) Angle measurement device.
Actuators 14 00378 g001
Figure 2. Movement process for acquiring data.
Figure 2. Movement process for acquiring data.
Actuators 14 00378 g002
Figure 3. Overview of proposed STCCE model.
Figure 3. Overview of proposed STCCE model.
Actuators 14 00378 g003
Figure 4. Distribution of STFT feature values before scaling.
Figure 4. Distribution of STFT feature values before scaling.
Actuators 14 00378 g004
Figure 5. Single-subject model training and validation loss.
Figure 5. Single-subject model training and validation loss.
Actuators 14 00378 g005
Figure 6. Single-subject model validation MAE and RMSE during training.
Figure 6. Single-subject model validation MAE and RMSE during training.
Actuators 14 00378 g006
Figure 7. Training and validation performance in multi-subject scenarios.
Figure 7. Training and validation performance in multi-subject scenarios.
Actuators 14 00378 g007
Figure 8. Comparison between ground truth and estimation values.
Figure 8. Comparison between ground truth and estimation values.
Actuators 14 00378 g008
Figure 9. Effect of the Input Scaling on model training and validation loss. Input Scaling: multiply the input data by a fixed scale.
Figure 9. Effect of the Input Scaling on model training and validation loss. Input Scaling: multiply the input data by a fixed scale.
Actuators 14 00378 g009
Figure 10. L3 inter-subject angle estimation results.
Figure 10. L3 inter-subject angle estimation results.
Actuators 14 00378 g010
Table 1. Information of the subjects.
Table 1. Information of the subjects.
SubjectsGenderAgeHeight (m)Weight (kg)
1male271.8672
2male231.8068
3male241.6868
4male261.7570
5female271.7058
6female261.6355
7female241.6456
Table 2. The size of datasets for each subject.
Table 2. The size of datasets for each subject.
SubjectsTraining Set
(70%)
Validation Set
(15%)
Test Set
(15%)
Shape
111,08923762377Input: [7, 8, 17]
Output: [1]
2835417901791
3969620782078
410,37122222223
5969520782078
610,97723522353
712,01925762576
Table 3. Results of the single-subject test set.
Table 3. Results of the single-subject test set.
SubjectsMAERMSE R 2 CC
1 2.85 4.55 0.99280.9965
2 2.88 4.07 0.99260.9964
3 2.99 4.64 0.99100.9956
4 3.51 5.33 0.98840.9942
5 2.77 3.95 0.99330.9968
6 2.94 4.36 0.99330.9969
7 2.79 3.99 0.99510.9977
Table 4. Results of the multi-subject test set.
Table 4. Results of the multi-subject test set.
SubjectsMAERMSE R 2 CC
Multi-subject 3.30 4.75 0.99150.9962
Table 5. Results of the inter-subject test set.
Table 5. Results of the inter-subject test set.
SubjectsMAERMSE R 2 CC
L1 14.69 21.45 0.83970.9178
L2 13.85 18.20 0.85420.9314
L3 13.64 19.38 0.84450.9408
L4 16.99 24.33 0.75720.8909
L5 17.51 25.13 0.73150.8569
L6 18.14 24.97 0.77990.8861
L7 13.87 18.55 0.89150.9464
Li means leaving out the dataset of subject i (i = 1–7) for validation and testing.
Table 6. Comparison of estimation performance with other methods.
Table 6. Comparison of estimation performance with other methods.
ResearchMethodScenariosMAERMSE R 2 CC
[44]LSTMSingle-subject 10.40 14.08 0.92210.9600
Multi-subject 10.54 14.94 0.91650.9574
Inter-subject 18.47 23.79 0.80460.8951
[33]BiLSTMSingle-subject 10.41 14.06 0.92190.9601
Multi-subject 10.64 15.03 0.91540.9568
Inter-subject 18.48 23.76 0.77970.8964
This paperSTCCESingle-subject 2.96 4.41 0.99240.9963
Multi-subject 3.30 4.75 0.99150.9962
Inter-subject 15.53 21.72 0.81410.9100
Table 7. Paired t-test results based on MAE.
Table 7. Paired t-test results based on MAE.
MethodsScenariosT-Statisticp-Value
LSTM vs. STCCESingle-subject 15.41 2.36 × 10 6
Multi-subject 87.02 5.23 × 10 8
Inter-subject 3.718 0.00494
BiLSTM vs. STCCESingle-subject 15.88 1.98 × 10 6
Multi-subject 158.93 4.70 × 10 9
Inter-subject 3.73 0.00487
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Han, X.; Chen, H.; Cheng, X.; Zhao, P. Continuous Estimation of sEMG-Based Upper-Limb Joint Angles in the Time–Frequency Domain Using a Scale Temporal–Channel Cross-Encoder. Actuators 2025, 14, 378. https://doi.org/10.3390/act14080378

AMA Style

Han X, Chen H, Cheng X, Zhao P. Continuous Estimation of sEMG-Based Upper-Limb Joint Angles in the Time–Frequency Domain Using a Scale Temporal–Channel Cross-Encoder. Actuators. 2025; 14(8):378. https://doi.org/10.3390/act14080378

Chicago/Turabian Style

Han, Xu, Haodong Chen, Xinyu Cheng, and Ping Zhao. 2025. "Continuous Estimation of sEMG-Based Upper-Limb Joint Angles in the Time–Frequency Domain Using a Scale Temporal–Channel Cross-Encoder" Actuators 14, no. 8: 378. https://doi.org/10.3390/act14080378

APA Style

Han, X., Chen, H., Cheng, X., & Zhao, P. (2025). Continuous Estimation of sEMG-Based Upper-Limb Joint Angles in the Time–Frequency Domain Using a Scale Temporal–Channel Cross-Encoder. Actuators, 14(8), 378. https://doi.org/10.3390/act14080378

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop