Lower Limb Joint Torque Prediction Using Long Short-Term Memory Network and Gaussian Process Regression

The accurate prediction of joint torque is required in various applications. Some traditional methods, such as the inverse dynamics model and the electromyography (EMG)-driven neuromusculoskeletal (NMS) model, depend on ground reaction force (GRF) measurements and involve complex optimization solution processes, respectively. Recently, machine learning methods have been popularly used to predict joint torque with surface electromyography (sEMG) signals and kinematic information as inputs. This study aims to predict lower limb joint torque in the sagittal plane during walking, using a long short-term memory (LSTM) model and Gaussian process regression (GPR) model, respectively, with seven characteristics extracted from the sEMG signals of five muscles and three joint angles as inputs. The majority of the normalized root mean squared error (NRMSE) values in both models are below 15%, most Pearson correlation coefficient (R) values exceed 0.85, and most decisive factor (R2) values surpass 0.75. These results indicate that the joint prediction of torque is feasible using machine learning methods with sEMG signals and joint angles as inputs.


Introduction
The real-time and accurate prediction of lower limb joint torque has important research significance in many fields.In sports rehabilitation, it serves as a foundation for the understanding of changes in people's muscle strength, and it enables doctors to guide the process of rehabilitation training effectively [1,2].In human-machine interaction systems, it forms the basis for machines to discern human motor intentions and adjust assistance strategies promptly [3][4][5].Unfortunately, the use of mechanical torque sensors to measure subjects' active joint torque is challenging because the sensed torque signals include undesired torque components, such as gravity torque, friction torque, and inertial torque, which need to be eliminated using complicated processing algorithms [6].Currently, three methods are employed to compute joint torque, namely the inverse dynamics model, EMG-driven NMS model, and machine learning model.
The inverse dynamics model [7,8] is considered a standard method that requires a complex experimental environment and equipment.Two common approaches to developing an inverse dynamics model are the Newton-Euler equations and Lagrange equations.Both methods rely on GRF measurements, which are typically only available in laboratory settings.
The EMG-driven NMS model builds upon the muscle-tendon model, which can be traced back to the Hill-type muscle model proposed in 1938 [9].At present, the improved model proposed by Buchanan and Lloyd is the most widely used [10,11].However, the EMG-driven NMS model faces three major challenges.Firstly, measuring individual physiological parameters in vivo is difficult.Secondly, the model requires input from all muscles involved in joint motion.Thirdly, the iterative optimization process for model parameters is tedious.
In recent years, machine learning models have gained popularity in many research fields due to their ability to learn from large amounts of data without relying on explicit equations [12][13][14].As sEMG signals exhibit smaller time delays and a higher signal-to-noise ratio (SNR), it may be an ideal method to use sEMG signals to estimate joint torque [6].Meanwile, user-friendly wireless sEMG sensors make it possible to accurately measure sEMG signals during human movement.Therefore, various non-model-based machine learning regression methods, such as regression trees (RT) [15][16][17], support vector machines (SVM) [18][19][20], neural networks (NN) [21][22][23], and Gaussian process regression (GPR) [24][25][26], have been applied to predict joint torque using sEMG signals and motion information (joint angles, angular velocity, angular acceleration, etc.), which can be measured with portable devices such as sEMG sensors integrated with inertial measurement units (IMUs).
Among neural network models, the long short-term memory (LSTM) model has shown excellent performance in time series prediction tasks, as it effectively captures and remembers long-term dependencies through its gating unit design.The use of an LSTM model to predict joint torque with sEMG signals and motion information is a suitable choice, as all these variables are time series data related to human movement.Previous studies have demonstrated the effectiveness of using LSTM models for torque estimation.Siu et al. [27] found that an LSTM model used to estimate ankle torque with sEMG signals and accelerometry as inputs outperformed other methods, such as dense feedforward neural networks (FNN), convolutional neural networks (CNNs) [28,29], and neural ordinary differential equations (ODEs).Zhang et al. [30] observed that an LSTM model could predict lower limb joint torque during various activities accurately, with a relatively low error, using sEMG signals and joint angles as inputs.Truong et al. [31] extracted several sEMG features to predict joint angles and joint torque using an LSTM model when squatting, picking up an object, and sitting-standing.
Unlike the point prediction in NNs, GPR can not only predict the value of the target variable but also estimate the uncertainty of the prediction.It provides reliability to the prediction by calculating the confidence interval.Yang et al. [32] estimated the joint torque with a GPR model using the GRF and foot motion from wearable smart shoes while walking at three different speeds.Most R 2 values in their experiment were higher than 0.8, indicating good predictive performance.Ullauri et al. [33] compared a GPR model and pneumatic artificial muscle (PAM) model with muscle activation calculated by measuring sEMG signals to predict elbow torque, and they found that GPR provided relatively more favorable predictions.
Based on previous research about joint torque estimation by BPNN learning [34], this study aims to predict the joint torque during walking using an LSTM model and GPR model with sEMG signals and joint angles as inputs.Figure 1 shows the workflow of this work.This study can be referenced for the prediction of joint torque when only sEMG signals and joint angles are available and the GRF cannot be measured.The main contributions of this work are as follows: This article is structured as follows.Section 2 describes the methods of data collection and data processing, and the principles of the LSTM model and GPR model.Section 3 shows the results of the two models.Section 4 discusses the performance and limitations of the two models.Conclusions and future work are given in Section 5.

Data Collection
Four healthy young male subjects (mean ± STD, age ± 0.5 years, height ± 4.30 kg, weight ± 3.0, Table 1) were recruited to participate in the data collection experiment.All subjects were asked to perform 5 walking trials at a speed of 0.8 m/s on a 5-m flat surface with 7 embedded force platforms.sEMG data, force data, and motion capture data were collected simultaneously.The collected data were divided according to the gait cycle (from left foot toe-off to the subsequent left toe-off).Multiple data of 5 complete walking gait cycles for each subject were acquired.2) at 2000 Hz.For each subject, 16 sEMG signals were collected simultaneously from the gluteus maximus (GMX), rectus femoris (RF), vastus medialis (VM), vastus lateralis (VL), biceps femoris (BF), semitendinosus (ST), tibialis anterior (TA), and gastrocnemius (GC) of both the left and right legs. Figure 3 illustrates the locations where the EMG sensors were placed on the subject's body.sEMG signals from GMX, RF, BF, TA, and GC from the left leg were used in this study.

Force and Motion Data Collection
Force and motion data were collected using a 3D motion capture system (Qualisys, Goteborg, Sweden, Figure 4) at 100 Hz.Subjects were asked to perform movements with their legs stepping on different force plates.A total of 51 reflective markers were fixed on the subjects to record motion trajectories (Figure 3).The GRF data were collected by 7 platforms and the marker trajectories were captured by 41 cameras.The sEMG signals, joint angles, and joint torque of the left leg of each subject were studied in this work.

•
MAV: where X represents the EMG signal in a movable window and N represents the window length.• RMS: • ZC: where • SSC: • WL: • MNF: where f j is the jth frequency component, P j is the power spectrum at f j , and M is the total number of frequency components.• MDF: EMG signals were downsampled to 100 Hz after feature extraction to correspond to force and motion data.

Inverse Kinematics
Joint angles were calculated by the inverse kinematics toolbox in OpenSim 4.4 (an open-source software system for biomechanical modeling, simulation, and analysis, SimTK, Stanford, CA, USA).The Gait2392_Simbody model was chosen for the research in this study.First, for each subject, a scaled model was built to match the markers on his body best.Then, the joint angles were obtained by optimizing the error between the virtual marker positions on the scaled model and the real marker positions from experiments.The detailed objective function is as follows: where ω i is the weight of the marker i, q is the joint angle vector being solved for, x exp i is the experimental position of marker i, x i (q) is the virtual position of marker i at given joint angle vector q, and • represents the standard Euclidean norm.

Inverse Dynamics
Joint torque was calculated by the inverse dynamics toolbox in OpenSim 4.4 after joint angles were obtained.The classical equations of motion can be written as follows: where q, q, q are angles, angular velocities, and angular accelerations, respectively; M(q) is the mass matrix; C(q, q) is the vector of Coriolis and centrifugal torque; G(q) is the vector of gravitational torque; and τ is the vector of the unknown generalized torque.The features of sEMG signals, the joint angles, and the joint torque were normalized by the maximum value of each channel.The input signal was 38-dimensional (7 features of 5 sEMG signals and 3 joint angles).

LSTM Neural Network Model
LSTM is a type of recurrent neural network (RNN) that is designed to overcome the limitations of traditional RNNs in capturing long-term dependencies in sequential data.LSTM units, which are the building blocks of an LSTM network, are more complex than traditional RNN units.A typical LSTM unit is composed of a cell state, an input gate, an output gate, and a forget gate [40,41], as Figure 5 shows.These components work together to process sequential data and maintain information over long periods.The specific forward propagation formulas of an LSTM unit are as follows.First, the forget gate decides which information to discard in the last cell state,

  
where f t ∈ (0, 1) h represents the forget gate's activation vector, x t ∈ R d represents the input with d features at time t, h t−1 ∈ R h represents the output at time t − 1, W f ∈ R h×d and U f ∈ R h×h represent the weight matrix, b f ∈ R h represents the bias vector, and σ represents the sigmoid activation function.Second, the input gate determines which information to keep, where i t ∈ (0, 1) h represents the update gate's activation vector, ct ∈ (−1, 1) h stands for the cell input activation vector, W i , W c ∈ R h×d , U i , U c ∈ R h×h represents the weight matrix, b i , b c ∈ R h represents the bias vector, and tanh represents the hyperbolic tangent activation function.Then, the cell state is updated from time t − 1 to time t, where c t ∈ R h is the cell state vector at time t, and c t−1 is the cell state vector at time t − 1.
Finally, the output is obtained, where o t ∈ (0, 1) h represents the output gate's activation vector; h t ∈ (−1, 1) h is the hidden state vector, also known as the output vector of an LSTM unit; W o ∈ R h×d and U o ∈ R h×h represent the weight matrices; b o ∈ R h represents the bias vector; and the operator represents the Hadamard product.The initial values are c 0 = 0 and h 0 = 0, and the parameters W, U, and b need to be learned during the training process.
In this study, a network model with 3 LSTM layers and a fully connected (FC) layer was built using the deep learning toolbox of Matlab 2021b.The architecture of the model is shown in Figure 6.Each LSTM layer consisted of 32, 16, and 8 hidden neuron units, respectively, and the output layer had 1 neuron.The output mode of the first and second LSTM layers was set to 'sequence', and the output mode of the third LSTM layer was set to 'last'.A sigmoid layer was included between the third LSTM layer and the fully connected layer.The solver was set to be 'adam' and the training was performed for 500 epochs.To prevent the gradients from exploding during training, a gradient threshold of 1 was set.The initial learning rate was set to 0.005.After 125 epochs, the learning rate was dropped by multiplying it with a factor of 0.2.The format of the input data was transformed to time slices (sample number × features × time step).The time step was set to be 5, which was equivalent to 50 ms.The model was trained to predict the output at the next time step, which was 10 ms ahead of real time.For each joint of each subject, a model was trained individually.In this study, 12 different LSTM models were obtained.

GPR Model
The Gaussian process (GP) is a non-parametric learning method that has advantages over parametric methods when given a small training set [42].Unlike point prediction in NNs, GP can quantify the uncertainty of the point prediction, which is favorable to decrease risks in decision making.In the regression model, decision makers can forecast the possible outcomes with an explicit probability when provided a 95% confidence interval around the prediction [43].
Any finite subset of a set of random variables { f (x)|x ∈ X} that follows a Gaussian distribution is known as a Gaussian process [44].To specify a GP, the mean function m(x) and the covariance function k(x, x ) should be defined, which contain a series of hyperparameters θ [45,46].Then, the GP can be written as Given a training data set D(X, y) = {(x 1 , y 1 ), (x 2 , y 2 ), . . ., (x N , y N )} of N pairs of vectorial input x i and noisy scalar output y i , y i is obtained by the latent function f (x i ) with Gaussian noise where f (x) ∼ GP (0, k(x, x )), i ∼ N (0, σ 2 ) .We have For a testing data set D = (X * , y * ) of N * points, the joint distribution p(y, f * ) can be obtained as y f * ∼ N (0, where K NN * denotes the N × N * covariance matrix of the training points and the testing points.K NN , K N * N and K N * N * are similar.The Gaussian predictive distribution p( f * |y) can be obtained from the joint distribution p(y, f * ) as where The training process of GP is actually to learn the hyperparameters (θ, σ 2 ) by maximizing the log-marginal likelihood logp(y where K denotes K(X, X) in (9).In this study, the GPR model was implemented using the regression learner in Matlab 2021b.The goal was to predict the torque at the current time point based on input feature data from the previous time point, with a prediction horizon of 10 ms ahead of real time.After evaluating the performance of 4 kinds of kernel functions (rational quadratic, squared exponential, matern5/2, exponential), the exponential kernel function was finally selected.The basis function was set to be constant and the standardization was set to be true.For each joint of each subject, a separate GPR model was trained, resulting in a total of 12 distinct GPR models in this study.

Evaluation Protocol
For each subject, 5 complete gait cycles were acquired.Out of the 5 gait cycles, 4 cycles were used as the training set to train the models, while the remaining cycle was used as the testing set to evaluate the performance of the models.
The NRMSE is as follows: where and the decisive factor R 2 , where , are all used to evaluate the model performance.A lower NRMSE and higher R and R 2 close to 1 indicate better regression performance.

The Result of the LSTM Model
Table 2 shows the performance of the LSTM model in predicting joint torque for all subjects.For the hip joints of four subjects, the NRMSE values are less than 15%, the R values are more than 0.92, and the R 2 values are more than 0.79.For the knee joint, the NRMSE values are more than 17%, the R values are more than 0.83, and the R 2 values are more than 0.62.For the ankle joint, the NRMSE values are less than 8%, the R values are more than 0.95, and the R 2 values are more than 0.90.Certain differences exist between different people.The best hip prediction occurs on subject 4, of which the NRMSE value is 11.2691, the R value is 0.9469, and the R 2 value is 0.8512.The best knee prediction occurs on subject 1, of which the NRMSE value is 12.3561, the R value is 0.9102, and the R 2 value is 0.8207.The best ankle prediction occurs on subject 3, of which the NRMSE value is 7.6790, the R value is 0.9837, and the R 2 value is 0.9646.Figure 7 displays the real torque and the predicted torque of the LSTM model.

The Result of the GPR Model
Table 3 presents the performance of the GPR model in predicting the joint torque for all subjects.For the hip joints of four subjects, the NRMSE values are less than 17%, the R values are more than 0.91, and the R 2 values are more than 0.75.For the knee joint, the NRMSE values are less than 18%, the R values are more than 0.85, and the R 2 values are more than 0.56.For the ankle joint, the NRMSE values are less than 8%, the R values are more than 0.94, and the R 2 values are more than 0.81.The best hip prediction occurs on subject 1, of which the NRMSE value is 8.6833, the R value is 0.9661, and the R 2 value is 0.9070.The best knee prediction occurs on subject 3, of which the NRMSE value is 11.4735, the R value is 0.9328, and the R 2 value is 0.7999.The best ankle prediction occurs on subject 3, of which the NRMSE value is 6.2121, the R value is 0.9801, and the R 2 value is 0.9590.Figure 8 shows the real torque and the predicted torque of the GP model, with shaded areas representing the 95% confidence intervals of the predicted torque.

Discussion
In this study, an LSTM model and GPR model were used to predict joint torque in the sagittal plane during walking, using sEMG signals and joint angles as inputs.The relevant regression performance (NRMSE, R, R 2 ) indicated that the predicted torque was in good agreement with the torque calculated by inverse dynamics.In the LSTM model, the average regression performance for the ankle (NRMSE: 6.5850%, R: 0.9711, R 2 : 0.9361), hip (NRMSE: 12.5728%, R: 0.9324, R 2 : 0.8336), and knee (NRMSE: 14.4465%, R: 0.8763, R 2 : 0.7560) decreased in sequence.In [27], Siu et al. took sEMG signals and accelerations as inputs, using an LSTM model to predict the ankle torque (average R of five subjects: 0.84) accurately.However, only the maximum value and area for each window of the sEMG signals were extracted.In our study, seven features of sEMG signals were extracted, and, in comparison, the average R was improved.In the GPR model, the average regression performance for the ankle (NRMSE: 6.4766%, R: 0.9636, R 2 : 0.9018), hip (NRMSE: 12.3216%, R: 0.9391, R 2 : 0.8371), and knee (NRMSE: 14.3591%, R: 0.8980, R 2 : 0.7335) also decreased in order.In [32], Yang et al. used a GP model to predict joint torque with plantar pressure, orientations, and angular velocities as inputs.In the case of walking at the speed of 0.8 m/s, the prediction of ankle (average R 2 : 0.8905), knee (average R 2 : 0.7250), and hip (average R 2 : 0.7390) torque was precise.However, it is inconvenient to measure plantar pressure with smart shoes.In our study, the measurement of the sEMG signal was more user-friendly.Portable wireless sEMG sensors integrated with IMUs made it possible to predict the joint torque in real time.
Although the low NRMSE and the high R and R 2 in both the LSTM model and GPR model indicate that the joint torque has a relatively strong correlation and high consistency with sEMG signals and joint angles, there still exist some limitations.The performance of the hip and knee torque is not as good as that of the ankle torque.In [30], Zhang et al. also used an LSTM model to predict joint torque, with good performance.Their study indicated that the best prediction performance during walking was for the hip, followed by the ankle and finally the knee.Our result was not exactly consistent with this.We suppose that the prediction performance can be essentially attributed to the complexity of the output curve to be predicted, including the number of data points, number of peaks, and data volatility.The more complex the output curve, the more challenging it is to accurately predict.Therefore, one solution to fundamentally improve the prediction performance is to add more valid EMG signals.
There are certain differences in the physiological and movement characteristics of different subjects, reflected in the data and results.In the LSTM model, the ability to estimate the knee joint torque of subject 1 was better than for subject 4. In the GP model, the ability to estimate the hip joint torque of subject 1 was better than for subject 3.For a specific subject, the regression performance would be improved if the adjustable parameters in the model (number of network layers, number of neurons, etc.) were individually adjusted.
Additionally, the EMG signal itself lacks robustness.In this research, the measurement of sEMG signals posed a challenge.The absolute amplitude of the sEMG signal itself has little reference significance.Factors such as the sensor itself, where the sensor is attached, and the tightness of the sensor attachment all affect the quality of the sEMG signal.To ensure consistency and reliability, all data must be collected in the same experiment, minimizing variations caused by these factors.

Conclusions and Future Work
In this study, the hip, knee, and ankle torque were predicted during walking in four subjects by an LSTM model and GPR model.The results indicate that both the LSTM and GPR models performed well in predicting the joint torque given the sEMG signals and joint angles as inputs, with low NRMSE values and high R and R 2 .For all joints of each model, the average NRMSE values were all less than 15%, the R were all more than 0.87, and the R 2 were all more than 0.73.When using machine learning models like LSTM and GPR, on the one hand, there is no need to build dynamic models or measure the GRF compared to inverse dynamic models.On the other hand, the complex optimization process associated with NMS models is not necessary.The proposed models have potential application in various fields, including exoskeleton rehabilitation systems, exoskeleton assistance systems, and sports training.These models can provide accurate and real-time estimates of joint torque during walking when force plates are limited or unavailable.
In the future, our research group will conduct more in-depth research considering the following four aspects.Firstly, the diversity of the subjects should be increased.A larger data set including more subjects with different body types and different ages should be built.Additionally, studying subjects with lower limb disabilities has broader significance.Secondly, expanding the diversity of movements beyond walking would be beneficial.Including activities such as running, jumping, and cycling in daily life will provide a broader perspective on joint torque prediction across various dynamic movements.Thirdly, investigating inter-movement and inter-subject predictions is a more challenging task.It would involve using the current model and data set to predict joint torque for new movements and individuals.This research direction would contribute to a more comprehensive understanding of the generalizability and adaptability of the proposed model.Lastly, exploring torque prediction over a period of time, rather than at a single point in time, would be a valuable future direction.

Figure 5 .
Figure 5.The structure of a typical LSTM unit.

Figure 6 .
Figure 6.The structure of the LSTM network model.

Figure 7 .
Figure 7.The predicted torque using LSTM network model.

Figure 8 .
Figure 8.The predicted torque using GPR model.

Table 2 .
The NRMSE, R, and R 2 of the joint torque prediction using LSTM model.

Table 3 .
The NRMSE, R, and R 2 of the joint torque prediction using GPR model.