A Method Combining Multi-Feature Fusion and Optimized Deep Belief Network for EMG-Based Human Gait Classiﬁcation

: In this paper, a gait classiﬁcation method based on the deep belief network (DBN) optimized by the sparrow search algorithm (SSA) is proposed. The multiple features obtained based on surface electromyography (sEMG) are fused. These functions are used to train the model. First, the sample features, such as the time domain and frequency domain features of the denoised sEMG are extracted and then the fused features are obtained by feature combination. Second, the SSA is utilized to optimize the architecture of DBN and its weight parameters. Finally, the optimized DBN classiﬁer is trained and used for gait recognition. The classiﬁcation results are obtained by varying different factors and the recognition rate is compared with the previous classiﬁcation algorithms. The results show that the recognition rate of SSA-DBN is higher than other classiﬁers, and the recognition accuracy is improved by about 2% compared with the unoptimized DBN. This indicates that for the application in gait recognition, SSA can optimize the network performance of DBN, thus improving the classiﬁcation accuracy.


Introduction
Gait is a biological characteristic that describes the manner in which people walk [1]. Walking is one of the important activities that maintains our daily life [2] and physical health. Surface electromyography (sEMG) is a weak bioelectric signal that characterizes to some extent the functional state between the human nervous system and muscles [3]. By analyzing the characteristics of surface electromyography signals obtained from the lower limbs of humans, we can identify the gait phase of the gait cycle [4]. Gait classification based on sEMG signals has been widely used in the diagnosis of muscle diseases and as a guidance path for rehabilitation medicine [5].
Gait information includes video image, electromyography, three-dimensional and kinematics [6][7][8], etc. The 3D motion capture is an accurate optical motion capture system, which can collect and record the 3D gait of the human body in real time and conduct quantitative analysis on gait indicators such as time distance parameters and kinematic parameters. It is commonly used in motion capture and analysis of high frequency and high-precision motion [9][10][11]. The sEMG signal can reflect the activation degree of skeletal muscle and is highly correlated with muscle force [12]. Therefore, sEMG has been widely utilized in the field of gait analysis [13,14]. The gait changes caused by diseases have also attracted extensive attention, accompanied by neuromuscular changes [15][16][17]. With the development of the real-time monitoring system, much research has been conducted to Mohammad et al. [45] used DBN to extract the depth features from the fusion observation of signals for classifying five basic emotions. As compared with traditional SVM, DBN significantly improved the accuracy of emotion classification and increased the nonlinear classification of emotions. Qiao et al. [46] combined cognitive computing, DBN, and collaborative robots for building a model. The experiment shows that DBN significantly reduced the error rate by using its own neuron number, network structure, and training epochs and laid a foundation for the performance improvement of collaborative robots for the future. However, the self-parameters of these DBNs which are often determined by human experience, not only induce human diagnostic errors, but also affect the structure of the network. This leads to high computational cost and slow training speed of the whole model [47]. Deng et al. [48] proposed a differential evolution algorithm based on quantum computation to optimize DBN and applied it to the practical engineering problems. The results show that this algorithm has better optimization performance and classification accuracy as compared to non-optimized DBN. Xu et al. [49] proposed the sparrow search algorithm (SSA), which improves the convergence speed, stability, and convergence accuracy of the model. Li et al. [50] used simulated annealing (SA), PSO, and SSA to develop an improved DBN model by selecting the best model parameters. The results show that SSA-DBN achieves the highest assessment accuracy and is suitable for optimizing the network structure of DBN.
In this study, the TD and FD features are extracted from the sEMG signals, and their fusion features are used as the input of a DBN model for performing gait classification. The SSA with better optimization performance is used to adjust the network architecture of DBN and solve the problem of the empirical selection of DBN parameters.
The major contributions of this work are as follows: (1) The layer-by-layer learning feature of DBN can solve the distribution differences of feature sets caused by gait differences. (2) To solve the problem of empirical selection of DBN parameters, SSA with good optimization performance is used to prevent the model from falling into local optimization due to traditional low dimensional features in gait analysis. (3) The proposed method effectively improves the accuracy of gait classification.
The rest of the manuscript is organized as follows. Section 2 describes the proposed methods. Section 3 presents the experimental results and discussion. Section 4 concludes this work and presents the future work.

Materials and Methods
This experimental protocol is comprised of five parts, namely acquisition of experimental data and its pre-processing, feature extraction from sEMG signals, construction of the deep belief network (DBN), parameter optimization of SSA, and gait classification results. The flowchart of the proposed method is shown in Figure 1. intrinsic modular functions in electricity load demand and to model each function to predict its trend. The final forecasts were derived from a combination of unbiased and weighted summation. Mohammad et al. [45] used DBN to extract the depth features from the fusion observation of signals for classifying five basic emotions. As compared with traditional SVM, DBN significantly improved the accuracy of emotion classification and increased the nonlinear classification of emotions. Qiao et al. [46] combined cognitive computing, DBN, and collaborative robots for building a model. The experiment shows that DBN significantly reduced the error rate by using its own neuron number, network structure, and training epochs and laid a foundation for the performance improvement of collaborative robots for the future. However, the self-parameters of these DBNs which are often determined by human experience, not only induce human diagnostic errors, but also affect the structure of the network. This leads to high computational cost and slow training speed of the whole model [47]. Deng et al. [48] proposed a differential evolution algorithm based on quantum computation to optimize DBN and applied it to the practical engineering problems. The results show that this algorithm has better optimization performance and classification accuracy as compared to non-optimized DBN. Xu et al. [49] proposed the sparrow search algorithm (SSA), which improves the convergence speed, stability, and convergence accuracy of the model. Li et al. [50] used simulated annealing (SA), PSO, and SSA to develop an improved DBN model by selecting the best model parameters. The results show that SSA-DBN achieves the highest assessment accuracy and is suitable for optimizing the network structure of DBN. In this study, the TD and FD features are extracted from the sEMG signals, and their fusion features are used as the input of a DBN model for performing gait classification. The SSA with better optimization performance is used to adjust the network architecture of DBN and solve the problem of the empirical selection of DBN parameters.
The major contributions of this work are as follows: (1) The layer-by-layer learning feature of DBN can solve the distribution differences of feature sets caused by gait differences. (2) To solve the problem of empirical selection of DBN parameters, SSA with good optimization performance is used to prevent the model from falling into local optimization due to traditional low dimensional features in gait analysis. (3) The proposed method effectively improves the accuracy of gait classification.
The rest of the manuscript is organized as follows. Section 2 describes the proposed methods. Section 3 presents the experimental results and discussion. Section 4 concludes this work and presents the future work.

Materials and Methods
This experimental protocol is comprised of five parts, namely acquisition of experimental data and its pre-processing, feature extraction from sEMG signals, construction of the deep belief network (DBN), parameter optimization of SSA, and gait classification results. The flowchart of the proposed method is shown in Figure 1.

. Lower Limb Muscles and Gait Division
Considering the role and contribution of lower limb muscles during different phases of walking, and the sensitivity of the sEMG signal acquisition device to lower limb muscles, the muscles with distinct performance characteristics are selected as the signal sources [51]. As presented in Figure 2, it includes tensor fascia lata (TF), adductor longus (AL), rectus femoris (RF), vastus medialis (VM), tibialis anterior (TA), semitendinosus (ST), gastrocnemius (GM), and soleus (SO). Considering the role and contribution of lower limb muscles during different phases of walking, and the sensitivity of the sEMG signal acquisition device to lower limb muscles, the muscles with distinct performance characteristics are selected as the signal sources [51]. As presented in Figure 2, it includes tensor fascia lata (TF), adductor longus (AL), rectus femoris (RF), vastus medialis (VM), tibialis anterior (TA), semitendinosus (ST), gastrocnemius (GM), and soleus (SO). A complete gait cycle can be divided into stance and swing phases [52]. The stance phase can be further divided into pre-stance, mid-stance, and terminal-stance. The swing phase can be divided into pre-swing and terminal-swing, as presented in Figure 3.

Signal Processing and Analysis
The surface electromyography (sEMG) signal is a complex, weak, and non-smooth electrical signal, which comprises motion artifacts caused by electrode offset and other noise interference induced during the acquisition process. Therefore, it is necessary to remove the noise efficiently. The denoising methods we adopted in the experiment include wavelet threshold denoising, wavelet packet threshold denoising, and wavelet modulus maximum denoising [53].

Feature Extraction of sEMG Signals
After de-noising, the TD and FD features of each channel of the EMG signal are extracted. In this work, three representative time domain characteristics, including absolute mean value (MAV), variance (VAR), and zero crossing points (ZC) are used as frequency domain features [54,55].
MAV takes advantage of the property that sEMG signals have large amplitude fluctuations in the time domain, which are linearly related to the level of muscle activation. The higher the value of MAV, the higher is the activation level of the muscle. A complete gait cycle can be divided into stance and swing phases [52]. The stance phase can be further divided into pre-stance, mid-stance, and terminal-stance. The swing phase can be divided into pre-swing and terminal-swing, as presented in Figure 3.

Lower Limb Muscles and Gait Division
Considering the role and contribution of lower limb muscles during different phases of walking, and the sensitivity of the sEMG signal acquisition device to lower limb muscles, the muscles with distinct performance characteristics are selected as the signal sources [51]. As presented in Figure 2, it includes tensor fascia lata (TF), adductor longus (AL), rectus femoris (RF), vastus medialis (VM), tibialis anterior (TA), semitendinosus (ST), gastrocnemius (GM), and soleus (SO). A complete gait cycle can be divided into stance and swing phases [52]. The stance phase can be further divided into pre-stance, mid-stance, and terminal-stance. The swing phase can be divided into pre-swing and terminal-swing, as presented in Figure 3.

Signal Processing and Analysis
The surface electromyography (sEMG) signal is a complex, weak, and non-smooth electrical signal, which comprises motion artifacts caused by electrode offset and other noise interference induced during the acquisition process. Therefore, it is necessary to remove the noise efficiently. The denoising methods we adopted in the experiment include wavelet threshold denoising, wavelet packet threshold denoising, and wavelet modulus maximum denoising [53].

Feature Extraction of sEMG Signals
After de-noising, the TD and FD features of each channel of the EMG signal are extracted. In this work, three representative time domain characteristics, including absolute mean value (MAV), variance (VAR), and zero crossing points (ZC) are used as frequency domain features [54,55].
MAV takes advantage of the property that sEMG signals have large amplitude fluctuations in the time domain, which are linearly related to the level of muscle activation. The higher the value of MAV, the higher is the activation level of the muscle.

Signal Processing and Analysis
The surface electromyography (sEMG) signal is a complex, weak, and non-smooth electrical signal, which comprises motion artifacts caused by electrode offset and other noise interference induced during the acquisition process. Therefore, it is necessary to remove the noise efficiently. The denoising methods we adopted in the experiment include wavelet threshold denoising, wavelet packet threshold denoising, and wavelet modulus maximum denoising [53].

Feature Extraction of sEMG Signals
After de-noising, the TD and FD features of each channel of the EMG signal are extracted. In this work, three representative time domain characteristics, including absolute mean value (MAV), variance (VAR), and zero crossing points (ZC) are used as frequency domain features [54,55]. MAV takes advantage of the property that sEMG signals have large amplitude fluctuations in the time domain, which are linearly related to the level of muscle activation. The higher the value of MAV, the higher is the activation level of the muscle.
where, x k (k = 1, 2, . . . , N) denotes the sEMG time series with a window length of N. VAR is a measure of signal power of the sEMG signal and is expressed as follows: ZC refers to the number of times that the sEMG waveform passes through the zero point to avoid signal cross counting caused by low-level noise. It is mathematically expressed as follows: where, sgn(x) = 1 x > 0 0 otherwise .
We select two representative frequency domain characteristics, namely average power frequency f mean and median frequency f m f [56] defined as follows: where, P( f ) is the power spectral density of the sEMG signal and f is the frequency. Each feature is extracted by setting different window lengths N and to form a set of feature vectors. Then, a set of feature matrices is formed based on different kinds of selected lower limb muscles, where the number of rows in the matrix represents the number of selected lower limb muscle blocks and the number of columns represents the values of the windows in which the signal is divided. This feature matrix is used as the input data of the network in the next section.

Deep Belief Network
The deep belief network (DBN) is a probabilistic generation model that is designed by stacking multiple restricted Boltzmann machines (RBMs). Its training process is divided into two parts, i.e., the greedy unsupervised hierarchical pre-training process and the discriminative supervised fine-tuning process. Please note that the neurons in the same layer are not connected to each other and connections are only formed between adjacent layers [57].
The basic building module of DBN is RBM. One RBM is composed of one visible layer and one hidden layer. During the training process of DBN, each RBM is usually pretrained from bottom to top in a layered manner, and the hidden layer of the previous RBM is used as the visible layer of the next RBM. Afterwards, the whole DBN model is fine-tuned based on the BP network set in the last layer. Finally, the output layer performs hypothesis prediction according to the posterior probability distribution obtained in the previous layer.
The basic network structure of the DBN model is shown in Figure 4. In this work, we define the learning rate factor controlling the weight update rate as α and the number of fine-tunings as β. Mathematics 2022, 10, x FOR PEER REVIEW 6 of 20  h as the hidden vector, and the second RBM is trained. The third RBM is trained in a similar fashion. The black circles in Figure 4 represent the neurons of each layer. The number of neurons is usually determined manually. In this work, the number of neurons is set as Best_pos ( q ) (where q represents the q -th hidden layer and ∈ , The architecture of DBN possesses the ability to obtain higher dimensional features based on the layer-by-layer learning feature of this model. The hidden variables in each layer learn how to represent the high-order correlations of the original input data. In order to use DBN for classification, the feature vectors of the data samples are used to set the state of the visible variables in the bottom layer of DBN. This is followed by DBN generating a probability distribution of the possible labels of the data based on the posterior probability distribution of the data samples. Let us assume that the dataset  Figure 4a-c represents the structure of the DBN model containing 1 RBM, 2 RBMs, and n RBMs, respectively. The first RBM is composed of the feature matrix data obtained in the previous section and the first hidden layer h 1 . The parameters of the first RBM are trainable, and include the weights and offset coefficients of h 1 . Then, h 1 is treated as the visible vector and h 2 as the hidden vector, and the second RBM is trained. The third RBM is trained in a similar fashion. The black circles in Figure 4 represent the neurons of each layer. The number of neurons is usually determined manually. In this work, the number of neurons is set as Best_pos (q) (where q represents the q-th hidden layer and q ∈ [1, n]).
The architecture of DBN possesses the ability to obtain higher dimensional features based on the layer-by-layer learning feature of this model. The hidden variables in each layer learn how to represent the high-order correlations of the original input data. In order to use DBN for classification, the feature vectors of the data samples are used to set the state of the visible variables in the bottom layer of DBN. This is followed by DBN generating a probability distribution of the possible labels of the data based on the posterior probability distribution of the data samples.
Let us assume that the dataset S = {(c 1 , d 1 ), (c 2 , d 2 ), . . . , (c M , d M )} contains M data sample pairs, where c M is the M-th data sample and d M is the corresponding M-th target tag. Given a data sample (c λ , d λ ) (λ ∈ [1, M]) from the data set, the DBN with n hidden layers is represented as a complex feature mapping function. After feature conversion, the softmax layer is used as the output layer of the DBN to classify and predict the parameter neuron is responsible for predicting the probability of the o-th class. The input of a given c n is the output of the previous layer and is associated with the weight W  λ . The probability obtained by the softmax layer is mathematically expressed as follows: where, c n denotes the output of the previous layer. Based on probability estimation, the trained DBN classifier provides the following prediction.
Mathematics 2022, 10, 4387 7 of 20 The DBN is optimized by the statistical gradient descent with negative log-likelihood loss relative to the training set S. The posterior of each layer is approximated by the factorial distribution of independent variables within a layer. The values of the independent variables are provided by the variables in the previous layer. The purpose of the wake-sleep algorithm [57] is to learn the characteristics of the original data and recover it correctly. It obtains the weights of the top-level undirected connections by fitting RBM on the posterior distribution of the penultimate layer. The fine-tuning process starts with the state of the top output layer and in turn activates each bottom layer by using a top-down connection. Thus, a DBN model can be considered as RBMs consisting of all prior hidden variables placed at the top layer of a directed belief network, combined with a set of "identified" weights to perform fast approximate inference.

Sparrow Search Algorithm
The sparrow search algorithm (SSA) is a metaheuristic algorithm that is inspired by the characteristics of birds, i.e., foraging and anti-predatory behavior [49].
Let us suppose that a population of w sparrows conducts a search for food.
where, v denotes the dimension of the problem variable to be optimized and w represents the number of sparrows, and i ∈ [1, w], j ∈ [1, v]. At this point, the fitness value is expressed as follows: where, r denotes the fitness value. The sparrows with high fitness value have a larger foraging search range as discoverers as compared to the joiners in the population. Therefore, the location update of the discoverers during each iteration is described as follows: where, t is the current iteration, iter max is the maximum number of iterations, α is a uniformly distributed random number in range [0, 1]. W 2 ∈ [0, 1] and ST ∈ [0.5, 1] denote the warning value and the safety value, respectively. Q is a random number subject to normal distribution, and L is a matrix of dimension 1 × d. When W 2 < ST, there is no danger around the population and the discoverer can expand the search range to make the fitness value of other individuals higher. On the other hand, when W 2 ≥ ST, a predator is detected around the population and an alarm is released. As a result, all the sparrows quickly fly to other safe places for feeding. The update of the joiner's position during each iteration is described as follows, where, E t worst and E t pbest denote the worst global position and the best local position of the joiner in the t-th and (t+1)-th iterations, respectively. A is a multidimensional matrix with internal elements of 1 or −1, and A + = A T AA T −1 . When i > n 2 , the i-th joiner with lower adaptation has no gain in foraging and should shift its location to obtain higher energy.
The update regarding the position of the population after it becomes aware of the danger is described as follows: where, E t gbest is the global optimal position of the current population, β is the step control parameter, which is a random number distributed normally with mean 0 and variance 1, and ε is a very small constant used to avoid zero in the denominator. µ ∈ [−1, 1] is a random number, r i is the fitness value of individual i, r g , and r ω are the optimal and the worst fitness values of the current population, respectively. When r i > r g , it means that the current individual is at the edge of the population and is highly vulnerable to the predators. When r i = r g , the current individual is in the middle of the population. When it feels the danger, it should move closer to other sparrows to reduce the risk of being predated.
In this work, the SSA is used to search for the sparrow with the best position among the parameters to be optimized in the DBN, i.e., the sparrow with the highest adaptation degree. The parameters include the number of neurons Best_pos (q) per layer, the number of reverse fine-tunings β, and the learning rate α mentioned in the previous section. The optimal network structure of the DBN is set based on the parameters of this sparrow at the end of each iteration.

Training Process of Gait Results
The detailed steps of the proposed algorithm are presented below.
Step 1. We obtain the original sEMG signals dataset.
Step 2. We denoise the original signal dataset by using the wavelet modulus maximum method.
Step 3. The TD and FD features are extracted by using overlapping windows.
Step 4. The dataset is divided into training and test sets.
Step 5. We set the relevant parameters in the DBN model, including the number of RBM layers, the number of neurons in each layer, the number of iterations, the learning rate, and the number of reverse fine-tunings.
Step 6. We set the parameters of SSA, including the number of optimization parameters, the ratio of discoverers to joiners, and the safety threshold of the optimization parameter value.
Step 7. The DBN randomly generates the initial weights based on the safety threshold. The SSA algorithm updates the positions of the warning values of discoverers and joiners based on (10) and (11), and (12), assigns the updated parameter values to the DBN model, and iteratively updates the values of the new fitness function.
Step 8. We determine whether the termination condition is satisfied and whether the fitness function is the current optimum. If not, return to Step 6, otherwise, proceed to Step 9.
Step 9. Finally, we obtain the minimum value of fitness function value and determine the DBN parameters, i.e., the optimal weight parameters of the DBN model.
Step 10. The trained model is evaluated based on the test set. The flowchart of the proposed algorithm is presented in Figure 5.
Step 8. We determine whether the termination condition is satisfied and whether the fitness function is the current optimum. If not, return to Step 6, otherwise, proceed to Step 9.
Step 9. Finally, we obtain the minimum value of fitness function value and determine the DBN parameters, i.e., the optimal weight parameters of the DBN model.
Step 10. The trained model is evaluated based on the test set. The flowchart of the proposed algorithm is presented in Figure 5.

Acquisition and Pre-Processing of sEMG Signals from Subjects
The dataset included signals recorded from six healthy adults. Mean (±SD) characteristics were as follows: age = 24.0 ± 1.5 years; height =173 ± 5 cm; mass = 66.3 ± 7.1 kg; body mass index (BMI) = 22.2 ± 1.0 kg/m . None of the subjects presented any patholog-

Acquisition and Pre-Processing of sEMG Signals from Subjects
The dataset included signals recorded from six healthy adults. Mean (±SD) characteristics were as follows: age = 24.0 ± 1.5 years; height =173 ± 5 cm; mass = 66.3 ± 7.1 kg; body mass index (BMI) = 22.2 ± 1.0 kg/m 2 . None of the subjects presented any pathological condition or had undergone orthopedic surgery that might have affected lower limb mechanics. Therefore, subjects with joint pain, neurological pathology, orthopedic surgery, abnormal gait, or a body mass index (BMI) higher than 25 (overweight and obesity) were not recruited. The research was undertaken in compliance with the ethical principles and participants signed informed consent prior to the beginning of the test. The equipment included a DataLink sEMG acquisition instrument (sampling frequency of 1000 Hz), a Vicon three-dimensional motion capture system, a triaxial accelerometer, and a computer. The subjects walk on a treadmill at a uniform speed of 1.3 m/s for 60 s. The signals from eight muscles, presented in Figure 2, are used as the signal acquisition sources, and synchronous camera tracking is carried out, which is convenient for gait recognition comparison verification. The spatial-temporal information of the subjects' gait are as follows: the gait cycle is controlled within 1-1.5 s, the step length is 1-1.5 m, and the step speed is 1.3 m/s. Figure 6 shows the 8-channel EMG signal of a complete gait cycle.
The acquired sEMG signals are compared based on three denoising methods, namely, wavelet threshold denoising, wavelet packet threshold denoising, and wavelet modulus maximum denoising. The root mean square error (MSE) and signal to noise ratio (SNR) are used as the evaluation indicators [58]. The following is an example of the vastus medialis (VM) muscle.
Generally, for the two aforementioned evaluation indicators, the smaller the MSE, the larger the SNR, and the better are the noise elimination results. In Table 1, the SNR of the wavelet modulus maximum is greater than the wavelet packet threshold and the wavelet threshold, and the average SNR reaches 92.5751. Similarly, the MSE index of the wavelet modulus maximum method is as low as 0.0024, which is significantly lower than the other two methods. Therefore, in this work, the wavelet modulus maximum method is used to denoise the sEMG signals. The corresponding denoising effect is shown in Figure 7. a computer. The subjects walk on a treadmill at a uniform speed of 1.3 m/s for 60 s. The signals from eight muscles, presented in Figure 2, are used as the signal acquisition sources, and synchronous camera tracking is carried out, which is convenient for gait recognition comparison verification. The spatial-temporal information of the subjects' gait are as follows: the gait cycle is controlled within 1-1.5 s, the step length is 1-1.5 m, and the step speed is 1.3 m/s. Figure 6 shows the 8-channel EMG signal of a complete gait cycle. The acquired sEMG signals are compared based on three denoising methods, namely, wavelet threshold denoising, wavelet packet threshold denoising, and wavelet modulus maximum denoising. The root mean square error (MSE) and signal to noise ratio (SNR) are used as the evaluation indicators [58]. The following is an example of the vastus medialis (VM) muscle.
Generally, for the two aforementioned evaluation indicators, the smaller the MSE, the larger the SNR, and the better are the noise elimination results. In Table 1, the SNR of the wavelet modulus maximum is greater than the wavelet packet threshold and the wavelet threshold, and the average SNR reaches 92.5751. Similarly, the MSE index of the wavelet modulus maximum method is as low as 0.0024, which is significantly lower than the other two methods. Therefore, in this work, the wavelet modulus maximum method is used to denoise the sEMG signals. The corresponding denoising effect is shown in Figure 7.     Figure 7 represents the denoising result of the selected lower limb muscle. In each subplot, the red curve in the upper panel denotes the original signal before denoising and the blue curve in the lower panel represents the signal after denoising. It is evident from the figure that the original signal contains more burrs and the signal drifts around the zero baseline in the resting state. After using wavelet modulus maximum denoising, the signal curve becomes smoother, and the signal tends to zero in the resting state.
Finally, TD features and FD features are extracted from the denoised signal separately. In this work, the features are extracted using the data window with overlap. The Figure 7. The de-noising result of the selected lower limb muscle signal using the wavelet module maximum method. Taking the tensor fascia lata (TF) as an example, the red line represents the original signal before de-noising, and the blue line represents the signal after de-noising. Figure 7 represents the denoising result of the selected lower limb muscle. In each subplot, the red curve in the upper panel denotes the original signal before denoising and the blue curve in the lower panel represents the signal after denoising. It is evident from the figure that the original signal contains more burrs and the signal drifts around the zero baseline in the resting state. After using wavelet modulus maximum denoising, the signal curve becomes smoother, and the signal tends to zero in the resting state.
Finally, TD features and FD features are extracted from the denoised signal separately. In this work, the features are extracted using the data window with overlap. The window length N is 30 ms, and each window increment is 25 ms to obtain the pre-processed dataset.

Classifier Parameter Setting
In this section, SVM, ELM, DBN, and SSA-DBN are used to classify and identify the TD features, FD features, and fusion features obtained by combining TD and FD. The unoptimized DBN model contains 4 layers and 3 RBMs. The number of neurons per layer is 10, the number of epochs is 30, the learning rate is 0.01, and the number of reverse fin e-tunings is set to 10 and 100 to perform comparisons in subsequent experiments.
The SSA algorithm is introduced to optimize the architecture of DBN. In this work, we consider the proportion of discoverers to be 20% and the warning value is 0.8. The SSA algorithm optimizes several parameters. Note that different values of the parameters have certain effects on the classification results. Based on the structure of the network and the size of the dataset, five optimization parameters are set for performing experiments. It is worth noting that the parameters are often determined by human experience from previous studies, so this experiment uses SSA to optimize parameters. The range of the search for obtaining the optimal values is shown in Table 2. Where, Best_pos (1), Best_pos (2), and Best_pos (3) are used to limit the number of neurons in the 3 RBMs, α denotes the learning rate, and β denotes the number of reverse fine-tunings.

The Effect of Feature Type on Classification Results
The data sets are obtained from 20 consecutive experiments conducted by six subjects respectively. The classification results in this study are based on statistics, consisting of mean value and standard error [59]. The mean value represents the average recognition rate in the gait phase, and the standard error describes the average difference between the mean recognition rate of different subjects and the mean overall recognition rate. Variance is also an indicator of statistics, which represents the deviation degree between sample and the mean [60]. Finally, the average recognition rate of the five gait results is calculated on the basis of the arithmetic mean.

Time Domain Features
In this section, the DBN and SSA-DBN models use the TD feature dataset of gaits as the input for performing gait classification. The corresponding results are compared with SVM and ELM classifiers. The classification results are shown in Table 3. In Table 3, the rows represent the recognition rates trained by the classifier for the five stages of gait classification, i.e., pre-stance, mid-stance, terminal-stance, pre-swing, and terminal-swing. Each column represents different types of classifiers. In order to study the effect of network structure on the DBN model, DBN (10) (the number of reverse fine-tunings of this model is 10) and DBN (100) (the number of reverse fine-tunings of this model is 100) are used. In order to compare the classification results of each classifier more intuitively, a comparison graph is presented in Figure 8.

The Effect of the Number of Reverse Fine-Tunings on TD Feature Classification Results
In this section, the TD features under the pre-stance are considered. First, we set the unoptimized DBN model with 4 layers, 3 RBMs, 10 neurons per layer, 30 training epochs, and a learning rate of 0.01. Then, we artificially adjust the number of reverse fine-tunings to 1, 20, 100, and 500. Second, in order to investigate the effect of the SSA algorithm on the network structure of the DBN model, we use the SSA algorithm to adaptively select the DBN model with the number of reverse fine-tunings β for performing comparisons. The corresponding results are presented in Table 4. As presented in Table 4, the recognition rate of the DBN model increases as the number of reverse fine-tunings is artificially increased. The results demonstrate that the network structure of the DBN model plays a decisive role in the accuracy of the final gait classification. However, at the same time, the increase in the number of reverse fine-tunings also increases the training time. Moreover, this method also relies on subjec- Accuracy(%) The analysis of the data presented in Figure 8 shows that the average classification accuracies based on TD features obtained by using SVM, ELM, DBN, and SSA-DBN classifiers are 92.08%, 93.95%, 95.97%, and 96.24%, respectively. The results show that SSA-DBN has the highest average recognition rate and improves the classification accuracy by about 4% compared with SVM. The recognition rate of SVM is significantly lower as compared to classifiers in mid-stance. Based on TD, the classification effect of DBN with reverse fine-tuning number of 10 is not much different from that of the SVM and ELM. Notably, the classification result of DBN with 100 reverse fine-tuning times is significantly better than that of DBN with 10 reverse fine-tuning times. This shows that the training efficiency and classification results can be improved by artificially adjusting the network structure of the DBN model. Therefore, in order to address this situation, we develop a discussion regarding the effect of the number of reverse fine-tunings on the classification results of the TD features.

The Effect of the Number of Reverse Fine-Tunings on TD Feature Classification Results
In this section, the TD features under the pre-stance are considered. First, we set the unoptimized DBN model with 4 layers, 3 RBMs, 10 neurons per layer, 30 training epochs, and a learning rate of 0.01. Then, we artificially adjust the number of reverse fine-tunings to 1, 20, 100, and 500. Second, in order to investigate the effect of the SSA algorithm on the network structure of the DBN model, we use the SSA algorithm to adaptively select the DBN model with the number of reverse fine-tunings β for performing comparisons. The corresponding results are presented in Table 4. As presented in Table 4, the recognition rate of the DBN model increases as the number of reverse fine-tunings is artificially increased. The results demonstrate that the network structure of the DBN model plays a decisive role in the accuracy of the final gait classification. However, at the same time, the increase in the number of reverse fine-tunings also increases the training time. Moreover, this method also relies on subjective judgment of human experience when solving the practical classification problems.
The recognition rate of the model optimized by SSA is the highest. When compared with the DBN with 500 times of reverse fine-tuning, although there is no major improvement in the recognition rate, it greatly reduces the training time, which in turn improves the classification efficiency and avoids human diagnosis errors. On the other hand, it also illustrates the importance of adjusting the network structure to improve the performance of the DBN model. Figure 9 compares the recognition rate curves of the aforementioned models.
We keep the vertical axis range from (a) to (d) in Figure 9 consistent to make the comparison clearer. With the increase of training time, the range of the horizontal axis also increases. It is evident from Figure 9 that when the number of fine-tunings is 1, the curve stability is poor and the recognition rate is extremely low. When the number of fine-tunings is 20, the recognition accuracy is slightly improved, but the curve is still oscillating. After the number of fine-tunings is increased again, the recognition rate curve becomes stable, and the loss function gradually decreases and finally tends to 0. This further illustrates the influence of the network structure on the DBN model. In the subsequent experiments, in order to reduce the invalid workload, reduce the execution time, and facilitate the analysis of the network performance before and after optimization, we fix the number of fine-tunings of the unoptimized DBN model to 100.

Frequency Domain Features
The FD features are used as the input of DBN and SSA-DBN models for performing gait classification and the corresponding results are compared with SVM and ELM classifiers. The classification results are shown in Table 5.
In Table 5, the rows represent the recognition rates of the classifier for the five stages of gait classification, and each column represents different types of classifiers. Please note that the parameters of the network structure are no longer set artificially for the SSA-DBN model, but the number of reverse fine-tunings is chosen by SSA autonomously from the conclusions drawn in the previous section. In order to compare the classification results of each classifier more intuitively, a comparison graph is presented in Figure 10.
According to the data analysis presented in Figure 10, the average recognition rate of SSA-DBN is the highest, reaching 96.42%, followed by DBN, which is still slightly higher than the other two algorithms. The recognition rate of DBN in pre-stance and mid-stance is significantly higher than that of the SVM and ELM. This also proves that the DBN further improves the accuracy of recognizing the gait stance phase by using its layer-by-layer learning characteristics. However, due to the small number of features extracted for the terminal-stance, the uncertainty, complexity, and muscle fatigue of the feature extraction in the swing phase, the training effect of DBN is not reflected properly. Consequently, the classification result in this phase is weaker as compared to the other two algorithms. (c) (d) Figure 9. The recognition rate curves for different fine-tuning times. The red line represents the recognition rate and the blue line represents the loss rate. (a-d) respectively represent that fine-tunning time is 1, 20, 100, and 500.
We keep the vertical axis range from (a) to (d) in Figure 9 consistent to make the comparison clearer. With the increase of training time, the range of the horizontal axis also increases. It is evident from Figure 9 that when the number of fine-tunings is 1, the curve stability is poor and the recognition rate is extremely low. When the number of fine-tunings is 20, the recognition accuracy is slightly improved, but the curve is still oscillating. After the number of fine-tunings is increased again, the recognition rate curve becomes stable, and the loss function gradually decreases and finally tends to 0. This further illustrates the influence of the network structure on the DBN model. In the subsequent experiments, in order to reduce the invalid workload, reduce the execution time, and facilitate the analysis of the network performance before and after optimization, we fix the number of fine-tunings of the unoptimized DBN model to 100.

Frequency Domain Features
The FD features are used as the input of DBN and SSA-DBN models for performing gait classification and the corresponding results are compared with SVM and ELM classifiers. The classification results are shown in Table 5.

Fusion Features
In this section, the DBN and SSA-DBN models use the fusion feature dataset as the input for performing gait classification. The corresponding results are compared with SVM and ELM classifiers. The classification results are presented in Table 6.
In Table 5, the rows represent the recognition rates of the classifier for the five stages of gait classification, and each column represents different types of classifiers. Please note that the parameters of the network structure are no longer set artificially for the SSA-DBN model, but the number of reverse fine-tunings is chosen by SSA autonomously from the conclusions drawn in the previous section. In order to compare the classification results of each classifier more intuitively, a comparison graph is presented in Figure 10. According to the data analysis presented in Figure 10, the average recognition rate of SSA-DBN is the highest, reaching 96.42%, followed by DBN, which is still slightly higher than the other two algorithms. The recognition rate of DBN in pre-stance and mid-stance is significantly higher than that of the SVM and ELM. This also proves that the DBN further improves the accuracy of recognizing the gait stance phase by using its layer-by-layer learning characteristics. However, due to the small number of features extracted for the terminal-stance, the uncertainty, complexity, and muscle fatigue of the feature extraction in the swing phase, the training effect of DBN is not reflected properly. Consequently, the classification result in this phase is weaker as compared to the other two algorithms.

Fusion Features
In this section, the DBN and SSA-DBN models use the fusion feature dataset as the input for performing gait classification. The corresponding results are compared with SVM and ELM classifiers. The classification results are presented in Table 6.

Prestance
Midstance Accuracy(%) Figure 10. The frequency domain feature recognition rate. In Table 6, the rows represent the recognition rates of the classifier for five stages of gait classification, and each column represents different types of classifiers. In order to compare the classification results of each classifier more intuitively, a comparison graph is presented in Figure 11.  In Table 6, the rows represent the recognition rates of the classifier for five stages of gait classification, and each column represents different types of classifiers. In order to compare the classification results of each classifier more intuitively, a comparison graph is presented in Figure 11. Based on the data analysis presented in Figure 11, it is evident that the fusion features increase the data volume and diversity of the feature samples. In addition, the classification results of each classifier are also improved. Particularly, the recognition rate in pre-swing and terminal-swing approaches the other three stages of gait.
It is evident from the figure that the average recognition rates of SVM, ELM, DBN, and SSA-DBN based on the fusion features are higher as compared to the case when a Accuracy(%) Figure 11. The fusion feature recognition rate.
Based on the data analysis presented in Figure 11, it is evident that the fusion features increase the data volume and diversity of the feature samples. In addition, the classification results of each classifier are also improved. Particularly, the recognition rate in pre-swing and terminal-swing approaches the other three stages of gait.
It is evident from the figure that the average recognition rates of SVM, ELM, DBN, and SSA-DBN based on the fusion features are higher as compared to the case when a single feature is used, with an average improvement of 1.17% compared to the FD features. We note that the recognition rate of DBN is very close to that of SSA-DBN in pre-swing and terminal-swing, even slightly higher, as the former is 98.18% and the latter is 98.08%. This shows that the optimization effect in these two stages is limited. But for the average recognition rate, the results also prove that the fusion features can enhance the classification ability of the model. SSA-DBN has an excellent classification effect, i.e., 97.73%.

SSA Optimization Performance Analysis with the Variance
In this section, we calculate the variance of the recognition rate including DBN and SSA-DBN. The variance represents the deviation degree between the sample recognition rate and the average recognition rate, that is, the variance represents the stability of recognition [60].
On the one hand, from the analysis of the data results in Table 7, the variance of DBN and SSA-DBN in pre-swing and terminal-swing are lower than the other three stages, which shows a more stable recognition rate. On the other hand, the overall variance of SSA-DBN is smaller than DBN before optimization, which indicates that SSA can improve the stability of DBN recognition, and further verifies that SSA-DBN is an effective method. In order to study the influence of fusion features on classification performance as compared to using a single feature, the recognition rates of SSA-DBN under TD features, FD features, and fusion features are compared. Figure 12 shows that the classification ability of an algorithm based on fusion features is significantly better than the use of single features. Second, the improvement in the classification ability of the swing phase is particularly significant under the fusion feature. This also proves that the fusion features can find more gait differences compared with the features in a single time domain or frequency domain, so as to better classify the five gait stages.
A comparison of DBN before and after optimization in time domain, frequency domain, and fusion feature dataset shows that the recognition rates of the five gait stages improve by different degrees after optimization. This shows that the SSA algorithm achieves the purpose of improving the classification accuracy of the DBN model by optimizing the weight parameters in the DBN model. This also proves that the SSA-DBN model is a real and effective model that can be applied to the actual gait classification problem.
FD features, and fusion features are compared. Figure 12 shows that the classification ability of an algorithm based on fusion features is significantly better than the use of single features. Second, the improvement in the classification ability of the swing phase is particularly significant under the fusion feature. This also proves that the fusion features can find more gait differences compared with the features in a single time domain or frequency domain, so as to better classify the five gait stages. A comparison of DBN before and after optimization in time domain, frequency domain, and fusion feature dataset shows that the recognition rates of the five gait stages improve by different degrees after optimization. This shows that the SSA algorithm achieves the purpose of improving the classification accuracy of the DBN model by optimizing the weight parameters in the DBN model. This also proves that the SSA-DBN model is a real and effective model that can be applied to the actual gait classification problem.

Conclusions
In this paper, we optimized the deep belief network (DBN) by using the sparrow search algorithm (SSA) to perform gait classification based on multi-feature fusion of

Conclusions
In this paper, we optimized the deep belief network (DBN) by using the sparrow search algorithm (SSA) to perform gait classification based on multi-feature fusion of surface EMG signals. The DBN has the ability to find the distributed features of gait based on the underlying features. The use of the fused feature combination instead of single time domain or frequency domain features enables us to obtain higher classification accuracy and lower loss rate, and avoid the uncertainty caused by the traditional feature extraction. When the network structure and the parameters of DBN are changed manually and autonomously, the classification results change. Blindly increasing the number of fine-tunings may prolong the network training time and reduce the classification efficiency. The SSA is to optimize the parameters of the DBN model, such as the number of neurons in each layer, the learning rate, and the number of fine-tunings to aviod human-made interference. The experimental results show that SSA-DBN improves the gait recognition rate and stability. In addition, the SSA algorithm also increases the training time of the DBN network, while the optimization effect in some gait stages is limited. With the development of human gait detection and intelligent safety monitoring, the requirements for real-time classification are increasing, which lays the basis for our research direction in the future.  Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author. The data are not publicly available due to privacy issue.