Estimation of Lower Extremity Muscle Activity in Gait Using the Wearable Inertial Measurement Units and Neural Network

The inertial measurement unit (IMU) has become more prevalent in gait analysis. However, it can only measure the kinematics of the body segment it is attached to. Muscle behaviour is an important part of gait analysis and provides a more comprehensive overview of gait quality. Muscle behaviour can be estimated using musculoskeletal modelling or measured using an electromyogram (EMG). However, both methods can be tasking and resource intensive. A combination of IMU and neural networks (NN) has the potential to overcome this limitation. Therefore, this study proposes using NN and IMU data to estimate nine lower extremity muscle activities. Two NN were developed and investigated, namely feedforward neural network (FNN) and long short-term memory neural network (LSTM). The results show that, although both networks were able to predict muscle activities well, LSTM outperformed the conventional FNN. This study confirms the feasibility of estimating muscle activity using IMU data and NN. It also indicates the possibility of this method enabling the gait analysis to be performed outside the laboratory environment with a limited number of devices.


Introduction
Inertial measurement unit or IMU has been widely viewed as an economical and practical alternative to the optical motion capture system. A typical optical motion capture involves the placement of numerous reflective markers on anatomical landmarks to measure the movements of lower extremity segments-pelvis, foot, shank and thigh. With the use of IMU, the movement of these segments can be collected by using a limited number of wearable sensors. These can be placed and aligned on the lateral or anterior side of the leg to obtain the kinematics of foot, shank and thigh during walking. The sensors are small and light and can capture human motion outside a laboratory environment. Several studies have demonstrated the accuracy and reliability of the IMU for gait analysis [1,2].
The IMU can be used to derive other valuable information, such as the spatial and temporal gait parameters [3,4]. It can also be used to evaluate and diagnose abnormal gait [5,6] and to identify ageing-related physiological changes [6,7]. Other studies show that IMU alone can be used to perform inverse dynamics analysis to estimate joint moment and ground reaction force in gait [8,9].
A more comprehensive gait analysis involves the use of electromyograms (EMG). Measuring muscle activity using EMG is not a trivial task. Two to three electrodes per muscle must be accurately placed around the muscle to characterize its behaviour. This means that a total of 10 to 15 electrodes have to be placed around the thigh to record the dynamics of the thigh muscle. Surface EMG (SEMG) is a widely adopted measurement technique, but it has several drawbacks. Among them is crosstalk, which happens when SEMG detects the myoelectrical activity of the neighbouring muscle. Moreover, patients often have muscle deformities, which makes it more challenging to obtain accurate readings. Regardless of whether a wireless or wired EMG, a large number of electrodes must be

Gait Dataset
This study used an online dataset reported in [18]. It contains walking data collected from 13 healthy male individuals and 9 healthy female individuals (Age: 18-35 years old, height: 1.5-1.8 m, weight: 52-96 kg) walking at three different speeds for ten trials in four different conditions: level ground, ramp, stairs and treadmill. It has two types of data: IMU data and EMG data. The IMU data contain the 3D angular velocity and acceleration of the foot, shank, thigh and trunk. The EMG data contain the muscle activity of gluteus medius, right external oblique, semitendinosus, biceps femoris, rectus femoris, vastus lateralis, vastus medialis, soleus, tibialis anterior and gastrocnemius. The right external oblique muscle is excluded from this study because it is part of the abdominal muscle. This work only focuses on level ground walking regardless of the walking speed; thus, the other data were ignored. The data from 3 subjects were excluded because they contain more than 4 inconsistent muscle activities, possibly due to crosstalk errors during data collection.

Data Processing
The EMG data were pre-processed following the International Society of Electrophysiology and Kinesiology (ISEK) standards [19]. First, Fast Fourier transform (FFT) was used to obtain the EMG power spectrum. It was found that the primary signals lie between 20 Hz and 400 Hz. Therefore, a Butterworth bandpass filter was applied to reduce the noise. Second, the filtered data were rectified. Third, another FFT was performed on the rectified data to obtain the appropriate cut-off frequency to smoothen the signal. A Butterworth low-pass filter with a cut-off frequency of 8 Hz was selected and applied to the rectified data. Next, an EMG envelope was created to obtain the muscle activation profile.
In the subsequent step, the EMG data were segmented on a stride-to-stride basis. The timing of the heel strike given in the dataset was used to define the start and end of one stride (one complete gait cycle). The segmented data were then time-normalized to 101 data points, representing the percentage of the gait cycle. The median filter and min-max normalization were then applied sequentially. The median filter is a nonlinear digital filter and is good at removing impulsive noise [20]. The min-max normalization, as defined in (1), creates an array of 101 data points with values ranging between 0 and 1.
where y norm is the normalized data, y is the original data, y min is the minimum of the data, y max is the maximum of the data and i is the number of data points, i = 0, 1, 2, . . . 100. A sample of processed EMG data is shown in (Figure 1a). The amplitude and timing of the peak muscle contraction ( Figure 1b) were identified in every gait cycle to evaluate the performance of the NN in estimating the muscle behaviour.
eralis, vastus medialis, soleus, tibialis anterior and gastrocnemius. The right external oblique muscle is excluded from this study because it is part of the abdominal muscle. This work only focuses on level ground walking regardless of the walking speed; thus, the other data were ignored. The data from 3 subjects were excluded because they contain more than 4 inconsistent muscle activities, possibly due to crosstalk errors during data collection.

Data Processing
The EMG data were pre-processed following the International Society of Electrophysiology and Kinesiology (ISEK) standards [19]. First, Fast Fourier transform (FFT) was used to obtain the EMG power spectrum. It was found that the primary signals lie between 20 Hz and 400 Hz. Therefore, a Butterworth bandpass filter was applied to reduce the noise. Second, the filtered data were rectified. Third, another FFT was performed on the rectified data to obtain the appropriate cut-off frequency to smoothen the signal. A Butterworth low-pass filter with a cut-off frequency of 8 Hz was selected and applied to the rectified data. Next, an EMG envelope was created to obtain the muscle activation profile.
In the subsequent step, the EMG data were segmented on a stride-to-stride basis. The timing of the heel strike given in the dataset was used to define the start and end of one stride (one complete gait cycle). The segmented data were then time-normalized to 101 data points, representing the percentage of the gait cycle. The median filter and min-max normalization were then applied sequentially. The median filter is a nonlinear digital filter and is good at removing impulsive noise [20]. The min-max normalization, as defined in (1), creates an array of 101 data points with values ranging between 0 and 1.
where is the normalized data, y is the original data, is the minimum of the data, is the maximum of the data and i is the number of data points, i = 0, 1, 2, … 100. A sample of processed EMG data is shown in (Figure 1a). The amplitude and timing of the peak muscle contraction (Figure 1b) were identified in every gait cycle to evaluate the performance of the NN in estimating the muscle behaviour. The IMU data were processed in the same way as the EMG data, segmented on a stride-to-stride basis and time normalized to 101 data points. The data were then filtered The IMU data were processed in the same way as the EMG data, segmented on a stride-to-stride basis and time normalized to 101 data points. The data were then filtered using a median filter and min-max normalized. A sample of the processed IMU data is shown in Figure 2. using a median filter and min-max normalized. A sample of the processed IMU data is shown in Figure 2.

Neural Network
Two NN models were developed here. The first model is an FNN with 1 input layer, 1 output layer, 4 hidden (dense fully connected) layers with 256 neurons each and a dropout layer between each layer. The input features are arranged in a 2D array that cascades the normalized 3D acceleration and 3D angular velocity of the trunk, thigh, shank and foot in one gait cycle. The target output is 1D normalized EMG data for each individual muscle. The layout of this model is shown in Figure 3a.
The LSTM has similar architecture as the FNN. The only difference between them is that instead of having the dense hidden layer, it has a LSTM hidden layer, as illustrated in Figure 3b. In traditional FNN, the information only flows in one direction-from input to output without any feedback [21]. This means that FNN is only capable of learning linearly separable problems. On the other hand, LSTM can transmit the output backward as the input, therefore LSTM can learn from experience of the process, classify and predict time-series data and remember values for a long time [22]. Both models were developed using TensorFlow. The proposed methodology is summarized in Figure 3.

Neural Network
Two NN models were developed here. The first model is an FNN with 1 input layer, 1 output layer, 4 hidden (dense fully connected) layers with 256 neurons each and a dropout layer between each layer. The input features are arranged in a 2D array that cascades the normalized 3D acceleration and 3D angular velocity of the trunk, thigh, shank and foot in one gait cycle. The target output is 1D normalized EMG data for each individual muscle. The layout of this model is shown in Figure 3a.
The LSTM has similar architecture as the FNN. The only difference between them is that instead of having the dense hidden layer, it has a LSTM hidden layer, as illustrated in Figure 3b. In traditional FNN, the information only flows in one direction-from input to output without any feedback [21]. This means that FNN is only capable of learning linearly separable problems. On the other hand, LSTM can transmit the output backward as the input, therefore LSTM can learn from experience of the process, classify and predict time-series data and remember values for a long time [22]. Both models were developed using TensorFlow. The proposed methodology is summarized in Figure 3.
FNN and LSTM use the same 'tanh' as the activation function in the hidden layers. They use the same 'sigmoid' as the output layer activation function. Both models used Mean Square Error (MSE) as the loss function and were optimized using Adam optimizer.
The input features and target outputs were split, as shown in Table 1. One random subject data was excluded to be used as an unseen subject test data. The remaining data which has 5440 gait cycles were randomized and divided into 3 groups-training, validation and testing with a ratio of 80:15:5, respectively.  FNN and LSTM use the same 'tanh' as the activation function in the hidden layers. They use the same 'sigmoid' as the output layer activation function. Both models used Mean Square Error (MSE) as the loss function and were optimized using Adam optimizer.
The input features and target outputs were split, as shown in Table 1. One random subject data was excluded to be used as an unseen subject test data. The remaining data which has 5440 gait cycles were randomized and divided into 3 groups-training, validation and testing with a ratio of 80:15:5, respectively.

Validation
A series of measures was used to determine the differences between the predicted and actual muscle activities. Among them are nRMSE and r, as defined in (2) and (3), respectively.
where are actual EMG at position i, are predicted EMG at position i, is maximum value from actual EMG, is the minimum value from actual EMG, is the mean of actual EMG and is the mean of predicted EMG. Next, the difference in time and amplitude between the actual and predicted peak muscle contractions were evaluated, as indicated in (4) and (5), respectively.

Validation
A series of measures was used to determine the differences between the predicted and actual muscle activities. Among them are nRMSE and r, as defined in (2) and (3), respectively.
where X i are actual EMG at position i, Y i are predicted EMG at position i, X max is maximum value from actual EMG, X min is the minimum value from actual EMG, X is the mean of actual EMG and Y is the mean of predicted EMG. Next, the difference in time and amplitude between the actual and predicted peak muscle contractions were evaluated, as indicated in (4) and (5), respectively.
where ∆T p is the time difference between actual and predicted peak muscle contraction in % of gait cycle, T x,p and T y,p are the times of the actual and predicted peak muscle contractions, respectively, ∆E p is the percentage difference in amplitude between the actual and predicted peak muscle contraction, X p is the actual peak contraction and Y p is the predicted peak contraction. Lastly, both predicted and actual muscle activity were plotted together to compare them qualitatively. This involves denormalising and reconstructing the predicted EMG signal back to its original time domain using the heel strike. This is intended to give a more comprehensive outlook of the results, particularly the differences between the predicted and actual muscle behaviours in continuous gait cycles.

Results
The average nRMSE, r, ∆T p and ∆E p of the test data are presented in Table 2. Both FNN and LSTM performed well in estimating muscle activities. For example, the average ∆T p of tibialis anterior muscle is 0.71% and 0.72% of the gait cycle for FNN and LSTM, respectively. The largest ∆T p was found on the gastrocnemius muscle with an average difference of 2.59% for FNN and 2.40% of the gait cycle for LSTM. FNN and LSTM performed differently when estimating the amplitude of the peak contraction. LSTM can better estimate the peak contraction with an average ∆E p of less than 20% than FNN with ∆E p as high as 22%. Despite the discrepancies in peak contraction, both models can estimate muscle activities reasonably well. The estimated EMG waveforms were similar to the actual ones ( Figure 4). These results are further corroborated by the small nRMSE values and large r values. The FNN has nRMSE less than 15% and r greater than 75%, while LSTM has nRMSE less than 10% and r greater than 85%. % of gait cycle, Tx,p and Ty,p are the times of the actual and predicted peak muscle contractions, respectively, ΔEp is the percentage difference in amplitude between the actual and predicted peak muscle contraction, Xp is the actual peak contraction and Yp is the predicted peak contraction. Lastly, both predicted and actual muscle activity were plotted together to compare them qualitatively. This involves denormalising and reconstructing the predicted EMG signal back to its original time domain using the heel strike. This is intended to give a more comprehensive outlook of the results, particularly the differences between the predicted and actual muscle behaviours in continuous gait cycles.

Results
The average nRMSE, r, ∆Tp and ∆Ep of the test data are presented in Table 2. Both FNN and LSTM performed well in estimating muscle activities. For example, the average ΔTp of tibialis anterior muscle is 0.71% and 0.72% of the gait cycle for FNN and LSTM, respectively. The largest ΔTp was found on the gastrocnemius muscle with an average difference of 2.59% for FNN and 2.40% of the gait cycle for LSTM. FNN and LSTM performed differently when estimating the amplitude of the peak contraction. LSTM can better estimate the peak contraction with an average ΔEp of less than 20% than FNN with ΔEp as high as 22%. Despite the discrepancies in peak contraction, both models can estimate muscle activities reasonably well. The estimated EMG waveforms were similar to the actual ones (Figure 4). These results are further corroborated by the small nRMSE values and large r values. The FNN has nRMSE less than 15% and r greater than 75%, while LSTM has nRMSE less than 10% and r greater than 85%.   Next, unseen subject data were used to estimate muscle activity to ensure that the model can predict the gait data of a person outside the training and test data. The results are shown in Table 3. Although the average ΔTp is within an acceptable range, the average ΔEp are larger than 10%. This is deemed reasonable considering that they are unseen data. Nevertheless, looking into the muscle behaviour in continuous gait cycles ( Figure 5), these results are comparable with the literature [23,24]. Both FNN and LSTM can estimate six muscles with nRMSE less than 20% and r greater than 70%. Breaking down the LSTM results, it can be observed that there are three muscles (gastrocnemius, soleus and vastus lateralis) with nRMSE less than 10% and r greater than 90%, three muscles (vastus medialis, tibialis anterior and gluteus medius) with nRMSE less 15% and r greater than 80% and 1 muscle (rectus femoris) with nRMSE less than 20% and r greater than 75%. The two remaining hamstring muscles (biceps femoris and semitendinosus) performed the worst (nRMSE greater than 20% and r less 50%). On the other hand, FNN has four muscles with nRMSE less than 15% and r greater than 80% and two muscles with nRMSE less than 20% and r greater than 70%. Other similar results can be found in Appendix A.  Next, unseen subject data were used to estimate muscle activity to ensure that the model can predict the gait data of a person outside the training and test data. The results are shown in Table 3. Although the average ∆T p is within an acceptable range, the average ∆E p are larger than 10%. This is deemed reasonable considering that they are unseen data. Nevertheless, looking into the muscle behaviour in continuous gait cycles ( Figure 5), these results are comparable with the literature [23,24]. Both FNN and LSTM can estimate six muscles with nRMSE less than 20% and r greater than 70%. Breaking down the LSTM results, it can be observed that there are three muscles (gastrocnemius, soleus and vastus lateralis) with nRMSE less than 10% and r greater than 90%, three muscles (vastus medialis, tibialis anterior and gluteus medius) with nRMSE less 15% and r greater than 80% and 1 muscle (rectus femoris) with nRMSE less than 20% and r greater than 75%. The two remaining hamstring muscles (biceps femoris and semitendinosus) performed the worst (nRMSE greater than 20% and r less 50%). On the other hand, FNN has four muscles with nRMSE less than 15% and r greater than 80% and two muscles with nRMSE less than 20% and r greater than 70%. Other similar results can be found in Appendix A.

Discussion
This study establishes the possibility of using NN and IMU data to estimate muscle activity. The positive outcome of this work suggests that the number of sensing devices can be reduced, which further implies that the time and effort required for gait analysis can be minimized. Instead of the bulky camera systems and EMG, small and light wearable IMUs can be placed on the trunk and limbs to quantify the kinematics of the gait, thereby making out-of-lab gait analysis a reality. This also indicates that future potential

Discussion
This study establishes the possibility of using NN and IMU data to estimate muscle activity. The positive outcome of this work suggests that the number of sensing devices can be reduced, which further implies that the time and effort required for gait analysis can be minimized. Instead of the bulky camera systems and EMG, small and light wearable IMUs can be placed on the trunk and limbs to quantify the kinematics of the gait, thereby making out-of-lab gait analysis a reality. This also indicates that future potential research could lead to home-based, inexpensive gait detection and monitoring of people with gait abnormalities, gait deterioration and injuries. The widespread use of IMU in smartphones and wearable devices, such as fitness trackers, means that potentially more health data, such as gait patterns and muscle activities, can be provided to the users.
The application of NN is promising, particularly the LSTM. It produced a nRMSE value less than 15% with r greater than 75% for seven muscles. The estimation results also align well with the literature [23,24]. This could be attributed to the main characteristic of the LSTM in retaining information for a long period of time, hence it was able to avoid long-term dependency and generate an output response that closely resembles the actual dynamic behaviour of the muscle. However, it requires greater computational effort than FNN. FNN does not need to remember a lot of information, therefore it uses less computational resources and is generally faster than LSTM.
Trinler et al. used a musculoskeletal model with Static Optimization (SO) and Computer Muscle Control (CMC) to estimate muscle activation of gastrocnemius, tibialis anterior, vastus medialis, vastus lateralis and rectus femoris [10]. Although their results are promising, the current study still performs better with r as high as 95% on both test data and unseen subject data. One of the main differences between these two studies is that their study used the conventional optical motion capture system, whereas the current study used IMU data. The other difference is that NN relies on the data to learn and estimate muscle activities. A well-represented data is required for NN to produce an accurate and reliable outcome. On the other hand, SO and CMC rely on the anatomical considerations and assumptions made in the musculoskeletal model. SO calculates the muscle activity by considering the muscle tendons to be rigid and ignoring the passive muscle forces [25]. CMC computes muscle activities from joint coordinates using a combination of proportional-derivative (PD) control and SO [26].
The current findings are also in agreement with the study reported by Zabre-Gonzalez et al. [11]. Their study proposed using a NARX neural network and kinematics data derived from the motion capture system to estimate the muscle activity of two muscles. While the NN in the current study focuses on using one generalized model to estimate unseen data, their study focuses on personalized models, therefore two models (one model per muscle) have to be created to estimate the muscle activities.
The actual and predicted muscle activities of gastrocnemius, soleus, vastus lateralis and vastus medialis were similar to those in the literature [23,24], as depicted in Figure 5 and Appendix B. However, for some muscles, minor differences were observed. These were expected, particularly when the unseen data were used. For instance, a typical tibialis anterior muscle (Figure 5b) has two main contractions: one occurs between the pre-swing and mid-swing (between 60-80% gait cycle) and another between terminal swing and opposite toe-off (between 90% of the current gait and 10% of the subsequent gait). In some gaits, the EMG captured small muscle contractions during the stance phase. The NN could not predict these accurately, thus negatively affecting the quantitative results. A similar trend was found in gluteus medius (Figure 5i).
Several muscles, such as rectus femoris (Figure 5f), biceps femoris ( Figure 5g) and semitendinosus (Figure 5h), were reported to have speed-dependent features [27][28][29][30]. For instance, the hamstring muscles (biceps femoris and semitendinosus) activate at the end of the gait cycle (peak around 90% gait cycle) [23,24]. However, in some gaits, an additional contraction was found at the stance phase (around 30% gait cycle). This contraction is more significant in slow walks and the amplitude of this peak can sometimes be greater than the actual contraction. Although this component has been described in previous literature [27,28], it has not been thoroughly explored. Despite its occasional prominence, the NN gave less precedence to this feature and was able to predict the actual contraction accurately. Rectus femoris (Figure 5f) muscle activity occurs during the pre-and initial swing phase (around 50% of the gait) [23,24]. At slow walks, this activity can be minimal (almost zero) and its amplitude increases with speed [29,30]. Due to its high dependence on speed, the NN could not reliably predict this feature. By providing speed or time difference as the inputs in future work, the accuracy of the NN could be improved. On the other hand, the NN could accurately predict the peak muscle contraction at the start of the gait (around 0-20% gait cycle). Although this peak is the most prominent and consistent in all gait data, it is considered to be the crosstalk from vastus lateralis [29,30]. Crosstalk is a known limitation of the SEMG and is widely reported in the literature [31]. Since NN relies on the data to produce correct output responses, this crosstalk will always be present in the predicted results.
The main difficulty faced in this study is inter-subject gait variance. Although the EMG data is normalized to mitigate this issue, secondary or minor peak contractions can have different amplitudes. These peaks are hard to predict and the source of error is difficult to identify. In addition, these secondary peaks can be higher than the actual peak, especially in slow walks, where muscles behave differently. As these peaks are inconsistent and vary from subject to subject, NN cannot accurately estimate muscle behaviour. This can be observed in the results of the unseen test data ( Table 2).
Another limitation of this study is that although the total number of gait cycles is large, these data come from a population with a narrow age group between 18 and 35 years old. This could limit the performance when predicting the muscle activities of the elderly and children. The elderly have different gait characteristics, different gait kinematics and kinetics [32,33] and muscle activity [34] compared to healthy young adults. Likewise, children also have distinct walking behaviors as they have altered body mass distribution and proportion [35], gait features [36] and muscle activities [37].
Since this is the first attempt to incorporate IMU and NN to estimate muscle activity, several potential improvements can be explored and investigated in the future. Among them is the use of a larger dataset that includes different types of gaits. Feature extraction in time and frequency domains can be proposed too, such as in [17]. Lastly, different neural network models such as Convolutional Neural Network (CNN) and CNN-LSTM [38] can be developed, trained and compared.

Conclusions
This study demonstrates the potential of using IMU data and NN to estimate muscle activity. LSTM performed better than FNN. It was able to estimate three muscles with r greater than 90% and nRMSE less than 10% and seven muscles with r greater than 70% and nRMSE less than 20% using IMU data as input. This study also shows that minimal number of modalities/sensors can be used to estimate muscle activity: four IMUs that are attached to the foot, shank, thigh and trunk can estimate nine lower extremity muscle activities during walking. IMU offers several advantages over its conventional counterpart. They are portable and inexpensive, thus allowing the gait analysis to be performed anywhere, outside the laboratory. Studies also show that IMU can produce measurements equivalent to the gold standard. With the wide availability of IMU, gait analysis can be performed remotely for diagnosis and patient monitoring, as well as to provide additional health data. The use of NN here also demonstrates the ability of machine learning to handle gait variation, regardless of its inter-subject variation or inter-stride variation. However, the success of NN heavily relies on the data. Therefore, the first stage of future study will be the collection of gait data that involves a wide range of populations, such as the elderly and patients with gait abnormalities, subsequently exploring different feature extraction methods and neural network models. Sensors 2023, 23, x FOR PEER REVIEW 14 of 20 Figure A2. A plot of actual vs. predicted muscle activities of the test data (Sample B). Figure A2. A plot of actual vs. predicted muscle activities of the test data (Sample B). Sensors 2023, 23, x FOR PEER REVIEW 15 of 20 Figure A3. A plot of actual vs. predicted muscle activities of the test data (Sample C). Figure A3. A plot of actual vs. predicted muscle activities of the test data (Sample C). Sensors 2023, 23, x FOR PEER REVIEW 17 of 20 Figure A5. A plot of actual vs. predicted muscle activities of the unseen data (Sample B). Figure A5. A plot of actual vs. predicted muscle activities of the unseen data (Sample B). Figure A6. A plot of actual vs. predicted muscle activities of the unseen data (Sample C).