Task-Oriented Muscle Synergy Extraction Using An Autoencoder-Based Neural Model

: The growing interest in wearable robots opens the challenge for developing intuitive and natural control strategies. Among several human–machine interaction approaches, myoelectric control consists of decoding the motor intention from muscular activity (or EMG signals) with the aim of driving prosthetic or assistive robotic devices accordingly, thus establishing an intimate human–machine connection. In this scenario, bio-inspired approaches, e.g., synergy-based controllers, are revealed to be the most robust. However, synergy-based myo-controllers already proposed in the literature consider muscle patterns that are computed considering only the total variance reconstruction rate of the EMG signals, without taking into account the performance of the controller in the task (or application) space. In this work, extending a previous study, the authors presented an autoencoder-based neural model able to extract muscles synergies for motion intention detection while optimizing the task performance in terms of force/moment reconstruction. The proposed neural topology has been validated with EMG signals acquired from the main upper limb muscles during planar isometric reaching tasks performed in a virtual environment while wearing an exoskeleton. The presented model has been compared with the non-negative matrix factorization algorithm (i.e., the most used approach in the literature) in terms of muscle synergy extraction quality, and with three techniques already presented in the literature in terms of goodness of shoulder and elbow predicted moments. The results of the experimental comparisons have showed that the proposed model outperforms the state-of-art synergy-based joint moment estimators at the expense of the quality of the EMG signals reconstruction. These ﬁndings demonstrate that a trade-off, between the capability of the extracted muscle synergies to better describe the EMG signals variability and the task performance in terms of force reconstruction, can be achieved. The results of this study might open new horizons on synergies extraction methodologies, optimized synergy-based myo-controllers and, perhaps, reveals useful hints about their origin.


Introduction
The discovery of human brain capabilities has always gained high interest and expectation among all the disciplines that study and do research on the human body. This trend is closely linked to three that are an initial attempt to link muscle synergies with task variables [19,[38][39][40]. However, as deeply discussed by Barradas et al. [41] and Cristiano et al. [19], functional synergies present some issues and limitations. After an extensive argumentation, Cristiano and his colleagues state that a novel required technique for muscle synergy extraction ". . . should optimize the reconstruction error of the EMG signals, and constrain a good fit of the task-variables".
In this work, as an extension of a previous study [34], the authors proposed a novel autoencoder-based neural model able to extract the muscle synergy patterns simultaneously considering the performance in the task space, i.e., estimation of moments/forces exerted by the human upper limb. Specifically, the novel autoencoder-based model builds its synergy code considering both the EMG signals reconstruction performance and the estimation quality of the upper limb moments computed as a linear combination of the synergy activation signals thus allowing for a task-oriented synergy extraction. The authors believe that directly integrating task-space constraints in the algorithm used to extract the synergies could produce a better task-space variable estimation, thus leading to a new class of optimized myo-controllers and, perhaps, providing a deeper understanding of the hypothetical modularity of the central nervous system and its relationship with the motor learning.

Participants
Nine right-handed healthy subjects (seven males, aged 27.7 ± 4.9 years, weight 74.1 ± 9.1 kg) were involved in the study. All the subjects signed a written consent form before joining the experiments. The experimental procedures were conducted in accordance with the World Medical Association Declaration of Helsinki and approved by the Ethical Review Board of Scuola Superiore Sant'Anna (Approval Number: 1292).

Experimental Setup
The setup was designed for measuring the subject upper-limb muscle EMG signals and forces exerted at the hand level during a set of isometric contractions (see Figure 1). An electromechanical upper-limb exoskeleton, designed for upper-limb rehabilitation, namely L-Exos, was used for acquiring the interaction force between the subject's hand and the exoskeleton's cylindrical handle featuring a triaxial force sensor. The L-Exos has been designed as a wearable haptic interface, capable of providing a controllable force at the center of user's righthand palm, oriented along any direction of the space [42]. The L-Exos has four actuated DOFs for supporting elbow and shoulder movements: shoulder adduction/abduction; shoulder flexion/extension; shoulder internal/external rotation; elbow flexion/extension, and one passive DOF used for measuring the wrist pronosupination angle. All the motors of the exoskeleton have been located on the fixed frame. For each actuated DOF, the torque is delivered from the motor to the corresponding joint by means of steel cables and a reduction gear integrated at the joint axis. All actuated joints are driven with a proportional-derivative control strategy with gravity compensation. The force sensor readings have been then used to estimate the articulation moments. Concerning the EMG acquisition system, two bio-signals amplifiers (g.USBamp, gTec, Austria) were included in the setup to record the activity of 13 muscle heads: biceps short head, biceps long head, brachioradial, triceps long head, triceps lateral head, deltoid anterior head, deltoid posterior head, trapezius, pectoralis major, teres major, infraspinatus, latissimus dorsi and rhomboid. Disposable Ag/AgCl surface electrodes were placed by following the SENIAM recommendations, after a skin cleaning process, and the ground electrode attached to the right elbow. All the surface EMG signals were acquired at 1200 Hz sampling frequency and filtered by the amplifier with a 5-500 Hz band-pass filter and a 50 Hz notch filter. In order to make a more intuitive and easy experimental session, the subject was immersed in a virtual environment (VE) by wearing a head mounted display (Oculus Rift HMD, Oculus) to receive visual feedback. The force sensor measurements, VE signals and EMG data were synchronized on a single PC (Master PC), featuring Microsoft Windows 10 (64 bit), Intel i7 1.6 GHz, 8 Gb RAM and Matlab [43] (Release 2018b). The Master PC has been also used to generate commands for driving the exoskeleton and the VE, according to the acquisition routine.

Data Acquisition Protocol
Before starting the acquisition routine, subjects were invited to sit on a chair and wear the exoskeleton using the flip-off arm bands. By using stacked hard plastic layers under the chair, the height of the seat was adjusted in order to align the centers of rotation of the subject's and exoskeleton shoulder joint. At the beginning of the experiment the exoskeleton joint angles were automatically fixed to a pre-defined angles set: shoulder abduction/abduction angle equal to 0 degrees, shoulder internal rotation angle equal to 0 degrees, shoulder elevation angle equal to 10 degrees and elbow angle equal to 90 degrees. After the surface EMG electrodes were placed on the targeted muscles, elastic bands were used to keep electrodes and wires firmly attached to the body in such a way that the exoskeleton handle was easily reachable. Then, subjects were asked to perform 16 isometric virtual reaching tasks along 8 directions (two trials per direction) on the sagittal plane, equally spaced at 45 degrees and randomly sorted. Isometric contractions were achieved through the exoskeleton end effector position control, keeping the subjects upper-limb pose fixed. In the virtual environment, the subjects hand position corresponds to a red sphere (cursor) and the task target is represented as a green sphere. The distance between the two spheres is covered applying the target force of 20 kg · m/s 2 on the sensor and the radius difference allowed a maximum positioning error equal to 3 kg · m/s 2 (1 N = 1 kg · m/s 2 ). Each virtual reaching task consists of (1) positioning the cursor inside the target, (2) holding it in place for 2 s and then (3) relaxing to move the cursor back to the rest position. The cursor position is driven by a spring model P c = K * F EE where P c is the 3D cursor position, F EE is the applied isometric force vector and K is the elastic constant of the virtual spring.

Autoencoder-Based Neural Model for Muscle Synergy Extraction and Task Optimization
In this work the authors propose a novel neural architecture that is able to learn the optimized muscles synergies patterns that lead to the optimized muscle synergy-based movement intention detection. The structure of the presented model (see Figure 2) is a feed-forward neural network composed of two main blocks that will be discussed later: an undercomplete autoencoder for muscle synergies extraction and a feed-forward layer for movement estimation based on muscle synergy activations.

Undercomplete Autoencoder for Muscle Synergies Extractions
The first block of the proposed model considers an undercomplete autoencoder that has been previously proposed by the authors for muscle synergies extraction [34]. Autoencoders belong to the family of unsupervised learning techniques and represent models that are able to leverage the neural networks for the representation learning task, e.g., denoising, feature reduction, clustering, image processing [44][45][46][47][48][49]. Specifically, an autoencoder is a feedforward neural network that is trained to copy the input data to the output layer. Internally, an autoencoder is based on a symmetric topology that is composed of an encoder and a decoder. The encoder is composed of one of more layers aiming at codifying the input into a code h that is representative of the input x, i.e., h = e(x). The decoder has a symmetric structure respect to the encoder and produces a reconstruction r = d(h). A good autoencoder does not have the ability to perfectly copy the input, but to generate an output that resembles the training data.
Among the several kinds of AE families [50], in this work the authors considered a particular type of autoencoder that is call undercomplete AE. An undercomplete AE has a specific structure that is able to extract the most representative features contained in the input data. Such property is achieved by imposing the size of code h to a value that is smaller than the dimension of input x. By introducing such a bottleneck the AE should be forced to learn an internal structure that exists in the input data, e.g., correlation among input signals.
Referring to the Figure 2, given the input feature vector , where m i (t) indicates the pre-processed activation of the i-th muscle and N is the number of considered muscles, the AE has the objective to extract muscle synergy activations s i . A three step pre-processing routine is also executed on each raw electromyographic signal: (1) high-pass filtering (20 Hz second-order Butterworth); (2) rectification and low-pass filtering (5 Hz second-order Butterworth); (3) per-channel normalization over the maximum value computed at the step 2.
The proposed topology has one hidden layer with four positive linear neurons that encode the muscle activations into synergy activations named s 1 (t), s 2 (t), s 3 (t) and s 4 (t). Such configuration has been chosen with aim to replicate the physiological model of the spatial muscles synergies reported and deeply discussed in the work of Berger and D'Avella [13].
In this work, the authors did not investigated the best number of the hidden neurons, i.e., the number of muscles synergies. A code dimension equal to four has been used since some studies in the literature have reported that upper limb muscle activations during planar isometric reaching tasks can be accurately described by four muscle synergies [13,51]. As suggested by Goodfellow et al., the authors have used a simple linear decoder with biases avoiding the copying task without extracting useful information caused by excessive learning capacity [50].

Feed-Forward Layer for Synergy-Based Movement Intention Detection
Differently from the previous authors' work [34], in this paper the synergy-based movement intention detection has been achieved by adding a feed-forward block on top of the encoding hidden layer of the AE. By adding such block, the proposed neural model is able to compute the best muscle-synergy patterns that leads to the best trade-off between muscle activation reconstruction and the movement intention estimation, i.e., hand forces or articulation moment predictions. To the author's best knowledge this is the first attempt in the literature to build model that is able to extract muscles synergies considering the performance into the task space, i.e., movement. Regarding the activation function, the layer considers a linear function and no bias have been added. Such configuration allows for the computation of the forces/moments as a linear combination of the synergy activations. In detail, the output vector T(t) = [T 1 (t)T 2 (t)] represents the estimated moments. It is important mentioning that the moment components T 1 (t) and T 2 (t) have been normalized to range within the interval [−0.5, 0.5].

Network Training
The proposed network has been implemented using the Neural Network toolbox of Matlab (Release 2018b), and trained using a gradient descent with momentum and adaptive learning rate algorithm for 1000 epochs. Given a single training set, the training of the neural model has been repeated 10 times considering different initial weights [35], then the model featuring the best performance has been considered for the next analysis. Considering a training set composed of about 1000 time points, the training process of the model lasts about 4.5 s. All the training sequences have been run on a PC featuring two Intel XEON E5 2630 v3 CPUs and 64 GB of RAM.

Muscle Synergy Extraction: Ae Vs Nnmf
The AE block of the proposed model has been compared with the most used technique in the literature for muscle synergies extraction, i.e., Non-Negative Matrix Factorization. Given a matrix M, the NNMF is a factorization algorithm able to compute the two matrices W and C such that: with the property that all three matrices have no negative elements. When the NNMF algorithm is applied to pre-processed muscles activation signals, the matrix M (size: N × P) contains the muscles activation observations during a task, where N is the number of recorded muscles and P is the number time samples, W (size: N × Q) is the synergy matrix, where Q is the number of extracted synergies, and C is the matrix (size: S × P) that contains the synergy activation signals. Given such nomenclature, m(t) and c(t) indicate a single column (that corresponds to a single sample time) of M and C, respectively. After the synergy model has been defined by running the NNMF on a training set, i.e., the synergy matrix W has been computed, the synergy activation vector c(t) related to s test EMG signals vector m(t) can be computed as follows [13]: where W + is the pseudo-inverse matrix of W.

Joint Moment Estimation Based on Muscle Synergies: Comparison with the State-Of-The-Art
Muscle synergies have been previously used to detect the motor intention and continuously drive robotic devices as prosthetic hands and assistive interfaces. A good myoelectric controller should be able to process the activations of the involved muscles and compute an estimation of the intended movement in terms of both the direction and amplitude. Considering a robotic interface controlled by an admittance control, such estimation has to be a force/torque vector. As an example, an upper limb exoskeleton could assist the arm movement if it moves accordingly to the patient movement intention.
It is well known that, under certain conditions [7], the EMG-based force/moment estimation can be based on a linear combination of the processed EMG signals as follows: where T represents the vector of the force/moment components, f EMG is the vector of the instantaneous EMG-based features and H is the matrix relating EMG features to force/moment estimated using multiple linear regressions of each applied force/moment component. If the movement is constrained on a plane, T is 2X1 vector, f EMG is a Mx1 vector, and H is a 2xM matrix, where M is the number of considered EMG-based features. In this study, the performance of a bi-dimensional motion intention estimator based on the model presented in Figure 2 has been compared with other methods already proposed in the literature that are based on the same model described by Equation (3). In detail, the authors have considered four models: It is worth noting that the matrix H has dimension equal to 2 × N (two is the number of moment components: shoulder and elbow joint moments), whereas the matricesĤ and H model have size 2 × 4 (four is the number of considered muscle synergies). All methods have been tested using the same set of muscle activation recordings.

Model Calibration and Performance Metrics
Each subject-specific model has been independently trained on different 256 (2 8 ) training sets, where 2 is the number of isometric reaching trials executed for each of the 8 directions (see Section 2.3 for details). Hence, a single training set contains the sEMG and moment data acquired in one trial out of two for each of the eight direction. Given a single training set, then all models have been evaluated on the complementary test set.
The multivariate R 2 index has been computed for each test set in order to evaluate the synergy extraction performance of both the NNMF and AE. The multivariate R 2 index represents the fraction of total variation accounted by the synergy reconstruction and then is a global indicator of the goodness of reconstruction. The R 2 has been computed as follows [27]: where SSE is the sum of the squared errors, and SST is the sum of the squared residuals from the mean activation vectorm, i.e., the total variation multiplied by the total number of samples K = ∑ s k s .
The shoulder and elbow articulation moment reconstruction has been evaluated computing both the root mean square error (E RMS ) and the multivariate R 2 index between the measured and estimated moments using Equation (4).

Statistics
In order to compare the proposed methods, the average values of the R 2 and E RMS among the 256 test sets for each subject have been computed. The two synergy extraction methods, i.e., AE-based and NNMF, have been compared with the Wilcoxon test. The four moment estimator models, i.e., Hm, HWW + m,Ĥc and AE-based Model, have been compared running the Friedman test and the Dunn's pairwise post-hoc tests with Bonferroni correction. The significance level has been set to 0.05. Non-parametric tests were adopted since the assumptions underlying parametric tests resulted to be violated for all sets of data. All the analyses have been performed using the SPSS software [52] (Version 21).

Results
The proposed neural model has been evaluated both in terms of joint moment estimation and sEMG signal reconstruction. Figure 3 (top-left and bottom-left) and Figure 3 (top-right) report the mean value of the E RMS and the mean R 2 values among all test sets for each subject, respectively. Table 1 reports the E RMS and multivariate R 2 values averaged among all subjects for each compared methodology. The Friedman test revealed that there is a significant difference among the four investigated techniques in terms of E RMS relative to both shoulder moment prediction (χ 2 = 18.733, p < 0.001) and elbow moment prediction (χ 2 = 15.000, p = 0.002). Dunn test with Bonferroni correction was then used to perform the post-hoc tests (see Table 2). The results of the post-hoc analysis showed that the shoulder moment E RMS error observed with AE-based model is significantly lower than both the errors obtained by the HWW + m model (Z = 2.556, p < 0.001) andĤc model (Z = 1.778, p = 0.021). No significant differences were found between the AE-based model and Hm model (Z = 1.222, p = 0.268). Similar results were found analyzing the moment elbow E RMS errors. In detail, the elbow moment E RMS error observed with AE-based model is significantly lower than both the errors obtained by the HWW + m model (Z = 2.111, p = 0.003) andĤc model (Z = 1.778, p = 0.021). No significant differences were found between the AE-based model and Hm model (Z = 0.778, p = 1.000). The Friedman test also revealed that there is a significant difference among the four investigated techniques in terms of multivariate R 2 index between the measured and predicted joint moments, χ 2 = 21.400, p < 0.001. The post-hoc analysis has reported that there is a significant difference between three pairs of models (see Table 2): the R 2 index of the AE-based model is higher than both the R 2 index of the HWW + m model (Z = −2.667, p < 0.001) andĤc model (Z = −1.889, p = 0.011), respectively; the Hm model outperforms the HWW + m model (Z = 1.667, p = 0.037); then no significant difference was found between the AE-based model and the Hm model (Z = −1.000, p = 0.602).
The difference between the autoencoder and the NNMF algorithm were also assessed in terms of sEMG signals reconstruction quality by comparing the multivariate R 2 index between the acquired and reconstructed EMG signals (see Figure 3 (bottom-right) and Table 3). The Wilcoxon test results showed that the NNMF achieved a significant higher R 2 index value than the autoencoder (Z = −2.666, p = 0.008).

Discussion and Conclusions
In this study the authors presented a novel neural model that is able to estimate the movement intention, in terms of upper limb joint moments, exploiting the muscle synergy concept. In detail, the architecture of the proposed model is composed of an undercomplete autoencoder that performs the muscle synergies extraction and a feed-forward layer employed to estimate articulation moments as a linear combination of the synergies activations. Such topology is derived from the physiological model of the spatial muscle synergies [13]. The rationale under the proposed model is based on the possibility to extract muscle synergies considering not only the performance on the EMG signal reconstruction, but also the estimation performance in the task space. Hence, the main goal of the proposed method is to find the best muscle synergy patterns that optimize the performance both in the EMG and task space.
Considering the success that synergy-based myo-controllers are achieving in the wearable robotic field, the authors have investigated whether the synergy activations extracted with the AE could be used in motion intention detection. In detail, the proposed AE-based estimator has been compared with three other methods already proposed in the literature for their capabilities of estimating shoulder and elbow joint moments generated while performing planar isometric reaching tasks. The comparison has been conducted analysing the E RMS and R 2 between the estimated and measured moment of both the shoulder and the elbow articulations. The clear messages that arise from the statistical analysis and the Figure 3 are: (1) the proposed method outperforms the two synergy-based approaches HWW + m andĤc and such difference is statistically significant; (2) no statistical difference has been found between the proposed method and the Hm model that considers a direct mapping between the EMG signals and the joint moments. Such findings seem promising since the proposed method is able to achieve the comparable performance of the Hm model even if introduces some loss in the EMG signal information due to the AE bottleneck.
The autoencoder part of the model has then been directly compared with the non-negative matrix factorization algorithm, i.e., the most used method in the literature, in muscle synergy extraction. Since the experiment consisted in exerting planar isometric forces with upper limb, the authors have chosen to consider a predefined number of synergies equal to four. The AE and the NNMF have been compared in terms of a multivariate R 2 index that measures the quality of the muscle activity reconstruction given the synergy structures (or patterns) and synergy activations. As reported in Figure 3, it turned out that the proposed AE-based model has shown slightly lower performance than the NNMF (Wilcoxon test, z = −2.666, p = 0.008). This means that the NNMF generates synergy activations that better reconstruct the original muscle activation signals. Sincerely, this finding is not a big surprise since, differently from the NNMF, the proposed neural model has simultaneously focused on the reconstruction of both the EMG signals and joint moment. It is also worth noting that the AE and the NNMF have not been tested on the reconstruction of the same EMG signals used to calibrate the synergy model, but on different muscle activations acquired in the same condition, i.e., the same upper limb pose.
This work does not address the study of the relationship between the model accuracy and the number of considered muscles [11]. The authors have considered all the main superficial upper limb muscles that contribute to the shoulder and elbow moment generation [13]. Clearly, a reduction in the number of considered muscles would lead to a loss of model accuracy, and such loss would be related to the functional contribution of the specific excluded set of muscles. A further study could investigate the role of the considered muscles in moment estimation when using the proposed approach. However, it is important to remark that the authors just want to propose a general methodology. The specific setup, i.e., considered muscles, task-space variables and acquisition procedure, needs to be customized case by case.
In the author's opinion, this work represents a first attempt to develop a muscle synergy-based myo-controller that is tailored to the specific subject by simultaneously considering the synergy extraction and the mapping between the synergy activations and the variables used in the task space, i.e., forces or moments. Concerning the specific experimental setup used in this study, the obtained results have clearly showed that the proposed model has lead to a better moment estimation when compared with other synergy-based models. However, at the same time, the quality of the EMG signals reconstruction was slightly degraded. This finding demonstrated that a trade-off between the capability of the extracted muscle synergies to better describe the EMG signals variability and the task performance in terms of force reconstruction might exist and can be exploited to develop more intuitive myo-controllers that are mainly evaluated in the task space [19,41].
The proposed strategy might open new perspectives for muscle synergy extraction techniques and, perhaps, encourages new studies related to the fundamentals of muscle synergies and human motor learning and control. In fact, as it has been done so far, the findings of such basic research might moreover be useful for implementing more intuitive simultaneous and proportional myoelectric controls of prostheses [35,53], and robotic devices [12,33] and for the development of innovative diagnostic tools and rehabilitation approaches [25]. Even VE-based rehabilitation exercises based on synergy-control might promote recovery of movement skills in stroke patients [13]. In conclusion, the study of task-oriented synergies and the relative comparison with the standard muscles synergies, i.e., the one extracted with standard approaches, could reveal interesting information about whether and how those patterns might be used to improve the myo-controllers and rehabilitative therapies.

Conflicts of Interest:
The authors declare no conflict of interest.