Improving Tandem Fluency Through Utilization of Deep Learning to Predict Human Motion in Exoskeleton

Koo, Bon Ho; Siu, Ho Chit; Apostolides, Luke; Kim, Sangbae; Petersen, Lonnie G.

doi:10.3390/act14060260

Open AccessArticle

Improving Tandem Fluency Through Utilization of Deep Learning to Predict Human Motion in Exoskeleton

by

Bon Ho Koo

¹

,

Ho Chit Siu

²

,

Luke Apostolides

¹

,

Sangbae Kim

¹ and

Lonnie G. Petersen

^3,4,*

¹

Department of Mechanical Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA

²

MIT Lincoln Laboratory, Lexington, MA 02421, USA

³

Department of Aeronautics and Astronautics, Massachusetts Institute of Technology, Cambridge, MA 02139, USA

⁴

Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA 02139, USA

^*

Author to whom correspondence should be addressed.

Actuators 2025, 14(6), 260; https://doi.org/10.3390/act14060260

Submission received: 8 April 2025 / Revised: 13 May 2025 / Accepted: 20 May 2025 / Published: 23 May 2025

(This article belongs to the Special Issue Recent Advances in Soft Actuators, Robotics and Intelligence)

Download

Browse Figures

Versions Notes

Abstract

Today’s exoskeletons face challenges with low fluency (a quantifiable alternative to “seamlessness”), hypothesized to be caused by a lag in active control innate in many leader–follower paradigms seen in contemporary systems, leading to inefficiencies and discomfort. Furthermore, tandem fluency, a variation of fluency specific for tandem robots systems as exoskeletons, is yet to be rigorously tested in practice. This study aims to utilize metrics of tandem fluency in order to demonstrate improved human–robot interaction (HRI) in exoskeletons through human subject testing of a prototype 1 degree of freedom (DoF) exoskeleton using a motion prediction bidirectional long short-term memory (bi-LSTM) deep learning network. Subjects were recruited to conduct various upper body exercises about the elbow joint, and the collected sEMG, goniometer, and gas exchange data was used to design, test, optimize, and assess the performance of the 1 DoF exoskeleton using tandem fluency metrics. We found that the correlation between I-ACT, a metric of tandem fluency, the subjective survey responses, and metabolic data suggest that the use of a predictive bi-LSTM network to control a 1 DoF exoskeleton about the elbow results in an overall positive trend, which may correlate to high tandem fluency.

Keywords:

motion prediction; deep learning; surface electromyography; LSTM; fluency; tandem fluency

1. Introduction and Background

1.1. Tandem Systems and Tandem Fluency

Fluency is a metric that combines subjective and objective factors to describe the seamlessness of a human–robot interaction (HRI) system [1,2]. Fluency is evaluated using both subjective and objective metrics, offering a comprehensive framework for assessing human–robot collaboration [2]. Subjective metrics focus on human perceptions of the interaction, considering aspects such as trust, naturalness, and the robot’s perceived role in the task. These perceptions are typically gathered through questionnaires and rating scales, providing valuable insights into the user experience. In contrast, objective metrics quantitatively analyze interaction dynamics, incorporating measures such as concurrent activity (C-ACT), human and robot idle times (H-IDLE and R-IDLE), and functional delay (F-DEL). By integrating both subjective and objective evaluations, this framework enables a thorough assessment of fluency in most collaborative human–robot systems. This definition of fluency has been utilized in a number of studies across areas of HRI. For example, in the fields of general human machine collaboration in cooperative tasks [3,4,5], multi-agent expansion with human agents [6], gesture analysis based interfacing and haptics [7,8,9,10], and to certain exoskeleton applications with limited success [11,12]. However, existing fluency metrics lack the granularity needed to characterize interactions within a tandem human–robot system—a specific HRI configuration where human and robotic agents, such as exoskeletons, synchronize their kinematic motion.

To address this gap, tandem fluency extends the traditional definition of fluency to better capture the unique dynamics of wearable and powered robots. Conventional fluency metrics, designed for general human–robot collaboration, do not fully account for the complexities of systems that physically augment or assist human motion. Tandem fluency introduces a revised framework that integrates both objective and subjective metrics tailored to these interactions.

The objective metrics in this expanded framework reflect the actuation characteristics and user behaviors unique to being a kinematically tied system for wearable and other tandem systems, while the subjective metrics assess key human factors such as trust, comfort, perceived effort, and cognitive fit—critical considerations for exoskeleton design and performance. By emphasizing both system performance and user experience in a tandem configuration, tandem fluency provides a more precise measure of seamlessness in human–robot collaboration in the mentioned configuration.

Notably, tandem fluency introduces measures of finer temporal detail, which adds to the existing metrics that are generally at a task-level time scale in measurement. One specific addition is of note and will be explicitly explored in this study:

Intended Concurrent Activity (I-ACT)

I-ACT represents the percentage of the total task duration during which the robotic agent is actively engaged (or actuated) in a manner that supports collaboration with the human agent. The measurement method varies depending on the system; for instance, in a hypothetical one-degree-of-freedom exoskeleton operating at the elbow, it can be assessed by the following:

I - ACT = \int_{t_{s t a r t}}^{t_{e n d}} 1 - \sqrt{{[(E M G_{t_{A G}} - E M G_{t_{A N T}}) - \frac{τ_{t}}{τ_{m a x}}]}^{2}}

(1)

where

E M G_{t_{A G}}

and

E M G_{t_{A N T}}

represent the normalized voluntary muscle activation ratios of the agonist and antagonist muscles (e.g., biceps and triceps in this case) at time t, while

τ_{t}

and

τ_{m a x}

denote the exoskeleton’s actuator torque at time t and its stall torque, respectively. These values are considered over the entire task duration, from

t_{s t a r t}

to

t_{e n d}

. This metric is designed to yield a high value when applied torque aligns with muscular torque and to be penalized when there is muscular torque without corresponding applied torque, or vice versa.

1.2. Exoskeleton Challenges

Exoskeletons, specifically the powered variants designed for augmentation as opposed to rehabilitation/exercise, are often ineffective due to factors related to low fluency: perceived seamlessness, metabolic costs, user confidence, and safety among many confounding factors. While there are arguably other shortcomings of actuation (motor design, bandwidth, backdriving compatibility, and more) and interfacing (mounting point imprecision, repeatability in interfacing, mounting stiffness, and more) in many designs, the issue of low fluency [2] stands as one of the most fundamental and universal among augmenting exoskeletons.

Low fluency is widely regarded as a key factor contributing to inefficiency, subjective discomfort, and increased injury risk in modern human–robot tandem systems. The control paradigm used in these systems is thought to play a major role in generating low-fluency interactions. A common leader–follower control model for exoskeletons [13] requires actuators to respond with a delay to the operator’s movements, a feature often hypothesized to underlie low fluency. Therefore, improving fluency may be possible by removing this leader–follower delay. One potential solution is to predict the human’s movements, allowing for truly simultaneous actuation in principle.

1.3. Motion Prediction

A number of studies have shown that various classification paradigms can perform well in human motion prediction applications. Some prior studies utilize classification [14,15], kinematics [16], motion capture [17], and video [18] for prediction classification. Furthermore, high-resolution regression prediction in the form of deep learning-based machine learning algorithms have been used to predict motion even prior to initiation of motion [15]. These studies demonstrate that it is possible to some degree predict motion prior to a human’s physical motion.

These classification prediction methods, while demonstrating the possibility of deriving information about a motion prior to initiation, do not have sufficient resolution for more direct HRI in the form of tandem robots. For that purpose, a regression method may be preferable; a regression prediction algorithm, if conducted with high enough accuracy, should yield information that can be used to control a tandem robot to either precisely (1) track and move synchronously with a user as a wearable robot or (2) be able to achieve operational tasks with higher perceived reaction speed.

LSTM-Based Motion Prediction

Long short-term memory (LSTM) neural networks are particularly promising for this approach. An LSTM network is a type of recurrent neural network (RNN) designed to model temporal dependencies in sequential data. Unlike standard RNNs, LSTMs incorporate specialized gating mechanisms—namely, the input, forget, and output gates—that regulate the flow of information, mitigating issues of vanishing and exploding gradients [9]. This architecture enables LSTMs to effectively capture both short- and long-term dependencies, making them well-suited for tasks such as time-series forecasting, motion prediction, and sequential decision-making in human–robot interaction.

This architecture has been used in relevant studies in the past. When using accelerometry and surface electromyography (sEMG) inputs from wearable sensors, LSTMs and other RNNs that took the sequential nature of the data into account outperformed typical feedforward architectures from the literature in estimating instantaneous left and right ankle torques [19]. Furthermore, variations such as the bidirectional LSTM network may yield even better performance in the context of motion prediction as it allows for temporal independence, in which a regression prediction could trained on partial trajectories as well.

2. Methods

A study of tandem fluency prerequisites a complete closed loop system, as tandem fluency by definition is a metric measuring the interaction between a human and a robot, by comparing objective and subjective metrics. The following aspects are of note to the experimentation of fluency effects in the context of a 1 degree of freedom (DoF) exoskeleton:

human subject experimentation and data collection
LSTM prediction network
1 DoF prototype exoskeleton

We discuss methods for each aspect of this study.

2.1. Human Subject Experimentation and Data Collection

We recruited subjects (4 male, 1 female ages 22 to 30, able bodied, no upper body musculoskeletal condition current and in history) and conducted human subject testing in accordance with institutional review board approved protocols, detailed in the corresponding “Institutional Review Board Statement” section. Recruitment was conducted without consideration of fitness level, and no preparatory instructions were given before data collection.

Surface electromyography (sEMG) probes were placed on each of the following muscles for a total of four channels: Biceps Brachii, Triceps Brachii, Brachialis, and Brachioradialis. EMG data were collected using the Delsys (Natick, MA, USA) Trigno system.

Gas exchange data was collected using a Metalyzer 3B metabolic gas exchange sensor package from CORTEX Biophysik (Leipzig, Germany) capable of measuring inhale/exhale gas

{CO}_{2}

and

O_{2}

composition, rate, speed, volume, and more. The Hans Rudolph 7450 Series V2 Mask by Hans Rudolph (Shawnee, KS, USA) interfacing with the subjects were cleaned and disinfected between each individual subject.

Finally, goniometer data was collected via two methods. First, the servo data from the exoskeleton motor was collected during human subject trials. In addition to the servo information, a separate Delsys goniometer sensor was used during experimentation trials.

Subjects were instructed to perform three sets of 10 repetitions of upper-body exercises: unweighted elbow flexions (bicep curls), weighted elbow flexions, bodyweight dips, bodyweight push-ups, unweighted elbow extensions (tricep extensions), and weighted elbow extensions. There were no explicit rest instructions, however subjects were told to take appropriate rests to avoid any noticeable muscle fatigue. For the weighted exercises, subjects were instructed to select a weight they could comfortably lift for a total of 30 repetitions while ensuring sufficient exertion throughout the movement. The details of the exercise routine can be seen in Figure A1. Three trials were conducted per subject: one control with no exoskeleton (but with sEMG sensors and goniometer) that also served as the training data collection session, one with exoskeleton equipped but not activated (with all sensors), and one with exoskeleton activated providing external torque (also with all sensors). All sEMG and goniometer channels were sampled at a rate of 1 kHz, and the spiroergometry system was sampling on a per-breath basis. Subjects did not remove the mask in between conditions. The exercise protocol is visualized in Figure 1.

Subjective Metric Questionnaire

In order to gather subjective metrics regarding the use of the exoskeleton, a brief questionnaire was administered in between each condition. The questions asked were a tandem robot relevant subset of questions frequently presented in similar human–robot interaction-based fluency studies [2]. Subjects were instructed to respond to the questions on a scale of 1 (completely disagree) to 5 (completely agree): Question categories can be seen in Table 1.

2.2. LSTM Prediction Network

The LSTM prediction network component of this study consists of the design of the network, the training, and sEMG based sensor data collection and processing.

2.2.1. Raw Data Format

The raw data was collected (refer to Section 2.1) and processed into rolling windows of between 50 and 300 units long, implying they are rolling windows of between 50 and 300 ms due to the 1 kHz sample rate. Windows are of size 4 (sEMG channels)

\times n

, where

50 \leq n \leq 300

. Each trial was processed into approximately 50,000 to 100,000 rolling windows, depending on the window size.

2.2.2. Network Design

The network designed for the purposes of sEMG data-based motion prediction is a bi-directional multi-layer LSTM network. The exact structure of the network was variable, as multiple networks were constructed, trained, and compared for the best performance in simulation prior to human subject testing. This includes optimizing the input data rolling window size. In general, Figure 2 visually shows the structure of the bidirectional LSTM networks evaluated and used in this study.

The cost function was a simple regression function, computing the mean-squared-error loss:

l o s s = \sum_{i = 1}^{R} {(t_{i} - y_{i})}^{2}

(2)

where R is the number of responses,

t_{i}

is the target output, and

y_{i}

is the network’s prediction for response i.

2.2.3. Ablation Analysis

We conducted an ablation analysis using a k-fold cross-validation framework, in which the dataset was split into

k = 10

subsets, each serving once as a test set while the remainder were used for training. Each fold comprised data from a single experimental trial to ensure chronologically clear and physiologically sound separation between training and testing phases. Because the dataset consisted of continuous time-series data, this seeks to avoid overlap between training and test sets that could arise from random sampling, thereby minimizing data leakage and providing an accurate assessment of network performance.

2.3. Exoskeleton Hardware

The 1 DoF exoskeleton hardware is a novel design specifically built for the experiment discussed in this study. The actuator itself was custom designed, using a high torque density electric motor coupled to a low-ratio transmission to achieve high torque density, backdriveablility, and high bandwidth force control [20]. Originally used for the MIT Cheetah robot, its qualities and specifications line up ideally for exoskeleton purposes, among many biomimetic applications. Table 2 contains specifications for the actuator used in the exoskeleton hardware.

The human interface component is a modified off-the-shelf articulating arm brace from Komzer (Las Vegas, NV, USA). By removing the joint component and replacing it with the actuator, the exoskeleton is able to deliver torque to the user’s arm. A custom interface between the actuator and the brace was designed and tested. This custom interface piece was 3D printed (Bambu Lab X1CAustin, TX, USA) using PETG. The resulting hardware can be seen in isolation on Figure 3, and on a subject’s arm in Figure 4.

There are a number of safety features present in the design of the exoskeleton prototype.

Over-current protection: an inline fuse prevents delivery of excessive amounts of current, which may occur in a stuck motor.
Hardware range of motion enforcement: the custom interface bracket mechanically limits range of motion, and will disengage motor if torque is applied at the range limit in an undesirable direction.
Voluntary safety mechanisms: both experimenter and subject have ready access to current cutting mechanisms in the form of emergency stops and dead-man switches.

2.4. Computing Hardware

Inference presented in this study was conducted using the same local hardware/software combination. Hardware equipment included an intel i9-10900 CPU, an NVIDIA RTX 2080 Super 8 GB GPU, and 16 GB of RAM. Software environment included MATLAB 2022b running on Windows 11 Pro Operating System. Training of the network was carried out on a high-performance computing unit at MIT SuperCloud, Lincoln Laboratory Supercomputing Center (Lexington, MA, USA) [21].

3. Results

3.1. Prediction Performance and Optimizations

The predictive performance of the LSTM based motion prediction network depended on the particular setup. Each network was evaluated for its mean error between prediction and target data over the entire trajectory, as well as the standard deviation over the entire sample set. The optimization parameters were:

Number of bi-LSTM layers (between 1 and 3).
Size of each bi-LSTM layer (between 10 to 500 hidden units).
The temporal distance between the rolling window and the target angle of arm (between 50 to 300 samples).
The size of the rolling window (between 50 and 300 samples).

Overall, sweeping the parameter range as defined in this study yielded relatively little variability in network performance aside from a couple trends. Figure 5 visually shows the performance of the networks generated in the optimization process. Of the three major parameters of biLSTM layer number, layer size, and the distance between rolling window and target angle of the arm (referred to as “intersample distance” and denotes the time into the future in which the prediction target exists) the layer number parameter showed little correlation and the layer size and intersample distance yielded small trends. Specifically, and as expected, increasing the training and prediction window size generally reduced the error in both magnitude and deviation, while increasing the intersample distance increased both error magnitude and deviation. However, these errors were across the board objectively small, ranging between

1 %

and

5 %

of the available trajectory angle range.

3.2. Tandem Fluency Metric Results

Tandem fluency must be evaluated through the correlation of both subjective and objective components of HRI.

3.2.1. Objective Metrics and I-ACT Results

The objective metrics of the subject–exoskeleton interaction was calculated on the basis of the tandem fluency metrics presented in Section 1.1. The I-ACT was calculated separately per trial session.

The theoretical I-ACT of a tandem system with perfect synchronization, where the normalized agonist-antagonist weighed sEMG difference in activation is perfectly and instantaneously reflected as the normalized torque application, is

I - A C T = 1

, implying that the torque applied by the motor was at all times directly scaled to the torque generated as a result of the muscular activation of the agonist and antagonist of the target joint. The tandem system investigated in this study generated an

I - A C T = 0.807 \pm 0.079

across the subjects.

More specifically, when dissecting the terms of I-ACT, Figure 6 denotes the two terms across a small section of a trial, showing the two measures of I-ACT trending together in the trials of this study.

3.2.2. Metabolic Work Results

Across all trials,

{VO}_{2}

and

{VCO}_{2}

, as well as kcal consumed derived from the

{VO}_{2}

and

{VCO}_{2}

data, indicate that the use of a motion prediction driven exoskeleton is energetically advantageous.

More specifically, there is a trend indicating the use of the 1-DoF exoskeleton hardware results in reduced energetic cost. Figure 7 shows that, over time, the control condition (where subjects do not have the exoskeleton on) expends more energy relative to the condition in which the exoskeleton is used.

3.2.3. Subjective Survey Results

The questionnaire results show that subjects viewing the utilization of the prediction driven exoskeleton positively. The survey results were compared pairwise; see Table 3 for a Cohen’s d for all three pairwise permutations. Furthermore, a Wilcoxon sign-ranked test was conducted to evaluate the null hypothesis of no significant difference. A visualization of the results of the subjective surveys can be seen in Figure 8.

Also visible in the data is the dip in positive associations for the pre-exo condition, in which the user is wearing the exoskeleton but is not powered. This is expected, as the exoskeleton’s presence is in fact increased mass, inertia, and interference to the arm’s natural motion.

3.2.4. Subjective and Objective Metrics Correlation

The exploration of both subjective and objective metrics allows for the correlation of the two, potentially validating the metrics used as indicative of a positive of negative relationship. A Kendall’s Rank Correlation method was used to observe any correlation between the subjective survey results and tandem fluency metrics recorded. When observing the responses before and after exoskeleton use versus I-ACT across all subject, a Kendall’s Tau

(τ)

of

0.72

, p-value

< 0.1

was reported. As it is greater than zero, this indicates that the tandem fluency metric of I-ACT is correlated to an improved subjective experience. Combining this conclusion with the observations of the metabolics, it is possible to say that there is a positive trend observable between positive subjective experience, favorable metabolics, and a high I-ACT measured.

4. Discussion

4.1. Prediction Parameters

In the results, we show that (1) the deterioration of performance when increasing intersample distance (and thus the window of prediction) as expected but (2) the magnitude of performance deterioration in both parameter sweep ranges was small. It is possible that a larger drop-off in performance (and thus larger errors) happens in more extreme parameter ranges than previously anticipated. In particular, based on the results, it is reasonable to conclude that an increase in inter sample distance (or how far into the future a particular window is predicting), the greater the error. Alternatively, this behavior can be the result of over-fitting. This is unlikely, as the real-time predictions were using new data unseen in the training process and the training set was constructed with as much variability environmental effects can afford allowed, but there are some variables that carried over from training to inference such as the consistency of the individual subjects participating in the training/inference sessions.

4.2. Exoskeleton Performance

Furthermore, we show that the use of a prediction based control method of exoskeletons have beneficial effects in the context of exercises explored in this study. In particular, the correlation of objective and subject metrics from Section 3.2.4 show that the use of an exoskeleton driven by prediction based controls is advantageous. The subjective responses in particular also suggest that the use of an exoskeleton in its experimental configuration is comparable to the control condition.

4.3. Tandem Fluency

This study does not feature a comparison between a tandem system controlled by prediction mechanisms and one that is not. As such, it is difficult to say the relative effects of prediction driven exoskeletons on tandem fluency compared to a different paradigm. However, we are able to correlate the tandem fluency metrics calculated in this study to the metabolic and subjective metrics in order to establish a baseline for future studies either using similar methods, or studying tandem fluency in general as demonstrated in Section 3.2.4. In particular, the results show that the objective metrics changing is likely correlated to the change in biometric and subjective metrics. Further, based on the subjects’ biometric data and subjective trends, it is reasonable to conclude that the objective metric figures yielded in this study represent that of a system with high tandem fluency. Particularly promising are, while a type-II error due to

p > 0.05

thus likely indicating an underpowered study, the results presented in Table 3, where the large effect sizes suggest a trend towards the conditions (and of particular interest, the conditions of control vs exo, as well as pre-exo vs exo), being meaningfully different.

4.3.1. Validity of Tandem Fluency Metrics

In this work, we reference the concept of tandem fluency, an expanded form of conventional fluency designed to capture the interactive qualities of tandem robotic systems. It is therefore critical to first establish the validity of additional metrics before drawing conclusions based on them. The results provide evidence that I-ACT, as an example of a tandem fluency metric, is a meaningful and justifiable measurement in the context of human–tandem robot interactions. Notably, higher I-ACT scores are correlated with improved subjective reception, reinforcing the idea that I-ACT reflects qualities valued by users. This relationship highlights I-ACT as a relevant and valuable metric to track when assessing and improving fluency in tandem robotic systems.

4.3.2. Future Works on Tandem Fluency

The formulation and presentation of tandem fluency in this study is notably incomplete as a comprehensive and universal metric; while the system observed in this experiment does present evidence of a high tandem fluency system, this claim is contingent upon further studies on similar systems, or any other such HRI systems in general. In order to claim a relative benefit, the following efforts should follow:

The proliferation of the utilization of tandem fluency across multiple HRI paradigms, which requires the observation of fluency metrics (such as I-ACT presented in this work) in light of many different systems ranging from off-body autonomous systems to other wearable systems such as exoskeletons and prosthetics.
The standardization of tandem fluency terms, which can be achieved through statistical observations of many examples of identical or similar paradigms observing the specific tandem fluency metrics given controlled parameters.
The expansion of tandem fluency to include metrics relevant to both conventional HRI as well as tandem system HRI.

4.4. Summary and Relevance

This study is one of the first to (1) attempt to utilize tandem fluency metrics in order to study tandem robot performance and (2) utilize a bi-LSTM based motion prediction algorithm to control an exoskeleton: a notable example of a tandem robot. There are promising trends that indicate that such a predictive algorithm can in fact improve user experience for tandem robots, especially exoskeletons as demonstrated in this study. We show that the use of an exoskeleton with predictive controls is metabolically advantageous compared to control conditions, and is subjectively comparable than executing load bearing tasks in control conditions. This is a positive advancement in the use of deep learning neural networks as not only a predictive controller, but may be extended to adjacent use cases such as trajectory planning and decision-making support in the application of tandem robots.

4.5. Limitations

There are a number of shortcomings that should be outlined in the methods and results of this study.

First and most importantly, the recruited subject pool does not represent the greater population in age and physical characteristics. We have derived our conclusions based on healthy and relatively young subjects, but in order to claim universality of the tandem fluency effects as well as fundamentals of exoskeleton operation, future studies must include a larger range of subjects in age, athletic ability, physical impairments, and more. An extension from the subject size limitation, our subjective metrics results suggest an underpowered study. This was due to the subject pool size being determined by a power test conducted on the quantitative data requirements. Thus, here we stop at the presentation of a trend towards the positive effects of prediction based motion prediction driven exoskeleton on the subjective metrics. In future studies, more subjects should be recruited to increase the power of the qualitative data as well.

Second, the use of the particular metrics of tandem fluency discussed in this study has no precedent. While the reasoning for this particular quantification is grounded in both prior work and solid hypothesis, alongside the fundamentals of fluency which itself is a frequently used metric in human–robot interactions, the implementation of tandem fluency is not seen in the previous literature. As of this study, the objective metric of tandem fluency does not carry weight field-wide as a recognizable quality of exoskeleton performance, nor does this study claim to present an HRI method that is universally improved based on these objective metrics. To do so, further studies must explore different systems of tandem robots, employing different control methods, and correlate its tandem fluency metrics to subjective and bioenergetic metrics of their own; a scope beyond the study presented in this work. As a result, this study does not make broad claims of any relative qualities comparing the prediction driven control methods of this work and any other conventional means of exoskeleton sensing and controls.

Furthermore, there are other experimental, as well as network design, parameters that may has a large effect on exoskeleton performance that is not optimized or dissected in this study. Notably, this study did not optimize motor control parameters that may has an effect on the close loop interactions of the tandem robot system. Future studies should look to design experiments that hold these, in addition to all of the other parameters noted in this work, as independent variables to study their effects.

There is also the consideration of the small n presented in the work of this study. The number of subjects were selected to meet the statistical power requirement for the objective metrics given the number of data-points each trial for each subject would provide. However, this same n did not meet the power requirement for the metabolic and subjective metrics. Therefore, this study presents trends in those metrics, not making strong statistical claims about either. Future studies should incorporate a subject number size that provides statistical clarity in both these approaches.

Finally, the exoskeleton prototype used as the testing framework did not emphasize interface efficiency (from interface stiffness, compliance, and other tissue mechanics) or comfort as a design consideration. As a result, it is possible that some of the results seen is due to the particular design of the exoskeleton, in which there were little considerations for factors such as interface energy loss, material selection, and interface comfort. In particular, these factors can potentially effect the subjective component of the measured tandem fluency metrics. It is recommended that future studies of tandem fluency effects of various exoskeleton systems to have hardware designed with these design factors in mind.

5. Conclusions

In this study, we investigate the tandem fluency (a measure of seamlessness between human and tandem robots that are kinematically coupled to a user) effects of a deep learning network driven motion prediction control method of a 1 DoF exoskeleton, and present the results of human subject testing. We constructed an exoskeleton prototype, designed, optimized, and validated a bidirectional LSTM network that uses sEMG data and generates an angular trajectory prediction, and conducted human subject testing to observe any effects an exoskeleton controlled by such a predictive trajectory prediction may have on tandem fluency.

Based on the findings of this study, we observe the following:

First, utilizing the 1 DoF exoskeleton about the elbow joint of this study results in a generally favorable subjective experience when executing load-bearing tasks especially relative to the unpowered exoskeleton condition. Second, the metabolics of the user when using the exoskeleton trends positively; it appears that the exoskeleton does provide for an energetically beneficial task execution. Third, the tandem fluency metric explored, I-ACT, in conjunction with the subjective and energetic metrics of tandem fluency, may be an experimentally valid metric to observe when characterizing tandem robot systems such as the exoskeleton explored in this study. Finally, based on the collected objective, subjective, and metabolic data, we conclude that a motion prediction based bi-LSTM network controlled 1 DoF exoskeleton about the elbow is an example of a high tandem fluency system.

We propose that these results support a number of conclusions:

While conventional means of exoskeleton control demonstrably fall short of practical use in terms of fluency, a control method incorporating motion prediction, specifically through the use of deep learning networks, may yield better results. A more robust evaluation of tandem robot systems, including exoskeletons, can be carried out by measuring system performance with an expanded definition like tandem fluency. As demonstrated, by establishing a metric more closely relating the combination of subjective, energetic, and objective factors unique to tandem robot systems, it should be possible to better inform tandem robot design going forward.

For future studies, we recommend the further validation of the concept of tandem fluency through comparison studies featuring alternative means of control, as well as alternative hardware platforms, to expand the prior-knowledge base of tandem fluency as a universal metric much like conventional fluency is today. Furthermore, it should be beneficial to correlate different system performance behaviors with higher or lower tandem fluency, which in turn should be critical in practical tandem robot design and use. Finally, we encourage the recruitment of a larger range of subjects in sex, age, physical capability, and physical conditions.

Author Contributions

Conceptualization, B.H.K. and L.G.P.; methodology, B.H.K.; software, B.H.K. and L.A.; validation, B.H.K. and L.A.; formal analysis, B.H.K. and L.A.; investigation, B.H.K., L.A., L.G.P., H.C.S. and S.K.; resources, L.G.P. and S.K.; data curation, B.H.K. and L.A.; writing—original draft preparation, B.H.K.; writing—review and editing, L.G.P., S.K. and H.C.S.; visualization, B.H.K., H.C.S. and L.G.P.; supervision, L.G.P. and S.K.; project administration, L.G.P. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

This study was conducted according to the guidelines of the Declaration of Helsinki, and approved by the Committee on the Use of Humans as Experimental Subjects (COUHES) of the Massachusetts Institute of Technology (2209000745, approved 28 October 2022).

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

The datasets presented in this article are not readily available because the data are part of an ongoing study. Requests to access the datasets should be directed to Bon H. Koo (bkoo1104@mit.edu) and Lonnie G. Petersen (lgpeters@mit.edu).

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

HRI	Human–Robot Interfacing/interaction
(s)EMG	(Surface) Electromyography
DoF	Degree of Freedom
(bi-)LSTM	(Bidirectional) Long Short-term Memory
I-ACT	Intended Concurrent Activity
RNN	Recurrent Neural Network

Appendix A. Motion Diagrams

Figure A1. Series of diagrams representing the exercises prescribed to subjects. The exact magnitude depended on the range of motion of the subject. Exercises with weighted variants were identical to unweighted variants in posture with an addition of a preferred dumbbell as a mass. Subfigure (a) is a visual representation of a subject performing the elbow flexion motion from a lateral view. Subfigure (b) is a visual representation of a subject performing the elbow extension motion from a lateral view. Subfigure (c) is a visual representation of a subject performing the bodyweight dip motion from a lateral view. Subfigure (d) is a visual representation of a subject performing the pushup motion from a lateral view.

References

Hoffman, G.; Breazeal, C. Cost-Based Anticipatory Action Selection for Human–Robot Fluency. IEEE Trans. Robot. 2007, 23, 952–961. [Google Scholar] [CrossRef]
Hoffman, G. Evaluating Fluency in Human–Robot Collaboration. IEEE Trans.-Hum.-Mach. Syst. 2019, 49, 209–218. [Google Scholar] [CrossRef]
Li, Y.; Zhang, F. Trust-Preserved Human-Robot Shared Autonomy Enabled by Bayesian Relational Event Modeling. IEEE Robot. Autom. Lett. 2024, 9, 10716–10723. [Google Scholar] [CrossRef]
Pupa, A.; Van Dijk, W.; Secchi, C. A Human-Centered Dynamic Scheduling Architecture for Collaborative Application. IEEE Robot. Autom. Lett. 2021, 6, 4736–4743. [Google Scholar] [CrossRef]
Ortenzi, V.; Cosgun, A.; Pardi, T.; Chan, W.P.; Croft, E.; Kulić, D. Object Handovers: A Review for Robotics. IEEE Trans. Robot. 2021, 37, 1855–1873. [Google Scholar] [CrossRef]
Yasar, M.S.; Iqbal, T. A Scalable Approach to Predict Multi-Agent Motion for Human-Robot Collaboration. IEEE Robot. Autom. Lett. 2021, 6, 1686–1693. [Google Scholar] [CrossRef]
Rojas-Munoz, E.; Wachs, J. Assessing Collaborative Physical Tasks Via Gestural Analysis. IEEE Trans.-Hum.-Mach. Syst. 2021, 51, 152–161. [Google Scholar] [CrossRef]
Yan, X.; Jiang, Y.; Chen, C.; Gong, L.; Ge, M.; Zhang, T.; Li, X. A Complementary Framework for Human–Robot Collaboration with a Mixed AR–Haptic Interface. IEEE Trans. Control Syst. Technol. 2024, 32, 112–127. [Google Scholar] [CrossRef]
Fei, H.; Tan, F. Bidirectional Grid Long Short-Term Memory (BiGridLSTM): A method to address context-sensitivity and vanishing gradient. Algorithms 2018, 11, 172. [Google Scholar] [CrossRef]
Islam, M.M.; Iqbal, T. Multi-GAT: A Graphical Attention-Based Hierarchical Multimodal Representation Learning Approach for Human Activity Recognition. IEEE Robot. Autom. Lett. 2021, 6, 1729–1736. [Google Scholar] [CrossRef]
Wu, M.I.; Stirling, L. Emergent Gait Strategies Defined by Cluster Analysis When Using Imperfect Exoskeleton Algorithms. IEEE Robot. Autom. Lett. 2024, 9, 3171–3178. [Google Scholar] [CrossRef]
Samper-Escudero, J.L.; Coloma, S.; Olivares-Mendez, M.A.; González, M.Á.S.U.; Ferre, M. A Compact and Portable Exoskeleton for Shoulder and Elbow Assistance for Workers and Prospective Use in Space. IEEE Trans.-Hum.-Mach. Syst. 2023, 53, 668–677. [Google Scholar] [CrossRef]
Lora-Millan, J.S.; Moreno, J.C.; Rocon, E. Coordination Between Partial Robotic Exoskeletons and Human Gait: A Comprehensive Review on Control Strategies. Front. Bioeng. Biotechnol. 2022, 10, 842294. [Google Scholar] [CrossRef] [PubMed]
Siu, H.; Shah, J.; Stirling, L. Classification of Anticipatory Signals for Grasp and Release from Surface Electromyography. Sensors 2016, 16, 1782. [Google Scholar] [CrossRef] [PubMed]
Koo, B.H.; Siu, H.C.; Newman, D.J.; Roche, E.T.; Petersen, L.G. Utilization of Classification Learning Algorithms for Upper-Body Non-Cyclic Motion Prediction. Sensors 2025, 25, 1297. [Google Scholar] [CrossRef] [PubMed]
Wang, J.; Fang, Z.; Shen, L.; He, C. Prediction of Human Motion with Motion Optimization and Neural Networks. In Proceedings of the 2021 3rd International Symposium on Robotics & Intelligent Manufacturing Technology (ISRIMT), Changzhou, China, 24–26 September 2021; pp. 66–70. [Google Scholar] [CrossRef]
Carrara, F.; Elias, P.; Sedmidubsky, J.; Zezula, P. LSTM-based real-time action detection and prediction in human motion streams. Multimed. Tools Appl. 2019, 78, 27309–27331. [Google Scholar] [CrossRef]
Al-Akam, R.; Paulus, D. RGBD Human Action Recognition using Multi-Features Combination and K-Nearest Neighbors Classification. Int. J. Adv. Comput. Sci. Appl. 2017, 8, 081050. [Google Scholar] [CrossRef]
Siu, H.C.; Sloboda, J.; McKindles, R.J.; Stirling, L.A. A Neural Network Estimation of Ankle Torques From Electromyography and Accelerometry. IEEE Trans. Neural Syst. Rehabil. Eng. 2021, 29, 1624–1633. [Google Scholar] [CrossRef] [PubMed]
Katz, B.; Carlo, J.D.; Kim, S. Mini Cheetah: A Platform for Pushing the Limits of Dynamic Quadruped Control. In Proceedings of the 2019 International Conference on Robotics and Automation (ICRA), Montreal, QC, Canada, 20–24 May 2019; pp. 6295–6301. [Google Scholar] [CrossRef]
Reuther, A.; Kepner, J.; Byun, C.; Samsi, S.; Arcand, W.; Bestor, D.; Bergeron, B.; Gadepally, V.; Houle, M.; Hubbell, M.; et al. Interactive supercomputing on 40,000 cores for machine learning and data analysis. In Proceedings of the 2018 IEEE High Performance extreme Computing Conference (HPEC), Virtual, 25–27 September 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 1–6. [Google Scholar]

Figure 1. A visualized flowchart of the human subject experiment protocol. Each exercise is a condition with factors of exo condition (control, pre-exo, and exo) and exercise type (pushups, dips, arm flexion, and arm extension), the orders of which are randomized.

Figure 2. Visual representation of the LSTM network structure. Note that the exact number of LSTM layers and the number of nodes in each layer are optimization parameters that depend on the performance of each version of the network. These are evaluated in Section 3.1.

Figure 3. A photo of the resulting hardware prototype as viewed from the frontal point of view supported by a stand; visible is the custom interface piece (in grey) joining the actuator and brace components.

Figure 4. The arm exoskeleton, complete with the modified arm brace interfaced with the custom motor, on a subject. This configuration is to visualize the tandem robot system; in practice, the hardware is mostly obscured due to the sEMG device and accompanying wrap to keep the sensors adhered and in place.

Figure 5. These figures visually show the relevant results of the parameter optimization: (a) When comparing different window sizes used for training and prediction. The error and deviation of error appears to reduce with larger window sizes. (b) The optimization process of inter sample distance, also noted as the time window of prediction, is correlated with the error magnitude and consistency; larger intersample distance correlates with greater prediction error and deviation.

Figure 6. A visualization of the terms in I−ACT, specifically the sEMG activation difference between agonist/antagonist and the applied torque (normalized). Note that the two measures trend together, which theoretically and in practice yields an I−ACT value closer to 1.

Figure 7. A plot illustrating the upper and lower bound for energy expenditure of a subject during the trial exercises. The energy, derived from

{VO}_{2}

and

{VCO}_{2}

, is normalized to peak energy expenditure per subject across both exoskeleton conditions, and the times are normalized to percent time relative to total time of exercise. The areas represent the region between max and min trajectories per condition across all valid subjects.

Figure 7. A plot illustrating the upper and lower bound for energy expenditure of a subject during the trial exercises. The energy, derived from

{VO}_{2}

and

{VCO}_{2}

, is normalized to peak energy expenditure per subject across both exoskeleton conditions, and the times are normalized to percent time relative to total time of exercise. The areas represent the region between max and min trajectories per condition across all valid subjects.

Figure 8. A plot showing the mean and standard deviation of the responses from subjects at three points during their trials; control (with no device), pre-exo (with exoskeleton worn but unpowered), and exo (where exoskeleton is acting as designed with prediction driven controls). Reverse scales are denoted with (R).

Table 1. Subjective fluency metric scale used in the study. All ratings were out of 5, where the higher the number the more the subject is in agreement with the question statement. (R) indicates reverse scale.

Trust in Robot/System

“The system was trustworthy.”

Effort (R)

“I felt the task was strenuous”

Task Effectiveness

“I was effective at the task”

Comfort

“I was physically comfortable performing the task”

Table 2. Specification for the actuator used in the exoskeleton prototype.

Mass	440 g
Dimensions	96 mm O.D., 40 mm shaft length
Max Torque	17 N*m
Continuous Torque	6.9 N*m
Max Speed	40 rad/s
Max Power	+250/−680 W
Current Bandwidth	4.5 kHz @4.5 Nm, 1.5 kHz@17 Nm
Output Inertia	0.0023 kg*m²

Table 3. Cohen’s d effect sizes for all pairwise condition comparisons across four metrics. Also listed is the p value for each pairwise condition.

Condition Comparison	Trust	Effort	Task Effectiveness	Comfort
Control vs. Pre-exo ( $p = 0.08$ )	4.08	−0.81	0.82	2.04
Control vs. Exo ( $p = 0.09$ )	3.27	1.73	−2.04	0.82
Pre-exo vs. Exo ( $p = 0.31$ )	−0.58	3.27	−4.04	−1.73

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Koo, B.H.; Siu, H.C.; Apostolides, L.; Kim, S.; Petersen, L.G. Improving Tandem Fluency Through Utilization of Deep Learning to Predict Human Motion in Exoskeleton. Actuators 2025, 14, 260. https://doi.org/10.3390/act14060260

AMA Style

Koo BH, Siu HC, Apostolides L, Kim S, Petersen LG. Improving Tandem Fluency Through Utilization of Deep Learning to Predict Human Motion in Exoskeleton. Actuators. 2025; 14(6):260. https://doi.org/10.3390/act14060260

Chicago/Turabian Style

Koo, Bon Ho, Ho Chit Siu, Luke Apostolides, Sangbae Kim, and Lonnie G. Petersen. 2025. "Improving Tandem Fluency Through Utilization of Deep Learning to Predict Human Motion in Exoskeleton" Actuators 14, no. 6: 260. https://doi.org/10.3390/act14060260

APA Style

Koo, B. H., Siu, H. C., Apostolides, L., Kim, S., & Petersen, L. G. (2025). Improving Tandem Fluency Through Utilization of Deep Learning to Predict Human Motion in Exoskeleton. Actuators, 14(6), 260. https://doi.org/10.3390/act14060260

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Improving Tandem Fluency Through Utilization of Deep Learning to Predict Human Motion in Exoskeleton

Abstract

1. Introduction and Background

1.1. Tandem Systems and Tandem Fluency

1.2. Exoskeleton Challenges

1.3. Motion Prediction

LSTM-Based Motion Prediction

2. Methods

2.1. Human Subject Experimentation and Data Collection

Subjective Metric Questionnaire

2.2. LSTM Prediction Network

2.2.1. Raw Data Format

2.2.2. Network Design

2.2.3. Ablation Analysis

2.3. Exoskeleton Hardware

2.4. Computing Hardware

3. Results

3.1. Prediction Performance and Optimizations

3.2. Tandem Fluency Metric Results

3.2.1. Objective Metrics and I-ACT Results

3.2.2. Metabolic Work Results

3.2.3. Subjective Survey Results

3.2.4. Subjective and Objective Metrics Correlation

4. Discussion

4.1. Prediction Parameters

4.2. Exoskeleton Performance

4.3. Tandem Fluency

4.3.1. Validity of Tandem Fluency Metrics

4.3.2. Future Works on Tandem Fluency

4.4. Summary and Relevance

4.5. Limitations

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

Appendix A. Motion Diagrams

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI