FusionTCN-Attention: A Causality-Preserving Temporal Model for Unilateral IMU-Based Gait Prediction and Cooperative Exoskeleton Control

Yang, Sichuang; Yu, Kang; Zhang, Lei; Pan, Minling; Pan, Haihong; Chen, Lin; Guo, Xuxia

doi:10.3390/biomimetics11010026

Open AccessArticle

FusionTCN-Attention: A Causality-Preserving Temporal Model for Unilateral IMU-Based Gait Prediction and Cooperative Exoskeleton Control

by

Sichuang Yang

^1,†

,

Kang Yu

^1,†

,

Lei Zhang

¹,

Minling Pan

¹

,

Haihong Pan

¹,

Lin Chen

^1,*,‡

and

Xuxia Guo

^2,*,‡

¹

Department of Mechatronics Engineering, College of Mechanical Engineering, Guangxi University, Nanning 530004, China

²

College of Physical Education, Guangxi University, Nanning 530004, China

^*

Authors to whom correspondence should be addressed.

^†

These authors contributed equally to this work (co-first authors).

^‡

These authors contributed equally as corresponding authors.

Biomimetics 2026, 11(1), 26; https://doi.org/10.3390/biomimetics11010026

Submission received: 21 November 2025 / Revised: 20 December 2025 / Accepted: 28 December 2025 / Published: 2 January 2026

(This article belongs to the Section Bioinspired Sensorics, Information Processing and Control)

Download

Browse Figures

Versions Notes

Abstract

Human gait exhibits stable contralateral coupling, making healthy-side motion a viable predictor for affected-limb kinematics. Leveraging this property, this study develops FusionTCN–Attention, a causality-preserving temporal model designed to forecast contralateral hip and knee trajectories from unilateral IMU measurements. The model integrates dilated temporal convolutions with a lightweight attention mechanism to enhance feature representation while maintaining strict real-time causality. Evaluated on twenty-one subjects, the method achieves hip and knee RMSEs of 5.71° and 7.43°, correlation coefficients over 0.9, and a deterministic phase lag of 14.56 ms, consistently outperforming conventional sequence models including Seq2Seq and causal Transformers. These results demonstrate that unilateral IMU sensing supports low-latency, stable prediction, thereby establishing a control-oriented methodological basis for unilateral prediction as a necessary engineering prerequisite for future hemiparetic exoskeleton applications.

Keywords:

gait prediction; temporal convolutional network; attention mechanism; IMU; causal modeling; exoskeleton control; phase consistency

1. Introduction

Stroke and hemiparesis remain among the most prevalent neurological disorders leading to chronic gait impairment and long-term functional asymmetry. According to the World Stroke Organization, more than 12 million individuals suffer a stroke annually, and over half experience persistent gait disturbances that limit daily activities and reduce quality of life [1,2,3,4]. Clinical guidelines emphasize the clinical importance of gait recovery in post-stroke rehabilitation [2], and classic biomechanical analyses describe the characteristic asymmetry and compensatory patterns of hemiparetic gait [3]. Because coordinated motion of both lower limbs is essential for stable walking [4], restoring gait symmetry has long been a key objective in post-stroke rehabilitation. [2], and classic biomechanical analyses describe the characteristic asymmetry and compensatory patterns of hemiparetic gait [3].

Wearable robotic exoskeletons have become an important tool in this field, providing intensive, repeatable, and task-oriented training without heavy dependence on therapist supervision [5,6,7,8,9]. For instance, Esquenazi et al. [10] demonstrated the clinical safety and efficacy of the ReWalk powered exoskeleton, reporting that the system successfully restored ambulatory function in individuals with severe motor impairments. Dollar and Herr [5] reviewed early lower-extremity assistive devices and summarized core challenges in actuation, control, and human−robot interaction, while follow-up reviews by Louie and Eng [6], Young and Ferris [7], and Warutkar et al. [8] examined the clinical translation of powered exoskeletons and emphasized constraints related to usability and patient selection. Over time, the field has shifted from replaying unimpaired trajectories [5] to more adaptive strategies that align robotic assistance with the user’s ongoing movement intent, often through phase-variable or oscillator-based control formulations [11,12,13]. Achieving this alignment depends fundamentally on the system’s ability to estimate or predict gait timing and joint evolution in real time.

However, for neurologically impaired individuals, sensors placed directly on the paretic limb often provide degraded signals due to spasticity, restricted range of motion, orthotic interventions, or inconsistent foot strike patterns [4,8]. These practical constraints motivate unilateral sensing strategies that rely exclusively on the healthy limb. This concept draws inspiration from the natural contralateral coupling in human locomotion [14], suggesting that the healthy limb carries predictive structures that can be leveraged to forecast the trajectory of the affected side. Recently, Huo et al. [15] conducted a randomized controlled trial validating this clinical direction, showing that unilateral lower-limb exoskeleton assistance significantly improved balance and gait recovery in subacute stroke patients.

Despite these promising clinical findings, translating this strategy into robust control faces quantified hurdles. Regarding physiological asymmetry, standard predictive models suffer performance degradation when gait deviations increase. For instance, De Miguel-Fernández et al. [16] demonstrated that the correlation between IMU-derived kinematic metrics and exoskeleton control parameters can drop from strong levels (

r > 0.9

) in mild impairment to poor reliability (near

0.0

) in severe cases, highlighting the inability of simple regression baselines to capture pathological non-linearities. Furthermore, regarding computational feasibility, state-of-the-art sequence models like the Transformer are computationally intensive for embedded exoskeleton controllers. The base Transformer architecture requires over 65 million parameters [17], introducing inference latencies incompatible with the strict latency constraints required for real-time assistance. Consequently, existing approaches often struggle to balance the modeling of long-term dependencies with the dual requirements of causality preservation and high parameter efficiency.

In this study, we focus on the methodological feasibility of unilateral IMU-based contralateral gait prediction under controlled conditions. The proposed FusionTCN−Attention model is evaluated using gait data collected from healthy adult subjects. This choice serves as a necessary engineering prerequisite to rigorously examine causality preservation and prediction stability, leveraging the inherent gait symmetry of healthy individuals [18] to establish a performance benchmark before clinical deployment. While the ultimate application lies in post-stroke rehabilitation, clinical validation on hemiparetic populations involves distinct pathological challenges and is beyond the scope of the present work.

The main contributions of this study are summarized as follows:

Biomimetic Architecture:This study introduces FusionTCN−Attention, a causality-preserving model that integrates a dilated TCN backbone with a lightweight temporal-attention module. This design mimics the neural coupling of short-term reflex and long-term rhythm to predict continuous contralateral trajectories using only unilateral IMU inputs.
Feasibility Validation:The algorithmic feasibility of using six kinematic channels (angle, velocity, acceleration) from the healthy limb to forecast contralateral motion is validated, establishing a necessary engineering benchmark for future hemiparetic applications.
Comprehensive Evaluation: Experiments demonstrate that the proposed model outperforms conventional sequence models, achieving hip/knee RMSE of $5.71 ° / 7.43 °$ and an average phase lag of 14.56 ms. Hardware-in-the-loop tests further confirm its timing stability for real-time exoskeleton control.

2. Related Work

2.1. Gait Event Detection and Continuous Prediction

Most existing exoskeleton controllers implement finite-state or event-triggered structures in which the gait cycle is divided into discrete phases such as heel strike, stance, swing, and toe-off [4,9]. These approaches are attractive for their robustness and conceptual simplicity: transitions are triggered using foot-switch signals, force thresholds, or characteristic angular patterns, often detected from kinematic or kinetic features [19,20,21]. Zeni et al. [19] compared simple kinematic rules for detecting heel-strike and toe-off and showed that the chosen rule can shift event timing substantially, while Ghoussayni et al. [20] assessed automated force-based detection and highlighted its sensitivity to noise and threshold selection. Prasanth et al. [21] surveyed wearable-sensor algorithms and concluded that IMU-based detection is feasible across diverse speeds in healthy gait. Nevertheless, phase-variable controllers often degrade when cadence or terrain changes continuously [11,12]. Baek et al. [11] reported reduced performance when gait deviated from nominal patterns, and Best et al. [12] showed that reliable phase estimation is critical for prosthetic knee−ankle control on varying inclines.

Recent deep-learning approaches have improved event detection—Ashraf et al. applied a temporal convolutional network (TCN) to IMU data [22], and Asogwa et al. predicted minimum-clearance events using neural networks [23]—but these studies also note performance gaps under irregular gait. Such limitations have motivated continuous gait modeling approaches that infer joint trajectories or continuous phase directly from sensor streams [24,25,26]. By replacing discrete transitions with continuous predictions, these approaches support smoother reference generation and better responsiveness to natural variations in gait [4,21].

2.2. Human–Robot Interaction and Sensing Modalities

In the broader context of human–robot interaction, mapping user intention to robotic control is critical. Su et al. [27] developed a bidirectional teleoperation control system, demonstrating that synchronization between a master and slave system can be achieved through precise kinematic constraints. This offers a theoretical parallel to our approach, where the healthy limb acts as the “master” driving the “slave” exoskeleton. Similarly, Zhou et al. [28] highlighted the importance of multi-modal sensor fusion in decoding human intention with high granularity.

However, deploying such multi-modal systems in daily rehabilitation faces significant practical trade-offs. Vision-based gait recognition systems are often constrained by environmental factors, such as sensitivity to lighting conditions, viewing angles, and potential occlusions, which limit their robustness in unconstrained settings [29]. Conversely, while surface electromyography (sEMG) offers direct neural interfacing, it introduces substantial calibration burdens and signal instability due to electrode shifts, skin impedance variations (e.g., sweating), and muscle fatigue during long-term use [30]. In contrast, IMU-based sensing provides a self-contained, lighting-independent solution with minimal setup requirements [25]. Therefore, we prioritize a unilateral IMU configuration to maximize system portability and deployment simplicity, ensuring the exoskeleton remains suitable for compact, personal use outside strict laboratory settings.

Given these advantages, Inertial Measurement Units (IMUs) have become a preferred low-cost, portable solution [25,31]. Early IMU studies focused primarily on event detection [19,20,21]. Later work extended IMU usage to reconstruct joint kinematics and spatiotemporal parameters [24,25,26]. Favre et al. [26] estimated three-dimensional knee angles in ambulatory settings and showed reasonable agreement with laboratory motion capture. Seel et al. [25] demonstrated robust lower-limb joint estimation without external references, and Li et al. [24] used two shank-mounted IMUs to reconstruct spatiotemporal parameters from a minimal sensor set. Hur et al. [32] trained deep neural networks on open-source IMU datasets and reported that lower-limb joint angles can be recovered across diverse walking conditions. Unilateral sensing strategies, which infer affected-side motion from the healthy side, remain less explored but have shown initial promise in metabolic cost reduction [33] and feasibility monitoring [16].

2.3. Temporal Modeling Architectures

To model the predictive structure of gait, recent work has employed deep temporal architectures. Recurrent Neural Networks (RNNs) and their gated variants such as LSTMs and Seq2Seq models have been widely applied to human motion forecasting [34]. For example, Kim and Hargrove [35] proposed an LSTM-based model to predict continuous gait phase for prosthetic legs, achieving high accuracy on benchmark datasets. However, RNNs often suffer from accumulated phase drift over multi-step predictions. Bai et al. [36] showed that temporal convolutional networks (TCNs) can achieve comparable or improved accuracy with more stable training behavior across a range of sequence modeling tasks.

Deep models have also been applied directly to IMU-based gait estimation [22,23,32]. Vaswani et al. [17] introduced the Transformer architecture based on self-attention, which excels at capturing long-range dependencies. However, its standard implementation is ill-suited for wearable robotics due to the quadratic computational complexity of global attention and the lack of strict causal constraints, which complicate real-time inference on resource-limited hardware. As a result, TCNs—which combine dilated causal convolutions with fixed latency—have become increasingly prominent [36,37]. Lea et al. [37] demonstrated the capability of TCNs for structured human activity segmentation, and Ashraf et al. [22] applied a TCN to gait-event detection with robustness to sensor noise. Still, most TCN-based studies focus on classification tasks or assume bilateral sensing, leaving a gap in unilateral trajectory-level gait prediction suitable for exoskeleton control.

3. Materials and Methods

3.1. Problem Definition

This study targets cooperative exoskeleton assistance by predicting the motion of the affected limb from the kinematic patterns of the healthy limb. Human walking exhibits bilateral coupling with phase-locked dynamics; consequently, the task is cast as a temporal sequence modeling problem in which contralateral dynamics are inferred from the observed motion history of the healthy side under causality constraints and a bounded, deterministic latency budget.

Let the healthy-side kinematic stream be sampled at a constant rate; in practice, raw signals were acquired at 200 Hz and uniformly resampled to

f_{s} = 100

Hz for model training and deployment. Around any time index t, a history window of length N is defined as

X_{t - N + 1 : t} = [x_{t - N + 1}, x_{t - N + 2}, \dots, x_{t}] \in R^{N \times 6},

(1)

where each frame collects six kinematic features,

x_{τ} = {[θ_{H}^{hip} (τ), {\dot{θ}}_{H}^{hip} (τ), {\ddot{θ}}_{H}^{hip} (τ), θ_{H}^{knee} (τ), {\dot{θ}}_{H}^{knee} (τ), {\ddot{θ}}_{H}^{knee} (τ)]}^{⊤} .

(2)

where

θ

,

\dot{θ}

, and

\ddot{θ}

denote joint angle, angular velocity, and angular acceleration, and the subscript H indicates the healthy limb. The prediction target is the short-horizon trajectory of the contralateral (affected) hip and knee,

Y_{t + 1 : t + M} = [y_{t + 1}, y_{t + 2}, \dots, y_{t + M}] \in R^{M \times 2}, y_{τ} = {[θ_{A}^{hip} (τ), θ_{A}^{knee} (τ)]}^{⊤},

(3)

with subscript A indicating the affected limb. The task is a causal sequence-to-sequence mapping,

F : X_{t - N + 1 : t} ⟼ {\hat{Y}}_{t + 1 : t + M} .

(4)

3.1.1. Causality and Deterministic Latency

For cooperative deployment with wearable exoskeletons, the prediction mapping

F

is required to be strictly causal, such that future motion estimates depend only on information available up to the decision time and do not access any future samples.

\begin{matrix} {\hat{Y}}_{t + 1 : t + M} & = F (X_{t - N + 1 : t}), \\ {\hat{y}}_{τ} & = {[{\hat{Y}}_{t + 1 : t + M}]}_{τ}, \forall τ \in {t + 1, \dots, t + M}, \end{matrix}

(5)

This strict causality constraint enables seamless operation on streaming IMU data and supports a fixed, measurable inference delay compatible with controller synchronization.

3.1.2. Control-Oriented Objectives

To maintain biomechanical consistency and closed-loop stability, the predicted trajectories are encouraged to preserve (i) global phase consistency over the gait cycle and (ii) accurate alignment near heel-strike (HS) and toe-off (TO). Let

ϕ (\cdot)

denote a phase-mapping operator and

\nabla_{t}

the first-order temporal difference. A practical objective is

\begin{matrix} min_{F} & \underset{trajectory accuracy}{\underset{︸}{∥ {\hat{Y}}_{t + 1 : t + M} - Y_{t + 1 : t + M} ∥_{2}^{2}}} + λ \underset{temporal smoothness}{\underset{︸}{∥ \nabla_{t} {\hat{Y}}_{t + 1 : t + M} - \nabla_{t} Y_{t + 1 : t + M} ∥_{2}^{2}}} \\ + β \underset{phase consistency}{\underset{︸}{∥ ϕ ({\hat{Y}}_{t + 1 : t + M}) - ϕ (Y_{t + 1 : t + M}) ∥_{2}^{2}}}, \end{matrix}

(6)

where

λ

and

β

weight smoothness and phase preservation. This formulation serves as a conceptual objective to guide model design, motivating the causal architecture and auxiliary loss terms described in Section 3.3.

3.2. Data Acquisition and Preprocessing

3.2.1. Acquisition Protocol

Unilateral lower-limb kinematics were collected using three wearable IMUs (Figure 1). The thigh and shank units (highlighted in the dashed box) serve as the exclusive prediction inputs. The pelvis unit acts as the trunk reference for ground-truth angle calculation; in the complete exoskeleton system, this reference sensor is placed inside the Backpack Control Module (as shown later in Figure 2) to ensure a self-contained setup. All sensors were physically sampled at 200 Hz under a unified timestamp. Each recording began with a brief upright calibration (T-pose) and included level-ground walking at two paces. All procedures followed institutional guidelines with written informed consent.

3.2.2. Preprocessing Pipeline

The onboard IMU firmware applies proprietary anti-aliasing and orientation filtering, so the 200 Hz streams received by the industrial PC are already temporally smoothed at the sensor level. To align the data with the controller update rate and the prediction horizon design, all joint-angle streams were then uniformly resampled from 200 Hz to 100 Hz on the host side using linear interpolation. Angular velocity and acceleration were obtained through numerical differentiation (first- and second-order gradients) of the resampled angles. Finally, each kinematic channel was standardized using statistics computed from the training split and applied unchanged to the validation and test sets. No subject-specific tuning or additional offline filtering was introduced.

3.2.3. Feature Construction and Windowing

After downsampling to 100 Hz, a rolling buffer of length N formed the input window

X_{t - N + 1 : t} \in R^{N \times 6}

by stacking healthy-side hip and knee angle, angular velocity, and angular acceleration:

X_{t - N + 1 : t} = [x_{t - N + 1}, \dots, x_{t}] \in R^{N \times 6} .

(7)

\begin{matrix} x_{τ} & = {[θ_{H}^{hip} (τ), {\dot{θ}}_{H}^{hip} (τ), {\ddot{θ}}_{H}^{hip} (τ), θ_{H}^{knee} (τ), {\dot{θ}}_{H}^{knee} (τ), {\ddot{θ}}_{H}^{knee} (τ)]}^{⊤}, \\ τ \in {t - N + 1, \dots, t} . \end{matrix}

(8)

The prediction target

Y_{t + 1 : t + M} \in R^{M \times 2}

comprised contralateral hip and knee angles. Training samples were generated offline by sliding the window with stride

s = 1

, whereas streaming inference updated the same buffer causally in real time. Data were partitioned by subject into training/validation/test sets (e.g., 70/15/15) to assess cross-subject generalization, and all preprocessing parameters (N, M, s, differentiator window/order, and normalization statistics) were fixed a priori and shared across splits.

3.3. Proposed Model: FusionTCN–Attention

Human gait exhibits rhythmically coupled yet asymmetric dynamics between contralateral limbs. For cooperative assistance, the predictor must be strictly causal, phase-consistent, and numerically stable at short preview horizons. The proposed FusionTCN–Attention addresses these requirements by combining a dilated causal encoder with a lightweight temporal attention fusion and joint-specific lightweight linear heads. Unlike recurrent predictors (LSTM/Seq2Seq) that accumulate phase drift via recursive states, and fully self-attentive models whose quadratic complexity undercuts deployment, this design preserves deterministic timing while adaptively focusing on phase-critical segments of motion.

Causal dilated encoder.

Let the healthy-side temporal window be

Z^{(0)} = X_{t - N + 1 : t} \in R^{N \times 6}

, with six kinematic channels per frame. The encoder stacks L residual causal blocks with exponentially increasing dilations

d_{l} = 2^{l - 1}

and kernel size k. Each block expands the temporal coverage without leaking future information:

Z^{(l)} = BN (GELU ({Conv}_{causal} (Z^{(l - 1)}; k, d_{l}))) + Z^{(l - 1)}, l = 1, \dots, L .

(9)

During training, dropout is applied within each convolutional block to improve generalization and is disabled during inference. With this design, the encoder has an effective receptive field

R = 1 + (k - 1) (2^{L} - 1) .

(10)

With the configuration used in this study (

L = 5

,

k = 3

), the resulting receptive field computed from (10) is 63 frames, corresponding to approximately 0.63 s at a sampling rate of 100 Hz.

Multi-head temporal attention design and phase-capturing behavior.

In the actual implementation, the temporal attention module adopts four independent causal heads. Let

H \in R^{N \times D}

denote the encoder output with D channels. For the i-th head, query, key, and value projections are computed as

Q_{i} = H W_{Q, i}, K_{i} = H W_{K, i}, V_{i} = H W_{V, i},

(11)

with per-head dimensionality

d_{h} = D / 4

. A causal mask

M_{causal}

is added to prevent access to future timestamps, yielding

A_{i} = Softmax (\frac{Q_{i} K_{i}^{⊤}}{\sqrt{d_{h}}} + M_{causal}) V_{i},

(12)

where

{(M_{causal})}_{u v} = 0

if

v \leq u

and

- \infty

otherwise. The four heads are then concatenated and linearly projected:

A = Concat (A_{1}, \dots, A_{4}) W_{o} .

(13)

A channel-wise reweighting block (SE layer) is applied to the aggregated attention features, modulating the relative importance of each channel

s = σ (W_{2} ReLU (W_{1} AvgPool (A))), \tilde{A} = s ⊙ A,

(14)

where

s \in R^{D}

encodes channel importance and ⊙ denotes elementwise multiplication.

In practice, different heads tend to emphasize distinct temporal patterns: some respond strongly near heel-strike and toe-off (large derivatives), while others capture mid-stance regularities and longer-range periodicity. This multi-head, causally masked design provides phase-aware feature aggregation under real-time constraints and supports stable contralateral trajectory prediction.

Dual-head decoding with biomechanical differentiation.

Only the most recent M embeddings are required for the preview:

\tilde{H} = Slice (H; N - M + 1 : N) \in R^{M \times D} .

(15)

The resulting feature matrix

\tilde{H}

is decoded frame-wise by joint-specific lightweight linear heads,

{\hat{Y}}_{t + τ} = [\begin{matrix} W_{hip} \\ W_{knee} \end{matrix}] {\tilde{H}}_{τ} + [\begin{matrix} b_{hip} \\ b_{knee} \end{matrix}], τ = 1, \dots, M,

(16)

where the hip branch is deliberately smoother for torque stability, and the knee branch is assigned a higher auxiliary loss weight to accommodate larger excursions and faster dynamics—consistent with proximal–distal asymmetry.

Control-oriented composite objective.

To ensure kinematic fidelity and prevent hazard-inducing artifacts (e.g., toe-stubbing risks), we employ a composite loss function that explicitly balances trajectory tracking, smoothness, and morphological constraints:

L_{total} = λ_{mse} L_{MSE} + λ_{vel} L_{vel} + λ_{peak} L_{peak} + λ_{aux} L_{aux} .

(17)

where,

L_{MSE}

is the standard mean squared error governing the global trajectory. To suppress high-frequency jitter which can destabilize exoskeleton controllers, we impose a first-order velocity consistency penalty:

\nabla_{t} {\hat{Y}}_{t} = {\hat{Y}}_{t} - {\hat{Y}}_{t - 1}, L_{vel} = \frac{1}{M} \sum_{t = 1}^{M} {∥ \nabla_{t} {\hat{Y}}_{t} - \nabla_{t} Y_{t} ∥}_{2}^{2} .

(18)

Crucially, standard MSE often tends to “undershoot” extremum points (peaks and valleys) due to averaging effects. For gait rehabilitation, accurate peak knee flexion is vital for toe clearance. We therefore introduce a specific Peak Loss term:

L_{peak} = \frac{1}{| P |} \sum_{t \in P} {∥ {\hat{Y}}_{t} - Y_{t} ∥}_{2}^{2},

(19)

where P denotes the set of indices corresponding to the local extrema (top-k amplitude frames) within the prediction horizon. Here,

L_{aux}

increases the relative contribution of the knee joint error. The weighting coefficients (

λ

) were determined empirically through sensitivity analysis (see Table 1) to establish an optimal trade-off between tracking precision and morphological preservation.

3.4. Training and Implementation Details

To ensure reproducibility and facilitate fair comparison, all models were trained and evaluated under a unified configuration. Table 1 summarizes the key hyperparameters and implementation settings used throughout this study, including temporal windowing, network depth, optimization strategy, and hardware environment.

Table 1. Reproducibility configuration of the FusionTCN–Attention implementation.

Configuration Item	Value
Sampling rate	$f_{s} = 100$ Hz (downsampled from 200 Hz)
Input/horizon/stride	$N = 250$ , $M = 40$ , $s = 1$
Encoder (USE)	$L = 5$ , $k = 3$ , dilation $[1, 2, 4, 8, 16]$ , width 96
Attention fusion	$H = 4$ , embed dim $D = 128$ (head dim 32), multi-head attention + SE gating
Loss weights	$λ_{m s e} = 1.0$ , $λ_{p e a k} = 0.05$ , $λ_{v e l} = 0.10$ , $λ_{a u x} = 0.05$
Optimizer	AdamW (weight decay $10^{- 4}$ , $lr = 10^{- 3}$ )
Schedule/stopping	fixed learning rate, early stopping (patience 40 epochs)
Batch size & epochs	batch 192, maximum 300 epochs
Training Hardware/software	PyTorch 2.5, CUDA 12.1, NVIDIA RTX 4070 Ti (Training Only)

These settings were fixed across all experiments to ensure reproducibility during offline training and evaluation. The hardware configuration listed above does not represent the real-time deployment platform of the exoskeleton system.

3.5. System Integration and Closed-Loop Real-Time Validation

Real-time validation of the proposed FusionTCN–Attention model was conducted on a unilateral lower-limb exoskeleton platform to assess system integration feasibility, closed-loop execution characteristics, and timing compatibility under physical hardware constraints. This section focuses on engineering validation rather than clinical performance.

The experimental platform is shown in Figure 2. The backpack control module (visible in Figure 2b) houses the central IMU used as the trunk reference, ensuring a self-contained sensing setup consistent with the protocol in Section 3.2.1. A unilateral lower-limb exoskeleton with two actuated joints (hip and knee) was used as a representative cooperative assistance system. Each joint was equipped with incremental encoders for position feedback and driven by embedded motor controllers. A CAN-based bus connected the joint controllers to a host computer for sensor acquisition, model inference, and reference generation. Real-time deployment was conducted on a CPU-only platform (AMD Ryzen 7 6800H, 16 GB RAM) without discrete GPU acceleration, with all inference executed online.This setup reflects a practical deployment scenario for assistive and rehabilitation systems, where compact industrial PCs or laptop-class processors are commonly used.

Figure 2. Unilateral lower-limb exoskeleton platform: (a) system architecture and (b) prototype implementation.

During cooperative walking trials, IMU data from the healthy limb were streamed to the host computer at 200 Hz and uniformly downsampled to 100 Hz to match the controller update rate. A rolling temporal buffer of length

N = 250

frames was continuously maintained. At each control cycle, the buffer was processed by the FusionTCN–Attention model to generate a short-horizon contralateral joint trajectory prediction with a horizon of

M = 40

frames. The same temporal configuration and preprocessing pipeline were used during offline training and online inference, ensuring consistency between learning and deployment.

The control architecture adopted for real-time validation is illustrated in Figure 3. The predicted joint trajectories were used exclusively as reference inputs within a feedback-controlled system. At each control cycle, only the next-step prediction

{\hat{Y}}_{t + 1}

was supplied to the low-level joint controller. Joint actuation was governed by a PD controller with encoder feedback, while subsequent predicted frames were used to shape the feedforward reference profile. As a result, joint motion was continuously regulated by feedback, and prediction errors did not accumulate across control cycles.

Standard hardware safety constraints were enforced throughout all real-time experiments. These included joint-level torque and velocity saturation at the motor-controller level, watchdog-based controller monitoring, and a hardware emergency-stop circuit. All experiments were conducted on healthy adult subjects under human supervision. Extension of the proposed framework to impaired users would require additional safety certification and clinical validation, which are beyond the scope of this study.

4. Results

4.1. Evaluation Protocol and Metrics

To evaluate the proposed causal prediction framework under conditions that closely resemble real deployment, unilateral gait data were collected from twenty-one adult volunteers walking along a straight indoor corridor at two self-selected paces (Normal and Fast). All inertial signals were physically sampled at 200 Hz and uniformly resampled to 100 Hz, matching the control frequency of the cooperative exoskeleton described in Section 3.2. Three IMUs were attached to the healthy-side pelvis, thigh, and shank. Pelvis measurements were used exclusively to reconstruct reference joint angles, whereas the prediction architectures received only thigh and shank streams, mirroring the real-time configuration of the assistive system.

A subject-exclusive split was adopted (15/3/3 for training/validation/testing) to prevent cross-subject leakage. Unless otherwise stated, the temporal context window was fixed at

N = 250

frames (2.5 s) and the prediction horizon at

M = 40

frames (0.4 s), with stride

s = 1

for frame-wise evaluation. Performance was computed separately for hip and knee and summarized across paces.

Model assessment combined pointwise accuracy with rhythm-preserving fidelity. Trajectory agreement was quantified by root-mean-squared error (RMSE) and Pearson correlation coefficient

RMSE = \sqrt{\frac{1}{M} \sum_{i = 1}^{M} {({\hat{y}}_{i} - y_{i})}^{2}}, r = corr (\hat{y}, y) .

(20)

Temporal consistency was assessed by a velocity-domain RMSE (vRMSE) computed on first-order differences

Δ y

. Stride-scale morphology was summarized by the peak amplitude error (PeakAmpErr, degrees). Phase alignment was characterized by PhaseLag (%), defined as the time shift that maximizes the normalized cross-correlation and normalized by the median gait period. Model size is reported as trainable parameters (millions).

4.2. Quantitative Benchmark Comparison

Four causal predictors—FusionTCN–Attention, Seq2Seq, PlainTCN, and a masked self-attention Transformer—were evaluated under identical splits, windowing, and optimization. Table 2 lists angle errors (°), correlations, absolute lag (ms), peak-amplitude error (°), and parameter counts (M).

Overall, FusionTCN–Attention attains the lowest angle errors and highest correlations with a compact footprint (hip RMSE

5.71 °

, knee RMSE

7.43 °

,

r_{H i p}

0.905,

r_{K n e e}

0.912, AbsLag 14.56 ms, PeakErr

1.99 °

, Params 0.366 M). The Transformer narrows the gap in correlation but exhibits larger peak deviations (PeakErr

4.27 °

). PlainTCN remains competitive on knee RMSE (

7.29 °

) at modest lag (12.45 ms) but shows residual over-/undershoot near extrema. Seq2Seq presents the largest RMSEs (hip

9.50 °

, knee

10.06 °

) and substantially higher lag (32.26 ms), accompanied by significant peak attenuation (PeakErr

6.92 °

), which is consistent with the multi-frame phase drift observed at faster cadences.

Figure 4 shows long-sequence hip and knee angle predictions under Normal and Fast walking. FusionTCN—Attention exhibits the closest temporal alignment with the ground truth across cycles, with minimal phase drift and limited peak attenuation, especially around heel—strike and toe—off. Seq2Seq shows clear phase lag and peak attenuation over longer horizons, which is particularly detrimental for real-time gait assistance, as accumulated temporal misalignment degrades cycle-level synchronization, while PlainTCN and the Transformer present moderate overshoot or amplitude bias near extrema, consistent with the quantitative trends in Table 2.

4.3. Ablation and Design Sensitivity

A controlled ablation on FusionTCN–Attention modified exactly one factor per variant while holding all else fixed. The A-series removed structural components (attention, fusion head, residual behavior, or causality); the B-series adjusted loss components. Table 3 and Table 4 report means across test subjects.

Removing temporal attention (A1) increases both angle errors and lag, indicating a role in timing stability during rapid transitions. Disabling fusion (knee-only head, A2) compromises cross-joint consistency by construction and lowers knee-side correlation relative to the full model. Approximating residual removal by feature de-meaning (A3) degrades fit, suggesting residual flows help preserve high-slope information. A non-causal TCN (A4) achieves strong offline scores (e.g., hip RMSE 5.28°) yet violates deployable causality, hence it is excluded from real-time candidates.

Loss-oriented variants reveal complementary roles of objective terms (Table 4). A pure MSE (B1) raises errors and lag, showing that dynamics-aware components add value beyond pointwise fit. Disabling the auxiliary knee term (B2) prioritizes knee−side fidelity (best knee RMSE and

r_{Knee}

) and lowers peak error, indicating a tunable trade-off for joint-specific accuracy. Removing the velocity term (B) yields the best hip−side fit and the smallest lag, reflecting a shift from trajectory smoothness to pointwise agreement. Eliminating the peak term (B4) markedly inflates PeakErr, confirming its necessity for stride-scale morphology.

4.4. Temporal Error Profiles and Phase Consistency

Beyond aggregates, error time-courses and state-space geometry were examined to verify adherence to gait dynamics. Figure 5 displays frame-wise residuals (prediction−ground truth, degrees) over long sequences. FusionTCN–Attention forms a narrow, near-zero envelope: the mean trace stays within

\pm 3 °

for most segments, with brief troughs around

- 5 °

near rapid reversals. The plain causal TCN shows wider excursions (typically

\pm 5 °

). Seq2Seq exhibits intermittent bursts exceeding

8 °

around fast flexion/extension, aligning with its larger RMSEs and higher lag; the transformer narrows the correlation gap yet still shows 4–6° drift at slope extremes, consistent with its higher PeakErr.

Angle−velocity phase portraits (Figure 6) probe limit-cycle fidelity. Ground-truth hip loops span roughly

[- 20 °, 30 °]

and

[- 200, 300] ° / s

; knee loops cover about

[- 10 °, 60 °]

and

[- 600, 450] ° / s

. FusionTCN−Attention preserves loop shape and orientation with minimal distortion: peak angular-velocity loci align within one grid step (≈50–

100 ° / s

) without spurious self-intersections; models lacking temporal attention or using recurrent decoding display thinning/ballooning near apices, consistent with larger PeakErr and drift.

Bilateral angle−angle loops (Figure 7) assess inter-limb coupling. For the hip, the traversed corridor is about 20° wide; FusionTCN–Attention stays inside this corridor with cycle-wise bias of only a few degrees. For the knee, the narrow loop around mid-stance is particularly sensitive; predictions adhere without systematic hysteresis widening, consistent with the small PeakErr in Table 2.

Cycle-normalized residual ribbons (Figure 8) summarize mean error and

\pm 1 σ

bands over 0–100% gait. For the hip (a), FusionTCN−Attention maintains a near-zero mean from ∼55–85% with bands largely within

\pm 3 °

; the largest negative swing (≈

- 5 °

) appears around 35–45% near push-off. For the knee (b), the mean residual remains close to zero over mid-stance and late swing, with a brief negative dip of ∼6–8° near ∼25–30% where slope changes are steep. In contrast, Seq2Seq shows a pronounced trough near 75–80% (exceeding 10°), matching its higher RMSE and lag. These phase-wise trends align with Table 2.

4.5. Real-Time Validation

Finally, FusionTCN–Attention was deployed on the unilateral exoskeleton platform for hardware-in-the-loop real-time validation. During closed-loop operation at 100 Hz, the end-to-end delay from IMU sampling to actuator command generation, including sensing, communication, inference, and motor control, remained within the 10 ms control cycle budget. No instability, sustained oscillation, or perceptible delay was observed during repeated walking trials. Representative long-sequence overlays recorded under wear conditions (Figure 9) show that the predicted reference trajectories closely follow the executed hip and knee motions over consecutive gait cycles, with accurate timing around heel-strike and toe-off.

5. Discussion

The proposed FusionTCN–Attention demonstrates accurate contralateral prediction suitable for real-time assistance. Table 2 reports hip and knee RMSEs of 5.71° and 7.43° with high correlations (

r > 0.9

). The mean absolute lag of 14.56 ms falls within a single 100 Hz control cycle, minimizing controller phase compensation requirements. Contrastingly, the Transformer model exhibits higher peak errors (4.27°), and PlainTCN shows oscillation near signal peaks. The Seq2Seq baseline yields the highest errors and a 32.26 ms lag, confirming phase drift accumulation in recursive decoding. Long-sequence plots (Figure 4) verify the model’s capability to track rapid trajectory changes during toe-off and heel-strike.

Component−wise ablation validates the architectural design choices. Removing temporal attention (A1) degrades all accuracy metrics and increases latency, confirming its role in stabilizing timing during transitions. The knee-only fusion variant (A2) results in the highest system latency (22.67 ms), demonstrating that cross-joint feature integration is critical for minimizing phase lag. Approximating residual removal (A3) yields higher angle errors, suggesting that residual flows are essential for preserving high-slope information. While the non-causal TCN (A4) achieves lower hip error, it degrades knee tracking accuracy compared to the proposed causal model and violates real-time constraints. Regarding loss components, the full composite objective (A0) represents the optimal trade-off. Disabling the auxiliary knee term (B2) or the velocity term (B3) effectively reduces latency and peak error but compromises global tracking accuracy (increasing RMSE relative to the full model). Finally, excluding the peak-oriented term (B4) more than doubles the peak error (4.95° vs. 1.99°), validating its necessity for maintaining stride-scale morphology.

Limitations of this study are noted regarding the dataset. The evaluation utilizes data from unimpaired adults as an engineering benchmark to verify algorithmic causality and stability. The trunk reference frame captures mechanically valid kinematics for both symmetric and asymmetric gait. The use of healthy training data represents a deliberate design choice to generate “normative” reference trajectories that guide the paretic limb toward a corrected pattern, avoiding the replication of pathological asymmetry. Post-stroke hemiparesis involves spasticity and compensatory movements distinct from healthy coupling; consequently, prediction errors in clinical populations may exceed current reports. Future work will use this model as a pre-trained baseline for transfer learning on patient-specific datasets.

5.1. Clinical Translation and Hardware Integration

Clinical deployment requires seamless algorithm–hardware integration. The model’s lightweight architecture (0.366 M parameters) suits embedded platforms in wearable robotics, such as the STM32 microcontroller or NVIDIA Jetson series. Future implementations will integrate confidence-based gating to ensure safety; if predictive uncertainty exceeds a threshold (e.g., during unobserved movements), the controller decays to a transparent mode. Consistent, deterministic prediction facilitates long-term user adaptation and potentially reduces metabolic cost as the user learns to trust the assistance.

5.2. Biomimetic Extensions and Environmental Adaptation

Future work will focus on biomimetic extensions mirroring central nervous system adaptation to varying contexts. For variable cadence, the fixed-window approach can incorporate phase-variable estimation, aligning the TCN receptive field with gait cycle percentage rather than absolute time. For terrain adaptation (e.g., stairs, ramps), a proposed multi-modal fusion framework supplements IMU data with vision-based context or EMG signals. This mimics biological sensorimotor integration, allowing preemptive impedance adjustment for stair ascent or descent to overcome level-ground limitations.

6. Conclusions

This paper presents FusionTCN–Attention, a causal temporal modeling framework designed for real-time contralateral gait prediction. By integrating a dilated causal Convolutional Network with a temporal attention mechanism, the architecture captures long-range gait dependencies while enforcing the strict causality required for closed-loop control. Drawing on the biomechanical principle of contralateral coupling, the proposed method reconstructs hip and knee trajectories using a minimal unilateral IMU setup.

Experimental validation on healthy subjects demonstrates the model’s tracking accuracy and timing stability. The method achieves hip and knee RMSEs of 5.71° and 7.43° (

r > 0.9

), showing improved waveform fidelity and peak preservation compared to Seq2Seq and Transformer baselines. Furthermore, the model maintains a deterministic phase lag of 14.56 ms within a 100 Hz control cycle, effectively mitigating the phase drift issues observed in recurrent architectures.

From an engineering perspective, the study verifies the feasibility of high-fidelity gait estimation under embedded hardware constraints. With 0.366 M parameters, the model offers a lightweight module suitable for deployment on microcontroller-based exoskeleton platforms without GPU acceleration. Future work will focus on extending this baseline to hemiparetic cohorts via transfer learning and incorporating environmental adaptation strategies.

Author Contributions

Conceptualization, S.Y. and K.Y.; methodology, S.Y. and K.Y.; software, S.Y. and K.Y.; validation, S.Y., K.Y., L.Z., M.P., H.P. and X.G.; formal analysis, S.Y.; investigation, S.Y., K.Y. and X.G.; resources, L.Z., M.P., H.P., L.C. and X.G.; data curation, S.Y. and K.Y.; writing—original draft preparation, S.Y.; writing—review and editing, K.Y., L.Z., M.P., H.P., L.C. and X.G.; visualization, S.Y.; supervision, L.C. and X.G.; project administration, L.C.; funding acquisition, S.Y. and L.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Guangxi Science and Technology Major Program under Grant AA17204017 and the Innovation Project of Guangxi Graduate Education with the project number A3010022002.

Institutional Review Board Statement

The study was conducted in accordance with the Declaration of Helsinki and approved by the Ethics Committee of Guangxi University (Approval Number: GXU-2025-110).

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

Data is unavailable due to privacy or ethical restrictions.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

IMU	Inertial Measurement Unit
TCN	Temporal Convolutional Network
RMSE	Root Mean Square Error
Seq2Seq	Sequence-to-Sequence
LSTM	Long Short-Term Memory
HS	Heel-Strike
TO	Toe-Off
SE	Squeeze-and-Excitation
SEMG	Surface Electromyography
FOC	Field-Oriented Control

References

Feigin, V.L.; Stark, B.A.; Johnson, C.O.; Roth, G.A.; Bisignano, C.; Abady, G.G.; Abbasifard, M.; Abbasi-Kangevari, M.; Abd-Allah, F.; Abedi, V.; et al. Global, regional, and national burden of stroke and its risk factors, 1990–2019: A systematic analysis for the Global Burden of Disease Study 2019. Lancet Neurol. 2021, 20, 795–820. [Google Scholar] [CrossRef]
Gittler, M.; Davis, A.M. Guidelines for adult stroke rehabilitation and recovery. JAMA 2018, 319, 820–821. [Google Scholar] [CrossRef] [PubMed]
Olney, S.J.; Richards, C. Hemiparetic gait following stroke. Part I: Characteristics. Gait Posture 1996, 4, 136–148. [Google Scholar] [CrossRef]
Webster, J.B.; Darter, B.J. Principles of normal and pathologic gait. In Atlas of Orthoses and Assistive Devices; Elsevier: Philadelphia, PA, USA, 2019; pp. 49–62. [Google Scholar]
Dollar, A.M.; Herr, H. Lower extremity exoskeletons and active orthoses: Challenges and state-of-the-art. IEEE Trans. Robot. 2008, 24, 144–158. [Google Scholar] [CrossRef]
Louie, D.R.; Eng, J.J. Powered robotic exoskeletons in post-stroke rehabilitation of gait: A scoping review. J. Neuroeng. Rehabil. 2016, 13, 53. [Google Scholar] [CrossRef] [PubMed]
Young, A.J.; Ferris, D.P. State of the art and future directions for lower limb robotic exoskeletons. IEEE Trans. Neural Syst. Rehabil. Eng. 2016, 25, 171–182. [Google Scholar] [CrossRef]
Warutkar, V.; Dadgal, R.; Mangulkar, U.R. Use of robotics in gait rehabilitation following stroke: A review. Cureus 2022, 14, e31075. [Google Scholar] [CrossRef]
Lajeunesse, V.; Vincent, C.; Routhier, F.; Careau, E.; Michaud, F. Exoskeletons’ design and usefulness evidence according to a systematic review of lower limb exoskeletons used for functional mobility by people with spinal cord injury. Disabil. Rehabil. Assist. Technol. 2016, 11, 535–547. [Google Scholar] [CrossRef]
Esquenazi, A.; Talaty, M.; Packel, A.; Saulino, M. The ReWalk powered exoskeleton to restore ambulatory function to individuals with thoracic-level motor-complete spinal cord injury. Am. J. Phys. Med. Rehabil. 2012, 91, 911–921. [Google Scholar] [CrossRef]
Baek, E.; Song, S.K.; Oh, S.; Mohammed, S.; Jeon, D.; Kong, K. A motion phase-based hybrid assistive controller for lower limb exoskeletons. In Proceedings of the 2014 IEEE 13th International Workshop on Advanced Motion Control (AMC), Yokohama, Japan, 14–16 March 2014; IEEE: New York, NY, USA, 2014; pp. 197–202. [Google Scholar]
Best, T.K.; Embry, K.R.; Rouse, E.J.; Gregg, R.D. Phase-variable control of a powered knee-ankle prosthesis over continuously varying speeds and inclines. In Proceedings of the 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Prague, Czech Republic, 27 September–1 October 2021; IEEE: New York, NY, USA 2021; pp. 6182–6189. [Google Scholar]
Yang, W.; Xu, L.; Yu, L.; Chen, Y.; Yan, Z.; Yang, C. Hybrid oscillator-based no-delay hip exoskeleton control for free walking assistance. Ind. Robot. Int. J. Robot. Res. Appl. 2021, 48, 906–914. [Google Scholar] [CrossRef]
MacKay-Lyons, M. Central pattern generation of locomotion: A review of the evidence. Phys. Ther. 2002, 82, 69–83. [Google Scholar] [CrossRef]
Huo, C.; Shao, G.; Chen, T.; Li, W.; Wang, J.; Xie, H.; Wang, Y.; Li, Z.; Zheng, P.; Li, L.; et al. Effectiveness of unilateral lower-limb exoskeleton robot on balance and gait recovery and neuroplasticity in patients with subacute stroke: A randomized controlled trial. J. Neuroeng. Rehabil. 2024, 21, 213. [Google Scholar] [CrossRef]
De Miguel-Fernández, J.; Salazar-Del Rio, M.; Rey-Prieto, M.; Bayón, C.; Guirao-Cano, L.; Font-Llagunes, J.M.; Lobo-Prat, J. Inertial sensors for gait monitoring and design of adaptive controllers for exoskeletons after stroke: A feasibility study. Front. Bioeng. Biotechnol. 2023, 11, 1208561. [Google Scholar] [CrossRef]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017, 30, 5998–6008. [Google Scholar]
Błażkiewicz, M.; Lann Vel Lace, K.; Hadamus, A. Gait Symmetry Analysis Based on Dynamic Time Warping. Symmetry 2021, 13, 836. [Google Scholar] [CrossRef]
Zeni, J., Jr.; Richards, J.; Higginson, J. Two simple methods for determining gait events during treadmill and overground walking using kinematic data. Gait Posture 2008, 27, 710–714. [Google Scholar] [CrossRef] [PubMed]
Ghoussayni, S.; Stevens, C.; Durham, S.; Ewins, D. Assessment and validation of a simple automated method for the detection of gait events and intervals. Gait Posture 2004, 20, 266–272. [Google Scholar] [CrossRef] [PubMed]
Prasanth, H.; Caban, M.; Keller, U.; Courtine, G.; Ijspeert, A.; Vallery, H.; Von Zitzewitz, J. Wearable sensor-based real-time gait detection: A systematic review. Sensors 2021, 21, 2727. [Google Scholar] [CrossRef] [PubMed]
Ashraf, H.; Schwartz, C.; Waris, A.; Brüls, O.; Boutaayamou, M. Temporal Convolutional Network for Gait Event Detection. In Proceedings of the 2024 46th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Orlando, FL, USA, 15–19 July 2024; IEEE: New York, NY, USA, 2024; pp. 1–4. [Google Scholar]
Asogwa, C.O.; Nagano, H.; Wang, K.; Begg, R. Using deep learning to predict minimum foot–ground clearance event from toe-off kinematics. Sensors 2022, 22, 6960. [Google Scholar] [CrossRef]
Li, T.; Wang, L.; Yi, J.; Li, Q.; Liu, T. Reconstructing walking dynamics from two shank-mounted inertial measurement units. IEEE/ASME Trans. Mechatronics 2021, 26, 3040–3050. [Google Scholar] [CrossRef]
Seel, T.; Raisch, J.; Schauer, T. IMU-based joint angle measurement for gait analysis. Sensors 2014, 14, 6891–6909. [Google Scholar] [CrossRef]
Favre, J.; Jolles, B.M.; Aissaoui, R.; Aminian, K. Ambulatory measurement of 3D knee joint angle. J. Biomech. 2008, 41, 1029–1035. [Google Scholar] [CrossRef]
Su, H.; Schmirander, Y.; Li, Z.; Zhou, X.; Ferrigno, G.; De Momi, E. Bilateral teleoperation control of a redundant manipulator with an rcm kinematic constraint. In Proceedings of the 2020 IEEE International Conference on Robotics and Automation (ICRA), Paris, France, 31 May–31 August 2020; IEEE: New York, NY, USA, 2020; pp. 4477–4482. [Google Scholar]
Zhou, X.; Qi, W.; Ovur, S.E.; Zhang, L.; Hu, Y.; Su, H.; Ferrigno, G.; De Momi, E. A novel muscle-computer interface for hand gesture recognition using depth vision. J. Ambient. Intell. Humaniz. Comput. 2020, 11, 5569–5580. [Google Scholar] [CrossRef]
Kumar, M.; Singh, N.; Kumar, R.; Goel, S.; Kumar, K. Gait recognition based on vision systems: A systematic survey. J. Vis. Commun. Image Represent. 2021, 75, 103052. [Google Scholar] [CrossRef]
Wang, X.; Ao, D.; Li, L. Robust myoelectric pattern recognition methods for reducing users’ calibration burden: Challenges and future. Front. Bioeng. Biotechnol. 2024, 12, 1329209. [Google Scholar] [CrossRef]
Sprager, S.; Juric, M.B. Inertial sensor-based gait recognition: A review. Sensors 2015, 15, 22089–22127. [Google Scholar] [CrossRef]
Hur, B.; Baek, S.; Kang, I.; Kim, D. Learning based lower limb joint kinematic estimation using open source IMU data. Sci. Rep. 2025, 15, 5287. [Google Scholar] [CrossRef] [PubMed]
Heo, Y.; Choi, H.J.; Lee, J.W.; Cho, H.S.; Kim, G.S. Motion-based control strategy of knee actuated exoskeletal gait orthosis for hemiplegic patients: A feasibility study. Appl. Sci. 2023, 14, 301. [Google Scholar] [CrossRef]
Martinez, J.; Black, M.J.; Romero, J. On human motion prediction using recurrent neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 2891–2900. [Google Scholar]
Kim, M.; Hargrove, L.J. A gait phase prediction model trained on benchmark datasets for evaluating a controller for prosthetic legs. Front. Neurorobot. 2023, 16, 1064313. [Google Scholar] [CrossRef] [PubMed]
Bai, S.; Kolter, J.Z.; Koltun, V. An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. arXiv 2018, arXiv:1803.01271. [Google Scholar] [CrossRef]
Lea, C.; Flynn, M.D.; Vidal, R.; Reiter, A.; Hager, G.D. Temporal convolutional networks for action segmentation and detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 156–165. [Google Scholar]

Figure 1. IMU sensor configuration and coordinate definitions on the healthy limb.

Figure 3. Closed-loop control architecture of the unilateral exoskeleton.

Figure 4. Comparison of predicted joint angles under Normal and Fast walking. Normal gait: hip (a), knee (b); Fast gait: hip (c), knee (d).

Figure 5. Long−sequence residuals for hip and knee joints (degrees). (a) Hip joint prediction error over time. (b) Knee joint prediction error over time. The horizontal line indicates zero error.

Figure 6. Hip and knee joint phase portraits. (a) Hip joint phase portrait: angular velocity versus joint angle. (b) Knee joint phase portrait: angular velocity versus joint angle. Ground truth is shown in black, and FusionTCN–Attention predictions are shown in red.

Figure 7. Unilateral angle−angle loops during gait: (a) hip joint loop (right hip versus left hip angle); (b) knee joint loop (right knee versus left knee angle). Ground truth trajectories are shown in black, and FusionTCN predictions in red.

Figure 8. Residual error distributions across normalized gait cycles for (a) hip and (b) knee joints among different models. Shaded regions denote mean

\pm σ

of residuals across test samples at each normalized gait phase.

Figure 8. Residual error distributions across normalized gait cycles for (a) hip and (b) knee joints among different models. Shaded regions denote mean

\pm σ

of residuals across test samples at each normalized gait phase.

Figure 9. Real−time tracking of hip and knee joint angles during unilateral exoskeleton assistance. Solid black curves denote exoskeleton encoder measurements, and red dashed curves denote FusionTCN–Attention predicted references; (a) walking trial 1 (b) walking trial 2.

Table 2. Prediction comparison across models on a representative trial at 100 Hz.

Model	Hip RMSE (°)	Knee RMSE (°)	$r_{Hip}$	$r_{Knee}$	AbsLag (ms)	PeakErr (°)	Params (M)
FusionTCN (ours)	5.71	7.43	0.905	0.912	14.56	1.99	0.366
Seq2Seq	9.50	10.06	0.848	0.854	32.26	6.92	0.052
PlainTCN	7.38	7.29	0.890	0.915	12.45	3.88	0.622
Transformer	6.56	7.25	0.883	0.911	12.43	4.27	0.558

Table 3. Ablation of FusionTCN−Attention (A−series; mean across test subjects). AbsLag (ms) is the mean absolute lag across hip and knee; A2 uses knee only.

Variant	Hip RMSE (°)	Knee RMSE (°)	$r_{Hip}$	$r_{Knee}$	AbsLag (ms)	PeakAmpErr (°)	Params (M)
A0 Full Model	5.71	7.43	0.905	0.912	14.56	1.99	0.366
A1 No Attention	6.40	8.85	0.908	0.870	16.55	2.82	0.326
A2 No Fusion (knee-only head)	14.92	7.81	—	0.896	22.67	3.08	0.366
A3 No Residual (approx.)	6.89	8.20	0.904	0.889	17.58	4.53	0.366
A4 Non-causal TCN	5.29	7.94	0.917	0.900	14.88	1.37	0.366

Table 4. Ablation of FusionTCN–Attention (B-series: loss-component removals; mean across test subjects). AbsLag (ms) is the mean absolute lag across hip and knee.

Variant	Hip RMSE (°)	Knee RMSE (°)	$r_{Hip}$	$r_{Knee}$	AbsLag (ms)	PeakAmpErr (°)	Params (M)
B1 MSE-only	6.17	8.09	0.911	0.891	14.65	2.65	0.366
B2 No KneeAux	6.37	7.47	0.911	0.912	11.48	0.92	0.366
B3 No Velocity	5.78	7.93	0.915	0.899	11.74	1.67	0.366
B4 No Peak	6.51	7.82	0.895	0.897	16.72	4.95	0.366

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Yang, S.; Yu, K.; Zhang, L.; Pan, M.; Pan, H.; Chen, L.; Guo, X. FusionTCN-Attention: A Causality-Preserving Temporal Model for Unilateral IMU-Based Gait Prediction and Cooperative Exoskeleton Control. Biomimetics 2026, 11, 26. https://doi.org/10.3390/biomimetics11010026

AMA Style

Yang S, Yu K, Zhang L, Pan M, Pan H, Chen L, Guo X. FusionTCN-Attention: A Causality-Preserving Temporal Model for Unilateral IMU-Based Gait Prediction and Cooperative Exoskeleton Control. Biomimetics. 2026; 11(1):26. https://doi.org/10.3390/biomimetics11010026

Chicago/Turabian Style

Yang, Sichuang, Kang Yu, Lei Zhang, Minling Pan, Haihong Pan, Lin Chen, and Xuxia Guo. 2026. "FusionTCN-Attention: A Causality-Preserving Temporal Model for Unilateral IMU-Based Gait Prediction and Cooperative Exoskeleton Control" Biomimetics 11, no. 1: 26. https://doi.org/10.3390/biomimetics11010026

APA Style

Yang, S., Yu, K., Zhang, L., Pan, M., Pan, H., Chen, L., & Guo, X. (2026). FusionTCN-Attention: A Causality-Preserving Temporal Model for Unilateral IMU-Based Gait Prediction and Cooperative Exoskeleton Control. Biomimetics, 11(1), 26. https://doi.org/10.3390/biomimetics11010026

Article Menu

FusionTCN-Attention: A Causality-Preserving Temporal Model for Unilateral IMU-Based Gait Prediction and Cooperative Exoskeleton Control

Abstract

1. Introduction

2. Related Work

2.1. Gait Event Detection and Continuous Prediction

2.2. Human–Robot Interaction and Sensing Modalities

2.3. Temporal Modeling Architectures

3. Materials and Methods

3.1. Problem Definition

3.1.1. Causality and Deterministic Latency

3.1.2. Control-Oriented Objectives

3.2. Data Acquisition and Preprocessing

3.2.1. Acquisition Protocol

3.2.2. Preprocessing Pipeline

3.2.3. Feature Construction and Windowing

3.3. Proposed Model: FusionTCN–Attention

3.4. Training and Implementation Details

3.5. System Integration and Closed-Loop Real-Time Validation

4. Results

4.1. Evaluation Protocol and Metrics

4.2. Quantitative Benchmark Comparison

4.3. Ablation and Design Sensitivity

4.4. Temporal Error Profiles and Phase Consistency

4.5. Real-Time Validation

5. Discussion

5.1. Clinical Translation and Hardware Integration

5.2. Biomimetic Extensions and Environmental Adaptation

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI