Human-Inspired Force–Motion Imitation Learning with Dynamic Response for Adaptive Robotic Manipulation

Yuchuang Tong; Haotian Liu; Tianbo Yang; Zhengtao Zhang

doi:10.3390/biomimetics10120825

,

and

¹

CAS Engineering Laboratory for Intelligent Industrial Vision, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China

²

Beijing Zhongke Huiling Robot Technology Co., Ltd., Beijing 100192, China

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Biomimetics2025, 10(12), 825;https://doi.org/10.3390/biomimetics10120825

Version Notes

Order Reprints

Abstract

Recent advances in bioinspired robotics highlight the growing demand for dexterous, adaptive control strategies that allow robots to interact naturally, safely, and efficiently with dynamic, contact-rich environments. Yet, achieving robust adaptability and reflex-like responsiveness to unpredictable disturbances remains a fundamental challenge. This paper presents a bioinspired imitation learning framework that models human adaptive dynamics to jointly acquire and generalize motion and force skills, enabling compliant and resilient robot behavior. The proposed framework integrates hybrid force–motion learning with dynamic response mechanisms, achieving broad skill generalization without reliance on external sensing modalities. A momentum-based force observer is combined with dynamic movement primitives (DMPs) to enable accurate force estimation and smooth motion coordination, while a broad learning system (BLS) refines the DMP forcing function through style modulation, feature augmentation, and adaptive weight tuning. In addition, an adaptive radial basis function neural network (RBFNN) controller dynamically adjusts control parameters to ensure precise, low-latency skill reproduction, and safe physical interaction. Simulations and real-world experiments confirm that the proposed framework achieves human-like adaptability, robustness, and scalability, attaining a competitive learning time of 5.56 s and a rapid generation time of 0.036 s, thereby demonstrating its efficiency and practicality for real-time applications and offering a lightweight yet powerful solution for bioinspired intelligent control in complex and unstructured environments.

Keywords:

bioinspired robotics; imitation learning; force–motion skill acquisition; adaptive control

1. Introduction

Recent advances in robotic manipulators have enabled transformative applications across manufacturing, healthcare, and service industries. Nevertheless, achieving human-like dexterity and generalizable interaction in complex, unstructured environments remains a persistent challenge [1,2,3,4].

Imitation learning based on dynamic movement primitives (DMPs) offers an efficient and data-driven method to acquire task-specific motion skills [5,6,7,8]. However, unlike humans who can flexibly adapt motion and force responses to external disturbances, conventional DMP-based frameworks exhibit limited adaptability, revealing the need for more generalizable skill learning mechanisms [9,10,11]. To improve adaptability and obstacle avoidance, DMPs have been extended with potential fields, bioinspired acceleration terms, and adaptive feedback control, combined with online planning to facilitate cooperative manipulation [12,13,14,15,16]. Imitation learning has further enhanced skill refinement and transferability [17,18,19]. Nonetheless, current methods face notable limitations: obstacle avoidance DMPs often fail to generalize across humanoid tasks, stylistic DMPs show restricted adaptability to new contexts, and imitation learning rarely accounts for stylistic or adaptive variability [15,20,21,22]. Despite increasing applications of DMPs in cooperative control, studies addressing stylistically adaptive imitation learning that supports robust task execution and broad generalization remain scarce [23,24,25].

Human motor skills inherently couple motion trajectories with muscle stiffness and contact forces—factors critical for transferring dexterous skills to robots [20]. DMPs have therefore been extended to represent motion and stiffness profiles, enhancing the learning of compliant behavior [26]. However, while robotic motion can be readily tracked via position sensors, direct measurement of contact forces is often constrained by sparse sensor deployment [27,28]. Alternative methods, such as stiffness estimation or EMG-based inference, are prone to noise and modeling errors. Momentum-based force observers provide an effective alternative by estimating contact forces accurately without requiring joint acceleration measurements, maintaining robustness under dynamic uncertainties [29].

Recent developments in advanced control have further contributed to adaptive and robust robotic behavior. Techniques such as sliding-mode control, adaptive neural network (NN) control, and robust control have demonstrated strong potential for handling nonlinear and uncertain dynamics [30,31,32,33]. Neural and fuzzy systems have been widely adopted for adaptive control to compensate for unmodeled dynamics, with fuzzy logic improving NN adaptability under system uncertainties [34]. Adaptive fuzzy neural network (FNN) control has shown effectiveness in managing interaction and uncertainty [35,36,37]. However, the presence of large payloads or time-varying dynamics can induce biases that degrade the performance of radial basis function neural networks (RBFNNs), reducing tracking accuracy in position and velocity control [38,39,40].

To address these challenges, this paper presents a human-inspired imitation learning framework that unifies force–motion learning and dynamic response mechanisms, enabling robots to robustly acquire and generalize manipulation skills in dynamic, contact-rich environments. Compared to conventional teaching-by-demonstration methods and previous approaches that separately model motion and force or rely on external sensors such as force/EMG [7,9,10,19,41], our bioinspired imitation learning framework simultaneously acquires motion and force skills, adaptively generalizes across tasks and environments without additional sensors, and dynamically responds to environmental changes, enabling robust, compliant, and human-like manipulation. In contrast to existing SoTA methods [11,26,28,31,33] that depend on data-intensive learning or sensor-heavy hybrid control, our method delivers comparable adaptability and precision while requiring substantially lower sensing and computational resources, thereby preserving the interpretability, modularity, and stability inherent in structured control paradigms. At its core, the framework integrates a momentum-based force observer with dynamic movement primitives (DMPs) for accurate force–motion coupling, augmented by a broad learning system (BLS) that refines the DMP forcing function through style modulation, feature augmentation, and adaptive weight tuning. An adaptive RBFNN controller further modulates control parameters in real time, ensuring precise and low-latency skill reproduction under unanticipated disturbances. This unified, lightweight, and sensor-free architecture significantly enhances robustness, adaptability, and scalability, providing a practical and generalizable solution for adaptive manipulation and human–robot interaction in uncertain, unstructured environments. The main contributions are summarized as follows:

We propose a human-inspired imitation learning framework that jointly acquires and generalizes force–motion skills, enabling robots to perform robust, adaptive, and compliant manipulation in dynamic, contact-rich environments, thereby addressing key challenges in practical robotic applications.
A momentum-based force observer integrated with DMPs and BLS is developed to enhance force–motion coupling, refine skill trajectories through style modulation and feature augmentation, and achieve accurate, low-latency skill reproduction.
An adaptive RBFNN controller is introduced to dynamically tune control parameters in response to unforeseen disturbances, improving system robustness, scalability, and safe physical interaction in unstructured environments.

This paper is organized as follows: Section 2 introduces the DMP-based skill modeling. Section 3 details imitation learning and the adaptive RBFNN controller. Section 4 overviews the framework. Section 5 presents simulation and experimental results. Section 6 concludes this paper.

2. Hybrid Force-Motion Skill Learning

2.1. Sensorless Momentum-Driven Force Observers

To acquire force-related skills from human demonstration maneuvers, a momentum-based force observer is employed to estimate the human contact forces without relying on external force sensors. The robot dynamics are described by

\begin{matrix} D (q) \ddot{q} + S (q, \dot{q}) \dot{q} + G (q) + τ_{e} = τ_{c} \end{matrix}

(1)

where

D (q) \in R^{n \times n}

,

S (q, \dot{q}) \dot{q} \in R^{n}

, and

G (q) \in R^{n}

represent the inertia matrix, Coriolis and centripetal vector, and gravity vector, respectively. Here, n denotes the number of degrees of freedom, and q,

\dot{q}

, and

\ddot{q}

denote the joint position, velocity, and acceleration. The terms

τ_{c}

and

τ_{e}

represent the control torque and the external torque (e.g., human-induced), respectively.

The generalized momentum of the manipulator is defined as

p = D (q) \dot{q}

. Taking its time derivative yields

\dot{p} = \dot{D} (q) \dot{q} + D (q) \ddot{q} = \dot{D} (q) \dot{q} - S (q, \dot{q}) \dot{q} - G (q) + τ_{c} - τ_{e},

(2)

which simplifies to

\dot{p} = S^{T} (q, \dot{q}) \dot{q} - G (q) + τ_{c} - τ_{e},

(3)

using the identity

\dot{D} (q) - 2 S (q, \dot{q}) = - S^{T} (q, \dot{q})

.

To analyze how each component of

\dot{p}

is affected by its corresponding torque, we expand the expression element-wise as

\begin{matrix} {\dot{p}}_{i} = - \frac{1}{2} {\dot{q}}^{T} \frac{\partial D (q)}{\partial q_{i}} \dot{q} - G_{i} (q) + τ_{c, i} - τ_{e, i}, i = 1, \dots, n . \end{matrix}

(4)

Based on this, the momentum observer is constructed as

\begin{matrix} \begin{matrix} \dot{\hat{p}} & = τ_{c} - \hat{Λ} (q, \dot{q}) + η \\ \dot{η} & = K_{e} (\dot{p} - \hat{\hat{p}}) \end{matrix} \end{matrix}

(5)

where

\hat{p}

is the estimated momentum and

η

is the observer output. The compensating term

Λ (q, \dot{q})

simplifies dynamics calculation and is defined as

\begin{matrix} \begin{matrix} Λ (q, \dot{q}) & : = S (q, \dot{q}) \dot{q} - \dot{D} (q) \dot{q} + G (q) \\ = - S^{T} (q, \dot{q}) \dot{q} + G (q) \end{matrix} \end{matrix}

(6)

Here,

K_{e} = diag (k_{e, 1}, \dots, k_{e, n})

is a positive-definite gain matrix. Integrating the observer dynamics yields

\begin{matrix} \begin{matrix} η & = K_{e} (p (t) - \int_{0}^{t} \dot{\hat{p}} (t) d t - p (0)) \\ = K_{e} (p (t) - \int_{0}^{t} (τ_{c} - \hat{Λ} (q, \dot{q}) + η) d t - p (0)) \end{matrix} \end{matrix}

(7)

Assuming ideal model knowledge (i.e.,

\hat{D} (q) = D (q)

,

\hat{Λ} (q, \dot{q}) = Λ (q, \dot{q})

), the error dynamics simplify to

\begin{matrix} \dot{η} = K_{e} (τ_{e} - η) \end{matrix}

(8)

Applying the Laplace transform to this first-order system yields

\begin{matrix} η_{i} = \frac{k_{e, i}}{t + k_{e, i}} τ_{e, i} = \frac{1}{1 + T_{e, i} s} τ_{e, i}, i = 1, \dots, n \end{matrix}

(9)

Thus, a larger gain

k_{e, i}

results in a faster transient response (i.e., smaller time constant

T_{e, i}

). In the limit as

K_{e} \to \infty

, the observer output

η

asymptotically models the actual external torque

τ_{e}

.

Hence, the momentum observer effectively functions as a virtual torque sensor, which allows for estimations of external Cartesian forces

F_{e}

through the robot’s Jacobian matrix. To reduce noise in the estimated force signal, a Kalman filter [42] is employed, with its output modeled as

\begin{matrix} F_{e} (t) = H (t) \hat{λ} (t) \end{matrix}

(10)

where

\hat{λ} (t)

denotes the estimated Wiener filter coefficients, and

H (t)

is the corresponding virtual regressor.

Remark 1.

The generalized momentum-based force observer [42] estimates external torques without inverting the inertia matrix or coupling DOFs, enabling robust, sensorless contact force estimation that reduces noise and enhances data quality and learning efficiency during human demonstrations.

2.2. Skill Encoding with Dynamic Movement Primitives

To encode motion skills observed from human demonstrations, the DMP formulation is employed as follows:

\begin{matrix} τ_{d} \dot{v} = α_{x} (𝒳_{g} - 𝒳) - β_{x} v + α_{f} [f (s) - (𝒳_{g} - 𝒳_{0}) s] \\ τ_{d} \dot{𝒳} = v \end{matrix}

(11)

where

𝒳, v, \dot{v}

denote the position, velocity, and acceleration in Cartesian space, respectively. Subscripts ₀ and _g indicate the initial and goal positions. The constants

α_{x}, β_{x}, α_{f} \in R

are positive scalars, and

τ_{d} > 0

is a temporal scaling factor.

This system (11) can be interpreted as a spring-damper mechanism modulated by a virtual force term

α_{f} [f (s) - (𝒳_{g} - 𝒳_{0}) s]

, where

(𝒳_{g} - 𝒳_{0})

acts as a spatial scaling term. The phase variable s evolves according to a canonical system

\begin{matrix} τ_{d} \dot{s} = - φ s, φ > 0, s_{0} = 1 \end{matrix}

(12)

where

φ

denotes the decay rate.

The nonlinear forcing term

f (s)

is represented as a weighted sum of normalized Gaussian basis functions:

\begin{matrix} f (s) = \sum_{i = 1}^{N_{ψ}} ω_{i} ψ_{i} (s) s \end{matrix}

(13)

with

\begin{matrix} ψ_{j} (s) = \frac{exp [- {(s - b_{j})}^{2} / (2 c_{j})]}{\sum_{j = 1}^{N_{ψ}} exp [- {(s - b_{j})}^{2} / (2 c_{j})]} \end{matrix}

(14)

where

ω_{j}

are the corresponding weights, and

b_{j}

and

c_{j}

denote the mean and variance of the j-th basis function, respectively.

N_{ψ}

is the total number of basis functions. As s monotonically decays to zero from its initial value

s_{0} > 0

, both

f (s)

and the virtual forcing term converge to zero, resulting in

𝒳 \to 𝒳_{g}

.

Assuming the demonstration trajectory is generated according to Model (11), the weights

ω_{j}

can be learned using linear regression. The expected forcing term is derived as

\begin{matrix} f^{*} (s) = & \frac{τ_{d} \ddot{𝒳} (\nabla s) + β_{x} \dot{𝒳} (\nabla s)}{α_{x}} \\ - (𝒳_{g} - 𝒳 (\nabla s)) + (𝒳_{g} - 𝒳_{0}) s \end{matrix}

(15)

where

𝒳 (\cdot)

denotes the demonstration trajectory, and

\nabla s

is the inverse of

s (t) = s_{0} exp (- φ t / τ_{d})

. The parameter vector

Ω = [ω_{1}, ω_{2}, \dots ω_{N_{ψ}}]

can be efficiently identified using the least-squares method based on the data obtained from Equation (14).

To capture force-related skills in the demonstrations, we similarly encode the evolution of external force

F_{e}

as a trajectory using a DMP-based model:

\begin{matrix} \begin{matrix} τ_{d} {\dot{F}}_{v} & = α_{x} (F_{g} - F) - β_{x} F_{v} + α_{f} [f (s) - (F_{g} - F_{0}) s] \\ τ_{d} \dot{F} & = F_{v} \end{matrix} \end{matrix}

(16)

where F and

F_{v}

denote the force profile and its rate of change. While the parameters

α_{x}

,

β_{x}

and

α_{f}

in this model are specific to the force dynamics, the temporal scaling factor

τ_{d}

and the phase variable s are shared with Model (11) to maintain temporal alignment.

Remark 2.

The DMP-based models (11) and (14) use spring-damper systems driven by virtual forces for motion and force, balancing rigidity and flexibility akin to human muscle stiffness modulation. Tuning

x_{g}

and

F_{g}

enables precise motion and force goals, while

τ_{d}

controls task duration, forming a unified, generalizable framework for encoding human motor behavior.

3. Skill Generalization & Reproduction

3.1. BLS-Based Forcing Function Modulation

The BLS [43] enhances training efficiency by mapping inputs into feature nodes and augmenting them with enhancement nodes. Integrated with DMPs, this method improves adaptability to disturbances in dynamic environments. Given input

I

and output

H \in R^{N_{f} \times M}

, the output is modeled as

\begin{matrix} H = [ℱ^{n_{z}} ∣ ℰ^{n_{e}}] 𝒲 = 𝒵 𝒲, \end{matrix}

(17)

where feature nodes

ℱ i = ϕ_{i} (I W f i + γ_{f i})

and enhancement nodes

ℰ j = ξ_{j} (ℱ^{n_{z}} W e j + γ_{e j})

capture nonlinear transformations. Output weights

𝒲

are efficiently computed via pseudo-inverse.

Adding new enhancement nodes

ℰ^{n_e + 1}

allows incremental weight updates:

\begin{matrix} 𝒲^{N_{f} + 1} = [\begin{matrix} 𝒲^{N_{f}} - 𝒵^{+} ℰ^{n_{e} + 1} κ^{T} H \\ κ^{T} H \end{matrix}], \end{matrix}

(18)

with

κ^{T}

defined based on residuals

ϱ

and projections

ρ

. This structure enables fast, adaptive learning with improved feature representation.

To enhance the expressiveness of the original forcing term

f (s)

, we introduce an extended formulation

f^{new} (s)

within the BLS framework. A subset of Gaussian basis functions from the original set is retained, while additional features

χ_{j} (s) = ζ_{j} (Φ (s) W_{j} + γ_{j})

are incorporated. Here,

ζ_{j}

is a nonlinear activation function, and

W_{j}, γ_{j}

are randomly initialized weights and biases, designed to enrich the representational capacity of the state vector

Ψ (s)

. The augmented state components

μ_{k}

are normalized as

\begin{matrix} μ_{k} = \frac{l_{k} (s) ψ_{k} (s) + (1 - l_{k} (s)) χ_{k - n_{z}} (s)}{\sum_{i = 1}^{n_{z}} ψ_{i} (s) + \sum_{j = 1}^{n_{e}} χ_{j} (s)} s, \end{matrix}

(19)

where

l_{k} (s) = 1

if

k \leq n_{z}

, and 0 otherwise. This ensures that

\sum_{k = 1}^{n_{z} + n_{e}} μ_{k} = s

.

Defining

Π \sum_{i = 1}^{n_{z}} ψ_{i} (s) = \sum_{i = 1}^{n_{z}} ψ_{i} (s) + \sum_{j = 1}^{n_{e}} χ_{j} (s)

, the expression simplifies to

\begin{matrix} μ_{k} = \frac{l_{k} (s) ψ_{k} (s)}{Π} + \frac{(Π - 1) (1 - l_{k} (s)) χ_{k - n_{z}} (s) s}{Π \sum_{j = 1}^{n_{e}} χ_{j} (s)} . \end{matrix}

(20)

The enhanced state vector

Ξ (s)

is constructed as

\begin{matrix} Ξ (s) = {[μ_{1}, μ_{2}, \dots, μ_{n_{z} + n_{e}}]}^{T} = {[\frac{Ψ (s)}{Π} ∣ (1 - \frac{1}{Π}) Υ (s)]}^{T}, \end{matrix}

(21)

with

Υ (s)

denoting normalized enhanced components

ς_{j} (s) = \frac{χ_{j} (s) s}{\sum_{j} χ_{j} (s)}

. The new forcing term is then defined by

\begin{matrix} f^{new} (s) = {(Ω_{n_{z} + n_{e}}^{new})}^{T} Ξ (s), \end{matrix}

(22)

where

Ω^{new} = {[Ω ∣ Ω^{𝒰}]}^{T}

combines weights from the original and enhanced components. This yields

\begin{matrix} f^{new} (s) & = \frac{1}{Π} f (s) + (1 - \frac{1}{Π}) f^{𝒰} (s) \\ = f (s) + (\frac{1}{Π} - 1) (f (s) - f^{𝒰} (s)) . \end{matrix}

(23)

Assuming the additional compensation targets mid-segment of the trajectory, the system dynamics are modified as

\begin{matrix} τ_{d} \dot{v} = α_{x} (x_{g} - x) - β_{x} v + α_{f} [f^{new} (s) - (x_{g} - x_{0}) s], \\ τ_{d} \dot{x} = v . \end{matrix}

(24)

This structure supports incremental learning by allowing trajectory adaptation via the residual term:

\begin{matrix} Δ 𝒞 (s) & = f^{new} (s) - f (s) = (\frac{1}{Π} - 1) (f (s) - f^{𝒰} (s)) \\ = {(Ω^{new})}^{T} Ξ^{E} (s), \end{matrix}

(25)

where

Ξ^{E} (s) = {[(1 / Π - 1) Ψ (s) ∣ (1 - 1 / Π) Γ (s)]}^{T}

represents the extended state.

For each DoF, the desired enhanced term is computed by

\begin{matrix} f_{j}^{new} {(s)}^{𝒟} = \frac{τ_{d} {\ddot{x}}_{j}^{new} + β_{x} {\dot{x}}_{j}^{new}}{α_{x}} - (x_{g} - x_{j}^{new}) + (x_{g} - x_{0}) s . \end{matrix}

(26)

The corresponding weights are obtained by minimizing the discrepancy:

\begin{matrix} min \sum_{j = 1}^{n_{z}} {∥ f_{j}^{new} {(s)}^{𝒟} - f_{j} (s) ∥}^{2} . \end{matrix}

(27)

The trajectory compensation term becomes

\begin{matrix} Δ 𝒞_{j}^{new} {(s)}^{𝒟} = f_{j}^{new} {(s)}^{𝒟} - f_{j} {(s)}^{𝒟} \\ = \frac{τ_{d} ({\ddot{x}}_{j}^{new} - {\ddot{x}}_{j}) + β_{x} ({\dot{x}}_{j}^{new} - {\dot{x}}_{j})}{α_{x}} + (x_{j}^{new} - x_{j}), \end{matrix}

(28)

and the final optimization objective is

\begin{matrix} min \sum_{j = 1}^{m} {∥ Δ 𝒞_{j}^{new} {(s)}^{𝒟} - Δ 𝒞_{j} (s) ∥}^{2} . \end{matrix}

(29)

Remark 3.

By introducing enriched features

χ_{j} (s)

, the proposed method expands the state and parameter space, implicitly adapting the normalization factor Π. Guided by BLS theory, this extension enables accurate approximation of complex nonlinear mappings and facilitates rapid, flexible trajectory refinement under dynamic conditions.

While the previous formulation supports adaptation via a single compensation term, it remains limited in representing diverse motion variations. To address this, we propose a generalized formulation of the compensation terms

Δ 𝒞^{k} (s), k = 1, \dots, 𝒦

, each capturing a distinct motion modulation based on a shared baseline trajectory

f (s)

:

Δ 𝒞^{k} (s) = {(Ω_{n_{z} + n_{e}}^{new})}^{k} Ξ^{E} (s),

(30)

where

{(Ω_{n_{z} + n_{e}}^{new})}^{k}

denotes a unique set of weights associated with the k-th modulation mode, and

Ξ^{E} (s)

is the common extended state-vector defined in (21).

To model these variations more compactly, we introduce a set of style coefficients

𝒮_{k} \in R

, such that

Δ 𝒞^{k} (s) = 𝒮_{k} Ω_{n_{z} + n_{e}}^{new} Ξ^{E} (s) = 𝒮_{k} Δ 𝒞^{𝒮} (s),

(31)

with the style modulation matrix defined as

S = [𝒮_{1}, 𝒮_{2}, \dots, 𝒮_{𝒦}],

(32)

which encodes inter-demonstration variability and enables adaptive modulation of the DMP forcing term. Each coefficient

𝒮_{k}

is empirically determined from demonstration data by minimizing the deviation between the baseline DMP forcing term and the observed trajectory via least-squares regression. A BLS is then employed to refine the estimates of

𝒮_{k}

, thereby capturing stylistic variations across multiple demonstrations and ensuring accurate trajectory reproduction and generalization.

To obtain these coefficients, we reformulate the imitation learning model by embedding a style-attractor landscape, replacing the conventional forcing function [44]. Given the target modulation set

Δ ℱ {(s)}^{𝒟} = [Δ 𝒞_{1} {(s)}^{𝒟}, \dots, Δ 𝒞_{𝒦} {(s)}^{𝒟}]

, we apply singular value decomposition:

\begin{matrix} Δ ℱ {(s)}^{𝒟} = U Σ V^{T} \approx S Δ ℱ^{S} {(s)}^{𝒟}, \end{matrix}

(33)

where

Δ ℱ^{S} {(s)}^{𝒟} = [Δ 𝒞_{1}^{S} {(s)}^{𝒟}, \dots, Δ 𝒞_{𝒦}^{S} {(s)}^{𝒟}]

denotes the style-invariant basis.

The optimal weights

{(Ω_{n_{z} + n_{e}}^{new})}^{k}

are then learned by minimizing the reconstruction loss

\begin{matrix} min \sum_{k = 1}^{𝒦} ∥Δ ℱ_{k}^{S} {(s)}^{𝒟} - Δ ℱ_{k}^{S} (s)∥ . \end{matrix}

(34)

Remark 4.

Integrating BLS with DMP in an imitation learning framework yields a unified representation for diverse motions. The normalized shared state vector

Ξ^{E} (s)

enables scalable weight expansion and compact style modulation, supporting flexible, context-aware trajectory adaptation.

3.2. Adaptive Control with RBFNN

This paper utilizes an RBFNN-based controller to accurately track both motion and force profiles corresponding to learned behaviors, encompassing both routine and feedback-driven executions, thereby enabling skill replication. Initially, the trajectories obtained from the baseline behavior (11) and those generalized from interaction-induced variations (24) are assigned as the desired joint positions

Q_{d} \in R^{n}

and velocities

{\dot{Q}}_{d} \in R^{n}

. The tracking errors are defined as

\begin{matrix} E = Q_{d} - q, \dot{E} = {\dot{Q}}_{d} - \dot{q} . \end{matrix}

(35)

where n denotes the number of DOFs in joint space.

To improve both the stability and tracking performance, a virtual control law is designed as

\begin{matrix} 𝒱 = {\dot{Q}}_{d} + 𝒫_{1} Q \end{matrix}

(36)

where

𝒫_{1} = d i a g {p_{q, 1}, \dots, p_{q, n}}

is a diagonal and positive definite gain matrix. The vector

Q

is defined based on L’Hospital’s rule as

\begin{matrix} Q & = {[σ_{1}, \dots, σ_{n}]}^{T} \\ = {[\frac{ι_{1}^{2}}{2 π E_{1}} sin (\frac{π E_{1}^{2}}{ι_{1}^{2}}), \dots, \frac{ι_{n}^{2}}{2 π E_{n}} sin (\frac{π E_{n}^{2}}{ι_{n}^{2}})]}^{T} \end{matrix}

(37)

where

ι

is a constant vector.

The control input is constructed as

\begin{matrix} τ_{c} = 𝒫_{2} 𝒱 + W^{T} \cdot Φ (𝒜) \end{matrix}

(38)

where

𝒫_{2}

is another diagonal gain matrix,

W

denotes the adaptive weight matrix,

𝒜 = {[𝒜_{1}, 𝒜_{2}, \dots 𝒜_{q}]}^{T} \in Θ_{𝒜} \subset R^{q}

is the neural input, and

Φ = [ϕ_{1}, ϕ_{2}, \dots, ϕ_{n}]

represents the basis function set. The generalized inner product ⋄ is defined as

W^{T} ⋄ Φ (𝒜) : = {[W_{1}^{T} ϕ_{1} (𝒜), W_{2}^{T} ϕ_{2} (𝒜), \dots, W_{n}^{T} ϕ_{n} (𝒜)]}^{T} .

This neural approximation term can model any bounded continuous function to arbitrary precision given sufficient nodes, according to

\begin{matrix} ℵ (𝒜) = W_{i}^{T} ϕ_{i} (𝒜) + ϵ_{i} (𝒜), \forall 𝒜 \in Θ_{𝒜} \end{matrix}

(39)

where

i = 0, l, g

denote networks with no bias, local bias, and global bias respectively;

ϵ_{i} (𝒜)

is the approximation error.

The Gaussian basis function used is

\begin{matrix} ϕ_{j} (𝒜) = exp [- \frac{{(𝒜 - h_{j})}^{T} (𝒜 - h_{j})}{r^{2}}], j = 1, 2, \dots, n, \end{matrix}

(40)

where

h_{j}

is the center vector of the j-th hidden node, and r is the width parameter.

The optimal weights are defined by minimizing the worst-case approximation error:

\begin{matrix} W^{*} = arg min_{W} \{sup_{𝒜 \in Θ_{𝒜}} | ℵ (𝒜) - W^{T} Φ (𝒜) |\} . \end{matrix}

(41)

where

ℵ (𝒜)

is the approximate model of the target dynamics.

The neural model approximates the dynamics (1) as

\begin{matrix} \begin{matrix} W^{* T} ⋄ Φ (𝒜) + ϵ (𝒜) = D (q) \ddot{q} + S (q, \dot{q}) \dot{q} + G (q) + τ_{e} \end{matrix} \end{matrix}

(42)

where the input

𝒜 = [q^{T}, {\dot{Q}}_{d}^{T}, {\ddot{Q}}_{d}^{T}]

reduces the dimensionality to

3 n

, thereby decreasing the hidden node complexity from

m^{4 n}

to

m^{3 n}

. Fine-tuned control gains are used to mitigate residual approximation errors.

The online learning law adopts a gradient descent with a discontinuous

δ

-modification:

\begin{matrix} {\dot{W}}_{k} = L_{k} (ϕ_{k} (𝒜) r_{k} - δ_{k} W_{k}) \end{matrix}

(43)

where

L_{k}

is the learning rate, and the switching gain

δ_{k}

is defined as

\begin{matrix} δ_{k} = \{\begin{matrix} 0 & if ∥W_{k}∥ < {\bar{W}}_{k} \\ {\bar{δ}}_{k} & if ∥W_{k}∥ ⩾ {\bar{W}}_{k} \end{matrix} \end{matrix}

(44)

Since the true optimal weights

|W^{*} k|

are unknown,

\bar{W} k

must be empirically tuned. A value too small effectively reduces the algorithm to fixed damping, while a value too large risks weight drift and control instability in high-gain settings [45].

Remark 5.

While δ-modification is commonly used to enhance the robustness of adaptive RBFNN controllers [35,45,46], its fixed damping effect can undermine function approximation. This drawback is mitigated via a discontinuous switching scheme, wherein

δ_{k}

is deactivated (i.e., set to zero) when the weight norm

|W_{k}|

falls below a predefined threshold

{\bar{W}}_{k}

, allowing more accurate adaptation without sacrificing stability.

4. Overview of the Framework

The proposed bioinspired imitation learning framework for motor and interaction skill acquisition consists of three stages (Figure 1), enabling robots to learn, generalize, and execute motion and force behaviors in dynamic, unstructured environments. The framework draws inspiration from human motor control principles, emphasizing adaptive force–motion coupling, style-based skill generalization, and reflex-like responsiveness.

Figure 1. Block diagram of the incremental learning framework for enhancing robotic responsiveness through integrated hybrid force-motion learning and dynamic response mechanisms.

In the demonstration and encoding phase, human demonstrations under unstructured contact conditions are captured using a momentum-based force observer, without relying on external force or electromyography sensors. Essential kinematic and dynamic data are encoded as Dynamic Movement Primitives (DMPs) by extracting parameters such as weight vectors

ω

, basis activations

ψ (s)

, and nonlinear forcing terms

f (s)

via Equations (11) and (12), yielding a compact representation that closely approximates target trajectories (26). This stage enables human-like adaptive coordination between motion and force, allowing reflexive responses to environmental perturbations that extend beyond the capabilities of conventional TbD methods.

For skill generalization, the framework integrates a Broad Learning System (BLS) with DMPs, incrementally refining internal models through adaptive modulation terms

Δ ℱ {(s)}^{𝒟}

computed by Equations (26) and (28) and decomposed into stylistic components

Δ ℱ_{k}^{S} {(s)}^{𝒟}

via singular value decomposition. Using prior weights

Ω

and extended state vectors

Ξ^{E} (s)

, the adapted weights

Ω_{n_{z} + n_{e}}^{new}

are optimized following Equations (19)–(22), (31), and (32), allowing context-sensitive modulation of motion and force. Style coefficients

𝒮_{k}

and BLS integration enable reproduction of demonstration variability, capturing human-inspired trajectory nuances and ensuring robust generalization across tasks and environments. Compared to traditional TbD robots, this approach allows simultaneous learning of motion and force skills, supporting compliant, adaptive interaction in high-contact tasks and enabling skill generalization without requiring additional demonstrations or sensors.

Finally, an RBFNN controller dynamically adjusts parameters to track desired joint-space trajectories

Q_{d} \in R^{n}

and velocities

{\dot{Q}}_{d} \in R^{n}

, reducing input dimensionality from

4 n

to

3 n

to enhance computational efficiency. This adaptive control layer preserves smooth, low-latency motion while maintaining stable force–motion coordination under unforeseen environmental changes, such as obstacle appearance or surface variations. Collectively, these bioinspired elements endow the robot with reflexive adaptability, sensor-free efficiency, human-like skill generalization, and robust responsiveness to dynamic perturbations, demonstrating clear advantages over standard teaching-by-demonstration approaches.

5. Simulation and Physical Experiments

This section presents simulations and physical experiments with the UR5 manipulator to demonstrate the framework’s effectiveness and practicality in diverse tasks.

5.1. Simulation Setup

Simulation experiments were conducted in Matlab and CoppeliaSim to validate the proposed framework using manually collected demonstrations. In the simulated task, the robot autonomously performs whiteboard drawing while avoiding obstacles, with contact forces estimated via a momentum-based observer. Hybrid force-position skill learning ensures precise trajectory tracking, while the incremental learning mechanism facilitates trajectory diversification and enhances generalization.

The physical experimental platform (Figure 2) consists of a UR5 manipulator, a 3D camera, and an operator console. The 3D camera serves solely as an environmental perception module for auxiliary tasks, including target localization, obstacle detection, and workspace boundary monitoring, ensuring safe and reliable task execution. It does not participate in the force–motion control loop or the skill generalization process. Obstacle configuration in simulation adheres to predefined workspace safety and reachability principles: (1) obstacles are placed within the manipulator’s reachable workspace without violating kinematic or joint constraints; (2) sufficient spacing between obstacles is maintained to preserve maneuverability; (3) obstacle positions are randomized across trials to evaluate the generalization capability of the learned skill; and (4) irregular or non-convex obstacles are approximated as convex polyhedra with spherical envelopes to enable computationally tractable collision checking. These setup rules ensure consistent, reproducible conditions for evaluating reactive and compliant motion generation under diverse and realistic scenarios.

Figure 2. Simulation and physical platforms.

The main experimental parameters are configured as follows:

α = 60

,

β = 15

, the number of basis functions is 200, and the initial number of enhancement nodes is 1. The initial and final joint states are defined as

q (0) = {[0.657, - 1.195, 0.392, - 0.768, - 1.571, 0]}^{T} rad

and

\dot{q} (0) = {[0, 0, 0, 0, 0, 0]}^{T} rad / s

. Obstacles are modeled as spheres with a radius of

0.05 m

, located at

(- 0.45, - 0.175, 0.6) m

and

(- 0.425, - 0.275, 0.275) m

, respectively, with a minimum obstacle distance of

d_{min} = 0.06 m

. The total task duration is set to

t = 30 s

.

5.2. Verification of the Momentum-Driven Force Observer

Figure 3 evaluates the proposed momentum-based force observer under conditions of intentionally applied constant external force during the drawing task. The comparison between sensor-measured force, estimated force, and Kalman-filtered output shows that, upon pen–whiteboard contact, the force rises sharply and then stabilizes. The sudden fluctuations observed in the estimated force primarily result from rapid transitions in contact states, including abrupt directional and positional adjustments of the end-effector. Around 14 s, a transient deviation between the filtered and measured forces occurs due to a short-lived, high-frequency contact disturbance. These transient peaks are effectively attenuated by the observer dynamics and the DMP-based motion smoothing mechanism. The momentum observer functions as a virtual torque sensor, estimating external Cartesian forces via the robot’s Jacobian matrix. Kalman filtering mitigates measurement noise and further enhances signal smoothness. Despite temporary peaks, the steady-state force estimation error remains on the order of

10^{- 2} N

, without materially affecting task execution or the stability of adaptive control. This high estimation precision enables consistent reproduction of human-like force modulation, demonstrating the observer’s reliability and effectiveness in contact-rich robotic tasks.

Figure 3. Estimated and measured contact forces for momentum-based force observers.

5.3. Verification of the Incremental Learning Framework

Figure 4 illustrates the effectiveness of the proposed framework in acquiring reactive skills and achieving obstacle-aware motion adaptation. As shown in Figure 4a–c, the generated trajectories progressively converge toward smooth, collision-free motions as the number of augmentation nodes increases, demonstrating enhanced adaptability and refinement of the learned skill. Figure 4d further shows that, with an increasing number of augmentation nodes, the minimum distance between the manipulator and obstacles gradually exceeds the predefined safety threshold, indicating improved compliance and robustness in reactive behavior. This trend verifies that higher augmentation node density enhances the stability and safety margin of the learned skill. Collectively, these results substantiate the framework’s capability to generalize across multiple demonstrations, adapt to dynamic and contact-rich environments, and ensure safe and reliable physical interaction. The observed trajectory evolution further confirms that the integration of style modulation and adaptive learning enables robust, human-like reactive skill acquisition. Figure 4 therefore provides both qualitative and quantitative validation of the framework’s effectiveness in achieving safe, adaptive, and generalizable skill learning.

Figure 4. Trajectory learning process for reactive skills when the number of augmentation nodes increases. (a) Performance of 5 enhancing nodes; (b) performance of 10 enhancing nodes; (c) performance of 15 enhancing nodes; (d) relationship between obstacle avoidance performance and the number of enhanced nodes.

Furthermore, Table 1 quantitatively compares the proposed method with DMP [5], GMM-DMP [12], ProDMP [13], and CDPMM-DMP [16], demonstrating superior performance in trajectory similarity (i.e., the average distance of the generated trajectory to demo), learning time, and generation time. The main parameters of these methods are of the same size. As the number of enhanced nodes increases, our method achieves the trajectory similarity error of 0.020 m, indicating high fidelity in replicating demonstrated skills. Additionally, it maintains a competitive learning time of 5.56 s and a rapid generation time of 0.036 s, showcasing its efficiency and practicality for real-time applications. These results underscore the framework’s effectiveness in skill acquisition and generalization, outperforming existing methods across key performance metrics.

Table 1. Quantitative comparisons among different methods.

An experiment altering the drawing position while keeping the original demonstration validated the framework’s generalization capability (Figure 5). Adjusted DMP trajectories enabled effective robot control using generalized trajectory and force data, demonstrating adaptability to spatial variations without major reconfiguration. This flexibility ensures precise, robust performance across diverse tasks and dynamic environments. By minimizing retraining or manual tuning, the framework enhances robotic efficiency and scalability. Overall, Figure 5 confirms the framework’s effectiveness in extending skill learning to new spatial contexts, supporting versatile and reliable robotic operation.

Figure 5. Analysis of the generalization ability of the incremental learning framework. (a) Trajectory generation performance; (b) force generation performance.

5.4. Verification of the Adaptive RBFNN Controller

Figure 6 validates this imitative learning framework’s precision and robustness in trajectory tracking. As shown in Figure 6a–c, the manipulator achieves smooth joint transitions and maintains tracking errors below

5 \times 10^{- 4}

rad. The comparison in Figure 6b between learned and generalized trajectories confirms accurate path following and effective obstacle avoidance. Force-feedback-induced fluctuations in Figure 6c are well controlled, demonstrating stability in dynamic environments. Figure 6d further highlights the adaptive RBFNN controller’s superior accuracy and resilience compared to the conventional RBFNN, underscoring enhanced adaptability. Collectively, these results confirm the framework’s robustness and practical efficacy in precise, reliable execution of complex tasks under diverse conditions.

Figure 6. Tracking control results for skill reproduction using the adaptive RBFNN controller. (a) joint angle of the manipulator, (b) motion trajectory of the manipulator, (c) tracking error of the proposed adaptive RBFNN controller, and (d) tracking error of the conventional RBFNN controller.

5.5. Implementation & Verification

The results from both simulation and real-world experiments (Figure 6, Figure 7, Figure 8, Figure 9 and Figure 10) comprehensively validate the proposed imitation-learning framework in terms of skill generation, generalization, and robustness.

Figure 7. Simulation and experimental demonstrations. The 1st line: skill learning performance; the 2nd line: skill generalization performance; the 3rd line: physical experiment performance.

Figure 8. Physical experiments for skill learning and generalization in plug-in tasks. (a) Data collection for obstacle-free case; (b) data collection for obstacle avoidance case; (c) skill learning performance; (d) skill generalization performance.

Figure 9. Trajectories of skill learning and generalization generation in physical experiments with plug-in tasks.

Figure 10. Comparison of skill learning in cleaning tasks through physical experiments. (a) Conventional position control; (b) the proposed method.

In simulation, Figure 6b and Figure 7 demonstrate that the learned trajectories successfully achieve collision-free performance across multiple randomly generated spherical obstacle configurations. The proposed augmentation mechanism enables trajectories to adaptively reshape in response to varying obstacle layouts, ensuring reliable obstacle-aware behavior under complex spatial constraints. The simulations further illustrate that reactive skill learning remains smooth and stable, allowing the manipulator to complete the specified end-effector drawing tasks despite the presence of multiple obstacles.

For physical validation, diverse manipulation tasks—including power-plugging and surface cleaning—were performed on the UR5 platform (Figure 8, Figure 9 and Figure 10). During skill acquisition, the robot accurately executes drawing and insertion tasks while maintaining safe distances from surrounding obstacles. When task locations or obstacle arrangements change, the system exhibits strong adaptability, generalizing learned motion and force profiles with minimal retraining. In the power-plugging task (Figure 8a–d), the framework generalizes human demonstrations to achieve collision-free insertion even when socket positions shift, aided by visual calibration from the 3D camera. The trajectories are adaptively generated via force–motion modulated Dynamic Movement Primitives (DMPs), and rather than representing a single globally optimal path, multiple feasible trajectories exist for the same task, each maintaining skill-consistent, compliant, and smooth force–motion execution. This demonstrates that the framework not only reproduces demonstrated skills but also flexibly responds to dynamic environmental changes, a capability often lacking in conventional teaching-by-demonstration robots. The robot consistently completes the insertion task while avoiding obstacles, and trajectory comparisons (Figure 9) confirm that the learned force–motion policy preserves stable, low-latency performance under varying conditions. In the cleaning task, interaction-force modulation (Figure 10b) ensures consistent surface contact and superior performance compared with conventional position control (Figure 10a), which often fails under force uncertainty. Collectively, these results indicate that the proposed framework effectively learns and reproduces force–motion skills, enabling precise, compliant, and adaptive interactive manipulation in tasks requiring both contact accuracy and dynamic environmental responsiveness.

Collectively, these findings substantiate that the proposed framework preserves stable force–motion coordination, smooth trajectory tracking, and compliant interaction in both single- and multi-obstacle environments. The framework demonstrates strong robustness against workspace perturbations and irregular obstacle geometries, verifying its capacity for safe, adaptive, and generalizable manipulation in complex and cluttered settings.

6. Conclusions

This paper presented a bioinspired imitation learning framework that unifies force–motion learning with dynamic response mechanisms, enabling robots to acquire and generalize dexterous manipulation skills in dynamic, contact-rich environments while operating with a lightweight, sensor-free architecture that substantially reduces sensing and computational requirements compared with existing SoTA methods. By integrating a momentum-based force observer with DMPs and a BLS, complemented by an adaptive RBFNN controller, the framework achieves accurate force estimation, low-latency skill reproduction, and robust, compliant performance without relying on external sensors. Simulation and real-world experiments demonstrate human-like adaptability and dexterity, safe physical interactions, and broad task generalization. These results underscore the framework’s potential as a practical, scalable solution for advanced robotic manipulation in unstructured and uncertain environments, providing a foundation for future extensions such as multi-robot cooperation, complex object handling, and vision-guided autonomy.

Author Contributions

Conceptualization, Y.T. and H.L.; methodology, Y.T. and H.L.; software, Y.T. and H.L.; validation, H.L. and T.Y.; formal analysis, Y.T.; investigation, Y.T.; data curation, H.L.; writing—original draft preparation, Y.T. and H.L.; writing—review and editing, Y.T.; visualization, Y.T. and H.L.; supervision, Y.T. and Z.Z.; project administration, Z.Z.; funding acquisition, Z.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This work is supported by the National Natural Science Foundation of China (62303457, U21A20482), Beijing Municipal Natural Science Foundation (4252053).

Institutional Review Board Statement

Not applicable.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

Author Zhengtao Zhang was employed by the Institute of Automation, Chinese Academy of Sciences, and Beijing Zhongke Huiling Robot Technology Co. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Liu, W.; Wang, J.; Wang, Y.; Wang, W.; Lu, C. ForceMimic: Force-centric imitation learning with force-motion capture system for contact-rich manipulation. In Proceedings of the 2025 IEEE International Conference on Robotics and Automation (ICRA), Atlanta, GA, USA, 19–23 May 2025. [Google Scholar]
Ablett, T.; Limoyo, O.; Sigal, A.; Jilani, A.; Kelly, J.; Siddiqi, K.; Hogan, F.; Dudek, G. Multimodal and Force-Matched Imitation Learning With a See-Through Visuotactile Sensor. IEEE Trans. Robot. 2025, 41, 946–959. [Google Scholar] [CrossRef]
Zare, M.; Kebria, P.M.; Khosravi, A.; Nahavandi, S. A survey of imitation learning: Algorithms, recent developments, and challenges. IEEE Trans. Cybern. 2024, 54, 7173–7186. [Google Scholar] [CrossRef]
Liu, H.; Tong, Y.; Zhang, Z. Human observation-inspired universal image acquisition paradigm integrating multi-objective motion planning and control for robotics. IEEE/CAA J. Autom. Sin. 2024, 11, 2463–2475. [Google Scholar] [CrossRef]
Ijspeert, A.J.; Nakanishi, J.; Hoffmann, H.; Pastor, P.; Schaal, S. Dynamical movement primitives: Learning attractor models for motor behaviors. Neural Comput. 2013, 25, 328–373. [Google Scholar] [CrossRef] [PubMed]
Xue, W.; Lian, B.; Kartal, Y.; Fan, J.; Chai, T.; Lewis, F.L. Model-free inverse H-infinity control for imitation learning. IEEE Trans. Autom. Sci. Eng. 2024, 22, 5661–5672. [Google Scholar] [CrossRef]
Liu, H.; Tong, Y.; Liu, G.; Ju, Z.; Zhang, Z. IDAGC: Adaptive Generalized Human-Robot Collaboration via Human Intent Estimation and Multimodal Policy Learning. In Proceedings of the 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Hangzhou, China, 19–25 October 2025. [Google Scholar]
Lu, Z.; Wang, N.; Yang, C. A dynamic movement primitives-based tool use skill learning and transfer framework for robot manipulation. IEEE Trans. Autom. Sci. Eng. 2024, 22, 1748–1763. [Google Scholar] [CrossRef]
Lu, Z.; Wang, N.; Li, M.; Yang, C. Incremental Motor Skill Learning and Generalization From Human Dynamic Reactions Based on Dynamic Movement Primitives and Fuzzy Logic System. IEEE Trans. Fuzzy Syst. 2022, 30, 1506–1515. [Google Scholar] [CrossRef]
Zhong, S.; Wu, W. Motion Learning and Generalization of Musculoskeletal Robot Using Gain Primitives. IEEE Trans. Autom. Sci. Eng. 2023, 21, 1580–1591. [Google Scholar] [CrossRef]
Huang, H.H.; Zhang, T.; Yang, C.G.; Chen, C.L.P. Motor Learning and Generalization Using Broad Learning Adaptive Neural Control. IEEE Trans. Ind. Electron. 2020, 67, 8608–8617. [Google Scholar] [CrossRef]
Pervez, A.; Lee, D. Learning task-parameterized dynamic movement primitives using mixture of GMMs. Intell. Serv. Robot. 2018, 11, 61–78. [Google Scholar] [CrossRef]
Li, G.; Jin, Z.; Volpp, M.; Otto, F.; Lioutikov, R.; Neumann, G. ProDMP: A Unified Perspective on Dynamic and Probabilistic Movement Primitives. IEEE Robot. Autom. Lett. 2023, 8, 2325–2332. [Google Scholar] [CrossRef]
Daab, T.; Jaquier, N.; Dreher, C.; Meixner, A.; Krebs, F.; Asfour, T. Incremental Learning of Full-Pose Via-Point Movement Primitives on Riemannian Manifolds. In Proceedings of the 2024 IEEE International Conference on Robotics and Automation (ICRA), Yokohama, Japan, 13–17 May 2024; pp. 2317–2323. [Google Scholar]
Liu, Z.; Fang, Y. A Novel DMPs Framework for Robot Skill Generalizing With Obstacle Avoidance: Taking Volume and Orientation Into Consideration. IEEE/ASME Trans. Mechatronics 2025, 1–11. [Google Scholar] [CrossRef]
Jiang, H.; He, J.; Duan, X. CDPMM-DMP: Conditional Dirichlet Process Mixture Model-Based Dynamic Movement Primitives. IEEE Trans. Autom. Sci. Eng. 2025, 22, 14201–14217. [Google Scholar] [CrossRef]
Tong, Y.C.; Liu, H.T.; Zhang, Z.T. Advancements in humanoid robots: A comprehensive review and future prospects. IEEE/CAA J. Autom. Sin. 2024, 11, 301–328. [Google Scholar] [CrossRef]
Nah, M.C.; Lachner, J.; Hogan, N. Robot control based on motor primitives: A comparison of two approaches. Int. J. Robot. Res. 2024, 43, 1959–1991. [Google Scholar] [CrossRef]
Liu, H.; Tong, Y.; Zhang, Z. DTRT: Enhancing human intent estimation and role allocation for physical human-robot collaboration. In Proceedings of the 2025 IEEE International Conference on Robotics and Automation (ICRA), Atlanta, GA, USA, 19–23 May 2025; pp. 16312–16318. [Google Scholar] [CrossRef]
Xia, W.; Liao, Z.; Lu, Z.; Yao, L. Bio-Signal-Guided Robot Adaptive Stiffness Learning via Human-Teleoperated Demonstrations. Biomimetics 2025, 10, 399. [Google Scholar] [CrossRef]
Jian, X.; Song, Y.; Liu, D.; Wang, Y.; Guo, X.; Wu, B.; Zhang, N. Motion Planning and Control of Active Robot in Orthopedic Surgery by CDMP-Based Imitation Learning and Constrained Optimization. IEEE Trans. Autom. Sci. Eng. 2025, 22, 12197–12212. [Google Scholar] [CrossRef]
Pupa, A.; Di Vittorio, F.; Secchi, C. A Novel Dynamic Motion Primitives Framework for Safe Human-Robot Collaboration. In Proceedings of the 2025 IEEE International Conference on Robotics and Automation (ICRA), Atlanta, GA, USA, 19–23 May 2025; pp. 16326–16332. [Google Scholar] [CrossRef]
Wu, H.; Zhai, D.H.; Xia, Y. Probabilized Dynamic Movement Primitives and Model Predictive Planning for Enhanced Trajectory Imitation Learning. IEEE Trans. Ind. Electron. 2025, 72, 620–628. [Google Scholar] [CrossRef]
Chen, Z.; Fan, K. An online trajectory guidance framework via imitation learning and interactive feedback in robot-assisted surgery. Neural Netw. 2025, 185, 107197. [Google Scholar] [CrossRef]
Wang, W.; Zeng, C.; Zhan, H.; Yang, C. A Novel Robust Imitation Learning Framework for Complex Skills With Limited Demonstrations. IEEE Trans. Autom. Sci. Eng. 2025, 22, 3947–3959. [Google Scholar] [CrossRef]
Liao, Z.; Tassi, F.; Gong, C.; Leonori, M.; Zhao, F.; Jiang, G.; Ajoudani, A. Simultaneously Learning of Motion, Stiffness, and Force From Human Demonstration Based on Riemannian DMP and QP Optimization. IEEE Trans. Autom. Sci. Eng. 2025, 22, 7773–7785. [Google Scholar] [CrossRef]
Yang, C.G.; Peng, G.Z.; Cheng, L.; Na, J.; Li, Z.J. Force Sensorless Admittance Control for Teleoperation of Uncertain Robot Manipulator Using Neural Networks. IEEE Trans. Syst. Man Cybern. Syst. 2021, 51, 3282–3292. [Google Scholar] [CrossRef]
Lyu, C.; Guo, S.; Yan, Y.; Zhang, Y.; Zhang, Y.; Yang, P.; Liu, J. Deep-Learning-Based Force Sensing Method for a Flexible Endovascular Surgery Robot. IEEE Trans. Instrum. Meas. 2024, 73, 1–10. [Google Scholar] [CrossRef]
Heng, S.; Zang, X.; Liu, Y.; Song, C.; Chen, B.; Zhang, Y.; Zhu, Y.; Zhao, J. A Robust Disturbance Rejection Whole-Body Control Framework for Bipedal Robots Using a Momentum-Based Observer. Biomimetics 2025, 10, 189. [Google Scholar] [CrossRef]
Prigozin, A.; Degani, A. Interacting with Obstacles Using a Bio-Inspired, Flexible, Underactuated Multilink Manipulator. Biomimetics 2024, 9, 86. [Google Scholar] [CrossRef]
Liu, H.; Zhang, Z.; Tong, Y.; Ju, Z. Multistep Intent Estimation Guided Adaptive Passive Control for Safety-Aware Physical Human–Robot Collaboration. IEEE Trans. Cybern. 2025, 1–13. [Google Scholar] [CrossRef]
Liu, Z.; Liu, J.; Zhang, O.; Zhao, Y.; Chen, W.; Gao, Y. Adaptive Disturbance Observer-Based Fixed-Time Tracking Control for Uncertain Robotic Systems. IEEE Trans. Ind. Electron. 2024, 71, 14823–14831. [Google Scholar] [CrossRef]
Tong, Y.; Liu, H.; Yang, T.; Zhang, Z. A game theory perspective for multi-manipulators with unknown kinematics. IEEE Trans. Cogn. Dev. Syst. 2025, 1–18. [Google Scholar] [CrossRef]
Yang, Y.; Li, Z.; Shi, P.; Li, G. Fuzzy-Based Control for Multiple Tasks With Human–Robot Interaction. IEEE Trans. Fuzzy Syst. 2024, 32, 5802–5814. [Google Scholar] [CrossRef]
Liu, Q.; Li, D.; Ge, S.S.; Ji, R.; Ouyang, Z.; Tee, K.P. Adaptive bias RBF neural network control for a robotic manipulator. Neurocomputing 2021, 447, 213–223. [Google Scholar] [CrossRef]
Li, L.J.; Chang, X.; Chao, F.; Lin, C.M.; Huỳnh, T.T.; Yang, L.; Shang, C.; Shen, Q. Self-Organizing Type-2 Fuzzy Double Loop Recurrent Neural Network for Uncertain Nonlinear System Control. IEEE Trans. Neural Netw. Learn. Syst. 2025, 36, 6451–6465. [Google Scholar] [CrossRef]
Luo, X.; Li, Z.; Yue, W.; Li, S. A Calibrator Fuzzy Ensemble for Highly-Accurate Robot Arm Calibration. IEEE Trans. Neural Netw. Learn. Syst. 2025, 36, 2169–2181. [Google Scholar] [CrossRef]
Xu, S.; Wu, Z. Adaptive learning control of robot manipulators via incremental hybrid neural network. Neurocomputing 2024, 568, 127045. [Google Scholar] [CrossRef]
Liu, H.; Tong, Y.; Zhang, Z. Human-inspired adaptive optimal control framework for robot-environment interaction. IEEE Trans. Syst. Man Cybern. Syst. 2025, 55, 6085–6098. [Google Scholar] [CrossRef]
Peng, G.; Li, T.; Guo, Y.; Liu, C.; Yang, C.; Chen, C.L.P. Force Observer-Based Motion Adaptation and Adaptive Neural Control for Robots in Contact With Unknown Environments. IEEE Trans. Cybern. 2025, 55, 2138–2150. [Google Scholar] [CrossRef]
Xu, J.; Xu, L.; Ji, A.; Li, Y.; Cao, K. A DMP-Based Motion Generation Scheme for Robotic Mirror Therapy. IEEE-ASME Trans. Mechatronics 2023, 28, 3120–3131. [Google Scholar] [CrossRef]
Wang, N.; Chen, C.; Di Nuovo, A. A Framework of Hybrid Force/Motion Skills Learning for Robots. IEEE Trans. Cogn. Dev. Syst. 2021, 13, 162–170. [Google Scholar] [CrossRef]
Chen, C.L.P.; Liu, Z. Broad Learning System: An Effective and Efficient Incremental Learning System Without the Need for Deep Architecture. IEEE Trans. Neural Netw. Learn. Syst. 2018, 29, 10–24. [Google Scholar] [CrossRef]
Kulic, D.; Ott, C.; Lee, D.H.; Ishikawa, J.; Nakamura, Y. Incremental learning of full body motion primitives and their sequencing through human motion observation. Int. J. Robot. Res. 2012, 31, 330–345. [Google Scholar] [CrossRef]
Ge, S.S.; Wang, C. Adaptive neural control of uncertain MIMO nonlinear systems. IEEE Trans. Neural Netw. 2004, 15, 674–692. [Google Scholar] [CrossRef] [PubMed]
He, W.; Ge, S.S.; Li, Y.; Chew, E.; Ng, Y.S. Neural Network Control of a Rehabilitation Robot by State and Output Feedback. J. Intell. Robot. Syst. 2015, 80, 15–31. [Google Scholar] [CrossRef]

Figure 1. Block diagram of the incremental learning framework for enhancing robotic responsiveness through integrated hybrid force-motion learning and dynamic response mechanisms.

Figure 2. Simulation and physical platforms.

Figure 3. Estimated and measured contact forces for momentum-based force observers.

Figure 4. Trajectory learning process for reactive skills when the number of augmentation nodes increases. (a) Performance of 5 enhancing nodes; (b) performance of 10 enhancing nodes; (c) performance of 15 enhancing nodes; (d) relationship between obstacle avoidance performance and the number of enhanced nodes.

Figure 5. Analysis of the generalization ability of the incremental learning framework. (a) Trajectory generation performance; (b) force generation performance.

Figure 6. Tracking control results for skill reproduction using the adaptive RBFNN controller. (a) joint angle of the manipulator, (b) motion trajectory of the manipulator, (c) tracking error of the proposed adaptive RBFNN controller, and (d) tracking error of the conventional RBFNN controller.

Figure 7. Simulation and experimental demonstrations. The 1st line: skill learning performance; the 2nd line: skill generalization performance; the 3rd line: physical experiment performance.

Figure 8. Physical experiments for skill learning and generalization in plug-in tasks. (a) Data collection for obstacle-free case; (b) data collection for obstacle avoidance case; (c) skill learning performance; (d) skill generalization performance.

Figure 9. Trajectories of skill learning and generalization generation in physical experiments with plug-in tasks.

Figure 10. Comparison of skill learning in cleaning tasks through physical experiments. (a) Conventional position control; (b) the proposed method.

Table 1. Quantitative comparisons among different methods.

Method	Similarity (m)	Learning Time (s)	Generation Time (s)
Ours	0.020	5.56	0.036
DMP [5]	0.043	22.45	0.045
GMM-DMP [12]	0.019	4.10	0.056
ProDMP [13]	0.031	1.72	0.033
CDPMM-DMP [16]	0.027	6.72	0.042

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Human-Inspired Force–Motion Imitation Learning with Dynamic Response for Adaptive Robotic Manipulation

Abstract

1. Introduction

2. Hybrid Force-Motion Skill Learning

2.1. Sensorless Momentum-Driven Force Observers

2.2. Skill Encoding with Dynamic Movement Primitives

3. Skill Generalization & Reproduction

3.1. BLS-Based Forcing Function Modulation

3.2. Adaptive Control with RBFNN

4. Overview of the Framework

5. Simulation and Physical Experiments

5.1. Simulation Setup

5.2. Verification of the Momentum-Driven Force Observer

5.3. Verification of the Incremental Learning Framework

5.4. Verification of the Adaptive RBFNN Controller

5.5. Implementation & Verification

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Data Availability Statement

Conflicts of Interest

References

Article Metrics

Citations

Article Access Statistics