Next Article in Journal
Towards Self-Assembling 3D-Printed Shapes Through Βiomimetic Μechanical Interlocking
Previous Article in Journal
Brain-Inspired Multisensory Learning: A Systematic Review of Neuroplasticity and Cognitive Outcomes in Adult Multicultural and Second Language Acquisition
Previous Article in Special Issue
Decentralized Multi-Robot Navigation Based on Deep Reinforcement Learning and Trajectory Optimization
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Bio-Signal-Guided Robot Adaptive Stiffness Learning via Human-Teleoperated Demonstrations

by
Wei Xia
1,2,
Zhiwei Liao
3,*,
Zongxin Lu
3,* and
Ligang Yao
3
1
School of Mechanical Engineering, Shaanxi Polytechnic Institute, Xianyang 712000, China
2
Engineering Research Center of Composite Movable Robot, Universities of Shaanxi Province, Xianyang 712000, China
3
School of Mechanical Engineering and Automation, Fuzhou University, Fuzhou 350108, China
*
Authors to whom correspondence should be addressed.
Biomimetics 2025, 10(6), 399; https://doi.org/10.3390/biomimetics10060399
Submission received: 19 May 2025 / Revised: 10 June 2025 / Accepted: 11 June 2025 / Published: 13 June 2025

Abstract

:
Robot learning from human demonstration pioneers an effective mapping paradigm for endowing robots with human-like operational capabilities. This paper proposes a bio-signal-guided robot adaptive stiffness learning framework grounded in the conclusion that muscle activation of the human arm is positively correlated with the endpoint stiffness. First, we propose a human-teleoperated demonstration platform enabling real-time modulation of robot end-effector stiffness by human tutors during operational tasks. Second, we develop a dual-stage probabilistic modeling architecture employing the Gaussian mixture model and Gaussian mixture regression to model the temporal–motion correlation and the motion–sEMG relationship, successively. Third, a real-world experiment was conducted to validate the effectiveness of the proposed skill transfer framework, demonstrating that the robot achieves online adaptation of Cartesian impedance characteristics in contact-rich tasks. This paper provides a simple and intuitive way to plan the Cartesian impedance parameters, transcending the classical method that requires complex human arm endpoint stiffness identification before human demonstration or compensation for the difference in human–robot operational effects after human demonstration.

1. Introduction

The impedance-controlled active compliance in collaborative robots is essential for ensuring operational safety and adaptability in disturbance-prone, contact-rich environments such as human–robot collaboration and dynamic assembly scenarios [1]. However, formulating an effective stiffness modulation strategy for the robot end-effector remains challenging due to the dynamic and inherently unpredictable contact conditions in disturbance-prone and contact-rich tasks [2,3,4].
As a promising and efficient paradigm for robots to master complex manipulation skills from humans, learning from human demonstration (LfHD) provides a neuromorphic mapping mechanism that decodes biological motor intelligence into robot control parameters [5]. With the development of neuroscience, which reveals how humans modulate their musculoskeletal system during manipulation [6], stiffness learning has been extensively studied in recent years to endow robots with human-like variable impedance characteristics [7]. Most generally, the approaches in this community can be divided into two categories with and without explicit reward mechanisms, i.e., reinforcement learning and imitation learning.
Reinforcement learning for stiffness modulation can be further divided into model-based and model-free methods. The former leverages learned or analytical contact dynamics models to optimize stiffness profiles [8,9,10,11,12], and the latter learns stiffness modulation policies directly through trial-and-error interactions, bypassing explicit dynamics modeling [13,14,15,16,17,18]. For instance, Ref. [8] integrates iterative linear quadratic Gaussian and guided policy search to address high-precision robotic assembly tasks, the learned stiffness modulation policies adapting to environmental variations, and mimicking human-like force-guided behaviors. A deep model predictive stiffness modulation using a probabilistic ensemble neural network is proposed in [12] to learn generalized robot–environment dynamics, and a real-time adaptation of stiffness is realized via model predictive control. For model-free reinforcement learning methods, for instance, Ref. [13] proposes the robot Cartesian impedance control as the action space for deep reinforcement learning and uses the proximal policy optimization to optimize the stiffness and damping for safe, energy-reduction, and robust manipulation. A deep reinforcement learning integrating Lyapunov stability analysis is investigated in [18] to optimize the stiffness and damping of the controller in contact-rich tasks for accurate force-tracking. In summary, reinforcement learning provides a flexible and effective paradigm for robot learning stiffness modulation to adapt to dynamic and uncertain contact-rich environments. However, it faces significant challenges, including the need for extensive training data, difficulties in designing robust reward functions, slow convergence due to sparse rewards, limited success rates in transferring learned policies from simulation to real-world applications, etc.
Compared with reinforcement learning, imitation learning exhibits great potential in simplifying the training process, avoiding reward function design and sim-to-real transfer. According to the demonstration patterns of imitation learning, it can be categorized into human-independent and human–robot teleoperation. In the former pattern, the human demonstrates the specific tasks independently, and the demonstrated data are extracted with multi-source signal acquisition devices [19,20,21,22,23,24]. In the latter pattern, the demonstration integrates the robot with the human tutor, the human-in-the-loop robot control is applied to demonstrate the specific tasks, and the demonstrated data are obtained from the robot’s built-in and external sensors [25,26,27,28,29,30,31].
For the human-teleoperated demonstration, the authors in [25] present a human–robot physical interaction interface leveraging an electronic skin integrated onto the robot’s surface. The system enables real-time modulation of end-effector stiffness through tactile interactions, i.e., when external perturbations (e.g., shaking or tapping) are detected via the electronic skin, the stiffness will be reduced, and conversely, sustained grip forces measured by the skin sensors will trigger stiffness enhancement. In [27], the human tutor online monitors the robot execution and modulates the robot end-effector stiffness through a potentiometer button in the handheld device to realize human-in-the-loop control. In [28], the human arm endpoint stiffness is transferred to the robot online based on a tele-impedance control framework, in which the human arm endpoint stiffness is estimated through a reduced-complexity representation model including the muscle activation, human arm configuration, and the musculoskeletal system.
To model the multimodal data from the human demonstration, several methods are proposed which can be divided into three categories: (i) dynamic system-based methods, such as Dynamic Movement Primitives (DMPs) [32,33] and Stable Estimator of Dynamical Systems (SEDS) [34], which encode trajectories through attractor dynamics but are limited to few-shot demonstrations due to their deterministic nature; (ii) probabilistic modeling methods, including GMM/R [35], Task-Parameterized GMMs [36], Probabilistic Movement Primitives (ProMPs) [37], Kernelized Movement Primitives (KMPs) [38], and Hidden (Semi-) Markov Models (H(S)MM) [39,40], which explicitly capture data distribution characteristics and handle uncertainty through Bayesian inference; (iii) data-driven behavioral cloning, typically implemented via deep neural networks (DNNs) [41,42,43] or reinforcement learning (RL) [44,45], which requires large-scale datasets for reliable policy generalization. Compared with dynamic systems that lack distributional awareness and behavioral cloning that demands intensive data collection, probabilistic models provide a principled compromise; they enable uncertainty-aware skill representation from sparse demonstrations while maintaining interpretability through explicit probability density functions.
Notwithstanding the significant advancements in robotic stiffness learning outlined in prior studies, critical limitations persist across two dimensions. (i) Current methods for estimating human arm end-point stiffness in kinematically redundant upper limbs necessitate subject-specific parameter identification procedures before human demonstration. (ii) The inherent biomechanical differences between humans and robots would destroy the operational effects, and must be compensated for through some optimization methods after human demonstration. The above procedures in the pre-/post- human demonstration have affected the promotion of skill learning due to their cumbersome stiffness modeling and human–robot difference compensation. To simplify the robot skill learning paradigm, this paper proposes an sEMG-guided robot adaptive stiffness learning framework. The framework comprises a human-teleoperated demonstration platform and a GMM/R-based temporal–motion–sEMG modeling method. A real-world experiment was conducted to validate the effectiveness of the proposed framework.
The rest of this article is organized as follows. Section 2 presents the methodology of the proposed framework, including the human-teleoperated demonstration, the GMM/R modeling algorithm, and the stiffness-learning-based robot Cartesian impedance controller. Section 3 introduces the experimental setup, protocols, and results. Section 4 and Section 5 present the discussion and conclusion, respectively.

2. Methodology

As shown in Figure 1, the overall architecture of the proposed sEMG-driven robot learning variable stiffness framework mainly includes three components: human-teleoperated demonstration, GMM/R modeling, and robot reproduction. The human-teleoperated demonstration integrates kinesthetic teaching with sEMG-based stiffness programming, where the demonstrator’s motion variations and muscle activations are synchronously captured and transmitted into robot impedance parameters via a teleoperation interface. On this basis, motion and stiffness are extracted from multiple demonstrations and modeled using GMM/R; thereby, the temporal–motion correlation and the motion–sEMG relationship are established successively. The robot Cartesian impedance control law is derived based on its dynamics and impedance model, in which the impedance parameters are determined with the learned stiffness. These components are explained in detail in the following subsections.

2.1. Human-Teleoperated Demonstration

The sEMG preprocessing and feature extraction are first conducted to map the raw sEMG signals to the corresponding muscle activation levels. On this basis, given the inherent discrepancies between human tutors and robots in kinematic parameters (initial endpoint pose and base coordinate system) and dynamic characteristics (stiffness range), systematic calibration of both motion trajectories and impedance profiles must be performed prior to the human demonstration.

2.1.1. sEMG Preprocessing and Feature Extraction

The raw sEMG signal is a voltage with both positive and negative and is contaminated by many external interferences. In the sEMG preprocessing stage, this paper firstly smooths and filters the raw sEMG signals with a 50 Hz Notch filter, 30 Hz high-pass filter, and 500 Hz low-pass filter, successively, and then rectifies and normalizes the filtered signals through the process of full-wave rectification and signal normalization w.r.t their maximum voluntary contraction (MVC) values, respectively.
In the feature extraction stage, the time-domain feature of the processed sEMG signals is obtained using the root mean square (RMS) method and the moving window technique.
a i = 1 l w j = 1 l w v ^ i , j 2 ,
where a i is the RMS in the i-th window, serving as the muscle activation within that segment; l w is the window length; and v ^ i , j denotes the j-th normalized amplitude of raw sEMG signals in the i-th window.
Assuming that the widow shift distance and the length of the raw sEMG signals are l s and l p , respectively, the length of the RMS feature can be calculated as follows:
l a = l p l w l s + 1 ,
where l p and l a represent the lengths of the raw sEMG signals and the RMS feature, respectively, and denotes the floor operator for ∗.

2.1.2. Motion Calibration

Assume the initial endpoint poses of the human arm and the robot are defined as H h 0 S E ( 3 ) and H r 0 S E ( 3 ) , respectively, and the base coordinate poses between them are determined as H Σ h S E ( 3 ) and H Σ r S E ( 3 ) , respectively.
H h 0 = R h 0 p h 0 0 1 × 3 1 , H r 0 = R r 0 p r 0 0 1 × 3 1
H Σ h = R Σ h 0 3 × 1 0 1 × 3 1 , H Σ r = R Σ r 0 3 × 1 0 1 × 3 1
where R h 0 S O ( 3 ) and R r 0 S O ( 3 ) denote the initial endpoint postures of the human arm and the robot, respectively, p h 0 R 3 and p r 0 R 3 denote the initial endpoint positions of the human arm and the robot, respectively, R Σ h S O ( 3 ) and R Σ r S O ( 3 ) denote the base coordinate postures of the human arm and the robot, respectively.
The calibration of motion trajectories can be realized with calibration matrices as follows:
R Σ e = R Σ r R Σ h 1
p e = p r 0 R Σ e p h 0
R e = R r 0 ( R Σ e R h 0 ) 1
where R Σ e R 3 , p e R 3 , and R e S O ( 3 ) represent the calibration matrices of the base coordinate system, endpoint position, and endpoint posture, respectively.

2.1.3. Stiffness Calibration

Assume the stiffness range of the robot is defined as [ k r , min R , k r , max R ] . The calibration of impedance profiles can be realized as follows:
k r ( t ) = k r , min + a ( t ) ( k r , max k r , min ) ,
a ( t ) = 1 2 ( max ( a 1 ( t ) a 1 ( 0 ) , 0 ) + max ( a 2 ( t ) a 2 ( 0 ) , 0 ) ) ,
where a ( t ) 0 , 1 denotes the normalized relative muscle activation; a 1 0 , 1 and a 2 0 , 1 correspond to the sEMG-based muscle activation levels of targeted antagonist muscles, where max ( a i ( t ) a i ( 0 ) , 0 ) quantifies the corresponding increase of activation amplitude during the contact phase, with a i ( 0 ) denoting the pre-contact baseline activation level; and k r ( t ) represents the diagonal element of the stiffness eigenvalue matrix of the robot after calibration.
According to Equations (5) and (8), the allowable stiffness of the robot can be calculated as follows:
K r , V = R Σ e K h , V R Σ e T
K r , D = diag ( k r , 1 , k r , 2 , k r , 3 )
K r = K r , V K r , D K r , V T
where K r R 3 × 3 denotes the allowable stiffness of the robot and K r , D R 3 × 3 and K r , V R 3 × 3 represent the eigenvalue and eigenvector matrices of K r , respectively.

2.2. Motion/Stiffness Modeling Through GMM/R

GMM/R is employed to probabilistically model the demonstration data (motion trajectories and sEMG-based muscle activation profiles) and establish the temporal–motion correlation and the motion–sEMG relationship, successively. The implementation involves offline training and online execution. The former formulates the GMM input–output structure with temporal–motion and motion–sEMG, successively, and optimizes model parameters through an iterative expectation-maximization (EM) algorithm. The latter uses GMR to compute conditional expectations of motion and stiffness given real-time temporal and motion inputs, respectively.

2.2.1. Gmm Modeling

The input ξ I R n × l and output ξ O R m × l vectors of GMM can be determined as follows:
ξ I = ξ 1 I , ξ 2 I , , ξ n I T , ξ O = ξ 1 O , ξ 2 O , , ξ m O T .
where n, m, and l represent the dimensions of the input and output vectors as well as their lengths, respectively.
The probability density function of GMM is a weighted sum of multiple Gaussian distribution functions [35], as shown in Equation (14):
P ξ = i = 1 N π i N ξ | μ , Σ ,
N ξ | μ , Σ = 2 π d Σ 1 / 2 exp 1 2 ξ u T Σ 1 ξ u ,
where d denotes the dimension of output vectors ξ O and π , μ , and Σ represent the posterior probability, mean, and covariance matrix of N Gaussian distribution functions, which can be determined by the EM algorithm [46].

2.2.2. Em Optimization

The EM algorithm is an iterative optimization method that alternates between the Expectation step (E-step), where the expected value of the log-likelihood is computed given the current parameters, and the Maximization step (M-step), where the parameters are updated to maximize this expected log-likelihood [46].
E-step: the expected value of the log-likelihood Q ( Φ , Φ i 1 ) is computed with the ( i 1 )-th posterior probability Φ i 1 = π i 1 , μ i 1 , Σ i 1 , as shown in Equation (16):
Q ( Φ , Φ i 1 ) = E log P ( ξ I , ξ O | Φ ) | ξ I , Φ i 1 ,
M-step: the parameters in Φ i are updated to maximize the expected log-likelihood Q ( Φ , Φ i 1 ) , as follows:
Φ i = arg max Φ ( Q ( Φ , Φ i 1 ) ) .
The parameters Φ = π ^ , μ ^ , Σ ^ are updated through EM iteration:
π i n + 1 = j = 1 N γ i , j n N , μ i n + 1 = j = 1 N γ i , j n ξ i j = 1 N γ i , j n ,
Σ i n + 1 = j = 1 N ξ j μ i n + 1 ξ j μ i n + 1 T j = 1 N γ i , j n .

2.2.3. Gmr Generation

On this basis, the GMR online calculation output parameters are as follows [35]:
u ^ i = u ^ i I , u ^ i O T , Σ ^ i = Σ ^ i O Σ ^ i O I Σ ^ i I O Σ ^ i I ,
P ξ O | ξ I = i = 1 N 2 h i ξ I N u ^ i O ξ I , Σ i I ^ ,
u ^ i O ξ I = u ^ i O + Σ i O I Σ i I 1 ξ I u ^ i I ,
Σ ^ i O = Σ i O Σ i O I Σ i I 1 Σ i I O ,
h i ξ I = P u i , Σ i | ξ I = π i N ξ I | u i I , Σ i I j = 1 N 2 π j N ξ I | u j I , Σ j I ,
The reconstructed data are deduced in Equations (25) and (26):
u ^ O ξ I = i = 1 N 2 h i ξ I u ^ i O ξ I .
Σ ^ O ξ I = i = 1 N 2 h i ξ I Σ ^ i O + u ^ i O ξ I u ^ i O ξ I T u ^ O ξ I u ^ O ξ I T .

2.3. Robot Cartesian Impedance Control Law

The robot Cartesian impedance control law is derived with the robot dynamics and the classical impedance model in the following subsections.

2.3.1. Robot Dynamics

The robot dynamics in Cartesian space can be written as follows [47]:
Λ ( x ) x ¨ + u ( x , x ˙ ) x ˙ + F g = F τ + F e x t ,
where x R 6 , x ˙ R 6 , and x ¨ R 6 denote the position, velocity, and acceleration of the robot in Cartesian space, respectively; Λ ( x ) R 6 × 6 , u ( x , x ˙ ) x ˙ R 6 , and F g R 6 represent the inertia, Coriolis and centrifugal, and gravitational terms in Cartesian space, respectively; F τ R 6 is the input wrench of the controller; and F e x t R 6 is the external wrench exerted on the robot.

2.3.2. Classical Impedance Model

The classical impedance model proposed in [47] establishes a dynamic relationship to exert corrective forces on the robot when its instantaneous position diverges from the predefined equilibrium configuration.
Λ d x ˜ ¨ + ( D d + u ( x , x ˙ ) ) x ˜ ˙ + K d x ˜ = F e x t ,
Here, x ˜ = x x d , x ˜ ˙ = x ˙ x ˙ d , and x ˜ ¨ = x ¨ x ¨ d represent the tracking errors of position, velocity, and acceleration, respectively, and Λ d R 6 × 6 , D d R 6 × 6 , and K d R 6 × 6 represent the desired inertial, damping, and stiffness of the impedance model, respectively.

2.3.3. Cartesian Impedance Control Law

Substituting Equation (28) into Equation (27), the input wrench of the controller can be calculated as follows:
F τ = Λ d x ¨ d + u ( x , x ˙ ) x ˙ d ( Λ ( x ) Λ d 1 I ) F e x t + F g Λ ( x ) Λ d 1 ( D d x ˜ ˙ + K d x ˜ ) .
By defining Λ d = Λ ( x ) in Equation (29), the dependency on external wrench F e x t measurements can be eliminated, thereby simplifying the formulation as follows:
F τ = Λ ( x ) x ¨ d + u ( x , x ˙ ) x ˙ d + F g D d x ˜ ˙ K d x ˜ .
Based on Equations (8)–(12) and (25), the stiffness variation in Equation (30) is governed by two factors: (i) the sEMG-based stiffness mapping derived from human-teleoperated demonstrations and (ii) the online GMR-generated outputs during robot execution. For simplicity, this paper assumes that (i) in Equation (12), K r , V = I ; (ii) in Equation (11), k r , 1 = k r , 2 = k r , 3 ; (iii) the translational stiffness is composed of a constant term and a variable term; and (iv) the rotational stiffness is set as a constant diagonal matrix. Therefore, the desired stiffness in Equation (30) can be determined as follows:
K d = α K c + β K v ,
K c = K c , t 0 3 × 3 0 3 × 3 K c , r , K v = K v , t 0 3 × 3 0 3 × 3 K c , r ,
β = 1 α ,
where K c , t R 3 and K v , t R 3 , respectively, represent the constant and variable stiffness terms and the subscript t and r denote the translational and rotational stiffness term, respectively. The weighting factors α , β 0 , 1 govern the stiffness transition between free motion ( α = 1 , β = 0 ) and contact ( α = 0 , β = 1 ) stages. In the free motion stage, the robot maintains a relatively low constant stiffness for better compliance; in the contact stage, the stiffness fully follows the learned variable stiffness profiles.
On this basis, the desired damping matrix can be calculated as follows [47]:
D d = 2 diag ( ζ i λ K , i ) ,
where λ K , i denotes the i-th diagonal element of K d , and ζ i is a damping factor.

3. Experiment

In this section, the human-teleoperated demonstration platform was built, and a real-world experiment campaign was conducted to verify the effectiveness of the proposed skill transfer framework.

3.1. Experimental Setup and Protocols

3.1.1. Human-Teleoperated Demonstration Platform

During the human-teleoperated demonstration phase, the human tutor wore inertial measurement units (IMUs) and sEMG sensors and teleoperated the robot to execute the demonstration. In Figure 2, the experimental setup of the developed human-teleoperated demonstration platform includes a wearable motion-capture system (Noitom Perception Neuron Studio, Noitom, Beijing, China, 80 Hz); a wireless sEMG acquisition device (Noraxon ulitum-8, Noraxon, Scottsdale, USA, 2 kHz); a Noraxon official analog output module; an NI chassis (National Instruments cDAQ-9174, National Instruments, Austin, USA) equipped with a NI capture card (National Instruments 9215, National Instruments, Austin, USA) for analog-to-digital (A/D) conversion; three computers; and a router for the data capture and transmission.
In Figure 2, the motion capture system has 17 IMUs. A Windows-based software platform was used to capture spatial positions, orientations, and joint angles of the human tutor’s full-body kinematics during task execution. The pose of the wrist joint (“RightHand”) was extracted from BVH (Biovision Hierarchy) data broadcast within the software, utilizing UDP (User Datagram Protocol) for low-latency transmission.
To enable cross-platform interoperability, a server application was designed on Visual Studio 2019. The server incorporates a customized data structure that encapsulates the position (x, y, z axes) and quaternion-based orientation parameters, facilitating the transmission of pose information from the Windows environment to the Ubuntu system. Concurrently, a client module was implemented on the Ubuntu system, establishing Ethernet-based communication with the server. The client parses incoming data streams and interfaces with the Robot Operating System (ROS) through a dedicated publisher node, which packages the acquired pose data into standardized ROS messages for downstream utilization.
Similarly, for sEMG signals, since direct digital signal transmission to Ubuntu is constrained by hardware incompatibility, the raw digital signals were first converted to analog voltage outputs via the official analog output module, followed by re-digitization using NI hardware (cDAQ-9174 chassis with NI-9215 analog input module) compatible with Ubuntu. USB connectivity enables seamless analog-to-digital signal transmission to the Ubuntu system, where a dedicated ROS publisher node serializes the sEMG data into ROS messages.
Finally, a centralized ROS subscriber node was defined, and the motion and sEMG signals were transmitted to the remote robot (Franka Emika Panda, 1k Hz) in real-time.

3.1.2. Robot Control Scheme

During the human-independent robot reproduction phase, the human-like motion and stiffness profiles were modeled and transferred to the robot, which performed the task with the learned skills automatically. In Figure 3, four ROS nodes, named finite state machine, motion-stiffness, temporal–motion, and Cartesian impedance controller, were defined to control the robot. The finite state machine node serves as a centralized dispatcher to coordinate task sequencing across operational phases, implementing real-time execution monitoring through continuous feedback of robot end-effector status and forces. The motion-stiffness and temporal–motion nodes are encapsulated through GMM/R and work as motion and stiffness planners to generate real-time reference trajectories and stiffness profiles for the robot controller according to the task execution and the robot end-effector status. The Cartesian impedance controller node synthesizes reference trajectories, stiffness profiles, and task execution status feedback from other nodes, and calculates the desired torque for the robot.
On this basis, this paper conducted a cutting task, including free motion and cutting stages, to verify the effectiveness of the proposed skill transfer framework. The purpose is to learn the stiffness modulation strategy during the cutting stage, which completes the cutting and avoids generating cutting forces that are too great. The probability model was built based on the GMM/R modeling to capture the statistical distributions in the normalized sequences.
First, the watershed of free motion and cutting stages depended on the robot feedback status, including the robot end-effector position and force, and the suitable thresholds were determined empirically.
Second, temporal normalization was completed in Equation (35) to eliminate the differences in time scales under multiple demonstrations.
s ( t ) = t e n d t t e n d
Here, s ( t ) [ 0 , 1 ] denotes the normalized time variable that s ( t ) = 1 at the beginning and s ( t ) = 0 at the end.
Third, to learn the stiffness modulation strategy during the cutting stage, instead of directly modeling the motion trajectories and sEMG-based muscle activation profiles, we set the input of GMM as the distance between the current and target positions:
Δ p = p p g p 0 p g
where p , p 0 , and p g denote the current, initial, and target positions at the cutting stage and p 0 p g denotes the thickness of the object; thus, Δ p = 1 at the beginning and Δ p = 0 at the end to normalize the cutting process.

3.2. Parameters Settings

The stiffness range of the robot k r , min and k r , max in Equation (8) were set as 300 N/m and 2500 N/m. The number of Gaussian functions N for the temporal–motion and motion–sEMG relationships was set as 5 and 10, respectively. The Constant stiffness terms K c , t and K r , t in Equation (32) were set as 300 N/m and 30 Nm/rad, respectively. The damping factor ζ i in Equation (34) was set as 1. In addition, to divide free motion and contact stages, we set the force threshold as an absolute value of 5 N ( F e x t 5 N ) and the distance threshold between the current and target positions as 0.05 m ( p 0 p g 0.05 m). As for the constant impedance controller, α and β in Equation (31) were set as 1 and 0, respectively, throughout the whole operation. As a result, the desired stiffness and damping in Equations (31) and (34) were constant and equal to K d = diag ( 300 , 300 , 300 , 30 , 30 , 30 ) and D d = 2 diag ( 300 , 300 , 300 , 30 , 30 , 30 ) , respectively.

3.3. Experimental Results

In Figure 4, we exhibited the spatiotemporal motion patterns learned through GMM/R on the x-, y-, and z-axis, in which the shaded regions delineate the probabilistic variance bounds of the Gaussian components, and the solid curves represent the GMR outputs parameterized by the normalized temporal variable.
In Figure 5, the normalized motion–sEMG relationship during the cutting stage was built through GMM/R. Similar to Figure 4, the shaded ellipses represent the probabilistic variance bounds of the Gaussian components, and the solid curve denotes the GMR output based on the normalized distance between the current and target positions. As evidenced by the multiple demonstrations, the sEMG-based muscle activation level a exhibited a progressive augmentation during the cutting process, with stabilization within the 0.5–0.7 range throughout the penetration phase. When 90% cutting task completion, i.e., the normalized remaining cutting distance was less than 0.1 ( Δ p 0.1 ), a demonstrated an abrupt decline to baseline levels, indicating rapid neuromuscular activity reduction near task termination.
Based on the proposed human-teleoperated demonstration platform and the modeled temporal–motion and motion–sEMG relationships through GMM/R, the robot can quickly master the stiffness modulation strategy in the cutting task after very few demonstrations. As shown in Figure 6, we compared the cutting results with the constant ( K c , t = 300 ) and the proposed variable stiffness.
As a result, Figure 7 illustrates the robot end-effector feedback of force and position tracking error and the stiffness variations. Thanks to the stiffness adaption during the cutting stage, the force quickly increased from 300 N/m to 2200 N/m during the initial penetration phase; the external force feedback profile reveals that the implementation of real-time stiffness modulation enabled a sharp escalation of cutting force from −5 N to −14 N. Concurrently, the position tracking error declines from 4 cm to 0.18 cm and the stiffness sequentially back to the low level of 660 N/m, thereby maintaining stable contact force regulation at −2 N during terminal positioning. The stiffness–force–position variations experimentally validate the robot’s capability to automatically adapt to the contact situations through context-aware impedance shaping based on the learned temporal–motion and motion–sEMG models. Compared to the constant case that failed to cut the object with its constant stiffness setting, the proposed method exhibits the same flexibility in the free motion stage and stronger environmental adaptability in the cutting stage.

4. Discussion

The main contribution of this article is to propose a human-teleoperated demonstration platform that enables human tutors to modulate robots’ end-effector stiffness online during the operational demonstration phase. A dual-stage probabilistic modeling architecture based on GMM/R is developed to model the temporal–motion correlation and the motion–sEMG relationship from demonstrations. The human-like variable impedance control is realized in a real-world validation experiment based on the proposed demonstration framework. It should be emphasized that the proposed framework can be generalized to various human tutors thanks to the normalized sEMG processing method. Therefore, sEMG-based muscle activation can accurately reflect human tutors’ preferences for increasing or decreasing robot end-effector stiffness under their subject-specific MVC values.
To simplify the model, this paper sets the eigenvector matrix of the stiffness to the identity matrix, ensuring that the stiffness ellipsoid’s axes remain aligned with the operational space coordinate system. sEMG-based muscle activation levels govern the magnitude of stiffness according to the experimentally validated positive relationship between them. Therefore, compared to the classical method, considering both the magnitude and posture of the human arm endpoint stiffness ellipsoid, the complex identification of human arm endpoint stiffness before human demonstration is avoided. In addition, the robot participates in the demonstration phase through the developed human-teleoperated demonstration platform, allowing the demonstration data used for GMM/R to accurately reflect the actual situation of the robot’s operation. As a result, the model trained on these data can be directly applied to robot control without requiring difference compensation after human–robot transfer.
It should be noted that while a constant high stiffness setting can technically accomplish the above cutting task, maintaining such rigid configurations throughout the operation may introduce significant risks in disturbance-prone and contact-rich environments. For instance, in scenarios involving unexpected collisions or uncertainties in tool–workpiece interaction (e.g., surgical robotics or precision component assembly), excessive contact forces resulting from high stiffness can lead to tissue damage, part deformation, or even mechanical failure. This rigidity–performance paradox underscores the crucial importance of variable stiffness control, which involves dynamically modulating stiffness characteristics in response to real-time interaction requirements. To this end, this paper takes a step towards teaching robots to modulate their end-effector stiffness by learning from human demonstrations. Robots can achieve a good balance between task execution accuracy and environmental adaptability. Such stiffness-tunable mechanisms not only enhance operational safety through compliant collision responses but also improve cutting quality during dynamic contact processes.
Notably, this paper normalizes the demonstration time and the cutting distance in temporal–motion and motion–sEMG modeling to eliminate the differences in time scales and object thicknesses. While this paper specifically demonstrated the model’s application in cucumber cutting, thereby limiting its immediate applicability to this particular task, the proposed framework possesses significant potential for generalization. By incorporating additional training demonstrations encompassing a broader range of objects, the approach can be effectively extended to diverse cutting tasks beyond the current scope. To enhance the model’s adaptability for diverse cutting tasks, several methodological advancements warrant consideration: (i) incorporation of multimodal demonstration data, encompassing visual, haptic, and force feedback modalities; (ii) development of a hierarchical processing framework combining preprocessing techniques with probabilistic modeling, potentially involving initial object classification through clustering algorithms followed by task-specific modeling; (iii) implementation of advanced computational architectures, such as deep learning methods, to capture the complex dynamics inherent in various cutting situations.
Furthermore, the force feedback depicted in Figure 7 is computationally derived through forward dynamics estimation using the robot’s built-in joint torque measurements and its geometric Jacobian matrix. The target cutting force in this paper is realized indirectly by modulating the robot end-effector stiffness based on the classical impedance controller. Future work will focus on enhancing the proposed framework through integrating additional sensory modalities, with particular emphasis on force/torque feedback, to address the critical challenges associated with contact-rich tasks that demand precise force regulation and haptic perception.

5. Conclusions

In this work, we proposed a bio-signal-guided robot adaptive stiffness learning framework. The framework comprises a human-teleoperated demonstration platform and a GMM/R-based temporal–motion–sEMG modeling method. The human-teleoperated demonstration platform featured a simple and intuitive way that synchronizes human arm movements with selected muscle sEMG signals, enabling the transmission of real-time motion and stiffness to the robot impedance controller. The GMM/R-based temporal–motion–sEMG modeling method is proposed to build the temporal–motion and motion–sEMG relationships, respectively. A real-world experiment was performed to verify the effectiveness of the proposed framework. Experimental results show that the robot quickly masters the demonstrated motion and stiffness variations through the above two phases. The proposed framework provides an efficient way to plan the robot end-effector stiffness modulation strategy in contact-rich tasks.
A limitation of the proposed framework is that the human-like operational effect can not be guaranteed when the operational state encounters distributions significantly outside the demonstration data, and the framework for significantly different objects or tasks requires additional demonstrations. Future work will integrate multimodal sensory data and investigate advanced modeling methods for the complex dynamics inherent in various contact situations.

Author Contributions

Conceptualization, W.X. and Z.L. (Zhiwei Liao); methodology, W.X.; software, W.X.; validation, W.X. and Z.L. (Zhiwei Liao); data curation, W.X. and Z.L. (Zhiwei Liao); writing—original draft preparation, W.X.; writing—review and editing, Z.L. (Zhiwei Liao), Z.L. (Zongxin Lu), and L.Y.; funding acquisition, W.X.; supervision, Z.L. (Zongxin Lu) and L.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Department of Education Scientific Research Plan of Shaanxi Province under grant number 24JR015.

Institutional Review Board Statement

The study utilized non-invasive wearable devices to collect human motion data and surface electromyogram, which did not involve medical interventions, biological sampling, or personal privacy risks. Ethical approval for this research was reviewed and exempted by the Ethics Committee of Shaanxi Polytechnic Institute under Application No. SXPI-SMEA-24017 on 19 May 2025.

Informed Consent Statement

Not applicable.

Data Availability Statement

No new data were created or analyzed in this study.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Billard, A.; Kragic, D. Trends and challenges in robot manipulation. Science 2019, 364, eaat8414. [Google Scholar] [CrossRef] [PubMed]
  2. Abu-Dakka, F.J.; Saveriano, M. Variable impedance control and learning—A review. Front. Robot. AI 2020, 7, 590681. [Google Scholar] [CrossRef] [PubMed]
  3. Knežević, N.; Petrović, M.; Jovanović, K. Cartesian stiffness shaping of compliant robots—Incremental learning and optimization based on sequential quadratic programming. Actuators 2024, 13, 32. [Google Scholar] [CrossRef]
  4. Knežević, N.; Lukić, B.; Petrič, T.; Jovanovič, K. A Geometric Approach to Task-Specific Cartesian Stiffness Shaping. J. Intell. Robot. Syst. 2024, 110, 14. [Google Scholar] [CrossRef]
  5. Ravichandar, H.; Polydoros, A.S.; Chernova, S.; Billard, A. Recent advances in robot learning from demonstration. Annu. Rev. Control. Robot. Auton. Syst. 2020, 3, 297–330. [Google Scholar] [CrossRef]
  6. Batzianoulis, I.; Iwane, F.; Wei, S.; Correia, C.G.P.R.; Chavarriaga, R.; Millán, J.d.R.; Billard, A. Customizing skills for assistive robotic manipulators, an inverse reinforcement learning approach with error-related potentials. Commun. Biol. 2021, 4, 1406. [Google Scholar] [CrossRef]
  7. Tugal, H.; Gautier, B.; Tang, B.; Nabi, G.; Erden, M.S. Hand-impedance measurements with robots during laparoscopy training. Robot. Auton. Syst. 2022, 154, 104130. [Google Scholar] [CrossRef]
  8. Luo, J.; Solowjow, E.; Wen, C.; Ojea, J.A.; Agogino, A.M.; Tamar, A.; Abbeel, P. Reinforcement learning on variable impedance controller for high-precision robotic assembly. In Proceedings of the 2019 International Conference on Robotics and Automation (ICRA), Montreal, QC, Canada, 20–24 May 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 3080–3087. [Google Scholar]
  9. Roveda, L.; Maskani, J.; Franceschi, P.; Abdi, A.; Braghin, F.; Molinari Tosatti, L.; Pedrocchi, N. Model-based reinforcement learning variable impedance control for human-robot collaboration. J. Intell. Robot. Syst. 2020, 100, 417–433. [Google Scholar] [CrossRef]
  10. Roveda, L.; Testa, A.; Shahid, A.A.; Braghin, F.; Piga, D. Q-Learning-based model predictive variable impedance control for physical human-robot collaboration. Artif. Intell. 2022, 312, 103771. [Google Scholar] [CrossRef]
  11. Karim, M.F.; Bollimuntha, S.; Hashmi, M.S.; Das, A.; Singh, G.; Sridhar, S.; Singh, A.K.; Govindan, N.; Krishna, K.M. DA-VIL: Adaptive Dual-Arm Manipulation with Reinforcement Learning and Variable Impedance Control. arXiv 2024, arXiv:2410.19712. [Google Scholar]
  12. Anand, A.S.; Gravdahl, J.T.; Abu-Dakka, F.J. Model-based variable impedance learning control for robotic manipulation. Robot. Auton. Syst. 2023, 170, 104531. [Google Scholar] [CrossRef]
  13. Martín-Martín, R.; Lee, M.A.; Gardner, R.; Savarese, S.; Bohg, J.; Garg, A. Variable impedance control in end-effector space: An action space for reinforcement learning in contact-rich tasks. In Proceedings of the 2019 IEEE/RSJ international conference on intelligent robots and systems (IROS), Macau, China, 3–8 November 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 1010–1017. [Google Scholar]
  14. Meng, Y.; Su, J.; Wu, J. Reinforcement learning based variable impedance control for high precision human-robot collaboration tasks. In Proceedings of the 2021 6th IEEE International Conference on Advanced Robotics and Mechatronics (ICARM), Chongqing, China, 3–5 July 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 560–565. [Google Scholar]
  15. Zhang, X.; Sun, L.; Kuang, Z.; Tomizuka, M. Learning variable impedance control via inverse reinforcement learning for force-related tasks. IEEE Robot. Autom. Lett. 2021, 6, 2225–2232. [Google Scholar] [CrossRef]
  16. Li, Z.; Zeng, C.; Deng, Z.; Xu, Q.; He, B.; Zhang, J. Learning variable impedance control for robotic massage with deep reinforcement learning: A novel learning framework. IEEE Syst. Man, Cybern. Mag. 2024, 10, 17–27. [Google Scholar] [CrossRef]
  17. Zhang, H.; Solak, G.; Lahr, G.J.; Ajoudani, A. Srl-vic: A variable stiffness-based safe reinforcement learning for contact-rich robotic tasks. IEEE Robot. Autom. Lett. 2024, 9, 5631–5638. [Google Scholar] [CrossRef]
  18. Li, Y.; Wang, Y.; Li, Z.; Lv, Y.; Chai, J.; Dong, E. Deep reinforcement learning-based variable impedance control for grinding workpieces with complex geometry. Robot. Intell. Autom. 2025, 45, 159–172. [Google Scholar] [CrossRef]
  19. Wu, Y.; Zhao, F.; Tao, T.; Ajoudani, A. A framework for autonomous impedance regulation of robots based on imitation learning and optimal control. IEEE Robot. Autom. Lett. 2020, 6, 127–134. [Google Scholar] [CrossRef]
  20. Wu, R.; Billard, A. Learning from demonstration and interactive control of variable-impedance to cut soft tissues. IEEE/ASME Trans. Mechatron. 2021, 27, 2740–2751. [Google Scholar] [CrossRef]
  21. Liao, Z.; Jiang, G.; Zhao, F.; Wu, Y.; Yue, Y.; Mei, X. Dynamic skill learning from human demonstration based on the human arm stiffness estimation model and Riemannian DMP. IEEE/ASME Trans. Mechatron. 2022, 28, 1149–1160. [Google Scholar] [CrossRef]
  22. Zeng, C.; Yang, C.; Jin, Z.; Zhang, J. Hierarchical impedance, force, and manipulability control for robot learning of skills. IEEE/ASME Trans. Mechatron. 2024. early access. [Google Scholar] [CrossRef]
  23. Liao, Z.; Tassi, F.; Gong, C.; Leonori, M.; Zhao, F.; Jiang, G.; Ajoudani, A. Simultaneously learning of motion, stiffness, and force from human demonstration based on riemannian dmp and qp optimization. IEEE Trans. Autom. Sci. Eng. 2024, 22, 7773–7785. [Google Scholar] [CrossRef]
  24. Zhai, X.; Jiang, L.; Wu, H.; Zheng, H.; Liu, D.; Wu, X.; Xu, Z.; Zhou, X. Learning target-directed skill and variable impedance control from interactive demonstrations for robot-assisted soft tissue puncture tasks. IEEE Trans. Autom. Sci. Eng. 2024, 22, 5238–5250. [Google Scholar] [CrossRef]
  25. Kronander, K.; Billard, A. Learning compliant manipulation through kinesthetic and tactile human-robot interaction. IEEE Trans. Haptics 2013, 7, 367–380. [Google Scholar] [CrossRef] [PubMed]
  26. Peternel, L.; Petrič, T.; Oztop, E.; Babič, J. Teaching robots to cooperate with humans in dynamic manipulation tasks based on multi-modal human-in-the-loop approach. Auton. Robot. 2014, 36, 123–136. [Google Scholar] [CrossRef]
  27. Peternel, L.; Petrič, T.; Babič, J. Robotic assembly solution by human-in-the-loop teaching method based on real-time stiffness modulation. Auton. Robot. 2018, 42, 1–17. [Google Scholar] [CrossRef]
  28. Ajoudani, A.; Fang, C.; Tsagarakis, N.; Bicchi, A. Reduced-complexity representation of the human arm active endpoint stiffness for supervisory control of remote manipulation. Int. J. Robot. Res. 2018, 37, 155–167. [Google Scholar] [CrossRef]
  29. Doornebosch, L.M.; Abbink, D.A.; Peternel, L. Analysis of coupling effect in human-commanded stiffness during bilateral tele-impedance. IEEE Trans. Robot. 2021, 37, 1282–1297. [Google Scholar] [CrossRef]
  30. Ahn, H.; Michel, Y.; Eiband, T.; Lee, D. Vision-based approximate estimation of muscle activation patterns for tele-impedance. IEEE Robot. Autom. Lett. 2023, 8, 5220–5227. [Google Scholar] [CrossRef]
  31. Hu, P.; Huang, X.; Wang, Y.; Li, H.; Jiang, Z. A Novel Hand Teleoperation Method with Force and Vibrotactile Feedback Based on Dynamic Compliant Primitives Controller. Biomimetics 2025, 10, 194. [Google Scholar] [CrossRef]
  32. Ijspeert, A.J.; Nakanishi, J.; Hoffmann, H.; Pastor, P.; Schaal, S. Dynamical movement primitives: Learning attractor models for motor behaviors. Neural Comput. 2013, 25, 328–373. [Google Scholar] [CrossRef]
  33. Liao, Z.; Zhao, F.; Jiang, G.; Mei, X. Extended DMPs framework for position and decoupled quaternion learning and generalization. Chin. J. Mech. Eng. 2022, 35, 95. [Google Scholar] [CrossRef]
  34. Khansari-Zadeh, S.M.; Billard, A. Learning stable nonlinear dynamical systems with gaussian mixture models. IEEE Trans. Robot. 2011, 27, 943–957. [Google Scholar] [CrossRef]
  35. Calinon, S. Mixture models for the analysis, edition, and synthesis of continuous time series. In Mixture Models and Applications; Springer International Publishing: Cham, Switzerland, 2020; pp. 39–57. [Google Scholar]
  36. Calinon, S. A tutorial on task-parameterized movement learning and retrieval. Intell. Serv. Robot. 2016, 9, 1–29. [Google Scholar] [CrossRef]
  37. Paraschos, A.; Daniel, C.; Peters, J.R.; Neumann, G. Probabilistic movement primitives. Adv. Neural Inf. Process. Syst. 2013, 26. [Google Scholar]
  38. Huang, Y.; Rozo, L.; Silvério, J.; Caldwell, D.G. Kernelized movement primitives. Int. J. Robot. Res. 2019, 38, 833–852. [Google Scholar] [CrossRef]
  39. Cho, N.J.; Lee, S.H.; Kim, J.B.; Suh, I.H. Learning, improving, and generalizing motor skills for the peg-in-hole tasks based on imitation learning and self-learning. Appl. Sci. 2020, 10, 2719. [Google Scholar] [CrossRef]
  40. Tanwani, A.K.; Calinon, S. Learning robot manipulation tasks with task-parameterized semitied hidden semi-Markov model. IEEE Robot. Autom. Lett. 2016, 1, 235–242. [Google Scholar] [CrossRef]
  41. Kadi, H.A.; Terzić, K. Data-driven robotic manipulation of cloth-like deformable objects: The present, challenges and future prospects. Sensors 2023, 23, 2389. [Google Scholar] [CrossRef]
  42. Lu, G.; Yan, Z.; Luo, J.; Li, W. Integrating Historical Learning and Multi-View Attention with Hierarchical Feature Fusion for Robotic Manipulation. Biomimetics 2024, 9, 712. [Google Scholar] [CrossRef]
  43. Yang, H.; Zhou, Y.; Wu, J.; Liu, H.; Yang, L.; Lv, C. Human-Guided Continual Learning for Personalized Decision-Making of Autonomous Driving. IEEE Trans. Intell. Transp. Syst. 2025, 26, 5435–5447. [Google Scholar] [CrossRef]
  44. Zhang, Z.; Hong, J.; Enayati, A.M.S.; Najjaran, H. Using implicit behavior cloning and dynamic movement primitive to facilitate reinforcement learning for robot motion planning. IEEE Trans. Robot. 2024, 40, 4733–4749. [Google Scholar] [CrossRef]
  45. Qi, W.; Fan, H.; Zheng, C.; Su, H.; Alfayad, S. Human-like Dexterous Grasping Through Reinforcement Learning and Multimodal Perception. Biomimetics 2025, 10, 186. [Google Scholar] [CrossRef] [PubMed]
  46. Calinon, S.; Kormushev, P.; Caldwell, D.G. Compliant skills acquisition and multi-optima policy search with EM-based reinforcement learning. Robot. Auton. Syst. 2013, 61, 369–379. [Google Scholar] [CrossRef]
  47. Ott, C. Cartesian Impedance Control of Redundant and Flexible-Joint Robots; Springer: Berlin/Heidelberg, Germany, 2008. [Google Scholar]
Figure 1. Overall architecture of the sEMG-driven robot learning variable stiffness framework.
Figure 1. Overall architecture of the sEMG-driven robot learning variable stiffness framework.
Biomimetics 10 00399 g001
Figure 2. Experimental setup and protocols of the human-teleoperated demonstration platform.
Figure 2. Experimental setup and protocols of the human-teleoperated demonstration platform.
Biomimetics 10 00399 g002
Figure 3. Robot control diagram.
Figure 3. Robot control diagram.
Biomimetics 10 00399 g003
Figure 4. Temporal–motion relationship with GMM/R.
Figure 4. Temporal–motion relationship with GMM/R.
Biomimetics 10 00399 g004
Figure 5. Normalized motion–sEMG relationship with GMM/R.
Figure 5. Normalized motion–sEMG relationship with GMM/R.
Biomimetics 10 00399 g005
Figure 6. Cutting results comparison with constant and variable stiffness.
Figure 6. Cutting results comparison with constant and variable stiffness.
Biomimetics 10 00399 g006
Figure 7. Robot cutting results of the force, position tracking error, and stiffness.
Figure 7. Robot cutting results of the force, position tracking error, and stiffness.
Biomimetics 10 00399 g007
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Xia, W.; Liao, Z.; Lu, Z.; Yao, L. Bio-Signal-Guided Robot Adaptive Stiffness Learning via Human-Teleoperated Demonstrations. Biomimetics 2025, 10, 399. https://doi.org/10.3390/biomimetics10060399

AMA Style

Xia W, Liao Z, Lu Z, Yao L. Bio-Signal-Guided Robot Adaptive Stiffness Learning via Human-Teleoperated Demonstrations. Biomimetics. 2025; 10(6):399. https://doi.org/10.3390/biomimetics10060399

Chicago/Turabian Style

Xia, Wei, Zhiwei Liao, Zongxin Lu, and Ligang Yao. 2025. "Bio-Signal-Guided Robot Adaptive Stiffness Learning via Human-Teleoperated Demonstrations" Biomimetics 10, no. 6: 399. https://doi.org/10.3390/biomimetics10060399

APA Style

Xia, W., Liao, Z., Lu, Z., & Yao, L. (2025). Bio-Signal-Guided Robot Adaptive Stiffness Learning via Human-Teleoperated Demonstrations. Biomimetics, 10(6), 399. https://doi.org/10.3390/biomimetics10060399

Article Metrics

Back to TopTop