1. Introduction
Human–robot collaboration (HRC) can be defined as the close interaction between a human user and a robot working together to accomplish a specific task. True collaboration, however, is only achieved when humans and robots can share the same workspace simultaneously and can perform tasks concurrently during production operation [
1]. One of the biggest challenges in collaborative robotics resides in how to achieve full collaboration between humans and robots. Even when collaborative robots (cobots) have gained popularity in industrial settings, it is still uncommon to find them working in close interaction with human operators due to a lack of intuitive robot programming, limiting non-expert operators to create and alter robot programs quickly and intuitively [
2]. To overcome these programming complexities and enable robots to learn new skills and adapt to tasks demonstrated by humans within collaborative settings, Learning from Demonstration (LfD) offers a promising paradigm.
LfD aims to equip robots with the ability to acquire new skills by observing movements performed by an expert to solve the same task. This concept is rooted in neuroscience, serving as a means to emulate the human learning process in robots [
3]. The LfD process encompasses five stages: Instructor Selection, Data Acquisition, Data Modeling, Task Execution, and Learning Refinement. Initially, the expert who will instruct the task resolution is selected. Typically, the human user who possesses the precise knowledge to solve the task assumes this role. Next, the Data Acquisition phase involves determining how task information will be recorded and how task-specific information will be mapped to the robot’s configuration space. The essence of LfD lies in the Data Modeling process, where a suite of algorithms is employed to learn task features for future reproducibility. Subsequently, during the Task Execution stage, the system’s performance is assessed. Many algorithms utilized in the data modeling phase are designed to be applicable in scenarios not explicitly taught during demonstrations, enabling the evaluation of learning effectiveness. Finally, if necessary, adjustments can be made in the Learning Refinement stage. This stage allows for the fine-tuning of learning parameters or the addition of new information to demonstrations to enhance task execution.
Of course, another major concern is safety. Safety is a crucial aspect of autonomous dynamical systems expected to operate in unknown and unstructured environments. Regarding system properties, safety ensures that undesirable states or events do not occur [
4]. Cobots have critical safety requirements due to HRC typically occurring in unstructured and dynamic environments. The coexistence of humans and cobots in shared spaces creates hazardous situations that require appropriate mechanisms to prevent uncontrolled physical contact between them [
5]. Avoiding such contact can be viewed as a path-planning problem involving obstacle avoidance. While cobots are designed with inherent safety features, achieving seamless and efficient collaboration in dynamic environments shared with humans remains a significant hurdle. Many existing HRC systems rely on predefined paths or simplistic reactive strategies (e.g., stopping upon collision detection), which can limit fluidity and task efficiency, especially when unexpected obstacles such as moving human co-workers are present. Consequently, there is a pressing need for more advanced techniques that facilitate intuitive robot programming and enable robust, real-time adaptation to dynamic workspaces [
2]. Active obstacle avoidance in the LfD framework can enhance cobots’ ability to complete tasks safely and effectively, significantly expanding the LfD application field [
6].
This work proposes a LfD programming framework for human–robot collaborative scenarios, designed to enhance collaboration speed and safety by enabling adaptive robot movement in human presence. The framework introduces an improved method of volumetric obstacle avoidance based on Dynamic Movement Primitives (DMPs) formulation [
7,
8,
9], which is a popular LfD algorithm used to represent and generate complex movements, and superquadrics [
10], which are a family of geometric shapes used to model a variety of 3D shapes. This enables a robotic manipulator to perform tasks in dynamic, unconstrained environments. The major contribution of this work is the implementation of a complete robotic system for rapid robot programming and task generalization. This includes perception, hand–eye robot coordination, trajectory learning, task adaptation, and obstacle avoidance of objects modeled with superquadric shapes with varying sizes and poses. The robotic system is tested in a collaborative cell, considering human actions alongside robot movements. This contributes to improving the integration of collaborative robots in settings requiring human intervention and action.
The rest of this work is organized as follows:
Section 2 presents a literature review on obstacle avoidance in the context of the LfD framework. Then, in
Section 3, the basic formulation of DMPs and superquadrics are explained. In
Section 4, the proposed programming framework algorithms and implementation steps are thoroughly described.
Section 5 is dedicated to validating the experiment design through theoretical and simulation validity before applying the framework to a real human–robot scenario.
Section 6 demonstrates the results obtained via experimental evaluation on a real HRC task. Finally,
Section 7 discusses the overall performance of the proposed programming framework, its advantages over similar methods, and future research directions and opportunities.
2. Literature Review
In the LfD framework, the learning process occurs during the Data Modelling stage, where the framework gives a way to learn various goal-directed movement skills in high-dimensional continuous state-action spaces directly from human demonstration. As pointed out in [
11], there exists a great variety of algorithms focused on LfD. In the context of primitive movement learning, it is possible to classify the learning algorithms into two categories: Deterministic and Probabilistic methods.
The most representative solution of the deterministic methods in LfD is known as DMPs. The term was first introduced by Ijspeert et al. in [
7], and further improved in their later work [
9]. DMPs have their origin in biological systems, particularly in motor control, and refer to a framework for trajectory learning based on second-order ODEs of the spring-mass-damping type with a forcing term. Furthermore, in the context of the robot obstacle avoidance problem, customized obstacle-avoidance algorithms have been devised utilizing DMPs [
12].
While DMP-based methods offer strong generalization capabilities from a few demonstrations, other learning paradigms such as Reinforcement Learning (RL) have also been extensively explored for robot path planning in dynamic environments. For instance, RL enables robots to learn complex behaviors and navigation strategies through interaction and trial-and-error, sometimes even without explicit global information [
13]. However, RL approaches often require significant training time and data, and ensuring safety during the learning process in physical HRC scenarios presents a considerable challenge. In contrast, the LfD framework, particularly when augmented with DMPs, aims for faster skill acquisition while providing a structured mathematical basis for incorporating modifications such as the proposed volumetric obstacle avoidance.
To ensure both obstacle avoidance and stability within the DMP framework, a perturbing term is incorporated. This term is typically constructed using potential functions [
14], or the scheme of steering angle presented in [
15], to facilitate efficient obstacle avoidance while preserving the overall stability of the system. The problem with these solutions is that the coupling term considers point-like obstacles, which are impractical in some situations.
The literature details several attempts to extend the DMP formulation for volumetric obstacle avoidance. One approach, for instance, represents obstacles as point clouds, utilizing the nearest point to the robot in the steering-angle-based obstacle avoidance formulation [
16]. Other examples of modeling volumes as point clouds can be found in [
17], which addresses issues related to trajectory jittering, ineffective obstacle avoidance in specific scenarios, and better preservation of teaching intentions in dynamic environments for enhanced performance. Meanwhile, ref. [
18] employs a method of hyperparameter optimization based on RL to learn both the profiles of potentials and the shape parameters of a motion.
However, as presented by [
19], some of the drawbacks of modeling volumes with point clouds are the high computational time due to the density of the point cloud and non-smooth behaviors due to the constantly changing nearest point between the robot and the obstacle. To tackle this, ref. [
19] enhanced DMPs to support volumetric obstacle avoidance using superquadric functions in scenarios where obstacles have known and unknown shapes. The superquadrics’ implicit formulation was modified to propose a superquadric potential function based on superellipses that model the shape of the obstacle potential field. This solution was originally developed for static obstacles and was expanded later in [
20], showing promising results in dynamic obstacle avoidance in different robotic scenarios. The main drawback of this solution is that convergence and stability of the DMPs plus the potential field is not guaranteed. To avoid these problems, a solution composed of steering angle avoidance and superquadric models was adopted in [
21]. The stability and convergence of the method are proven and obstacle avoidance is guaranteed for multiple volumetric obstacles; still, dynamic obstacle avoidance was not tested.
Thus, the proposed solution in this work will extend the work of Ginesi et al. [
20] and Liu et al. [
21] to:
Leverage LfD to reduce programming time and enable task generalization for complex robot tasks in constrained workspaces.
Include superquadrics in the general pose to model static and dynamic obstacles.
Tackle some of the major drawbacks of the steering angle scheme by addressing limitations like the “dead zone” issue and extending the approach to volumetric obstacles using a modified Mollifier function.
The framework was successfully tested and validated in both simulations and real-world scenarios using a collaborative robot, demonstrating effective trajectory learning, adaptation, and collision avoidance with static and dynamic obstacles (including human arms) in an HRC setting.
4. Implementation
Let us consider (
8),
where
represents a coupling term in the DMP dynamical system formulation. As described in
Section 3, the steering angle approaches the coupling term, as described by (
12).
This term can be rewritten as
where
describes the influence of a force generated by an obstacle on the steering angle. This term
will be modeled after the general pose description of a superquadric model. Usually, a superquadric centered in a local coordinate system
can be defined by only five parameters (
and
). When describing the same superquadric in the general pose, it is necessary to include six additional parameters describing the pose vector relative to the center of the world coordinate system
[
22]. Thus, it is required to apply a frame coordinate transformation in the form of
where
with
, with
as the rotation components and
as the translation vector. Since it is needed to express this relation in superquadric-centered coordinates, these coordinates are computed as
with
where
.
By substituting (
18) and (
19) into (
13), the implicit function representation for superquadrics in the general pose is obtained.
Then, considering the work of [
21], function
can be formulated as
where
represents the base of the logarithmic function. As a constant,
b determines the magnitude of
. Unfortunately, dealing with the correct selection of a constant
b adds complexity to the formulation. Instead, to reduce the number of extra parameters to tune, Equation (
22) is proposed:
The selection of logarithmic function to describe (
22) is simple to understand. When the robot is close to the surface of the obstacle, the function
will decrease to 1, and
will increase to
, forcing the robot to move away from the obstacle, which is the desired behavior. While the use of the exponential function will preserve its behavior of being strictly increasing and convex, it will grow extremely fast for values of
. So the new DMP coupling term can be expressed as
This term can be further improved following the directions of [
16,
23]. Both works add an extra term considering the influence of the distance between the robot and the obstacle. If the robot is moving away from the obstacle, the effect of the coupling term should reduce to zero and vice versa; if the robot moves toward the obstacle, the effect of the coupling term should be maximized. Another drawback of the original steering angle formulation is called the “dead zone” in [
23]. As the authors described it, the dead zone problem involves a “heading range towards the obstacle for which the system becomes incoherently less reactive." To avoid it, the authors proposed an additional term based on a Gaussian function to solve the dead zone problem. This can be visualized in
Figure 3.
Therefore, the coupling term can be enhanced by adding two exponential terms to (
15), one
to regulate the coupling term based on the robot-obstacle distance, and the second,
, to tackle the dead zone problem,
where
and
k are constants. However, this solution is meant to be applied to point obstacles where it is possible to ignore an obstacle when
. When working with volumetric obstacles, angles greater than
can produce collisions depending on the shape of the obstacle. Considering convex obstacles only, complete obstacle avoidance can be guaranteed only when
(because the robot will be moving in the opposite direction from the obstacle); it is necessary to modify the Gaussian function to include the interval
. Nonetheless, the coupling term does not reduce to zero when
, as shown in
Figure 4a. For this reason, the proposed solution implements a slight modification to the formulation; the Gaussian function is replaced by a Mollifier function obtaining the final form of the proposed coupling term
where
.
With this modification, the properties of the Gaussian function are preserved and
when
as demonstrated in
Figure 4b.
Algorithm 1 shows how obstacle avoidance is implemented in the proposed solution.
| Algorithm 1 Dynamic Movement Primitives with obstacle avoidance |
- Input:
cartesian trajectory vector, velocity vector (optional), acceleration vector (optional), stiffness matrix, canonical system constant, time rescaling factor, T duration of the movement, N number of basis functions, superquadric parameters and pose vector,
- 1:
canonical system - 2:
- 3:
- 4:
canonical system - 5:
where , , and - 6:
- 7:
desired forcing term - 8:
- 9:
Initialize , with where and - 10:
Solve do - 11:
Locally weighted regression on - 12:
where is the forgetting factor and is the transposed row of - 13:
- 14:
end Solve - 15:
- 16:
obstacle position components - 17:
- 18:
obstacle rotation matrix from obstacle orientation components - 19:
for step do - 20:
( 20) with superquadric parameters and - 21:
- 22:
- 23:
distance to obstacle - 24:
- 25:
- 26:
- 27:
- 28:
- 29:
end for
- Output:
Learned cartesian trajectory vector, velocity vector, acceleration vector
|
6. Results
The proposed solution was tested on a UR10e collaborative robot from Universal Robots (Odense, Denmark). The robot has six DoF and 1300 mm of reach, capable of lifting 12.5 kg and reaching speeds up to 3.142 rad/s in its joint configuration. The robot is equipped with a wrist camera and a two-finger gripper model 2F85 of the brand Robotiq (Levis, QB, Canada), as shown in
Figure 9.
Likewise, is important to select the robot’s external sensors for collecting environmental data. As volumetric information about the environment is necessary, the ZED 2 stereo camera from StereoLabs (San Francisco, CA, USA) is selected. The ZED 2 stereo camera has the advantage of capturing at 60 fps in HD resolution or 30 fps in full-HD resolution, and additionally comes with some built-in object and human tracking features through its SDK, which are useful for deployment.
The workstation used for the implementation possesses an AMD Ryzen 5600G processor capable of running up to 4.4 GHz of clock frequency, and also has 32 GB of RAM and an RTX3060 NVIDIA GPU with 12 GB of DDR6 dedicated memory as its main components.
The experiments will utilize the same basic experimental setup, centered on the UR10e manipulator as shown in
Figure 10. The task involves a pick-and-place operation where the robot will move a red cube of 2.5 cm from an arbitrary point in its workspace to a short pipe segment of 7.5 cm diameter, also arbitrarily positioned within the workspace. This operation can be described by a simple state machine with five actions/states: (1) visually locating the object to grasp and the target placement position; (2) moving to the object; (3) grasping it; (4) moving to the target position; and (5) releasing the object. The critical part for evaluating the algorithm’s effectiveness is the movement performed after grasping the object of interest until reaching the target position. Key performance metrics for the algorithm will include the computation time for each robot movement step and the position error relative to the calculated DMP trajectory in an obstacle-free environment.
For testing purposes, we prepared five distinct trajectories to simulate unconstrained point-to-point human hand movements. One trajectory was modeled using a MJT to represent straight-line segments. Two trajectories were directly recorded from a human user’s right-hand movements via the ZED 2 camera. Each recorded trajectory was then smoothed by fitting a five-degree polynomial (e.g.,
Figure 11). The remaining two trajectories were derived by flattening the recorded and smoothed trajectories.
The Cartesian vector describing these trajectories is defined in the global space frame, with its origin at the robot’s base link. Here,
and
denote the initial and goal positions for the pick-and-place operation. These positions are determined by localizing the object to be grasped (a small red cube) and the target position within the pipe segment, as illustrated in
Figure 12. Cases where obstacles obstruct either the initial or goal positions during execution are excluded from this study. Such scenarios consistently yield erroneous results and pose potential risks to both the user and the robot, thus falling outside the scope of this research. Each experiment utilizes one of these five distinct trajectory shapes.
For each experiment, the trajectory is learned by a DMP with as the number of basis functions, and elastic and damping constants and , respectively, where is the identity matrix and and . The hyperparameters for the coupling term are . To describe the volumetric shapes of the obstacles, the parameters of the superquadrics are in the interval of , and the axes will be determined by the dimensions of the bounding boxes around the detected objects. The work table of the robot is described by a superquadric shape with parameters, volume axes , and a volume center in .
The framework’s performance was validated across 77 real-world trials, achieving an overall success rate of 96.1% (
Table 4). Failures (3.9%) were categorized into two modes: direct collisions were detected in 1.3% of trials, and precautionary stops due to exceeding the acceleration safety limit occurred in 2.6% of scenarios. The latter represents cases where the algorithm computed a valid, aggressive maneuver that was intentionally halted to ensure the physical integrity of the operator and the robotic system.
On successful runs, the mean peak deviation was 167.32 mm, and the mean 99th percentile latency computation time was 2.939 ms per step (
Table 5), confirming real-time feasibility. The worst-case computation time observed across all trials was 5.51 ms, indicating robust and predictable performance under the tested conditions.
Conversely, performance metrics for unsuccessful trials, including a mean 99th percentile latency of 3.022 ms, are detailed in
Table 6 for comparison. These results suggest that failures were likely due to physical constraints or trajectory issues rather than computational bottlenecks.
Furthermore, in experiments involving humans, only the right arm of the user will be visually tracked. This restriction is implemented to streamline computational processes and enable the left arm of the user to adjust the goal position dynamically during execution without compromising system performance. Additionally, the cobot maximum speed was limited to 250 mm/s, and precautionary stops due to exceeding the acceleration safety limit were kept in place, as described previously.
In these experiments, the human right arm is treated as an obstacle, and the corresponding obstacle avoidance trajectories are depicted in
Figure 13.
Figure 13a–c illustrate different scenarios where the right arm of the human is represented using various superquadric shapes. Different colors on the geometries indicate the initial and final positions of the right arm tracked by the ZED 2 camera: purple denotes the initial position and green denotes the final position, respectively.
As previously mentioned, the user has the capability to adjust the goal position of the trajectory during execution to demonstrate the system’s ability to converge to this new goal position. In each trajectory, a star marker indicates how the goal target position is modified during task execution, while the movements of the human right arm are represented by solid black lines.
Figure 13d illustrates the execution timesteps of the pick-and-place operation, demonstrating successful obstacle avoidance and online adaptation within the shared workspace in the presence of a human.
7. Discussion & Conclusions
This work presents a robot programming framework designed for robot LfD within human–robot collaborative environments. The framework has been developed, implemented, validated, and tested both in simulated environments and on a physical robotic station. It incorporates an enhanced formulation of DMPs that includes volumetric obstacle avoidance capabilities in Cartesian space.
The formulation implemented not only considers the geometric shape of obstacles but also their relative position, orientation, and velocity in relation to the robot. This advancement extends traditional obstacle avoidance techniques from point obstacles to volumetric obstacles, applicable to various shapes using superquadric functions as its fundamental approach. A key advantage of this method is its ability to maintain the stability and convergence properties of the original DMP formulation, even in scenarios involving dynamic obstacles.
The physical robotic station consists of an industrial collaborative manipulator equipped with a visual perception system for object and human detection and tracking. The volumetric obstacle avoidance was tested on a human–robot environment, where the robot must complete a pick-and-place operation while the user moves around in the shared workspace, and therefore the solution gives insights into human–robot interaction (HRI) in HRC task scenarios.
The quantitative results presented (
Table 2 and
Table 5) underscore the effectiveness of our proposed formulation. In contrast to traditional DMP obstacle avoidance techniques that often simplify obstacles to points or use potential fields prone to local minima with multiple obstacles, our approach directly models and reacts to volumetric obstacles of varying shapes and poses using superquadrics in a general pose. The demonstrated success in avoiding dynamic obstacles, including human limbs in HRC scenarios (
Table 4), with an average computation time of 2.939 ms per adaptation step, highlights its practical applicability. Unlike point-cloud-based avoidance methods, which can be computationally demanding due to high data density and may result in non-smooth trajectories from rapidly changing nearest-point calculations, our superquadric modeling offers a compact and efficient representation that facilitates smoother and more predictable avoidance maneuvers.
The main drawback of the proposed method is the number of necessary hyperparameters for tuning, an activity that is critical depending on the volume shape used for modeling the obstacles. For example, the proposed coupling term is primarily influenced by the parameter , which significantly affects movement generation. Varying results in different performances in obstacle avoidance, and this behavior was qualitatively observed during the tuning of the algorithm for the experiments. Specifically, a smaller may cause movement failure in obstacle-avoidance scenarios, while a larger can lead to poor performance in tracking the desired trajectory. This can result in unrealistic acceleration values that are either unattainable physically or compromise the safety of the robot and its user. This behavior could be minimized with the correct selection of the hyperparameters, but still needs to be proven.
To address this, future research will explore options for automated hyperparameter optimization. Techniques such as RL, Bayesian optimization, or evolutionary algorithms could be investigated to systematically tune these parameters, potentially enhancing performance consistency across different tasks and reducing the manual setup effort.
Another limitation is the orientation of the robot’s end-effector, which has not been taken into account during the robot learning phase. In both path planning methods, the robot end-effector is considered a rigid point-like robot, and its volume and dimensions have been omitted during the experiments. Rather than maintaining a static end-effector configuration, the orientation should be adaptively adjusted in real-time, taking into consideration factors such as the demonstrated trajectory, the type and position of obstacles, and the object being grasped. To address this limitation, options for integrating the volumetric obstacle avoidance in quaternion DMP formulation to include the end-effector movements during the trajectory adaptation will be explored.
It is also important to acknowledge that while superquadrics provide a versatile and computationally efficient method for modeling many common object shapes, representing highly complex, non-convex, or finely articulated geometries, such as the full human body, with a single superquadric might have limitations in terms of geometric fidelity. Future investigations could explore two possible solutions. First, utilizing multiple superquadrics to form more intricate composite object representations could effectively model non-convex geometries, allowing the algorithm to treat these obstacles as chains of simpler superquadric shapes, thereby improving avoidance success rates. Second, hybrid approaches that combine superquadrics for intricate shapes with other techniques (e.g., point clouds or mesh segments) could be employed to capture local detail where high precision is crucial.
Future work will also focus on validating the system’s robustness against a wider array of dynamic obstacle behaviors. This includes testing with non-linear obstacle trajectories, variable and unpredictable speeds, and more complex articulated movements to further assess the algorithm’s performance limits and adaptability in highly dynamic and unstructured HRC environments.
While the presented real-time adaptation times are promising, a complete assessment of safety in HRC requires a detailed evaluation of emergency stop response time and adherence to minimum safety distances. Our measured 99th percentile latency computation time of 2.939 ms per step for real successful trials provides a crucial component of the overall response. However, the total emergency stop response time also incorporates factors such as sensor refresh rates, communication latency within the robot control architecture, and the physical braking capabilities of the UR10e manipulator. Future work aims to quantify this end-to-end response time through dedicated experiments, focusing on the total duration from the detection of an unsafe condition (for example, human intrusion into a predefined safety zone) to the robot achieving a complete halt or a verified safe state. Furthermore, while our system exhibited a low collision rate of 1.3% in real-world trials and utilizes ’precautionary stops’ when acceleration limits are exceeded, explicit definition and verification of the maintained minimum safety distance is essential. This will involve establishing quantifiable safety envelopes based on relevant safety standards (ISO 10218 [
26], ISO/TS 15066 [
1]) and employing precise tracking methods to confirm that the robot’s adapted trajectory consistently respects these distances, even under aggressive maneuvers. The influence of the
parameter in the coupling term on implicitly maintaining these safety distances will also be a subject of further investigation.
Regardless of these limitations and future improvements, the initial objective of developing an LfD programming framework on a collaborative robot to improve HRC in a manufacturing task involving reactive path planning and volumetric obstacle avoidance has been accomplished. The findings of this study will help lay the foundation for further enhancements in the LfD model and better interactions between humans and robots in unconstrained environments.