Next Article in Journal
Geometry, Kinematics, Workspace, and Singularities of a Novel 3-PRRS Parallel Manipulator
Previous Article in Journal
RA6D: Reliability-Aware 6D Pose Estimation via Attention-Guided Point Cloud in Aerosol Environments
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Low-Code Mixed Reality Programming Framework for Collaborative Robots: From Operator Intent to Executable Trajectories

1
School of Information Engineering, Shenyang University of Chemical Technology, Shenyang 110142, China
2
National Key Laboratory of Robotics and Intelligent Systems, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110016, China
3
National Robot Quality Inspection and Testing Center (Liaoning), Shenyang 110016, China
*
Author to whom correspondence should be addressed.
Robotics 2026, 15(1), 9; https://doi.org/10.3390/robotics15010009
Submission received: 15 October 2025 / Revised: 27 November 2025 / Accepted: 25 December 2025 / Published: 29 December 2025
(This article belongs to the Section AI in Robotics)

Abstract

Efficient and intuitive programming strategies are essential for enabling robots to adapt to small-batch, high-mix production scenarios. Mixed reality (MR) and programming by demonstration (PbD) have shown great potential to lower the programming barrier and enhance human–robot interaction by leveraging natural human guidance. However, traditional offline programming methods, while capable of generating industrial-grade trajectories, remain time-consuming, costly to debug, and heavily dependent on expert knowledge. Conversely, existing MR-based PbD approaches primarily focus on improving intuitiveness but often suffer from low trajectory quality due to hand jitter and the lack of refinement mechanisms. To address these limitations, this paper introduces a coarse-to-fine human–robot collaborative programming paradigm. In this paradigm, the operator’s role is elevated from a low-level “trajectory drawer” to a high-level “task guider”. By leveraging sparse key points as guidance, the paradigm decouples high-level human task intent from machine-level trajectory planning, enabling their effective integration. The feasibility of the proposed system is validated through two industrial case studies and comparative quantitative experiments against conventional programming methods. The results demonstrate that the coarse-to-fine paradigm significantly improves programming efficiency and usability while reducing operator cognitive load. Crucially, it achieves this without compromising the final output, automatically generating smooth, high-fidelity trajectories from simple user inputs. This work provides an effective pathway toward reconciling programming intuitiveness with final trajectory quality.

1. Introduction

Industrial robots are a cornerstone of modern manufacturing, widely used in automated tasks such as welding, surface treatment, machining, and laser cutting [1]. The motion of the robot’s end-effector in these applications is often abstracted as a path-following or contour-tracking problem, where precise trajectory execution is paramount for product quality and process consistency [2]. However, programming these precise paths remains a significant bottleneck: it is time-consuming, overly reliant on experts, and is ill-suited for the frequent changes required in high-mix, low-volume production [3]. Consequently, developing more efficient and intuitive programming strategies is crucial to improve the adaptability of robots to evolving task demands [4]. To address these challenges, Mixed reality (MR) has emerged as a promising paradigm to democratize robot programming [5]. By overlaying virtual information onto the real workspace, MR interfaces allow operators to define robot tasks more naturally. The most common approach involves Programming by Demonstration (PbD), where an operator “draws” the desired path in 3D space using gestures or a handheld device [6]. While this “what-you-draw-is-what-you-get” method is highly intuitive, it introduces a fundamental trade-off: it sacrifices machine precision for human ease-of-use. This trade-off is a critical flaw in existing MR programming systems, especially for contour-following tasks [7]. The inherent jitter and inconsistency of human motion make it exceptionally difficult for a non-expert operator to manually trace a path with the smoothness and accuracy required for a perfect weld or a uniform adhesive bead [8]. This approach burdens the operator with achieving machine-level precision, undermining the goal of rapid programming. It fails to leverage the system’s computational power to interpret user intent, instead merely copying flawed manual input. An ideal approach should minimize the physical and cognitive load on the operator while ensuring that the robot understands their guidance accurately [9]. This reveals a distinct gap in the literature for a method that combines sparse, intuitive guidance with automated, high-precision trajectory generation.
To resolve this conflict between intuitiveness and precision, this paper proposes a novel “coarse-to-fine” low-code programming methodology. Our approach separates high-level task specification from low-level quality assurance. Using an MR Head-Mounted Display (HMD), a non-expert operator provides “coarse-grained” guidance by defining only a few critical waypoints or directional cues with natural gestures. The system then takes this sparse input and performs the “fine-grained” planning: it computationally generates a dense, smooth, and kinematically feasible trajectory that precisely matches the specified contour. This synergistic architecture, which integrates a front-end MR interface with a robust back-end of the Robot Operating System (ROS), allows users to create complex and high-quality paths without the burden of manual tracing, as illustrated in Figure 1. Our primary contributions are summarized as follows.
  • We propose a novel “coarse-to-fine” interaction paradigm for MR robot programming that effectively resolves the prevailing trade-off between intuitiveness and precision.
  • We design and implement a low-code system that realizes this paradigm, synergizing an intuitive MR front-end with a powerful ROS back-end for real-time, high-quality path generation from sparse user inputs.
  • We validate our system’s effectiveness and efficiency through two industrial case studies and a comparative experiment, demonstrating its superiority over traditional programming methods for non-expert users.
The remainder of this paper is organized as follows. Section 2 reviews the related work in traditional and MR-based robot programming. Section 3 describes the materials, methods, and the overall workflow of the proposed approach. Section 4 explains the system implementation details and the underlying algorithms. Section 5 presents the experimental setup, case studies, and evaluation results. Finally, Section 6 concludes the paper.

2. Related Work

2.1. Traditional Robot Programming

The established online and offline programming paradigms, while foundational, exhibit persistent limitations in efficiency, precision, and adaptability—particularly for enabling non-experts to program complex contour-following tasks in dynamic settings, as highlighted in recent reviews on industrial robot programming methods [10,11,12].
Online methods face inherent efficiency bottlenecks. Comparative studies quantify this challenge: Mabong et al. [13] demonstrated that even relatively intuitive Demonstrative-Kinesthetic Teaching (DKT) significantly outperformed conventional teach pendant operation in speed and learnability for path creation, revealing the substantial time cost and expertise dependency ingrained in manual online approaches. This inefficiency becomes critical in high-mix, low-volume production requiring rapid task switching [14]. Precision is another critical weakness, especially for continuous contours. Gonzalez et al. [15] identified the fundamental dependence on sparse waypoints and Point-to-Point (PTP) motion logic as the primary cause of path deviation in tasks such as welding. The resulting interpolated arcs often fail to achieve the smooth, accurate paths demanded by quality-sensitive operations (e.g., precision gluing or seam welding).
Offline Programming (OLP) circumvents production downtime but grapples with the significant complexity of the simulation-to-reality gap. Research efforts highlight the substantial resources required: Sarivan et al. [16] developed an automated system generating trajectories directly from CAD annotations to reduce manual programming effort by 80%, underscoring the intensity of the baseline process. Slavković et al. [17] further illustrated the intricate measures needed for fidelity, employing complex CNC-format data pipelines to ensure physical execution matched simulated plans. The very existence of such sophisticated solutions demonstrates that reliable OLP is inherently model-dependent and far from a simple ‘plan-and-execute’ workflow, making it brittle in the face of environmental dynamics [18]. Hybrid approaches seeking to merge online intuition with offline planning (e.g., refining CAD paths via expert demonstrations [19]) acknowledge the need for human input. However, they often introduce new complexities in real-time data alignment and system integration, hindering their practicality and accessibility for non-expert users.
Consequently, despite incremental improvements, traditional and hybrid programming methods fundamentally lack the intuitiveness, efficiency, and adaptability required to empower non-experts in rapidly deploying robots for complex contour tasks within dynamic manufacturing environments. This critical gap motivates the exploration of MR as a transformative paradigm for human–robot interaction in programming.

2.2. Mixed Reality-Based Robot Programming

To overcome the limitations of traditional methods, researchers have increasingly turned to MR as a transformative paradigm for Human–Robot Interaction (HRI). MR-based systems merge the physical and digital worlds, allowing operators to visualize and define robot tasks directly within the actual workspace [20]. This approach promises to make programming significantly more intuitive and efficient by overlaying virtual instructions and robot paths onto the real environment, thereby lowering the barrier for non-expert users [6,21,22]. The primary goal is to enable operators to program complex tasks, such as welding or assembly, without relying on complicated teaching pendants or complex offline software [21].
A dominant approach within this domain is PbD, where the operator “shows” the robot the desired trajectory [23,24]. Early and common implementations of this paradigm involve an operator using MR interfaces, such as a HoloLens, to move a virtual gripper or draw a path with their hand [25]. The system captures this demonstrated 3D motion and translates it into executable robot code [7,20]. This concept has evolved to include dedicated handheld “teaching devices” or pointers, which an operator physically moves through space to define waypoints and paths, while the MR display provides real-time feedback on the simulated robot’s movement [26]. More advanced end-to-end frameworks now even integrate machine learning to immediately train movement primitives from these in situ demonstrations, further streamlining the workflow from demonstration to execution [24]. Researchers are also exploring novel interaction modalities, such as haptic gloves that allow users to “paint” trajectories without handheld controllers and even eye-gaze for part selection, to make the interaction feel as natural as possible [27,28].
However, this direct demonstration paradigm—a “what-you-draw-is-what-you-get” approach—introduces a fundamental trade-off between intuitiveness and precision, especially for the complex contour-following tasks that are the focus of this paper. While drawing a path with a handheld device or a gesture is highly intuitive, it places the entire burden of fine-grained precision onto the non-expert operator [23]. Human hand movements are naturally prone to jitter and inconsistency, making it extremely difficult to manually trace the smooth, accurate, and steady paths required for quality-sensitive operations like welding, gluing, or sealing [21]. The cognitive and physical load of meticulously defining every single point along a complex curve can also become overwhelming, undermining the goal of rapid and effortless programming [29]. This direct, 1-to-1 mapping of human motion to robot motion fails to leverage the computational power of the robot to assist the user in achieving high-quality results.
Recognizing this challenge, some researchers are exploring alternative strategies that reduce the burden of manual path creation. For instance, one system automatically generates several candidate paths and uses MR to let the user simply select the most suitable one, minimizing the need for manual waypoint input [30]. While promising, this approach may lack the flexibility required to define custom, arbitrary contours that are not easily generated by automated planners. This reveals a critical gap in the literature: the need for a hybrid approach that combines the intuitive, sparse directional input from a human operator with the automated precision of computational trajectory generation. Such a method would empower non-experts to define complex, custom paths without bearing the full burden of fine-grained manual tracing, directly motivating the “coarse-to-fine” paradigm proposed in this work.

3. Materials and Methods

To facilitate intuitive and efficient robot programming, this study proposes and develops an interactive robot control system founded on MR and hand gesture recognition. The system enables operators to guide a physical robot through complex tasks via direct, immersive interaction.

3.1. MR Low-Code Programming Overview

Our methodology is founded on the principles of low-code programming, a paradigm aimed at minimizing manual coding by using intuitive, high-level interactions to generate complex machine instructions. MR serves as a powerful enabling technology for this paradigm in robotics, as it provides a natural interface to translate human spatial actions and intent into structured digital commands. Among various MR devices, HMDs are particularly advantageous for industrial applications because they offer a hands-free, immersive experience, allowing an operator to interact with virtual content overlaid onto the physical workspace seamlessly.
Following these principles, we designed the system whose composition is illustrated in Figure 2. The primary modules consist of the physical robot, its registered virtual twin for simulation, interactive Graphical User Interfaces (GUIs), and the operator providing spatial guidance while wearing an MR HMD. These modules work in synergy: the operator provides coarse-grained guidance via the MR interface, and the system’s back-end computational planner processes this sparse input to generate a dense, smooth trajectory. This trajectory is then visualized using the virtual robot for confirmation before being executed by the physical robot. This workflow effectively decouples the intuitive human interaction from the complex task of precise path generation.

3.2. System Architecture

To realize the “coarse-to-fine” paradigm, we propose a decoupled, two-part system architecture. As illustrated in Figure 3, the architecture is divided into an Operator Intent Module (MR Front-End) and an Execution Core Module (ROS Back-End). This separation of concerns is key to resolving the inherent conflict between intuitive human interaction and machine-level precision. The Operator Intent Module is responsible for capturing the user’s high-level, coarse-grained guidance, while the Execution Core undertakes the computationally intensive task of transforming this guidance into a dense, smooth, and kinematically feasible robot trajectory.

3.2.1. Operator Intent Module

The Operator Intent Module serves as the high-level interface between the operator and the robot. Its core responsibility is to translate the operator’s abstract task intent into structured, machine-readable data. Within our architecture, this translation is achieved through an intuitive, MR-based interaction paradigm. The operator, freed from concerns about underlying coordinate systems or robotic kinematic constraints, specifies key task nodes (i.e., waypoints) directly in the 3D physical workspace using simple hand gestures. This interaction method simplifies the task definition process, converting it from precise numerical programming into a coarse and intuitive act of spatial outlining. The final output of this module is a sparse, ordered sequence of waypoints. This sequence represents the overall geometric profile and logical order of the task, serving as a digital embodiment of the user’s coarse-grained intent. It is passed as the sole input to the Execution Core, thus achieving a complete decoupling of operational intent from execution details.

3.2.2. Execution Core

The Execution Core Module functions as the system’s computational and planning center. Its primary responsibility is to transform the sparse waypoint sequence received from the front-end into a high-density, high-smoothness trajectory that the robot can execute with precision. Architecturally, this module encapsulates the complexity of low-level motion planning. It shields the user from a series of computationally intensive tasks, including path interpolation, trajectory densification, and kinematic solving and validation. This design ensures that regardless of how simple or coarse the front-end interaction is, the final generated trajectory strictly adheres to the requirements of industrial robots for accuracy, smoothness, and kinematic feasibility. The input to this module is the structured command generated by the Operator Intent Module, and its final output is a stream of executable, dynamically consistent robot trajectory data. To ensure the precision and safety of the final execution, the planned trajectory is sent back to the MR front-end to drive a virtual robot model through a final simulation drill. Only after the operator confirms that the MR simulation is error-free is the trajectory dispatched to the physical robot for execution. This process forms a rigorous “virtual verification, physical execution” closed loop.
It is important to note that while the Execution Core ensures kinematic feasibility, the safety of the physical interaction relies on the hardware capabilities of the collaborative robot. The robot utilized in this system complies with the ISO/TS 15066 standard [31], featuring motor-current-based force estimation. This mechanism serves as a final safety layer: if the operator fails to identify an obstacle during the MR verification phase, the robot’s controller detects the abnormal torque upon contact and triggers a protective stop, thereby mitigating the risk of damage caused by operator errors.

3.3. Hardware and Software Setup

The system is implemented as a distributed architecture comprising three core hardware components: an MR HMD, a central host PC, and a collaborative robot arm. Each component serves a distinct functional role within the system’s information flow. The Microsoft HoloLens 2 acts as the primary human-intent capture and visualization terminal, responsible for interpreting the operator’s gestures and displaying the virtual robot simulation. The host PC functions as the central computation and control hub, where high-level user commands are processed and transformed into low-level robot instructions. Finally, the GCR7 collaborative robot arm serves as the physical execution endpoint, performing the tasks defined by the operator.
The dynamic interaction between these hardware components is enabled by the communication architecture illustrated in Figure 4. This architecture facilitates a bidirectional information flow, with green arrows indicating the forward command stream and orange arrows indicating the backward feedback stream. The command stream originates from the MR HMD, transmitting the operator’s intent as a sequence of sparse waypoints to the host PC. Conversely, the feedback stream flows from the host PC back to the MR HMD, carrying the planned trajectory and real-time joint states. This feedback is crucial for driving the virtual robot simulation and enabling the operator to verify the intended motion before physical execution, thus forming a closed loop for interaction and validation.
Within the host PC, the high-level commands undergo a multi-stage algorithmic transformation to ensure executable and precise motion. Upon receiving the sparse waypoints, the ACRSR module first performs path interpolation and densification, refining the coarse user input into a smooth, dense path. Subsequently, this dense path is processed by the Inverse Kinematics (IK) Solver, which translates the task-space coordinates into an executable joint-space trajectory. This final trajectory is then used in the verification loop for simulation on the MR HMD. Following the operator’s confirmation, the commands are dispatched to the robot controller, which drives the physical robot to complete the task. This layered process effectively transforms abstract human guidance into precise, machine-level execution.

3.4. Core Workflow

The operational workflow of the system is a direct embodiment of our “coarse-to-fine” design paradigm. It is divided into two distinct, sequential stages: a human-driven, coarse-grained path definition stage, and a machine-automated, fine-grained trajectory generation and verification stage. This process seamlessly transforms an operator’s high-level spatial intent into a precise, executable robot motion.

3.4.1. Coarse-Grained Path Definition

To define a task, the operator employs natural hand gestures to perform a series of intuitive “point-and-place” actions directly within the workspace. When the operator points to a target location, a yellow virtual sphere appears as a preview feedback cue. This interaction is implemented by leveraging the Mixed Reality Toolkit (MRTK) in conjunction with the HoloLens 2’s Spatial Mapping capability. The device continuously reconstructs the physical environment into a dynamic Spatial Mesh. When the operator performs a pointing gesture, the system projects a virtual ray from the user’s index finger and calculates the exact intersection point between this ray and the Spatial Mesh collider. This mechanism ensures that the virtual waypoints automatically “snap” to the surface of the physical workpiece, effectively resolving the depth ambiguity inherent in 3D interactions. Crucially, this visual feedback is rendered locally on the HoloLens device at approximately 60 FPS, ensuring that the interaction is immediate and immune to network latency. Through this interactive feedback loop, the operator intuitively constructs a sparse, ordered sequence of waypoints that capture the desired path and critical nodes of the task, as illustrated in Figure 5.
The system completely decouples the temporal aspect of human interaction from robot execution. Unlike direct teaching methods, the speed at which the operator selects waypoints has no influence on the final motion. The backend planner applies time-parameterization to the generated path based on pre-set process parameters (e.g., a constant welding velocity), ensuring smooth and consistent execution regardless of the operator’s input speed. It is worth noting that the number of required waypoints is not strictly dependent on the physical length of the trajectory, but rather on its geometric complexity. For instance, a linear segment requires only two points (start and end) regardless of its length, whereas curved segments require additional waypoints placed at inflection points or peaks of curvature. This sparse input strategy minimizes operator workload while allowing the backend ACRSR algorithm to handle the necessary densification.

3.4.2. Fine-Grained Trajectory Generation and Verification

Once the operator finalizes the coarse path, the fine-grained trajectory generation process is automatically initiated within the Execution Core. The sparse sequence of waypoints is transmitted from the MR HMD to the ROS back-end, where it first undergoes a coordinate registration step to align all points with the robot’s base coordinate system. This registration ensures an accurate translation of the operator’s spatial intent into the robot’s operational workspace.
Subsequently, the registered waypoints are processed by the proposed Adaptive Curvature-Based Spline Resampling (ACRSR) algorithm. This algorithm transforms the sparse, operator-defined path into a smooth, dense, and high-fidelity trajectory suitable for precise motion planning. The resulting path is then forwarded to the IK solver, which computes the corresponding executable joint trajectory.
To guarantee safety and correctness, physical execution is deliberately deferred. The generated joint trajectory is first transmitted back to the MR HMD, where it drives the Virtual Robot model to simulate the planned motion within the MR environment. This simulation starts from the robot’s current physical pose, enabling the operator to visualize the entire motion sequence in its actual workspace while the real robot remains stationary (Figure 6). This step provides a crucial opportunity to assess the planned motion for feasibility, collision avoidance, and overall accuracy. Only after the operator explicitly confirms the motion does the system dispatch the verified trajectory to the GCR7 controller for precise and safe physical execution.

4. System Implementation and Algorithms

4.1. HoloLens-to-Robot Coordinate System Registration

The foundational prerequisite for our “coarse-to-fine” paradigm is the establishment of a robust spatial transformation between the user’s perceived space and the robot’s operational space. This spatial mapping enables the translation of waypoints, intuitively specified by a user in MR, into a coordinate system intelligible to the robot for execution. For this task, we employ a one-time, point-based calibration procedure chosen for its high accuracy and operational simplicity, avoiding reliance on visual markers or specific environmental features.
The core of this procedure is to align two primary coordinate systems, as illustrated in Figure 7: the MR world frame, F H o l o , which in our system is the left-handed Y-up world coordinate system established by the HoloLens 2; and the robot base frame, F R o b o t , a right-handed Z-up system compliant with robotics standards (ROS REP 103) [32].
Our objective is thus to determine the static homogeneous transformation matrix T H o l o R o b o t , that maps points from F H o l o to F R o b o t . A significant technical challenge, however, arises from the opposing “handedness” of these frames, which introduces a non-rigid reflection and precludes the direct application of standard rigid alignment algorithms like Kabsch [33]. To resolve this, we implement a two-stage method that decouples the problem into a non-rigid reflection followed by a rigid transformation. The initial step addresses the discrepancy in coordinate conventions by applying a known and constant semantic transformation, denoted as T s e m , to the point set measured by the HoloLens2.
T s e m = 0 0 1 1 0 0 0 1 0
This operation yields an intermediate point set, P H o l o = P H o l o T s e m T , which is now represented in a right-handed system. With the handedness conflict resolved, the problem is reduced to a classic rigid body registration task. The remaining unknown rotation matrix R and translation vector t between P H o l o and the corresponding robot measurements P R o b o t R n × 3 can now be reliably computed using the Kabsch algorithm. This algorithm utilizes Singular Value Decomposition (SVD) to calculate the optimal rotation matrix that minimizes the Root Mean Squared Deviation (RMSD). As a closed-form solution with linear time complexity O ( N ) , it is computationally extremely efficient and robust for real-time calibration, eliminating the need for iterative optimization:
( R , t ) = K a b s c h ( P H o l o , P R o b o t )
Finally, the complete transformation T H o l o R o b o t is synthesized by composing the semantic transformation and the computed rigid motion.
T H o l o R o b o t = R · T s e m t 0 T 1
With this registration accurately established, the system is empowered to act as the intended bridge between human and machine. The sparse, “coarse-grained” waypoints specified by the user’s gestures can be precisely transformed into the robot’s reference frame. These transformed points then serve as the definitive anchors for the “fine-grained” computational planner to generate a smooth, dense, and kinematically feasible trajectory. This robust, two-stage registration is therefore the critical underpinning of our entire interaction paradigm, ensuring that intuitive human intent is faithfully translated into precise robotic execution.

4.2. Adaptive Curvature-Based Spline Resampling (ACRSR)

Conventional local smoothing techniques, such as parabolic blending, achieve C 1 continuity at individual junctions but inherently operate in a localized manner, leaving the global curvature profile unoptimized. Furthermore, these methods tend to deviate from the original waypoints, producing a “corner-cutting” effect that undermines path fidelity—an issue of particular concern when following precise operator-defined trajectories.
The proposed ACRSR method addresses these limitations by employing a globally defined Catmull–Rom spline interpolation. This spline is selected for its properties of guaranteed passage through all control points and its ability to maintain C 1 continuity, thereby ensuring smooth tangent transitions along the path. Given a sparse sequence of waypoints
P = { P 0 , P 1 , , P n } , P i R 3
each spline segment C i ( t ) connecting P i and P i + 1 for t [ 0 , 1 ] is expressed as:
C i ( t ) = 1 2 1 t t 2 t 3 0 2 0 0 1 0 1 0 2 5 4 1 1 3 3 1 P i 1 P i P i + 1 P i + 2
To handle the path boundaries, virtual control points are introduced via reflection,
P 1 = 2 P 0 P 1 , P n + 1 = 2 P n P n 1
ensuring well-defined tangents at both ends.
Beyond interpolation, ACRSR adaptively refines the sampling density of the spline according to its local geometric complexity, which is quantified by the curvature κ ( t ) :
κ ( t ) = C ( t ) × C ( t ) C ( t ) 3
When  κ ( t ) exceeds a tunable threshold κ t h , the sampling resolution is locally increased. In this study, the threshold κ t h r e s h o l d is determined empirically, representing a fundamental trade-off. A lower threshold value results in a denser set of points that captures geometric details with higher fidelity, albeit at an increased computational cost for the subsequent motion planning stage. Conversely, a higher threshold improves efficiency by generating fewer waypoints but may risk over-smoothing fine-grained turns. The selection of κ t h r e s h o l d is therefore guided by the specific precision requirements of the given task. This adaptive approach ensures that rapid directional changes are adequately represented while avoiding excessive oversampling in straight or gently curved regions.In our experiments, κ t h r e s h o l d was set to 0.5 for the sealant application task to better capture regions of higher curvature, and 0.9 for the butt welding task, where paths are predominantly linear.
A key advantage of ACRSR lies in its computational efficiency compared to traditional uniform sampling methods. To capture sharp turns, a uniform approach must apply a small interpolation step across the entire path, leading to significant oversampling and redundant points in straight or gently curved sections. In contrast, ACRSR intelligently allocates points only where geometrically necessary. This reduction in the total number of waypoints not only preserves path fidelity but also directly decreases the number of subsequent IK computations, significantly enhancing the overall efficiency of the motion planning pipeline. The result of the ACRSR procedure is a dense, curvature-aware waypoint sequence, denoted as X r e f , which faithfully represents the intended path while providing the smoothness necessary for stable robot control. The entire procedure is visually summarized in Figure 8.

4.3. Robot Motion Planning

Upon completion of the spatial registration, all operator-specified waypoints p i H o l o , originally defined in the HoloLens world frame F H o l o , are transformed into the robot base frame F R o b o t using a precomputed homogeneous transformation T H o l o R o b o t .
The motion planner then processes the dense Cartesian path X r e f = { x 1 , x 2 , , x N } generated by the ACRSR algorithm to produce a kinematically feasible joint-space trajectory. Each waypoint x k X r e f is associated with a desired end-effector orientation, which is interpolated between the initial and target poses to guarantee continuous rotational motion.
Let T i n i t and T t a r g e t be the initial and target end-effector poses. After transformation, they become T i n i t R and T t a r g e t R in the robot frame. The planning objective is to generate a time-parameterized joint-space trajectory Q ( t ) such that the forward kinematics (FK) satisfy:
F K ( q ( t k ) ) = T k = [ R k | x k ]
where R k is obtained via Spherical Linear Interpolation (SLERP) between R i n i t R and R t a r g e t R , and x k is the k-th point from X r e f . To solve for the IK at each target pose T k , the joint state from the previous step, q k 1 , is used as the initial guess for the numerical solver. This strategy ensures that the resulting sequence of joint configurations is continuous in the joint space. It effectively mitigates issues arising from multiple IK solutions, such as abrupt joint movements, thereby guaranteeing a smooth final trajectory.
The complete motion planning procedure is summarized in Algorithm 1. Before execution, the target pose T t a r g e t R is validated against the robot’s reachable workspace to prevent infeasible IK queries. Through this pipeline, operator-guided spatial intent is converted into a precise, smooth, and collision-free joint-space motion plan.
Algorithm 1 Robot Motion Planning from ACRSR-Refined Paths
1:
Input: Refined path X r e f , T i n i t , T t a r g e t , T H o l o R o b o t , time step Δ t
2:
Output: Joint trajectory Q ( t )
3:
Transform T i n i t and T t a r g e t into F R o b o t to get T i n i t R , T t a r g e t R
4:
Validate T t a r g e t R within workspace
5:
Δ R ( R i n i t R ) T R t a r g e t R
6:
for  k = 1 to N do
7:
     R k SLERP ( R i n i t R , R t a r g e t R , k / N )
8:
     T k [ R k | x k ]
9:
     q k IK ( T k , q k 1 )
10:
   Verify q k C f r e e (collision-free) and joint continuity
11:
   Append q k to Q
12:
end for
13:
Apply time-parameterization to the sequence Q to generate Q ( t )
14:
return  Q ( t )

5. Experiments and Results

To rigorously evaluate the proposed “coarse-to-fine” programming methodology, we conducted a comprehensive experimental assessment focusing on its efficiency and usability. The evaluation framework includes a detailed description of the experimental platform and protocol, two representative industrial case studies for qualitative validation, and a quantitative user study comparing our system against a traditional teach pendant baseline.

5.1. Experimental Platform and Protocol

All experiments were performed on the integrated hardware and software platform described in Section 3. The setup comprised a 6-DOF Duco GCR7-910 collaborative robot with a reach of 910 mm and payload of 7 kg, a host workstation (Intel Core i7-9750H, 16 GB RAM, NVIDIA GeForce GTX 1650), and a Microsoft HoloLens 2 MR HMD. The software stack utilized ROS Melodic Morenia for back-end planning and Unity 2022.3 LTS for the MR front-end, with communication handled via TCP/IP using JSON-formatted messages.
The user study involved 9 participants (8 males, 1 female) aged between 22 and 28 (M = 24.5, SD = 2.1). To evaluate the system’s accessibility to non-experts, the participants were selected with diverse backgrounds: 3 had prior experience with industrial robot programming, while the remaining 6 were novices. Regarding AR familiarity, 4 participants had previous experience with AR/VR interfaces.
The experimental protocol was standardized to ensure consistency and reproducibility. To eliminate potential order effects, the assignment of programming methods (MR system vs. teach pendant) was counterbalanced across participants: half of the participants performed the tasks using the MR system first, while the other half started with the teach pendant. Each session began with a 5 min system calibration, including HoloLens-to-robot coordinate registration (Section 4.1). Participants then received a 10 min tutorial on the assigned programming method (our MR system or the baseline teach pendant). For each task, participants performed five trials per programming method to account for learning effects and variability. Each trial consisted of defining the trajectory, verifying it through simulation, and executing it on the physical robot. We detailed the rest schedule: a 5 min break between tasks within the same system and a 15 min break between the two experimental conditions to effectively mitigate fatigue and to minimize immediate fatigue while allowing for procedural familiarity. Regarding data analysis, the first trial per participant per method per task was treated as a practice trial and excluded to mitigate initial learning effects (e.g., unfamiliarity with gestures or controls). The remaining four trials were averaged to compute a per-participant mean TCT for each condition. These per-participant means were then used for statistical comparisons (paired samples t-tests). The system automatically recorded the task completion time (TCT). Post-experiment, participants completed the System Usability Scale (SUS) questionnaire and provided qualitative feedback via open-ended questions to assess subjective usability and experience.

5.2. System Accuracy Evaluation

The overall accuracy of the proposed robotic programming framework is governed by two primary error sources: (1) the spatial registration accuracy, which reflects the precision of the coordinate transformation between the HoloLens and the robot base frame, and (2) the trajectory fidelity, which evaluates how well the ACRSR algorithm reconstructs the operator’s intended geometry from sparse inputs. To comprehensively assess these two aspects, we performed a two-stage evaluation as detailed below.

5.2.1. Spatial Registration Accuracy

To quantify the spatial fidelity of the HoloLens-to-robot registration, a post-calibration validation experiment was conducted using eight non-coplanar test points distributed across the robot’s workspace. The operator, wearing a HoloLens 2, designated virtual waypoints via the “Air Tap” gesture. The resulting mixed-reality coordinates P H o l o were transmitted to the ROS backend over TCP/IP.
To obtain ground truth measurements, the robot was commanded to move its Tool Center Point (TCP) to each corresponding waypoint, and the teach pendant readings were recorded as P R o b o t . The mixed-reality coordinates were then transformed into the robot base frame using the calibration matrix T H o l o R o b o t , yielding
P H o l o = T H o l o R o b o t · P H o l o .
For each test point, the registration error was computed as the Euclidean distance between the transformed MR coordinates and the robot-measured ground truth:
E i = P R o b o t , i P H o l o , i 2 = ( x r , i x h , i ) 2 + ( y r , i y h , i ) 2 + ( z r , i z h , i ) 2 .
Quantitative results are shown in Figure 9. The system achieved a Mean Absolute Error (MAE) of 1.80 mm and a Root Mean Square Error (RMSE) of 2.03 mm, with a maximum deviation of 3.41 mm. Although sub-millimeter precision remains challenging for current optical see-through HMDs, the obtained 1.80 mm MAE is well within the acceptable range for the intended coarse-to-fine programming workflow.

5.2.2. Trajectory Evaluation

To evaluate the fidelity of the proposed trajectory generation method, a digital-twin experiment was conducted in Unity to isolate algorithmic performance from hardware noise. The ideal contour was extracted from the CAD model of the workpiece and used as the Ground Truth. In Figure 10, a simulated operator provided seven sparse waypoints along the contour, emulating the coarse input typically generated during MR-based programming.
Using the same set of user inputs, two trajectory generation strategies were compared: the standard interpolation method commonly used in robot teach pendants (TP), and the proposed Adaptive Curvature-based Spline Resampling (ACRSR) algorithm.
Trajectory fidelity was assessed by computing the Euclidean distance from sampled points on each generated trajectory to their nearest point on the Ground Truth.
Figure 11 summarizes the results. The TP method achieved a Mean Absolute Error (MAE) of 0.9198 mm and a Maximum Error of 3.1355 mm. In comparison, the ACRSR method produced significantly higher geometric fidelity, with an MAE of 0.4062 mm and a Maximum Error of 1.3914 mm, corresponding to a 55.8% reduction in mean error.
As shown in Figure 11b, ACRSR effectively addresses the limitations of linear interpolation by leveraging curvature information to reconstruct the arc segments. Although the curved region ( x < 6 ) is inherently more challenging, ACRSR maintains consistently low error levels and avoids the pronounced error spike observed in the TP trajectory (notably near the high-curvature point at x 4.8 ).
Beyond improving positional accuracy, ACRSR provides a qualitative advantage in trajectory smoothness. Standard TP methods rely on piecewise-linear segments ( C 0 continuity), which can introduce vibration, or on blending schemes that unavoidably cause corner-cutting. In contrast, ACRSR generates trajectories with native geometric continuity, ensuring smooth curvature transitions and continuous acceleration profiles. This property is essential for maintaining stable process velocities and preventing quality defects in precision tasks such as welding or sealant application.

5.3. System Functionality and Validation

In this section, two representative industrial contour-following tasks are taken as examples to construct the implementation framework and evaluate the augmented performance of the proposed system.

5.3.1. Butt Joint Welding

This task emulates a standard welding operation on a butt joint, necessitating a smooth linear trajectory with consistent end-effector orientation to achieve high weld quality. As illustrated in Figure 12, the operator employed hand gestures within the MR interface to designate three sparse waypoints along a 150 mm seam on a metallic workpiece. These waypoints delineated the start, midpoint, and end of the joint, effectively conveying the operator’s high-level intent without the need for meticulous manual tracing.
Following confirmation, the backend ACRSR algorithm interpolated these waypoints into a dense trajectory comprising 23 points, with adaptive density augmentation near the endpoints to promote smoothness. The IK solver subsequently generated the corresponding joint trajectory, ensuring a fixed torch orientation perpendicular to the workpiece surface. The MR-based virtual robot simulation enabled the operator to preemptively verify collision-free motion prior to physical execution.
The physical robot executed the trajectory at a constant velocity of 50 mm/s, yielding a uniform weld seam. Across multiple trials, the complete programming-to-execution workflow averaged approximately 27 s.

5.3.2. Sealant Application

To assess a more intricate scenario, sealant was applied along a non-planar contour on an automotive chassis panel, featuring curves with variable curvature. The operator defined seven sparse waypoints at critical inflection points using MR gestures, as shown in Figure 13. This sparse input captured the overall geometry of the 300 mm contour, encompassing transitions between linear and curved segments.
The ACRSR algorithm produced a refined trajectory with 43 points, dynamically increasing point density in regions of high curvature (e.g., corners) to guarantee trajectory smoothness. End-effector orientations were calculated to maintain perpendicularity relative to the surface normals, which were extracted from the panel’s CAD model. The MR simulation facilitated visual validation of the trajectory’s fidelity to the contour.
The physical robot dispensed a uniform sealant bead at a velocity of 30 mm/s. The programming process averaged about 47 s.
These case studies qualitatively validate the system’s capability to convert sparse, intuitive user inputs into precise, executable trajectories suitable for demanding industrial applications.

5.4. Performance Evaluation and Comparison

The butt welding task involved programming a simple linear profile to evaluate the system’s ability to generate a smooth trajectory from a minimal set of waypoints. This task emphasized the system’s efficiency in helping non-expert users quickly define and execute precise paths. In contrast, the sealant application task required a more complex, variable-curvature profile, designed to assess the framework’s capacity to handle adaptive resampling and orientation adjustments, thereby reducing operator workload in complex scenarios. Figure 12 and Figure 13 illustrate the procedural steps for performing each task using the MR system.
A statistical analysis was performed on data from nine participants for both tasks. For the butt welding task, a paired samples t-test revealed a statistically significant reduction in Task Completion Time (TCT) with the MR system (M = 70.2 s, SD = 11.4 s) compared to the teach pendant (M = 127.0 s, SD = 9.4 s). This represents a substantial time savings of approximately 45% ( t ( 8 ) = 17.5 , p < 0.001 ).
The efficiency gains were even more pronounced in the sealant application task. Here, a paired samples t-test again showed a statistically significant difference in TCT, with the MR system’s time (M = 109.4 s, SD = 9.3 s) being drastically shorter than the teach pendant’s time (M = 220.3 s, SD = 20.0 s). This corresponds to a remarkable TCT reduction of approximately 50% ( t ( 8 ) = 23.4 , p < 0.001 ).
The comparison results are summarized in Table 1 and visualized in Figure 14.
The data unequivocally demonstrate the strong efficiency advantage of the MR-assisted programming system. For the simpler welding task, the MR interface cut the average programming time by nearly half. In the more complex sealant task, this improvement was even greater, with programming time being halved.The final Task Completion Time (TCT) showed no statistically significant difference between the expert and novice subgroups. This suggests that the proposed “coarse-to-fine” paradigm effectively abstracts the complexity of the task, allowing non-experts to achieve performance levels comparable to those of domain experts.
To strictly evaluate the subjective usability, participants completed the standard System Usability Scale (SUS) questionnaire consisting of 10 items. The responses were recorded on a 5-point Likert scale (1: Strongly Disagree to 5: Strongly Agree). Table 2 presents the detailed mean scores for each question item, comparing the proposed MR System against the Teach Pendant baseline. Figure 15 shows the comparative results of the questionnaire scoring based on volunteer feedback for both the MR System and the Teach Pendant. The feedback indicates that the functions of the proposed MR system are well integrated, and users can learn to use the system very quickly (as evidenced by high scores in Q5 and Q7). Participants considered the MR system relatively easy to use and felt confident in using it. According to standard SUS interpretation, the participants’ raw scores for the MR System were converted to a global score of 82.7 (out of 100), indicating that the system’s usability is “Excellent”. In contrast, the Teach Pendant received significantly less favorable ratings ( p < 0.001 ). These experimental results demonstrate that even inexperienced users can effectively utilize the proposed MR system to complete complex robot programming tasks.
The inefficiency of teach pendant programming often stems from jogging coordinate points, frequent mode switching, and the difficulty of maintaining a smooth path among numerous waypoints. In contrast, the MR system minimizes these issues by allowing for intuitive waypoint placement and abstracting fine-grained trajectory generation to the ACRSR algorithm, which effectively reduces operator workload and redundant inputs.
These results indicate that in both industrial tasks, the proposed system drastically shortened completion times and improved operator efficiency. The more significant TCT improvement in the sealant task is likely due to the higher geometric complexity requiring more waypoints with traditional methods, thus magnifying the benefits of sparse guidance. However, the variance in TCT could be influenced by individual differences in spatial perception and familiarity with gestures.

6. Conclusions

This paper successfully addressed the critical trade-off between intuitive programming and trajectory quality in robotics. We introduced and validated a “coarse-to-fine” synergistic paradigm that leverages MR to capture high-level human intent and computational algorithms to ensure high-quality machine execution. Our experiments demonstrated that this approach fundamentally alters the programming workflow for non-experts: it reduces task completion times by up to 50% and significantly enhances usability compared to traditional methods. Most importantly, it resolves the central conflict by transforming sparse, intuitive user guidance into smooth, dense, production-quality trajectories suitable for complex contour-following tasks. This work offers more than an efficient programming tool; it presents a robust framework for creating more accessible, intelligent, and truly collaborative human–robot systems. By bridging the gap between human guidance and machine precision, our methodology paves the way for broader robotic adoption in dynamic, customized, and high-mix manufacturing environments.
However, the system currently has limitations regarding dynamic environment perception. While the virtual verification step minimizes planning errors, the physical execution logic does not yet integrate real-time, sensor-based collision avoidance within the ROS backend. Consequently, the system relies on the robot’s built-in hardware safety features to handle unexpected obstacles. Future work will focus on integrating real-time depth sensing into the execution loop to enable dynamic replanning, thereby making the system more robust to operator oversight and environmental changes.

Author Contributions

Conceptualization, Z.W. and Z.L.; methodology, Z.W. and Z.L.; software, Z.W.; validation, Z.W., Z.L., H.Y., D.P., S.P. and S.L.; formal analysis, Z.W.; investigation, Z.W.; resources, Z.L.; data curation, Z.W.; writing—original draft preparation, Z.W.; writing—review and editing, Z.W., Z.L., H.Y., D.P., S.P. and S.L.; visualization, Z.W.; supervision, Z.L.; project administration, Z.L.; funding acquisition, Z.L. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Key R&D Program of China (Grant No. 2023YFB4704600); Project supported by the Natural Science Foundation of Liaoning Province, China (Grant No. 2024-MS-131).

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

The datasets generated and/or analysed during the current study are not publicly available due to the trade secrets involved but are available from the corresponding author on reasonable request.

Acknowledgments

The authors would like to thank the technical staff at the National Robot Quality Inspection and Testing Center (Liaoning) for their support in conducting the experiments. During the preparation of this manuscript, the authors used OpenAI ChatGPT (version GPT-5.1, 2025) for assistance in drafting text. The authors have reviewed and edited all generated content and take full responsibility for the final scientific content and conclusions of this work.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
ACRSR Adaptive Curvature-based Spline Resampling
DKT Demonstrative-Kinesthetic Teaching
FK Forward Kinematics
GUI Graphical User Interface
HMD Head-Mounted Display
HRI Human–Robot Interaction
IK Inverse Kinematics
MAE Mean Absolute Error
MR Mixed Reality
MRTK Mixed Reality Toolkit
OLP Offline Programming
PbD Programming by Demonstration
PTP Point-to-Point
RMSD Root Mean Squared Deviation
RMSE Root Mean Square Error
ROS Robot Operating System
R&D Research and Development
SLERP Spherical Linear Interpolation
SUS System Usability Scale
SVD Singular Value Decomposition
TCT Task Completion Time
TCP Tool Center Point
TP Teach Pendant

References

  1. Makulavičius, M.; Petkevičius, S.; Rožėnė, J.; Dzedzickis, A.; Bučinskas, V. Industrial robots in mechanical machining: Perspectives and limitations. Robotics 2023, 12, 160. [Google Scholar] [CrossRef]
  2. Wen, Y.; Pagilla, P.R. A novel 3D path following control framework for robots performing surface finishing tasks. Mechatronics 2021, 76, 102540. [Google Scholar] [CrossRef]
  3. Khan, M.M.; Singh, K.P.; Khan, W.U. A critical study on the implementation of operation, control and maintenance techniques for flexible manufacturing systems in small scale industries. Mater. Today Proc. 2023. [Google Scholar] [CrossRef]
  4. Heimann, O.; Guhl, J. Industrial robot programming methods: A scoping review. In Proceedings of the 2020 25th IEEE International Conference on Emerging Technologies and Factory Automation (ETFA), Vienna, Austria, 8–11 September 2020; Volume 1, pp. 696–703. [Google Scholar]
  5. Liao, Z.; Cai, Y. AR-enhanced digital twin for human–Robot interaction in manufacturing systems. Energy Ecol. Environ. 2024, 9, 530–548. [Google Scholar] [CrossRef]
  6. Fu, J.; Rota, A.; Li, S.; Zhao, J.; Liu, Q.; Iovene, E.; Ferrigno, G.; De Momi, E. Recent advancements in augmented reality for robotic applications: A survey. Actuators 2023, 12, 323. [Google Scholar] [CrossRef]
  7. Soares, I.; Petry, M.; Moreira, A.P. Programming robots by demonstration using augmented reality. Sensors 2021, 21, 5976. [Google Scholar] [CrossRef]
  8. Maric, B.; Zoric, F.; Petric, F.; Orsag, M. Comparative analysis of programming by demonstration methods: Kinesthetic teaching vs. human demonstration. arXiv 2024, arXiv:2403.10140. [Google Scholar] [CrossRef]
  9. Huang, S.; Wang, B.; Li, X.; Zheng, P.; Mourtzis, D.; Wang, L. Industry 5.0 and Society 5.0—Comparison, complementation and co-evolution. J. Manuf. Syst. 2022, 64, 424–428. [Google Scholar] [CrossRef]
  10. Pan, Z.; Polden, J.; Larkin, N.; Van Duin, S.; Norrish, J. Recent progress on programming methods for industrial robots. Robot. Comput.-Integr. Manuf. 2012, 28, 87–94. [Google Scholar] [CrossRef]
  11. Ryalat, M.; Almtireen, N.; Al-refai, G.; Elmoaqet, H.; Rawashdeh, N. Research and education in robotics: A comprehensive review, trends, challenges, and future directions. J. Sens. Actuator Netw. 2025, 14, 76. [Google Scholar] [CrossRef]
  12. Xu, Y.; Yu, H.; Wu, L.; Song, Y.; Liu, C. Contingency Planning of Visual Contamination for Wheeled Mobile Robots with Chameleon-Inspired Visual System. Electronics 2023, 12, 2365. [Google Scholar] [CrossRef]
  13. Mabong, G.P.; Osore, E.A.; Cherop, P.T. Robot Manipulator Programming Via Demonstrative-Kinesthetic Teaching for Efficient Industrial Material Handling Applications. Afr. Sci. Annu. Rev. 2024, 1, 165–171. [Google Scholar] [CrossRef]
  14. Nieto Bastida, S.; Lin, C.Y. Autonomous trajectory planning for spray painting on complex surfaces based on a point cloud model. Sensors 2023, 23, 9634. [Google Scholar] [CrossRef] [PubMed]
  15. Gonzalez, M.; Rodriguez, A.; de Lacalle, L.N.L. A novel methodology to improve robotic contour following by using radial-compliant pneumatic spindles. Int. J. Adv. Manuf. Technol. 2025, 1–9. [Google Scholar] [CrossRef]
  16. Sarivan, I.M.; Madsen, O.; Wæhrens, B.V. Automatic welding-robot programming based on product-process-resource models. Int. J. Adv. Manuf. Technol. 2024, 132, 1931–1950. [Google Scholar] [CrossRef]
  17. Slavković, N.; Zivanovic, S.; Dimic, Z.; Kokotovic, B. An advanced machining robot flexible programming methodology supported by verification in a virtual environment. Int. J. Comput. Integr. Manuf. 2025, 38, 1424–1442. [Google Scholar] [CrossRef]
  18. Li, G.; Wang, R.; Xu, P.; Ye, Q.; Chen, J. The Developments and Challenges towards Dexterous and Embodied Robotic Manipulation: A Survey. arXiv 2025, arXiv:2507.11840. [Google Scholar] [CrossRef]
  19. Babcinschi, M.; Cruz, F.; Duarte, N.; Santos, S.; Alves, S.; Neto, P. Offline robot programming assisted by task demonstration: An AutomationML interoperable solution for glass adhesive application and welding. Int. J. Comput. Integr. Manuf. 2025, 38, 864–875. [Google Scholar] [CrossRef]
  20. Zhang, F.; Lai, C.Y.; Simic, M.; Ding, S. Augmented reality in robot programming. Procedia Comput. Sci. 2020, 176, 1221–1230. [Google Scholar] [CrossRef]
  21. Ong, S.; Nee, A.; Yew, A.; Thanigaivel, N. AR-assisted robot welding programming. Adv. Manuf. 2020, 8, 40–48. [Google Scholar] [CrossRef]
  22. Xu, Y.; Dai, P.; Xin, M.; Wu, L.; Song, Y. Design and Analysis of a Dual-Screw Propelled Robot for Underwater and Muddy Substrate Operations in Agricultural Ponds. Actuators 2025, 14, 450. [Google Scholar] [CrossRef]
  23. Liu, G.; Sun, W.; Li, P. Motion capture and AR based programming by demonstration for industrial robots using handheld teaching device. Sci. Rep. 2024, 14, 23259. [Google Scholar] [CrossRef] [PubMed]
  24. Dogangun, F.; Bahar, S.; Yildirim, Y.; Temir, B.T.; Ugur, E.; Dogan, M.D. RAMPA: Robotic Augmented Reality for Machine Programming by DemonstrAtion. IEEE Robot. Autom. Lett. 2025, 10, 3795–3802. [Google Scholar] [CrossRef]
  25. Lotsaris, K.; Gkournelos, C.; Fousekis, N.; Kousi, N.; Makris, S. AR based robot programming using teaching by demonstration techniques. Procedia CIRP 2021, 97, 459–463. [Google Scholar] [CrossRef]
  26. Ong, S.K.; Yew, A.; Thanigaivel, N.K.; Nee, A.Y. Augmented reality-assisted robot programming system for industrial applications. Robot. Comput.-Integr. Manuf. 2020, 61, 101820. [Google Scholar] [CrossRef]
  27. Tadeja, S.K.; Zhou, T.; Capponi, M.; Walas, K.; Bohné, T.; Forni, F. Using augmented reality in Human-robot assembly: A comparative study of eye-gaze and hand-ray pointing methods. In Proceedings of the 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Abu Dhabi, United Arab Emirates, 14–18 October 2024; pp. 8786–8793. [Google Scholar]
  28. Ikeda, B.; Szafir, D. Programar: Augmented reality end-user robot programming. ACM Trans. Hum.-Robot. Interact. 2024, 13, 1–20. [Google Scholar] [CrossRef]
  29. Kim, J.; Lee, S.; Bae, J. An augmented reality-based wearable system for handheld-free and intuitive robot programming. J. Intell. Manuf. 2025, 1–10. [Google Scholar] [CrossRef]
  30. Fang, H.C.; Ong, S.K.; Nee, A.Y.C. Interactive robot trajectory planning and simulation using augmented reality. Robot. Comput.-Integr. Manuf. 2012, 28, 227–237. [Google Scholar] [CrossRef]
  31. ISO/TS 15066:2016; Robots and Robotic Devices—Collaborative Robots. International Organization for Standardization: Geneva, Switzerland, 2016.
  32. REP 103; Standard Units of Measure and Coordinate Conventions. Open Robotics: Mountain View, CA, USA, 2010. Available online: https://www.ros.org/reps/rep-0103.html (accessed on 7 October 2010).
  33. Kabsch, W. A solution for the best rotation to relate two sets of vectors. Found. Crystallogr. 1976, 32, 922–923. [Google Scholar] [CrossRef]
Figure 1. Schematic of the MR-based low-code robot programming system.
Figure 1. Schematic of the MR-based low-code robot programming system.
Robotics 15 00009 g001
Figure 2. Components of the MR Low-Code Programming System.
Figure 2. Components of the MR Low-Code Programming System.
Robotics 15 00009 g002
Figure 3. The system architecture and its “coarse-to-fine” data flow, decoupled into the Operator Intent Module (top, blue) and the Execution Core (bottom, orange). The forward path refines sparse user-defined waypoints into an executable joint trajectory, while the feedback loop synchronizes the virtual model with the physical robot’s state.
Figure 3. The system architecture and its “coarse-to-fine” data flow, decoupled into the Operator Intent Module (top, blue) and the Execution Core (bottom, orange). The forward path refines sparse user-defined waypoints into an executable joint trajectory, while the feedback loop synchronizes the virtual model with the physical robot’s state.
Robotics 15 00009 g003
Figure 4. System Communication Architecture. The architecture is based on the TCP/IP protocol, employing a Publish/Subscribe pattern and JSON data format to enable bidirectional data exchange between the MR front-end and the ROS back-end. The green arrows represent the forward command stream (desired position and orientation) from the front-end, while the orange arrows represent the backward feedback stream (joint state) from the back-end.
Figure 4. System Communication Architecture. The architecture is based on the TCP/IP protocol, employing a Publish/Subscribe pattern and JSON data format to enable bidirectional data exchange between the MR front-end and the ROS back-end. The green arrows represent the forward command stream (desired position and orientation) from the front-end, while the orange arrows represent the backward feedback stream (joint state) from the back-end.
Robotics 15 00009 g004
Figure 5. Coarse-grained path definition. The operator uses MR-based hand gestures to rapidly place a sparse sequence of waypoints, defining the initial outline of the task. Yellow dots represent preview points, while green dots represent placed points.
Figure 5. Coarse-grained path definition. The operator uses MR-based hand gestures to rapidly place a sparse sequence of waypoints, defining the initial outline of the task. Yellow dots represent preview points, while green dots represent placed points.
Robotics 15 00009 g005
Figure 6. Simulation-based trajectory verification. The virtual robot model executes the fine-grained trajectory generated by the back-end in the MR environment, allowing the operator to visually confirm the motion before physical execution. The four representative frames illustrate: (a) Initiation stage, (b) Stable stage, (c) Corner transition stage, and (d) Termination stage.
Figure 6. Simulation-based trajectory verification. The virtual robot model executes the fine-grained trajectory generated by the back-end in the MR environment, allowing the operator to visually confirm the motion before physical execution. The four representative frames illustrate: (a) Initiation stage, (b) Stable stage, (c) Corner transition stage, and (d) Termination stage.
Robotics 15 00009 g006aRobotics 15 00009 g006b
Figure 7. Coordinate system alignment between the HoloLens MR world frame ( F H o l o ) and the robot base frame ( F R o b o t ).
Figure 7. Coordinate system alignment between the HoloLens MR world frame ( F H o l o ) and the robot base frame ( F R o b o t ).
Robotics 15 00009 g007
Figure 8. Illustration of the Adaptive Curvature-based Spline Resampling (ACRSR) process. (a) A sparse sequence of operator-defined input waypoints ( P 0 to P 6 ) defining the initial, coarse path. (b) The initial continuous path generated by applying Catmull-Rom spline interpolation, which passes through all control points. (c) Curvature analysis along the spline. The path is color-coded based on the local curvature value κ , with red indicating high-curvature regions (sharp turns) and green indicating low-curvature regions. (d) The final refined trajectory. The adaptive resampling mechanism generates a dense set of points in high-curvature areas and a sparse set in low-curvature areas, resulting in a smooth, efficient, and high-fidelity path X r e f .
Figure 8. Illustration of the Adaptive Curvature-based Spline Resampling (ACRSR) process. (a) A sparse sequence of operator-defined input waypoints ( P 0 to P 6 ) defining the initial, coarse path. (b) The initial continuous path generated by applying Catmull-Rom spline interpolation, which passes through all control points. (c) Curvature analysis along the spline. The path is color-coded based on the local curvature value κ , with red indicating high-curvature regions (sharp turns) and green indicating low-curvature regions. (d) The final refined trajectory. The adaptive resampling mechanism generates a dense set of points in high-curvature areas and a sparse set in low-curvature areas, resulting in a smooth, efficient, and high-fidelity path X r e f .
Robotics 15 00009 g008
Figure 9. Quantitative evaluation of spatial registration accuracy. (a) 3D spatial comparison between the robot’s measured positions ( P R o b o t ) and the MR-transformed coordinates ( P H o l o ). (b) Euclidean error distribution for all samples, indicating an overall MAE of 1.80 mm.
Figure 9. Quantitative evaluation of spatial registration accuracy. (a) 3D spatial comparison between the robot’s measured positions ( P R o b o t ) and the MR-transformed coordinates ( P H o l o ). (b) Euclidean error distribution for all samples, indicating an overall MAE of 1.80 mm.
Robotics 15 00009 g009
Figure 10. Unity-based digital twin simulation environment.
Figure 10. Unity-based digital twin simulation environment.
Robotics 15 00009 g010
Figure 11. Trajectory fidelity evaluation in the digital twin environment. (a) 3D comparison of the Ground Truth, user inputs, standard TP trajectory, and the proposed ACRSR trajectory. (b) Error distribution relative to Ground Truth. ACRSR (blue) maintains superior fidelity, especially in the high-curvature region ( x < 6 ), whereas the TP method (purple) exhibits a significant spike around x 4.8 .
Figure 11. Trajectory fidelity evaluation in the digital twin environment. (a) 3D comparison of the Ground Truth, user inputs, standard TP trajectory, and the proposed ACRSR trajectory. (b) Error distribution relative to Ground Truth. ACRSR (blue) maintains superior fidelity, especially in the high-curvature region ( x < 6 ), whereas the TP method (purple) exhibits a significant spike around x 4.8 .
Robotics 15 00009 g011
Figure 12. Butt joint welding case study. Each subfigure is vertically divided into two parts: the upper portion depicts the physical robot in the real environment, while the lower portion shows the corresponding MR simulation. (a) Initiation stage: Operator placing sparse waypoints via MR gestures. (b) Execution stage: Virtual robot simulating the refined trajectory. (c) Termination stage: Physical robot executing the weld.
Figure 12. Butt joint welding case study. Each subfigure is vertically divided into two parts: the upper portion depicts the physical robot in the real environment, while the lower portion shows the corresponding MR simulation. (a) Initiation stage: Operator placing sparse waypoints via MR gestures. (b) Execution stage: Virtual robot simulating the refined trajectory. (c) Termination stage: Physical robot executing the weld.
Robotics 15 00009 g012
Figure 13. Sealant application case study. Each subfigure is vertically divided into two parts: the upper portion depicts the physical robot in the real environment, while the lower portion shows the corresponding MR simulation. (a) Initiation stage: Sparse waypoints defined on the 3D panel. (b) Stable stage: Simulated trajectory with surface-normal orientations. (c) Corner transition stage: Handling of high-curvature regions. (d) Termination stage: Physical robot applying the sealant.
Figure 13. Sealant application case study. Each subfigure is vertically divided into two parts: the upper portion depicts the physical robot in the real environment, while the lower portion shows the corresponding MR simulation. (a) Initiation stage: Sparse waypoints defined on the 3D panel. (b) Stable stage: Simulated trajectory with surface-normal orientations. (c) Corner transition stage: Handling of high-curvature regions. (d) Termination stage: Physical robot applying the sealant.
Robotics 15 00009 g013
Figure 14. Individual task performance comparison for all nine participants. Each plot shows the task completion time for the Welding and Sealant tasks using both the Teach Pendant and the MR System, with the percentage reduction in time indicated.
Figure 14. Individual task performance comparison for all nine participants. Each plot shows the task completion time for the Welding and Sealant tasks using both the Teach Pendant and the MR System, with the percentage reduction in time indicated.
Robotics 15 00009 g014
Figure 15. Comparison of mean System Usability Scale (SUS) scores for each questionnaire item across the two experimental conditions (MR System vs. Teach Pendant). Error bars represent standard deviations. The MR system shows consistently higher scores on positive items (odd numbers) and lower scores on negative items (even numbers).
Figure 15. Comparison of mean System Usability Scale (SUS) scores for each questionnaire item across the two experimental conditions (MR System vs. Teach Pendant). Error bars represent standard deviations. The MR system shows consistently higher scores on positive items (odd numbers) and lower scores on negative items (even numbers).
Robotics 15 00009 g015
Table 1. Comparative evaluation of the MR system and the teach pendant.
Table 1. Comparative evaluation of the MR system and the teach pendant.
TaskMethodTCT (s)
Butt WeldingMR System70.2 ± 11.4
Teach Pendant127.0 ± 9.4
Sealant ApplicationMR System109.4 ± 9.3
Teach Pendant220.3 ± 20.0
Table 2. Detailed breakdown of SUS questionnaire scores (Mean). Comparison between the MR System and the Teach Pendant (N = 9).
Table 2. Detailed breakdown of SUS questionnaire scores (Mean). Comparison between the MR System and the Teach Pendant (N = 9).
QuestionsMR SystemTeach Pendant
1. I think that I would like to use this system frequently4.352.80
2. I found the system unnecessarily complex1.703.50
3. I thought the system was easy to use4.253.00
4. I think that I would need the support of a technical person to be able to use this system1.753.20
5. I found the various functions in this system were well integrated4.453.50
6. I thought there was too much inconsistency in this system1.703.00
7. I would imagine that most people would learn to use this system very quickly4.352.50
8. I found the system very cumbersome to use1.703.50
9. I felt very confident using the system4.283.00
10. I needed to learn a lot of things before I could get going with this system1.753.50
Global SUS Score (0–100 Scale)82.745.3
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wang, Z.; Li, Z.; Yu, H.; Pan, D.; Peng, S.; Liu, S. Low-Code Mixed Reality Programming Framework for Collaborative Robots: From Operator Intent to Executable Trajectories. Robotics 2026, 15, 9. https://doi.org/10.3390/robotics15010009

AMA Style

Wang Z, Li Z, Yu H, Pan D, Peng S, Liu S. Low-Code Mixed Reality Programming Framework for Collaborative Robots: From Operator Intent to Executable Trajectories. Robotics. 2026; 15(1):9. https://doi.org/10.3390/robotics15010009

Chicago/Turabian Style

Wang, Ziyang, Zhihai Li, Hongpeng Yu, Duotao Pan, Songjie Peng, and Shenlin Liu. 2026. "Low-Code Mixed Reality Programming Framework for Collaborative Robots: From Operator Intent to Executable Trajectories" Robotics 15, no. 1: 9. https://doi.org/10.3390/robotics15010009

APA Style

Wang, Z., Li, Z., Yu, H., Pan, D., Peng, S., & Liu, S. (2026). Low-Code Mixed Reality Programming Framework for Collaborative Robots: From Operator Intent to Executable Trajectories. Robotics, 15(1), 9. https://doi.org/10.3390/robotics15010009

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop