A Cyber-Physical Integrated Framework for Developing Smart Operations in Robotic Applications

Liu, Tien-Lun; Chen, Po-Chun; Chao, Yi-Hsiang; Huang, Kuan-Chun

doi:10.3390/electronics14153130

Open AccessArticle

A Cyber-Physical Integrated Framework for Developing Smart Operations in Robotic Applications

Department of Industrial and Systems Engineering, Chung Yuan Christian University, Taoyuan City 320314, Taiwan

^*

Author to whom correspondence should be addressed.

Electronics 2025, 14(15), 3130; https://doi.org/10.3390/electronics14153130

Submission received: 23 May 2025 / Revised: 24 July 2025 / Accepted: 24 July 2025 / Published: 6 August 2025

(This article belongs to the Topic Innovation, Communication and Engineering)

Download

Browse Figures

Versions Notes

Abstract

The traditional manufacturing industry is facing the challenge of digital transformation, which involves the enhancement of intelligence and production efficiency. Many robotic applications have been discussed to enable collaborative robots to perform operations smartly rather than just automatically. This article tackles the issues of intelligent robots with cognitive and coordination capability by introducing cyber-physical integration technology. The authors propose a system architecture with open-source software and low-cost hardware based on the 5C hierarchy and then conduct experiments to verify the proposed framework. These experiments involve the collection of real-time data using a depth camera, object detection to recognize obstacles, simulation of collision avoidance for a robotic arm, and cyber-physical integration to perform a robotic task. The proposed framework realizes the scheme of the 5C architecture of Industry 4.0 and establishes a digital twin in cyberspace. By utilizing connection, conversion, calculation, simulation, verification, and operation, the robotic arm is capable of making independent judgments and appropriate decisions to successfully complete the assigned task, thereby verifying the proposed framework. Such a cyber-physical integration system is characterized by low cost but good effectiveness.

Keywords:

cyber-physical system; digital twins; collaborative robot; YOLO

1. Introduction

In recent years, the robotics industry has experienced significant growth due to market demand. Traditional industrial robots have encountered safety challenges and struggled to collaborate effectively with humans. This led to the emergence of collaborative robots that have addressed these issues and transformed automation. Previously, robots were primarily used for tasks like transportation, repetitive actions, and tasks performed in hazardous environments. However, with advancements in sensors and peripheral devices with software implementation, robots now handle various tasks efficiently. Human–robot collaboration (HRC) reveals the growing prevalence of collaborative robots, which are driven primarily by focusing on the visual system, with depth cameras being the most commonly used sensors. The integration of cameras with augmented reality applications is also becoming increasingly common, typically with the focus on improving safety and production, such as collision avoidance, human–computer interaction, and ergonomics. Sbaragli [1] proposed a task-insensitive cyber-physical architecture designed to monitor human-centric reconfigurable manufacturing systems. This architecture aims to meet mass customization needs while ensuring cost-effective flexibility and performance.

Baheti and Gill [2] mentioned the term cyber-physical system (CPS), which refers to a new generation of systems that integrate computational and physical capabilities, enabling interaction with humans through various new modes. The ability to interact with and extend the functionality of the physical world through computation, communication, and control is crucial for future technological developments. Lou et al. [3] stated the emergence of Industry 5.0 that integrates humans into cyber-physical systems (HCPS) to offset drawbacks on both sides. HCPS applications during the design, production, and service phases can greatly benefit engineers in the field of intelligent manufacturing. This review examines key enabling technologies, including human ability augmentation, human–robot interaction, digital twins, human-cyber-physical data fusion, crowdsourcing, and system modeling.

Although Industry 5.0 is an emerging concept in the manufacturing industry and a further evolution and extension of Industry 4.0, we believe that at the stage of Industry 4.0, the technology of cyber-physical systems has not yet been fully established, and there are still many technologies that need to be developed. Jay Lee et al. [4] mentioned the five-layer framework for cyber-physical systems, initialized as 5C, which consists of connection, conversion, cyber, cognition, and configuration. The 5C framework has to ensure real-time data collection from the physical world and send transformed information back to the network space through reliable connectivity, and then construct intelligent data management, analysis, and computation in cyberspace. Digital perception is a critical technology in CPS, which aims to extract important information from sensors to be capable of understanding the physical conditions. This information is then converted, analyzed, calculated, and interacted with through networked modeling. This helps equipment to acquire knowledge to handle changing environments.

Digital twin (DT) plays an important role in the CPS framework. Adil Rasheed et al. [5] mentioned that the term digital twin refers to the virtual representation of physical assets achieved through data and simulators, used for real-time prediction, optimization, monitoring, and decision improvement. Shaaban et al. [6] stated the importance of digital twins in HRC lies in their ability to offer a comprehensive and accurate representation of real-world systems. This feature enables applications in research fields such as safety, teleoperation, and human intention recognition. Despite advancements in this domain, DTs suffer scalability issues and are limited by low-performance hardware. Currently, the engineering community is primarily driven by physics-based modeling methods. Their approach involves observing the physical phenomena of interest, developing a partial understanding of it, and ultimately understanding and solving it in the form of mathematical equations. However, due to the partial understanding and numerous assumptions made in the process from observing phenomena to solving equations, a significant portion of physical problems is often overlooked.

A robust DT may assist industries in various applications. Rolofs et al. [7] incorporate DT into energy management systems to improve energy efficiency in manufacturing factories. The advancement of digital twin technologies holds significant potential for creating more sustainable factories. Caiza & Sanz [8] implemented a 3D immersive DT connected to a Manufacturing Execution System (MES) for real-time process monitoring and control, improving factory visibility and decision-making under Industry 4.0. Baniqued et al. [9] present a ROS–Unity-based immersive digital twin framework for operating and managing robot arm fleets. The authors suggested that three usability themes arising from this study were the flexibility of the software interface, the individuality of each robot in the fleet, and adaptation with expanding sensor visualization capabilities.

Regarding the visualization, the target detection model based on CNN models is widely used in tasks such as facial recognition, autonomous driving, object classification, and quality management. YOLO (You Only Look Once), inspired by GoogleNet, is a one-stage real-time object detection method, which means that it only needs to perform a CNN architecture on the image once to determine the position and category of the object in the image, thus improving the recognition speed. YOLO’s simple structure makes it ideal for fast and real-time target detection and suitable for robust and fast applications. In the continuous improvement process of the YOLO algorithm, it has evolved to improve accuracy and speed between generations. Wang et al. [10] launched YOLOv4 in 2020, and YOLOv9 was proposed in 2024. Compared to other current target detection algorithms, its speed and accuracy generally outperformed the rest. Therefore, this study aims to train the YOLO model to recognize the location of obstacles and provide the necessary information for simulations.

Robot simulation can help us address many uncertainties in real-world scenarios and reduce the likelihood of errors. As technology matures, the integration of robotic systems becomes increasingly important. Eric Rohmer et al. [11] mentioned that as the functionalities of robotic systems become more powerful, simulations become more complex. They introduced a universal, scalable robotic simulation framework called V-REP, which is used in many academic and industrial fields and is considered comprehensive simulation software. Currently, V-REP has been advanced to CoppeliaSim. This framework allows the direct integration of various control techniques and simplifies model deployment, making it easier to achieve model simulations.

Belda et al. [12] mentioned that the full robot integration into recent cyber-physical factories (CPF) is a step on the way to the full robotization of production. The gap between theory and software tools used in practice is still large. This is due to the unavailability of theory sourcing and limited development and testing. Their paper discussed hierarchical CPS control architectures for articulated robot arms in smart factories, from kinematics to model-based control and rapid prototyping tools.

Therefore, there are several technical issues to overcome for building a cyber-physical system as follows:

Inconsistent communication protocols: Robots of different brands (such as ABB, KUKA, Fanuc) use different communication protocols and interfaces. It is challenging to standardize data formats and control commands when integrating virtual and physical.
Limited data access: Many robot arm data (such as joint angle, load, temperature, error code) cannot be read directly from the controller, or require a specific SDK/API. Some manufacturers lock data in proprietary software, increasing the difficulty of integration.
High real-time requirements: If the virtual system (such as a digital twin) cannot reflect the arm status in real time, it will affect the accuracy of prediction or control. Excessive delay may cause motion deviation or even damage the equipment.
Complex digital twin modeling: High-level modeling is required to accurately simulate the kinematics, dynamics, collision detection, etc., of the robot arm. If the simulation is inconsistent with the entity, maintenance or process problems cannot be accurately predicted.

This paper tackles some of these issues to propose a feasible and cost-effective approach to a cyber-physical integrated framework for robotic applications. In this framework, the abovementioned concepts and techniques are applied to construct a smart robotic arm with a sensible task environment. It is also challenging to build communication between a virtual system and a physical system in industrial applications, especially from physical space to cyberspace. The depth camera used in this framework can play a significant role in object tracking and spatial mapping, capturing the physical information and then transforming data into cyberspace. In addition, the integrated framework focuses on the development and application of open-source software and general-purpose hardware to achieve feasible goals with lower cost but good effectiveness.

The paper will first introduce the system architecture and hardware setup, followed by the software algorithms that are integrated into the cyber-physical system.

2. Materials and Methods

In this research, virtual and physical integration technology is the key concept to construct the system operation of the robot arm. This section will further explain the software, hardware and experimental scenarios employed in the study.

2.1. System Architecture

The 5C hierarchy includes 5 layers from the bottom to the top as connection, conversion, cyber, cognition, and configuration with explanations as shown in Figure 1 [4].

This study applies the 5C hierarchy to the design of the cyber-physical integration framework, as shown in Table 1. Such hierarchical layers can be a roadmap to construct a smart manufacturing system. Our proposed framework of a cyber-physical integration system comprises hardware and software components as follows.

Hardware: A six-axis robot arm, named as Niryo Ned, and a depth camera, Intel RealSense D435i are used in this research. The computer acts as the main control unit, wirelessly connected to the Niryo Ned robotic arm via TCP/IP. The depth camera is connected via a USB cable.
Software: The system integration framework emphasizes the need for a modular and standardized approach to managing robotic systems. Therefore, a ROS environment is essential. While ROS 1 is being phased out, it still offers advantages such as mature tools and easier setup, making it a suitable choice for prototyping. However, ROS 2 is now the recommended platform for all new developments. The framework’s workflow is implemented in the ROS environment and comprises several steps. It begins with real-time object recognition using the depth camera and the YOLO object detection model. Next, object positions are simulated, and the Niryo Ned robotic arm model and kinematic calculation method are established. The robot path is planned using the artificial potential field method, and the object information and path planning results are integrated. Feasibility is verified, and collisions are detected in the simulator CoppeliaSim. If there are no collisions, the physical robot executes the planned path through ROS nodes. The RVIZ visualization tool is used to monitor the operation status. The entire architecture of the robotic system is shown in Figure 2.

2.2. Experimental Setup

The experimental setup is composed of six portions, labeled by numbers shown in Figure 3. In label 1, there is a six-axis robotic arm that picks up and places work pieces. In label 2, a depth camera is used for collecting coordinates and depth information. In label 3, a correction board is employed for the initial positioning by the depth camera. Label 4 represents an obstacle, in the form of a box, which the robotic arm must avoid during movement. The starting point where the robotic arm clamps the object is in label 5, while the target point where the mechanical arm places the object is in label 6. The experiments were conducted under normal indoor lighting (300–500 lux), which is sufficient for the depth camera to capture images and detect objects.

2.3. Depth Camera Principle

Depth cameras, also known as 3D cameras, can generate depth information along with RGB 2D images. This allows for a more detailed understanding of the environment being captured. There are three main measurement technologies used to gather depth information: stereo vision, structured light, and time of flight. The depth camera used in this platform is very cost-effective and applicable for object tracking, spatial mapping, etc. It also has a rich ecosystem with open-source resources. Although it has lower precision and stability compared to commercial systems such as Photoneo’s and Cognex’s, our proposed system is a good test platform and ideal for prototyping, research, and educational purposes, especially for academic institutions with limited resources.

When 3D cameras are applied in the field of robotics, the issue of coordinate transformation needs to be considered. In coordinate transformation, it is necessary to ascertain the positional relationship between the robot and the 3D camera and obtain corresponding parameters through calibration methods. Tailor et al. [13] mentioned real-time color detection based on image processing, utilizing a six-axis arm for object classification. They suggest that real-time calibration of the robotic arm contributes to the actual configuration of the end effector. As a pre-requisite for this real-time calibration, schematic diagrams of the workspace and various two-dimensional work planes have been conceptualized. The two-dimensional area is roughly divided into two parts: active and complementary workspaces. The active workspace refers to the area where the robot system dynamically grasps objects in two dimensions; the complementary workspace consists of nine different points in a 3 × 3 grid pattern drawn on white paper. Additionally, three different planes are included in the complementary workspace, divided into regions A, B, and C. A represents the 2D area of the gripper, directly mapped onto the complementary workspace; B represents the plane area of the camera’s field of view; and C represents the physical workspace of the robotic arm, as shown in Figure 4. Tadic et al. [14] used Intel RealSense depth cameras for visual applications in robotics to test its capabilities. The results showed that the depth sensors were appropriate for applications where obstacle avoidance and robot spatial orientation were required in coexistence with image vision algorithms.

2.4. YOLO Model Training

YOLO (You Only Look Once) is a single-stage, real-time object detection algorithm that frames object detection as a regression problem. It predicts both bounding boxes and class probabilities directly from full images in one evaluation of a convolutional neural network (CNN). Bochkovskiy et al. [15] developed the YOLOv4, which was popularly used by researchers. The YOLO version is now YOLOv9. The general steps of YOLO can be described as follows.

Input Preprocessing
- Resize the input image to a fixed size (e.g., 416 × 416 or 640 × 640 pixels).
- Normalize pixel values.
- Pass the image through the CNN backbone.
Feature Extraction
- Use a deep CNN to extract hierarchical feature maps from the image.
- Different versions use different backbones.
Grid Division
- Divide the image into an S × S grid.
- Each grid cell is responsible for detecting objects whose center falls within that cell.
Bounding Box Prediction
Each grid cell predicts bounding boxes, where each box includes:
■
Center coordinates (x, y) (relative to the grid cell),
■
Width (w), height (h) (relative to the whole image),
■
Confidence score = Pr (Object) × IoU (pred, truth).
Class Probability Prediction
- Each grid cell also predicts C class probabilities (conditional class probabilities: Pr (Class_i|Object)).
Final Predictions
- Multiply confidence score with class probabilities to obtain class-specific confidence scores.
- Final score = Pr (Class_i) × IoU (pred, truth).
Post-Processing
- Thresholding: Discard bounding boxes below a confidence threshold.
- Non-Maximum Suppression (NMS): Eliminate overlapping boxes to keep only the most confident prediction per object.
Output
- A list of detected objects, each with:
  ■
  Class label,
  ■
  Bounding box coordinates (x, y, w, h),
  ■
  Confidence score.

This research uses the YOLO object detection model to detect objects. In our research, we use boxes as obstacles, which can have different properties based on the specific requirements. The model needs to go through training before performing object detection. The training process is shown in Figure 5. The entire training process is briefly described as follows.

The first step involves capturing approximately 100 images of the boxes from various angles, resulting in a total of 850 images. It is important for these images to have minimal background clutter. The second step is to annotate and classify objects in the images using LabelImg software v1.5.0 (see Figure 6), resulting in a series of XML files. After this, the dataset is divided into four sets: training, validation, testing, and training+validation sets in step three. Finally, the annotated XML files are converted into the TXT format used by YOLO, containing class numbers, center coordinates (x, y), detection box width, and height.

2.5. Coordinate Conversion

Takayuki Nakamura [16] proposed a method for object tracking based on the integration of 3D depth and color information, employing a setup comprising a depth camera and a color camera. The paper assumes that all intrinsic and extrinsic parameters of the calibrated 3D depth camera and RGB camera are known beforehand, as these parameters are pre-determined. With this, the coordinate transformation between the 3D camera coordinates and RGB coordinates can be obtained. To determine the position of objects in a workspace, object recognition technology can be used to acquire center coordinates and depth information from an image stream. However, since these coordinates are typically in pixel format, they need to be converted into three-dimensional machine coordinates for the robotic arm to be applicable. We employed the nine-point calibration method in the experiment to obtain relevant coordinate positions. The process of transforming the coordinates is depicted in Figure 7 and involves three main steps.

In the first step, a nine-point positioning board is utilized to obtain the two-dimensional pixel coordinates and depth information of the nine points using object recognition. Then, these data are applied to a formula to convert them into camera coordinates. In the second step, based on the nine-point positioning board, the machine coordinates are obtained relative to the nine-point pixel coordinates. The third step involves adding the transformed nine-point camera coordinates to the nine-point machine coordinates. With the help of formulaic conversion, the final rotation translation matrix can be obtained, enabling the calculation of machine coordinates.

Equation (1) can be used as a reference for the coordinate transformation formula.

Z [\begin{matrix} u \\ v \\ 1 \end{matrix}] = [\begin{matrix} f_{x} & 0 & u_{0} \\ 0 & f_{y} & v_{0} \\ 0 & 0 & 1 \end{matrix}] [\begin{matrix} r_{11} & r_{12} & r_{13} & t_{1} \\ r_{21} & r_{22} & r_{23} & t_{2} \\ r_{31} & r_{32} & r_{33} & t_{3} \end{matrix}] [\begin{matrix} X_{w} \\ Y_{w} \\ Z_{w} \\ 1 \end{matrix}]

(1)

The symbols are defined as follows:

Z

: The depth of the object in the image.

u, v

: The coordinates of the object in the image.

f_{x}, f_{y}

: Focal length in pixels.

u_{0}, v_{0}

: Coordinates center point of the image.

r_{i j} (i, j = 1, 2, 3)

:

3 \times 3

Rotation matrix.

t_{i} (i = 1, 2, 3)

:

3 \times 1

Translation matrix.

X_{w}, Y_{w}, Z_{w}

: Three-dimensional coordinates of an object in a world coordinate system.

2.6. Robotic Kinematics

Regarding robot motion analysis, complex mathematical methods can be employed to obtain solutions. Jacques Denavit and Richard Hartenberg [17] proposed the Denavit–Hartenberg (DH) method to describe the kinematics of robotic arms. It is a systematic approach that attaches coordinate systems to the links of a robotic arm, simplifying homogeneous transformations. This method only requires four parameters to describe the position and orientation of adjacent reference frames. By providing the desired position and orientation to the end effector of the mechanical arm, the inverse kinematics model can calculate the corresponding angles of each joint.

Joao Almeida et al. [18] proposed an inverse kinematics algorithm for the Niryo One robotic arm, dividing inverse kinematics into translation and rotation. The algorithm considers rotation about the xyz axes, demonstrating three rotational degrees of freedom and three translational degrees of freedom for the robotic arm. The links near the base are defined as the translational part, consisting of the first three joints, while the rotational part involves the fourth, fifth, and sixth joints. Due to the sixth joint’s 5.5 mm offset in the z direction, the last three joints do not intersect at a single point, resulting in an infinite number of acceptable solutions to define the end-effector. To address this issue, the 5.5 mm offset is not considered, allowing the algorithm to generally produce up to eight different solutions. Finally, considering the physical limitations of the joints, the best solution is selected from the eight possible solutions.

In CoppeliaSim, a virtual model and kinematic calculation method for the Niryo Ned robotic arm are established to simulate motion paths. The study employs the D-H parameter method, as illustrated in Figure 8. First, a D-H table is created based on the robotic arm’s configuration to define spatial relationships, as shown in Table 2. Second, the robot description file is edited to configure parameters and establish joint and link relationships. Third, kinematics are calculated using the D-H table, utilizing MATLAB (v.2018a) Robotics System Toolbox, Python (v.3.8) visual-kinematics package, and ROS’s MoveIt tool. This process facilitates the simulation of various arm poses for path planning.

2.7. Obstacle Avoidance Path Planning

Oussama Khatib [19] proposed the artificial potential field method as a path-planning algorithm used to navigate robots from an initial point to a target point. The virtual potential field consists of both attractive and repulsive fields. The target generates an attraction force, creating an attraction field, while obstacles generate repulsive forces, forming a repulsive field. The potential energy in this artificial field is influenced by both attraction and repulsion, with high potential energy near obstacles and low potential energy near the target. As a result of the physics of potential energy, an object will naturally move from positions of high potential energy to those of lower potential energy. In pathfinding, an object can navigate from its starting position to a target position by following a path that avoids obstacles and minimizes potential energy. The target exerts an attraction force that covers the entire map, allowing the object to move toward the target from any location. However, obstacles exert a repelling force that prevents the object from getting too close to them, thereby ensuring that the object avoids collisions during its approach.

The APF method is characterized by its simple principles, smooth paths, and strong real-time performance, playing a crucial role in real-time path planning. In our research, we utilize modified repulsive force calculations and potential field construction, integrated into simulation software, to determine whether collisions will occur as the robotic arm moves. The flowchart of its framework is shown in Figure 9. Yao et al. [20] applied the artificial potential field (APF) method as a path-planning technique that constructs a virtual potential field within the robot’s environment. Xia et al. [21] improved the algorithm to make the collaborative robots work in a flexible and safe condition for the future development of medical and surgical scenarios.

As mentioned, the robot’s environment is defined by using potential fields, guiding the robot to avoid obstacles and move towards the target point through the combination of attractive and repulsive forces. The attractive force pulls objects towards the destination, while the repulsive force prevents collisions during movement, as shown in Figure 10. Finally, the gradient descent method is utilized with the artificial potential field to guide the object along the negative gradient direction, resulting in the planning of a collision-free path as shown in Figure 11.

3. Experimental Results

This section will show the results with demonstrated figures and their explanations.

3.1. YOLO Model Training Result

The newer YOLO series methods can be viewed as the current SOTA object detection technology. For example, YOLOv7 surpasses all known object detectors in both speed and accuracy in the range from 5 FPS to 160 FPS and has the highest accuracy 56.8% AP among all known real-time object detectors with 30 FPS or higher on GPU V100 [22]. The related- literatures on yolo has verified good results. After 4000 iterations of model training, the final average loss was 0.5112, as shown in Figure 12. Typically, a lower average loss is preferred, ideally ranging from 0.05 to 3.0. Through image inspection of the training outcomes, most of the boxes achieve satisfactory recognition results, as shown in Figure 13. The quality of recognition can be judged by confidence level, where lower confidence indicates lower relevance to the target object. Additionally, low confidence may result from factors such as low resolution, background interference, or insufficient sample quantity; thus, filtering can be applied through conditional judgments. In the experiment, a confidence threshold of 90 was set for box recognition to enhance accuracy. Upon completion of testing, the model can be integrated into the obstacle detection process. Utilizing the Intel^® RealSense™ D435i depth camera, object center pixel coordinates and depth information can be annotated in the stream using OpenCV and Pyrealsense2 functions, as shown in Figure 14.

3.2. Coordinate Conversion Result

According to the coordinate transformation description, the principle is implemented using the least squares method. The known data include nine-point pixel coordinates, nine-point machine coordinates, and intrinsic parameters. Then, based on these data, first use the nine-point pixel coordinates and intrinsic parameters to obtain the nine-point camera coordinates. To verify whether the results are acceptable, this study can substitute the original pixel coordinate data into the formula through the internal and external parameters to obtain the converted results. The results are rounded to three decimal places, and compare whether the error values of the original machine coordinates and the converted machine coordinates are within a reasonable range, as shown in Table 3 and Table 4. The results show that the Xw error is between −0.002 and 0.003, and the Yw error is between −0.002 and 0. This performance result meets the work requirements.

3.3. Virtual Model Establishment and Kinematic Calculation

To enable motion planning for the robotic arm model, we need to import the robot description file into both CoppeliaSim and RVIZ to create the robotic arm model, as shown in Figure 15. In CoppeliaSim, we implement basic motion planning for the robot through Lua scripts. In RVIZ, we establish a Python listener node responsible for receiving obstacle and collision information, while another publisher node is responsible for sending messages to execute obstacle avoidance path planning results on the real robotic arm. Additionally, the corresponding nodes for the lower-level operations related to the robotic arm also need to be activated, including device connection, joint states, virtual-physical integration, and coordinate information.

After creating the robotic arm, we input the parameters from the DH table into the program to perform forward and inverse kinematics calculations. We conduct bidirectional verification using both the Matlab Robotics Toolbox and Python Kinematics Package. Through visualization windows, we confirm that the computed results match the real robotic arm, as shown in Figure 16. This is achieved by setting several Cartesian coordinate points for reverse calculation of the six joint angles of the robotic arm. Due to the existence of multiple solutions in inverse kinematics, we need to search for corresponding solutions based on relevant data of the robotic arm.

3.4. Artificial Potential Field Path Planning

In this experiment, obstacle avoidance path planning adopts a modified artificial potential field method. The potential field is established through attraction and repulsion forces, and the final path planning is executed using the gradient descent algorithm. The robotic arm and boxes are set as obstacle points, with the pick-up location as the starting point and the drop-off location as the endpoint. Path planning stops upon reaching the endpoint, as shown in Figure 17. Subsequently, all path planning points go through kinematic calculations. Finally, the calculated results are transmitted to CoppeliaSim simulation software to verify whether the robotic arm encounters collisions during movement. In case of collisions, understanding of collision situations can be facilitated through subscribed topics or warning prompts in the simulator, as shown in Figure 18, to facilitate parameter adjustments. If no collisions occur, a signal is sent to the robotic arm to execute the path planning. It is important to note that the coordinate system for path planning needs to align with the robot coordinate system, primarily based on the respective positions of hardware devices in the scenario.

3.5. Resultant Demonstration

Based on the 5C framework, all the workflows as mentioned above are integrated. A snapshot of the computer screen is shown in Figure 19 with annotations, including the live screen, monitoring terminal, data exchange screen, user interface, object detection screen, virtual interaction in RVIZ, and verification window in CoppeliaSim.

As shown in Figure 20, the example demonstrates generating five path planning points using the artificial potential field method, including the start and end points. The path planning results will vary depending on the positions of the obstacles. Collision detection is performed in the simulation to verify the feasibility of the path planning. If a collision occurs during the robotic arm’s simulation test, a warning will be issued, and re-implementation of the obstacle avoidance path planning will be also required. If no collision occurs, the robotic arm will be notified to execute the pick-and-place task based on the obstacle avoidance path.

As shown in Figure 21, the demonstrated example is divided into three parts. First, the physical robotic arm executes the obstacle avoidance planning process, as shown in Figure 21b. Second, the start and end points for the robotic arm to pick up and place objects, as shown in Figure 21c. During the pick-and-place task, the operational status can be monitored remotely and synchronously through RVIZ, as shown in Figure 21a.

3.6. Discussion

In this experiment, the proposed approach is feasible for achieving the intended goal. Nevertheless, it should be noted that the real-time feedback mechanism involves first checking the robotic arm’s path using simulation software, after which the results are passed to the physical robotic arm for execution. The assumption of this case study is that the obstacle is static. If the obstacle position changes, the planned path may not be valid. Therefore, the responsiveness of the CPS in a dynamic environment is more complicated. When obstacles in the environment are moving, obstacle avoidance for robotic arms becomes a dynamic problem. Unlike static environments, it requires prediction and motion adaptation. The CPS will require introducing algorithms such as deep learning-based trackers to predict the future trajectory of moving obstacles. Based on current and predicted obstacle positions, the robot continuously updates its planned path. For example, rapidly exploring random trees (RRT) and dynamic window approach (DWA) can be applied for dynamic path planning.

However, the proposed cyber-physical system (CPS) framework has been tested for feasibility under current conditions. If necessary, this framework can serve as a basis for accommodating additional algorithms for various situations. Additional algorithms and upgraded equipment are required to deal with more complex situations. This issue can be an area of research in the in future. For example, ISO 10218-1/2 [23,24] sets the stage for robot safety in general, while ISO/TS 15066 [25] provides the specific guidelines for safe human–robot collaboration. A CPS-based robotic system may support and enhance the application of ISO 10218-1/2 and ISO/TS 15066 by enabling real-time situational awareness, dynamic control, adaptive safety behavior, and data transparency. These capabilities can make CPS an ideal architecture for compliant and intelligent robotic systems in Industry 4.0 and human–robot collaborative environments.

In our proposed framework, there is a vision system to capture the 3D data and assist the robot in performing tasks according to the kinematic analysis. By adding additional sensors (e.g., force/torque), CPS can use the collected data and closed-loop feedback to enforce these physical limits dynamically. The CPS can model and simulate risk zones using digital twins or AI-driven behavior prediction, and then adapt robot behavior (e.g., slow down, reroute, or stop) based on real-time changes in the environment. Such a CPS framework can enhance transparency and support functional safety lifecycle management (e.g., hazard analysis, testing, validation).

4. Conclusions

The aim of this research is to develop a practical framework to utilize cyber-physical integration for robotic applications, and such a system is characterized by low cost but good effectiveness. Using this CPS-based framework, we performed experiments with a robotic arm for pick-and-place operations and collision avoidance. Digital twin technology was employed to establish interaction, transformation, and computation between the physical and the virtual space. Although the application scenario is simple, the proposed framework was verified to be feasible. However, this model will enable the robotic arm to make decisions based on its surroundings to ensure coordination between the human and the robot in the future. This framework can serve as a foundation for future extensions to build smart robotic applications using CPS-based approach.

Author Contributions

Conceptualization, T.-L.L.; methodology, T.-L.L. and P.-C.C.; software, P.-C.C.; validation, T.-L.L., Y.-H.C. and K.-C.H.; formal analysis, T.-L.L. and P.-C.C.; writing—original draft preparation, T.-L.L. and P.-C.C.; writing—review and editing, T.-L.L., Y.-H.C. and K.-C.H.; project administration, T.-L.L.; funding acquisition, T.-L.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Science and Technology Council, Taiwan, grant number NSTC 113-2221-E-033-049 and CYCU.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The datasets presented in this article are not readily available due to technical limitations.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Sbaragli, A.; Ghafoorpoor, P.Y.; Thiede, S.; Pilati, F. A cyber-physical architecture to monitor human-centric reconfigurable manufacturing systems. J Intell Manuf. 2025. [Google Scholar] [CrossRef]
Baheti, R.; Gill, H. Cyber-physical systems. In The Impact of Control Technology, 1st ed.; Samad, T., Annaswamy, A., Eds.; IEEE Control Systems Society: Piscataway, NJ, USA, 2011; pp. 161–166. [Google Scholar]
Lou, S.; Hu, Z.; Zhang, Y.; Feng, Y.; Zhou, M.; Lv, C. Human-Cyber-Physical System for Industry 5.0: A Review From a Human-Centric Perspective. IEEE Trans. Autom. Sci. Eng. 2024, 22, 494–511. [Google Scholar] [CrossRef]
Lee, J.; Bagheri, B.; Kao, H.A. A cyber-physical systems architecture for industry 4.0-based manufacturing systems. Manuf. Lett. 2015, 3, 18–23. [Google Scholar] [CrossRef]
Rasheed, A.; San, O.; Kvamsdal, T. Digital Twin: Values, Challenges and Enablers from a Modeling Perspective. IEEE Access 2020, 8, 21980–22012. [Google Scholar] [CrossRef]
Shaaban, M.; Carfì, A.; Mastrogiovanni, F. Digital Twins for Human-Robot Collaboration: A Future Perspective. arXiv 2023, arXiv:2311.02421. [Google Scholar]
Rolofs, G.; Wilking, F.; Goetz, S.; Wartzack, S. Integrating Digital Twins and Cyber-Physical Systems for Flexible Energy Management in Manufacturing Facilities: A Conceptual Framework. Electronics 2024, 13, 4964. [Google Scholar] [CrossRef]
Caiza, G.; Sanz, R. An Immersive Digital Twin Applied to a Manufacturing Execution System for the Monitoring and Control of Industry 4.0 Processes. Appl. Sci. 2024, 14, 4125. [Google Scholar] [CrossRef]
Baniqued, P.; Bremner, P.; Sandison, M.; Harper, S.; Agrawal, S.; Bolarinwa, J.; Blanche, J.; Jiang, Z.; Johnson, T.; Mitchell, D.; et al. Multimodal immersive digital twin platform for cyber–physical robot fleets in nuclear environments. J. Field Robot. 2024, 41, 1521–1540. [Google Scholar] [CrossRef]
Wang, C.Y.; Yeh, I.H.; Liao, H.Y. YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information. arXiv 2024, arXiv:2402.13616. [Google Scholar]
Rohmer, E.; Singh, S.P.; Freese, M. V-REP: A versatile and scalable robot simulation framework. In Proceedings of the 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems, Tokyo, Japan, 3–7 November 2013; pp. 1321–1326. [Google Scholar]
Belda, K.; Venkrbec, L.; Jirsa, J. Modelling, Control Design and Inclusion of Articulated Robots in Cyber-Physical Factories. Actuators 2025, 14, 129. [Google Scholar] [CrossRef]
Tailor, P.; Roy, D.; Jagtap, K.; ali bhojani, K.; Vasant Badhan, K.; Prakash, A.; Atpadkar, V. Mono Camera-based Localization of Objects to Guide Real-time Grasp of a Robotic Manipulator. In Proceedings of the Advances in Robotics-5th International Conference of the Robotics Society, Kanpur, India, 30 June–4 July 2021; Association for Computing Machinery: New York, NY, USA, 2021; Volume 42, pp. 1–8. [Google Scholar] [CrossRef]
Tadic, V.; Toth, A.; Vizvari, Z.; Klincsik, M.; Sari, Z.; Sarcevic, P.; Sarosi, J.; Biro, I. Perspectives of RealSense and ZED Depth Sensors for Robotic Vision Applications. Machines 2022, 10, 183. [Google Scholar] [CrossRef]
Bochkovskiy, A.; Wang, C.Y.; Liao, H.Y.M. YOLOv4: Optimal speed and accuracy of object detection. arXiv 2020, arXiv:2004.10934. [Google Scholar]
Nakamura, T. Real-time 3-D object tracking using Kinect sensor. In Proceedings of the 2011 IEEE International Conference on Robotics and Biomimetics, Phuket, Thailand, 7–11 December 2011; IEEE: Piscataway, NJ, USA, 2011; pp. 784–788. [Google Scholar]
Denavit, J.; Hartenberg, R.S. A kinematic notation for lower-pair mechanisms based on matrices. J. Appl. Mech. 1955, 22, 215–221. [Google Scholar] [CrossRef]
Almeida, J.; Rosa, D.; Viegas, G. Direct and Inverse Kinematics of Serial Manipulators (Nyrio One 6-Axis Robotic Arm). 2021. Available online: https://www.scribd.com/document/731672012/Direct-and-Inverse-Kinematics-of-Serial-Manipulators-Nyrio-One-6-axis-Robotic-Arm (accessed on 15 July 2024).
Khatib, O. Real-Time Obstacle Avoidance for Manipulators and Mobile Robots. IJRR 1986, 5, 90–98. [Google Scholar]
Yao, Q.; Zheng, Z.; Qi, L.; Yuan, H.; Guo, X.; Zhao, M.; Liu, Z.; Yang, T. Path planning method with improved artificial potential field—A reinforcement learning perspective. IEEE Access 2020, 8, 135513–135523. [Google Scholar] [CrossRef]
Xia, X.; Li, T.; Sang, S.; Cheng, Y.; Ma, H.; Zhang, Q.; Yang, K. Path Planning for Obstacle Avoidance of Robot Arm Based on Improved Potential Field Method. Sensors 2023, 23, 3754. [Google Scholar] [CrossRef] [PubMed]
Wang, C.Y.; Bochkovskiy, A.; Liao, H.Y. YOLOv7: Trainable Bag-of-Freebies Sets New State-of-the-Art for Real-Time Object Detectors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 17–24 June 2023; pp. 7464–7475. [Google Scholar]
ISO 10218-1:2025; Robotics—Safety Requirements, Part 1: Industrial Robots, Edition 3. 2025. Available online: https://www.iso.org/standard/73933.html (accessed on 15 July 2024).
ISO 10218-2:2025; Robotics—Safety Requirements, Part 2: Industrial Robot Applications and Robot Cells, Edition 2. 2025. Available online: https://www.iso.org/standard/73934.html (accessed on 15 July 2024).
ISO/TS 15066:2016; Robots and Robotic Devices—Collaborative Robots, Edition 1. 2016. Available online: https://www.iso.org/standard/62996.html (accessed on 15 July 2024).

Figure 1. 5C architecture for implementation of cyber-physical system [4].

Figure 2. System execution architecture.

Figure 3. The setup of the experimental area.

Figure 4. Schematic diagram of the robot’s real-time calibration work area and various planes [5].

Figure 5. Training process diagram for the YOLO model.

Figure 6. Examples of using LabelImg with green frames marked.

Figure 7. Coordinate conversion process.

Figure 8. Model creation and kinematic calculation process diagram.

Figure 9. The path planning method using artificial potential field.

Figure 10. Attractive (red) and repulsive (green) forces.

Figure 11. The path generated with APF method.

Figure 12. Loss graph.

Figure 13. Image recognition results.

Figure 14. Stream image of YOLO recognition.

Figure 15. Creation of the robotic arm models (left: in CoppeliaSim; right: in RVIZ).

Figure 16. Visualization of kinematic calculation results.

Figure 17. Example of obstacle avoidance path planning.

Figure 18. Warning prompt with red color on the robot due to collision.

Figure 19. A snapshot of the CPS operating environment.

Figure 20. Validation of the generated path in CoppeliaSim.

Figure 21. (a) Synchronized screen. (b) Actual execution screen. (c) Pick and place.

Table 1. Cyber-physical integration framework based on the 5C architecture.

Connection Layer	The computer is connected to the RealSense D435i depth camera and Niryo Ned robotic arm via USB and TCP/IP, respectively. The depth camera obtains target objects from the stream using the YOLO object detection method and returns the coordinates and depth information. The robotic arm receives coordinate commands and provides status feedback.
Conversion Layer	The raw data collected by the depth camera is processed and applied to calculations such as obstacle simulation, obstacle avoidance path planning, and verification.
Cyber Layer	In the ROS environment, the integration of the robotic arm and obstacle information is achieved through interaction between nodes to establish models in simulation software.
Cognitive Layer	The depth camera perceives the surrounding environment, determines the presence of obstacles in the workspace, and interacts and computes through the network, providing feedback on the status in the visualization tool.
Configuration Layer	The physical robotic arm makes decisions to perform the task from the resultant planning path after the simulation software has checked the feasibility of the path.

Table 2. Niryo Ned D-H table.

Joint i	$θ_{i}$	$d_{i}$	$d_{i}$	$d_{i}$	Range of Rotation Angles
1	$θ_{1}$	0.175	0	90°	(−170°, 170°)
2	$θ_{2}$	0	0.221	0°	(−120°, 35°)
3	$θ_{3}$	0	0.0325	90°	(−77°, 90°)
4	$θ_{4}$	0.235	0	−90°	(−120°, 120°)
5	$θ_{5}$	0	0	90°	(−100°, 55°)
6	$θ_{6}$	0.04	0	0°	(−145°, 145°)

Table 3. Error between actual value and predicted value of machine coordinate X.

Point i	1	2	3	4	5	6	7	8	9
Actual value	0.252	0.25	0.247	0.304	0.302	0.301	0.359	0.357	0.357
Predicted valie	0.252	0.249	0.247	0.303	0.304	0.301	0.358	0.36	0.355
Error	~0	0.001	~0	0.001	−0.002	~0	0.001	−0.003	0.002

Table 4. Error between actual value and predicted value of machine coordinate Y.

Point i	1	2	3	4	5	6	7	8	9
Actual value	−0.061	0	0.062	−0.06	0.001	0.063	−0.058	0.002	0.064
Predicted valie	−0.061	0	0.062	−0.06	−0.001	0.064	−0.057	0.002	0.065
Error	~0	~0	~0	~0	−0.002	−0.001	−0.001	~0	−0.001

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Liu, T.-L.; Chen, P.-C.; Chao, Y.-H.; Huang, K.-C. A Cyber-Physical Integrated Framework for Developing Smart Operations in Robotic Applications. Electronics 2025, 14, 3130. https://doi.org/10.3390/electronics14153130

AMA Style

Liu T-L, Chen P-C, Chao Y-H, Huang K-C. A Cyber-Physical Integrated Framework for Developing Smart Operations in Robotic Applications. Electronics. 2025; 14(15):3130. https://doi.org/10.3390/electronics14153130

Chicago/Turabian Style

Liu, Tien-Lun, Po-Chun Chen, Yi-Hsiang Chao, and Kuan-Chun Huang. 2025. "A Cyber-Physical Integrated Framework for Developing Smart Operations in Robotic Applications" Electronics 14, no. 15: 3130. https://doi.org/10.3390/electronics14153130

APA Style

Liu, T.-L., Chen, P.-C., Chao, Y.-H., & Huang, K.-C. (2025). A Cyber-Physical Integrated Framework for Developing Smart Operations in Robotic Applications. Electronics, 14(15), 3130. https://doi.org/10.3390/electronics14153130

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Cyber-Physical Integrated Framework for Developing Smart Operations in Robotic Applications

Abstract

1. Introduction

2. Materials and Methods

2.1. System Architecture

2.2. Experimental Setup

2.3. Depth Camera Principle

2.4. YOLO Model Training

2.5. Coordinate Conversion

2.6. Robotic Kinematics

2.7. Obstacle Avoidance Path Planning

3. Experimental Results

3.1. YOLO Model Training Result

3.2. Coordinate Conversion Result

3.3. Virtual Model Establishment and Kinematic Calculation

3.4. Artificial Potential Field Path Planning

3.5. Resultant Demonstration

3.6. Discussion

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI