Virtual, Augmented, and Mixed Reality Robotics-Assisted Deep Reinforcement Learning Towards Smart Manufacturing
Abstract
:1. Introduction
2. Approach: System Model Utilizing Virtual, Augmented, and Mixed-Reality
2.1. VAM Network Communication Framework
2.2. Task Offloading Model
2.2.1. Local Processing
2.2.2. Edge Server Execution
2.3. Digital Twin Architecture and Components
- Physical Layer: Includes the UR3e welding robot, welding torch, sensors (e.g., position and force sensors), and auxiliary systems (e.g., wire shear gun cleaning station). These generate real-time data such as joint angles, weld current, and workpiece position.
- Data Layer: Aggregates static data (e.g., equipment geometry, and workstation layout) and dynamic data (e.g., sensor readings and welding process parameters). Data are transmitted via UDP, TCP, and HTTP protocols, with JSON formatting for interoperability.
- Server Layer: Processes data for motion planning, kinematics, and optimization using algorithms like the improved PRM and ANFIS. It synchronizes the physical and virtual models, updating the simulation in real-time.
- Simulation Layer: Built on the VAM platform (Unity-based), it visualizes the welding process, robot trajectories, and environmental interactions. The VAM interface supports AR/VR for operator interaction.
3. Optimization Framework Using Lyapunov Theory
3.1. Objective Formulation
3.2. Queue Stability and Lyapunov Drift
3.3. Digital Twin-Predicted Perturbation
3.4. Integration of Lyapunov Optimization and DRL
4. Deep Reinforcement Learning for Real-Time Optimization
4.1. System State and Action Space
4.2. DRL-Based Policy Optimization
- Actor Network: Proposes actions based on the current state using a parameterized policy .
- Critic Network: Evaluates the proposed action by estimating the state value and computes the advantage function:
4.3. Asynchronous Learning and Experience Sharing
4.4. RL Framework
5. Experiments and Results
5.1. Data Communication
- Synchronization Accuracy: Measured as the latency between physical sensor data updates and their reflection in the virtual model. Experiments achieved an average latency of 50 ms, ensuring real-time synchronization.
- Simulation Fidelity: Assessed by comparing virtual weld trajectories to physical weld outcomes (e.g., seam accuracy within mm). The VAM platform accurately replicated of physical trajectories in simulation.
5.2. Welding Robot System Server System
5.3. Inverse Kinematics Experiment
Method | Gantry | Robot Arm | ||
---|---|---|---|---|
Iterative | ANFIS | Iterative | ANFIS | |
Training time | None | 15 min | None | ≈120 h |
Solving time | 30 ms | 10 ms | 80 ms | 15 ms |
Accuracy | >99% | >99% | 98% | 95% |
5.4. Improved Probabilistic Roadmap Algorithm
- Adaptive Node Sampling: Unlike uniform sampling, our approach adjusts the sampling density based on workspace complexity. Regions near obstacles or welding seams are sampled more densely to ensure sufficient roadmap coverage, while open areas use sparse sampling to reduce the computational overhead. This is achieved by estimating obstacle proximity using a distance transform of the workspace.
- Dynamic Edge Connection: To minimize collision checks, we employ a dynamic k-nearest neighbor strategy. The number of neighbors (k) varies based on local node density, reducing unnecessary connections in crowded regions. Additionally, edges are prioritized based on their length and collision risk, improving roadmap efficiency.
- Cost-Aware Path Optimization: After constructing the roadmap, we apply a path optimization step that minimizes a weighted cost function combining path length and obstacle clearance. This ensures the robot follows shorter, safer trajectories suitable for welding tasks.
5.5. VAM Experiment
- Adaptation of Curriculum Learning (CL): Purpose: Enhance training efficiency for arc welding motion planning tasks by progressively increasing task difficulty. Implementation: Start training with simple welding paths (straight lines with no obstacles). Gradually add complexity, such as curves, multiple seams, and varying environments. Use metrics like success rates or reward thresholds to adaptively adjust task difficulty.
- Domain Randomization (DR): Purpose: Improve the generalization of RL policies for sim-to-real transfer. Adaptation: Randomize environmental factors like arc temperature, noise levels, or material reflectivity. Use Gaussian distributions for sampling parameters with a bias towards the current curriculum difficulty level, as described in the paper.
- Reinforcement Learning Framework: RL Model: Utilize Soft Actor–Critic (SAC) or similar off-policy algorithms for better sample efficiency. Reward Function: Define rewards based on proximity to the welding path, weld quality metrics, and collision avoidance. Adapt the reward dynamically to the current curriculum level, encouraging performance on harder tasks.
- PID Gains Scheduling: Application in Welding: Dynamically adjust control gains for precision near critical welding points, such as intersections or tight curves, similar to the force-control strategies used in the paper.
- Simulation and Validation: Simulated Environment: Leverage the digital twin setup for training and testing. Real-World Validation: Test learned policies in physical arc welding tasks with adjustable tolerances to mirror the peg-in-hole transfer experiments.
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Appendix A
Appendix A.1. Motion Planning Experiment: Development of the Digital Twin
Joint | (mm) | (rad) | (mm) | (Rad) |
---|---|---|---|---|
1 | 3050 | 0 | ||
2 | 0 | |||
3 | 0 |
Joint | (mm) | (rad) | (mm) | (Rad) |
---|---|---|---|---|
1 | 0 | |||
2 | 0 | 0 | ||
3 | 0 | 0 | ||
4 | 0 | |||
5 | 0 | |||
6 | 0 | 0 |
Appendix A.1.1. Comparison with Standard PRM
N. of Nodes | Algorithm | Learning Time (s) | Planning Time (ms) | Connect Time (ms) |
---|---|---|---|---|
500 | Standard | 20.5 | 1500 | 1300 |
500 | Improved | 22.3 | 1310 | 1218 |
1000 | Standard | 45.0 | 1700 | 1400 |
1000 | Improved | 47.2 | 1459 | 925.43 |
5000 | Standard | 290.0 | 900 | 450 |
5000 | Improved | 305.2 | 593 | 316.34 |
Appendix A.1.2. Explanation of Figure A11
Appendix A.2. Solving Inverse Kinematics Using ANFIS
Appendix A.3. Water Tank Welding Process
Appendix A.4. Welding Robot Workstation
- The physical robot is the actual welding production system consisting of a welding robot and equipment like a welding torch, a welding system, controllers, wire shearing and gun clearing devices, etc. It also includes specific welding information such as welding process, welding current, robot position and other data.
- Digital twin data consist of three main components: physical unit data, virtual unit data, and service system data. Through data transfer, interaction, and updates between each layer, they deliver relevant analysis, verification, and decision information to the service system. This interconnected data exchange ensures seamless communication and enables the service system to access valuable insights and updates from both the physical and virtual aspects of the system.
- The server system relies on digital twin data, which drive various functionalities such as logic driving and motion control of the digital twin. It analyzes and optimizes the welding process, including parameters like welding process, and robot welding path of the physical welding robot station. This data are then mapped to the simulation platform, enabling motion simulation of the welding process. Through this integrated approach, the service system can efficiently manage and enhance the performance of the welding operations, ensuring a seamless connection between the physical and virtual aspects of the system.
- Simulation comprises a digital representation of the welding robot workstation e.g., workstation layout, physical equipment, environment, and other production components. This virtual digital model facilitates the mapping of the digital space to the physical space of the welding workstation.
Appendix A.5. Data for Welding Robot Workstation
Appendix B. Motion Planning Experiment
References
- Scholz, C.; Cao, H.-L.; Imrith, E.; Roshandel, N.; Firouzipouyaei, H.; Burkiewicz, A. Sensor-enabled safety systems for human-robot collaboration: A review. IEEE Sens. J. 2025, 25, 65–88. [Google Scholar] [CrossRef]
- Pantano, M.; Eiband, T.; Lee, D. Capability-based Frameworks for Industrial Robot Skills: A Survey. In Proceedings of the 2022 IEEE 18th International Conference on Automation Science and Engineering (CASE), Mexico City, Mexico, 20–24 August 2022; IEEE Press: Piscataway, NJ, USA, 2022; pp. 2355–2362. [Google Scholar] [CrossRef]
- Pedersen, M.R.; Nalpantidis, L.; Andersen, R.S.; Schou, C.; Bøgh, S.; Krüger, V.; Madsen, O. Robot skills for manufacturing. Robot. Comput.-Integr. Manuf. 2016, 37, 282–291. [Google Scholar] [CrossRef]
- Dai, Y.; Zhang, K.; Maharjan, S.; Zhang, Y. Deep Reinforcement Learning for Stochastic Computation Offloading in Digital Twin Networks. arXiv 2020, arXiv:2011.08430. [Google Scholar] [CrossRef]
- Szymanski, T.H. Securing the Industrial-Tactile Internet of Things with Deterministic Silicon Photonics Switches. IEEE Access 2016, 4, 8236–8249. [Google Scholar] [CrossRef]
- Hu, Y.; Jia, Q.; Yao, Y.; Lee, Y.; Lee, M.; Wang, C. Industrial Internet of Things Intelligence Empowering Smart Manufacturing: A Literature Review. IEEE Internet Things J. 2024, 11, 19143–19167. [Google Scholar] [CrossRef]
- Benakis, M.; Du, C.; Patran, A.; French, R. Welding Process Monitoring Applications and Industry 4.0. In Proceedings of the 2019 IEEE 15th International Conference on Automation Science and Engineering (CASE), Vancouver, BC, Canada, 22–26 August 2019; pp. 1755–1760. [Google Scholar] [CrossRef]
- Reisgen, U.; Lozano, P.; Mann, S.; Buchholz, G.; Willms, K. Process control of gas metal arc welding processes by optical weld pool observation with combined quality models. In Proceedings of the 2015 IEEE International Conference on Automation Science and Engineering (CASE), Gothenburg, Sweden, 24–28 August 2015; pp. 407–410. [Google Scholar] [CrossRef]
- Zhou, X.; Wang, X.; Xie, Z.; Li, F.; Gu, X. Online obstacle avoidance path planning and application for arc welding robot. Robot. Comput.-Integr. Manuf. 2022, 78, 102413. [Google Scholar] [CrossRef]
- Rout, A.; Deepak, B.B.V.L.; Biswal, B.B. Advances in weld seam tracking techniques for robotic welding: A review. Robot. Comput.-Integr. Manuf. 2019, 56, 12–37. [Google Scholar] [CrossRef]
- Zhou, P.; Peng, R.; Xu, M.; Wu, V.; Navarro-Alarcon, D. Path Planning with Automatic Seam Extraction over Point Cloud Models for Robotic Arc Welding. IEEE Robot. Autom. Lett. 2021, 6, 5002–5009. [Google Scholar] [CrossRef]
- Peretz, Y. On Parametrization of All the Exact Pole-Assignment State-Feedbacks for LTI Systems. IEEE Trans. Autom. Control. 2017, 62, 3436–3441. [Google Scholar] [CrossRef]
- Zhang, Q.; Xiao, R.; Liu, Z.; Duan, J.; Qin, J. Process Simulation and Optimization of Arc Welding Robot Workstation Based on Digital Twin. Machines 2023, 11, 53. [Google Scholar] [CrossRef]
- Pan, Z.; Polden, J.; Larkin, N.; Duin, S.V.; Norrish, J. Recent progress on programming methods for industrial robots. Robot. Comput.-Integr. Manuf. 2012, 28, 87–94. [Google Scholar] [CrossRef]
- Wang, Q.; Jiao, W.; Yu, R.; Johnson, M.T.; Zhang, Y. Virtual Reality Robot-Assisted Welding Based on Human Intention Recognition. IEEE Trans. Autom. Sci. Eng. 2020, 17, 799–808. [Google Scholar] [CrossRef]
- Sarah, A.; Huseynov, K.; Çakir, L.V.; Thomson, C.; Özdem, M.; Canberk, B. AI-based traffic analysis in digital twin networks. arXiv 2024, arXiv:2411.00681. [Google Scholar]
- Yin, Y.; Zheng, P.; Li, C.; Wang, L. A state-of-the-art survey on Augmented Reality-assisted Digital Twin for futuristic human-centric industry transformation. Robot. Comput.-Integr. Manuf. 2023, 81, 102515. [Google Scholar] [CrossRef]
- Künz, A.; Rosmann, S.; Loria, E.; Pirker, J. The Potential of Augmented Reality for Digital Twins: A Literature Review. In Proceedings of the 2022 IEEE Conference on Virtual Reality and 3D User Interfaces (VR), Christchurch, New Zealand, 12–16 March 2022; pp. 389–398. [Google Scholar] [CrossRef]
- Lu, Y.; Liu, C.; Wang, K.I.-.K.; Huang, H.; Xu, X. Digital Twin-driven smart manufacturing: Connotation, reference model, applications and research issues. Robot. Comput.-Integr. Manuf. 2020, 61, 101837. [Google Scholar] [CrossRef]
- Fernández-Caramés, T.M.; Fraga-Lamas, P. Forging the Industrial Metaverse for Industry 5.0: Where Extended Reality, IIoT, Opportunistic Edge Computing, and Digital Twins Meet. IEEE Access 2024, 12, 95778–95819. [Google Scholar] [CrossRef]
- Tao, F.; Qi, Q.; Liu, A.; Kusiak, A. Data-driven smart manufacturing. J. Manuf. Syst. 2018, 48 Pt C, 157–169. [Google Scholar] [CrossRef]
- Li, M.; Huang, J.; Xue, L.; Zhang, R. A guidance system for robotic welding based on an improved YOLOv5 algorithm with a RealSense depth camera. Sci. Rep. 2023, 13, 21299. [Google Scholar] [CrossRef]
- Bhuiyan, T.; Kästner, L.; Hu, Y.; Kutschank, B.; Lambrecht, J. Deep-Reinforcement-Learning-Based Path Planning for Industrial Robots Using Distance Sensors as Observation. In Proceedings of the 2023 8th International Conference on Control and Robotics Engineering (ICCRE), Niigata, Japan, 21–23 April 2023; pp. 204–210. [Google Scholar] [CrossRef]
- Walker, M.; Phung, T.; Chakraborti, T.; Williams, T.; Szafir, D. Virtual, Augmented, and Mixed Reality for Human-Robot Interaction: A Survey and Virtual Design Element Taxonomy. arXiv 2022, arXiv:2202.11249. [Google Scholar] [CrossRef]
- Khdoudi, A.; Masrour, T.; El Hassani, I.; El Mazgualdi, C. A Deep-Reinforcement-Learning-Based Digital Twin for Manufacturing Process Optimization. Systems 2024, 12, 38. [Google Scholar] [CrossRef]
- Dharmawan, A.G.; Xiong, Y.; Foong, S.; Soh, G.S. A Model-Based Reinforcement Learning and Correction Framework for Process Control of Robotic Wire Arc Additive Manufacturing. In Proceedings of the 2020 IEEE International Conference on Robotics and Automation (ICRA), Paris, France, 31 May–31 August 2020; pp. 4030–4036. [Google Scholar] [CrossRef]
- Chen, X.; Cao, J.; Liang, Z.; Sahni, Y.; Zhang, M. Digital Twin-assisted Reinforcement Learning for Resource-aware Microservice Offloading in Edge Computing. In Proceedings of the 2023 IEEE 20th International Conference on Mobile Ad Hoc and Smart Systems (MASS), Toronto, ON, Canada, 25–27 September 2023; pp. 28–36. [Google Scholar] [CrossRef]
- Kavraki, L.E.; Svestka, P.; Latombe, J.-C.; Overmars, M.H. Probabilistic roadmaps for path planning in high-dimensional configuration spaces. IEEE Trans. Robot. Autom. 1996, 12, 566–580. [Google Scholar] [CrossRef]
Method | Accuracy (mm) | Convergence Time (s) | Stability Index | CPU Time (ms) |
---|---|---|---|---|
ANFIS | 0.52 | 1.8 | 0.96 | 30 |
PID | 1.34 | 3.2 | 0.78 | 25 |
FLC | 0.91 | 2.6 | 0.81 | 28 |
Number of Nodes (N) | Component Connection | Learning Time (s) | Connect Time (ms) | ||||
---|---|---|---|---|---|---|---|
500 | Connected | 22.3 | 1310 | Failed | 1015 | 1168 | 1218 |
1000 | Connected | 47.2 | 1459 | 2678 | 678 | 778 | 925.43 |
1500 | Connected | 72.4 | 633 | 1367 | 1056 | 1265 | 765.56 |
2000 | Connected | 99.8 | 867 | 956 | 457 | 865 | 557.54 |
2500 | Connected | 129.3 | 923 | 1789 | 686 | 1344 | 468.68 |
3000 | Connected | 163.2 | 454 | 985 | 557 | 984 | 378.76 |
3500 | Connected | 196.5 | 778 | 867 | 586 | 868 | 376.56 |
4000 | Connected | 225.8 | 563 | 786 | 678 | 676 | 354.31 |
4500 | Connected | 262.3 | 680 | 845 | 675 | 545 | 322.23 |
5000 | Connected | 305.2 | 523 | 593 | 876 | 657 | 316.34 |
Method | Trapezoid (SR, Avg Time) | Star (SR, Avg Time) |
---|---|---|
No Curriculum | 0.870, 9.762 s | 0.570, 11.585 s |
Linear Curriculum UDR | 1.000, 9.844 s | 0.750, 9.044 s |
Linear Curriculum GDR | 1.000, 11.735 s | 0.810, 11.717 s |
Adp. Curriculum UDR | 1.000, 6.875 s | 0.700, 6.768 s |
Adp. Curriculum GDR | 1.000, 8.493 s | 0.800, 7.960 s |
Adp. Curriculum UDR DyRe | 1.000, 8.411 s | 0.840, 11.635 s |
Adp. Curriculum GDR DyRe | 1.000, 8.584 s | 0.940, 11.850 s |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Le, T.; Vinh, L.Q.; Pham, V.H. Virtual, Augmented, and Mixed Reality Robotics-Assisted Deep Reinforcement Learning Towards Smart Manufacturing. Sensors 2025, 25, 3349. https://doi.org/10.3390/s25113349
Le T, Vinh LQ, Pham VH. Virtual, Augmented, and Mixed Reality Robotics-Assisted Deep Reinforcement Learning Towards Smart Manufacturing. Sensors. 2025; 25(11):3349. https://doi.org/10.3390/s25113349
Chicago/Turabian StyleLe, Than, Le Quang Vinh, and Van Huy Pham. 2025. "Virtual, Augmented, and Mixed Reality Robotics-Assisted Deep Reinforcement Learning Towards Smart Manufacturing" Sensors 25, no. 11: 3349. https://doi.org/10.3390/s25113349
APA StyleLe, T., Vinh, L. Q., & Pham, V. H. (2025). Virtual, Augmented, and Mixed Reality Robotics-Assisted Deep Reinforcement Learning Towards Smart Manufacturing. Sensors, 25(11), 3349. https://doi.org/10.3390/s25113349