Probing Augmented Intelligent Human–Robot Collaborative Assembly Methods Toward Industry 5.0

Nie, Qingwei; Shen, Yiping; Ma, Ye; Zhang, Shuqi; Zong, Lujie; Zheng, Ze; Zhangwa, Yunbo; Chen, Yu

doi:10.3390/electronics14153125

Open AccessArticle

Probing Augmented Intelligent Human–Robot Collaborative Assembly Methods Toward Industry 5.0

by

Qingwei Nie

¹

,

Yiping Shen

²,

Ye Ma

^3,*,

Shuqi Zhang

⁴,

Lujie Zong

³,

Ze Zheng

³,

Yunbo Zhangwa

³ and

Yu Chen

³

¹

School of Mechanical Engineering, Yangzhou University, Yangzhou 225000, China

²

Shanghai Spaceflight Precision Machinery Institute, Shanghai 201600, China

³

College of Mechanical and Electrical Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China

⁴

The 28th Research Institute of China Electronics Technology Group Corporation, Nanjing 210016, China

^*

Author to whom correspondence should be addressed.

Electronics 2025, 14(15), 3125; https://doi.org/10.3390/electronics14153125

Submission received: 26 June 2025 / Revised: 4 August 2025 / Accepted: 5 August 2025 / Published: 5 August 2025

(This article belongs to the Special Issue Human–Robot Interaction and Communication Towards Industry 5.0)

Download

Browse Figures

Versions Notes

Abstract

Facing the demands of Human–Robot Collaborative (HRC) assembly for complex products under Industry 5.0, this paper proposes an intelligent assembly method that integrates Large Language Model (LLM) reasoning with Augmented Reality (AR) interaction. To address issues such as poor visibility, difficulty in knowledge acquisition, and strong decision dependency in the assembly of complex aerospace products within confined spaces, an assembly task model and structured process information are constructed. Combined with a retrieval-augmented generation mechanism, the method realizes knowledge reasoning and optimization suggestion generation. An improved ORB-SLAM2 algorithm is applied to achieve virtual–real mapping and component tracking, further supporting the development of an enhanced visual interaction system. The proposed approach is validated through a typical aerospace electronic cabin assembly task, demonstrating significant improvements in assembly efficiency, quality, and human–robot interaction experience, thus providing effective support for intelligent HRC assembly.

Keywords:

human–robot collaborative assembly; augmented reality; knowledge reasoning; virtual–reality mapping; large language model

1. Introduction

Driven by a new wave of technological revolution and industrial transformation, the manufacturing industry is accelerating toward a development stage characterized by high-end, intelligent, and personalized production. The deep integration of next-generation information technologies with advanced manufacturing techniques has given rise to the evolution of intelligent manufacturing—from automation to autonomous intelligence. Represented by the European Union’s “Industry 5.0” initiative, this new wave of industrial transformation emphasizes human-centric manufacturing, advocating for deep collaboration between humans and robots and aiming to achieve a unified integration of technological empowerment and human value [1]. Against this backdrop, Human–Robot Collaboration (HRC) has become a key direction in the evolution of intelligent manufacturing [2]. This trend is particularly evident in high-end equipment manufacturing sectors such as aerospace, where increasing complexity, precision requirements, and customization in assembly processes render traditional assembly methods—reliant on manual experience and paper-based process documentation—insufficient to meet the demands for high efficiency, high quality, and traceability in production [3].

Complex aerospace products—particularly during critical assembly processes in confined spaces such as launcher electronic compartments—are often confronted with challenges such as multi-component integration, high-precision requirements, and spatial constraints [4]. Traditional manual assembly methods, lacking effective task planning and real-time guidance, are prone to errors, inefficiency, and a lack of traceability, thereby falling short of meeting the current demands for high reliability and rapid responsiveness in production. Consequently, there is an urgent need to develop novel human–robot integrated assembly approaches to overcome the cognitive and operational limitations of conventional processes and to reduce the workload on operators [5]. Augmented Reality (AR), as a key interactive technology that integrates virtual information with real-world operational environments, enables the overlay of virtual guidance information onto the physical workspace during assembly, offering a new solution to improve the visibility and accessibility of assembly processes [6]. Meanwhile, with the rapid advancement of Large Language Models (LLMs) in knowledge reasoning and natural language understanding, these models can assist in constructing interpretable assembly knowledge systems and support intelligent reasoning and real-time response for task planning, anomaly detection, and process compliance in assembly operations [7].

LLMs offer innovative approaches to HRC by continuously accumulating and updating experiential knowledge. With ongoing advancements in computational power and data aggregation, the capabilities of these models have been significantly enhanced. The GPT series, for instance, utilizes techniques such as pre-training and fine-tuning to understand and follow human instructions, thereby enabling the accurate interpretation and resolution of complex problems [8]. Given the complexity of HRC assembly scenarios, LLMs are well-positioned to bridge the gap between humans and the physical world, significantly enhancing the flexibility of collaborative robotic assembly processes [9]. Wang et al. [10] proposed a navigation method for human–robot interactive mobile inspection robots based on an LLM, capable of replacing human operators in hazardous industrial environments and performing complex navigation tasks in response to natural language instructions. Chen et al. [11] introduced an autonomous HRC assembly approach driven by LLM and digital twin technologies, in which an LLM is integrated into the control system of a collaborative robotic arm. They developed an intelligent assistant powered by an LLM to facilitate autonomous HRC. Liu et al. [12] proposed a framework combining knowledge graphs and LLMs for fault diagnosis in aerospace assembly. Utilizing subgraph embedding and retrieval-augmented knowledge fusion techniques, this framework significantly enhances reasoning capabilities and enables efficient fault localization and resolution. Bao et al. [13] developed an LLM-assisted AR assembly methodology tailored to complex product assembly, integrating an LLM with AR and constructing a matching process information model to achieve intelligent LLM-guided assembly assistance. Benefiting from powerful autonomous reasoning and computational capabilities, LLMs provide novel approaches for HRC. However, challenges remain, including limited flexibility in task planning, incorrect application of inferred process knowledge, and a lack of professionalism in generated assembly plans. Furthermore, achieving seamless interaction between LLM and HRC systems, as well as enabling more natural and intuitive collaborative assembly processes, still requires extensive exploration.

AR technology integrates virtual information with the real environment by superimposing digital content—such as images, text, or models—into the user’s field of view, allowing users to perceive virtual elements overlaid with physical objects [14]. This capability enables operators to better understand the operational context and utilize mapped virtual information to support the execution of assembly tasks [15]. Liu et al. [16] designed a user-centered AR-enhanced interactive interface, allowing personnel to perform assisted maintenance tasks through visual guidance, thereby mitigating the impact of spatial constraints and human factors. Fang et al. [17] proposed a human-centric, multimodal, context-aware AR method for recognizing stages in on-site assembly. This method can automatically confirm actual assembly outcomes, reduce the cognitive load associated with triggering subsequent instructions, and support real-time validation of the current task state during operation. Yan et al. [18] introduced a mixed reality (MR)-based remote collaborative assembly approach that combines information recommendation and visual enhancement. This method optimizes users’ attention allocation during remote collaboration, improves the visibility of key guidance information, and enhances the efficiency and user experience of MR-assisted assembly. Chan et al. [19] developed an AR-assisted interface for HRC-based assembly, demonstrating that the AR interface effectively reduces physical intervention requests and improves the utilization of cobots. In AR-assisted HRC assembly, tracking and registration technologies are critical components. These technologies enable the accurate integration of virtual 3D models or information with the physical world, thereby supporting precise virtual-to-real alignment [20]. As a core technology of AR systems, virtual–real mapping ensures accurate alignment between virtual 3D models and real components [21]. Among the tracking and registration technologies required for virtual–real mapping, ORB-SLAM2 (Oriented FAST and Rotated BRIEF Simultaneous Localization and Mapping) is one of the most widely adopted visual SLAM methods. Xi et al. [22] improved ORB-SLAM2 by incorporating the Progressive Sample Consensus (PROSAC) algorithm for outlier rejection and added dense point cloud and octree map construction threads to generate maps suitable for robotic navigation and path planning. Zhang et al. [23] employed a Grid-based Motion Statistics (GMS) algorithm to optimize the performance of ORB-SLAM2, thereby reducing processing time and correcting matching errors. Although significant progress has been made in AR-based guidance and tracking registration technologies, limitations persist in complex HRC assembly environments. Existing AR-assisted assembly solutions are mostly focused on guiding predefined procedures, with limited support for intelligent interaction and information integration. Moreover, assembly scenarios often lack prominent texture features, posing challenges for natural feature-based tracking methods and visual SLAM techniques, which struggle under low-texture or textureless conditions. To better meet the demands of complex assembly tasks, the robustness of AR-based HRC systems must be further improved.

The integration of LLMs with AR technologies has further enhanced the real-time performance and intelligence of HRC assembly. By leveraging the advanced data processing and reasoning capabilities of LLMs together with the real-time information visualization of AR, a more seamless fusion between humans and the assembly environment can be achieved [24]. To address the challenges that persist in HRC assembly, this paper proposes an intelligent HRC assembly approach that integrates LLM-driven reasoning with AR-based interaction. This method is motivated by the human-centric development paradigm of Industry 5.0 and aims to support emerging requirements in HRC. The goal is to overcome the rigid human–robot relationships in traditional assembly processes and to promote more organic integration between humans and robots. By employing domain-specific knowledge reasoning via an LLM, information-enhanced visualization, and interactive fusion techniques, the proposed approach enables operators to accurately comprehend task information and assembly context, thereby facilitating more natural and efficient collaboration. This research primarily focuses on addressing key challenges encountered during the assembly of complex aerospace products in confined spaces, such as limited visibility, difficulty in accessing domain knowledge, and heavy reliance on decision-making support. The main contributions of this paper are as follows:

(1): A knowledge reasoning method for complex aerospace product assembly assisted by an LLM is proposed. By parsing assembly tasks into sequential components and integrating them with process requirements, a structured process information model is constructed. This enables LLM-driven intelligent task guidance and optimized decision-making for assembly quality. The method effectively leverages the respective strengths of humans and robots in task execution, thereby improving the efficiency of HRC during the assembly process.
(2): An intelligent AR-based visual interaction system is developed, which achieves a virtual–real fusion of assembly scenarios through virtual scene construction and real-time tracking of assembly components. The system employs AR technology to provide enhanced visual guidance and intelligent interaction throughout the assembly process. This significantly improves the visibility and responsiveness of assembly information, increases operational efficiency, reduces error rates, and enhances the overall experience of HRC assembly.
(3): Validation of the proposed intelligent HRC assembly approach is conducted through a case study involving the assembly of a complex aerospace product. The results demonstrate the feasibility and effectiveness of the proposed system in real-world assembly scenarios.

2. Methodology

The proposed method integrates LLM and AR technologies to construct an enhanced intelligent HRC assembly assistance system. As illustrated in Figure 1, the system framework is divided into two main modules: assembly knowledge reasoning and intelligent visual interaction. Firstly, the process begins with assembly task analysis and process information modeling. The assembly workflow is structured through task sequence decomposition and process template construction. These structured data serve as inputs for LLM-driven knowledge reasoning and generation, including retrieval augmentation and contextual knowledge synthesis, thereby enabling data-based decision support for assembly processes. Simultaneously, virtual–real mapping and tracking registration are achieved via entropy-based image processing and optimized global pose estimation, ensuring precise alignment of the physical scene. Furthermore, AR-based intelligent visual interaction facilitates multimodal enhanced interaction and visual guidance for operators. By integrating these informational functionalities, seamless interaction between the LLM and human operators is realized, allowing knowledge reasoning to be effectively embedded within the assembly process.

2.1. Assembly Task Analysis and Process Information Modeling

To enable intelligent assistance and knowledge reasoning in the HRC assembly of complex products, it is essential to first conduct a systematic analysis of assembly tasks and construct a structured process information model. As shown in Figure 2, the assembly of a specific model of an aerospace electronic cabin is selected as a representative scenario. Based on its assembly craft documentation and operational procedures, the overall assembly task is divided into three primary stages: the preparation stage, the assembly execution stage, and the quality issue resolution stage. The assembly process is further structured into three hierarchical layers, progressing from inside to outside, each of which is decomposed into several subtasks. Each subtask includes key information such as operational objectives, assembly objects, required tools, and assembly constraints. The directed relationships among subtasks represent task dependencies: a subtask can only be executed after the completion of its predecessor, as indicated by the directional arrows. Each layer of assembly tasks must be executed sequentially, with each subsequent layer triggered by the successful completion of the previous one. Furthermore, three typical quality issues and their corresponding solutions are predefined within the model. This structured task representation serves as a knowledge template to support subsequent reasoning processes such as assembly sequence planning and task allocation.

On this basis, the execution roles for each task are assigned by leveraging the respective advantages of human operators and cobots in handling different task types. To further enhance the automation of information processing, the multidimensional data involved in HRC scenarios must be effectively integrated to ensure consistency throughout the complex assembly process. Subsequently, the textual information of each assembly task is encoded into a structured sequence input template, as shown in Figure 3, which includes fields such as Task ID, Process Description, Required Resources, Execution Role, and Expected Outcome. The resulting process information model serves not only as a foundational data source for the AR-based visual interaction system but also as the input for LLM-driven knowledge reasoning in assembly. Through structured representation and semantic embedding, this model provides reliable knowledge support for subsequent intelligent task recommendation and reasoning for quality optimization.

2.2. LLM-Driven Assembly Knowledge Reasoning and Generation

Building upon the structured modeling of assembly tasks and process information, this study further develops an assembly knowledge reasoning mechanism driven by an LLM to support the execution of complex tasks and the generation of optimization suggestions in HRC scenarios. This mechanism leverages a retrieval-augmented generation (RAG) framework to enable intelligent access to structured assembly knowledge and context-aware responses, as illustrated in Figure 4.

Based on the previously constructed process information templates, each subtask is transformed into a structured input comprising elements such as task description, target component, executor role, process specifications, and inspection requirements. These process data are encoded into text and embedded into high-dimensional vectors, forming the foundation of a knowledge base that supports reasoning by the LLM. When an assembly-related query is initiated—either by the operator or the system—the input is first standardized and converted into a query vector. This vector is then compared with all embedded knowledge entries in the database using similarity computation to retrieve the most relevant information. The retrieved content is concatenated with the original query and fed into the LLM as context prompts, enabling a knowledge-enhanced generation process under the RAG architecture.

For complex aerospace electronic cabin assembly tasks, the reasoning module not only provides precise assembly instructions and key considerations but also dynamically analyzes whether the task requires multi-agent collaboration. It can autonomously determine the optimal division of labor between human operators and cobots. Moreover, by integrating quality inspection feedback, the LLM can propose quality optimization suggestions—such as modifying cable routing strategies or adjusting fastener installation sequences. To ensure that the knowledge output aligns with operational conventions and communication norms specific to HRC assembly scenarios, a prompt-based fine-tuning mechanism is designed. This enables the LLM to tailor its response format, linguistic style, and granularity of information to better suit the requirements of aerospace assembly contexts, thereby enhancing both the accuracy and the professionalism of the reasoning outcomes.

2.3. Virtual–Real Mapping and Component Tracking Registration in Assembly Scenarios

To enable AR-assisted HRC assembly, it is essential to ensure precise and stable integration of virtual scenes with real-world environments, as well as accurate virtual–real mapping of assembly components. This requires the use of 3D tracking and registration algorithms. ORB-SLAM2, one of the most widely used algorithms for 3D tracking and registration, offers high real-time performance and accuracy. However, its heavy reliance on feature point extraction and matching makes it less stable in complex, low-texture environments commonly encountered in HRC assembly tasks. To enhance the performance of 3D tracking and registration for AR-assisted HRC assembly, this study introduces improvements to the original ORB-SLAM2 algorithm, aiming to meet the requirements of virtual scene mapping and precise component tracking, as illustrated in Figure 5.

Information entropy reflects the richness of texture and pixel gradient variation within local image regions during feature extraction. It is calculated as follows:

H (x) = - \sum_{0}^{255} p (x_{i}) \log_{2} p (x_{i})

(1)

where p(x_i) denotes the probability of a pixel having grayscale value i in the image. A probability close to 1 implies lower uncertainty in image information.

The entropy value is closely tied to the characteristics of the assembly scene. Since images from different scenes contain varying levels of information richness, their optimal entropy thresholds differ. Therefore, fixed thresholds often fail to yield satisfactory feature extraction and matching results across varied viewing angles in HRC assembly. To address this, an adaptive entropy threshold determination method is proposed, as follows:

E_{0} = \frac{H (i)_{a v e}}{2} + δ

(2)

where H(i)_ave is the average image entropy computed from the first i frames captured during the initial run in the scene, and δ is a correction factor empirically set to 0.5 to achieve effective results. The resulting E₀ serves as the adaptive threshold for feature extraction in the current scene.

To achieve seamless virtual–real fusion in assembly scenes, global pose estimation is required. This estimation problem is transformed into a nonlinear least squares optimization to reduce complexity and enhance precision. Using keyframes and 3D coordinates of target points from the physical environment, the Efficient Perspective-n-Point (EPnP) algorithm with Gauss–Newton optimization is employed for real-time pose estimation of the AR headset. This ensures spatial consistency between virtual models and the physical scene, enabling stable and accurate AR-assisted guidance for HRC assembly.

[\begin{matrix} p_{i}^{N} \\ 1 \end{matrix}] = [\begin{matrix} \begin{matrix} C_{1}^{N} \\ 1 \end{matrix} & \begin{matrix} C_{2}^{N} \\ 1 \end{matrix} & \begin{matrix} C_{3}^{N} \\ 1 \end{matrix} & \begin{matrix} C_{4}^{N} \\ 1 \end{matrix} \end{matrix}] \cdot [\begin{matrix} a_{i 1} \\ a_{i 2} \\ a_{i 3} \\ a_{i 4} \end{matrix}]

(3)

where P_i^N represents the 3D coordinates of a target point in the real world, and C_j^N represents the 3D coordinates of control points. The parameter a_ij defines the barycentric coordinates that linearly relate the target point to its associated control points. The transformation from the real world to the AR reference frame is expressed as

P_{i}^{M} = M [\begin{matrix} P_{i}^{N} \\ 1 \end{matrix}] = [R |t] \cdot [\begin{matrix} P_{i}^{N} \\ 1 \end{matrix}] = [R |t] \cdot [\begin{matrix} \sum_{j = 1}^{4} a_{i j} c_{j}^{N} \\ \sum_{j = 1}^{4} a_{i j} \end{matrix}] = \sum_{j = 1}^{4} a_{i j} c_{j}^{N}

(4)

By solving the camera projection matrix M, real-time localization and tracking can be performed. This enhanced ORB-SLAM2-based registration algorithm thus enables reliable virtual–real mapping and supports visualized assembly guidance in AR-enhanced HRC scenarios.

2.4. Human–Robot Augmented Interaction and Assembly Visualization Guidance

The virtualization of the HRC assembly scenario is constructed using Unity3Dv2019, wherein the digital twin models of cobots and assembly components are integrated. These models can be tracked and mapped in real time through wearable AR devices, enabling seamless fusion of the physical and virtual environments. Based on the virtual–real scene mapping and component tracking registration, an HRC assembly support system is constructed, with further development of augmented interaction and intelligent guidance modules. As shown in Figure 6, the system enables the dynamic presentation of assembly knowledge and interactive control through AR devices, providing intelligent assistance for HRC and realizing visual guidance throughout the assembly process. The system first extracts multimodal information from LLM reasoning results, including assembly procedures, quality inspection standards, and collaborative action prompts, and associates this information with the current task status of the operator. Then, via the AR interface, these data are overlaid onto the actual assembly scene in the form of text prompts, 3D animations, and flow diagrams, enabling visual instruction of task information, dynamic presentation of assembly steps, and real-time feedback on quality issues. To enhance the naturalness and efficiency of interaction, operators can interact with the AR interface using voice commands, gestures, or eye movements without interrupting their ongoing tasks. This allows for quick retrieval of task information, review of past tasks, and control of cobots. Through AR technology, the system achieves seamless interaction between the LLM and the operator, effectively integrating knowledge reasoning into the assembly process. This not only improves operational efficiency and accuracy but also significantly reduces the operator’s cognitive load, enabling a more natural and efficient collaborative assembly experience.

3. Experimental Results and Analysis

3.1. Experimental Platform Setup

To verify the feasibility and effectiveness of the proposed HRC assembly assistance method, an experimental platform was constructed for a representative aerospace electronic cabin assembly scenario, as shown in Figure 7. This platform integrates multi-source sensing devices, cobots, AR devices, and communication control modules, enabling coordinated operation among various units involved in complex product assembly tasks. The core assembly target of the platform is a to-be-assembled aerospace electronic cabin. The assembly tasks include typical operations such as cable plugging, module installation, fastener tightening, and assembly quality inspection. The platform is equipped with two vision-enabled cobots, each configured with an industrial camera and a different functional end-effector—one for grasping parts or tools, and the other equipped with an automatic screw-fastening tool for completing installation tasks. To support a stable material supply for the robots, the platform also includes a bolt storage and feeding system. In addition, to validate visual guidance and augmented interaction in HRC, AR devices are integrated into the platform as the human–robot interface. This allows operators to access process instructions, quality feedback, and assembly guidance via gesture, voice, or gaze control. Communication and command dispatch across all modules are centrally coordinated through the platform controller, ensuring smooth interaction with the cobots and maintaining the continuity and consistency of the assembly process. This experimental platform provides a robust hardware foundation and contextual environment for subsequent evaluations, including the accuracy of knowledge reasoning, the fidelity of virtual–real scene mapping, and the efficiency analysis of typical assembly task execution.

3.2. Experimental Results of Assembly Knowledge Reasoning

This study employs GPT-4o mini as the base model and applies P-tuning for domain-specific optimization, incorporating domain knowledge through knowledge base embedding to enhance knowledge integration. To evaluate the effectiveness of the domain tuning, we extracted 100 question–answer pairs from the constructed assembly knowledge templates of the aerospace electronic cabin. We queried both the optimized model and the base model separately. The benchmark results are shown in Table 1. The accuracy of domain-specific terminology interpretation improved by 25.2%, while the correctness of answers—measured by the similarity between generated responses and standard assembly specifications—increased by 27.8%. To further evaluate the effectiveness of the proposed LLM-driven assembly knowledge reasoning method, a series of multi-round question-answering experiments were conducted based on a typical aerospace electronic cabin assembly scenario. The performance of the proposed method was compared against two leading LLMs, ChatGPT-4o and DeepSeek-V3, concerning task allocation strategy, quality optimization suggestions, and other domain-specific assembly issues. Partial results of the responses are shown in Figure 8. It can be observed that the proposed method demonstrates a higher degree of professionalism when answering questions in the vertical domain of HRC assembly.

To objectively assess the reasoning and generation capabilities of each model, three categories of questions were designed—assembly guidance, task recommendations, and quality optimization—comprising 20 question sets each. These questions were drawn from knowledge points encountered in the electronic cabin assembly process. All models received the same input format via a unified prompt template. Human evaluators, referencing the process manuals, were tasked with scoring the quality and relevance of each model’s responses. As shown in Figure 9, all models achieved an accuracy of over 60% in answering relatively simple questions related to task recommendations. However, in addressing more complex questions concerning quality optimization, the performance of the baseline LLMs was relatively limited. In contrast, the proposed method achieved an average accuracy of 86% across the three categories of questions, significantly outperforming the other two models. These results indicate that the proposed approach exhibits superior reasoning and response generation capabilities for assembly-related issues, particularly in providing assembly guidance and quality optimization suggestions. It demonstrates a clear advantage over DeepSeek-V3 and ChatGPT-4o in delivering more accurate and actionable assistance in complex assembly scenarios.

3.3. Experimental Results of 3D Pose Tracking

To achieve high-precision tracking and visual fusion of components in AR-based assembly environments, this study utilizes an enhanced ORB-SLAM2 algorithm to conduct 3D pose tracking tests in a representative avionics assembly scenario. The objective is to evaluate the algorithm’s stability and real-time performance in multi-type interactive environments. As shown in Figure 10, the pose tracking results under three typical scenarios are presented: (a) tracking of a single component, (b) tracking in a multi-component assembly scene, and (c) tracking in a human-involved assembly scenario. In the figure, red dots represent the feature points identified by the improved ORB-SLAM2 algorithm, blue rectangles indicate the estimated camera pose in 3D space, green lines show the trajectory of camera motion, and green wireframes denote the current field of view. An analysis of the tracking accuracy and stability over consecutive frames reveals that the improved ORB-SLAM2 algorithm maintains robust pose estimation under varying operation postures, viewpoint transitions, and occlusion disturbances. No significant tracking loss or misrecognition was observed during the tests. These findings demonstrate that the proposed algorithm not only meets real-time requirements but also exhibits strong environmental adaptability and tracking stability. This provides a high-precision spatial mapping foundation for subsequent AR information overlay and visualized assembly guidance, thereby contributing to improved efficiency and accuracy in HRC assembly tasks under complex scenarios.

3.4. Application Example of the Intelligent Assembly Assistance System

To validate the applicability and effectiveness of the proposed intelligent HRC assembly assistance system in real-world scenarios, the system was deployed on the previously constructed aerospace electronic cabin assembly task platform. A series of demonstrations was conducted involving representative assembly operations, covering the entire workflow—from process information visualization, enhanced human–robot interaction, and assembly task guidance, to quality inspection feedback. An example of the system in operation is shown in Figure 11. Operators wear AR devices to fully visualize the physical assembly environment and the digital assistance system. When the operator completes the AR device setup and enters the workspace, the system first performs spatial perception and virtual–real mapping of the operational environment to construct an accurately aligned virtual assembly scene (a). Subsequently, the assembly task flow and process information are visualized in the AR interface through a 3D display, assisting the operator in quickly understanding the task objectives and specific actions required (d). The system dynamically highlights and localizes the components to be assembled based on the current operational state, guiding the operator through correct assembly procedures (b, c). In addition, an interactive interface is provided to support step confirmation and feedback on assembly quality (e). Notably, with the integration of LLM, the system can generate task suggestions and operational optimization strategies through natural language interactions during the assembly process (f), thereby enabling more intelligent decision support and human–robot interaction. The overall results demonstrate that the system exhibits strong task adaptability and environmental compatibility, offering advantages in enhancing assembly efficiency and information interaction. These findings validate its potential and practical value in complex HRC assembly scenarios.

4. Discussion

To evaluate the effectiveness of the proposed augmented intelligent HRC assembly assistance system in HRC tasks and to demonstrate its advantages over traditional HRC approaches and conventional AR-based assistance systems, a comparative experiment was conducted using an aerospace electronic cabin assembly task. Twenty participants were randomly assigned to four groups, each performing the assembly using one of the following methods: (1) conventional handicraft, (2) a baseline HRC system, (3) a conventional AR-based assistance system, and (4) the proposed method. The baseline HRC system refers to a setup in which only a cobot is used to assist the assembly process. The robot is equipped with a sensor-integrated end-effector capable of performing programmed material transport and screw-fastening tasks, with manual command input required via a controller. The conventional AR-assisted system, as adopted in existing studies, utilizes MRTK to provide assembly guidance, which is limited to visual overlays of assembly steps and requirements. For each group, we recorded the total assembly time, assembly pass rate, and guidance accuracy. In addition, participants’ stress levels were assessed using a structured questionnaire, with evaluation criteria and corresponding scores shown in Table 2. The average values for each metric across participants in a group were used as the final results, as illustrated in Figure 12. Experimental results indicate that the use of HRC systems significantly improves the assembly efficiency and pass rate compared to manual assembly, but tends to increase operator stress. In contrast, the proposed system not only reduces total assembly time and improves assembly quality but also effectively alleviates operator stress during the collaborative process. Furthermore, compared to the conventional AR assistance system, the improved tracking and registration algorithm in our method enhances guidance accuracy, reducing issues such as misassembly or omissions caused by mapping drift or tracking loss. In summary, the proposed system demonstrates notable improvements in both assembly efficiency and quality while mitigating the cognitive burden on human operators. It also exhibits strong operational stability across multiple trials, highlighting its practical applicability and robustness in complex HRC assembly tasks.

This paper proposes an enhanced intelligent HRC assembly assistance system that integrates an LLM with AR technology, significantly improving collaboration efficiency and assembly quality in complex assembly tasks. In the representative aerospace electronic cabin assembly scenario, the system effectively realizes knowledge-driven task reasoning and suggestion generation, as well as intelligent guidance based on visual interaction. Despite demonstrating strong practical value and deployment potential, the system still faces certain limitations in real-world applications. Firstly, the current approach heavily relies on the completeness and accuracy of predefined structured process templates. In scenarios with insufficient knowledge coverage or inconsistent expression, the stability of reasoning outputs may be affected. Secondly, the visual tracking component used for virtual–real mapping may experience performance degradation under dynamic occlusions or varying lighting conditions, which can impact the robustness of AR-based guidance. Future research will focus on enhancing the adaptability and generalization of large models across diverse industrial tasks. This includes introducing multimodal perception mechanisms to improve understanding of operator intent and behavioral states, as well as exploring dynamic knowledge graphs and interactive knowledge learning to strengthen the system’s self-adaptation and evolutionary capabilities.

5. Conclusions

This paper addresses the need for enhanced intelligent HRC assembly in the context of Industry 5.0 by proposing a novel assembly assistance system that integrates an LLM with AR technology. Through an in-depth analysis of the complex assembly scenario of an aerospace electronic cabin, a comprehensive system framework is constructed—spanning from assembly task modeling and knowledge-driven reasoning to virtual–real interactive fusion and enhanced visual guidance. A complete HRC assembly experimental platform is built to systematically validate the proposed modules and methods. Experimental results demonstrate the system’s advantages in improving assembly efficiency, reducing operator workload, and enhancing assembly quality. The proposed approach provides technical support and a practical path for future high-precision HRC assembly tasks, particularly in intelligent reasoning with LLM and multimodal AR interaction.

Author Contributions

Conceptualization, Q.N.; methodology, Y.M.; software, Y.M.; validation, Y.S.; formal analysis, S.Z.; investigation, Y.S.; resources, S.Z. and Y.Z.; data curation, Y.M. and Y.C.; writing—original draft preparation, Y.M.; writing—review and editing, L.Z.; visualization, Z.Z.; supervision, Q.N.; project administration, Q.N.; funding acquisition, Q.N. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the Postdoctoral Fellowship Program of CPSF (GZC20241434) and the Jiangsu Funding Program for Excellent Postdoctoral Talent (2024ZB660).

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The author Shuqi Zhang was employed by the 28th Research Institute of China Electronics Technology Group Corporation. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Ivanov, D. The Industry 5.0 framework: Viability-based integration of the resilience, sustainability, and human-centricity perspectives. Int. J. Prod. Res. 2023, 61, 1683–1695. [Google Scholar] [CrossRef]
Chacón, A.; Ponsa, P.; Angulo, C. Cognitive Interaction Analysis in Human-Robot Collaboration Using an Assembly Task. Electronics 2021, 10, 1317. [Google Scholar] [CrossRef]
Liu, L.; Schoen, A.J.; Henrichs, C.; Li, J.S.; Mutlu, B.; Radwin, R.G.; Zhang, Y.J. Human Robot Collaboration for Enhancing Work Activities. Hum. Factors 2024, 66, 158–179. [Google Scholar] [CrossRef] [PubMed]
Zhang, Z.Q.; Ji, Y.C.; Tang, D.B.; Chen, J.; Liu, C.C. Enabling collaborative assembly between humans and robots using a digital twin system. Robot. Comput.-Integr. Manuf. 2024, 86. [Google Scholar] [CrossRef]
Sokolov, O.; Andrusyshyn, V.; Iakovets, A.; Ivanov, V. Intelligent Human-Robot Interaction Assistant for Collaborative Robots. Electronics 2025, 14, 1160. [Google Scholar] [CrossRef]
Liu, C.C.; Zhu, H.H.; Tang, D.B.; Nie, Q.W.; Zhou, T.; Wang, L.P.; Song, Y.J. Probing an intelligent predictive maintenance approach with deep learning and augmented reality for machine tools in IoT-enabled manufacturing. Robot. Comput.-Integr. Manuf. 2022, 77. [Google Scholar] [CrossRef]
Ott, S.; Hebenstreit, K.; Lievin, V.; Hother, C.E.; Moradi, M.; Mayrhauser, M.; Praas, R.; Winther, O.; Samwald, M. ThoughtSource: A central hub for large language model reasoning data. Sci. Data 2023, 10, 1–12. [Google Scholar] [CrossRef] [PubMed]
Wu, T.Y.; He, S.Z.; Liu, J.P.; Sun, S.Q.; Liu, K.; Han, Q.L.; Tang, Y. A Brief Overview of ChatGPT: The History, Status Quo and Potential Future Development. IEEE-CAA J. Autom. Sin. 2023, 10, 1122–1136. [Google Scholar] [CrossRef]
Luan, Z.R.; Lai, Y.J.; Huang, R.D.; Bai, S.H.; Zhang, Y.D.; Zhang, H.R.; Wang, Q. Enhancing Robot Task Planning and Execution through Multi-Layer Large Language Models. Sensors 2024, 24, 1687. [Google Scholar] [CrossRef]
Wang, T.; Fan, J.; Zheng, P. Large language model-based approach for human-mobile inspection robot interactive navigation. Comput. Integr. Manuf. Syst. 2024, 30, 1587–1594. [Google Scholar] [CrossRef]
Chen, J.; Luo, H.; Huang, S.; Zhang, M.; Wang, G.; Yan, Y.; Jing, S. Autonomous Human-Robot Collaborative Assembly Method Driven by the Fusion of Large Language Model and Digital Twin. J. Phys. Conf. Ser. 2024, 2832, 012004. [Google Scholar] [CrossRef]
Liu, P.F.; Qian, L.; Zhao, X.W.; Tao, B. Joint Knowledge Graph and Large Language Model for Fault Diagnosis and Its Application in Aviation Assembly. IEEE Trans. Ind. Inform. 2024, 20, 8160–8169. [Google Scholar] [CrossRef]
Bao, J.; Li, J.; Yuan, Y.; Lu, C.; Wang, S. Augmented Reality Assembly Method Assisted by Large Language Models. Aeronaut. Manuf. Technol. 2024, 67, 107–116. [Google Scholar] [CrossRef]
Xue, Z.J.; Yang, J.; Chen, R.C.; He, Q.; Li, Q.X.; Mei, X.S. AR-Assisted Guidance for Assembly and Maintenance of Avionics Equipment. Appl. Sci. 2024, 14, 1137. [Google Scholar] [CrossRef]
Eswaran, M.; Gulivindala, A.K.; Inkulu, A.K.; Bahubalendruni, M. Augmented reality-based guidance in product assembly and maintenance/repair perspective: A state of the art review on challenges and opportunities. Expert Syst. Appl. 2023, 213, 118983. [Google Scholar] [CrossRef]
Liu, C.C.; Zhang, Z.Q.; Tang, D.B.; Nie, Q.W.; Zhang, L.Q.; Song, J.Y. A mixed perception-based human-robot collaborative maintenance approach driven by augmented reality and online deep reinforcement learning. Robot. Comput.-Integr. Manuf. 2023, 83, 102568. [Google Scholar] [CrossRef]
Fang, W.; Zhang, T.N.; Wang, Z.Y.; Ding, J. A multi-modal context-aware sequence stage validation for human-centric AR assembly. Comput. Ind. Eng. 2024, 194, 110355. [Google Scholar] [CrossRef]
Yan, Y.X.; Bai, X.L.; He, W.P.; Wang, S.X.; Zhang, X.Y.; Liu, L.W.; Yu, Q.; Zhang, B. Less gets more attention: A novel human-centered MR remote collaboration assembly method with information recommendation and visual enhancement. Robot. Comput.-Integr. Manuf. 2025, 92, 102898. [Google Scholar] [CrossRef]
Chan, W.P.; Hanks, G.; Sakr, M.; Zhang, H.; Zuo, T.; van der loos, H.F.M.; Croft, E. Design and Evaluation of an Augmented Reality Head-mounted Display Interface for Human Robot Teams Collaborating in Physically Shared Manufacturing Tasks. ACM Trans. Hum.-Robot Interact. 2022, 11, 1–19. [Google Scholar] [CrossRef]
Lei, X.M.; Lu, W.H.; Yong, J.; Wei, J.G. A Robust AR-DSNet Tracking Registration Method in Complex Scenarios. Electronics 2024, 13, 2807. [Google Scholar] [CrossRef]
Cheng, C.; Cui, H.; Liu, Y.; Wang, J.; Luo, Y.; Li, M.; Li, P. Multi-Point Precision Virtual-Real Registration Method for Aircraft Piping and Cable Assembly. Acta Opt. Sin. 2024, 44, 0412001. [Google Scholar]
Xi, Z.; Wang, H.; Han, S. Fast mismatch elimination algorithm and map-building based on ORB-SLAM2 system. J. Comput. Appl. 2020, 40, 3289–3294. [Google Scholar]
Zhang, D.D.; Zhu, J.L.; Wang, F.S.; Hu, X.Y.; Ye, X.H. GMS-RANSAC: A Fast Algorithm for Removing Mismatches Based on ORB-SLAM2. Symmetry 2022, 14, 849. [Google Scholar] [CrossRef]
Gkournelos, C.; Konstantinou, C.; Makris, S. An LLM-based approach for enabling seamless Human-Robot collaboration in assembly. CIRP Ann.-Manuf. Technol. 2024, 73, 9–12. [Google Scholar] [CrossRef]

Figure 1. Overall framework of the proposed enhanced intelligent HRC assembly method.

Figure 2. Assembly task analysis of a spaceborne avionics bay.

Figure 3. Structured input template for process information.

Figure 4. LLM-driven assembly knowledge reasoning framework.

Figure 5. Three-dimensional tracking and registration algorithm based on improved ORB-SLAM2.

Figure 6. AR-based intelligent visual assistance system for HRC assembly.

Figure 7. Experimental platform of the HRC assembly assistance system.

Figure 8. Examples of assembly question responses by different LLMs.

Figure 9. Evaluation results of different LLMs’ assembly question-answering performance.

Figure 10. Experimental results of 3D pose tracking at different time steps.

Figure 11. Practical application examples of the assembly assistance system.

Figure 12. Comparison of the effectiveness of the assembly assistance system.

Table 1. Benchmark results of the optimized model.

	Terminology Explanation Accuracy	Answer Correctness
Baseline	69.5%	62.4%
Optimized model	97.4%	90.2%

Table 2. Operator stress level questionnaire.

QuestionnaireItem	Not at All Stressed	Slightly Stressed	Stressed	Highly Stressed
Corresponding Score	0%	25%	50%	75%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Nie, Q.; Shen, Y.; Ma, Y.; Zhang, S.; Zong, L.; Zheng, Z.; Zhangwa, Y.; Chen, Y. Probing Augmented Intelligent Human–Robot Collaborative Assembly Methods Toward Industry 5.0. Electronics 2025, 14, 3125. https://doi.org/10.3390/electronics14153125

AMA Style

Nie Q, Shen Y, Ma Y, Zhang S, Zong L, Zheng Z, Zhangwa Y, Chen Y. Probing Augmented Intelligent Human–Robot Collaborative Assembly Methods Toward Industry 5.0. Electronics. 2025; 14(15):3125. https://doi.org/10.3390/electronics14153125

Chicago/Turabian Style

Nie, Qingwei, Yiping Shen, Ye Ma, Shuqi Zhang, Lujie Zong, Ze Zheng, Yunbo Zhangwa, and Yu Chen. 2025. "Probing Augmented Intelligent Human–Robot Collaborative Assembly Methods Toward Industry 5.0" Electronics 14, no. 15: 3125. https://doi.org/10.3390/electronics14153125

APA Style

Nie, Q., Shen, Y., Ma, Y., Zhang, S., Zong, L., Zheng, Z., Zhangwa, Y., & Chen, Y. (2025). Probing Augmented Intelligent Human–Robot Collaborative Assembly Methods Toward Industry 5.0. Electronics, 14(15), 3125. https://doi.org/10.3390/electronics14153125

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Probing Augmented Intelligent Human–Robot Collaborative Assembly Methods Toward Industry 5.0

Abstract

1. Introduction

2. Methodology

2.1. Assembly Task Analysis and Process Information Modeling

2.2. LLM-Driven Assembly Knowledge Reasoning and Generation

2.3. Virtual–Real Mapping and Component Tracking Registration in Assembly Scenarios

2.4. Human–Robot Augmented Interaction and Assembly Visualization Guidance

3. Experimental Results and Analysis

3.1. Experimental Platform Setup

3.2. Experimental Results of Assembly Knowledge Reasoning

3.3. Experimental Results of 3D Pose Tracking

3.4. Application Example of the Intelligent Assembly Assistance System

4. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI