1. Introduction
As global energy systems evolve toward low-carbon, intelligent, and high-efficiency operation, improving energy utilization efficiency while ensuring energy security has become a central challenge in energy engineering. Among various energy sources, nuclear energy, with its high energy density, low carbon emissions, and stable output, is widely regarded as a key component of future energy systems [
1,
2,
3]. Nuclear power plants, as critical infrastructure for energy conversion and supply, directly affect power system stability and have a profound impact on the safety, reliability, and efficiency of the overall energy system. Therefore, achieving intelligent and refined operation and maintenance of nuclear power plants is essential for improving the performance of energy systems. However, the internal environment of nuclear power plants is highly complex and hazardous, characterized by dense spatial structures, tightly coupled equipment and pipelines, and extreme conditions such as radiation, high temperature, and high pressure. In such environments, traditional manual inspection not only poses significant safety risks but also fails to achieve high-frequency and high-precision data acquisition, limiting real-time assessment and optimization of system operation. Therefore, deploying mobile robots for environmental exploration, condition monitoring, and data acquisition can reduce human risk while improving operational efficiency and energy performance through continuous sensing and data-driven analysis. In recent years, with the advancement of digital twin and intelligent robotic technologies, researchers have increasingly explored intelligent operation and maintenance methods for energy infrastructure. For example, Qi et al. [
4] summarized key digital twin technologies and their applications in industrial systems; Vairagade et al. [
5] and Do et al. [
6] developed digital twin platforms for nuclear power plants to support robot navigation and intelligent system development; Park et al. [
7] integrated digital twins with reinforcement learning to improve automation in nuclear power plant operations; Będkowski [
8] proposed a robot navigation and mapping system for nuclear facility inspection. These studies have provided a solid foundation for the digitalization and intelligent operation of energy systems. However, most existing work has focused on single-robot systems or local function optimization, and has not yet systematically addressed multi-robot collaborative exploration and environmental modeling from a system-level energy perspective.
In unknown environment exploration, frontier-based methods are widely used in robotic mapping and autonomous navigation because they effectively drive robots to expand into unexplored regions. Yamauchi [
9] first proposed the frontier exploration method, which was later extended to multi-robot systems by Burgard et al. [
10] with the introduction of collaborative optimization mechanisms. Subsequent studies have further improved frontier detection efficiency, task allocation, and path planning [
11,
12,
13,
14,
15,
16,
17,
18]. However, these methods are primarily designed for general environments. Their optimization objectives mainly focus on path length or information gain, lacking comprehensive consideration of energy consumption constraints, safety risks, and complex structural coupling in energy systems. As a result, they are insufficient for critical energy infrastructures such as nuclear power plants. In map construction, Rao–Blackwellized particle filter-based SLAM methods (e.g., Gmapping) have provided a high-precision foundation for environmental modeling [
19,
20,
21]. For multi-robot systems, various map fusion and collaborative SLAM methods have been proposed [
22,
23,
24,
25,
26,
27]. However, in nuclear power plants—a typical energy system environment—factors such as structural repetition, severe occlusion, and unknown initial poses often lead to cumulative errors during map fusion, thereby affecting the reliability of operation and maintenance decisions and the safety of energy system operation.
Recent studies on multi-robot exploration and SLAM have improved frontier management, task allocation, and decision-making under uncertain environments. For example, Ahmed et al. [
28] proposed an efficient multi-robot active SLAM framework with frontier sharing and uncertainty-aware utility optimization. Wang et al. [
29] developed a deep reinforcement learning and knowledge distillation method for communication-constrained multi-robot exploration. Liu et al. [
30] improved frontier costs and real-time map optimization for autonomous exploration. Energy-aware multi-robot exploration has also attracted attention, such as the CERES-Q framework [
31], which combines coverage maximization with energy safety. In the digital twin domain, recent power system and reliability studies have emphasized raw-measurement-driven DT updating [
32] and physics–data fusion for hierarchical reliability assessment [
33], indicating the importance of bidirectional data–model coupling for safety-critical energy systems.
Despite these advances, three gaps remain for NPP operation and maintenance. First, the simultaneous optimization of geometric path length, information gain, battery energy, radiation-risk exposure, and obstacle-sensitive safety cost is rarely formulated within a unified utility model. Second, collaborative SLAM studies often focus on map-fusion accuracy, whereas the closed-loop coupling among exploration, localization, map updating, and a DT service layer remains insufficient. Third, quantitative comparisons among different robot-exploration and SLAM paradigms under NPP-specific constraints remain limited, partly because these methods rely on different sensor, communication, environmental, and optimization assumptions. Therefore, the present work extends frontier-based exploration from a geometric exploration problem to an energy- and safety-oriented DT-assisted mapping problem for nuclear energy infrastructure. Compared with existing frontier-based exploration and multi-robot SLAM studies, the main novelties of this study are summarized as follows. (i) A DT-assisted I-WFD-Gmapping-DT framework is proposed for NPP operation and maintenance. The DT 5+3 architecture couples physical robots, virtual NPP models, twin data, service functions, and communication links with perception, scenario modeling, and task allocation modules, enabling online feedback between exploration and the virtual O&M environment. (ii) An energy- and risk-aware collaborative frontier allocation model is developed. The utility function explicitly combines discounted information gain, obstacle-sensitive cost, estimated energy consumption, angular dispersion, and safety-risk constraints, and the selected parameters are justified through sensitivity analysis. (iii) A multi-robot mapping pipeline is established by integrating EKF-based multi-sensor fusion, adaptive AMCL-based initial relative pose estimation, and multi-scale Gmapping fusion with loop-closure correction.
The remainder of this paper is organized as follows:
Section 1 reviews related work;
Section 2 presents the I-WFD-Gmapping-DT methodology;
Section 3 describes the simulation experiments used for validation;
Section 4 analyzes and compares the results; and
Section 5 concludes the paper.
2. Methodology
2.1. Digital Twin 5+3 Model and Theoretical Framework for Multi-Robot Collaborative Environmental Exploration and Map Construction in Nuclear Power Plants
Digital twins for energy infrastructure can be viewed as bidirectional cyber-physical systems in which physical assets, virtual models, operational data, service functions, and communication links are continuously synchronized to support monitoring, prediction, and decision-making. Based on this concept, the proposed NPP multi-robot digital twin is formulated as a 5+3 model. The number 5 denotes five core DT elements: physical NPP multi-robot entities (NPPM), virtual NPP model (NVPM), nuclear twin data (NDD), nuclear service platform modules (NSPM), and component/communication connections (NCN). The number 3 denotes three NPP-oriented functional modules superimposed on the DT backbone: sensing and perception module (NTPM), multi-scenario modeling module (NMPM), and the control and task allocation module (NKPM). The overall relationship can be expressed as in Equation (1).
The five DT elements provide the generic cyber-physical backbone. NPPM contains the mobile chassis, robotic actuators, onboard computer, LiDAR, IMU, odometry, camera, temperature, pressure, and radiation sensors. NVPM represents the virtual NPP inspection environment, including the reactor auxiliary space, corridor topology, dense pipelines, equipment rooms, and obstacle geometry. NDD stores static data, including plant layout, safety rules, robot parameters, and equipment configuration; dynamic data, including pose, laser scans, radiation values, temperature, and obstacle updates; and task data, including frontier clusters, target assignment, path cost, and map-fusion status. NSPM provides robot-status visualization, risk-map updating, frontier allocation feedback, SLAM monitoring, and maintenance-decision support. NCN defines the communication and synchronization mechanism among robots, ROS topics, TF frames, the virtual model, and the service platform.
The three functional modules instantiate the DT for multi-robot NPP exploration.
NTPM preprocesses and fuses multi-source sensor measurements.
NMPM supports regular inspection, maintenance, and emergency-response scenarios by maintaining multiple virtual scenes and risk maps.
NKPM receives frontier clusters, energy estimates, and safety constraints from the DT service layer, then returns robot-specific exploration targets and map-fusion commands. In the implementation, ROS Melodic and Gazebo were used for robot-environment co-simulation; RViz was used for visualization; the /map, /scan, /odom, /imu, and /radiation topics were synchronized through the TF tree. Pose and occupancy-grid data were updated at 5 Hz, whereas radiation and environmental-state data were updated at 1 Hz. When a virtual blocked corridor or high-risk area was detected, the DT layer updated the cost map and triggered target reallocation. This implementation enables closed-loop coupling among physical robot motion, virtual scenario updating, and data-driven decision-making, as shown in
Figure 1.
2.2. Multi-Robot Collaborative Environmental Exploration Method for Nuclear Power Plants Based on the Improved WFD Algorithm
The internal environment of nuclear power plants is complex and presents potential risks such as high radiation, high temperatures, and high pressure. Manual inspections are not only dangerous but also time-consuming. To ensure the safe operation of nuclear power plants and the continuity of energy supply, this paper proposes a multi-robot collaborative environmental exploration method (I-WFD) based on the improved WFD (Wavefront Frontier Detector) algorithm, used for autonomous and intelligent environmental scanning and data collection in nuclear plant scenarios. Let
G be the grid map inside the nuclear power plant, including known free space
Sfree, obstacle area
Sobs, and unknown region
Sunk. The state of each grid cell (
x,
y) is defined by Equation (2), where boundary cells
B are defined as known free cells adjacent to the unknown region, indicating potential exploration frontiers. To reduce computational complexity and improve exploration efficiency, DBSCAN clustering is applied to neighboring boundary cells. The grid-state criterion in Equation (2) follows the occupancy-grid frontier definition used in frontier-based exploration and WFD. A cell is treated as free when its occupancy probability is lower than
Tfree, occupied when it is higher than
Tocc, and unknown when it has not been observed or lies between the two thresholds. In this study,
Tfree = 0.35 and
Tocc = 0.65. A frontier cell is defined as a known free cell that has at least one unknown cell in its 8-neighborhood. This criterion was selected because it is consistent with occupancy-grid SLAM outputs and allows frontier extraction to be directly connected with Gmapping without additional semantic labeling. The geometric center of each cluster is defined as the centroid point
Ck in Equation (3), where
pi represents the coordinates of the
i-th cell. The centroid point in the nuclear plant environment represents a key exploration area, covering more unknown space while reducing redundant exploration.
In mobile path planning, the obstacle interference between the robot’s current position
R and the centroid point
C is considered, including nuclear plant equipment, pipelines, and possible dynamic work obstacles. The improved movement cost formula is given by Equation (4), where
nobs is the number of obstacles on the path, and
α > 0 is the obstacle penalty coefficient. This cost function considers not only distance but also the safety and energy consumption of the robot in the complex environment inside the nuclear plant, improving the rationality of task planning. The utility value of the centroid point integrates information gain
g(
C) and the improved movement cost
d’ as given in Equation (4). The multi-robot system dynamically selects target points based on the utility value, coordinates exploration tasks, avoids redundant coverage, and achieves efficient information collection. This collaboration mechanism is especially crucial in the nuclear power plant scenario, ensuring robot safety while enhancing the integrity of environmental scanning inside the plant, providing high-precision data for global map construction.
To make the energy-oriented objective explicit, the path cost from robot
i to frontier centroid
Cj is extended as Equation (5), where
Lij is the A*/Dijkstra path length,
nobs,ij is the number of inflated obstacle cells within the path safety buffer,
α is the dimensionless obstacle-penalty coefficient, beta is the energy-weight coefficient, and mu is the safety risk coefficient. The estimated battery energy for this candidate path is Equation (5), where
P0 is the base moving power,
Ps is the sensing/computing power,
Tij is the estimated travel time,
β is the distance-related energy coefficient, and
μ is the turning-related coefficient.
Sij is the normalized safety-risk exposure, calculated from the accumulated radiation-risk or restricted-zone cost along the candidate path. All three terms are normalized before utility calculation to avoid scale dominance.
2.3. Multi-Robot Collaborative Task Allocation Method for Nuclear Power Plants
During the internal environmental exploration of nuclear power plants, multiple robots need to collaborate in task allocation to ensure comprehensive coverage of unknown areas, reduce redundant exploration, and ensure safety. This paper proposes a multi-robot collaborative task allocation method based on the I-WFD algorithm, which includes discounted information gain, improved movement cost, angle constraints, and distance maximization strategies to improve exploration efficiency and safety in multi-robot systems within nuclear power plants. Since multiple robots may simultaneously detect adjacent centroid points, the actual information gain of the centroid points may overlap. In the nuclear power plant scenario, the information gain discount can be expressed as in Equation (6), where
gj is the original information gain for robot
i moving towards centroid point
j,
gjk’ is the overlapping information with the detection range of another robot
k, and
γ ∈ (0,1] is the discount factor. A larger
γ penalizes redundant exploration more strongly. In the simulation experiment,
α = 1.5,
β = 0.35,
μ = 0.25,
γ = 0.6,
λ = 0.2, DBSCAN eps = 0.45 m, and minPts = 5. The improved movement cost, considering obstacles and safety constraints in the nuclear power plant, is given by Equation (7). The final utility value of robot
i is defined by Equation (8).
To avoid the concentration of multiple robots in narrow passages or equipment-dense areas within the nuclear power plant, an angle constraint is added. Let
θij be the angle between the direction of motion of robot
i and centroid point
j and the
x-axis of the current coordinate system. The angle cost can be expressed as in Equation (9), where
λ is the proportional coefficient. The angle cost is incorporated into the utility value as shown in Equation (10).
This strategy ensures that multiple robots are dispersed during exploration inside the nuclear power plant, reducing the risk of collisions and improving safety and information collection efficiency. In the case of centroid point conflicts, a distance maximization strategy is introduced: if two robots point to the same centroid point, one robot is assigned to move towards that point, and the other robot selects the centroid point that is farthest from the already assigned target point, as shown in Equation (11). Here,
C is the set of candidate centroid points, and
A is the set of already assigned target points. This strategy ensures that robots are evenly distributed in the complex environment of the nuclear power plant, avoiding redundant coverage and optimizing the path.
2.4. Initial Relative Pose Determination Method
After completing centroid point exploration and local map construction in the nuclear power plant’s internal environment, it is necessary to determine the initial relative poses between robots to ensure the accurate fusion of multiple robots’ local maps into a global map. In response to the complex environment of nuclear power plants, this study improves the Gmapping algorithm by introducing multi-source sensor data fusion and an Adaptive Monte Carlo Localization (AMCL) mechanism, achieving high-precision, low-computation initial pose determination.
The nuclear power plant environment has multipath effects, equipment occlusions, and unstructured environmental features, making it difficult for a single sensor to ensure accurate localization. This paper uses Extended Kalman Filtering (EKF) to fuse the robot’s odometry, inertial measurement unit (IMU), and LiDAR information, achieving robust localization for the multi-robot system in the complex nuclear power plant environment. The state transition and observation model is given by Equation (12), where
xk is the system state vector at time
k (position and orientation),
uk is the control input, and
wk and
vk represent process noise and observation noise, respectively.
To further enhance the robustness of initial pose estimation, an enhanced AMCL method is introduced, which dynamically adjusts the number of particles to address global localization failure and robot kidnapping problems. The weight update formula for the enhanced particle filter is given by Equation (13), where
wi′ is the normalized particle weight,
Nk dynamically adjusts according to pose uncertainty, and
σk is the variance of particle distribution. This mechanism ensures that in the complex layout of the nuclear power plant, each robot can quickly recover global localization while reducing the computational resource usage.
Each robot maintains independent coordinate systems: (1) local map coordinate system Robot
i/map; (2) odometry coordinate system Robot
i/odom; (3) robot body coordinate system Robot
i/base; (4) laser sensor coordinate system Robot
i/laser. To achieve global map fusion, one robot (e.g., Robot
1) is selected as the reference, with its local map map
1 serving as the global reference coordinate system. Other robots locate within map
1 using AMCL and solve for the relative translation and rotation transformation matrix, as shown in Equation (14), where
R1i is the rotation matrix, and
t1i is the translation vector. Using this transformation matrix, the local maps constructed by each robot can be unified and fused into the global coordinate system, enabling the precise construction of the global map for the nuclear power plant. With this method, the multi-robot system in the nuclear power plant can efficiently complete initial relative pose estimation in a complex environment, providing a reliable foundation for subsequent global map fusion and operation tasks, while ensuring exploration accuracy and the continuity of energy operations and maintenance.
2.5. Global Map Fusion Method
After completing multi-robot environmental exploration and initial relative pose determination inside the nuclear power plant, the local maps constructed by each robot need to be fused into a global map to ensure the integrity of environmental modeling across the entire plant and ensure the safety of energy operation and maintenance. In response to the complex nuclear power plant environment, this paper proposes a global map fusion method based on a multi-level, multi-scale strategy. This method combines data preprocessing, redundant data removal, multi-scale alignment, and loop closure correction to achieve high-precision and high-reliability map construction, as shown in
Figure 2.
First, data preprocessing is applied to the local maps collected by each robot, including filtering and voxel grid down-sampling, to remove noise and reduce computational burden, as shown in Equation (15), where
Mi is the original local map of robot
i,
Mi’ is the filtered map, and
vs is the voxel size. A clustering method is then applied to remove redundant points, ensuring the compactness and consistency of the global map, as shown in Equation (16), where
Pj is the set of points within cluster
j, and
Cj is the cluster center.
To ensure global consistency of the complex internal structures (reactor chamber, auxiliary pipelines, etc.) within the nuclear power plant, multi-scale map fusion is used: (1) Align large-scale structures (walls, pipes, major equipment layout) first on the low-resolution map; (2) Correct local errors (small equipment, pipelines, obstacles) on the high-resolution map. The fusion formula is given by Equation (17), where
Mi(
r) is the local map at different resolutions,
wi is the weight, and
N is the number of robots.
To eliminate accumulated errors and improve accuracy, graph optimization and loop closure correction are introduced. When the robot returns to an already explored area, loop closure is detected by matching the current frame with the historical frame features, and the pose graph is updated, as shown in Equation (18), where
Xi and
Xj are the robot poses,
zij is the measurement constraint, and
h(⋅) is the observation model. Finally, after multi-level, multi-scale fusion and loop closure optimization, the generated global map
Mglobal not only accurately reflects the internal structure of the nuclear power plant but also supports multi-robot autonomous inspection, path planning, and energy operation and maintenance decision-making, ensuring the safety of plant equipment and the continuity of power supply.
3. Simulation Experiment Validation
To validate the effectiveness of the proposed I-WFD-Gmapping algorithm in the nuclear power plant environment and its multi-robot collaboration performance, a key area model of the nuclear power plant was constructed in the simulation environment. The 3D model of the main obstacles was further reduced to a 2D layout, as shown in
Figure 3. The map structure is highly irregular, with dense obstacles, complex topological connectivity, narrow and multi-branch pathways, reflecting the complex environmental features typical of nuclear industry scenarios. The simulation platform uses ROS Melodic and Gazebo, with system configuration including an Intel i7 CPU, 16 GB of memory, and an NVIDIA GPU, for real-time simulation of multi-robot task scheduling and map construction. Three robot quantities were set in the simulation: single robot, dual robots, and a three-robot system, to evaluate exploration efficiency and global map construction accuracy under different collaboration scales. The robots’ starting positions are distributed near the central control area of the nuclear power plant. All robots are equipped with LiDAR, IMUs, and radiation sensors to collect environmental data in real time and perform exploration tasks using the I-WFD-Gmapping algorithm.
Once the robot begins exploration, a set of map data is obtained. The I-WFD-Gmapping algorithm processes this data to guide the robots to the most suitable positions.
Figure 4 shows the map data formed in Rviz, including known nuclear power plant areas, obstacle areas, and unknown nuclear plant areas. Based on this data, the centroid of each adjacent boundary is determined, serving as the target point for multi-robot movement. The multi-robot collaboration environmental exploration and map construction algorithm of I-WFD-Gmapping runs in the background, iterating and repeating until no new boundaries remain for the robots to explore. As shown in the figure, I-WFD-Gmapping starts with the least amount of data and navigates from generated boundaries to new ones. The generated map is the same as the one shown in
Figure 3. The purple objects on the map represent obstacles. In the given scenario, the red spheres represent the centroids of each adjacent pair of boundaries, which are the locations the robots are attempting to reach. Therefore, using the I-WFD-Gmapping algorithm, multi-robot collaboration can explore unknown nuclear power plant environments and construct unknown maps.
Thirty independent runs were performed for each method. Each run used a different random seed for sensor noise, frontier tie-breaking, and initial pose perturbation. The starting positions of robots were randomly perturbed within a radius of 0.4 m around the central control area, while maintaining collision-free initialization. The LiDAR range was set to 10–12 m with a Gaussian range noise of 0.02 m, odometry translational noise was 0.01 m/s, yaw-rate noise was 0.5 deg/s, and the radiation sensor was modeled as a scalar risk field with a Gaussian noise of 3%. The occupancy-grid resolution was 0.05 m. The simulation was simplified to a 2D occupancy-grid representation extracted from the 3D NPP-like model; therefore, vertical structures, stairways, and full radiation transport were not explicitly modeled. The main evaluation metrics include: (1) Coverage Area Percentage (CAP): measures the rate of expansion of the unknown area during exploration; (2) Number of target points: measures the efficiency of multi-robot collaboration in reducing redundant exploration; (3) Alignment Error (AE), Root Mean Square Error (RMSE), and Structural Similarity Index (SSIM): used to assess the matching accuracy between the local maps and the global reference map. AE is the percentage of mismatched cells after map registration. RMSE is computed from the normalized occupancy probability difference between the fused map and reference map. SSIM is calculated on registered binary occupancy images, and a larger SSIM indicates better structural consistency. Runtime was measured as the average planning time per allocation cycle and total algorithm time. Memory usage was recorded as the peak resident set size. Statistical significance was tested using paired t-tests across matched random seeds, with p < 0.05 considered significant. During the simulation, the robots scan the unknown areas sequentially, generate centroid points, assign tasks, and perform dynamic path planning and obstacle avoidance. Multi-robot collaboration ensures that each centroid point is explored by only one robot, reducing redundant coverage and improving exploration efficiency. Simultaneously, the simulation records the robot’s movement paths, the amount of data collected, and the algorithm’s running time, providing accurate inputs for subsequent global map fusion.
4. Results and Discussion
To validate the effectiveness of the proposed I-WFD-Gmapping method in improving environmental exploration efficiency (measured by Coverage Area Percentage, CAP) under different robot scales, this section conducts a comparative analysis of single-robot, dual-robot, and multi-robot systems in the complex nuclear power plant environment.
Figure 5 shows the coverage results of the I-WFD-Gmapping and WFD-Gmapping methods under different robot quantities in the nuclear power plant simulation environment, comparing the coverage performance under various collaboration scales. The above results were obtained through 30 repeated simulation experiments in the complex nuclear power plant environment, where robots dynamically allocate tasks based on boundary centroids and continue exploring until no new boundaries remain. The coverage variation was recorded. From
Figure 5 (Coverage Map Percentage Comparison in Simulation Experiments), it can be seen that the I-WFD-Gmapping method exhibits higher coverage rates in single-robot, dual-robot, and multi-robot scenarios. Specifically, the average coverage rates are 31.4%, 35.3%, and 40.5%, respectively. This is because the improved boundary detection and task allocation mechanisms effectively reduce redundant exploration and enhance collaboration among robots, thus improving effective coverage efficiency. Additionally, from
Figure 5, it can be observed that as the number of robots increases, the improvement in exploration efficiency becomes more significant. Particularly in the multi-robot system, the improvement is 28.98% compared to the single robot and 14.73% compared to the dual robots. The reason is that parallel exploration by multiple robots allows for spatial dispersion and task parallel execution, thereby accelerating the expansion of unknown areas.
To further evaluate the time efficiency of the proposed method, this section analyzes the differences in exploration time under the same coverage conditions.
Figure 6 presents a comparison of the relationship between coverage and time for the I-WFD-Gmapping and WFD-Gmapping methods in multi-robot simulation experiments. This result was obtained by recording the time required to reach the same coverage and averaging over multiple experiments. From
Figure 6 (Coverage Area Percentage vs. Time Comparison), it is clear that the I-WFD-Gmapping method requires significantly less time than the WFD-Gmapping method for the same coverage. Specifically, the exploration time is reduced by approximately 13.35%. This is because the improved algorithm reduces the computation of invalid boundaries and optimizes the task allocation strategy, thereby reducing system computational complexity and improving execution efficiency. It is worth noting that as the coverage increases, the time difference between the two methods gradually expands. This is because the traditional method experiences more redundant exploration and inefficient path selection in the later exploration stages, leading to an increase in cumulative time.
To analyze the optimization effect of the algorithm on computational efficiency, this section compares the number of target points (total and used target points).
Figure 7 shows a comparison of the number of target points in multi-robot simulation experiments between the I-WFD-Gmapping and WFD-Gmapping methods. This result was obtained by counting the number of centroid target points generated and actually used during the exploration. From
Figure 7 (Target Point Reduction Comparison), it can be seen that the I-WFD-Gmapping method significantly reduces both the total number of target points and the used target points compared to the WFD-Gmapping method. Specifically, the total number of target points decreased by an average of 38.9%, and the number of used target points decreased by an average of 23.24%. This is because the improved boundary clustering and centroid filtering strategies effectively remove redundant frontier points, thus reducing unnecessary calculations. Another noteworthy phenomenon is that the difference between the two methods remains stable across different experiments. This is because the proposed method introduces information gain and cost constraints in the frontier filtering stage, ensuring that low-value target points are filtered early, thereby guaranteeing stable efficiency improvements.
To further verify the energy efficiency, computational scalability, localization robustness, and safety-aware performance of the proposed I-WFD-Gmapping-DT method, compact supplementary analyses were conducted in the simulated nuclear power plant environment.
Table 1,
Table 2,
Table 3,
Table 4 and
Table 5 summarize the energy evaluation, parameter sensitivity, runtime and memory usage, initial-pose estimation, and practical interpretation of mapping metrics. These results were obtained from 30 independent ROS–Gazebo simulation runs using the same nuclear power plant grid environment, robot configuration, sensor settings, and evaluation protocol as those used in the main experiments.
To evaluate the performance of the proposed method in map construction accuracy, this section analyzes three metrics: AE, RMSE, and SSIM.
Table 1 summarizes the map quality evaluation results of the I-WFD-Gmapping and WFD-Gmapping methods in the nuclear power plant environment. This result was averaged from multiple simulation experiments and compared with the reference map for computation.
Table 1 shows that the I-WFD-Gmapping method outperforms the WFD-Gmapping method in both error metrics and structural similarity. Specifically, AE decreases by 7.47%, RMSE decreases by 8.53%, and SSIM (increasing from 0.78 to 0.83) increases by 6.02%. This is because multi-sensor fusion and the improved localization mechanism increase the accuracy of pose estimation, reducing cumulative errors and improving map consistency. In summary, the proposed I-WFD-Gmapping method significantly improves exploration efficiency, reduces computational costs, and enhances map construction accuracy in the complex environment of nuclear power plants.
From
Table 2, it can be observed that the proposed method reduces total energy consumption by 13.9% and energy per 1% coverage by 24.3%. This is because redundant frontier targets and unnecessary travel are reduced, confirming the energy-aware exploration capability.
Table 3 summarizes the tested ranges, selected values, and main observations for obstacle penalty, information-overlap discount, and angle cost coefficients. The results were obtained by varying one parameter while keeping the remaining settings unchanged. From
Table 3, it can be seen that α = 1.5, γ = 0.6, and λ = 0.2 provide balanced performance. This is because these values jointly consider obstacle avoidance, information utilization, and robot dispersion, supporting stable collaborative exploration.
Table 4 compares the average planning-cycle time and peak memory usage of WFD-Gmapping and I-WFD-Gmapping-DT in the three-robot simulation scenario. The proposed method reduces the planning-cycle time by 28.0% and peak memory usage by 6.1%. This improvement is mainly attributed to frontier clustering and target filtering, which reduce invalid candidate points and unnecessary planning operations. These results indicate that the improved method enhances computational efficiency without increasing the computational burden.
Table 5 reports the translation error, yaw error, and convergence time for AMCL-only localization and the proposed EKF-enhanced adaptive AMCL method. The results were obtained under the same multi-robot map initialization conditions. Compared with AMCL-only localization, the proposed method reduces the translation error to 0.118 m, yaw error to 2.17°, and convergence time to 8.1 s. These improvements result from multi-source sensor fusion and adaptive particle adjustment, which enhance localization robustness in complex environments.
5. Conclusions
This study proposes an energy-oriented multi-robot collaborative exploration and mapping framework, named I-WFD-Gmapping-DT, for nuclear power plant operation and maintenance. The framework integrates improved frontier detection, energy- and risk-aware task allocation, EKF-enhanced AMCL, multi-scale Gmapping fusion, and a digital twin feedback layer. Simulation results from 30 independent ROS–Gazebo runs show that the proposed method improves both exploration efficiency and mapping reliability compared with conventional WFD-Gmapping.
In the three-robot case, the proposed method increased the coverage area percentage from 35.6% to 40.5%, reduced exploration time by 13.35%, and reduced estimated energy consumption by 13.9%. It also decreased redundant frontier generation, with the total number of target points reduced by 38.9%. In terms of map quality, the alignment error decreased from 12.45% to 11.52%, while the SSIM increased from 0.78 to 0.83. These results indicate that the proposed method can reduce unnecessary robot motion, improve energy efficiency, and enhance global map consistency in complex NPP-like environments.
The main advantage of the proposed framework is the integration of digital twin-enabled closed-loop feedback with multi-robot exploration and mapping. The DT layer synchronizes robot states, frontier information, cost maps, risk maps, and map-fusion status, allowing the system to update task allocation when blocked corridors or high-risk areas appear. Therefore, the DT supports not only visualization but also safer, more energy-efficient, and more adaptive decision-making for NPP operation and maintenance.
However, the present validation still has limitations. First, the quantitative benchmark in this study is mainly based on a controlled comparison with WFD-Gmapping, because the proposed framework is developed by extending WFD-based frontier detection and Gmapping-based map construction under the same ROS–Gazebo NPP simulation setting. This design helps isolate the effects of the proposed energy-, risk-, and DT-enabled modules, but it does not fully represent the performance of other exploration or SLAM families, such as active SLAM, RRT-based exploration, deep reinforcement learning-based multi-robot exploration, graph-based SLAM, or other multi-robot map-merging approaches. Therefore, the generality of the conclusions across different exploration and SLAM paradigms should be further examined. Second, the simulation environment is a 2D occupancy-grid abstraction extracted from an NPP-like 3D model, rather than a complete physical NPP environment. Thus, vertical structures, stairways, metallic occlusions, full radiation transport, sensor degradation, and communication degradation caused by shielding or complex structures are not fully represented. In addition, the energy results are based on simulation-based estimation rather than direct measurements from physical robot hardware.
Future work will first establish a unified benchmark suite in representative NPP-like scenarios and compare I-WFD-Gmapping-DT with active-SLAM, RRT/frontier-hybrid, deep reinforcement learning-based, graph-SLAM/Cartographer-based, and energy-aware exploration methods under identical sensor, communication, energy, and safety-risk settings. Hardware-in-the-loop validation, real multi-robot experiments, and field-scale testing will then be conducted to examine practical deployability. Further work will also extend the current 2D grid abstraction to high-fidelity 3D validation, incorporate more realistic radiation and communication models, and evaluate the robustness of the proposed DT-assisted exploration framework under practical deployment conditions.