Structure-Aware Topological Exploration: A Semantic Seeded Voronoi Approach for Unstructured Environments

Ding, Miao; Wei, Xian; Chen, Shaowen

doi:10.3390/electronics15051033

Open AccessArticle

Structure-Aware Topological Exploration: A Semantic Seeded Voronoi Approach for Unstructured Environments

by

Miao Ding

¹,

Xian Wei

^2,* and

Shaowen Chen

²

¹

School of Software, Liaoning Technical University, Huludao 125105, China

²

School of Software, East China Normal University, Shanghai 200062, China

^*

Author to whom correspondence should be addressed.

Electronics 2026, 15(5), 1033; https://doi.org/10.3390/electronics15051033

Submission received: 11 January 2026 / Revised: 14 February 2026 / Accepted: 25 February 2026 / Published: 2 March 2026

(This article belongs to the Section Computer Science & Engineering)

Download

Browse Figures

Review Reports Versions Notes

Abstract

In autonomous exploration tasks in unstructured terrain, exploration efficiency and map topology quality have been a difficult problem to balance. Among the current autonomous exploration methods, geometry-based exploration methods only focus on exploration efficiency but not map quality, which not only leads to frequent backtracking by the robot, but also tends to ignore non-geometric risks such as negative obstacles. To address this pain point, we propose the Structure-Aware Topology Exploration framework. Unlike pure geometric exploration, we utilize U-Net to semantically analyze the unmanned aerial vehicle aerial images, and force the robot’s path to be anchored to the geometric axis of the safe area through the Semantic Seeded Voronoi mechanism. To avoid map redundancy leading to backtracking, we directly introduce topological sparsity constraints in the decision function to realize online structural pruning during exploration. Simulation experiments based on real-world aerial imagery demonstrate that the proposed framework effectively overcomes the late-stage exploration plateau: compared with purely geometric baselines (Rapidly exploring Random Tree and Frontier), it reduces average path length to 278.4 m (45% reduction) and improves exploration efficiency by 80%; compared with the semantic frontier-based baseline, it achieves 28.6% higher efficiency and 13% shorter path length, maximizing information gain per unit travel distance.

Keywords:

autonomous exploration; unstructured terrain; structure-aware topology; semantically analyze; topological sparsity; exploration efficiency

1. Introduction

One of the most challenging issues in mobile robotics is the ability to autonomously explore previously unknown spaces [1]. As real-world scenarios become increasingly complex and unstructured, mobile robots are frequently deployed for mission-critical tasks. Through autonomous exploration strategies, these robots utilize sensor data to decide where to navigate and simultaneously construct an environmental map to accomplish specific objectives. Different scenarios demand distinct capabilities. For example, Urban Search and Rescue (USAR) missions often benefit from integrating human prior knowledge to improve efficiency [2], whereas coal mine rescue operations require robust navigation in hazardous underground conditions [3]. Similarly, disaster reconnaissance tasks depend on cost-effective mapping schemes [4] and specialized platforms capable of rapid structural assessment [5]. Sensor data and robust mapping are fundamental to these operations. Binocular vision [3], for instance, is critical for environmental perception in unstructured terrain. Moreover, high-quality mapping is a verified prerequisite for downstream tasks, ranging from autonomous sweeper navigation [6] to complete coverage path planning [7]. Consequently, an effective exploration strategy must ensure complete spatial coverage and build high-quality maps within a limited time. Researchers have addressed this by proposing improved frontier strategies to enhance coverage completeness [8] and introducing multi-criteria decision-making frameworks [9]. Balancing map quality with efficiency is also imperative, often requiring multi-objective path planning algorithms [10] or strict localization constraints [6]. However, localized perception remains a significant bottleneck. Recent surveys on extreme underground environments [11] note that limited field-of-view affects map quality. While distributed multi-robot systems [12] attempt to mitigate this through collaboration, single-robot systems face distinct challenges. To address this, Lv et al. [13] proposed a heuristic exploration method that leverages prior information to improve efficiency in indoor confined environments. Furthermore, despite progress in confined environment exploration [14], a critical shortcoming persists in current research. Mainstream paradigms often treat exploration and map quality optimization as independent tasks. Classical geometric approaches, for instance, prioritize rapid boundary expansion but ignore map topology. This “open-loop decoupling” leads to topology bloat, increased computational burden, and inefficient backtracking. Additionally, the lack of semantic understanding creates perceptual blind spots, causing robots to rely solely on geometric feasibility and potentially enter hazardous areas. To overcome the twin challenges of structural inefficiency and semantic disconnect, a unified framework is urgently needed to bridge the gap between perception and planning. Such a scheme should integrate high-level semantic understanding to identify non-geometric risks while utilizing robust topological constraints (e.g., Voronoi diagrams) to ensure map quality.

Based on this approach, we propose the Structure-Aware Topology Exploration (SATE) framework. Unlike traditional reactive map building methods, SATE fundamentally changes the paradigm by making topological quality a central goal of active optimization and obtaining a priori information about the environment through an Unmanned Aerial Vehicle (UAV) perspective. We integrate a Semantic Seed Voronoi (SSV) mechanism driven by a lightweight U-Net network. The decision nodes are actively anchored to the central axis of the drivability space by converting the a priori information observed from the localized UAV aerial view into a dense drivability heatmap, thereby distinguishing between safe zones and hazardous terrains (e.g., negative obstacles) based on high-level semantic features, and environmental structure sensing via the Voronoi graph. To proactively suppress the generation of redundant nodes, we introduce a topological sparsity regularization term to ensure structural efficiency and the joint optimization of exploration efficiency and map quality.

Paper Contributions

Aiming at the shortcomings of traditional geometric methods that are difficult to recognize non-geometric risks, this paper proposes a semantic seeded Voronoi (SSV) mechanism with structure-aware properties. By utilizing U-Net to semantically analyze the UAV aerial overhead image, a high-confidence passable heat map is generated. Anchoring the Voronoi seed points on the geometric median axis of the semantically safe region realizes the active structured extraction of the environment skeleton, thus generating candidate decision points that satisfy both geometric connectivity and deep semantic safety. Enabling robots to be able to construct robust topology maps that satisfy both geometric connectivity and deep semantic safety. The closed-loop decision framework is constructed by introducing a topological sparsity term into the decision-making process, which introduces the influence of map quality into the decision-making process of autonomous exploration. The framework actively suppresses the generation of redundant nodes and significantly reduces the number of inefficient backtracking, thus improving the exploration efficiency and effectively mitigating the common exploration platform effect in unstructured environments. Experimental results show that SATE achieves complete environment coverage while maintaining high exploration efficiency in simulation environment experiments constructed based on real-world aerial images. Quantitative analysis shows that compared with purely geometric baselines (RRT and Frontier), SATE reduces exploration path length by approximately 45% and improves exploration efficiency (

η

) by 80%; compared with the semantic frontier-based baseline (SNF), it achieves 28.6% higher efficiency and 13% shorter path length. These results validate the superiority of the system in maximizing information gain per unit travel distance while ensuring robust navigation safety.

The remainder of this paper is organized as follows: Section 2 provides a comprehensive review of related work, covering geometric, topological, and semantic exploration strategies. Section 3 details the proposed SATE framework, with a particular focus on the SSV mechanism for candidate generation and the topological sparsity regularization term for decision optimization. Section 4 presents the implementation details, simulation environment, and rigorous comparative analysis with classic baseline methods. Finally, Section 5 summarizes the work presented in this paper and looks forward to future research directions.

2. Related Work

Research on autonomous exploration is diverse, evolving from early passive mapping to modern active perception. Since our proposed method integrates geometric structures, topological graphs, and semantic awareness, we review these three paradigms separately.

2.1. Geometry-Based Exploration

Geometric exploration typically relies on identifying boundaries between free and unknown spaces. The foundational frontier-based approach, introduced by Yamauchi [15], guides robots to these boundaries to maximize map expansion. However, in large-scale environments, pure boundary enumeration incurs high computational overhead. To address this, optimization strategies have been proposed: Fang et al. [16] utilized breadth-first search (BFS) for wavefront detection to rapidly lock boundaries, while Keidar and Kaminka [17] proposed Fast Frontier Detection (FFD) to avoid repetitive scanning by processing only active contour edges. To further reduce traversal costs, sampling-based methods were introduced. Umari and Mukhopadhyay [18] developed a multi-RRT variant that grows multiple trees in parallel and selects the best frontier candidate using a utility function balancing information gain and path cost, while Zhang et al. [19] incorporated goal-biasing sampling into RRT to accelerate convergence in narrow passages. Additionally, Zhong et al. [20] incorporated the Fast Marching method to accelerate exploration using potential field propagation. Despite these improvements, boundary-based methods often suffer from local minima. To overcome this, Volumetric Information-based (VIB) planning shifts the focus from geometric boundaries to uncertainty reduction. Bircher et al. [21] proposed a receding horizon planner to optimize the Next Best View (NBV), while Julian et al. [22] and Bai et al. [23] utilized mutual information and deep learning, respectively, to explicitly calculate the Expected Information Gain (EIG). Recent hierarchical frameworks have further enhanced efficiency. The TARE planner [24] decouples exploration into local coverage and global path optimization to solve the field-of-view effect, while FUEL [25] utilizes incremental frontier structures with B-spline trajectory optimization for high-speed UAV exploration. However, purely geometric paradigms have fundamental limitations. They treat all mapped free space as equally accessible, ignoring non-geometric hazards such as negative obstacles or unstable terrain. Furthermore, greedy heuristics often neglect the global topological configuration [26]. This leads to inefficient movement patterns that fail to address the multi-objective Traveling Salesman Problem (TSP) [27], resulting in excessive energy consumption [28] and redundant revisits to explored areas. These inefficiencies have been further explored in subsequent research, such as volumetric next-best-view planning that accounts for positioning errors during object reconstruction [29], and large-scale exploration frameworks that aim to reduce redundant traversal [30]. However, these approaches still rely on purely geometric criteria and do not actively incorporate topological structure or semantic awareness into the decision-making process, leaving room for the structure-aware framework proposed in this work.

2.2. Topology-Based Exploration

Topological approaches abstract complex environments into low-dimensional graphs that capture connectivity and free-space structure. The Generalized Voronoi Diagram (GVD) is a foundational representation; Aurenhammer [31] established its ability to maintain maximal obstacle clearance, supporting safe navigation in narrow passages. Building on this geometric skeleton, graph-based planners have been successfully deployed in challenging environments. Dang et al. [32] and Kulkarni et al. [33] developed bifurcated local–global strategies for aerial and subterranean robots, combining dense local graphs for thorough coverage with sparse global graphs for rapid relocation. A major challenge in topology generation is the computational cost of maintaining an accurate Euclidean Signed Distance Field (ESDF). Recent works have focused on accelerating skeleton extraction. Chen and Xiao [34] proposed an end-to-end neural network to predict GVDs directly from sensor data, significantly reducing online computation. Wen et al. [35] developed the G²VD planner, which uses grid-based Voronoi diagrams and Voronoi corridors to constrain search space and improve path smoothness. Gao et al. [36] extended Voronoi-based exploration to more general environments via lightweight feature extraction and frontier fusion. More recently, Dong et al. [37] introduced a multi-UAV exploration method based on a dynamic topological graph with graph Voronoi partition, achieving substantial reductions in exploration time and communication volume. Despite these advances in computational efficiency, all the methods above share a fundamental limitation: they derive topology passively from geometric distance fields, treating all free space as equally traversable. The topological graph is either predicted by a network, computed from a grid, or extracted via handcrafted rules—it is always a byproduct of geometric perception, not an active decision variable. This “geometry-only” paradigm leads to two critical drawbacks. First, it generates unnecessarily dense graphs: nodes are placed wherever local maxima of the distance field occur, without any consideration of semantic relevance or structural importance. Second, it is blind to non-geometric hazards. A region that is geometrically open but semantically non-traversable (e.g., tall grass, soft soil, negative obstacles) is still incorporated as valid nodes and edges, potentially guiding the planner into unsafe areas.

2.3. Semantic-Aware Navigation and Mapping

Semantic perception provides the foundational capability for robots to understand unstructured environments. Garg et al. [38] offer a systematic taxonomy of how semantic information can be integrated into robotic mapping and perception. In recent years, researchers have applied these semantic perception capabilities to a variety of robotic tasks. GANav [39], for instance, employs group-wise attention mechanisms to reliably distinguish navigable regions from hazardous terrain such as vegetation and negative obstacles in real time, enabling a robot to avoid obstacles safely. Kimera, proposed by Rosinol et al. [40], and Hydra, proposed by Hughes et al. [41], construct dense 3D dynamic scene graphs that encode both geometric structure and semantic hierarchy, generating rich environmental representations for downstream tasks such as object search and long-term autonomy. These works have achieved considerable success within their respective problem domains. However, neither local reactive navigation nor high-fidelity environmental mapping is designed to address the sequential decision-making problem that lies at the heart of autonomous exploration.GANav answers the question of how to move “given an immediate goal, but it does not answer where to go next” to maximize long-term information gain. Kimera and Hydra assume that a trajectory is already provided by an external planner and focus on reconstructing the environment from that trajectory. Consequently, even when a robot is equipped with state-of-the-art semantic perception and mapping capabilities, its exploration planner—the module responsible for deciding the next goal—still defaults to purely geometric heuristics such as the nearest frontier. Semantic labels remain confined to post-hoc analysis or short-term obstacle avoidance; they are not actively used to guide the robot toward regions that yield high information per unit travel distance.

Recognising this gap, a growing body of research has begun to inject semantic cues directly into the goal-selection process. One such approach is S-RRT [42], which biases sampling-based planners by assigning higher sampling probabilities to regions containing target objects, thereby accelerating task-specific search. While effective for goal-directed navigation, this method does not aim to build a globally consistent, topology-aware map of the environment, nor does it address structural redundancy during exploration. The most methodologically relevant precedent to our work, however, is the semantic frontier-based exploration framework proposed by Gomez et al. [43]. Gomez and his colleagues pioneered the idea of classifying frontiers into free areas” andtransit areas” (e.g., doors) using geometric rules, and select the next viewpoint by maximizing a utility function that balances frontier size, semantic importance, and topological hop-count. This framework shares with SATE the core philosophy that semantic information should influence not only perception but also goal selection, and that topological structure can serve as a guide for exploration efficiency. Nevertheless, the approach of Gomez et al. exhibits three inherent limitations. First, its semantic classification relies on hand-crafted geometric detectors (e.g., gap analysis for doors), which fail in terrains lacking explicit door-like features such as negative obstacles or dense vegetation. Second, its topological cost is a discrete hop-count that cannot capture the continuous, terrain-dependent effort required to navigate between nodes. Third, it lacks any mechanism to actively prune redundant nodes from the topological graph, leading to node proliferation and increased backtracking in large open spaces. It is worth noting that another line of research leverages Bird’s-Eye-View (BEV) representations to provide global spatial context for navigation. Dual-BEV Nav [44], for example, employs top-down traversability heatmaps to plan long-range paths in outdoor terrains. This work is particularly relevant to our perception module, as it demonstrates the effectiveness of using overhead imagery to generate dense traversability predictions. However it does not address the complete coverage exploration problem that we target in this paper.

2.4. Summary

Taken together, the above observations present a critical fragmentation of contemporary methodologies: geometric planners pursue maximum coverage at the expense of semantic security; traditional generalized vector decomposition (GVD) methods provide structural abstraction, yet their passive extraction approach is computationally expensive; and semantic methods, despite their perceptual capabilities, often lack integration with global topology optimization. To fill these gaps, we propose the Structure-Aware Topology Exploration (SATE) framework, which combines the Semantic Seeded Voronoi (SSV) mechanism (used to fix the skeleton in a safe region) and the topological sparsity regularization term to transform the map from a passive record to an active navigation tool, actively cutting out structural redundancies, thus maximizing the exploration efficiency while guaranteeing navigation safety. enhance the exploration efficiency.

3. Methodology

The proposed Structure-Aware Topological Exploration (SATE) framework aims to enable a mobile robot to autonomously explore an unknown environment

E

under strict computational and kinematic constraints, while simultaneously constructing a topologically consistent map

M

. The schematic diagrams in this section were constructed using Microsoft PowerPoint 2016 to visualize the system architecture and algorithmic procedures. Figure 1 employs a hybrid visualization approach: it integrates the logical framework with real-time algorithmic outputs. The embedded sub-images—encompassing both raw sensory inputs and processed intermediate representations—are actual results captured from the simulation platform, demonstrating the authentic data processing pipeline. Figure 2 (U-Net architecture) and Figure 3 (SSV mechanism) are conceptual illustrations created using vector graphics. These figures are designed to abstractly represent the network structure and geometric logic, focusing on the theoretical workflow rather than empirical data instantiation.

3.1. Problem Formulation

Consider a bounded but initially unknown workspace

W \subset R^{2}

. A mobile robot equipped with a top-down visual sensor conducts exploration, with its state at time t denoted by

p_{t} \in S E (2)

. To address the complexity of the exploration task, this paper adopts a dual-layer mapping strategy, maintaining two coupled representations of the environment:

Semantic Traversability Field ( $H$ ): Breaking away from the rigid binary description of “occupied/free” in traditional occupancy grids, we model the environment as a continuous scalar field $H : W \to [0, 1]$ . For any spatial coordinate $x \in W$ , the value $H (x)$ represents the degree of “traversability” inferred from visual observations. Based on this, the safe navigation manifold $F$ is no longer defined as a set of discrete grid cells, but as a naturally extending superlevel set $F = {x \in W ∣ H (x) > τ_{safe}}$ , where $τ_{safe}$ is a designated safety baseline threshold.
Topological Graph ( $G_{t}$ ): To characterize the “skeleton structure” of the environment, we construct and maintain a dynamic topological graph $G_{t} = (V_{t}, E_{t})$ . The vertex set $V_{t} = {v_{1}, \dots, v_{n}} \subset F$ consists of sparse waypoints anchored to the medial axis of the safe manifold; the edge set $E_{t} \subseteq V_{t} \times V_{t}$ corresponds to physically existing navigable paths—an edge $e_{i j} = (v_{i}, v_{j})$ is established if and only if a collision-free path $σ : [0, 1] \to F$ exists between the two vertices.

The core idea of the SATE framework is to formulate the autonomous exploration problem as a constrained optimization problem. Specifically, it seeks an optimal target viewpoint

v^{*}

that possesses a “multi-tasking attribute”: maximizing the reduction in environmental uncertainty while maintaining the compactness of the topological graph. Mathematically, this decision-making process is defined as:

v^{*} = \underset{v \in F}{argmax} (λ_{e} \cdot I_{\exp} (v) - λ_{t} \cdot S_{topo} (v) - λ_{c} \cdot C_{nav} (v)),

(1)

where

λ_{e}

,

λ_{t}

, and

λ_{c}

are weighting coefficients that regulate the relative importance of exploration gain, topological maintenance, and navigation cost, respectively. The functional terms have clear physical meanings:

I_{\exp} (v)

measures the expected information gain of the target viewpoint;

S_{topo} (v)

represents the topological sparsity regularization term proposed in this paper, which serves to suppress the generation of redundant nodes; and

C_{nav} (v)

signifies the navigation cost from the current pose to the target viewpoint. It is important to emphasize that the constraint

v \in F

remains explicit: while the exploration task is important, safety is always the primary prerequisite.

3.2. System Overview

The autonomous exploration process of the robot is a continuous closed-loop system. As shown in Figure 1, this framework adopts a hierarchical structure, decoupling environmental cognition from decision planning to achieve a modular design. The interaction logic between these modules is visualized through specific connections where solid arrows represent the primary execution pipeline carrying both control signals and sequential data flow, while dashed arrows denote auxiliary data references accessing the shared global map resource. The upper layer is the Perception Backbone, which serves as the data source for the entire system; based on a lightweight U-Net architecture, it processes local top-down visual data to output a dense environmental traversability heatmap that supports safety evaluation in downstream modules. The lower layer is the Navigation Control Loop, responsible for driving the robot to execute specific exploration actions in a sequential order. Specifically, the Sampling and Memory Module first receives the semantic map stream directly from the perception backbone via the solid arrow connection to extract the topological skeleton using the SSV mechanism and maintain a pool of candidate goals. Subsequently, the Decision Core Module acts as the planner’s brain, evaluating these candidates by querying the stored global traversability map via the dashed arrow input to calculate exploration gain and the proposed topological sparsity for optimal target selection. Finally, the Hierarchical Action Execution Module receives the decision target via the solid arrow and references the map data via the dashed arrow connection to generate a collision-free trajectory; simultaneously, the updated topological graph is fed back to the Decision Core through the recursive solid arrow to enforce sparsity constraints in the subsequent cycle, thus effectively closing the exploration loop.

3.3. Semantic Skeleton Sampling (SSV) Mechanism

Traditional geometric skeleton extraction algorithms often ignore the semantic information of the environment. To address this issue, this paper designs the SSV mechanism and adopts a dual-layer coupling strategy—building the environmental skeleton based on geometric constraints and optimizing the skeleton structure using semantic information.

3.3.1. Semantic Segmentation Network

To extract global traversability features from aerial images, we employ the U-Net architecture as the backbone of our semantic perception module. As illustrated in Figure 2, the network follows a classic encoder–decoder structure augmented with skip connections. The encoder path consists of repeated blocks of convolutional layers and max-pooling operations to progressively capture high-level semantic context, while the decoder path symmetrically recovers spatial resolution through up-sampling. Crucially, the skip connections concatenate high-resolution feature maps from the contracting path with the up-sampled output of the expanding path. This mechanism is essential for unstructured terrain analysis as it preserves the fine-grained boundary information of irregular obstacles, such as rock edges and vegetation lines, which are often lost in pure down-sampling architectures. However, applying a standard U-Net to unstructured environments presents a specific challenge: severe class imbalance, where safe traversable regions (background) often dominate the field of view compared to sparse obstacles (foreground). Standard training with Cross-Entropy loss often leads to model degeneracy, where the network biases towards predicting “safe” everywhere to minimize global error. To address this, we introduce a targeted improvement to the training strategy by adopting the Binary Focal Loss [44] instead of the standard cross-entropy loss. The loss function is defined as:

L_{f o c a l} = - α {(1 - p)}^{γ} \log (p),

(2)

where p is the predicted probability of the ground truth class. We tune the balancing factor

α

to address class imbalance and the focusing parameter

γ

to down-weight the loss contribution of easy examples (large open fields). This adaptation forces the model to focus on learning hard, sparse examples (irregular obstacles), ensuring accurate boundary extraction even in highly unstructured terrains.

3.3.2. Medial-Axis Aligned Sampling

To ensure that the generated candidate nodes naturally align with the environmental medial axis (i.e., the Generalized Voronoi Diagram), this paper introduces the Euclidean Distance Transform (EDT) algorithm. First, a distance field

D (x)

is constructed, where the value represents the Euclidean distance from any pixel in space to the nearest obstacle; subsequently, local maxima in the distance field are identified as candidate Voronoi seed points

V_{cand}

. Simply put, if a point lies within the safe region and its distance field value is higher than that of its surrounding neighbors, it is selected as a candidate node. This method ensures that all candidate nodes are anchored to the “skeleton” of the environment.

3.3.3. Candidate Node Scoring and Pruning

Extracting local maxima alone is insufficient, as raw seed points are often too dense. In this phase, semantic information plays a key role. Additionally, to address boundary effect issues during the Voronoi partitioning process, this paper introduces dummy points outside the boundaries of the perception range. These dummy points serve as a constraint mechanism, ensuring that the partitioning of Voronoi cells focuses on the internal structure of the environment rather than invalid regions outside the boundaries. For each candidate seed point

v_{i}

, its corresponding Voronoi cell

R_{i}

is first calculated, and then the traversability field within this cell is integrated to obtain the structural importance score

S_{imp} (v_{i})

for the seed point. This score reflects the size of the effective traversable space covered by the candidate node. Finally, a Non-Maximum Suppression (NMS) algorithm is performed on all candidate nodes to eliminate redundant nodes, resulting in the final set of candidate nodes. These nodes correspond to core regions in the environment with extensive traversable space, high safety, and significant strategic importance (processing flow shown in Figure 3).

3.4. Structure-Aware Decision Planning

After obtaining the candidate node set

V_{cand}

, the next step is to solve the decision problem of “where to explore next.” This process requires a precise balance between exploration gain, navigation cost, and map quality.

3.4.1. Multi-Objective Utility Function

To guide the exploration decision, this paper adapts the theoretical objective defined in Equation (1) to the discrete candidate set

V_{cand}

. Specifically, the optimal target node is obtained by evaluating Equation (1) over all

v \in V_{cand}

. Here,

λ_{e}

,

λ_{c}

, and

λ_{t}

serve as tunable hyperparameters used to normalize different metrics and adjust the trade-off between efficiency and map quality. Their definitions remain the same as in Equation (1):

Exploration Gain ( $I_{\exp}$ ): Given the sparse nature of the candidate node set, a ray-casting algorithm is used to efficiently estimate the range of unknown areas visible to the target node. The physical meaning of this metric is the potential reduction in map entropy achieved by the target node.
Navigation Cost ( $C_{nav}$ ): It is particularly worth noting that this paper does not use Euclidean distance to calculate navigation cost, but instead solves for the shortest path distance on the traversable manifold $F$ based on the A* algorithm. This design effectively avoids selecting target points that appear geometrically close but are blocked by obstacles (such as walls).
Topological Sparsity ( $S_{topo}$ ): This is the core innovation of the SATE framework, serving as an online constraint mechanism to suppress the excessive growth of redundant nodes in the topological graph.

3.4.2. Topological Sparsity Regularization

Existing exploration algorithms often ignore the topological structure of the map, leading to the generation of a large number of redundant nodes in open areas. To address this problem, this paper designs a potential energy field model based on Kernel Density Estimation (KDE). From a theoretical perspective, topological density should strictly be calculated based on the geodesic distance on the topological graph to account for non-Euclidean geometries. However, in the context of real-time robotic exploration, calculating all-pairs geodesic distances incurs a prohibitive computational cost, rendering it impractical for high-frequency onboard decision loops. Therefore, we adopt a local Euclidean approximation strategy. While we acknowledge that the Euclidean metric may deviate from the true topological distance in non-convex scenarios (e.g., thin walls), it serves as a sufficiently accurate first-order approximation within a small local neighborhood R. Consequently, we restrict the repulsive potential calculation to this local cutoff radius, defining the redundancy cost

S_{topo} (v)

as:

S_{topo} (v) \approx \sum_{u \in N_{R} (v)} K_{σ} {(∥ v - u ∥}_{2}),

(3)

where

N_{R} (v) = {u \in V_{global} ∣ {∥ v - u ∥}_{2} \leq R}

denotes the set of existing nodes within the local cutoff distance R. The function

K_{σ} (\cdot)

is a Gaussian kernel function defined as

K_{σ} (d) = \exp (- d^{2} / 2 σ^{2})

, where

σ

controls the decay rate of the repulsive field. Physically, this model implies that each existing node generates a local repulsive field; if a candidate node attempts to effectively “squeeze” into a node-dense region,

S_{topo} (v)

will increase sharply, pushing the exploration frontier toward sparse areas.

As formally outlined in Algorithm 1, the decision-making process iteratively evaluates the generated candidate set

V_{cand}

to identify the optimal exploration goal. The execution flow begins with a strict kinematic feasibility verification (Lines 3–7), where the function PathFinding attempts to generate a collision-free path

P

from the robot’s current pose

p_{t}

to the candidate node v on the traversable manifold

F

. Although computationally more intensive than simple Euclidean metrics, this explicit path planning acts as a crucial topological check to ensure that all selected goals are physically reachable, thereby compensating for the local approximation of the KDE model and avoiding the “unreachable frontier” problem common in traditional geometric methods. Once a candidate is validated as feasible, the algorithm proceeds to metric evaluation and sparsity checking (Lines 9–12). The exploration gain

I_{\exp}

is computed using ray-casting, while the topological sparsity is assessed by querying the global graph

V_{global}

. Finally, in the utility aggregation phase (Lines 14–19), the composite utility

U (v)

is derived by weighting the exploration gain, navigation cost, and sparsity terms according to Equation (1). By maintaining a running maximum

U_{\max}

, the algorithm updates the optimal target

v^{*}

whenever a candidate with higher utility is identified. This greedy strategy ensures that the robot consistently moves towards the frontier that offers the best trade-off between information gain and map structural quality.

Algorithm 1 Structure-Aware Decision Process
Require: Candidate Set $V_{cand}$ , Robot Pose $p_{t}$ , Global Graph $V_{global}$
Ensure: Optimal Target $v^{*}$
1:	Initialize $v^{*} \leftarrow \emptyset$ , $U_{\max} \leftarrow - \infty$
2:	for all $v \in V_{cand}$ do
3:	{// Step 1: Kinematic Feasibility Check (Safety First)}
4:	$P \leftarrow PathFinding (p_{t}, v, F)$ {e.g., A* Search}
5:	if $P = \emptyset$ or $P$ is collision then
6:	continue
7:	end if
8:	$C_{nav} \leftarrow Length (P)$
9:	{// Step 2: Metric Evaluation}
10:	$I_{\exp} \leftarrow RayCastGain (v)$ {Exploration Gain}
11:	Find local neighbors $N_{R} (v) \leftarrow {u \in V_{global} ∣ {∥ v - u ∥}_{2} \leq R}$
12:	$S_{topo} \leftarrow \sum_{u \in N_{R} (v)} K_{σ} {(∥ v - u ∥}_{2})$ {Local Sparsity Regularization}
13:	{// Step 3: Utility Calculation (Equation (1))}
14:	$U (v) \leftarrow λ_{e} I_{\exp} - λ_{c} C_{nav} - λ_{t} S_{topo}$
15:	{// Step 4: Optimization}
16:	if $U (v) > U_{\max}$ then
17:	$U_{\max} \leftarrow U (v)$
18:	$v^{*} \leftarrow v$
19:	end if
20:	end for
21:	return $v^{*}$

3.5. Closed-Loop Control: From Decision to Execution

Determining the optimal target node

v^{*}

is only half the task; how to reach that node safely is another core issue. This paper does not blindly innovate but optimizes traditional methods: in the costmap of the low-level motion planner, the semantic traversability field

H

output by the upstream module is explicitly incorporated, rather than relying solely on geometric obstacle maps generated by sensors like LiDAR. This means that when the A* algorithm performs path searching, spaces marked as “negative obstacles” or “dangerous areas” by the semantic model—even if their geometric features appear flat and traversable—will be assigned extremely high traversal costs. Through this design, the “intelligence” of the global decision can be effectively transferred to the motion execution layer, ensuring that the robot not only achieves safe planning but also completes safe motion.

4. Experiments

To systematically evaluate the performance and robustness of the proposed SATE framework, this study constructed a simulation test environment based on real-world aerial imagery and conducted a series of comprehensive comparative experiments. The core objective of this experimental design is to specifically address the following three key research questions:

RQ1 (System Efficiency): Compared with classical geometric exploration baselines, does SATE achieve better performance in terms of exploration efficiency and the compactness of the generated topological graph?
RQ2 (Topological Quality): Is the topological skeleton generated by the SSV mechanism better than those obtained through standard spatial discretization methods?
RQ3 (Component Necessity): What specific roles do semantic perception and topological sparsity regularization play in ensuring navigation safety and decision robustness?

To clarify the figure production process and technical details: all experimental logs were recorded via ROS (Robot Operating System) bags. Figure 4 presents UAV aerial imagery and corresponding semantic heatmaps generated by the U-Net perception module. Figure 5 was generated using Python (version 3.8.18, with Matplotlib 3.7.3) to plot statistical data extracted from ROS logs. Figure 6, Figure 7, Figure 8 and Figure 9 were produced using Python (version 3.8.18) by projecting and overlaying recorded ROS trajectory data, candidate points, and A* planned paths onto rectified UAV aerial maps.

4.1. Experimental Setup

4.1.1. Simulation Environment and Test Maps

To thoroughly test the generalization capability of SATE, we utilized three distinct simulation environments constructed from real-world aerial imagery. The key specifications and landmarks of each scenario are listed in Table 1. To visually validate the perception performance, Figure 4 presents both the aerial RGB views and the corresponding semantic traversability heatmaps generated by our model. As shown in the visualization, the model effectively assigns low traversal costs (blue regions) to navigable areas while highlighting obstacles (yellow/red regions) across diverse terrains.

To ensure rigorous validation, all primary comparative experiments and ablation studies were carried out in Map C, which represents the most complex scenario containing high-risk negative obstacles.

4.1.2. Perception Model Training Setup

To enable the semantic perception module to precisely identify traversable regions in aerial imagery, we adopted the self-supervised learning paradigm inspired by the Dual-BEV Nav framework [44]. The training process utilizes the RECON dataset [45] which is specifically collected for off-road navigation tasks and contains over 5000 trajectories from 9 distinct real-world unstructured scenarios. This dataset covers diverse complex terrains including grasslands, woodlands, and varying lighting conditions, ensuring that the model is exposed to representative samples of the challenges encountered in real-world exploration. We utilize the robot’s historical trajectories as supervision ground truth, labeling pixels covered by historical paths as foreground traversable regions while remaining areas are marked as background. To address the severe class imbalance caused by sparse positive labels in this trajectory-based supervision, we implemented the Binary Focal Loss strategy detailed in Section 3.3.1 during the optimization process. By leveraging the focusing parameter defined in Equation (2), the model is forced to prioritize learning from the hard and sparse examples provided by the historical trajectories. This configuration enables the network to effectively capture the textural features of traversable terrain and generalize this understanding to unvisited regions with similar visual characteristics rather than overfitting to specific path geometries. Detailed training hyperparameters and configuration settings are listed in Table 2.

4.1.3. Implementation Details

The simulation framework is built on ROS Noetic. To explicitly evaluate exploration logic, we assume reliable state estimation, abstracting away low-level sensor noise. To validate the system’s efficiency, the complete software stack is deployed on a single embedded NVIDIA Jetson AGX Orin module (32 GB). During real-world deployment tests, the system maintains a stable operating frequency of 20 Hz. The average inference time for the semantic perception module is approximately 15 ms, and the decision-making process consumes 6 ms. Key system parameters are summarized in Table 3. The decision weights (

λ_{e}, λ_{c}, λ_{t}

) were determined through extensive empirical testing to balance the trade-off between exploration speed and map compactness. Notably,

λ_{t}

is set to a relatively high value (0.5). This configuration is deliberately chosen to enforce strict topological sparsity constraints from the early stages of exploration, preventing the structural redundancy common in complex unstructured environments.

4.1.4. Evaluation Metrics

To quantitatively evaluate the proposed framework, we adopted three performance metrics: Explored Region Rate (

R_{e x p}

), Average Path Length (

L_{a v g}

), and Exploration Efficiency (

η

). Their specific definitions are detailed as follows:

Explored Region Rate $R_{e x p}$ : This metric gauges how complete the map built by the robot is. Its calculation formula is:

$R_{e x p} = \frac{N_{e x p}}{N_{t o t a l}},$

(4)

where $N_{e x p}$ stands for the number of explored free space cells, and $N_{t o t a l}$ refers to the total number of free space cells in the ground truth map.
Average Path Length $L_{a v g}$ : This metric reflects the average travel distance across all experimental trials, defined as:

$L_{a v g} = \frac{1}{M} \sum_{i = 1}^{M} L_{i},$

(5)

where M denotes the total number of trials, and $L_{i}$ is the path length traveled in the i-th trial.
Exploration Efficiency $η$ : This metric measures the average information gain (i.e., entropy reduction) per unit distance traveled across all trials. Mathematically, it is expressed as:

$η = \frac{1}{M} \sum_{i = 1}^{M} \frac{H (M_{0}) - H (M_{i})}{\sum_{t = 1}^{T_{i}} d_{t}},$

(6)

where $H (M)$ represents the entropy of the occupancy map M. $T_{i}$ and $d_{t}$ respectively correspond to the total number of steps and the distance of the path segment at step t in the i-th trial.

4.2. System-Level Comparative Evaluation

Hierarchical planners like TARE [24] and graph-based methods like GBPlanner [32], although occupying a frontier position in the field of geometric exploration, mostly rely on high-precision 3D LiDAR point cloud data, and their core objectives also focus on maximizing geometric surface coverage. Conversely, the SATE framework proposed in this paper was designed with the original intention of serving visual semantic exploration tasks, with a key emphasis on identifying non-geometric hazards easily missed by LiDAR methods, such as special terrains like negative obstacles. To ensure fairness of experiments under the same sensor modality, and simultaneously highlight the technical contribution of the proposed topological regularization method, we selected the classical Nearest Frontier algorithm (NF) and the sampling-based Rapidly exploring Random Tree algorithm (RRT) as benchmarks—these two types of algorithms are precisely the technical roots of the vast majority of autonomous exploration strategies. In addition, we also introduced the Utility-Greedy algorithm (MI) as a baseline method to conduct ablation experiments, thereby explicitly verifying the actual effect of the topological sparsity regularization term

S_{topo}

. Furthermore, to directly address the reviewers’ concern regarding comparison with representative semantic-aware exploration methods, we implemented a semantic frontier-based exploration baseline (denoted as SNF) following the method of Gomez et al. [43]. This baseline classifies frontiers into free areas and transit areas using geometric heuristics, and selects the next viewpoint by maximizing a utility function

f (p) = A (p) \cdot S (p) \cdot e^{1 / C (p)}

, where

A (p)

is frontier size,

S (p)

is a semantic weight (transit areas receive higher weight), and

C (p)

is the topological hop-count distance. It shares the same high-level philosophy with SATE—using semantics and topology to guide exploration—but differs fundamentally in the technical realization, making it an ideal comparator to isolate the contributions of our learning-based traversability field and continuous topological optimization. To launch a comprehensive and rigorous performance verification of the SATE framework, we designed comparative experiments with the following baseline strategies:

Frontier-Based Exploration (NF): A typical geometry-driven strategy [15] used as a standard benchmark. It prioritizes the nearest boundary between known free space and unknown regions to guide exploration.
RRT-Exploration (RRT): A classic sampling-based method that leverages Rapidly exploring Random Trees. It randomly generates exploration goals within unknown regions, providing a baseline for probabilistic planning efficiency.
Utility-Greedy (MI): An ablated variant of the proposed SATE framework. This method retains the SSV frontend for candidate generation but selects targets solely based on immediate information gain and travel cost, excluding the $S_{topo}$ term. It is designed to isolate and verify the specific contribution of the proposed topological feedback mechanism.
Semantic Frontier-based Exploration (SNF): A representative semantic topological exploration baseline implemented based on Gomez et al. [43]. It incorporates hand-crafted semantic rules (door detection) and discrete topological cost (hop-count), yet lacks continuous traversability learning and online sparsity control. This baseline directly represents the “semantic-aware but non-topology-optimized” paradigm.

All autonomous exploration tasks were carried out in Scenario (c) (Unstructured Training Field), which features a complex mix of open spaces and hazardous negative obstacles. To evaluate the robustness of the system against varying initial conditions, we adopted a multi-start validation strategy: the robot was spawned in three strategically distinct zones (bottom-center, bottom-right, and center). This setup prevents results from being biased by a single favorable spawn position and ensures the robot faces diverse topological challenges from different entry points. Each trial continued until the explored region rate (

R_{e x p}

) reached saturation (around 99% coverage) or a timeout occurred. The quantitative metrics (Table 4) present the averaged performance across these three distinct trials.

4.2.1. Quantitative Efficiency Analysis

As shown in Table 4, SATE demonstrates a consistent performance advantage in this complex unstructured environment. While baseline methods (MI and RRT) require over 500 m to complete exploration, SATE attains full coverage with an average path length of only 278.4 m—a reduction of roughly 45%. The semantic topological baseline SNF, while outperforming pure geometric methods (NF, RRT) with an average path length of 320.4 m and an efficiency

η

of 0.28, still falls considerably behind SATE. This performance gap is instructive: although SNF incorporates semantic priors and topological cost into its decision function, its semantic information is derived from rigid geometric rules (door detection based on gap size and symmetry), which are less robust in unstructured outdoor terrains containing negative obstacles and irregular vegetation. Moreover, its topological cost is a discrete hop-count that fails to reflect the true continuous navigation effort, leading to suboptimal path selection around narrow bottlenecks. In contrast, SATE’s learned traversability field provides dense, probabilistic semantic cues, and its A*-based path length on the safe manifold captures the exact navigation cost. This comparison empirically validates that the specific technical innovations of SATE—not merely the high-level idea of “semantics + topology”—are responsible for its superior efficiency.

Figure 5 further presents the exploration process averaged across the three sets of experiments. In the figure, the horizontal axis represents the accumulated path length, while the vertical axis corresponds to the percentage of the explored area relative to the total environmental area. From the curve trend, it is evident that both the Utility-Greedy algorithm (MI, yellow curve) and the SATE algorithm (blue curve) achieved a rapid climbing of coverage rate in the initial stage. However, as exploration advanced, the growth trend of the greedy algorithm markedly slowed down, exhibiting a significant “plateau effect.” Tracing the root cause, this stagnation is caused by the lack of topological foresight—in unstructured environments where negative obstacles and narrow corridors exist, a pure greedy strategy makes the robot susceptible to local dead zones, often necessitating large amounts of backtracking path length to cover the remaining areas. The SNF baseline (magenta dash–dot curve) shows improved efficiency over purely geometric methods, yet its curve still lags behind SATE throughout the exploration process, confirming that hand-crafted semantics and discrete topological cost are insufficient for achieving near-optimal performance. In contrast, the coverage rate curve of SATE maintained a relatively high growth rate throughout the entire exploration cycle, achieving near-complete coverage (>98%) with the shortest accumulated path length. This result verifies that the proposed topological sparsity regularization effectively guides the robot to prioritize exploring critical structural nodes, fundamentally reducing the generation of redundant movements regardless of the starting position. The coverage rate convergence speed of the RRT-Exploration algorithm is the slowest among all methods, and the tail of the curve appears exceptionally long. This inefficiency stems from the inherent randomness of the sampling mechanism, which struggled to consistently find feasible paths through narrow bottlenecks, thereby dragging down the overall exploration process.

4.2.2. Qualitative Trajectory Analysis

To investigate the algorithmic behavioral logic behind the quantitative indicator differences, Figure 6 visually presents the exploration trajectories generated by various methods in Scenario (c), intuitively revealing the behavioral differences between different algorithms. The motion trajectories of various baseline methods universally suffer from obvious path redundancy and motion disorder problems, which are particularly prominent in the central region of the map.

RRT-Exploration frequently oscillated back and forth near narrow passages, always unable to efficiently traverse critical passages between obstacles; Utility-Greedy (MI) fell into a severe local optimal trap, forming dense clusters of redundant trajectories in the bottom-left corner of the map before gradually expanding to surrounding areas; and the global planning efficiency of the Frontier-Based method (NF) is extremely low, with path planning presenting obvious “zig-zag” reciprocating fluctuation characteristics—the trajectory shuttles back and forth within the entire map width range, which also coincidentally explains the reason for the high average path length (423.1 m) of this method. The trajectory of SNF (magenta dash–dot line) exhibits a more structured pattern than the purely geometric baselines, successfully identifying and traversing several door-like passages. However, it still suffers from local oscillation near narrow entrances and occasionally bypasses the optimal medial axis due to its discrete topological cost and coarse semantic modeling, leading to noticeable detours. This behavior stands in stark contrast to SATE, which, guided by the continuous distance field and sparsity regularization, threads through bottlenecks with decisiveness and minimal detour. The trajectories generated by the SATE framework are smoother and possess stronger logic, presenting a structured and clear-goal-oriented exploration mode. SATE can precisely identify and traverse critical bottleneck regions that various baseline methods find difficult to cope with, such as the narrow pathway between two negative obstacles and the connection gap between the eastern pit and the corridor. Of course, to guarantee the effective coverage range of sensors, SATE will inevitably pass through the edge regions of negative obstacles, but the entire exploration process is always guided by clear topological goals, maximally reducing backtracking operations, thereby efficiently covering various narrow intersections and thoroughly avoiding the massive path redundancy problems appearing in methods such as MI, NF, and SNF.

4.2.3. Generalization and Stability Across Diverse Scenarios

To further verify the applicability of the SATE framework, additional exploration experiments were conducted in Map A (Open Field) and Map B (Campus Road). These environments provide distinct structural characteristics: Map A represents a large-scale, feature-sparse open terrain, while Map B features semi-structured corridors with paved roads and vegetation belts. In these trials, the robot was deployed from three strategically different starting positions: the center, the bottom-left, and the bottom-right of the maps. This multi-start strategy aims to evaluate whether the algorithm can consistently maintain efficient exploration logic regardless of the initial configuration and environmental topography. The visual results in Figure 7 demonstrate that SATE exhibits high stability across varied scenarios. In Map A, despite the lack of clear geometric constraints, the SSV mechanism effectively guides the robot to maintain uniform coverage, preventing redundant wandering in open spaces. In Map B, the robot successfully identifies the paved road as a high-confidence traversable corridor, anchoring its path to the medial axis of the safe manifold. The consistent performance across different starting points confirms that the integration of semantic seed points and topological sparsity provides a robust driving force. This ensures that the exploration process remains goal-oriented and structure-aware, without being contingent on specific environmental layouts or favorable initializations.

4.3. Parameter Sensitivity and Robustness Analysis

To justify the hyperparameter selection and evaluate the robustness of the SATE framework, we conducted systematic sensitivity analyses in the most challenging environment (Map C). Our analysis focuses on two decisive parameters: the topology regularization weight

λ_{t}

and the Gaussian kernel width

σ

. Impact of Topology Weight

λ_{t}

: As illustrated in Figure 8a, we varied

λ_{t}

from

0.1

to

0.9

while fixing other weights (

λ_{e} = 0.3, λ_{c} = 0.2

). When

λ_{t}

is low (

0.1

), the system lacks sufficient structural constraints and degenerates toward a greedy frontier-based strategy, resulting in a significantly longer path (avg. 495.5 m) due to severe backtracking. Conversely, an excessively high

λ_{t}

(

0.9

) imposes over-strict sparsity, causing the robot to “hesitate” at narrow transitions and take long detours. The optimal performance is achieved at

λ_{t} \in [0.3, 0.7]

, with

λ_{t} = 0.5

yielding the shortest path and minimum variance, validating our balanced design. Impact of Kernel Width

σ

: Figure 8b evaluates the effect of the Gaussian kernel width

σ \in {1.0 m, 3.0 m, 5.0 m}

. A small width (1.0 m) fails to suppress redundant nodes effectively, leading to a cluttered graph that hinders planning efficiency. A larger width (5.0 m) tends to over-simplify the topology, potentially merging distinct narrow passages and causing connectivity issues. The selected value,

σ = 3.0

m, aligns with the geometric scale of unstructured corridors, achieving the best balance between structural abstraction and path optimality.

4.4. Downstream Task Validation: Global Navigation

To further substantiate the practical value of the constructed topological graph

G_{t}

, we evaluate its performance in global point-to-point navigation tasks—an essential downstream capability beyond exploration. Experimental Setup: After completing the exploration of Map C (the most challenging unstructured environment), we randomly selected 20 start-goal pairs distributed across all major regions of the environment. For each pair, we performed A* search directly on the topological graph

G_{t}

to generate a feasible path.

This tests whether the graph, built solely during exploration, retains sufficient connectivity and structural fidelity for post-exploration navigation. Results and Analysis: All 20 queries successfully yielded a valid path, confirming that the topological skeleton captures every critical passage, intersection, and corridor required for global reachability. Due to the sparse nature of

G_{t}

, path computation completes in under 5 ms on an embedded platform (Jetson AGX Orin), demonstrating its readiness for real-time onboard applications. Figure 9 illustrates three representative navigation queries targeting distinct map regions. The generated paths (green lines) confirm that the topological skeleton maintains global reachability, allowing the robot to execute rapid return-to-home or relocation missions to any explored location. Implication: These results directly refute the concern that

G_{t}

is merely an exploration ephemera. Instead, they prove that our structure-aware exploration strategy yields a topologically complete and navigation-ready map with minimal traversal cost—fulfilling the dual promise of efficient exploration and high-quality map construction.

4.5. Component Analysis and Ablation Studies

4.5.1. Evaluation of Topological Skeleton Generation (SSV)

Since the quality of a topological map is closely tied to node placement, we benchmark the proposed SSV mechanism against three standard candidate generation strategies. These strategies represent three typical paradigms: discretization-based, peak-based, and density-based methods, as detailed below:

Uniform Grid (Baseline for Discretization): A straightforward approach that divides the local map into a fixed-resolution grid (2 m × 2 m) and selects the center of safe grid cells. It serves as a benchmark for coverage uniformity but lacks the ability to adapt to structural variations.
NMS (Baseline for Local Maxima): Represents peak extraction methods. It identifies pixels with the highest traversability scores within a local sliding window ( $5 \times 5$ ). This setup tests whether simple local filtering alone is sufficient for generating valid topological structures.
DBSCAN Clustering (Baseline for Regional Density): A density-based method chosen for its capability to handle regions of arbitrary shapes without requiring a pre-defined number of clusters. It groups pixels with high traversability probability and uses cluster centroids as nodes to represent regional connectivity.
Ours (SSV): The proposed structure-aware method. It applies the Euclidean Distance Transform (EDT) to the semantic safe region, anchoring nodes strictly to the medial axis (GVD) to ensure geometric centrality.

Figure 10 visualizes the performance differences across three representative scenarios, highlighting the superior structural awareness of the proposed SSV mechanism. In wide-open environments (Map A), discretization and peak extraction methods (Uniform Grid, NMS) produce an excessive number of redundant nodes, while DBSCAN tends to shift toward the boundaries of high-probability regions. In contrast, SSV maintains optimal sparsity by distributing a minimal set of nodes across key quadrants of the open area. In hazardous narrow passages (Map B) sandwiched between “negative obstacles,” baseline methods fail to ensure navigation safety. Uniform Grid often misses the precise center of the passage due to resolution limitations, while NMS and DBSCAN rely solely on pixel intensity for node placement. As a result, if the traversability heatmap contains noise or uneven gradients near the edges of the pits, these methods place nodes off-center, increasing the risk of collisions. On the other hand, SSV leverages the medial-axis constraint to anchor a single, precise node exactly at the geometric center of the chokepoint, maximizing the safety margin. In structured scenarios featuring critical shortcuts (Map C), baseline methods often lose focus. NMS and DBSCAN are prone to being drawn to high-probability noise in the corners of the map, overlooking the spatially small but topologically critical gap. In contrast, SSV identifies the topological ridge formed by this gap in the distance field, allowing the robot to “break through” the boundary and discover the shortcut that other methods miss.

4.5.2. Ablation Studies on Perception and Topology

To further verify the contributions of the semantic perception module and the topological sparsity regularization term, we conducted controlled ablation experiments, specifically focusing on the robot’s ability to break through critical topological bottlenecks (e.g., narrow passages formed by negative obstacles and environmental boundaries) to access the next exploration zone. As shown in Figure 11, both ablated variants failed to traverse this critical chokepoint efficiently, yet the underlying reasons for their failure were fundamentally different. In the variant utilizing a standard binary occupancy map, the robot fell into a mode of conservative avoidance. Lacking the “high-confidence” semantic signal to offset the high traversal cost caused by geometric obstacle inflation, the planner treated the bottleneck as a low-priority frontier, leading the robot to hesitate and wander in the open starting zone. This confirms that the traversability heatmap acts as a critical semantic affordance to encourage the robot to break through structural bottlenecks. Conversely, in the variant without the topological sparsity term, the robot exhibited severe exploration stagnation. Although the heatmap correctly identified the safe passage, the greedy utility function prioritized local map refinement over global expansion, causing the robot to linger in the starting region and add excessive redundant nodes.

This demonstrates that

S_{t o p o}

serves as a necessary global driving force to prevent local optima traps. By integrating both modules, the full SATE framework effectively balances navigation safety and exploration drive, as illustrated in Figure 11c. The semantic heatmap reduces the perceived cost of the narrow corridor, thereby overcoming geometric conservatism, while the topological sparsity regularization actively suppresses local redundancy to overcome exploration stagnation. The synergy of these two mechanisms results in a decisive and efficient trajectory that swiftly traverses the bottleneck, validating the necessity of the proposed dual-optimization strategy.

5. Conclusions

In this study, we propose a framework called Structure-Aware Topology Exploration (SATE) to address the critical balance between navigation safety and map quality optimization in autonomous exploration of unstructured environments. SATE fuses the Semantic Seeded Voronoi (SSV) mechanism with a topological sparsity regularization strategy, successfully bridging the gap between geometric feasibility and semantic safety, and constructing a closed-loop exploration strategy while explicitly supervising the map quality. By addressing the three research questions (RQ1–RQ3), this study draws the following conclusions.

Regarding the system exploration efficiency (RQ1), the experimental data confirms that SATE significantly outperforms both traditional geometric and semantic baseline methods. This advantage is attributed to the synergy between the SSV mechanism and the topological sparsity strategy: the former anchors the candidate nodes precisely on the central axis of the safe region with the help of semantic perception, thus guaranteeing geometric optimality and path smoothness; the latter introduces the topological sparsity regularization term, which actively curbs the generation of redundant nodes and effectively circumvents the repetitive backtracking problem that is prone to occur in open areas with greedy strategies. Quantitative results show that SATE achieves full-area coverage in complex unstructured terrain: compared with purely geometric baselines (RRT and Frontier), it reduces exploration path length by approximately 45% and improves exploration efficiency (

η

) by 80%; compared with the semantic frontier-based baseline (SNF), it achieves 28.6% higher efficiency and 13% shorter path length. These results demonstrate that the system maximizes environmental information gain with minimal mobility cost while ensuring navigation safety. Regarding the topological mapping quality (RQ2), the study verifies the significant superiority of the SSV mechanism. Compared with the traditional uniform grid sampling or density-based clustering methods, the SSV mechanism utilizes the Euclidean Distance Transform (EDT) to precisely anchor the candidate nodes at the central axis of the security region. This mechanism not only maintains topological sparsity in open areas, but also generates high-quality nodes at the geometric centers of critical bottlenecks, such as narrow passages, and ultimately builds a topological graph that is both compact and consistent with the environmental skeleton. Regarding the deep coupling between topology and semantics and the necessity of components (RQ3), the results of ablation experiments show that there is an indispensable synergy effect between these two modules. Specifically, semantic perception is the fundamental support, where the drivability heat map assumes a key semantic guidance role: it provides high-confidence signals that can offset the incremental cost of passage due to the expansion of geometric obstacles, and thus motivate the robot to break through the structural bottleneck that is regarded as a low-priority boundary by the geometric planner. Meanwhile, topological constraints play a guiding role, and sparsity regularization terms confer global structural tension to avoid the robot from getting stuck in localized regional stagnation. The organic integration of these two components drives the robot to make the leap from passive perception realization to active structure optimization.

In terms of methodological contributions, this work establishes a novel paradigm for structure-aware exploration. Unlike mainstream geometry-driven frontier approaches, SATE demonstrates that integrating high-level semantic constraints with topology maintenance mechanisms is crucial for the stable operation of robots in complex, unstructured terrain. While the proposed framework exhibits robust performance in simulation, we explicitly discuss its limitations and potential areas for optimization regarding scalability, perception reliability, and dynamic adaptability. First, regarding computational complexity, the system currently maintains a real-time frequency of 20 Hz on embedded platforms, which is sufficient for local exploration tasks. However, we note that the global Voronoi partition and Euclidean Distance Transform calculations scale with environment size. In extremely large-scale scenarios, centralized graph maintenance could potentially impact update rates. Second, concerning environmental perception and robustness, while real-world deployment introduces challenges such as sensor noise and SLAM drift, the proposed framework exhibits intrinsic tolerance to these uncertainties. Specifically, the integration-based scoring mechanism of the SSV functions as a spatial low-pass filter, effectively mitigating pixel-level semantic segmentation errors caused by ambiguous terrain textures. Furthermore, the topological skeleton, anchored to the geometric medial axis, provides a substantial safety margin that buffers against state estimation drift compared to precise boundary-following methods. However, we acknowledge that catastrophic state estimation failures or severe perception degradation in extreme weather remains a challenge for global consistency. Third, regarding dynamic adaptability, the reliance on static aerial priors limits responsiveness to transient obstacles appearing after map acquisition. In this context, we acknowledge that learning-based paradigms, such as the cross-platform deep reinforcement learning model proposed by Cheng et al. [46], offer distinct advantages. Unlike our global planning approach, such mapless methods excel at reactive maneuvering and dynamic obstacle avoidance without global information. Consequently, future research will focus on narrowing the gap between simulation and reality to achieve robust physical deployment. Specifically, we plan to adopt hierarchical sub-map architectures or sliding-window mechanisms to decouple global topological updates from local real-time requirements, thereby resolving computational bottlenecks in large-scale environments. To further enhance reliability under extreme conditions, we aim to integrate uncertainty-aware planning modules that explicitly account for semantic perception risks. Furthermore, to address dynamic adaptability, we will explore hybrid architectures that integrate a deep reinforcement learning-based local controller [46] into the SATE framework. This would combine our global structural guidance with the robust reactive capabilities of reinforcement learning to handle transient obstacles effectively. Ultimately, all these improvements will be integrated and rigorously tested on physical robotic platforms in diverse real-world unstructured environments to verify the system’s robustness against sensor noise and dynamic changes.

Author Contributions

Conceptualization, M.D. and X.W.; methodology, M.D.; software, M.D.; validation, M.D. and X.W.; formal analysis, M.D.; investigation, M.D. and S.C.; resources, X.W. and S.C.; data curation, M.D.; writing—original draft preparation, M.D.; writing—review and editing, X.W. and S.C.; visualization, M.D.; supervision, X.W. and S.C.; project administration, X.W.; funding acquisition, X.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research is supported by the Ministry of Industry and Information Technology of China, General Program of Shanghai Natural Science Foundation (No. 24ZR1419800, No. 23ZR1419300), the National Natural Science Foundation of China (No. 42130112), Science and Technology Commission of Shanghai Municipality (No. 22DZ2229004), and Shanghai Frontiers Science Center of Molecule Intelligent Syntheses. The APC was funded by the authors.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.

Acknowledgments

The authors would like to thank the anonymous reviewers for their constructive comments on the manuscript.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

SATE	Structure-Aware Topological Exploration
SSV	Semantic Seeded Voronoi
GVD	Generalized Voronoi Diagram
RRT	Rapidly exploring Random Tree
NBV	Next Best View
EDT	Euclidean Distance Transform
NMS	Non-Maximum Suppression
NF	Nearest Frontier
MI	Mutual Information (Utility-Greedy)
SLAM	Simultaneous Localization and Mapping
ROS	Robot Operating System
U-Net	U-shaped Convolutional Neural Network
KDE	Kernel Density Estimation

References

Burgard, W.; Moors, M.; Stachniss, C.; Schneider, F.E. Coordinated multi-robot exploration. IEEE Trans. Robot. 2005, 21, 376–386. [Google Scholar] [CrossRef]
Krzysiak, R.; Butail, S. Information-based control of robots in search-and-rescue missions with human prior knowledge. IEEE Trans. Hum. Mach. Syst. 2021, 52, 52–63. [Google Scholar] [CrossRef]
Zhai, G.; Zhang, W.; Hu, W.; Ji, Z. Coal mine rescue robots based on binocular vision: A review of the state of the art. IEEE Access 2020, 8, 130561–130575. [Google Scholar] [CrossRef]
Seenu, N.; Manohar, L.; Stephen, N.M.; Ramanathan, K.C.; Ramya, M. Autonomous cost-effective robotic exploration and mapping for disaster reconnaissance. In Proceedings of the 10th International Conference on Emerging Trends in Engineering and Technology-Signal and Information Processing (ICETET-SIP-22), Nagpur, India, 29–30 April 2022; pp. 1–6. [Google Scholar]
Narayan, S.; Aquif, M.; Kalim, A.R.; Chagarlamudi, D.; Harshith Vignesh, M. Search and reconnaissance robot for disaster management. In Machines, Mechanism and Robotics; Springer: Singapore, 2022; pp. 187–201. [Google Scholar]
Zhang, J. Localization, mapping and navigation for autonomous sweeper robots. In Proceedings of the International Conference on Machine Learning and Intelligent Systems Engineering (MLISE), Guangzhou, China, 5–7 August 2022; pp. 195–200. [Google Scholar]
Luo, B.; Huang, Y.; Deng, F.; Li, W.; Yan, Y. Complete coverage path planning for intelligent sweeping robot. In Proceedings of the IEEE Asia-Pacific Conference on Image Processing, Electronics and Computers (IPEC), Dalian, China, 14–16 April 2021; pp. 316–321. [Google Scholar]
Perkasa, D.A.; Santoso, J. Improved frontier exploration strategy for active mapping with mobile robot. In Proceedings of the 7th International Conference on Advanced Informatics: Concepts, Theory and Applications (ICAICTA), Tokoname, Japan, 8 September 2020; pp. 1–6. [Google Scholar]
Zagradjanin, N.; Pamucar, D.; Jovanovic, K.; Knezevic, N.; Pavkovic, B. Autonomous exploration based on multi-criteria decision-making and using D* Lite algorithm. Intell. Autom. Soft Comput. 2022, 32, 1369–1386. [Google Scholar] [CrossRef]
Duan, P.; Yu, Z.; Gao, K.; Meng, L.; Han, Y.; Ye, F. Solving the multi-objective path planning problem for mobile robot using an improved NSGA-II algorithm. Swarm Evol. Comput. 2024, 87, 101576. [Google Scholar] [CrossRef]
Ebadi, K.; Bernreiter, L.; Biggie, H.; Catt, G.; Chang, Y.; Chatterjee, A.; Denniston, C.; Deschamps, S.-P.; Harlow, K.; Khattak, S.; et al. Present and future of SLAM in extreme underground environments. IEEE Trans. Robot. 2023, 40, 622–643. [Google Scholar]
Lajoie, P.-Y.; Ramtoula, B.; Chang, Y.; Carlone, L.; Beltrame, G. DOOR-SLAM: Distributed, online, and outlier resilient SLAM for robotic teams. IEEE Robot. Autom. Lett. 2020, 5, 1658–1665. [Google Scholar] [CrossRef]
Liu, J.; Lv, Y.; Yuan, Y.; Chi, W.; Chen, G.; Sun, L. A prior information heuristic based robot exploration method in indoor environment. In Proceedings of the IEEE International Conference on Real-time Computing and Robotics (RCAR), Xining, China, 15–19 July 2021; pp. 129–134. [Google Scholar]
Lyu, Z.; Yin, Y.; Liu, Q.; Yang, T. Autonomous exploration algorithm for mobile robots in unknown confined environment. In Proceedings of the 16th International Conference on Intelligent Human-Machine Systems and Cybernetics (IHMSC), Hangzhou, China, 24–25 August 2024; pp. 188–191. [Google Scholar]
Yamauchi, B. A frontier-based approach for autonomous exploration. In Proceedings of the IEEE International Symposium on Computational Intelligence in Robotics and Automation (CIRA), Monterey, CA, USA, 10–11 July 1997; pp. 146–151. [Google Scholar]
Fang, B.; Ding, J.; Wang, Z. Autonomous robotic exploration based on frontier point optimization and multistep path planning. IEEE Access 2019, 7, 46104–46113. [Google Scholar] [CrossRef]
Keidar, M.; Kaminka, G.A. Robot exploration with fast frontier detection: Theory and experiments. In Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), Richland, SC, USA, 4–8 June 2012; Volume 1, pp. 113–120. [Google Scholar]
Umari, H.; Mukhopadhyay, S. Autonomous robotic exploration based on multiple rapidly-exploring randomized trees. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Vancouver, BC, Canada, 24–28 September 2017; pp. 1466–1472. [Google Scholar]
Zhang, X.; Zhang, J.; Wang, L. An improved RRT path planning algorithm for mobile robots. In Artificial Intelligence and Autonomous Transportation; Jia, L., Ou, D., Liu, H., Zong, F., Wang, P., Zhang, M., Eds.; Lecture Notes in Electrical Engineering; Springer: Singapore, 2025. [Google Scholar]
Zhong, P.; Chen, B.; Lu, S.; Meng, X.; Liang, Y. Information-driven fast marching autonomous exploration with aerial robots. IEEE Robot. Autom. Lett. 2021, 7, 810–817. [Google Scholar] [CrossRef]
Bircher, A.; Kamel, M.; Alexis, K.; Burri, M.; Siegwart, R. Receding horizon “next-best-view” planner for 3D exploration. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), Stockholm, Sweden, 16–21 May 2016; pp. 1462–1468. [Google Scholar]
Julian, B.J.; Karaman, S.; Rus, D. On mutual information-based control of range sensing robots for mapping applications. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Tokyo, Japan, 3–7 November 2013; pp. 5156–5163. [Google Scholar]
Bai, S.; Chen, F.; Englot, B. Toward autonomous mapping and exploration for mobile robots through deep supervised learning. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Vancouver, BC, Canada, 24–28 September 2017; pp. 2379–2384. [Google Scholar]
Cao, C.; Zhu, H.; Choset, H.; Zhang, J. TARE: A hierarchical framework for efficiently exploring complex 3D environments. In Proceedings of the Robotics: Science and Systems (RSS), Virtual, 12–16 July 2021. [Google Scholar]
Zhou, B.; Zhang, Y.; Chen, X.; Shen, S. FUEL: Fast UAV exploration using incremental frontier structure and hierarchical planning. IEEE Robot. Autom. Lett. 2021, 6, 779–786. [Google Scholar] [CrossRef]
González-Baños, H.H.; Latombe, J.C. Navigation strategies for exploring indoor environments. Int. J. Robot. Res. 2002, 21, 829–848. [Google Scholar] [CrossRef]
Faigl, J.; Kulich, M.; Přeučil, L. Goal assignment using distance cost in multi-robot exploration. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Vilamoura, Portugal, 7–12 October 2012; pp. 3741–3746. [Google Scholar]
Mei, Y.; Lu, Y.H.; Hu, Y.C.; Lee, C.S.G. Energy-efficient mobile robot exploration. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), Orlando, FL, USA, 15–19 May 2006; pp. 505–511. [Google Scholar]
Vasquez-Gomez, J.I.; Sucar, L.E.; Murrieta-Cid, R. Volumetric next-best-view planning for 3D object reconstruction with positioning error. Int. J. Adv. Robot. Syst. 2014, 11, 159. [Google Scholar] [CrossRef]
Selin, M.; Tiger, M.; Duberg, D.; Heintz, F.; Jensfelt, P. Efficient autonomous exploration planning of large-scale 3-D environments. IEEE Robot. Autom. Lett. 2019, 4, 1699–1706. [Google Scholar] [CrossRef]
Aurenhammer, F. Voronoi diagrams—A survey of a fundamental geometric data structure. ACM Comput. Surv. 1991, 23, 345–405. [Google Scholar] [CrossRef]
Dang, T.; Tranzatto, M.; Khattak, S.; Mascaro, F.; Alexis, K.; Hutter, M. Graph-based subterranean exploration path planning using aerial and legged robots. J. Field Robot. 2020, 37, 1363–1388. [Google Scholar] [CrossRef]
Kulkarni, M.; Dharmadhikari, M.; Tranzatto, M.; Zimmermann, S.; Reijgwart, V.; De Petris, P.; Nguyen, H.; Khattak, S.; Hutter, M.; Alexis, K. Autonomous subterranean exploration using graph-based path planning. IEEE Robot. Autom. Lett. 2022, 7, 10454–10461. [Google Scholar]
Chen, D.; Xiao, N. GVD-exploration: An efficient autonomous robot exploration framework based on fast generalized Voronoi diagram extraction. IEEE Robot. Autom. Lett. 2023, 8, 5321–5328. [Google Scholar] [CrossRef]
Wen, J.; Zhang, X.; Bi, Q.; Liu, H.; Yuan, J.; Fang, Y. G²VD planner: Efficient motion planning with grid-based generalized Voronoi diagrams. IEEE Trans. Autom. Sci. Eng. 2025, 22, 3743–3755. [Google Scholar] [CrossRef]
Gao, Y.; Wang, Z.; Zhao, Y. A generalized Voronoi diagram based robot exploration method for mobile robots. Machines 2022, 10, 84. [Google Scholar]
Dong, Q.; Xi, H.; Zhang, S.; Bi, Q.; Li, T.; Wang, Z.; Zhang, X. Fast and communication-efficient multi-UAV exploration via Voronoi partition on dynamic topological graph. In Proceedings of the 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Abu Dhabi, United Arab Emirates, 14–18 October 2024; pp. 14063–14070. [Google Scholar]
Garg, S.; Sünderhauf, N.; Dayoub, F.; Morrison, D.; Cosgun, A.; Carneiro, G.; Wu, Q.; Chin, T.; Reid, I.; Gould, S.; et al. Semantics for robotic mapping, perception and interaction: A survey. Found. Trends Robot. 2020, 8, 1–224. [Google Scholar] [CrossRef]
Guan, T.; Kothandaraman, D.; Chandra, R.; Manocha, D. GANav: Group-wise attention for classifying navigable regions in unstructured outdoor environments. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), Xi’an, China, 30 May–5 June 2021; pp. 8380–8386. [Google Scholar]
Rosinol, A.; Violette, A.; Abate, M.; Hughes, N.; Chang, Y.; Shi, J.; Gupta, A.; Carlone, L. Kimera: From SLAM to spatial perception with 3D dynamic scene graphs. arXiv 2021, arXiv:2101.06894. [Google Scholar] [CrossRef]
Hughes, N.; Chang, Y.; Carlone, L. Hydra: A real-time spatial perception system for 3D scene graph construction and optimization. In Proceedings of the Robotics: Science and Systems (RSS), New York, NY, USA, 27 June–1 July 2022. [Google Scholar]
Nguyen, V.H.; Pham, V.M.; Nguyen, V.T.; Truong, X.-T. S-RRT: A semantic-driven extension of the rapidly-exploring random tree algorithm. In Proceedings of the 2025 International Conference on Advanced Technologies for Communications (ATC), Hanoi, Vietnam, 16–18 October 2025; pp. 1–7. [Google Scholar]
Gomez, C.; Hernandez, A.C.; Barber, R. Topological frontier-based exploration and map-building using semantic information. Sensors 2019, 19, 4595. [Google Scholar] [CrossRef]
Zhang, J.; Dong, H.; Yang, J.; Liu, J.; Huang, S.; Li, K.; Tang, X.; Wei, X.; You, X. Dual-BEV Nav: Dual-layer BEV-based heuristic path planning for robotic navigation in unstructured outdoor environments. In Proceedings of the 2025 IEEE International Conference on Robotics and Automation (ICRA), Atlanta, GA, USA, 19–23 May 2025. [Google Scholar]
Shah, D.; Eysenbach, B.; Kahn, G.; Rhinehart, N.; Levine, S. Rapid exploration for open-world navigation with latent goal models. arXiv 2021, arXiv:2104.05859. [Google Scholar]
Cheng, C.; Zhang, H.; Sun, Y.; Tao, H.; Chen, Y. A cross-platform deep reinforcement learning model for autonomous navigation without global information in different scenes. Control Eng. Pract. 2024, 150, 105991. [Google Scholar] [CrossRef]

Figure 1. The architecture of the proposed SATE framework. The system tightly couples semantic perception with topological construction to achieve structure-aware exploration. Solid and dashed arrows indicate the primary execution pipeline and auxiliary data references, respectively.

Figure 2. Structure of the perception module. The raw aerial image is processed by a U-Net-based architecture to predict the traversability heatmap.

Figure 3. The pipeline of the SSV mechanism. (1) Semantic perception generates a traversability heatmap; (2) Distance-transform-based skeletonization extracts raw medial-axis seeds; (3) Dummy points are introduced to enforce boundary closure before Voronoi partitioning and candidate scoring. Blue solid boxes represent processing modules, teal dashed boxes represent data, and orange highlighted boxes indicate key steps.

Figure 4. Visual overview of the three test maps and their semantic analysis: (a–c) Original RGB aerial images of Map A (open area), Map B (campus road), and Map C (unstructured area); (d–f) Corresponding semantic accessibility heatmaps generated by the SATE perception module. Blue circles represent the current location. Blue areas represent low access costs (safety), while yellow/red areas represent high costs (obstacles). Note that in the heatmap of Map B (e), although the blue hue of the paved road is lighter than in the off-road training area (Map A) due to the area transformation, the model successfully generalizes it as a separate access corridor, effectively distinguishing it from the surrounding vegetation.

Figure 5. Statistical comparison of exploration efficiency over 30 trials. The curves represent the average explored region rate against path length. SATE (purple curve) achieves the fastest convergence to full coverage, outperforming RRT, Frontier, and SNF baselines significantly in terms of path efficiency.

Figure 6. Exploration trajectories generated in Map C from three different starting positions. The green circles, red triangles, and solid blue lines denote the starting positions, ending positions, and exploration trajectories, respectively. Baseline behaviors: NF exhibits chaotic global jumping; MI creates dense redundancy in the bottom-left; RRT oscillates at narrow entries; SNF shows improved structure but still oscillates near bottlenecks. SATE behavior: The proposed method effectively threads through narrow passages and covers edge cases with minimal backtracking, resulting in the shortest overall path.

Figure 7. Exploration trajectories of SATE in Map A and Map B starting from three different positions. The green circles, red triangles, and solid blue lines denote the starting positions, ending positions, and exploration trajectories, respectively. Top: Map A (Open Field); Bottom: Map B (Campus Road).

Figure 8. Parameter sensitivity analysis of the SATE system in Map C. (a) Impact of the topology regularization weight

λ_{t}

on total path length. The U-shaped curve justifies the necessity of topological constraints to avoid greedy backtracking. (b) Impact of the Gaussian kernel width

σ

.

σ = 3.0

m provides the optimal balance between graph sparsity and environmental fidelity. The red stars denote the default parameters used in our proposed system. Error bars represent the standard deviation over 10 independent trials.

Figure 8. Parameter sensitivity analysis of the SATE system in Map C. (a) Impact of the topology regularization weight

λ_{t}

on total path length. The U-shaped curve justifies the necessity of topological constraints to avoid greedy backtracking. (b) Impact of the Gaussian kernel width

σ

.

σ = 3.0

m provides the optimal balance between graph sparsity and environmental fidelity. The red stars denote the default parameters used in our proposed system. Error bars represent the standard deviation over 10 independent trials.

Figure 9. Demonstration of downstream navigation tasks on the efficiently constructed map. The blue circles and green lines represent the topological nodes and edges, respectively. Even though the map was built with a minimal exploration path (278.4 m), it fully supports complex global navigation tasks with a 100% success rate. Sub-figures (a–c) show valid paths to different regions, confirming global connectivity.

Figure 10. Qualitative comparison of candidate generation strategies across three distinct scenarios. The rows correspond to Map A, Map B, and Map C from top to bottom. Legend: The solid red dot indicates the robot’s current position, and the green circles represent the sampled candidate targets generated by each method. Uniform Grid produces excessive redundancy. NMS is sensitive to local heatmap noise and lacks geometric centering. DBSCAN effectively groups regions but often drifts towards high-gradient edges or noise. Ours (SSV) strictly aligns nodes with the safe medial axis, effectively handling open fields, hazardous narrow passages, and semantic gaps.

Figure 11. Trajectory comparison of failure modes at a critical bottleneck. Green circles, red triangles, black circles, and black lines denote starts, goals, topological nodes, and edges, respectively. (a) w/o Heatmap: Robot exhibits conservative avoidance, treating the narrow gap as high-risk. (b) w/o

S_{t o p o}

: Robot exhibits stagnation due to the lack of global repulsive force. (c) Full Framework: SATE effectively traverses the narrow gap guided by semantic confidence and topological pressure.

Figure 11. Trajectory comparison of failure modes at a critical bottleneck. Green circles, red triangles, black circles, and black lines denote starts, goals, topological nodes, and edges, respectively. (a) w/o Heatmap: Robot exhibits conservative avoidance, treating the narrow gap as high-risk. (b) w/o

S_{t o p o}

: Robot exhibits stagnation due to the lack of global repulsive force. (c) Full Framework: SATE effectively traverses the narrow gap guided by semantic confidence and topological pressure.

Table 1. Key Landmarks and Specifications of the Test Maps.

Map	Scenario	Size (m)	Key Landmarks
A	Open Field	$40 \times 40$	None
B	Campus Road	$40 \times 40$	paved road, vegetation belts
C	Unstructured Field	$80 \times 40$	negative obstacles (pits)

Table 2. Hyperparameters and Configuration of the Perception Model Training.

Parameter	Value/Setting
Network Architecture	U-Net (4-level encoder–decoder)
Input Resolution	$128 \times 96$ pixels
Dataset	RECON Dataset [45]
Supervision Signal	Historical Trajectories (Self-supervised)
Loss Function	Binary Focal Loss [44]
Optimizer	Adam
Learning Rate	$5 \times 10^{- 4}$
Batch Size	32
Training Epochs	30 (Converged)
Inference Platform	NVIDIA Jetson AGX Xavier (NVIDIA, Santa Clara, CA, USA)

Table 3. Detailed Parameter Configuration of the SATE System.

Category	Parameter	Value
System Env.	Middleware	ROS Noetic
	Computing Platform	Jetson AGX Orin
	Avg. Loop Rate	20 Hz
Decision Weights	$λ_{e}$ (Exploration)	0.3
	$λ_{c}$ (Cost)	0.2
	$λ_{t}$ (Topology)	0.5
Algorithm	$τ_{s a f e}$	0.6
	$σ$	3.0 m
	$K_{s i z e}$	$3 \times 3$
	$R_{s e n s o r}$	20.0 m

Table 4. Quantitative Comparison of Exploration Performance (Averaged over 3 trials with distinct start positions). ↑ and ↓ indicate that higher and lower values are preferred, respectively. SATE achieves the highest exploration efficiency with the shortest path length, validating its structural superiority.

Method	Exp. Efficiency ( $η$ ) ↑	Path Length (m) ↓
NF	0.24	423.1
RRT	0.20	509.8
MI	0.20	505.2
SNF	0.28	320.4
SATE	0.36	278.4

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Ding, M.; Wei, X.; Chen, S. Structure-Aware Topological Exploration: A Semantic Seeded Voronoi Approach for Unstructured Environments. Electronics 2026, 15, 1033. https://doi.org/10.3390/electronics15051033

AMA Style

Ding M, Wei X, Chen S. Structure-Aware Topological Exploration: A Semantic Seeded Voronoi Approach for Unstructured Environments. Electronics. 2026; 15(5):1033. https://doi.org/10.3390/electronics15051033

Chicago/Turabian Style

Ding, Miao, Xian Wei, and Shaowen Chen. 2026. "Structure-Aware Topological Exploration: A Semantic Seeded Voronoi Approach for Unstructured Environments" Electronics 15, no. 5: 1033. https://doi.org/10.3390/electronics15051033

APA Style

Ding, M., Wei, X., & Chen, S. (2026). Structure-Aware Topological Exploration: A Semantic Seeded Voronoi Approach for Unstructured Environments. Electronics, 15(5), 1033. https://doi.org/10.3390/electronics15051033

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Structure-Aware Topological Exploration: A Semantic Seeded Voronoi Approach for Unstructured Environments

Abstract

1. Introduction

Paper Contributions

2. Related Work

2.1. Geometry-Based Exploration

2.2. Topology-Based Exploration

2.3. Semantic-Aware Navigation and Mapping

2.4. Summary

3. Methodology

3.1. Problem Formulation

3.2. System Overview

3.3. Semantic Skeleton Sampling (SSV) Mechanism

3.3.1. Semantic Segmentation Network

3.3.2. Medial-Axis Aligned Sampling

3.3.3. Candidate Node Scoring and Pruning

3.4. Structure-Aware Decision Planning

3.4.1. Multi-Objective Utility Function

3.4.2. Topological Sparsity Regularization

3.5. Closed-Loop Control: From Decision to Execution

4. Experiments

4.1. Experimental Setup

4.1.1. Simulation Environment and Test Maps

4.1.2. Perception Model Training Setup

4.1.3. Implementation Details

4.1.4. Evaluation Metrics

4.2. System-Level Comparative Evaluation

4.2.1. Quantitative Efficiency Analysis

4.2.2. Qualitative Trajectory Analysis

4.2.3. Generalization and Stability Across Diverse Scenarios

4.3. Parameter Sensitivity and Robustness Analysis

4.4. Downstream Task Validation: Global Navigation

4.5. Component Analysis and Ablation Studies

4.5.1. Evaluation of Topological Skeleton Generation (SSV)

4.5.2. Ablation Studies on Perception and Topology

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI