Branch-Priority Exploration for Mobile Robots in Restricted Industrial Corridors

Yu, Wenjie; Du, Wangzhe

doi:10.3390/sym18050806

Open AccessArticle

Branch-Priority Exploration for Mobile Robots in Restricted Industrial Corridors

by

Wenjie Yu

¹

and

Wangzhe Du

^2,3,4,*

¹

College of Robotics Science and Engineering, Taiyuan University of Technology, Taiyuan 030024, China

²

College of Mechanical Engineering, Taiyuan University of Technology, Taiyuan 030024, China

³

Engineering Research Center of Advanced Metal Composites Forming Technology and Equipment of Ministry of Education, Taiyuan University of Technology, Taiyuan 030024, China

⁴

National Key Laboratory of Metal Forming Technology and Heavy Equipment, Xi’an 710018, China

^*

Author to whom correspondence should be addressed.

Symmetry 2026, 18(5), 806; https://doi.org/10.3390/sym18050806

Submission received: 13 April 2026 / Revised: 27 April 2026 / Accepted: 5 May 2026 / Published: 8 May 2026

(This article belongs to the Topic Artificial Intelligence in Intelligent Control, Robotics, and Manufacturing)

Download

Browse Figures

Versions Notes

Abstract

This paper proposes the Branch-Priority Exploration (BPE) framework for autonomous coverage in confined industrial corridor environments. BPE integrates three components: (1) a symmetry-aware LiDAR branch detector; (2) a hierarchical BFS/DFS mode-switching policy; and (3) a barrier-based branch memory. Frontier-based methods often struggle in industrial corridors where branches split off from the main corridor. The symmetric layout of such environments, featuring T-shaped junctions and L-shaped turns, creates recurring geometric patterns that conventional frontier scoring fails to exploit. When the robot reaches a junction, nearby frontier candidates often receive similar scores, causing repeated target switching as the local map changes. Meanwhile, frontier cells inside a branch tend to score lower than those along the main corridor; so, the robot often bypasses the branch and continues forward, which leads to additional backtracking later. Even when the robot eventually returns, residual frontier cells near the entrance may attract the planner repeatedly, causing redundant re-entry into already-covered branches. To address these issues, a branch-priority exploration framework is developed. A symmetry-aware branch detection module uses LiDAR range measurements from multiple directions to identify T-shaped junctions and lateral openings, applying identical geometric criteria to lateral openings on either side of the robot. This allows branch entry to be triggered by explicit geometric evidence, rather than frontier score comparisons that tend to be unreliable near intersections. When a branch is detected, the robot transitions from BFS mode to DFS mode for systematic branch coverage. Entry and post-return locks prevent mode reversal before the robot commits to the new heading. Once a branch is completed, a permanent virtual barrier is placed at its entrance; so, the planner no longer routes the robot back into that branch. The framework is formalized as a constrained coverage problem on occupancy grids, and monotonic coverage progress and finite branch completion under barrier memory are established theoretically. A fully reproducible ROS implementation on a wheeled platform with differential drive is validated. Experiments span several corridor environments of increasing topological complexity. Compared to a nearest-frontier baseline, the proposed method substantially reduces both exploration time and goal cancellations while achieving complete coverage across all trials. The cancellation count matches the number of T-branches per environment, with near-zero variance across repeated runs.

Keywords:

mobile robots; occupancy grid mapping; frontier-based exploration; branch detection; barrier memory

1. Introduction

Autonomous mobile robots are increasingly deployed for inspection and surveillance in hazardous or geometrically constrained environments, such as industrial pipelines, underground tunnels, and other confined corridor-like facilities [1,2,3,4,5,6]. These scenarios are typically GPS-denied, cluttered, and geometrically narrow, imposing strict requirements on exploration robustness and navigation reliability over long missions. Effective path planning under spatial constraints is a prerequisite for reliable long-duration missions [7].

These challenges are further amplified in naturally restricted subterranean settings. Aerial robots have been deployed in cave environments requiring full onboard autonomy, including mapping, planning, and multi-robot cooperation [8]. Legged and wheeled platforms have also demonstrated inspection capabilities in sewers, mines, and industrial facilities [9,10,11]. Heterogeneous teams combining aerial and legged robots were further used to tackle large-scale multi-branched tunnel-like environments, where the combined mobility and sensing coverage of different platforms improves mission efficiency [12,13,14]. Across all these deployments, confined corridor geometry fundamentally constrains sensor coverage and forces robots to plan exploration at the level of individual passages and branches. In this work, a branching corridor layout refers specifically to a 2D passage composed of long narrow main corridors connected by T-shaped intersections and L-shaped turns, with short lateral side branches emanating perpendicularly from the junctions.

The growing demand for regular inspection of aging infrastructure, coupled with stringent safety regulations, drives the development of robotic platforms capable of autonomous navigation in confined spaces. A recent systematic mapping study provides a comprehensive taxonomy of autonomous exploration strategies and identifies restricted corridor-like environments as a persistently underserved problem space [15]. Simultaneous localization and mapping (SLAM) is fundamental to enabling robots to build environmental maps and estimate their poses in GPS-denied environments [16,17]. In multi-robot settings, efficient loop-closure prioritization is critical to maintaining map consistency during long missions without incurring prohibitive communication overhead [18,19]. Two-dimensional LiDAR-based SLAM is widely adopted for corridor inspection, because LiDAR provides comparatively reliable range measurements largely independent of ambient lighting conditions [20,21,22]. The occupancy grid representation, which discretizes the environment into free, occupied, or unknown regions, naturally supports both path planning and exploration decision-making in such settings.

Frontier-based exploration is widely studied for autonomous coverage in unknown environments, where frontiers represent the boundaries between explored free space and unexplored regions in occupancy grid maps [23,24]. Recent work incorporates real-time map optimization and improved frontier cost functions to enhance robustness, especially in structured scenes such as corridors. Learning-based extensions replace hand-crafted frontier scoring with neural policies. Deep reinforcement learning and hierarchical planners have been applied to select exploration targets more efficiently [25,26,27,28,29]. Learning-based methods require environment-specific pretraining and cannot be deployed in new corridor layouts without retraining, whereas BPE requires no pretraining and achieves zero-shot deployment. However, frontier-driven decision making becomes unstable in branching corridor layouts, where multiple candidate targets appear around intersections and small pose changes induce frequent target switching and oscillatory behaviors [30]. The underlying instability stems from the fact that frontier geometry (e.g., centroid or best-score point) changes discontinuously with small pose or map updates near intersections; so, the selected target jumps between competing branches. This effect is exacerbated in narrow corridors, because the observable frontier set changes abruptly with limited lateral visibility, making frontier centroids and costs highly non-smooth around intersections. Existing solutions introduce smoothing and hysteresis in goal selection or incorporate loop-aware and topological representations, but frequent re-selection still occurs when frontiers appear or disappear under limited sensing and narrow corridor constraints. Compared to frontier stabilization techniques [30], BPE replaces score-based decisions with explicit geometric evidence, fundamentally eliminating score-tie oscillation rather than suppressing it.

Path planning in branching corridor layouts, commonly found in industrial environments, presents unique challenges. At intersections, the relative ranking of competing frontier candidates is highly sensitive to small changes in the robot’s pose or the local map, which leads to repeated target switching. Additionally, as the robot moves along a main corridor, lateral branches may fall outside the current sensor range or become temporarily occluded, causing important side passages to be missed. A compounding issue is the limited observability of lateral entrances: once the robot passes an intersection, the branch mouth falls outside the effective raytracing range and is no longer represented as an active frontier. Purely frontier-driven policies therefore lose the incentive to return. Prior work introduces topological memories or revisit heuristics to alleviate this issue. However, robust branch-level completion marking in continuously updated maps remains non-trivial [31,32]. Unlike graph-based topological exploration [31,32] that requires explicit node extraction and edge maintenance, BPE operates directly on occupancy grids without constructing a topological representation. These issues motivate (i) explicit intersection and branch recognition beyond frontier cues and (ii) persistent branch-level memory to suppress redundant revisits.

To address these limitations, this paper proposes a branch-priority exploration framework that targets two persistent failure modes of conventional frontier-based methods—unstable goal selection at branching junctions and redundant re-entry into completed branches. The framework is implemented in ROS on a standard differential-drive robot and tested in simulated corridor environments with increasing topological complexity. Table 1 compares BPE against representative existing approaches. BPE provides geometric branch detection, systematic branch coverage, and permanent memory within a unified framework deployable without prior training, distinguishing it from topological, learning-based, and stabilization-only methods.

The main contributions of the BPE framework are as follows.

A symmetry-aware branch detection module that uses multi-directional LiDAR range measurements to identify T-shaped junctions and lateral entrances, applying identical thresholds to both lateral directions so that branch entry is determined by direct geometric evidence rather than frontier score comparisons near intersections.
A hierarchical BFS/DFS mode-switching policy that separates main-corridor progression from branch exploration, with entry and post-return locks to prevent the robot from switching back too early.
A barrier-based branch memory that places a permanent virtual barrier at each confirmed branch entrance, preventing the planner from routing the robot back into branches that have already been explored.

2. Problem Formulation

With robot motion modeled by differential-drive kinematics and the environment represented as an occupancy grid, the exploration task is posed as a constrained coverage problem. A barrier constraint prevents revisits to confirmed branch entrances. Two assumptions are introduced to keep the formulation well-posed.

2.1. Notation and Symbol Summary

Table 2 lists the main symbols and core operators used throughout the paper. Scalars such as

v_{max}

represent bounds or thresholds, vectors such as

x

denote system states, and sets such as

Γ

and

Λ

refer to map regions or frontier clusters.

2.2. Robot Motion and Perception Model

A planar differential-drive kinematic model is adopted. The system state is defined as

x = {[x, y, ψ]}^{⊤},

(1)

where

(x, y)

is the robot position in the global map frame, and

ψ

is the yaw heading. The control input is

u = {[v, ω]}^{⊤},

(2)

where v and

ω

denote the commanded linear and angular velocities, respectively. A standard unicycle-form kinematic model is adopted

\dot{x} = v cos ψ, \dot{y} = v sin ψ, \dot{ψ} = ω .

(3)

A 2D LiDAR sensor provides planar range measurements. Let

Z

denote the set of these measurements. The observation model is

Z = G (Ω, x),

(4)

where

G (\cdot)

maps the environment and robot state to expected measurements. The occupancy grid map

Ω

discretizes the workspace into cells with states

Ω : R^{2} \to {0, 1, - 1},

(5)

where 0 indicates free space, 1 indicates obstacles, and

- 1

denotes unknown space.

2.3. Coverage and Barrier Constraint

Given the occupancy grid

Ω

, the free space available for robot traversal is

Γ = {p \in R^{2} ∣ Ω (p) = 0},

(6)

where

Ω (p) = 0

indicates a free cell. The exploration frontier

Λ

separates known free space from unexplored regions

Λ = {p \in Γ ∣ \exists q, Ω (q) = - 1, ∥ p - q ∥ < ε},

(7)

where

ε > 0

is a small threshold. The frontier set

Λ

is then partitioned into connected components,

{B_{i}}_{i = 1}^{K} = Ξ (Λ),

(8)

where

Ξ (\cdot)

denotes the 8-neighborhood connected-component operator, and K is the number of resulting clusters. Each

B_{i}

is a frontier cluster. Its priority is defined by

w_{i} = \frac{Σ (B_{i})}{D (B_{i}, x_{0})},

(9)

where

Σ (\cdot)

measures the expected information gain of the cluster, and

D (\cdot)

is the navigation effort from the initial state

x_{0}

.

Barrier-based branch memory. The set of completed branches at time t is denoted by

K_{explored} (t) \subseteq {1, \dots, K}

. For each

j \in K_{explored} (t)

, a virtual barrier region

H_{j} \subset Γ

is placed near the corresponding branch entrance and excluded from subsequent planning. This barrier prevents re-entry into completed branches.

The exploration objective is to find a feasible trajectory

x (t)

,

t \in [0, T]

, such that

\begin{matrix} Coverage : & ⋃_{t = 0}^{T} Φ (x (t)) \supseteq ⋃_{i = 1}^{K} B_{i}, \end{matrix}

(10)

\begin{matrix} Collision-free : & x (t) \in Γ, \forall t \in [0, T], \end{matrix}

(11)

\begin{matrix} Barrier constraint : & x (t) \notin H_{j}, \forall j \in K_{explored} (t), \forall t \in [0, T], \end{matrix}

(12)

\begin{matrix} Control limits : & 0 \leq v (t) \leq v_{max}, | ω (t) | \leq ω_{max}, \forall t \in [0, T] . \end{matrix}

(13)

Two assumptions underlie the formulation.

Assumption 1.

The robot moves on approximately planar ground, and its motion is modeled by differential-drive kinematics in (3).

Assumption 2.

The environment changes slowly enough over the exploration horizon that the occupancy grid Ω remains reliable for frontier extraction and planning.

The first assumption is reasonable for corridor-like facilities where floors are designed for wheeled access. If strong slopes or discontinuous terrain are present, the feasibility set

Γ

derived from a 2D grid may overestimate traversability; conservative speed bounds (13) and costmap inflation reduce this risk in practice. The second assumption holds in typical industrial corridors where fixed structures dominate and dynamic objects are sparse relative to the planning horizon; transient moving obstacles are handled by local costmap replanning in real time, without affecting the global barrier memory.

3. Methodology

This section presents the Branch-Priority Exploration (BPE) framework as a decision policy defined on the occupancy grid. The method is specified through frontier extraction and clustering, geometric corridor-feature detection, hierarchical BFS/DFS mode switching, and a barrier-based branch memory. A formal convergence analysis is also provided to characterize the theoretical properties of the framework. The framework addresses corridor exploration through three integrated components: geometric branch detection, hierarchical mode switching, and barrier-based memory.

3.1. Decision Policy on Occupancy Grids

Given the occupancy grid

Ω

and free space

Γ

defined in Section 2, the exploration process maintains two evolving sets: the explored set

E (t) \subseteq Γ

and the active frontier set

Λ (t) \subseteq Γ

. At each decision step t, the robot selects a navigation target

g (t) \in Γ

and executes a motion controller to produce a feasible trajectory segment. The explored set grows monotonically as the robot moves, updated at each step by the sensor field-of-view operator

Φ (\cdot)

E (t + 1) = E (t) \cup Φ (x (t)) .

(14)

The frontier set is extracted from the boundary between explored and unknown cells

Λ (t) = \{p \in Γ ∣ \exists q, Ω (q) = - 1, ∥ p - q ∥ < ε, p \in E (t)\} .

(15)

Frontier cells are clustered into connected components

{B_{i} (t)}_{i = 1}^{K (t)}

using an 8-neighborhood operator

Ξ (\cdot)

, where

K (t)

is the time-varying cluster count

{B_{i} (t)}_{i = 1}^{K (t)} = Ξ (Λ (t)) .

(16)

A free cell p qualifies as a frontier cell if at least one of its eight neighbours is unknown (8-neighborhood criterion). Figure 1 illustrates the 8-neighborhood structure: cell p is classified as a frontier cell, because at least one of its eight surrounding cells belongs to the unknown region, while it itself lies in the explored free space.

For each cluster

B_{i} (t)

, a direction-weighted utility score is computed as

U_{i} (t) = \frac{Σ (B_{i} (t))}{D (B_{i} (t), x (t)) + ϵ_{0}} + λ cos (∠ (c_{i} (t) - p (t), v_{main})),

(17)

where

Σ (\cdot)

is an information proxy (cluster size),

D (\cdot)

is a path-cost estimate,

c_{i} (t)

is the cluster centroid,

v_{main}

is the estimated corridor direction,

λ

is an alignment weight, and

ϵ_{0} > 0

prevents numerical singularity. This directional term favors frontiers along the main corridor axis; so, lateral branch frontiers are downweighted during BFS automatically.

3.2. Branch-Priority Policy with Memory

A branch-priority policy integrates geometric branch detection, frontier candidate generation, barrier-based branch memory, and a hierarchical BFS/DFS goal selector. At each decision step t, the policy outputs a goal

g (t) \in Γ

feasible for the ROS navigation stack and consistent with the barrier constraint in Section 2. Figure 2 illustrates the three operating modes on a T-shaped corridor.

Geometric branch detection. Let

Z (t) = {(d_{ℓ}, ϑ_{ℓ})}_{ℓ = 1}^{N}

be the planar LiDAR scan at time t. Eight probing directions relative to the robot heading are defined as

\begin{matrix} Θ = {left, right, forward, backward, \\ forward_left, forward_right, backward_left, backward_right} . \end{matrix}

(18)

For each

θ \in Θ

, the maximum free range is obtained by ray casting

Y (θ; t) = max {d ∣ cells in [0, d] free along θ} .

(19)

Direction-specific passability thresholds

Ψ (θ)

are defined to account for the directional anisotropy of corridor geometry: lateral directions (left, right) use

Ψ_{side} = 1.5

m, forward and backward directions use

Ψ_{fwd} = 1.8

m and

Ψ_{back} = 3.2

m, and diagonal directions use

Ψ_{diag} = 2.0

m. The term “symmetry-aware” refers specifically to the identical thresholds applied to the left and right lateral directions (

Ψ_{side} = 1.5

m for both), ensuring that branches on either side of the robot are detected under equal geometric criteria. The directional asymmetry between forward and backward thresholds is designed to account for the geometric anisotropy of corridor traversal, reflecting the different observability conditions in the direction of travel versus the direction already covered. A direction

θ

is considered passable if

Y (θ; t) \geq Ψ (θ)

. The geometric classifier outputs

Δ (Z (t)) = \{\begin{matrix} BRANCH & lateral or diagonal direction is passable, \\ DEAD_END & all directions are blocked, \\ CONTINUE & otherwise . \end{matrix}

(20)

L-shaped turns are not classified explicitly. When the forward passage closes, the robot stays in BFS mode, and the direction-weighted frontier selector steers it toward the open lateral direction. To prevent spurious repeated detections at the same junction,

Δ

is subject to two throttles. First, a temporal cooldown of 3.0 s is enforced. Second, a spatial displacement of at least 15 cells from the previous detection position is required. After completing a branch and returning to the main corridor, a 30-cell post-return lock suppresses re-detection until the robot has traveled sufficiently far from the junction.

Frontier candidates and barrier filtering. Frontier cluster centroids form the initial candidate set

G (t) = {centroid (B_{i} (t))}_{i = 1}^{K (t)} .

(21)

The policy maintains the completed-branch index set

K_{explored} (t)

. For each

j \in K_{explored} (t)

, a virtual barrier region

H_{j} \subset Γ

is activated near the branch entrance. Each

H_{j}

is a square region of half-side

r_{b} = 5

cells centered at offset

δ_{b} = 20

cells behind the detection point along

v_{main}

. Candidates inside barriers are rejected:

G_{valid} (t) = G (t) ∖ ⋃_{j \in K_{explored} (t)} H_{j} .

(22)

If

G_{valid} (t) = \emptyset

but

η_{explore} < η_{stop}

, a constrained recovery mode is activated: barrier filtering is temporarily relaxed only to the extent of searching for residual frontier cells within a narrow boundary strip of the existing barriers. The barriers themselves are never deactivated or crossed; the recovery merely allows the planner to reach unexplored cells that were occluded at the barrier edge. Once such cells are covered or confirmed unreachable, normal barrier-filtered operation resumes.

Hierarchical BFS/DFS goal selection. The goal selector operates in one of three modes: BFS (main-corridor progression), DFS (branch exhaustion), or RETURN (recovery to main corridor). Figure 3 illustrates the transitions among these modes.

In BFS mode, the goal is chosen from

G_{valid} (t)

by maximising the direction-weighted utility score defined in Section 3.1. The strong alignment weight

λ

effectively suppresses lateral branch frontiers, keeping the robot moving along the main axis until a branch is explicitly detected. Upon

Δ (Z (t)) = BRANCH

, a permanent barrier

H_{j}

is placed at offset

δ_{b}

behind the detection point, a green return waypoint

p_{green}

is placed at

δ_{b}

ahead, and the mode switches to DFS. A branch-entry lock of

d_{entry} = 30

cells prevents re-evaluation of the junction during the entry manoeuvre.

In DFS mode, the candidate frontier set is restricted to a neighbourhood

N (b^{*})

around the branch entrance

b^{*}

. Frontiers whose direction from

b^{*}

aligns with the main corridor axis are rejected by an angular criterion with a tolerance of 30°. The remaining candidates are scored by

score (g) = w_{d} \cdot D {(g, x (t))}^{- 1} + w_{s} \cdot Σ (g) + w_{c} \cdot cos (∠ (g - p (t), v_{move})),

(23)

where

v_{move}

is the recent movement direction, and

w_{d}, w_{s}, w_{c}

represent the balance proximity, information, and directional continuity. The DFS goal is

g (t) = arg max_{g \in N (b^{*})} score (g) .

(24)

DFS completion is triggered when no valid frontier exists in

N (b^{*})

for

N_{thresh}

consecutive cycles, or when move_base returns aborted for

N_{thresh}

consecutive goals. In either case, the branch interior is considered unreachable. Branch j is then added to

K_{explored} (t)

, and

H_{j}

is activated.

In RETURN mode, the goal is fixed to

p_{green}

. Once the robot reaches

p_{green}

(within

ϵ_{reach}

), the saved main-corridor direction is restored, the 30-cell post-return lock is engaged, and the mode switches back to BFS.

3.3. Convergence Analysis

The framework satisfies two basic convergence properties.

Theorem 1 (Monotonic Coverage Progress).

Under the explored-set update

E (t + 1) = E (t) \cup Φ (x (t))

, the unexplored potential

V (t) = | Γ ∖ E (t) |

satisfies

V (t + 1) \leq V (t)

, with a strict decrease whenever

Φ (x (t))

contains previously unexplored free cells.

Proof.

By definition,

E (t + 1) = E (t) \cup Φ (x (t))

. Since

E (t) \subseteq E (t + 1)

, it follows that

V (t + 1) = | Γ ∖ E (t + 1) | = | Γ ∖ (E (t) \cup Φ (x (t))) | \leq | Γ ∖ E (t) | = V (t) .

Strict decrease holds whenever

Φ (x (t)) ⊈ E (t)

, i.e., when the sensor footprint covers at least one previously unexplored cell. □

Theorem 1 means that the explored set never shrinks, but it says nothing about whether the robot revisits a completed branch. Theorem 2 shows that the barrier memory blocks re-entry after branch completion.

Theorem 2 (Finite Branch Completion under Barrier Memory).

Assume Γ contains finitely many free cells, and each navigation goal terminates in finite time. Then, every detected branch entrance reaches a completed status after finitely many decision steps. Moreover, once

j \in K_{explored} (t)

, the barrier constraint

x (t) \notin H_{j}

prevents redundant re-entry into the completed branch for all subsequent time.

Proof.

See Appendix A. □

Together, Theorems 1 and 2 jointly guarantee that the exploration process makes monotonic progress and that every detected branch is completed exactly once. Coverage never regresses, each branch is visited and closed in finite time, and the barrier memory structurally eliminates redundant revisits, ensuring the mission terminates with full coverage under the barrier-constrained objective.

4. Implementation

This section presents the ROS implementation of the branch-priority exploration framework on a standard differential-drive wheeled robot platform. The system is realized as an online decision node that interfaces with SLAM and the ROS navigation stack, maintains barrier-based branch memory, and dispatches goals to move_base until the termination criterion is satisfied. Figure 4 provides an overview of the complete system pipeline, from raw sensor inputs through the BPE decision node to motor commands.

4.1. ROS Deployment and Configuration

The implementation uses a 2D LiDAR and gmapping as the SLAM back-end to produce an occupancy grid map

M \in {- 1, 0, 1}^{H \times W}

. Navigation is executed by move_base with a global planner and a local planner. Frames follow the standard ROS convention: the global costmap is maintained in map, the local rolling costmap is maintained in odom, and the robot base frame is base_footprint. The decision node runs at

f_{cycle} = 2

Hz, subscribes to the occupancy grid and LiDAR scans, and publishes goal poses to move_base.

The runtime variables include the explored mask

E \in {0, 1}^{H \times W}

, a trajectory buffer

T

for corridor direction estimation, and a barrier memory list

B

. Each barrier record stores an entrance center, a half-side length, a status flag, the corridor direction at entry, and a return reference point. For each completed branch, a forbidden region

H_{j}

is placed near its entrance. Goal selection then rejects any candidate inside the aggregate barrier set:

A = ⋃_{r \in K_{explored}} H_{r}

.

All algorithmic parameters are listed in Table 3. Detection thresholds

(Ψ_{side}, Ψ_{diag}, Ψ_{back}, Ψ_{fwd})

reflect the nominal corridor width (1.0 m) and the effective LiDAR range

R_{s}

; smaller values cause missed detections, and larger values produce false positives in wider sections. In practice,

Ψ_{side}

should be calibrated to the nominal corridor width, which is typically known in advance in industrial deployment scenarios where facility layouts are available. For environments with varying corridor widths, one potential approach could be to estimate the local corridor width online from lateral LiDAR readings and adjust

Ψ_{side}

dynamically. Such an adaptive scheme may reduce missed and false detections in non-uniform corridor geometries, and is identified as a direction for future work. The barrier placement offset

δ_{b} = 20

cells keeps the barrier center behind the detection point, and the half-side

r_{b} = 5

cells provides a margin for localisation noise. Together,

δ_{b}

and

r_{b}

are sized so that

H_{j}

covers the entrance without encroaching on the main corridor. Lock distances

d_{entry}

,

d_{lock}

, and

d_{turn}

are set to multiples of the robot footprint so that heading recovery completes before the next detection event. The termination threshold

η_{stop} = 0.99

tolerates a small residual from cells unreachable due to costmap inflation.

4.2. Algorithm Description

Algorithm 1 presents the complete online execution pipeline. The main loop runs at

f_{cycle} = 2

Hz and terminates when

η_{explore} \geq η_{stop}

or no reachable frontier remains, at which point the robot returns to

p_{0}

and outputs a completion flag. Two auxiliary procedures implement the core logic of the decision node.

BarrierUpd manages all state transitions and barrier construction. It takes the current mode m, barrier list

B

, detection result s, corridor direction

θ_{corridor}

, and robot position

p_{r}

as inputs and returns the updated mode and barrier list. On a Branch detection in BFS mode, it places a permanent barrier

H_{j}

at offset

δ_{b}

behind the detection point, records a green return waypoint

p_{green}

at

δ_{b}

ahead, and switches to DFS mode. In DFS mode, it monitors frontier availability and move_base feedback; once

N_{thresh}

consecutive cycles yield no valid frontier or

N_{thresh}

goals are aborted, the branch is marked completed and the mode switches to RETURN. In RETURN mode, it waits for arrival at

p_{green}

within

ϵ_{reach}

, then restores the corridor direction, engages the post-return lock, and reverts to BFS.

Algorithm 1: Online execution pipeline for branch-priority exploration with barrier memory

GoalSel translates the current mode and barrier state into a single navigation goal. It takes the same inputs as BarrierUpd, together with the frontier cluster set

{C_{i}}

and the full map M, and returns a goal pose g. Frontier centroids inside the aggregate barrier set

A

are first discarded. In BFS mode, the highest direction-weighted scored candidate outside

A

is returned. In DFS mode, candidates are restricted to the branch neighbourhood

N (b^{*})

with a 30° angular filter, and the highest-scored goal in the restricted set is returned. In RETURN mode,

p_{green}

is returned directly. If no valid candidate exists but coverage is incomplete, a constrained recovery search is activated: barrier filtering is temporarily relaxed, and frontier cells are searched across the full reachable map, allowing the planner to reach unexplored cells occluded behind barrier regions. The barriers are re-activated immediately after the recovery frontiers are exhausted or confirmed unreachable; so, no permanent re-entry into completed branches occurs. Across all 72 trials, the recovery search was never triggered, demonstrating that the primary barrier-filtered frontier selection was sufficient to achieve complete coverage in all cases.

The system operates as a closed loop: sensor data flow through SLAM into the mode manager, which dispatches navigation goals to the planner; runtime feedback from the navigation stack closes the loop.

5. Test in ROS Environment

5.1. ROS Environment Setup

All experiments are conducted on a dedicated physical machine with ROS and Gazebo. SLAM is provided by gmapping and navigation by move_base with the DWA local planner. The decision node runs at 2 Hz with velocity bounds

v_{max} = 0.10

m/s and

ω_{max} = 0.30

rad/s. All algorithmic parameters follow Table 3 in Section 4. Figure 5 shows the simulation environment used in all experiments.

To assess the simulation reliability, four points are noted. First, the LiDAR sensor is modelled with Gaussian noise (mean

= 0

, stddev

= 0.01

m) using Gazebo’s built-in noise model, consistent with the measurement accuracy of the LDS-01 sensor (±15 mm). Second, since the proposed method relies exclusively on 2D LiDAR range measurements, it is inherently unaffected by ambient lighting variations. Third, the AMCL particle filter continuously corrects pose estimates using LiDAR–map matching in the structured corridor geometry, which provides abundant distinctive features for reliable localisation. The barrier placement margin (

δ_{b} = 20

cells,

r_{b} = 5

cells) provides additional tolerance for residual localisation uncertainty. Fourth, Theorems 1 and 2 establish monotonic coverage progress and finite branch completion without relying on simulation-specific assumptions; so, their guarantees transfer directly to physical deployment.

The linear velocity

v_{max} = 0.10

m/s is chosen so that the robot travels approximately 5 cm between consecutive decision cycles, providing sufficient spatial resolution for branch detection in corridors of 1.0 m width. Higher speeds may be used in physical deployment after retuning the local planner parameters. Regarding the computational cost, the decision cycle comprises frontier extraction

O (N)

, DBSCAN clustering

O (N log N)

, and goal selection

O (K)

, where N is the number of map cells, and K is the number of frontier clusters. The 2 Hz decision frequency provides a 500 ms budget per cycle. Timing measurements collected across all 72 trials yield a mean decision time of 573.7 ms and a maximum of 851.2 ms (budget utilisation 114.7%). Although the decision cycle time occasionally exceeds this budget in larger environments, this has a negligible impact on exploration performance, since move_base executes navigation goals asynchronously, and the robot continues moving toward the current goal during any decision overrun.

Three corridor environments of increasing topological complexity are evaluated, as summarised in Table 4. Each layout models a section of an industrial pipeline gallery at 0.05 m/cell resolution. T-shaped intersections represent secondary pipe branches or maintenance access passages, and L-shaped turns arise at structural corners. Free-space areas range from 60.2 m² to 123.2 m², and branch counts range from 2 to 5. The reachable free-space area roughly doubles from Env-1 to Env-3. Figure 6 and Figure 7 show a qualitative comparison of all four methods on a representative Env-2 trial.

The trajectory lengths in Table 4 come from a representative BP (1.5 m) trial. The path length scales approximately linearly with the free-space area (78.19 m for 60.2 m²; 168.40 m for 123.2 m²). Once a branch is confirmed explored, it is not revisited; so, the total travel is governed by the corridor topology rather than by the number of decision cycles.

Each trial starts from a fixed pose

p_{0}

at one end of the main corridor with no prior map. The mission terminates when (i)

η_{explore} \geq 0.99

, (ii) the wall-clock time exceeds 1800 s, or (iii) no reachable frontier exists for five consecutive cycles. All four methods are run six times per environment under identical initial conditions. All methods achieve 100% final coverage across all trials; so,

η_{final}

is not used for comparison. Each trial starts from a pose perturbed by

\pm 0.1

m uniformly in x and y to introduce the initial-condition variation; this produces non-zero standard deviation in

T_{\exp}

, while

N_{switch}

for BP (1.5 m) remains deterministic, because the branch detector fires based on geometric thresholds insensitive to small pose shifts. Evaluation uses three metrics:

$T_{\exp}$ : Total mission time (s).
$N_{switch}$ : Total goal cancellations dispatched to move_base by the exploration node, counting every call to cancel_all_goals() regardless of trigger type. For the proposed method, a cancellation is issued only when a branch is detected; so, $N_{switch}$ equals the number of branch-detection events across the mission.
$S_{success}$ : Fraction of trials reaching $η_{explore} \geq 0.99$ before timeout (%).

5.2. Baselines and Main Results

The proposed method is tested in two configurations: BP (1.5 m) with

Ψ_{side} = 1.5

m, exceeding the nominal corridor width of 1.0 m, and BP (0.8 m) with

Ψ_{side} = 0.8

m, a tighter threshold. Both are compared against a nearest-frontier (NF) baseline with identical frontier extraction, DBSCAN clustering, and decision frequency. The baseline scoring function is

U_{i}^{NF} (t) = \frac{Σ (B_{i} (t))}{D (B_{i} (t), x (t)) + ϵ_{0}},

(25)

with

λ = 0

, no mode switching, and no barrier memory. At each 2 Hz decision cycle, if the top-ranked frontier changes, the node calls cancel_all_goals() and dispatches a new goal; so, every rank inversion produces one counted cancellation. All four configurations use identical SLAM, costmap, and move_base settings. The NF baseline omits hysteresis and goal-hold mechanisms and serves as a lower-bound reference. An improved baseline (NF+GoalHold) is additionally evaluated, which augments the nearest-frontier selector with a goal-hold mechanism: once a navigation goal is dispatched, it is maintained for at least

T_{hold} = 5

s before switching to a new target. This suppresses oscillation without introducing geometric branch detection or barrier memory, providing a fairer comparison that isolates the contribution of the BPE components. Learning-based exploration methods require environment-specific pretraining incompatible with the zero-shot deployment objective of this work; graph-based topological methods target large-scale 3D multi-robot settings that differ fundamentally from the 2D single-robot corridor setting studied here.

Table 5 reports the mean and standard deviation across six trials. All four configurations achieve

S_{success} = 100 %

. BP (1.5 m) reduces

T_{\exp}

by 40.9%, 39.7%, and 33.3% on Env-1, Env-2, and Env-3, relative to the NF baseline. BP (0.8 m) reduces

T_{\exp}

by 34.7%, 32.0%, and 29.2%, with a slightly higher time than BP (1.5 m), because the tighter threshold occasionally triggers extra DFS-and-return cycles. NF+GoalHold reduces

T_{\exp}

by 15.7%, 8.2%, and 3.5%, demonstrating that goal-hold alone provides limited improvement compared to the full BPE framework. The reduction is consistent across all three environments despite the two-fold increase in free-space area and branch count from Env-1 to Env-3.

BP (1.5 m) issues an average

N_{switch}

of 2.3, 3.3, and 5.7 in Env-1, Env-2, and Env-3, closely matching the T-branch count in each environment (Table 4), with very low variance across all trials. This pattern is consistent with approximately one branch-triggered cancellation per T-branch under the 1.5 m threshold, with no missed detections observed. BP (0.8 m) produces average counts of 2.7, 3.8, and 6.8 with small non-zero variance. The extra cancellations likely arise at passages whose lateral clearance falls between 0.8 m and 1.5 m; each adds one DFS-and-return cycle and a small time overhead. Nevertheless, both proposed variants remain substantially below the NF baseline, which issues 138.2, 231.7, and 270.5 cancellations on average, substantially below NF+GoalHold (72.3, 156.3, and 163.2 on average)—reductions of 59-fold, 70-fold, and 48-fold, respectively, relative to BP (1.5 m).

Figure 8a,b visualise the raw trial data across all four configurations. In Figure 8a, the boxplots show that both proposed-method distributions lie entirely below the NF baseline across all three environments with no overlap, demonstrating consistent efficiency gains independent of random variation. In Figure 8b, the logarithmic scale is used to display all four methods on the same axis, clearly conveying the consistent gap between the proposed variants and the baseline across all environments and configurations.

5.3. Discussion and Qualitative Analysis

Additional scenarios. Figure 9 presents the trajectory results on two additional test scenarios. In the oblique-branch environment, the branch detector successfully triggers DFS entry at a non-perpendicular junction, and the robot achieves complete coverage, demonstrating that the BPE framework generalises beyond perpendicular T-shaped junctions to corridor layouts with non-orthogonal branching geometry. It is noted that, at non-perpendicular junctions, the detector may trigger twice due to diagonal ray directions overlapping with the lateral threshold, producing an additional DFS-and-return cycle; final coverage is unaffected, but eliminating this redundant detection is identified as a direction for future refinement. In the static-obstacle environment, the robot navigates around fixed square obstacles placed in the main corridor and completes full coverage, confirming robustness under Assumption 2.

Critical analysis. The BPE framework is designed specifically for 2D wheeled robots operating in structured corridor environments with known nominal widths and predominantly static obstacles. Its applicability is limited to settings where the corridor geometry provides reliable LiDAR range measurements for branch detection; highly cluttered or geometrically irregular environments may require adaptive threshold schemes. Regarding comparison fairness, learning-based and topological baselines are excluded from quantitative evaluation because they require environment-specific pretraining or explicit map construction incompatible with the zero-shot deployment setting of this work. The NF+GoalHold baseline isolates the contribution of BPE’s geometric detection and barrier memory by providing oscillation suppression without these components. The performance improvements stem from three distinct mechanisms: (1) the geometric classifier

Δ (Z)

replaces score comparison with explicit topological commitment, eliminating score-tie indecision; (2) the entry and post-return locks prevent premature mode reversal; and (3) barrier memory

{H_{j}}

makes redundant re-entry structurally impossible. The NF+GoalHold results confirm that oscillation suppression alone is insufficient—geometric detection and barrier memory are the decisive contributors.

Root cause of baseline oscillation. At each T-junction, the main-corridor frontier and the branch-entrance frontier have near-equal utility scores; so, any incremental map update or minor localisation shift can invert the ranking. The NF node then calls cancel_all_goals() and dispatches a new goal toward the newly top-ranked frontier. With the decision loop running at 2 Hz, this can continue for several cycles, while the two frontiers remain competitive, leading to repeated cancellations and extra delay. Without barrier memory, the NF selector also cannot distinguish a previously completed branch from an active one; so, the robot re-enters already-covered passages when the remaining frontier cells there are still ranked above the main-corridor option.

N_{switch}

increases with environment complexity (138.2, 231.7, and 270.5 for Env-1 to Env-3 for the NF baseline; 72.3, 156.3, and 163.2 for NF+GoalHold). More T-branches and a larger spatial extent create more opportunities for score-tie oscillation near junctions.

Root cause of baseline time overhead. The longer exploration time of the NF baseline stems from a systematic scoring bias against branch frontiers. Because branch clusters are shorter than main-corridor clusters, their information-gain proxy

Σ (B_{i})

is smaller, while their path cost

D (B_{i}, x)

is higher owing to the detour from the main corridor. Consequently, the NF selector consistently ranks main-corridor frontiers above branch frontiers and defers branch entry until the main corridor is exhausted. By that point the robot must travel back from the far end of the corridor to reach each branch entrance, incurring substantial redundant traversal. Without barrier memory, residual frontier cells near branch entrances can re-attract the planner after a branch has been physically covered, adding further unnecessary re-entry cycles. These two effects together account for the 33–41% longer

T_{\exp}

of the NF baseline observed across all three environments.

Effect of detection threshold. The two proposed variants differ only in

Ψ_{side}

. The 1.5 m threshold exceeds the nominal corridor width of 1.0 m and yields exactly one detection per T-branch, producing low-variance

N_{switch}

results that correspond one-to-one with the number of T-branches per environment. At 0.8 m, the threshold falls below the lateral clearance of certain junction geometries; so, the classifier is triggered more than once at passages physically wider than 0.8 m but narrower than 1.5 m (i.e., between the 0.8 m and 1.5 m thresholds). Each extra detection adds one DFS-and-return cycle and one cancellation, which explains the fractional

N_{switch}

means and non-zero variance for BP (0.8 m). Barrier memory prevents passage revisiting under both threshold settings, and both variants reach

S_{success} = 100 %

. In practice,

Ψ_{side}

should be calibrated to the corridor geometry: a well-chosen threshold produces one-to-one branch detection with low variance, while a tighter threshold increases sensitivity at the cost of occasional redundant DFS cycles.

Reasons for faster and more stable exploration. Three mechanisms collectively eliminate both failure modes. The geometric classifier

Δ (Z (t))

replaces score comparison with an explicit topological commitment, removing score-tie indecision at the source. The branch-entry lock (

d_{entry} = 30

cells) and post-return lock (

d_{lock} = 30

cells) prevent the main loop from overriding directional corrections before the new heading is established, while the turn-protection lock (

d_{turn} = 50

cells) stabilises the corridor direction after L-shaped turns. Barrier memory

{H_{j}}

makes redundant re-entry structurally impossible. The result is zero collisions, zero stuck events, and zero recovery behaviours across all trials for both proposed variants. As visualised in Figure 7a–f, both proposed variants achieve substantially more uniform visit density across the corridor layout compared to the baseline, indicating efficient spatial coverage without repeated concentration at contested junctions.

Limitations. Local planner stalling in corners narrower than 0.7 m (after costmap inflation) is the primary failure mode; the fallback frontier search recovers the mission in all cases. Early dead-end detection occasionally leaves a narrow unexplored strip behind an activated barrier, causing a marginal gap below

η_{final} = 1.0

. Both are handled gracefully but motivate future refinements: adaptive threshold tuning and post-barrier frontier checking. The fixed detection thresholds

(Ψ_{side}, Ψ_{diag})

are calibrated to the nominal corridor width and may cause missed or false detections when the corridor width varies significantly; an adaptive threshold scheme based on real-time corridor width estimation is identified as future work. The barrier memory is permanent once activated: if a branch entrance is temporarily blocked by a dynamic obstacle, the robot may prematurely mark the branch as unreachable after

N_{thresh} = 3

consecutive navigation failures; a fully adaptive barrier memory supporting re-evaluation after obstacle removal would better handle highly dynamic environments. The frontier extraction step scales linearly with map size

O (N)

, which may become a bottleneck in very large environments; sparse frontier representations or incremental update strategies could reduce this overhead. All experiments are conducted in simulation; physical validation on the JetHexa hexapod platform is planned as immediate future work.

6. Conclusions and Future Work

This paper has proposed the Branch-Priority Exploration (BPE) framework for autonomous coverage in confined industrial corridor layouts, targeting two common failure modes of conventional frontier-based methods: repeated goal switching at junctions and redundant re-entry into explored branches. The method has combined a LiDAR-based branch detector

Δ (Z (t))

, a hierarchical BFS/DFS mode-switching policy, and a barrier-based branch memory

{H_{j}}

to enforce branch completion before returning to the main corridor. Section 3 established monotonic coverage progress and finite branch completion under the barrier constraint. In ROS simulation on a standard differential-drive platform, the method achieved

η_{explore} \geq 0.99

in all trials, cut the exploration time by 33–41%, and reduced

N_{switch}

substantially against the nearest-frontier baseline.

Three aspects of the framework warrant further attention. Barrier sizing is currently fixed; an adaptive scheme will better handle corridors of varying width. A post-barrier verification step will also address small residual regions near dead ends that inflation occasionally leaves uncovered. Deployment on the JetHexa hexapod platform will raise additional challenges, as body sway, foot–ground interaction, and a larger footprint will require retuning of the localisation and navigation stack. Finally, a multi-robot extension with shared barrier memory and cooperative map fusion will be investigated to improve the efficiency in large-scale industrial corridor layouts.

Author Contributions

Conceptualization, W.Y. and W.D.; Methodology, W.Y.; Software, W.Y.; Validation, W.Y.; Formal analysis, W.Y.; Investigation, W.Y.; Resources, W.D.; Data curation, W.Y.; Writing—original draft, W.Y.; Writing—review & editing, W.D.; Visualization, W.Y.; Supervision, W.D.; Project administration, W.D.; Funding acquisition, W.D. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the Shanxi Province Basic Research Program Youth Science Research Project (Grant No. 202303021212054), the National Key Laboratory of Metal Forming Technology and Heavy Equipment Open Fund (Grant No. S2308100.W17), and the Hai’an Taiyuan University of Technology Advanced Manufacturing and Intelligent Equipment Industry Research Institute Open R&D Project (Grant No. 2024HATYUTKFYF002).

Data Availability Statement

Data are available upon reasonable request to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A. Proof of Theorem 2

Let branch j be detected at time

t_{0}

, with barrier region

H_{j}

placed at position

p_{t_{0}} - δ_{b} {\hat{v}}_{main}

. Define the frontier set restricted to the branch neighbourhood as

F (t) = Λ (t) \cap N (b^{*}), t \geq t_{0},

and define the branch completion time as

t_{j}^{*} = inf \{t \geq t_{0} : F (t^{'}) = \emptyset \forall t^{'} \in [t - N_{thresh} + 1, t]\} .

Case 1: $F (t)$ is eventually exhausted. By Theorem 1,

V (t) \in {0, 1, \dots, | Γ |}

is non-increasing and bounded below by 0. Each successful goal execution inside

N (b^{*})

strictly decreases

V (t)

V (t + 1) \leq V (t) - 1,

whenever

g (t) \in N (b^{*})

and

Φ (x (t)) ⊈ E (t)

. Since

| Γ | < \infty

, the number of such strict-decrease events is bounded by

| Γ |

; so,

F (t) = \emptyset

is reached after at most

| Γ |

successful goal executions. As each goal terminates in finite time by assumption,

t_{j}^{*} < \infty

.

Case 2: Goal execution repeatedly fails. If move_base returns aborted for

N_{thresh}

consecutive goals, the branch is marked completed. This mechanism explicitly handles local minima and deadlock scenarios: even if the robot cannot reach any frontier due to navigation failures, the branch is guaranteed to terminate after at most

N_{thresh} = 3

consecutive aborts, bounding the worst-case DFS duration to

N_{thresh} \cdot T_{max} = 3 \times 60 = 180

s. Since each goal attempt terminates in finite time,

N_{thresh}

failures accumulate after at most

N_{thresh} \cdot T_{max}

time units, where

T_{max}

is the per-goal timeout. Hence,

t_{j}^{*} < \infty

in this case as well.

Barrier enforcement after completion. In both cases, branch j is appended to

K_{explored}

and

H_{j}

is activated at time

t_{j}^{*}

. For all

t > t_{j}^{*}

, the valid candidate set satisfies

G_{valid} (t) = G (t) ∖ ⋃_{j^{'} \in K_{explored} (t)} H_{j^{'}} \subseteq Γ ∖ H_{j} .

Since

H_{j} \subset Γ

is a fixed spatial region independent of future map updates, every subsequent navigation goal satisfies

g (t) \notin H_{j}

. The local planner tracks

g (t)

without crossing the forbidden region encoded in the global costmap; so, the robot trajectory satisfies

x (t) \notin H_{j}, \forall t > t_{j}^{*},

enforcing the barrier constraint (12) for branch j permanently.

References

Chen, C.; Wang, Z.; Gong, Z.; Cai, P.; Zhang, C.; Li, Y. Autonomous Navigation and Obstacle Avoidance for Small VTOL UAV in Unknown Environments. Symmetry 2022, 14, 2608. [Google Scholar] [CrossRef]
Azpúrua, H.; Saboia, M.; Freitas, G.M.; Clark, L.; Agha-mohammadi, A.a.; Pessin, G.; Campos, M.F.; Macharet, D.G. A Survey on the autonomous exploration of confined subterranean spaces: Perspectives from real-word and industrial robotic deployments. Robot. Auton. Syst. 2023, 160, 104304. [Google Scholar] [CrossRef]
Kozhubaev, Y.; Novak, D.; Ershov, R.; Xu, W.; Cheng, H. Research on Navigation and Dynamic Symmetrical Path Planning Methods for Automated Rescue Robots in Coal Mines. Symmetry 2025, 17, 875. [Google Scholar] [CrossRef]
Chen, P.; Huang, B.; Yuan, S.; Jin, Y.; Hou, J. Automated mobile robot-based detection of in-process tunnel lining deformation during construction. Autom. Constr. 2025, 180, 106560. [Google Scholar] [CrossRef]
Shafaei, S.M.; Mousazadeh, H. Development of a mobile robot for safe mechanical evacuation of hazardous bulk materials in industrial confined spaces. J. Field Robot. 2021, 39, 218–231. [Google Scholar] [CrossRef]
Xiao, X.; Xu, Z.; Wang, Z.; Song, Y.; Warnell, G.; Stone, P.; Zhang, T.; Ravi, S.; Wang, G.; Karnan, H.; et al. Autonomous Ground Navigation in Highly Constrained Spaces: Lessons Learned From the Benchmark Autonomous Robot Navigation Challenge at ICRA 2022. IEEE Robot. Autom. Mag. 2022, 29, 148–156. [Google Scholar] [CrossRef]
Wang, D.; Liu, Q.; Yang, J.; Huang, D. Research on Path Planning for Intelligent Mobile Robots Based on Improved A* Algorithm. Symmetry 2024, 16, 1311. [Google Scholar] [CrossRef]
Petracek, P.; Kratky, V.; Petrlik, M.; Baca, T.; Kratochvil, R.; Saska, M. Large-Scale Exploration of Cave Environments by Unmanned Aerial Vehicles. IEEE Robot. Autom. Lett. 2021, 6, 7596–7603. [Google Scholar] [CrossRef]
Kolvenbach, H.; Wisth, D.; Buchanan, R.; Valsecchi, G.; Grandia, R.; Fallon, M.; Hutter, M. Towards autonomous inspection of concrete deterioration in sewers with legged robots. J. Field Robot. 2020, 37, 1314–1327. [Google Scholar] [CrossRef]
Miller, I.D.; Cladera, F.; Cowley, A.; Shivakumar, S.S.; Lee, E.S.; Jarin-Lipschitz, L.; Bhat, A.; Rodrigues, N.; Zhou, A.; Cohen, A.; et al. Mine Tunnel Exploration Using Multiple Quadrupedal Robots. IEEE Robot. Autom. Lett. 2020, 5, 2840–2847. [Google Scholar] [CrossRef]
Azpúrua, H.; Rezende, A.; Potje, G.; Júnior, G.P.d.C.; Fernandes, R.; Miranda, V.; Filho, L.W.d.R.; Domingues, J.; Rocha, F.; de Sousa, F.L.M.; et al. Towards Semi-autonomous Robotic Inspection and Mapping in Confined Spaces with the EspeleoRobô. J. Intell. Robot. Syst. 2021, 101, 69. [Google Scholar] [CrossRef]
Dang, T.; Tranzatto, M.; Khattak, S.; Mascarich, F.; Alexis, K.; Hutter, M. Graph-based subterranean exploration path planning using aerial and legged robots. J. Field Robot. 2020, 37, 1363–1388. [Google Scholar] [CrossRef]
Kulkarni, M.; Dharmadhikari, M.; Tranzatto, M.; Zimmermann, S.; Reijgwart, V.; De Petris, P.; Nguyen, H.; Khedekar, N.; Papachristos, C.; Ott, L.; et al. Autonomous Teamed Exploration of Subterranean Environments using Legged and Aerial Robots. In 2022 International Conference on Robotics and Automation (ICRA); IEEE: Piscataway, NJ, USA, 2022; pp. 3306–3313. [Google Scholar] [CrossRef]
Ebadi, K.; Bernreiter, L.; Biggie, H.; Catt, G.; Chang, Y.; Chatterjee, A.; Denniston, C.E.; Deschênes, S.P.; Harlow, K.; Khattak, S.; et al. Present and Future of SLAM in Extreme Environments: The DARPA SubT Challenge. IEEE Trans. Robot. 2024, 40, 936–959. [Google Scholar] [CrossRef]
Brugali, D.; Muratore, L.; De Luca, A. Mobile robots exploration strategies and requirements: A systematic mapping study. Int. J. Robot. Res. 2025, 44, 1461–1506. [Google Scholar] [CrossRef]
Sousa, R.B.; Sobreira, H.M.; Moreira, A.P. A systematic literature review on long-term localization and mapping for mobile robots. J. Field Robot. 2023, 40, 1245–1322. [Google Scholar] [CrossRef]
Placed, J.A.; Strader, J.; Carrillo, H.; Atanasov, N.; Indelman, V.; Carlone, L.; Castellanos, J.A. A Survey on Active Simultaneous Localization and Mapping: State of the Art and New Frontiers. IEEE Trans. Robot. 2023, 39, 1686–1705. [Google Scholar] [CrossRef]
Denniston, C.E.; Chang, Y.; Reinke, A.; Ebadi, K.; Sukhatme, G.S.; Carlone, L.; Morrell, B.; Agha-mohammadi, A.a. Loop Closure Prioritization for Efficient and Scalable Multi-Robot SLAM. IEEE Robot. Autom. Lett. 2022, 7, 9651–9658. [Google Scholar] [CrossRef]
Tran, V.P.; Garratt, M.A.; Kasmarik, K.; Anavatti, S.G. Dynamic Frontier-Led Swarming: Multi-Robot Repeated Coverage in Dynamic Environments. IEEE/CAA J. Autom. Sin. 2023, 10, 646–661. [Google Scholar] [CrossRef]
Lee, D.; Jung, M.; Yang, W.; Kim, A. LiDAR odometry survey: Recent advancements and remaining challenges. Intell. Serv. Robot. 2024, 17, 95–118. [Google Scholar] [CrossRef]
Belkin, I.; Abramenko, A.; Yudin, D. Real-Time Lidar-based Localization of Mobile Ground Robot. Procedia Comput. Sci. 2021, 186, 440–448. [Google Scholar] [CrossRef]
Xu, W.; Cai, Y.; He, D.; Lin, J.; Zhang, F. FAST-LIO2: Fast Direct LiDAR-Inertial Odometry. IEEE Trans. Robot. 2022, 38, 2053–2073. [Google Scholar] [CrossRef]
Semenas, R.; Bausys, R. Adaptive Autonomous Robot Navigation by Neutrosophic WASPAS Extensions. Symmetry 2022, 14, 179. [Google Scholar] [CrossRef]
Dai, A.; Papatheodorou, S.; Funk, N.; Tzoumanikas, D.; Leutenegger, S. Fast Frontier-based Information-driven Autonomous Exploration with an MAV. In 2020 IEEE International Conference on Robotics and Automation (ICRA); IEEE: Piscataway, NJ, USA, 2020; pp. 9570–9576. [Google Scholar] [CrossRef]
Zheng, J.; Mao, S.; Wu, Z.; Kong, P.; Qiang, H. Improved Path Planning for Indoor Patrol Robot Based on Deep Reinforcement Learning. Symmetry 2022, 14, 132. [Google Scholar] [CrossRef]
Fan, J.; Zhang, X.; Zou, Y. Hierarchical path planner for unknown space exploration using reinforcement learning-based intelligent frontier selection. Expert Syst. Appl. 2023, 230, 120630. [Google Scholar] [CrossRef]
Teng, Y.; Feng, T.; Li, J.; Chen, S.; Tang, X. A Dual-Layer Symmetric Multi-Robot Path Planning System Based on an Improved Neural Network-DWA Algorithm. Symmetry 2025, 17, 85. [Google Scholar] [CrossRef]
Cheng, C.; Zhang, H.; Sun, Y.; Tao, H.; Chen, Y. A cross-platform deep reinforcement learning model for autonomous navigation without global information in different scenes. Control Eng. Pract. 2024, 150, 105991. [Google Scholar] [CrossRef]
Cimurs, R.; Suh, I.H.; Lee, J.H. Goal-Driven Autonomous Exploration Through Deep Reinforcement Learning. IEEE Robot. Autom. Lett. 2022, 7, 730–737. [Google Scholar] [CrossRef]
Pittol, D.; Mantelli, M.; Maffei, R.; Kolberg, M.; Prestes, E. Loop-Aware Exploration Graph: A concise representation of environments for exploration and active loop-closure. Robot. Auton. Syst. 2022, 155, 104179. [Google Scholar] [CrossRef]
Yang, F.; Lee, D.H.; Keller, J.; Scherer, S. Graph-based Topological Exploration Planning in Large-scale 3D Environments. In 2021 IEEE International Conference on Robotics and Automation (ICRA); IEEE: Piscataway, NJ, USA, 2021; pp. 12730–12736. [Google Scholar] [CrossRef]
Cui, X.; Liu, Q.; Liu, Z.; Wang, H. Frontier-Enhanced Topological Memory with Improved Exploration Awareness for Embodied Visual Navigation. In Computer Vision–ECCV 2024; Springer Nature: Cham, Switzerland, 2024; pp. 296–313. [Google Scholar] [CrossRef]

Figure 1. Illustration of the 8-neighborhood frontier criterion. Cell p (frontier cell) lies in the explored free space and has at least one unknown neighbour among its eight surrounding cells.

Figure 2. Illustration of the three operating modes on a T-shaped corridor. (a) BFS mode: no branch detected, robot advances along the main corridor. (b) DFS mode: branch detected, robot enters the lateral branch and exhausts frontier cells within

N (b^{*})

. (c) RETURN mode: branch frontier exhausted, robot navigates back to

p_{green}

to resume BFS. The sequence (a)→(b)→(c) corresponds to one complete BFS→DFS→RETURN cycle at a single T-branch.

Figure 2. Illustration of the three operating modes on a T-shaped corridor. (a) BFS mode: no branch detected, robot advances along the main corridor. (b) DFS mode: branch detected, robot enters the lateral branch and exhausts frontier cells within

N (b^{*})

. (c) RETURN mode: branch frontier exhausted, robot navigates back to

p_{green}

to resume BFS. The sequence (a)→(b)→(c) corresponds to one complete BFS→DFS→RETURN cycle at a single T-branch.

Figure 3. Mode-transition diagram of the hierarchical goal selector. BFS switches to DFS upon branch detection; DFS switches to RETURN when no frontier remains in the branch neighbourhood; RETURN switches back to BFS upon arrival at

p_{green}

. The main loop terminates when coverage reaches

η_{stop} = 0.99

. A visual illustration of each mode is provided in Figure 2.

Figure 3. Mode-transition diagram of the hierarchical goal selector. BFS switches to DFS upon branch detection; DFS switches to RETURN when no frontier remains in the branch neighbourhood; RETURN switches back to BFS upon arrival at

p_{green}

. The main loop terminates when coverage reaches

η_{stop} = 0.99

. A visual illustration of each mode is provided in Figure 2.

Figure 4. System architecture of the BPE framework. Raw 2D LiDAR scans and odometry are consumed by the SLAM back-end (gmapping) to maintain an occupancy grid. The BPE decision node operates at 2 Hz: it reads the current map and scan, invokes the branch detector

Δ (Z)

, extracts and scores frontier clusters, applies barrier filtering, and dispatches a goal pose

g (t)

to move_base. The navigation stack executes the goal via the DWA local planner and returns status feedback that closes the decision loop.

Figure 4. System architecture of the BPE framework. Raw 2D LiDAR scans and odometry are consumed by the SLAM back-end (gmapping) to maintain an occupancy grid. The BPE decision node operates at 2 Hz: it reads the current map and scan, invokes the branch detector

Δ (Z)

, extracts and scores frontier clusters, applies barrier filtering, and dispatches a goal pose

g (t)

to move_base. The navigation stack executes the goal via the DWA local planner and returns status feedback that closes the decision loop.

Figure 5. Simulation environment visualisations. (a) Full Env-2 layout overview showing the complete corridor network with three T-branches and three L-turns; (b) Gazebo 3D corridor perspective with the robot at a T-shaped junction; (c) robot’s onboard view approaching a T-junction; (d) live gmapping occupancy grid during exploration, with the onboard LiDAR point cloud shown in red.

Figure 6. Trajectory comparison on a representative Env-2 trial. (a): Proposed method under

Ψ_{side} = 1.5

m, with three barrier zones (B1–B3) and green return waypoints marked. (b): Proposed method under

Ψ_{side} = 0.8

m, with four barrier zones (B1–B4) due to earlier detection at the tighter threshold. (c): NF+GoalHold baseline, showing reduced but persistent oscillation near junctions compared to the BPE variants. (d): Frontier-only baseline, showing oscillatory motion concentrated near junctions.

Figure 6. Trajectory comparison on a representative Env-2 trial. (a): Proposed method under

Ψ_{side} = 1.5

m, with three barrier zones (B1–B3) and green return waypoints marked. (b): Proposed method under

Ψ_{side} = 0.8

m, with four barrier zones (B1–B4) due to earlier detection at the tighter threshold. (c): NF+GoalHold baseline, showing reduced but persistent oscillation near junctions compared to the BPE variants. (d): Frontier-only baseline, showing oscillatory motion concentrated near junctions.

Figure 7. Quantitative visualisations on a representative Env-2 trial. Each row corresponds to one method: BP (1.5 m), BP (0.8 m), NF+GoalHold, and Frontier-only. Columns show visit density heatmap, coverage curve

η (t)

, and cumulative cancellation curve

N_{switch} (t)

. The two BPE variants distribute revisits more evenly, whereas both baselines concentrate them near junctions. Six-trial means: 491.8 s, 555.2 s, 749.5 s, and 816.2 s (Table 5). The proposed method issues 3–6 cancellations versus 72–163 for NF+GoalHold and 138–271 for the NF baseline.

Figure 7. Quantitative visualisations on a representative Env-2 trial. Each row corresponds to one method: BP (1.5 m), BP (0.8 m), NF+GoalHold, and Frontier-only. Columns show visit density heatmap, coverage curve

η (t)

, and cumulative cancellation curve

N_{switch} (t)

. The two BPE variants distribute revisits more evenly, whereas both baselines concentrate them near junctions. Six-trial means: 491.8 s, 555.2 s, 749.5 s, and 816.2 s (Table 5). The proposed method issues 3–6 cancellations versus 72–163 for NF+GoalHold and 138–271 for the NF baseline.

Figure 8. Quantitative results across all environments and trials. (a) Exploration times across three environments (72 trials total: 4 methods × 6 runs × 3 environments). Boxplots show the distribution over six trials per method. Both proposed-method boxes lie entirely below the NF baseline with no overlap, demonstrating consistent and reliable efficiency gains independent of random variation. (b) Goal cancellations

N_{switch}

on a logarithmic scale (mean ± std, 6 trials). The logarithmic scale is necessary to display all four methods simultaneously, itself reflecting the magnitude of the gap. BP (1.5 m) issues on average 5.7 cancellations per mission with low variance; BP (0.8 m) issues on average 6.8 due to occasional over-detection at narrow passages; both are one to two orders of magnitude below the NF baseline (138–271).

Figure 8. Quantitative results across all environments and trials. (a) Exploration times across three environments (72 trials total: 4 methods × 6 runs × 3 environments). Boxplots show the distribution over six trials per method. Both proposed-method boxes lie entirely below the NF baseline with no overlap, demonstrating consistent and reliable efficiency gains independent of random variation. (b) Goal cancellations

N_{switch}

on a logarithmic scale (mean ± std, 6 trials). The logarithmic scale is necessary to display all four methods simultaneously, itself reflecting the magnitude of the gap. BP (1.5 m) issues on average 5.7 cancellations per mission with low variance; BP (0.8 m) issues on average 6.8 due to occasional over-detection at narrow passages; both are one to two orders of magnitude below the NF baseline (138–271).

Figure 9. Trajectory results on additional test scenarios. (a) Oblique-branch environment: the branch detector successfully identifies and enters a non-perpendicular branch, demonstrating generalisability beyond perpendicular T-shaped junctions. However, at non-perpendicular junctions the detector may trigger twice at the same junction due to diagonal ray directions overlapping with the lateral threshold, causing an additional DFS-and-return cycle; this does not affect final coverage but is identified as a direction for future refinement. (b) Static-obstacle environment: the robot navigates around fixed square obstacles and achieves complete corridor coverage, consistent with Assumption 2.

Table 1. Comparison of BPE with representative existing methods.

Method	Scenario	Sensor	Pretrain	Explicit Map	Branch Memory
BPE (ours)	Corridor	2D LiDAR	No	No	Yes
Topological [31]	3D tunnel	3D LiDAR	No	Yes	No
Topological [32]	Indoor	Camera	No	Yes	No
DRL-based [26]	Indoor	Camera	Yes	No	No
Frontier+hysteresis [30]	General	Any	No	No	No
NF baseline	General	Any	No	No	No

Table 2. Symbols and operators used in this paper.

Symbol	Description
$x$	Robot state (planar pose)
$u$	Control input
$v, ω$	Linear and angular velocity
$v_{max}, ω_{max}$	Maximum linear and angular velocity
$Z$	LiDAR measurements
$Ω$	Occupancy grid map
$Γ$	Free (admissible) space
$Λ$	Frontier set
$B_{i}$	Frontier cluster (i-th)
$w_{i}$	Cluster priority score
$K_{explored}$	Set of completed branches
$H_{j}$	Barrier region for branch j
$R_{s}$	Effective sensing range
$Φ (x)$	Observable cell set from state $x$
$Ξ (Λ)$	8-neighborhood frontier clustering
$Δ (Z)$	Geometric branch detector
$Σ (B)$	Information gain proxy (cluster size)
$D (B, x)$	Navigation path-cost estimate
$G (Ω, x)$	Observation model
$Υ (θ; t)$	Maximum free range along direction $θ$
$Ψ (θ)$	Passability threshold for direction $θ$
$Π (\cdot)$	Hierarchical target selection policy

Table 3. Implementation parameters used in simulation.

Category	Parameter	Value
Map	Resolution $ρ$	0.05 m/cell
Sensing	Effective range $R_{s}$	3.0 m
Control	Decision frequency $f_{cycle}$	2 Hz
Detection	$(Ψ_{side}, Ψ_{diag}, Ψ_{back}, Ψ_{fwd})$	1.5, 2.0, 3.2, 1.8 m
Detection	Temporal cooldown	3.0 s
Detection	Spatial cooldown	15 cells
Clustering	$(ϵ_{DBSCAN}, minPts, τ_{cluster})$	0.3 m, 5, 10
Barrier	Placement offset $δ_{b}$	20 cells
Barrier	Half-side $r_{b}$	5 cells
Lock	Turn-protection $d_{turn}$	50 cells
Lock	Branch-entry $d_{entry}$	30 cells
Lock	Post-return $d_{lock}$	30 cells
DFS completion	$N_{thresh}$	3
Termination	$(η_{stop}, ϵ_{reach})$	0.99, 0.3 m

Table 4. Properties of the three simulated corridor environments.

Env	T-Branches	L-Turns	Grid Size (cells)	Free Space (m²)	Traj. Length (m)
Env-1	2	2	$720 \times 800$	60.2	78.19
Env-2	3	3	$480 \times 940$	73.0	94.46
Env-3	5	4	$660 \times 1960$	123.2	168.40

Table 5. Quantitative comparison (mean ± std, 6 trials). Bold values indicate the best result per environment.

Env	Method	$T_{\exp}$ (s)	$N_{switch}$	$S_{success}$ (%)
Env-1	BP (1.5 m)	$411.0 \pm 12.9$	$2.3 \pm 0.5$	100
	BP (0.8 m)	$453.7 \pm 30.6$	$2.7 \pm 0.5$	100
	NF+GoalHold	$585.8 \pm 38.3$	$72.3 \pm 20.0$	100
	NF baseline	$695.2 \pm 46.4$	$138.2 \pm 25.1$	100
Env-2	BP (1.5 m)	$491.8 \pm 27.3$	$3.3 \pm 0.5$	100
	BP (0.8 m)	$555.2 \pm 22.6$	$3.8 \pm 0.8$	100
	NF+GoalHold	$749.5 \pm 49.9$	$156.3 \pm 23.6$	100
	NF baseline	$816.2 \pm 48.4$	$231.7 \pm 25.2$	100
Env-3	BP (1.5 m)	$921.5 \pm 35.7$	$5.7 \pm 0.8$	100
	BP (0.8 m)	$978.0 \pm 53.0$	$6.8 \pm 0.8$	100
	NF+GoalHold	$1333.8 \pm 83.2$	$163.2 \pm 35.2$	100
	NF baseline	$1382.0 \pm 87.3$	$270.5 \pm 44.0$	100

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Yu, W.; Du, W. Branch-Priority Exploration for Mobile Robots in Restricted Industrial Corridors. Symmetry 2026, 18, 806. https://doi.org/10.3390/sym18050806

AMA Style

Yu W, Du W. Branch-Priority Exploration for Mobile Robots in Restricted Industrial Corridors. Symmetry. 2026; 18(5):806. https://doi.org/10.3390/sym18050806

Chicago/Turabian Style

Yu, Wenjie, and Wangzhe Du. 2026. "Branch-Priority Exploration for Mobile Robots in Restricted Industrial Corridors" Symmetry 18, no. 5: 806. https://doi.org/10.3390/sym18050806

APA Style

Yu, W., & Du, W. (2026). Branch-Priority Exploration for Mobile Robots in Restricted Industrial Corridors. Symmetry, 18(5), 806. https://doi.org/10.3390/sym18050806

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Branch-Priority Exploration for Mobile Robots in Restricted Industrial Corridors

Abstract

1. Introduction

2. Problem Formulation

2.1. Notation and Symbol Summary

2.2. Robot Motion and Perception Model

2.3. Coverage and Barrier Constraint

3. Methodology

3.1. Decision Policy on Occupancy Grids

3.2. Branch-Priority Policy with Memory

3.3. Convergence Analysis

4. Implementation

4.1. ROS Deployment and Configuration

4.2. Algorithm Description

5. Test in ROS Environment

5.1. ROS Environment Setup

5.2. Baselines and Main Results

5.3. Discussion and Qualitative Analysis

6. Conclusions and Future Work

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Appendix A. Proof of Theorem 2

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI