Topological Progress Potential-Enhanced Continuous-Space Ant Colony Algorithm for Robot Path Planning

Guikun Dong; Feixiong Zhao; Jiaxiong Zhuo; Lei Zhou; Qiaoling Liu; Xiangjun Yang

doi:10.3390/s26041264

Abstract

To address the issues of traditional grid-based Ant Colony Optimization path planning in discretized continuous space—including limited direction freedom, lack of global topological guidance, and difficulty in balancing path smoothness and safety margin—a topological progress potential-enhanced continuous-space ant colony path planning algorithm (TPP-CSACO) is proposed. TPP-CSACO discards grid-based expansion; instead, a perception circle centered on each ant is defined, movement is executed via a sector-based perception framework with probabilistic direction selection, and band-shaped decaying pheromones are deposited along the path. By coupling the global topological progress potential derived from the simplified probabilistic roadmap (PRM) with pheromones, a dual-field guidance mechanism is established to prevent local congestion. Combined with the explicit safety constraints of the signed distance field (SDF), an adaptive step size strategy that integrates elastic step size and frustration-induced temperature rise is introduced to enhance obstacle avoidance and search stability. Results from repeated experiments on multiscale constrained maps (conducted against six typical algorithms and the traditional ACO) show that compared with ACO, TPP-CSACO reduces the path length by up to 50.6% in the same environment, while achieving faster convergence and maintaining good search diversity. Although the path length increases slightly (by a maximum of 5.9%) compared with the shortest heuristic algorithms, the maximum turning angle is reduced by 75% to 93%, and a 100% success rate and zero safety violations are realized. This indicates that TPP-CSACO has achieved a relatively stable balance among safety, smoothness, and global search capability.

Keywords:

ant colony optimization algorithm; global path planning; continuous space; sector-based perception; elastic step size

1. Introduction

Path planning algorithms are a core technology in robotics [1]. They are designed to find an optimal or feasible path for robots from the starting point to the destination within their operational environments, while avoiding obstacles and satisfying specific constraints (e.g., minimum planning time, lowest energy consumption, and shortest path) [2]. Currently, path planning algorithms have been applied across various industries and domains, including agriculture [3], domestic services [4], military applications [5], and special equipment [6]. Extensive research has been conducted on path planning algorithms by researchers. Path planning algorithms are mainly classified into three main categories: classical algorithms, intelligent learning algorithms, and bio-inspired algorithms.

Classical path planning algorithms were first proposed and applied to deterministic environments, yielding results that are easily observable [7]. They mainly include cell decomposition (CD) [8], Rapidly-exploring Random Tree (RRT) [9], Dijkstra’s algorithm [10], A* algorithm [11], artificial potential field (APF) [12], and probabilistic roadmap (PRM) [13], among others. PRM was proposed by Overmars et al. in the early 1990s [14]. Its core principle is to perform random sampling in the configuration space and connect collision-free configurations to construct a roadmap, converting the search in complex continuous spaces into a discrete graph search. It is particularly suitable for path planning of high-degree-of-freedom robots and possesses strong probabilistic completeness and high multi-query efficiency [15,16]. Both PRM and RRT fall into the category of sampling-based methods: the former first constructs a static probabilistic roadmap before executing queries, while the latter gradually expands the path via a random tree structure [17]. A hierarchical path planner was proposed by Fan et al.: the upper layer utilizes PRM to rapidly generate a global guiding path, while the lower layer optimizes the path in segments through the Deep Deterministic Policy Gradient, which shortens the path length and enhances smoothness while satisfying the nonholonomic constraints of vehicles [18]. An incremental sampling strategy was adopted by Xu et al. to continuously construct and maintain PRM, enabling efficient and safe exploration of unmanned aerial vehicles (UAVs) in dynamic environments, which significantly reduces the total exploration time and computational cost [19]. To address the demand for online collision-free planning of 15-degree-of-freedom gantry welding robots in ship manufacturing, Zhou et al. proposed an improved Lazy-PRM algorithm that integrates rule guidance and a repulsive field (RR-Lazy-PRM). This algorithm notably reduces the collision risk between the robot, workpieces, and its own joints; additionally, it enables the collaborative operation of dual manipulators, precisely meeting the dual requirements of path accuracy and operational efficiency in welding scenarios [20].

Intelligent learning algorithms are generally defined as a category of artificial intelligence (AI) [21] methods that are characterized by data-driven or interaction-based learning and that, supported by modern computational resources, automatically acquire knowledge, decision-making strategies, or mapping relationships. Representative approaches include artificial neural network (ANN) [22], fuzzy logic (FL) [23], and reinforcement learning (RL) [24], among others. A CBNNP algorithm integrating bio-inspired neural networks and an ocean current compensation mechanism was proposed by Zhu et al., which allows unmanned underwater vehicles (UUVs) to stably track the optimal path in environments with different current directions, flow velocities, and obstacles [25]. The Artificial Bee Colony (ABC) algorithm was adopted by Aliskan et al. to optimize the output membership function of the fuzzy logic controller (FLC). Based on the Integral of Absolute Error (IAE) and Integral of Time-weighted Absolute Error (ITAE) indices, the FLC-IAE and FLC-ITAE controllers were designed, and the effectiveness of the metaheuristic algorithm in parameter tuning for automated guided vehicle (AGV) steering control was verified [26]. The potential field was encoded as a neural field by Fareh et al. to realize fast path planning for the leader, significantly improving the execution speed and stability of the system [27].

Bio-inspired algorithms are inspired by the behaviors of individual organisms, colony behaviors, and specific physiological functions (e.g., foraging, social interaction, escape, etc.) [28]. They mainly include the genetic algorithm (GA) [29], particle swarm optimization (PSO) [30], Grey Wolf Optimizer (GWO) [31], and Ant Colony Optimization (ACO) [32], among others. ACO simulates the foraging behavior of ants, searching for the optimal path via the positive feedback mechanism of pheromones [33]. It exhibits strong robustness and excellent parallelism, yet has inherent drawbacks such as slow convergence and a tendency to get trapped in local optima [34,35]. A multi-strategy adaptive ant colony algorithm was proposed by Cui et al., which integrates four improvements: direction guidance, adaptive heuristic function, deterministic state transition, and non-uniform pheromone initialization. This algorithm significantly enhances convergence speed and path quality and outperforms the traditional A* and ACO algorithms in various complex environments [36]. For orchard mowers, a BL-ACO path planning and GO-SMC tracking control scheme was proposed by Liu et al. The BL-ACO optimizes the operation path by improving the pheromone update mechanism and adopting a two-layer optimization strategy, while the GO-SMC designs a control law based on the kinematic model to ensure tracking stability [37]. A goose optimization-based ant colony algorithm was proposed by Sheng et al. In traveling salesman problem (TSP) and AGV scenarios, this algorithm achieves a more optimal path length and significantly improved convergence efficiency and demonstrates stronger adaptability in dynamic environments such as tobacco workshops [38]. A multi-strategy genetic ant colony algorithm was proposed by Li et al. It can stably obtain the optimal path in multiscale grid environments and can efficiently handle obstacle avoidance and path planning in dynamic obstacle environments [39]. An improved Q-evaluation Ant Colony Optimization algorithm was proposed by Li et al. This algorithm significantly improves path quality and convergence speed in small- and medium-scale maps; even in large-scale environments, it can maintain a fast solution speed, effectively enhancing the stability and real-time performance of path planning [40]. In 2008, Socha and Dorigo extended the Ant Colony Optimization framework to continuous domains and proposed the Ant Colony Optimization for Continuous Domains (ACOR) algorithm, which employs a solution archive to dynamically generate the Gaussian kernel probability density function and guide ants to perform sampling searches in the continuous space [41]. Based on ACOR, Niu et al. proposed an Ant Colony Optimization with an improved state transition probability, a random-walk strategy, and an adaptive waypoints-repair method, which is integrated with the dynamic window approach to enhance the global search efficiency and dynamic obstacle avoidance capabilities for UAV 3D path planning [42]. Wang et al. proposed a reinforcement-learning-driven multi-strategy continuous Ant Colony Optimization algorithm that demonstrates superior performance in addressing the multi-UAV path planning problem in complex three-dimensional continuous environments [43].

Current research on ACO-based path planning algorithms can be broadly classified into two categories: grid-based discretization methods and continuous-domain methods. Although grid-based discretization methods feature straightforward implementation, the discretized environmental model inherently restricts the motion flexibility and safety of robots, and the generated path morphology is constrained by the grid structure with fixed step sizes and orientations. Such discretization is usually accompanied by simplifying robot dimensions and binarizing obstacles, resulting in a lack of fine-grained characterization of continuous distance fields and safety margins; thus, an optimal trade-off between path smoothness and wall-proximity risk cannot be achieved. Continuous-domain methods are free of the limitations of grid resolution but cannot accurately model the geometric constraints of obstacles. In addition, neither category has sufficient awareness of the environment’s global topological connectivity. Therefore, in multi-channel, maze-like scenarios, the ant colony algorithm is prone to becoming trapped in local optima that are geometrically adjacent but topologically infeasible. To address the above limitations, this paper proposes a topological progress potential-enhanced continuous-space ant colony algorithm for path planning (TPP-CSACO), whose core designs are as follows: (1) A sector-based perception and probabilistic direction selection framework in continuous space is proposed. By adopting sector-based perception and multidirectional probabilistic selection in the continuous coordinate space to replace the traditional grid neighborhood expansion mode, the directional freedom of ants and the potential for smooth paths in static environments are improved. (2) An adaptive search strategy coupling elastic step size and frustration-driven temperature rise is proposed. By integrating adaptive step size scaling and frustration-driven temperature adjustment, the search granularity and exploration intensity are automatically adjusted according to the degree of local blockage, thereby enhancing the algorithm’s path-finding capability in complex environments. (3) An explicit safety margin modeling and dual constraint mechanism oriented to robot size is constructed, which effectively reduces wall-hugging paths and achieves a more reasonable trade-off between path length and path safety. (4) A planning framework based on PRM topological progress potential and nucleated pheromone corridors is proposed. By combining the topological progress potential constructed on the simplified PRM graph with nucleated pheromone deposition in continuous space, the global topological prior and local swarm experience are unified into the mobile decision-making process. The remainder of this paper is structured as follows. Section 2 presents the scheme for constructing the environment model. Section 3 describes the design of the TPP-CSACO. Section 4 elaborates on the process of the TPP-CSACO path planning algorithm. Section 5 conducts simulation experiments and corresponding analysis. Conclusions are drawn in Section 6.

2. Environment Model Construction

In mobile robot path planning, environment modeling serves as the foundation for safe navigation and simulation. In this paper, obstacles

O_{k}

are represented by closed polygons composed of ordered vertices

{V_{1}, V_{2}, \dots, V_{n}}

. The set of

K

obstacles is expressed as Equation (1):

\tilde{O} = ⋃_{k}^{K} O_{k}

(1)

This method can accurately depict the contours of complex obstacles, facilitating the subsequent calculation of the signed distance field (SDF). To quantify the collision state and safety margin of any point in the continuous space, the SDF is introduced: the SDF of an arbitrary point

P = (x, y)

in the space is defined as the signed distance from

P

to the nearest obstacle boundary

\partial \tilde{O}

, as given in Equation (2):

S D F = \{\begin{matrix} - d (P, \partial \tilde{O}), & P \in int (\tilde{O}) \\ + d (P, \partial \tilde{O}), & o t h e r w i s e \end{matrix}

(2)

where

d (P, \partial Ω)

denotes the Euclidean distance from the point

P

to the obstacle boundary. Figure 1a illustrates the construction of closed obstacles. As shown in Figure 1b, the interior of obstacles takes negative values, while the SDF of free space increases as the distance to the obstacles increases.

Figure 1. Construction of the operating environment. (a) Construction of obstacles. (b) Construction of the signed distance function (SDF) field.

To ensure the spatial smoothness of SDF queries, bilinear interpolation is employed as the continuous query interface. For any query point, its corresponding interpolation grid cell is first determined. Let the coordinates of the four corner points of this grid cell be

Q_{11} = (x_{1}, y_{1})

,

Q_{21} = (x_{2}, y_{1})

,

Q_{12} = (x_{1}, y_{2})

, and

Q_{22} = (x_{2}, y_{2})

, with their respective SDF values denoted as

F_{11}

,

F_{21}

,

F_{12}

, and

F_{22}

. Then, the SDF value of the point

P

can be obtained via Equation (3):

S D F (P) = \frac{1}{(x_{2} - x_{1}) (y_{2} - y_{1})} [\begin{matrix} x_{2} - x & x - x_{1} \end{matrix}] [\begin{matrix} F_{11} & F_{12} \\ F_{21} & F_{22} \end{matrix}] [\begin{matrix} y_{2} - y \\ y - y_{1} \end{matrix}]

(3)

Based on the continuous representation of the SDF, the robot’s size can be further considered to obtain a safe and feasible region. The robot is simplified as a disk with a radius

δ

, and a hierarchical safety margin strategy is adopted. Specifically, during the sampling phase of PRM, the safety margin is set to

δ

. In the ant exploration phase, the safety margin is adjusted to 2

δ

to provide sufficient geometric space for subsequent trajectory optimization. Finally, during the trajectory optimization phase, the safety margin is restored to

δ

.

During each phase of the algorithm, the safety and feasibility of the path segments must be determined. For a continuous path segment

\bar{P Q}

, its feasibility is defined by Equation (4):

\forall t \in [0,1], S D F ((1 - t) \cdot P + t \cdot Q) \geq δ

(4)

in which

(1 - t) \cdot P + t \cdot Q

represents the point corresponding to parameter

t

on the line segment. When

t

varies from 0 to 1, the entire line segment can be traversed.

3. Design of the TPP-CSACO

3.1. Construction of Topology and Progress Potential

3.1.1. Construction, Simplification, and Connectivity Enhancement of PRM

The nodes generated by RRT* are concentrated near the path from the start to the target point, making it difficult to characterize the topology of regions far from this path. Consequently, alternative path information may be lost in complex multi-corridor environments. In contrast, the PRM’s sampling strategy distributes nodes throughout the free space rather than clustering them around the path. This allows the constructed topological graph to present more feasible paths and capture the complete connectivity structure of the environment. A robust and information-explicit PRM topological graph is the key to constructing the potential for progress. Therefore, systematic enhancements have been implemented for the sampling strategy, edge connection mechanism, and graph structure of the traditional PRM algorithm.

In this paper, a hybrid sampling strategy combining four types of sampling methods is adopted: uniform sampling, obstacle-boundary-biased sampling, Gaussian sampling, and bridge test sampling. Let

N_{s a m}

denote the total number of sampling nodes; the number of samples for each type is allocated according to the integer proportion

λ = (λ_{U}, λ_{B}, λ_{G}, λ_{R})

, as given in Equation (5):

n_{k} = [N_{s a m} \cdot \frac{λ_{k}}{\sum_{i = {U, B, G, R}} λ_{i}}], k \in {U, B, G, R}

(5)

in which

n_{U}

,

n_{B}

,

n_{G}

, and

n_{R}

represent the number of uniform sampling nodes, boundary sampling nodes, Gaussian sampling nodes, and bridge test sampling nodes, respectively. Subsequent sampling in this paper is performed with a ratio of

2 : 2 : 1 : 1

. Meanwhile, a sampling refill mechanism is set up: if sampling point failure occurs for a particular sampling type, the insufficient number of samples will be supplemented by uniform sampling, ensuring the sampling density in the feasible region of the initial sampling.

After node sampling, the node set

V_{p r e l} = \{P_{i} \in R^{2} | i = 1,2, \dots, N\}

is obtained. Edge connection is then performed: for each node

P_{u}

, its K-nearest neighbors are first queried to form a candidate edge

e = (P_{u}, P_{v})

, and then, edge extension equidistant sampling detection is executed on the candidate edge. Let

K_{s e g}

denote the number of sampling points along the edge; the calculation formula of the sampling points is given in Equation (6):

\{\begin{matrix} q_{k} = P_{u} + t_{k} \cdot (P_{v} - P_{u}) \\ \begin{matrix} t_{k} = k / K_{s e g}, & k = 0,1, \dots, K_{s e g} \end{matrix} \end{matrix}

(6)

A candidate edge

e

is deemed feasible if and only if

\forall q_{k} \in e : S D F (q_{k}) > 0

. To ensure the local connectivity of the graph, a minimum constraint

{d e g}_{m i n} = 5

is introduced. If the feasible neighbor count

P_{u}

of the node

P_{u}

is less than

{d e g}_{m i n}

, the search radius is increased incrementally until

d e g (P_{u}) \geq {d e g}_{m i n}

.

To address the isotropic defect of traditional KNN edge connection, an oriented expansion strategy is introduced. The

2 π

angular range is divided equally into eight sectors. For each node

P_{i}

, the candidate node with the closest projection distance is selected in each sector

s

, and the selection rule is given in Equation (7):

P^{s} = {argmin}_{P_{j} \in V_{prel}, θ_{j} \in ϑ_{s}} ((P_{j} - P_{i}) \cdot u_{s})

(7)

in which

u_{s}

denotes the unit vector in the central direction of sector

s

and

ϑ_{s}

represents the angular range of sector

s

. When the edge

\bar{P_{i} P^{s}}

is feasible, it is added to the edge set

E

to ensure that the node is connected by short edges in all principal directions.

The above strategy can construct the topological structure of the feasible region, but the large number of nodes and edges will increase the subsequent computational cost. Therefore, on the premise of ensuring the reachability between the start point and the end point, we simplify the graph according to the following steps to strengthen the backbone structure and retain the narrow channel characteristics:

(1) Divide the workspace into grid cells

l_{c} = 0.5

with an edge length of

\{C_{j}\}

. The start point

i_{S}

and end point

i_{G}

are retained compulsorily. For each cell, only the node with the maximum SDF value is retained as the representative point

P_{j}^{*}

, while the remaining nodes in the cell are removed. Subsequently, the adjacency relationships are reconstructed on the set of representative points, and infeasible edges are filtered out.

(2) For a node

P_{u}

satisfying

\deg (P_{u}) = 2

(connected only to

P_{v}

and

P_{w}

), if the included angle of the segment formed by

P_{v}

and

P_{w}

is smaller than 25° and the edge

(P_{v}, P_{w})

passes the feasibility check, then

(P_{v}, P_{w})

is used to replace the edges

(P_{v}, P_{u})

and

(P_{u}, P_{w})

, while

P_{u}

is removed. This operation is executed iteratively until no nodes can be contracted.

(3) After each stage, depth-first search is used to compute the connected component function

c o m p : V \to \{1,2, \dots, K\}

, so as to check the connectivity between the start point

i_{S}

and the end point

i_{G}

. If

c o m p (i_{S}) = c o m p (i_{G})

, the simplified result is accepted; if disconnection occurs, the graph structure is rolled back to the previous stage step by step to ensure the start and end points remain connected at all times.

(4) If the start and end points are still disconnected after the rollback in the previous stage, the target component

V_{S}

(the component containing the start point) and all components

V_{\neg S}

that do not contain the start point are marked. A monotonically increasing search radius set

R = \{r_{1} < r_{1} < \dots < r_{T}\}

is defined. For each

r \in R

, node pairs where the distance between

P_{u} \in V_{S}

and

P_{v} \in V_{\neg S}

is less than

r

are selected to construct the candidate pair set

P (r)

.

P (r)

is traversed in ascending order of distance, the feasibility of candidate edges is verified, and the connected components are updated until the connectivity between the start and end points is restored.

This subsection presents the complete construction process of the PRM topological graph, which maintains good reachability, expressiveness, and numerical stability in complex environments. It provides a high-quality topological foundation for subsequent operations.

3.1.2. Topological Progress Potential

Based on the simplified PRM topological graph, a progress potential is constructed on it to provide smooth guidance information between the start and end points. The PRM topological graph is abstracted as an undirected graph

G = (V, E)

, where the node set is

V = \{P_{i} \in R^{2} | i = 1,2, \dots, N\}

and the edge set is

E = \{e = (u, v) | u, v \in V\}

. When constructing the progress potential, the potential of the start point

i_{s} \in V

is set as

φ (i_{s}) = 1

and that of the end point

i_{G} \in V

is set as

φ (i_{G}) = 0

. The potentials of the remaining nodes satisfy the discrete harmonic equation, so as to ensure the smoothness and monotonicity of the potential.

To balance the geometric length of edges and the characteristics of the surrounding environment, for any edge

e = (u, v)

, its geometric length

L_{e}

is taken, and the geometric length weight

W_{L} (e)

is defined as given in Equation (8):

W_{L} (e) = 1 / (e^{- 6} + L_{e})

(8)

Meanwhile, the minimum obstacle distance

c_{m i n}

of the edge is estimated using the SDF and bilinear interpolation, and the environmental compensation modulation factor

W_{c} (e)

is given in Equation (9):

W_{c} (e) = \{\begin{matrix} 1 + \min (\max (\frac{c_{r e f} - c_{m i n} (e)}{c_{r e f}}, 0) 1), & i f c_{m i n} (e) < c_{r e f} \\ 1, & o t h e r w i s e \end{matrix}

(9)

in which

c_{r e f}

is the clearance reference value (matching the size of the safety margin). By combining the geometric length weight and the environmental compensation modulation factor, the edge weight

W_{L c} (e)

is given in Equation (10):

W_{L c} (e) = W_{L} (e) \cdot W_{c} (e)

(10)

Based on the edge weights, the graph Laplacian matrix is constructed: first, the weighted adjacency matrix

A_{i j}

is obtained; then, the degree matrix

D_{i i}

, which represents the sum of weights of all edges connecting a node to other nodes, is derived; subsequently, the weighted Laplacian matrix

L

is obtained. Its formulas are given in Equations (11)–(13):

A_{i j} = \{\begin{matrix} W_{L c} (e), & i f e = (i, j) \in E \\ 0, & o t h e r w i s e \end{matrix}

(11)

D_{i i} = \sum_{j = 1}^{N} A_{i j}

(12)

L = D - A

(13)

Nodes are reordered into boundary nodes

B = \{i_{S}, i_{G}\}

and internal nodes

I

, and the potential function vector

ϕ

is correspondingly partitioned into

ϕ = (ϕ_{I}, ϕ_{B})

(where

ϕ_{I}

denotes the unknown potential values of internal nodes and

ϕ_{B}

denotes the known potential values of boundary nodes). The weighted Laplacian matrix can be block-partitioned, as given in Equation (14):

L = [\begin{matrix} L_{I I} & L_{I B} \\ L_{B I} & L_{B B} \end{matrix}]

(14)

in which

L_{I I}

is the Laplacian submatrix corresponding to internal nodes and

L_{I B}

is the incidence submatrix between internal nodes and boundary nodes. Based on the discrete harmonicity constraint, the core linear equation system is given in Equation (15):

L_{I I} ϕ_{I} = - L_{I B} ϕ_{B}

(15)

Within the subgraph containing

i_{S}

and

i_{G}

,

L_{I I}

is symmetric and positive definite. Thus, the above system of equations has a unique solution, which satisfies the weighted average property and the discrete maximum-minimum principle. For any internal node

u \in I

, its potential value equals the weighted average of the potentials of its neighboring nodes, as given in Equation (16):

ϕ (u) = \frac{1}{\sum_{v \in N (u)} W_{L c} (e_{u v})} \sum_{v} W_{L c} (e_{u v}) ϕ (v)

(16)

in which

N (u)

denotes the neighbor set of node

u

. Based on this, the potential progress values of all nodes in the map can be obtained. This design provides a globally monotonic progress scale consistent with the topology, facilitating the formation of reliable global guidance in complex scenarios.

As illustrated in Figure 2, the construction of the simplified PRM and the topological progress potential on the operating map is depicted.

Figure 2. Construction of the simplified PRM and topological progress potential. (a) Sampling on the map. (b) Node connection. (c) Node simplification. (d) Topological progress potential.

3.2. Sector Partition and Decision-Making Mechanism

3.2.1. Sector Partition of the Perception Circle and Internal Equal-Area Discretization

TPP-CSACO needs to select directions from the angular space. A dual-layer direction evaluation model based on sector partition and equal-area discretization is introduced as the geometric foundation for TPP-CSACO decision-making: the first layer involves sector partition for the ant’s moving directions; the second layer uses polar coordinate equal-area partition to balance the computation load and area-weighted averaging.

First, the number of moving directions to be refined (i.e., the number of sectors to be discretized,

N_{ϕ}

) is determined. The central angle of each sector is denoted as

ϕ_{j}

(

j \in 1, \dots, N_{ϕ}

), and each

ϕ_{j}

is obtained by uniformly shifting the polar angle from the current position

x

to the target point

x_{g o a l}

. The central angles of adjacent sectors are separated by a fixed angular distance. A proportional coefficient

η

is introduced to dynamically adjust the angular domain of the sectors, and its expression is given in Equation (17):

\{\begin{matrix} Δ ϕ = 2 π / N_{ϕ} \\ θ_{j} = [ϕ_{j} - η \cdot π / N_{ϕ}, ϕ_{j} + η \cdot π / N_{ϕ}] \end{matrix}

(17)

in which

Δ ϕ

represents the fixed angular distance between two adjacent central angles. When

η < 1

, a moderate angular gap exists between sectors; when

η = 1

, sectors are connected end to end without overlap or gaps; when

η > 1

, sectors partially overlap, and each individual sector covers a larger area. In this paper,

η = 1

is adopted by default to fully cover the

2 π

angular range without overlap. For each sector within the interval bounded by its perception circle radius

R

, it is divided into

N_{r}

equal-area rings. The equal-area constraint is applied to ensure that the area of each ring is equal, and the division formula is given in Equation (18):

\{\begin{matrix} r_{i} = R \sqrt{i / N_{r}} \\ A_{i}^{r} = π \cdot R^{2} / N_{r} \end{matrix}

(18)

in which

r_{i}

denotes the

i

ring radius and

A_{i}^{r}

denotes the area of the

i

ring. The angular domain

θ_{j}

of each sector is evenly divided into

N_{θ}

columns. Let

ω_{j}

represent the angular width of sector

j

; then, the step length and the central angle of the

k

column can be expressed as given in Equation (19):

\{\begin{matrix} Δ θ = ω_{j} / N_{θ} = 2 \cdot η \cdot π / N_{ϕ} \cdot N_{θ} \\ θ_{j, k} = ϕ_{j} - (η \cdot π / N_{ϕ}) + (k - 0.5) Δ θ \\ P_{j, i, k} = x + {\bar{r}}_{i} \cdot [\begin{matrix} \cos θ_{j, k} \\ \sin θ_{j, k} \end{matrix}] \end{matrix}

(19)

in which

Δ θ

denotes the angular step length,

θ_{j, k}

is the central angle of the

k

column,

P_{j, i, k}

represents the central coordinate of the small area unit located in the

i

ring and

k

column of the

j

sector, and

{\bar{r}}_{i} = (r_{i - 1} + r_{i}) / 2

is the radial center of the

i

ring. The

S D F (p)

is used to determine the safe region attribution of each subunit in the sector, and the index of the first blocked unit in each direction is defined as given in Equation (20):

a l l o w (j, i, k) = \{\begin{matrix} 1, & i f S D F (P_{j, i, k}) > 2 δ \\ 0, & o t h e r w i s e \end{matrix}

(20)

For the

k

column of sector

j

, the first blocked ring is defined as the first radially blocked ring, and its formula is given in Equation (21):

f (j, k) = \min {i : a l l o w (j, i, k) = 0}

(21)

When the

k

column is fully feasible,

f (j, k) = N_{r} + 1

, and the effective passable radius is

R_{j, k} = r_{f (j, k) - 1}

. The set of subunits within sector

j

is defined as:

M_{j} = \{(i, k) | a l l o w (j, i, k) = 1\}

(22)

In the subsequent pre-screening and detailed scoring process, multiple quantities can be sampled synchronously at the center points of subunits. To aggregate the unit information into sector-level evaluations, a cosine-based angular weighting strategy is adopted, assigning higher weights to the central column. The weight formula is given in Equation (23):

ω_{k} = \cos (π \cdot |k - k_{c}| / 3 \cdot k_{m a x})

(23)

in which

k_{c} = (N_{θ} + 1) / 2

serves as the central column index and

k_{m a x} = N_{θ} / 2

denotes the maximum distance. The weighted effective radius of sector

j

is given in Equation (24):

{\bar{R}}_{j} = (\sum_{k = 1}^{N_{θ}} ω_{k} \cdot R_{j, k}) / \sum_{k = 1}^{N_{θ}} ω_{k}

(24)

By constructing a polar coordinate grid within each sector, subsequent continuous integration can be converted into a local summation over each sector. The equal-area partition method can avoid systematic area-weighting biases in the average calculation of subsequent candidate sampling points, significantly reducing the subsequent computational load.

Regarding the computational complexity, the sector division and the division of its internal subunits are illustrated in Figure 3. Specifically, the sector division scenario for

η = 1

is presented in the top-left quadrant; the sector division scenario where

η < 1

is depicted in the top-right quadrant; the sector corresponding to

η > 1

is shown in the bottom-left quadrant; the internal division scenario of a single sector is displayed in the bottom-right quadrant.

Figure 3. Sector division and division of internal subunits.

3.2.2. Sector Pre-Screening

Conducting a complete evaluation for all

N_{ϕ}

directions at each ant step would lead to significant computational overhead. Therefore, it is necessary to eliminate poor directions using relatively lightweight information, retaining optimal directions for subsequent detailed scoring. The pre-screening evaluation formula is defined in Equation (25):

H_{p r e} (j) = B (j) \cdot G_{g e o} (j) \cdot G_{τ} (j)

(25)

in which

B (j)

is the basic direction score of the

j

sector,

G_{g e o} (j)

is the feasible region factor of the

j

sector, and

G_{τ} (j)

is the pheromone factor of the

j

sector. To calculate the basic direction score, the progress potential statistical measure over the sector units must be obtained first, as given in Equation (26):

\{\begin{matrix} S_{ϕ} (j) = \frac{1}{|M_{j}|} \sum_{P \in M_{j}} \max (0, φ (p o s) - φ (P)) \\ {\tilde{S}}_{ϕ} (j) = \min (1, \max (\frac{S_{ϕ} (j) - \min (S_{ϕ} (j))}{\max (S_{ϕ} (j)) - \min (S_{ϕ} (j))}, 0)) \end{matrix}

(26)

in which

S_{ϕ} (j)

is the average value of progress potential decline. When the number of feasible units in the sector

M_{j} = 0

,

S_{ϕ} (j) = 0

.

{\tilde{S}}_{ϕ} (j)

is the normalized progress potential decline index. Then, the alignment degree between the sector’s central direction and the target direction is calculated, and its formula is given in Equation (27):

g (j) = \max (0, \cos (ϕ_{j} - ϕ_{g o a l}))

(27)

in which

ϕ_{g o a l}

is the direction angle from the current position to the target. This term can increase the score of sectors oriented toward the target and suppress those oriented away from it. The final basic direction score

B (j)

adopts a dynamic weight fusion strategy, combining

{\tilde{S}}_{ϕ} (j)

and

g (j)

. When the potential information is reliable, its weight is increased; otherwise, greater reliance is placed on the target direction. The formula for this is given in Equation (28):

\{\begin{matrix} B (j) = ω_{φ} (j) \cdot {\tilde{S}}_{ϕ} (j)^{k} + (1 - ω_{φ} (j)) \times g (j)^{β (j)} \\ ω_{φ} (j) = 0.4 + (4 \cdot {\tilde{S}}_{ϕ} (j) - 0.6) / (1 + \exp (- j)) \\ β (j) = β_{m i n} + (β_{m a x} - β_{m i n}) \cdot {\tilde{S}}_{ϕ} (j) \end{matrix}

(28)

in which

k

denotes the trend power exponent,

ω_{φ} (j) \in [0.4, 0.8]

is the dynamic weight factor controlled by

{\tilde{S}}_{ϕ} (j)

, and

β (j) \in [β_{m i n,} β_{m a x}]

is the amplification/suppression coefficient. When a sector shows significant potential drop in the progress potential, even if its directional alignment

g (j)

is small,

{\tilde{S}}_{ϕ} (j)

can still yield a definite basic directional score in this direction. This helps in navigating around trap regions and avoids falling into local optima due to overreliance on

g (j)

. For the feasible region factor

G_{g e o} (j)

, it is necessary to calculate the coverage rate inside the sector, radial feasible depth, and obstacle avoidance quality. The formulas for these indicators are given in Equations (29)–(31):

{cov}_{j} = \frac{1}{N_{r} \cdot N_{θ}} \sum_{i = 1}^{N_{r}} \sum_{k = 1}^{N_{θ}} a l l o w (i, k)

(29)

d e p_{j} = m e d i a n_{k} (\sqrt{(f (j, k) - 1) / N_{r}})

(30)

q u a l_{j} = \min (1, \max (\frac{1}{|M_{j}|} \sum_{P \in M_{j}} S D F (P_{j, i, k}^{p r e}), 0))

(31)

in which

{c o v}_{j}

denotes the coverage rate,

d e p_{j}

is the radial feasible depth, and

q u a l_{j}

represents the obstacle avoidance quality. By taking the geometric mean of these three indicators and mapping the result to the interval

[0.2, 1]

, the feasible region factor

G_{g e o} (j)

can be obtained, with its formula given in Equation (32):

\{\begin{matrix} g_{g e o} = ({cov}_{j} \cdot d e p_{j} \cdot q u a l_{j})^{1 / 3} \\ G_{g e o} (j) = 0.2 + 0.8 \times \min (1, \max (g_{g e o}, 0)) \end{matrix}

(32)

This design can retain a base score of 0.2, avoiding the complete occlusion of potential feasible directions. For the pheromone factor

G_{τ} (j)

, a pheromone trigger mechanism is adopted: first, pheromone sampling is performed on the sector direction to obtain the directional pheromone

τ_{p r e} (j)

; then, gain triggering is determined using the initial pheromone

τ_{0}

and the trigger threshold coefficient

μ_{p r e}

, with the formulas given in Equations (33) and (34):

z_{p r e} (j) = \{\begin{matrix} 1, & τ_{p r e} (j) \geq τ_{0} \cdot μ_{p r e} \\ 0, & o t h e r w i s e \end{matrix}

(33)

G_{τ} (j) = 1 + z_{p r e} (j)

(34)

After obtaining the rough scores of each sector via the pre-screening evaluation formula and sorting them, the number of sectors to be retained is determined in accordance with Equation (34):

K_{p r e} = \max (K_{m i n}, [η_{p r e} \cdot N_{ϕ}])

(35)

in which

K_{m i n}

denotes the minimum number of sectors to retain and

η_{p r e} = 0.55

is the retention ratio. By truncating the sorted sectors, the set of sectors that proceed to subsequent evaluation, denoted as

K = {j_{1}, \dots, j_{K_{p r e}}}

, is obtained.

The pre-screening uses a sector-based retention strategy, where 55% of candidate directions are retained by default, with a specified minimum retention quantity. Such a lenient screening criterion provides sufficient error tolerance margin for deviations in the scoring process. The pre-screening score integrates two complementary indicators, namely the global progress potential decline index and the target alignment degree, which reflect the global accessibility of the direction and the geometric relationship between the direction and the end point, respectively. Even if a single indicator suffers from local computational errors, the other indicator can still ensure a high overall score for the direction, thereby retaining high-quality directions in the candidate set. The lower bound of the geometric term is set to 0.2, which ensures that no direction will be eliminated due to obstacle occlusion. High-quality directions can still be included in the candidate set by virtue of the advantages of other terms. The base value of the pheromone term is set to 1, which imposes no negative impact on directions without prior experience but does have a high score in other terms.

3.2.3. Detailed Scoring Model

After pre-screening, the candidate sector set is reduced from

N_{ϕ}

to

K_{p r e}

. Compared with the low-resolution fast screening of pre-screening, the detailed scoring has two distinct features: it uses denser subunit to sample the potential field, improving the accuracy of environmental information; it introduces a boundary fine interpolation mechanism to precisely locate the safe passage boundary, providing a geometric basis for elastic step-length constraints.

The first blocked ring index

f (j, k)

defined by Equation (20) can only achieve ring-level blocking judgment and cannot precisely locate the radial position of the safe boundary (the

2 δ

equipotential line). Therefore, a five-point boundary sampling mechanism is introduced. For the first blocked ring

f_{b} = f (j, k)

, additional SDF sampling is performed at the following five key positions, as specified in Equation (36):

\{\begin{matrix} d_{s a f e} = S D F (p o s + {\bar{r}}_{f_{b} - 1} \cdot e_{j, k}) \\ d_{e n t r y} = S D F (p o s + r_{f_{b} - 1} \cdot e_{j, k}) \\ d_{b l o c k} = S D F (p o s + {\bar{r}}_{f_{b}} \cdot e_{j, k}) \\ d_{l e f t} = S D F (p o s + {\bar{r}}_{f_{b} - 1} \cdot e_{j, k - 0.5}) \\ d_{r i g h t} = S D F (p o s + {\bar{r}}_{f_{b} - 1} \cdot e_{j, k + 0.5}) \end{matrix}

(36)

in which

e_{j, k} = {[\cos θ_{j, k}, \sin θ_{j, k}]}^{T}

is the direction unit vector and

e_{j, k \pm 0.5}

denotes the direction of the angular column boundary. The safe ring center

{\bar{r}}_{f_{b} - 1}

is located in the unblocked area; the entrance of the first blocked ring

r_{f_{b} - 1}

(i.e., the outer boundary of the safe ring) is the boundary between the safe and dangerous areas; the center of the first blocked ring

{\bar{r}}_{f_{b}}

is located inside the blocked area. The geometric relationship is illustrated in Figure 4.

Figure 4. Process for obtaining equipotential lines. (a) First blocked ring. (b) Five-point sampling. (c) Prediction of equipotential lines.

Assume that the SDF varies approximately linearly in the radial direction. The radial position of the

2 δ

equipotential line is estimated via linear interpolation. The processing is divided into two cases based on the safety state of the entrance point: when the entrance is in a dangerous state (

d_{e n t r y} < 2 δ

), the

d_{e n t r y} < 2 δ

equipotential line lies between the safe ring center and the entrance, and the interpolation formula is given in Equation (37):

\{\begin{matrix} t_{1} = (d_{s a f e} - 2 δ) / (d_{s a f e} - d_{e n t r y}) \\ r_{int e r p} (j, k) = {\bar{r}}_{f_{b} - 1} + t_{1} \cdot (r_{f_{b}} - {\bar{r}}_{f_{b} - 1}) \end{matrix}

(37)

When the entrance is in a safe state (

d_{e n t r y} \geq 2 δ

), the

2 δ

equipotential line lies between the entrance and the center of the first blocked ring, and the interpolation formula is given in Equation (38):

\{\begin{matrix} t_{1} = (d_{e n t r y} - 2 δ) / (d_{e n t r y} - d_{b l o c k}) \\ r_{int e r p} (j, k) = r_{f_{b} - 1} + t_{2} \cdot ({\bar{r}}_{f_{b}} - r_{f_{b} - 1}) \end{matrix}

(38)

The above radial interpolation can reasonably predict the radial position of the equipotential line, but it cannot eliminate the interference of lateral obstacles. When

\min (d_{l e f t}, d_{r i g h t}) < 2 δ

, the interpolation result is contracted to the safety ring center for lateral obstacle intrusion, with the formula given in Equation (39):

r_{int e r p} (j, k) = \min (r_{int t e r p} (j, k), {\bar{r}}_{f_{b} - 1})

(39)

To enhance conservatism, the minimum value of three adjacent columns is adopted to avoid excessively large step estimations caused by single-column errors, as given in Equation (40):

r_{f i n a l} (j, k) = \min {r_{int e r p} (j, k), r_{int e r p} (j, k - 1), r_{int e r p} (j, k + 1)}

(40)

After the above processing, the safe passage radius for each angular column in the sector is given by Equation (41):

r_{s a f e} (j, k) = \max (0, r_{f i n a l} (j, k))

(41)

By applying the angular-column cosine weighting from Equations (23) and (24), the weighted safe passage radius

{\bar{R}}_{j}^{s a f e}

of sector j is obtained, and the normalized passage depth factor is defined as Equation (42):

ξ_{j} = {({\bar{R}}_{j}^{s a f e} / R)}^{α_{G e o}}

(42)

in which

α_{G e o}

denotes the obstacle penalty intensity. This factor reflects the radial passage capability of the sector direction and is used for depth correction of the geometric term. Based on geometric analysis and multi-field quantity sampling, the detailed scoring formula for the candidate sector is shown in Equation (43):

H_{c o a} = G_{d i r} (j) \cdot T_{τ} (j) \cdot G_{g e o m} (j)

(43)

in which

G_{d i r} (j)

denotes the directional guidance factor,

T_{τ} (j)

represents the pheromone main term, and

G_{g e o m} (j)

refers to the geometric term. These three terms are fused in a multiplicative manner, functioning independently yet collaboratively. To achieve reasonable normalization of on-site progress and avoid interference from local extreme values, a global normalization scheme based on gradient statistics of PRM edges is adopted. By taking the 98th quantile

k_{98}

of the gradient

\nabla_{e_{ϕ}}

of the progress potential on each PRM edge as the typical maximum gradient, the global potential drop normalization scale is determined as

Δ φ_{m a x} = k \cdot R

; this scale is based on the “maximum per-step progress potential drop value” while matching the perception circle

R

. The potential progress of the subunit is defined as Equation (44):

e_{φ} (i, k) = \min (1, \max (\frac{φ (p o s) - φ (i, k)}{Δ φ_{m a x}}, 0))

(44)

in which

φ (p o s)

denotes the progress potential value at the current position. The direction orientation factor

G_{d i r} (j)

measures the alignment degree between the central direction of the sector and the target direction and dynamically adjusts the response intensity in combination with potential progress. First, the average potential progress of the allowed units within the sector is calculated, with the formula given in Equation (45):

{S^{'}}_{ϕ} (j) = \frac{1}{|M_{j}|} \sum_{(i, k) \in M_{j}} e_{ϕ} (i, k)

(45)

Since the

g (j)

obtained from pre-screening after direction refinement cannot effectively distinguish adjacent sectors, the sector division model is instead used to linearly map the target contribution degree of each direction, whose expression is given in Equation (46):

g^{'} (j) = \{\begin{matrix} 1 - (λ_{l i n} \cdot N_{ϕ} \cdot |ϕ_{j} - ϕ_{g o a l}|) / 2 π, & i f |ϕ_{j} - ϕ_{g o a l}| \leq \frac{π}{2} \\ 0, & o t h e r w i s e \end{matrix}

(46)

in which

λ_{l i n}

is the linear gain slope. By substituting

{S^{'}}_{ϕ} (j)

into Equation (28), the amplification coefficient

β (j)

is obtained, which forms the direction orientation factor together with

g^{'} (j)

. The formula is given in Equation (47):

G_{d i r} (j) = (1 + g^{'} (j))^{β (j)}

(47)

The pheromone main term

T_{τ} (j)

reflects the accumulation intensity of the swarm’s historical exploration experience in the sector direction. First, the average pheromone value of the allowed units within the sector is calculated, with the formula given in Equation (48):

\bar{τ} (j) = \frac{1}{|M_{j}|} \sum_{(i, k) \in M_{j}} τ (j, i, k)

(48)

To avoid extreme values dominating the score, a saturation normalization method is adopted to map the pheromone to a bounded interval, and the pheromone main term is obtained via exponential amplification. The formula is given in Equation (49):

\{\begin{matrix} \hat{τ} (j) = \bar{τ} (j) / (\bar{τ} (j) + τ_{s a t}) \\ T_{τ} (j) = \exp (ω_{τ} \cdot \hat{τ} (j)) \end{matrix}

(49)

in which

τ_{s a t}

is the saturation constant and

ω_{τ}

is the pheromone exponential weight. This design ensures that the pheromone remains non-negative and provides exponential-level benefits to regions with high pheromone. The geometric term

G_{g e o m} (j)

is used to comprehensively evaluate the unit-level geometric quality and radial passage depth of the sector. The geometric quality is obtained, and the unit-level geometric quality is aggregated by angular columns; the formulas are given in Equations (50) and (51):

q (i, k) = e_{ϕ} (i, k) \cdot c (i, k)

(50)

{\bar{q}}_{k} = \frac{1}{n_{k}} \sum_{i, k \in M_{j}} q (i, k)

(51)

in which

n_{k} = |{i : (i, k) \in M_{j}}|

. Unit-level geometric quality is aggregated into sector-level geometric direction quality

\tilde{q} (j)

by angular columns via Equation (23). By combining the passage depth factor in Equation (42), the correction formula for the sector’s geometric direction quality is given in Equation (52):

G_{g e o m} (j) = \tilde{q} (j) \cdot ξ_{j}

(52)

Through integration via Equation (43), the scoring set of candidate sectors is obtained as

K^{'} = {(j, H_{c o a} (j)) | j \in K_{p r e}}

. This scoring formula incorporates three core dimensions: directional guidance, pheromone, and geometric feasibility. Its multiplicative fusion design not only embodies the guiding role of the global potential field, but also accommodates the detailed structure of local geometry and pheromone.

3.2.4. State Transition Probability

In the classical ACO algorithm, the next discrete node is selected by ants based on the product of pheromone concentration and heuristic information. Since no predefined node set exists in continuous-space path planning, ants need to select the movement direction from candidate sectors. To balance convergence and exploration robustness, as well as to match the sector scoring model, the state transition probability of the traditional ACO is adjusted. A temperature-regulated softmax mechanism is introduced to achieve the balance between exploration and exploitation, and its formulas are presented in Equations (53)–(55):

l (j) = \ln (\max (H_{c o a} (j), ε)) / T

(53)

l^{'} (j) = l (j) - \max (l (k))

(54)

P (j) = \exp (l^{'} (j)) / \sum_{i}^{N_{ϕ}} \exp (l^{'} (j))

(55)

in which

T

is the temperature parameter,

ε = 1 0^{- 12}

denotes the numerical protection constant,

l (j)

represents the logarithmic value converted from the original score

H_{c o a} (j)

,

l^{'} (j)

is the value after numerical stabilization, and

P (j)

refers to the final state transition probability. When

T \to 0^{+}

, the differentiation of

l^{'} (j)

is amplified, causing the current algorithm to be more inclined to select the direction with the highest score; when

T \to \infty

,

l^{'} (j) \to 0^{+}

; at this point, the score differences across all directions are minimized, and the ant is more prone to random exploration. Finally, the selected sector

j

and the corresponding movement direction

ϕ_{j}

are determined via the roulette wheel selection method.

The main advantages of the temperature-based logarithmic softmax state transition probability are as follows: The probability model is decoupled from the specific physical, geometric, and pheromone designs, allowing the hierarchical components to be combined in any reasonable manner while always ensuring that a higher score corresponds to a greater probability, resulting in a clear and scalable structure. Secondly, the temperature T provides a simple and controllable “exploration-exploitation” adjustment, which means it can be applied in scenarios with large score differences or significant scale changes and can balance global search and local development. Meanwhile, this scheme can also be matched with subsequent elastic step-length adjustments and realize trap breakthrough through the “stagnation-temperature rise” mechanism.

3.3. Ant Movement and Pheromone Update

3.3.1. Elastic Step Size and Direct Connection Scheme

After the moving direction

ϕ_{j}

is determined, the actual moving step size needs to be confirmed. Due to the geometric complexity of obstacles in continuous space, a fixed step size cannot balance efficient traversal in open areas and safe navigation through narrow channels. Thus, an elastic step size mechanism is introduced, combined with the “frustration-induce temperature rise” mechanism, which enables adaptive step size adjustment and enhanced exploration. The primary constraint of the elastic step size is geometric safety, i.e., the step size does not exceed the passable distance in the current direction. The nominal step size is calculated by integrating two geometric upper bounds: the maximum expected step size is set equal to the sensing circle radius, both denoted as

R

; the safe passage radius

R_{s a f e}

is obtained from the angular column closest to the selected direction

ϕ_{j}

. The nominal step size is defined as the minimum of these two values, and the corresponding formula is given by Equation (56):

R_{n o m} = \min (R, R_{s a f e})

(56)

It can ensure that the nominal step size does not violate any geometric constraints. To enhance the breakthrough capability of the ant when it is trapped, a frustration-induced rise mechanism based on the degree of frustration is introduced. Let

f (t) \in [0, 1]

denote the frustration degree at the

t

step, whose formulas are given by Equations (57) and (58):

f (t + 1) = λ^{'} \cdot f (t) + (1 - λ^{'}) \cdot I_{t}

(57)

α_{t + 1} = \{\begin{matrix} 0.5, & i f I_{t} = 1 \\ 1, & o t h e r w i s e \end{matrix}

(58)

in which

λ^{'}

is the frustration memory coefficient and

I_{t}

denotes the movement flag. The movement indicator

I_{t}

is set to 0 upon successful movement and 1 upon failure. The degree of frustration is also used to dynamically adjust the temperature parameter in the state transition probability. When

f (t) > 0.3

, a power function mapping is employed to calculate the dynamic temperature, and the corresponding formula is given by Equation (59):

T (f) = T_{0} + (T_{m a x} - T_{0}) f^{q}

(59)

in which

T_{0}

serves as the temperature baseline,

T_{m a x}

acts as the maximum temperature upper limit, and

q

denotes the temperature rise rate. When

f (t) \approx 0

,

T (f) \approx T_{0}

, with heuristic guidance being dominant. When

f (t) \approx 1

,

T (f) \approx T_{m a x}

; the probability differences across all feasible directions are narrowed, thereby enhancing the breakthrough capability. By integrating geometric safety and frustration scaling, the formula for calculating the actual moving step size is given by Equation (60):

R_{u s e} = \max (R_{m i n}, R_{n o m}, \propto_{t})

(60)

in which

R_{m i n}

acts as the minimum step size constraint, which prevents stagnation caused by an excessively small step size. To avoid the ant from infinite looping in the unsolvable region, an obstruction counter is introduced to record the ant’s continuous obstruction status, and the corresponding formula is given by Equation (61):

H (t + 1) = \{\begin{matrix} H (t) + 1, & i f f (t + 1) > f_{h i g h} \\ 0, & o t h e r w i s e \end{matrix}

(61)

in which

f_{h i g h}

is the obstruction threshold. When the value of the obstruction function exceeds this obstruction threshold, the counter initiates counting; when

H (t) = 6

, the ant self-destructs. When the ant approaches the target, a direct connection strategy is introduced to avoid lingering near the end point. When

|g o a l - p o s| < R

, the formula for the SDF values of the key sampling points in the subunits of the angular column corresponding to the target direction is given by Equation (62):

d_{c e n t e r}^{'} = S D F (p o s + {\bar{r}}_{i} \cdot e_{g o a l}), i = 1,2, \dots, i_{g o a l}

(62)

in which

e_{g o a l} = {[\cos ϕ_{g o a l}, \sin ϕ_{g o a l}]}^{T}

is the unit vector pointing to the target,

{\bar{r}}_{i}

denotes the central radius of the

i

ring, and

i_{g o a l}

refers to the radial ring index where the target is located. Along the direction from the current position to the target, if the SDF values of all central points are no less than the safety margin

2 δ

and

S D F (g o a l) \geq δ

, the ant is allowed to move directly to the target end point.

In this section, a complete mobile execution mechanism is established: in regular scenarios, adaptive adjustment of elastic step size is implemented; in dilemma scenarios, frustration-induced rise mechanism is employed to enhance the exploration and breakthrough capability; when approaching the target, the direct connection strategy is prioritized; self-destruction is triggered upon excessive obstruction, which balances travel efficiency and robustness.

3.3.2. Pheromone Update

Since the movement of ants does not occur between grid nodes, there is no grid to carry the pheromones. Compared with the Gaussian kernel, the raised-cosine kernel has both zero value and zero derivative at the boundaries, requiring no truncation processing, and the cosine function offers higher computational efficiency than the exponential function. Thus, the raised cosine kernel deposition is introduced. Consider a vertex sequence set

V = \{v_{0}, v_{1}, \dots, v_{m}\}

for a path. The length of the

k

segment is

l_{k} = | v_{k + 1} - v_{k} |

, and the cumulative arc length sequence is

\{c_{0}, c_{1}, \dots, c_{m}\}

(where

c_{0} = 0

,

c_{k + 1} = c_{k} + l_{k}

, and

c_{m} = L

denotes the total length of the path). The polyline segment is represented as a function

γ (s)

parameterized by the arc length

s

, which is given by Equation (63):

γ (s) = v_{k} + [(s - c_{k}) / l_{k}] \cdot (v_{k + 1} - v_{k}), s \in c_{k}, c_{k + 1}

(63)

The path is resampled equidistantly with a step size of

d s = 0.15 R

, yielding the coordinates of the sampling points

γ (s_{i})

and the cumulative arc lengths

s_{i} = i \cdot d s, i = 0,1, \dots, N,

and

N = L / d s

. The raised cosine kernel is introduced to establish the single-point deposition formula, which is given by Equation (64):

K (r; R) = \{\begin{matrix} (1 + \cos (π r / R)) / 2, & 0 \leq r \leq R \\ 0, & r > R \end{matrix}

(64)

in which

T_{l i n e}

represents the peak path deposition intensity. The formula of the pheromone deposition model, derived via the discretization of the Riemann sum, is given as Equation (65):

τ_{p} (x, T_{l i n e}, R) = T_{l i n e} \cdot (d s / R) \sum_{i = 0}^{N} K (| x - γ (s_{i}) |; R)

(65)

In this model, the deposition peak is controlled by adjusting

T_{l i n e}

, while the pheromone diffusion width is regulated by tuning

R

. Its deposition effect is illustrated in Figure 5.

Figure 5. Pheromone deposition. (a) Single-point deposition when

R = 1

. (b) Pheromone deposition along the path.

Regarding the pheromone update rule, let

U_{k}

denote the set of points on the map that are within a distance

R

from the

k

path. Local pheromone update is executed immediately after each ant reaches the target, acting on the neighborhood of that ant’s path; its formula is given by Equation (66):

τ_{n e w} (x) = \{\begin{matrix} (1 - ρ_{l o c}) τ (x) + ρ_{l o c} τ_{0}, & x \in U_{k} \\ τ (x), & o t h e r w i s e \end{matrix}

(66)

Global pheromone update is executed uniformly at the end of each generation. Batch deposition is performed based on the paths of all target-reaching ants, and its formulas are given by Equations (67)–(69):

τ_{n e w} (x) = (1 - ρ) τ (x) + ρ Δ τ (x)

(67)

Δ τ (x) = \sum_{k = 1}^{M_{a r}} Δ τ_{k} (x)

(68)

Δ τ_{k} (x) = \{\begin{matrix} T_{l i n e} = Q / L_{k}, & ant k passes through path p_{i} in this iteration \\ 0, & o t h e r w i s e \end{matrix}

(69)

in which

ρ_{l o c}

denotes the local pheromone evaporation coefficient,

ρ

is the global pheromone evaporation coefficient, and

τ_{0}

represents the baseline pheromone concentration.

Q

is the deposition constant,

L_{k}

stands for the geometric length of the

k

path, and

M_{a r}

is the number of ants that reach the target in the current generation.

3.4. Trajectory Optimization

Ants that successfully reach the target generate a collision-free polygonal trajectory

V = \{v_{0}, v_{1}, \dots, v_{m}\}

. However, due to the exploration characteristic of direction refinement, the path tends to exhibit jitter, which is unfavorable for trajectory tracking and velocity planning of the robot. Therefore, continuous spatial path smoothing is performed in two stages: local low-pass smoothing and Bézier curve smoothing. Several rounds of local low-pass iterations are conducted on the inner points

v_{i} (i = 1,2, \dots, m - 1)

of the path. In a single iteration, the formula for constructing the candidate position of the

i

point is given by Equation (70):

{\hat{v}}_{i} = α \cdot v_{i - 1} + (1 - 2 α) \cdot v_{i} + α \cdot v_{i + 1}

(70)

in which

α

denotes the smoothing weight. To prevent points from being overstretched, a constraint is imposed on the point displacement, and its formula is given by Equation (71):

v_{i}^{c a n d} = v_{i} + \min (1, \frac{δ \cdot \bar{l}}{|{\hat{v}}_{i} - v_{i}|}) \cdot ({\hat{v}}_{i} - v_{i})

(71)

in which

\bar{l}

represents the average segment length of the path. The candidate point

v_{i}^{c a n d}

must satisfy the safety constraint

S D F (v_{i}^{c a n d}) \geq δ

, and no collision occurs between the candidate segments

l_{v_{i - 1}, v_{i}^{c a n d}}

and

l_{v_{i}^{c a n d}, v_{i + 1}}

. After a limited number of iterations, the jitter in the path can be significantly eliminated, resulting in a pre-smoothed path. For the corner angles of the pre-smoothed path, quadratic Bézier corner smoothing is performed. By taking three consecutive points

A = v_{i - 1}^{c a n d}, B = v_{i}^{c a n d}, C = v_{i + 1}^{c a n d}

, and

l_{1}^{'} = B - A, l_{2}^{'} = C - B

are defined. The formula for constructing the tangent points is given by Equation (72):

\{\begin{matrix} T_{1} = B - t \cdot (l_{1}^{'} / |l_{1}^{'}|) \\ T_{2} = B + t \cdot (l_{2}^{'} / |l_{2}^{'}|) \end{matrix}

(72)

in which

t = t_{f} \cdot \min (|l_{1}|, |l_{2}|)

, where

t_{f} = 0.33

denotes the tangent point distance factor. By taking

T_{1}

,

B

, and

T_{2}

as control points, the formula for constructing the quadratic Bézier curve is given by Equation (73):

B (u) = (1 - u)^{2} T_{1} + 2 (1 - u) u \cdot B + u^{2} T_{2}

(73)

A chamfer is deemed valid if and only if

\forall u_{k} : S D F (B (u_{k})) \geq δ

and all connecting segments are collision-free. If the above condition is not satisfied, the chamfer is discarded, and the original inflection point is retained. Through the two-stage processing, path backtracking and jitter can be eliminated as much as possible while ensuring path safety, thereby making the trajectory more consistent with the dynamic constraints of the robot.

4. TPP-CSACO Path Planning Algorithm Process

The overall workflow of the proposed TPP-ASACO path planning algorithm is presented in this section, as illustrated in Figure 6. First, a PRM topological graph is constructed on the map, and the topological potential is calculated at its nodes to characterize the global guidance trend from the start point to the end point. Subsequently, the pheromone field is initialized on the map, and the number of ants

M

, iteration count

n,

and current ant index

m

are set. In the

n

iteration, for each ant

m

, the first step is to detect whether the target lies within the current sensing range. The direct path satisfies the SDF safety constraint: if the condition is satisfied, the ant is directly connected to the target, and a strip-shaped local pheromone update is performed along this path; if the target is outside the safe sensing range, pre-screening and fine scoring are conducted for the fan-shaped directions, and direction selection is carried out according to probability rules. Subsequently, the ant’s position and temperature parameters are updated based on the elastic step size and frustration mechanism, and the process returns to re-detect whether the target is visible and safe. After the current ant reaches the end point or meets the termination condition, the ant index is updated, and the above process is repeated until all M ants in this iteration are completed. After all ants in the current iteration complete the search, a global pheromone update is performed based on the pheromone regions deposited by successful paths. The optimal path of the current iteration is selected, and trajectory optimization is conducted to obtain a smooth and feasible intra-iteration optimal path. When the number of iterations reaches the upper limit

N

, the currently globally optimal path is output as the final planning result. The detailed pseudocode is provided in Supplementary Materials Figures S1–S4.

Figure 6. The flow diagram of TPP-ASACO.

5. Simulation Experiments and Analysis

5.1. Algorithm Parameter Analysis

In the improved TPP-CSACO, core modules including pre-screening, detailed scoring, and elastic step size contain a large number of adjustable parameters. However, it is difficult to explain the respective action mechanisms of each parameter merely by repeatedly adjusting the parameters and running the algorithm. Moreover, coupled parameters are not conducive to system adjustment and may cause overfitting on specific maps. To address this issue, an analysis framework based on analytical models and real samples is adopted: three typical test scenarios (open area, U-shaped obstacle, and triangular obstacle) are selected. In each scenario, a sector scoring database containing information such as the forward-looking potential

{\tilde{S}}_{ϕ} (j)

, direction vector

g (j)

, and normalized geometric depth

ξ_{j}

is constructed in a static manner. For the positions corresponding to the parameters, the influence curves of parameter variations are plotted respectively. Other structural or weakly sensitive parameters are fixed to their default values. This approach not only enables intuitive observation of the influence pattern of the parameters but also allows for the assessment of the rationality of the default parameter configuration.

Table 1 presents partial parameter values adopted by the current algorithm. Only in this parameter analysis is the perception radius

R

adjusted to 2. Figure 7 illustrates the scoring performance of three maps under the static state of ants. Among these, Figure 7b, Figure 7d and Figure 7f correspond to the reserved sector at the current position, the comparison of pre-screening scores of sectors, and the comparison of detailed evaluation scores of sectors, respectively.

Table 1. Related parameters involved in TPP-CSACO (part one).

Figure 7. Pre-screening and scoring results of three environments under static ant conditions. (a) Open area environment. (b) Scoring results of the open area. (c) Triangular obstacle environment. (d) Scoring results of the triangular obstacle environment. (e) U-shaped obstacle environment. (f) Scoring results of the U-shaped obstacle environment.

Figure 8a illustrates the influence of variations in

k

on

{\tilde{S}}_{ϕ} (j)

, where the gray scatter points represent the actual sector distributions of the three scenarios under the default value. As can be seen from Figure 8a, the actual operating points are mainly concentrated in the moderate potential drop interval rather than at the two ends of extremely optimal

{\tilde{S}}_{ϕ} (j) \to 1

or extremely poor

{\tilde{S}}_{ϕ} (j) \to 0

. For

k < 1

, the curve exhibits an upward convex shape in the moderate potential drop segment, which is equivalent to increasing the score weight of non-optimal directions and is beneficial for maintaining diversity in the pre-screening phase; for

k = 1

, a linear influence is exhibited, and the physical gradient of the potential itself is completely retained. For

k > 1

, the curve is obviously downward concave, which significantly reduces the scores of moderate potential drop directions and makes the algorithm more concentrated on high-quality trend directions. The term

\partial {\tilde{S}}_{ϕ} (j)^{k} / \partial k

indicates that the parameter sensitivity approaches 0 at both

{\tilde{S}}_{ϕ} (j) \to 1

and

{\tilde{S}}_{ϕ} (j) \to 0

and is only significant in the moderate segment. This implies that the adjustment of

k

mainly acts on the “sub-optimal candidate directions” with dense actual distributions, while its impact on the ranking of extremely optimal or poor directions is limited.

Figure 8. Parameter-sweep response curves for

k

and

α_{G e o}

(dots: real samples). (a)

{({\tilde{S}}_{ϕ} (j))}^{k}

versus

{\tilde{S}}_{ϕ} (j)

under different

k

. (b)

ξ_{j}^{α_{G e o}}

versus

ξ_{j}

under different

α_{G e o}

.

Figure 8b clearly demonstrates the influence of different

α_{G e o}

values on the scores of near-obstacle regions. The scatter distribution in the figure confirms that the operating points significantly affected by

α_{G e o}

are mainly those close to obstacle edges, while the impact on open regions is negligible. When

α_{G e o} < 1

, the score attenuation in near-obstacle regions becomes gentle. In narrow gap traversal scenarios, this setting reduces the algorithm’s repulsion against wall-hugging behavior, significantly improving the passage probability in narrow spaces. When

α_{G e o} > 1

, the curve drops sharply at small

ξ_{j}

values, which means that only sectors far from the obstacle center can obtain valid scores; increasing

α_{G e o}

enables the establishment of a “harder” safety boundary around obstacles.

As shown in Figure 9, under the current

β_{m i n}

and

β_{m a x}

, different potential progress indices

{\tilde{S}}_{ϕ} (j)

and average potential progress

S_{ϕ}^{'}

are employed to plot the curve of

g (j)^{β (j)}

(in pre-screening) varying with the direction alignment

g (j)

, as well as the curve of the direction pointing factor

G_{d i r} (j)

(in detailed scoring) varying with

(1 + g^{'} (j))

; real sector samples are overlaid on these curves. On the premise that

β_{m i n} = 1

is kept constant, a proper increase in

β_{m a x}

will cause the curve corresponding to high-potential progress sectors to be further depressed (for pre-screening) or elevated (for detailed scoring) in the medium-to-high independent variable interval.

Figure 9. Parameter-sweep response curves of the directional terms with (

β_{m i n}

,

β_{m a x}

) fixed (dots: real sector samples). (a)

g (j)^{β (j)}

versus

g (j)

under different

{\tilde{S}}_{ϕ}

. (b)

G_{d i r} (j)

versus

(1 + g^{'} (j))

under different

S_{ϕ}^{'}

.

As illustrated in Figure 9a, when the potential drop

{\tilde{S}}_{ϕ} (j)

increases from 0.2 to 0.8, the descending amplitude of the curve is enlarged, and the curve becomes steeper in shape. Since the base number satisfies

g (j) \in [0, 1]

, an increase in the exponent induces a monotonic increase in the function value. This implies that when the local potential drop is significant, the algorithm narrows its direction and imposes stricter nonlinear suppression on directions deviating from the target; whereas when the potential drop is weak, the curve tends to be linear, and the algorithm relaxes the tolerance to permit a wider range of trials. The scatter points in the figure indicate that the real data cover the entire dynamic adjustment range from “tolerance” to “focus.” In pre-screening, the parameter

β_{m a x}

essentially defines the “focusing intensity” for the high potential drop region. Under the current parameters, it ensures that the algorithm converges its focus rapidly in favorable scenarios, while maintaining the necessary exploration sector in adverse scenarios—thus effectively avoiding the rigidity or blind divergence that may arise from a fixed exponent. A smaller

β_{m a x}

makes the curve in the high-

{\tilde{S}}_{ϕ} (j)

region approximate linearity, remaining relatively tolerant of deviated directions, which is beneficial for retaining more candidates in multi-objective or complex terrain; a larger

β_{m a x}

, by contrast, significantly reduces the scores of these candidates, causing the pre-screening to favor a small number of sectors that are highly aligned with the target direction.

As illustrated in Figure 9b, contrary to the inhibition logic of pre-screening, high-quality directions are enhanced through detailed scoring. As

S_{ϕ}^{'}

increases, the slope of the gain curve rises significantly. This implies that, under the condition of identical potential progress, nonlinear rewards are granted by the algorithm to candidate sectors approaching the target direction. The real samples in the figure confirm that the core function of this parameter is to directionally amplify high-quality candidates that conform to both the local direction and the regional trend during the final decision-making phase. In detailed scoring, the parameter

β_{m a x}

acts as the “amplification gain” for advantageous signals. It does not interfere with the ranking of low-scoring candidates; instead, it is specifically used to widen the gap between “good” and “best” candidates, thereby significantly enhancing their relative advantages. If

β_{m a x}

is set to a relatively low value, the final scores can be kept smoother; if

β_{m a x}

is moderately increased, the score gap between advantageous sectors and ordinary candidates will be widened, which strengthens the decisiveness of the final decision.

A similar analysis is adopted for the parts involving pheromones, but with slight differences: instead of relying on real environmental data, the influence of pheromone intensity

τ

on the scoring gain

T_{τ} (τ)

is directly examined through analytical formulas.

The pre-screening parameter gating

μ_{p r e}

is essentially a threshold-based binary classifier. Instead of analyzing the influence curve of the pre-screening gating independently, it is treated as a background mechanism: the pre-screening layer only provides an eligibility assessment (to verify whether the minimum criteria are met) in the dimension of pheromone, and it does not further subdivide grades within the high

τ

interval where the criteria have already been satisfied.

Figure 10a presents the influence curves of different intensity factors

ω_{τ}

on the detailed scoring gain

T_{τ} (τ)

, under the condition that

τ_{s a t} = 2.0

(held constant). Two observation perspectives are provided by the dual horizontal axes in the figure: the lower axis corresponds to the actual pheromone concentration

τ

, while the upper axis represents the corresponding dimensionless intensity

X = τ / τ_{s a t}

. From the overall morphology of the curves,

ω_{τ}

determines both the gain upper limit of the pheromone term and the local discrimination capability. On one hand, each curve converges to its respective theoretical upper limit (marked by the dashed line in the figure) on the right side, which directly quantifies the maximum amplification factor achievable by mature paths. On the other hand, for the same

τ

or

X

, the larger

ω_{τ}

is, the higher the slope of the curve

\partial T_{τ} (τ) / \partial τ

becomes, and the stronger the capability to resolve differences in pheromone intensity. The point where the vertical gray dashed line in the figure intersects each curve is the half-saturation point. At this location, the normalized pheromone satisfies

X / (X + 1) = 0.5

. This point divides each curve into two segments: in the left region

X < 1

, the curve is relatively gentle with an approximately linear growth; in the right region

X > 1

, the curve gradually flattens and tends toward saturation. In the early stage of pheromone accumulation, most sectors remain in the

X < 1

region. It can be observed that the curves corresponding to different

ω_{τ}

values are close to each other at this stage, indicating that even if

ω_{τ}

is increased, the value of

T_{τ} (τ)

will only rise slowly, and the dependence of the score on

ω_{τ}

is automatically suppressed. When

X = 0.5

,

\partial \ln T_{τ} (τ) / \partial ω_{τ} \approx 0.33

, which suggests that adjusting

ω_{τ}

at this stage only induces a small change in gain; when

X \geq 2

,

\partial \ln T_{τ} (τ) / \partial ω_{τ} > 0.6

, so varying

ω_{τ}

can significantly alter

\ln T_{τ} (τ)

, thereby effectively controlling the amplification factor of the pheromone term in the total score. The “converging on the left, diverging on the right” morphology reveals the mechanism of

ω_{τ}

: the algorithm leverages the nonlinear property of the saturation function to inherently suppress the parameter’s effect during the cold-start phase, avoiding the over-amplification of random influences in the low-pheromone-intensity region; whereas in the mature phase of the high-pheromone region,

ω_{τ}

dominates the control of path utilization intensity.

Figure 10. Parameter-sweep response curves of the pheromone term

T_{τ} (τ)

. (a) With

ω_{τ}

fixed,

T_{τ} (τ)

versus

τ

under different

τ_{s a t}

. (b) With

τ_{s a t}

fixed,

T_{τ} (τ)

versus

τ

under different

ω_{τ}

.

Figure 10b plots the influence of

τ_{s a t}

on

T_{τ} (τ)

under the condition that

ω_{τ} = 1.5

is held constant. The solid circles on each curve denote the half-saturation position at

τ = τ_{s a t}

. The black dashed line in the figure represents

τ = τ_{t y p i c a l} = 3.0

; the intersection points of this dashed line with each curve are taken as the characteristic operating points of the system in the mature stage for different

τ_{s a t}

configurations, and the inset in the top-right corner quantifies

T_{τ} (τ_{t y p i c a l})

at these operating points. If the limit value of each curve as

τ \to \infty

is regarded as “full saturation,” the positions of

τ_{t y p i c a l}

under different

τ_{s a t}

values can be clearly compared: When

τ_{s a t} = 0.5

,

T_{τ} (3)

already reaches 80% of the maximum value, with the operating point lying in the latter half of the curve. Although the absolute gain is the highest, the curve has flattened significantly; subsequent increases in pheromone will only induce a negligible rise in the curve. When

τ_{s a t} = 5

,

T_{τ} (3)

only reaches approximately 40% of the final value, leading to a relatively weak overall amplification effect. When

τ_{s a t} = 1

or

τ_{s a t} = 2

,

T_{τ} (3)

accounts for about 55% to 70% of the maximum value: this configuration not only achieves a considerable amplification effect but also allows the curve to rise noticeably for

τ > 3

, thus retaining a good capability to distinguish small differences in pheromone intensity.

In the elastic step size, the actual step size is strictly constrained by the geometric trust region and the half-step state machine and has been decoupled from the exploration logic. At this point, the frustration memory coefficient

λ^{'}

and the temperature rise exponent

q

act as the key adjustment knobs for controlling the dynamic loop of “continuous frustration → enhanced exploration → self-termination.”

Figure 11a demonstrates the regulatory effect of

λ^{'}

on the accumulation rate of the frustration degree

f (t)

under the limit condition of continuous frustration. The curves reveal that

λ^{'}

essentially defines the “sensitive time window” for the ant’s congestion state: at low

λ^{'}

values, merely two to three consecutive rejections suffice to drive

f (t)

into the high-frustration region. While this configuration offers rapid responsiveness, ants are more prone to false triggering of self-termination in narrow or crowded environments.

Figure 11. Parameter-sweep response curves of key mappings in the elastic-step mechanism. (a)

f (t^{'})

versus the consecutive-failure steps

t^{'}

under different

λ^{'}

(with

f_{h i g h} = 0.72

indicated). (b) With (

T_{0}

,

T_{m a x}

) fixed,

T (f)

versus

f (t^{'})

under different

q

.

Figure 11b illustrates how different exponents

q

reshape the mapping relationship of

f (t^{'}) \to T (f)

. When

T (f) = T_{m a x}

, the probability difference across all directions is reduced to one-third of its original value, which weakens directional guidance; thus, the rate and timing of temperature rise must be strictly controlled. When

q > 1

, the temperature rise curve exhibits a distinctly concave shape. As

q

increases, the temperature in the low-to-moderate frustration interval is more strongly suppressed near

T_{0}

. If

q = 1.0

is adopted, even minor fluctuations in frustration will directly induce a temperature increase; whereas when

q = 2.5

is used, ants maintain

T (f) < 1.5

when

f (t^{'}) < 0.6

(corresponding to the first two consecutive frustrations), preserving strong directional guidance. Only when

f (t^{'})

approaches 0.9 does the temperature surge sharply to above 2.5. This “conservative in the early stage, aggressive in the later stage” response characteristic effectively balances the convergence efficiency under normal conditions and the escape capability in extreme predicaments.

λ^{'} = 0.7

and

q = 2.5

constitute a set of robust parameters: they jointly establish a fault-tolerance window on the order of several consecutive rejections (about 4–5 steps before entering the high-frustration regime and around 10 steps before self-termination under continuous congestion). Within this window, the algorithm is guaranteed to possess the capability of local detouring; if continuous congestion exceeds this window, the capability to break out of predicaments is enhanced via the surging temperature.

5.2. Comparative Experiments and Analysis

All experiments were conducted under the same hardware and software environment to ensure the fairness of comparisons. The experimental system utilized an Intel^® Core i7-12700 CPU @2.0 GHz, and the experimental environment was the MATLAB 2024b platform.

To fully evaluate the performance of the algorithms, eight representative methods in the continuous-space path planning domain were selected as baselines in this section, covering RRT*, PRM, PSO, GWO, SSA, ABC, Q-learning (QL), and ACO. The TPP-CSACO algorithm proposed in this paper was designated as the core evaluation subject. The key parameters of each algorithm are listed in detail in Table 2 and Table 3. All parameters of ACO are set identically to those of TPP-CSACO, as listed in Table 4. Under the current safety margin, the minimum feasible traversal distance in the ant colony execution phase is set to four times the safety margin. The direction space is divided into 15 sectors with a central angle of 24° each, which meet the requirements for direction discrimination and prevent scoring sectors from being excessively affected by obstacles on both sides of narrow passages. For detailed scoring, 10 rings are used, with the outermost ring at 0.051

R

, accounting for 43% of the safety margin. This allows accurate determination of the positions of distant obstacles and improves the accuracy of direction evaluation. For the pre-screening phase, six rings are selected, with the outermost ring width set to 0.087

R

, which meets the detection requirements while reducing computational load.

Table 2. Parameters of PSO, GWO, SSA, and ABC algorithms.

Table 3. Parameters of RRT*, PRM, and Q-learning algorithms.

Table 4. Related parameters involved in TPP-CSACO (part two).

As shown in Figure 12, to comprehensively evaluate the path planning performance of the algorithm under varying environmental complexities, three 2D continuous-space maps with distinct characteristics were selected in this section for testing. These three maps have dimensions of 10 × 10, 20 × 20, and 50 × 50, with obstacle densities of 38.00%, 26.75%, and 23.44%, respectively, and their topological complexity increases sequentially. They correspond to the following scenarios: a locally constrained narrow-channel and deep-trap scenario, which tests the algorithm’s ability to escape local optima; a medium-density multi-obstacle discrete scenario, which evaluates the algorithm’s global optimization capability in multi-solution topologies; and a high-density multi-scale hybrid scenario, which is designed to conduct an extreme stress test on the algorithm’s safety and smoothness under large-scale, long-distance planning.

Figure 12. Three 2D test environments. (a) 10 × 10. (b) 20 × 20. (c) 50 × 50.

To address path quality requirements in static environments, an evaluation system consisting of four core dimensions is established: geometric efficiency, where a shorter path length

L

is preferable; smoothness, for which smaller values of both the total turning angle

θ_{t o t a l}

and the maximum turning angle

θ_{m a x}

are preferable; and safety, where the hazard zone proportion

v i o l

is defined as the percentage of the total path length occupied by hazardous segments, and this metric directly reflects the extent to which the path violates safety constraints. To conduct an unbiased, comprehensive evaluation across these dimensions, an equal-weighted composite score

J_{e q}

is employed: the four aforementioned metrics are first normalized via the Min-Max method, and their normalized values are summed as

J_{e q} = L^{'} + θ_{t o t a l}^{'} + θ_{m a x}^{'} {+ V i o l}^{'}

, with a higher

J_{e q}

indicating that the algorithm has achieved a better balance among path length, smoothness, and safety.

To eliminate the impact of randomness on algorithm performance evaluation, each algorithm is run 50 times on each map. The random seed is reinitialized for each run to ensure the independence of sampling points, initial populations, or graph construction processes. If an algorithm fails to find a collision-free path within the specified number of iterations, it is recorded as a failure. All metrics are calculated based solely on sample data from successful runs. Failed runs are excluded from metric statistics to prevent data distortion.

Table 5, Table 6 and Table 7 present detailed operational data. It can be observed that the TPP-CSACO algorithm demonstrates stable and strong performance across maps of varying complexity. In terms of path length, the TPP-CSACO algorithm is slightly inferior to the shortest-path algorithm in each scenario: compared to GWO, PSO, and SSA, its average path length increases by up to approximately 5.9% (e.g., rising from 75.708 to 80.180), while the increase in other scenarios is controlled between 0.3% and 2.9%; however, compared to the traditional ACO algorithm, the path length of TPP-CSACO can be reduced by up to approximately 50.6%. Meanwhile, the standard deviations of the path length for TPP-CSACO are 0.024, 0.219, and 1.145, respectively, which are the lowest among all algorithms in each scenario, significantly lower than the fluctuation levels of algorithms such as SSA, ABC, PRM, and ACO. In terms of path smoothness, the maximum turning angles of TPP-CSACO in the three scenarios are only 16.823°, 5.932°, and 8.720°, respectively, which are generally reduced by approximately 75% to 93% compared to the typical comparison algorithms in each scenario. In terms of safety and stability, TPP-CSACO achieves a 0% hazard zone proportion and a 100% success rate in all scenarios, while the maximum hazard zone proportion of the comparison algorithms reaches 16.54%, and the success rate of some algorithms is only 0% to 44%. In terms of comprehensive performance, TPP-CSACO ranks first among all tested environments. To verify the statistical reliability of the experimental results, the Friedman test is adopted in this paper to analyze the performance differences among various algorithms. Under three map scales, namely 10 × 10, 20 × 20, and 50 × 50, the p-values of all evaluation metrics are less than 0.01. Further details can be found in Supplementary Materials Table S1.

Table 5. Results of various algorithms in the 10 × 10 map.

Table 6. Results of various algorithms in the 20 × 20 map.

Table 7. Results of various algorithms in the 50 × 50 map.

Figure 13, Figure 14 and Figure 15 present a set of randomly selected operation results obtained during the running process. It can be intuitively observed that, compared to the comparison algorithms, the TPP-CSACO algorithm not only maintains a relatively short path but also exhibits a smooth path morphology; meanwhile, it has a faster convergence rate than the ACO algorithm.

Figure 13. Comparison results in the 10 × 10 environment. (a) Final planned paths of different algorithms. (b) Convergence curves of the best path length versus iteration.

Figure 14. Comparison results in the 20 × 20 environment. (a) Final planned paths of different algorithms. (b) Convergence curves of the best path length versus iteration.

Figure 15. Comparison results in the 50 × 50 environment. (a) Final planned paths of different algorithms. (b) Convergence curves of the best path length versus iteration.

5.3. Diversity Analysis and Ablation Experiments

The convergence rate of TPP-CSACO is notably fast across all types of environments. Based solely on the convergence curves and the final path, it is difficult to directly rule out the possibility of fast convergence to a certain locally optimal corridor. To more intuitively analyze whether TPP-CSACO maintains sufficient path diversity during the search process, a single run of TPP-CSACO is performed on the 20 × 20 and 50 × 50 environments, with parameter settings consistent with those specified earlier. The optimal path within each generation is recorded every other generation, and these paths are overlaid and plotted on the corresponding obstacle environments, as illustrated in Figure 16. By observing the distribution of the optimal paths from different generations on the environment, it can be determined that the TPP-CSACO algorithm has explored multiple corridors during the optimization process, and its exploration coverage of the feasible corridors in the environment is relatively comprehensive.

Figure 16. Best-path snapshots of TPP-CSACO sampled at fixed iteration intervals (single run). (a) 20 × 20. (b) 50 × 50.

To verify the contributions of each key component in the TPP-CSACO algorithm, three ablated variants are constructed by removing, in turn, the topological progress potential, the frustration-induced temperature rise mechanism, and the elastic step length mechanism. Each variant is tested 10 times independently in a 20 × 20 environment, and three refined quantitative indicators, namely the movement rejection rate, ant arrival rate, and average number of movement steps, are recorded to distinguish the performance differences among the variants. As shown in Table 8, removing the topological progress potential results in a severe degradation in the arrival rate, with a 77% increase in the average number of steps, indicating that the global progress potential provides effective guidance during ant movement. Removing the frustration-induced temperature rise mechanism results in a 13.7% decrease in the arrival rate and a 25.7% increase in the number of steps; this mechanism facilitates ants’ escape from local traps in deep-trap regions. After eliminating the elastic step-length mechanism, the average number of steps decreases.

Table 8. Ablation study results for key algorithm modules.

In contrast, the movement rejection rate increases 26.9-fold, and the ant arrival rate decreases by 13.4%, demonstrating that the elastic step length can accurately adjust ant step size across different environments. To verify the optimization performance and selection accuracy of the pre-screening strategy, 10 static verification experiments are conducted in three scenarios illustrated in Figure 7. Under the condition that the experimental settings remain unchanged across scenarios, pre-screening and detailed scoring are performed sequentially, followed by full-range detailed scoring. The running time of each stage is collected to evaluate the computational efficiency, and the accuracy of pre-screening is assessed by comparing the sector retention rates of the top-ranked results and the ranking consistency between full-range detailed scoring and pre-screened retained sectors. As shown in Table 9, the pre-screening strategy achieves a 1.3× speedup while retaining all optimal movement directions. Although some false rejections occur in complex regions, the erroneously rejected directions are all low-ranked non-optimal directions, which do not compromise the quality of the final solutions.

Table 9. Efficiency and accuracy evaluation of sector pre-screening.

6. Conclusions

This study addresses the path planning issues existing in current ACO algorithms by proposing a topological progress potential-enhanced continuous-space ant colony path planning algorithm (TPP-CSACO), which adopts a hierarchical framework of “topological prior and continuous-space ACO.” The front end constructs a roadmap topology based on the PRM. By simplifying the PRM graph using Laplacian solution, a topological progress potential that integrates connectivity structure and clearance information is obtained, providing global search guidance. The back end operates a continuous-space ant colony under SDF constraints: it abandons the grid eight-neighborhood movement mode, employs sector division of the perception circle to refine movement directions, and adopts a pre-screening and multidimensional scoring model. By integrating the SDF safety margin, topological progress potential, and pheromone intensity distribution, a dual-field guidance mechanism that combines static prior and dynamic group experience is established. Probabilistic steering is realized through a Softmax state transition probability. By combining an elastic step size, the frustration-induced temperature rise mechanism, and the direct connection strategy, operational efficiency and the capability to break out of predicaments are balanced.

Subsequently, a simulation platform is constructed. For the key influencing parameters of the algorithm, their stable intervals are determined via static parameter formula analysis across three typical test scenarios. Subsequently, comparative verification is conducted against algorithms, including PRM, RRT*, PSO, and ACO. TPP-CSACO outperforms the other algorithms in comprehensive scoring, avoiding the excessive sacrifice of path length while ensuring no entry into hazardous areas. Compared with the traditional ACO, TPP-CSACO endows ants with stronger local perception capabilities, as well as a faster convergence rate without falling into local optima. However, TPP-CSACO does not hold an advantage in terms of computational efficiency; the single planning runtime of the current implementation is relatively long, making it difficult to compete with comparative algorithms under strict real-time constraints. This phenomenon stems primarily from two aspects: first, sector-based perception and multi-point interpolation sampling in continuous space introduce substantial constant computational overhead into single-step decision-making; second, the ant colony algorithm itself is a population-based iterative optimization approach, which requires multiple generations of iteration and pheromone updates to maintain diversity and convergence. Therefore, the current version of TPP-CSACO is more suitable as a planner for scenarios that demand high path quality (smoothness and safety margin) and topological adaptability but have relatively relaxed real-time requirements—such as offline planning or strategy generation in medium-scale scenarios. Future work will focus on further optimizing computational efficiency while ensuring path quality to enhance the competitiveness of TPP-CSACO. The sector-based perception in TPP-CSACO enables local perception, the elastic step-length strategy can adaptively regulate the movement range, and the frustration-driven temperature rise mechanism supports deadlock recovery. These mechanisms lay a solid foundation for extending the algorithm to dynamic environments. However, constrained by the real-time requirements of dynamic environments, the global progress potential requires frequent recomputation and thus fails to meet real-time response requirements. Developing local update strategies to cater to the demands of dynamic scenarios constitutes one of the future research directions.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/s26041264/s1. Figure S1: Pseudocode of the TPP-CSACO path planning algorithm. Figure S2: Pseudocode of PRM construction and topological progress potential computation. Figure S3: Pseudocode of sector-based direction scoring and selection. Figure S4: Pseudocode of elastic step length with frustration-induced temperature adjustment. Table S1: Friedman test results.

Author Contributions

Methodology, G.D.; Validation, F.Z.; Investigation, J.Z. and L.Z.; writing—original draft preparation, G.D.; Writing—review & editing, Q.L.; Resources, X.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the “National College Students’ Entrepreneurship Training Program”—“Multi-functional Biodegradable Mulch Film” (202511079004S), which was managed by the Entrepreneurship College of Chengdu University, and by the “Natural Science Foundation of Sichuan Province”—“Research, Development and Application of Extracorporeal Shock Wave Rehabilitation Robot for Osteonecrosis of the Femoral Head” (2025ZNSFSC0642).

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Acknowledgments

During the preparation of this manuscript, the authors used ChatGPT (GPT-5.1) for the purposes of assisting with summarizing and drafting the narrative description of the statistical summary tables exported from MATLAB 2024b. The authors have reviewed and edited the output and take full responsibility for the content of this publication.

Conflicts of Interest

The authors declare no conflict of interest.

References

Zhou, C.; Huang, B.; Fränti, P. A Review of Motion Planning Algorithms for Intelligent Robots. J. Intell. Manuf. 2022, 33, 387–424. [Google Scholar] [CrossRef]
Raj, R.; Kos, A. A Comprehensive Study of Mobile Robot: History, Developments, Applications, and Future Research Perspectives. Appl. Sci. 2022, 12, 6951. [Google Scholar] [CrossRef]
Yan, J.; Zhang, W.; Liu, Y.; Pan, W.; Hou, X.; Liu, Z. Autonomous Trajectory Tracking Control Method for an Agricultural Robotic Vehicle. Int. J. Agric. Biol. Eng. 2024, 17, 215–224. [Google Scholar] [CrossRef]
Kumaar, A.A.N.; Kochuvila, S. Mobile Service Robot Path Planning Using Deep Reinforcement Learning. IEEE Access 2023, 11, 100083–100096. [Google Scholar] [CrossRef]
Abujabal, N.; Fareh, R.; Sinan, S.; Baziyad, M.; Bettayeb, M. A Comprehensive Review of the Latest Path Planning Developments for Multi-Robot Formation Systems. Robotica 2023, 41, 2079–2104. [Google Scholar] [CrossRef]
Larsen, L.; Kim, J. Path Planning of Cooperating Industrial Robots Using Evolutionary Algorithms. Robot. Comput.-Integr. Manuf. 2021, 67, 102053. [Google Scholar] [CrossRef]
Wahab, M.N.A.; Nefti-Meziani, S.; Atyabi, A. A Comparative Review on Mobile Robot Path Planning: Classical or Meta-Heuristic Methods? Annu. Rev. Control 2020, 50, 233–252. [Google Scholar] [CrossRef]
Gonzalez, R.; Kloetzer, M.; Mahulea, C. Comparative Study of Trajectories Resulted from Cell Decomposition Path Planning Approaches. In Proceedings of the 21st International Conference on System Theory, Control and Computing, Sinaia, Romania, 19–21 October 2017; IEEE: New York, NY, USA, 2017; pp. 49–54. [Google Scholar]
Li, B.; Chen, B. An Adaptive Rapidly-Exploring Random Tree. IEEECAA J. Autom. Sin. 2022, 9, 283–294. [Google Scholar] [CrossRef]
Luo, M.; Hou, X.; Yang, J. Surface Optimal Path Planning Using an Extended Dijkstra Algorithm. IEEE Access 2020, 8, 147827–147838. [Google Scholar] [CrossRef]
Guruji, A.K.; Agarwal, H.; Parsediya, D.K. Time-Efficient A* Algorithm for Robot Path Planning. Procedia Technol. 2016, 23, 144–149. [Google Scholar] [CrossRef]
Luo, J.; Wang, Z.-X.; Pan, K.-L. Reliable Path Planning Algorithm Based on Improved Artificial Potential Field Method. IEEE Access 2022, 10, 108276–108284. [Google Scholar] [CrossRef]
Chen, G.; Luo, N.; Liu, D.; Zhao, Z.; Liang, C. Path Planning for Manipulators Based on an Improved Probabilistic Roadmap Method. Robot. Comput. Integr. Manuf. 2021, 72, 102196. [Google Scholar] [CrossRef]
Kavraki, L.E.; Svestka, P.; Latombe, J.-C.; Overmars, M.H. Probabilistic Roadmaps for Path Planning in High-Dimensional Configuration Spaces. IEEE Trans. Robot. Autom. 1996, 12, 566–580. [Google Scholar] [CrossRef]
Dai, S.; Schaffert, S.; Jasour, A.; Hofmann, A.; Williams, B. Chance Constrained Motion Planning for High-Dimensional Robots. In Proceedings of the 2019 International Conference on Robotics and Automation (ICRA), Montreal, QC, Canada, 20–24 May 2019; IEEE: New York, NY, USA, 2019; pp. 8805–8811. [Google Scholar]
Ojha, P.; Thakur, A. Dynamic Obstacle Avoidance Using Path Reshaping on Probabilistic Roadmaps for High-Degree-of-Freedom Robots. IEEE Trans. Artif. Intell. 2025, 7, 818–827. [Google Scholar] [CrossRef]
Zhang, H.; Wang, Y.; Zheng, J.; Yu, J. Path Planning of Industrial Robot Based on Improved RRT Algorithm in Complex Environments. IEEE Access 2018, 6, 53296–53306. [Google Scholar] [CrossRef]
Fan, J.; Zhang, X.; Zheng, K.; Zou, Y.; Zhou, N. Hierarchical Path Planner Combining Probabilistic Roadmap and Deep Deterministic Policy Gradient for Unmanned Ground Vehicles with Non-Holonomic Constraints. J. Frankl. Inst. 2024, 361, 106821. [Google Scholar] [CrossRef]
Xu, Z.; Deng, D.; Shimada, K. Autonomous UAV Exploration of Dynamic Environments via Incremental Sampling and Probabilistic Roadmap. IEEE Robot. Autom. Lett. 2021, 6, 2729–2736. [Google Scholar] [CrossRef]
Zhou, X.; Wang, X.; Xie, Z.; Gao, J.; Li, F.; Gu, X. A Collision-Free Path Planning Approach Based on Rule Guided Lazy-PRM with Repulsion Field for Gantry Welding Robots. Robot. Auton. Syst. 2024, 174, 104633. [Google Scholar] [CrossRef]
Puente-Castro, A.; Rivero, D.; Pazos, A.; Fernandez-Blanco, E. A Review of Artificial Intelligence Applied to Path Planning in UAV Swarms. Neural Comput. Appl. 2022, 34, 153–170. [Google Scholar] [CrossRef]
Ullah, Z.; Xu, Z.; Zhang, L.; Zhang, L.; Ullah, W. RL and ANN Based Modular Path Planning Controller for Resource-Constrained Robots in the Indoor Complex Dynamic Environment. IEEE Access 2018, 6, 74557–74568. [Google Scholar] [CrossRef]
Hentout, A.; Maoudj, A.; Aouache, M. A Review of the Literature on Fuzzy-Logic Approaches for Collision-Free Path Planning of Manipulator Robots. Artif. Intell. Rev. 2023, 56, 3369–3444. [Google Scholar] [CrossRef]
Panov, A.I.; Yakovlev, K.S.; Suvorov, R. Grid Path Planning with Deep Reinforcement Learning: Preliminary Results. Procedia Comput. Sci. 2018, 123, 347–353. [Google Scholar] [CrossRef]
Zhu, D.; Yang, S.X. Bio-Inspired Neural Network-Based Optimal Path Planning for UUVs under the Effect of Ocean Currents. IEEE Trans. Intell. Veh. 2022, 7, 231–239. [Google Scholar] [CrossRef]
Aliskan, I. The Optimization-Based Fuzzy Logic Controllers for Autonomous Ground Vehicle Path Tracking. Eng. Appl. Artif. Intell. 2025, 151, 110642. [Google Scholar] [CrossRef]
Fareh, R.; Baziyad, M.; Rabie, T.F.; Khadraoui, S.; Rahman, M.H. Efficient Path Planning and Formation Control in Multi-Robot Systems: A Neural Fields and Auto-Switching Mechanism Approach. IEEE Access 2025, 13, 8270–8285. [Google Scholar] [CrossRef]
Poudel, S.; Arafat, M.Y.; Moh, S. Bio-Inspired Optimization-Based Path Planning Algorithms in Unmanned Aerial Vehicles: A Survey. Sensors 2023, 23, 3051. [Google Scholar] [CrossRef]
Lamini, C.; Benhlima, S.; Elbekri, A. Genetic Algorithm Based Approach for Autonomous Mobile Robot Path Planning. Procedia Comput. Sci. 2018, 127, 180–189. [Google Scholar] [CrossRef]
Roberge, V.; Tarbouchi, M.; Labonte, G. Comparison of Parallel Genetic Algorithm and Particle Swarm Optimization for Real-Time UAV Path Planning. IEEE Trans. Ind. Inform. 2013, 9, 132–141. [Google Scholar] [CrossRef]
Liu, J.; Wei, X.; Huang, H. An Improved Grey Wolf Optimization Algorithm and Its Application in Path Planning. IEEE Access 2021, 9, 121944–121956. [Google Scholar] [CrossRef]
Miao, C.; Chen, G.; Yan, C.; Wu, Y. Path Planning Optimization of Indoor Mobile Robot Based on Adaptive Ant Colony Algorithm. Comput. Ind. Eng. 2021, 156, 107230. [Google Scholar] [CrossRef]
Brand, M.; Masuda, M.; Wehner, N.; Yu, X.-H. Ant Colony Optimization Algorithm for Robot Path Planning. In Proceedings of the 2010 International Conference on Computer Design and Applications, Qinhuangdao, China, 25–27 June 2010; IEEE: New York, NY, USA, 2010; Volume 3, pp. 436–440. [Google Scholar]
Wu, L.; Huang, X.; Cui, J.; Liu, C.; Xiao, W. Modified Adaptive Ant Colony Optimization Algorithm and Its Application for Solving Path Planning of Mobile Robot. Expert Syst. Appl. 2023, 215, 119410. [Google Scholar] [CrossRef]
Li, G.; Liu, C.; Wu, L.; Xiao, W. A Mixing Algorithm of ACO and ABC for Solving Path Planning of Mobile Robot. Appl. Soft Comput. 2023, 148, 110868. [Google Scholar] [CrossRef]
Cui, J.; Wu, L.; Huang, X.; Xu, D.; Liu, C.; Xiao, W. Multi-Strategy Adaptable Ant Colony Optimization Algorithm and Its Application in Robot Path Planning. Knowl.-Based Syst. 2024, 288, 111459. [Google Scholar] [CrossRef]
Liu, L.; Wang, X.; Xie, J.; Wang, X.; Liu, H.; Li, J.; Wang, P.; Yang, X. Path Planning and Tracking Control of Orchard Wheel Mower Based on BL-ACO and GO-SMC. Comput. Electron. Agric. 2025, 228, 109696. [Google Scholar] [CrossRef]
Sheng, X.; Yang, J.; You, L.; Li, J.; Wang, R. GOA-ACO: A Goose Optimized Ant Colony Algorithm for the Automated Guided Vehicle Path Planning. Alex. Eng. J. 2025, 130, 724–737. [Google Scholar] [CrossRef]
Li, Z.; Du, M.; Qin, J.; Chen, X. Research on Robot Path Planning Based on Multi-Strategy Genetic Ant Colony Optimization Algorithm. Inf. Sci. 2025, 718, 122407. [Google Scholar] [CrossRef]
Li, D.; Wang, L. Research on Mobile Robot Path Planning Based on Improved Q-Evaluation Ant Colony Optimization Algorithm. Eng. Appl. Artif. Intell. 2025, 160, 111890. [Google Scholar] [CrossRef]
Socha, K.; Dorigo, M. Ant Colony Optimization for Continuous Domains. Eur. J. Oper. Res. 2008, 185, 1155–1173. [Google Scholar] [CrossRef]
Niu, B.; Wang, Y.; Liu, J.; Yue, G.X.-G. Path Planning for Unmanned Aerial Vehicles in Complex Environment Based on an Improved Continuous Ant Colony Optimisation. Comput. Electr. Eng. 2025, 123, 110034. [Google Scholar] [CrossRef]
Wang, Y.; Liu, J.; Qian, Y.; Yi, W. Path Planning for Multi-UAV in a Complex Environment Based on Reinforcement-Learning-Driven Continuous Ant Colony Optimization. Drones 2025, 9, 638. [Google Scholar] [CrossRef]

Figure 1. Construction of the operating environment. (a) Construction of obstacles. (b) Construction of the signed distance function (SDF) field.

Figure 2. Construction of the simplified PRM and topological progress potential. (a) Sampling on the map. (b) Node connection. (c) Node simplification. (d) Topological progress potential.

Figure 3. Sector division and division of internal subunits.

Figure 4. Process for obtaining equipotential lines. (a) First blocked ring. (b) Five-point sampling. (c) Prediction of equipotential lines.

Figure 5. Pheromone deposition. (a) Single-point deposition when

R = 1

. (b) Pheromone deposition along the path.

Figure 5. Pheromone deposition. (a) Single-point deposition when

R = 1

. (b) Pheromone deposition along the path.

Figure 6. The flow diagram of TPP-ASACO.

Figure 7. Pre-screening and scoring results of three environments under static ant conditions. (a) Open area environment. (b) Scoring results of the open area. (c) Triangular obstacle environment. (d) Scoring results of the triangular obstacle environment. (e) U-shaped obstacle environment. (f) Scoring results of the U-shaped obstacle environment.

Figure 8. Parameter-sweep response curves for

k

and

α_{G e o}

(dots: real samples). (a)

{({\tilde{S}}_{ϕ} (j))}^{k}

versus

{\tilde{S}}_{ϕ} (j)

under different

k

. (b)

ξ_{j}^{α_{G e o}}

versus

ξ_{j}

under different

α_{G e o}

.

Figure 8. Parameter-sweep response curves for

k

and

α_{G e o}

(dots: real samples). (a)

{({\tilde{S}}_{ϕ} (j))}^{k}

versus

{\tilde{S}}_{ϕ} (j)

under different

k

. (b)

ξ_{j}^{α_{G e o}}

versus

ξ_{j}

under different

α_{G e o}

.

Figure 9. Parameter-sweep response curves of the directional terms with (

β_{m i n}

,

β_{m a x}

) fixed (dots: real sector samples). (a)

g (j)^{β (j)}

versus

g (j)

under different

{\tilde{S}}_{ϕ}

. (b)

G_{d i r} (j)

versus

(1 + g^{'} (j))

under different

S_{ϕ}^{'}

.

Figure 9. Parameter-sweep response curves of the directional terms with (

β_{m i n}

,

β_{m a x}

) fixed (dots: real sector samples). (a)

g (j)^{β (j)}

versus

g (j)

under different

{\tilde{S}}_{ϕ}

. (b)

G_{d i r} (j)

versus

(1 + g^{'} (j))

under different

S_{ϕ}^{'}

.

Figure 10. Parameter-sweep response curves of the pheromone term

T_{τ} (τ)

. (a) With

ω_{τ}

fixed,

T_{τ} (τ)

versus

τ

under different

τ_{s a t}

. (b) With

τ_{s a t}

fixed,

T_{τ} (τ)

versus

τ

under different

ω_{τ}

.

Figure 10. Parameter-sweep response curves of the pheromone term

T_{τ} (τ)

. (a) With

ω_{τ}

fixed,

T_{τ} (τ)

versus

τ

under different

τ_{s a t}

. (b) With

τ_{s a t}

fixed,

T_{τ} (τ)

versus

τ

under different

ω_{τ}

.

Figure 11. Parameter-sweep response curves of key mappings in the elastic-step mechanism. (a)

f (t^{'})

versus the consecutive-failure steps

t^{'}

under different

λ^{'}

(with

f_{h i g h} = 0.72

indicated). (b) With (

T_{0}

,

T_{m a x}

) fixed,

T (f)

versus

f (t^{'})

under different

q

.

Figure 11. Parameter-sweep response curves of key mappings in the elastic-step mechanism. (a)

f (t^{'})

versus the consecutive-failure steps

t^{'}

under different

λ^{'}

(with

f_{h i g h} = 0.72

indicated). (b) With (

T_{0}

,

T_{m a x}

) fixed,

T (f)

versus

f (t^{'})

under different

q

.

Figure 12. Three 2D test environments. (a) 10 × 10. (b) 20 × 20. (c) 50 × 50.

Figure 13. Comparison results in the 10 × 10 environment. (a) Final planned paths of different algorithms. (b) Convergence curves of the best path length versus iteration.

Figure 14. Comparison results in the 20 × 20 environment. (a) Final planned paths of different algorithms. (b) Convergence curves of the best path length versus iteration.

Figure 15. Comparison results in the 50 × 50 environment. (a) Final planned paths of different algorithms. (b) Convergence curves of the best path length versus iteration.

Figure 16. Best-path snapshots of TPP-CSACO sampled at fixed iteration intervals (single run). (a) 20 × 20. (b) 50 × 50.

Table 1. Related parameters involved in TPP-CSACO (part one).

Parameter	$c_{r e f}$	$k$	$β_{m i n}$	$β_{m a x}$	$α_{G e o}$	$λ_{l i n}$	$δ$	$τ_{s a t}$	$ω_{τ}$	$λ^{'}$	$q$	$f_{h i g h}$	$T_{0}$	$T_{m a x}$	$α$
Value	0.08	1	1	1.6	1	0.22	0.06	2	1	0.7	2.5	0.72	1	3	0.25

Table 2. Parameters of PSO, GWO, SSA, and ABC algorithms.

Algorithm	Parameter	Value
PSO	Population size	100
	Number of generations	300
	Inertia weight	0.85
	Cognitive coefficient	1
	Social coefficient	1
GWO	Population size	50
	Number of generations	200
	$α$ wolf weight	0.33
	$β$ wolf weight	0.33
	$δ$ wolf weight	0.33
SSA	Population size	50
	Number of generations	200
	Producer ratio	0.2
	Safety threshold	0.8
ABC	Population size	50
	Number of generations	200
	Abandonment limit	20

Table 3. Parameters of RRT*, PRM, and Q-learning algorithms.

Algorithm	Parameter	Value
RRT*	Maximum samples	20,000
	Maximum step size	0.8
	Goal sampling rate	0.1
	Goal connection radius	0.6
	Rewire radius	2.5
PRM	Number of samples	600
	Max neighbors (k)	15
	Connection radius	3
Q-learning	Learning rate ( $α$ )	0.1
	Discount factor ( $γ$ )	0.95
	Initial exploration rate ( $ε_{0}$ )	1.0
	Final exploration rate ( $ε_{m i n}$ )	0.01
	Exploration decay factor	0.998
	Number of training episodes	3000

Table 4. Related parameters involved in TPP-CSACO (part two).

Parameter	$N$	$M$	$ρ_{l o c}$	$ρ$	$Q$	$τ_{0}$	$R$	$N_{ϕ}$	$N_{r}^{p r e}$	$N_{r}$	$N_{θ}^{p r e}$	$N_{θ}$
Value	50	50	0.1	0.3	1	8	1	23	6	10	5	7

Table 5. Results of various algorithms in the 10 × 10 map.

Algorithm	Path Length (m)			Turn (°)		$V i o l$ Percent (%)	$J_{e q}$	Success Rate
Algorithm	Best	Mean	Std.	Total Turn	Max Turn	$V i o l$ Percent (%)	$J_{e q}$	Success Rate
ABC	17.441	17.990	0.345	114.457	69.414	0.398	3.290	100.0%
ACO	19.243	21.115	0.873	1468.800	126.900	0.603	1.249	100.0%
GWO	17.056	17.209	0.215	99.694	72.974	6.024	2.775	100.0%
PRM	17.100	17.241	0.075	144.424	69.951	2.548	3.249	100.0%
PSO	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.0%
RRT*	17.105	17.307	0.130	115.532	71.435	2.507	3.263	100.0%
SSA	17.038	18.056	0.912	75.566	70.505	16.535	2.454	68.0%
QL	19.837	21.165	0.632	1517.835	117.900	2.388	1.065	100.0%
TPP-CSACO	17.227	17.296	0.024	152.202	16.823	0.000	3.899	100.0%

Table 6. Results of various algorithms in the 20 × 20 map.

Algorithm	Path Length (m)			Turn (°)		$V i o l$ Percent (%)	$J_{e q}$	Success Rate
Algorithm	Best	Mean	Std.	Total Turn	Max Turn	$V i o l$ Percent (%)	$J_{e q}$	Success Rate
ABC	29.956	32.782	1.593	249.252	90.236	0.512	2.144	100.0%
ACO	29.799	31.675	0.905	681.300	101.700	2.565	0.914	100.0%
GWO	28.324	33.948	3.416	209.678	86.628	1.029	2.078	72.0%
PRM	28.472	29.145	0.336	275.798	60.090	0.870	2.983	100.0%
PSO	28.163	28.574	0.618	139.382	53.615	4.281	2.493	44.0%
RRT*	28.650	30.061	0.835	315.494	65.236	0.867	2.679	100.0%
SSA	28.812	30.105	1.379	183.918	69.700	1.334	2.652	10.0%
QL	29.799	31.307	0.704	625.500	87.300	3.132	1.023	100.0%
TPP-CSACO	28.993	29.405	0.219	390.010	5.932	0.000	3.470	100.0%

Table 7. Results of various algorithms in the 50 × 50 map.

Algorithm	Path Length (m)			Turn (°)		$V i o l$ Percent (%)	$J_{e q}$	Success Rate
Algorithm	Best	Mean	Std.	Total Turn	Max Turn	$V i o l$ Percent (%)	$J_{e q}$	Success Rate
ABC	79.585	89.503	4.899	305.092	98.420	0.164	3.054	82.0%
ACO	133.054	162.449	12.326	6390.000	135.000	0.381	0.805	100.0%
GWO	74.679	85.986	6.198	156.334	65.407	0.585	3.188	82.0%
PRM	86.867	92.677	6.461	1208.633	96.077	0.384	2.847	18.0%
PSO	74.213	75.708	0.731	161.084	56.051	2.190	2.768	26.0%
RRT*	76.087	82.004	3.849	742.210	70.302	1.395	2.873	100.0%
SSA	79.844	93.638	6.354	105.298	72.713	2.473	2.695	40.0%
QL	79.154	83.089	1.387	1597.500	93.600	0.976	2.542	100.0%
TPP-CSACO	78.157	80.180	1.145	1076.741	8.720	0.000	3.854	100.0%

Table 8. Ablation study results for key algorithm modules.

Variant	Move Rejection Rate	Ant Arrival Rate	Average Steps
TPP-CSACO	0.375%	97.417%	42.899
w/o Topological Progress Potential	2.354%	59.125%	75.911
w/o Frustration-Induced Temperature Rise	2.087%	84.083%	53.924
w/o Elastic Step Length	10.081%	84.417%	39.901

Table 9. Efficiency and accuracy evaluation of sector pre-screening.

Environment	Pre-Screening	Pre-Screening and Detailed Scoring	Full-Directional Detailed Scoring	Speedup	Top-5 Retention	False Rejection Rate
Open area	4.196 ms	13.471 ms	18.118 ms	1.345	100%	0
Triangular obstacle	4.123 ms	13.598 ms	18.267 ms	1.342	100%	5%
U-shaped obstacle	4.534 ms	14.296 ms	18.780 ms	1.316	100%	25%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Article Metrics

Citations

Article Access Statistics

Journal Statistics

Multiple requests from the same IP address are counted as one view.