Abstract
This paper proposes the Quantum-behaved Loser Reverse-learning Differential Evolution (QLRDE) algorithm to address the inherent limitations of the standard Differential Evolution (DE) algorithm, including slow convergence speed and the premature stagnation in local optima. QLRDE incorporates three innovations: quantum-behaved mutation strategies suppress premature convergence by leveraging quantum mechanics, the Loser Reverse-Learning Mechanism enhances diversity by reconstructing inferior individuals through opposition-based learning, and an adaptive parameter adjustment mechanism balances exploration and exploitation to improve robustness and convergence efficiency. Experimental evaluations on twelve benchmark functions confirm that QLRDE demonstrates better performance than existing algorithms in terms of search capability and convergence speed. Furthermore, QLRDE is employed for the 3D UAV path planning problem. QLRDE can generate B-Spline-based smooth flight paths and incorporate real-world constraints into the cost function. Simulation results confirm that QLRDE outperforms several competing algorithms with respect to path quality, computational efficiency, and robustness.
1. Introduction
Unmanned aerial vehicles (UAVs) are increasingly utilized across various military and civilian applications, driven by rapid technological advancement [1]. Path planning is a critical technology for enabling autonomous UAV navigation [2,3]. In complex 3D environments, a UAV path must satisfy physical and dynamic constraints, such as maximum turning angle, maximum climb angle, path length, and flight cost, while simultaneously avoiding terrain obstacles and threat areas [1]. Currently, path planning algorithms can be generally divided into four classes: heuristic algorithms, sampling-based methods, learning-based methods, and swarm intelligence-based methods. Heuristic algorithms, such as A* [4], guide the search process using designed heuristic functions, though designing effective heuristics can be challenging. Sampling-based methods, e.g., the rapidly exploring random tree (RRT) [5], can ensure path feasibility but often cannot guarantee optimality. Learning-based methods, such as reinforcement learning (RL) [6], offer adaptive optimization and generalization capabilities but typically require substantial computational resources and data. Concurrently, the rise of large AI models for end-to-end navigation [7] presents a complementary frontier where efficient optimizers could be integrated to refine AI-generated paths with guaranteed constraint satisfaction. Swarm intelligence-based algorithms usually possess strong global search capabilities and achieve significant improvements in path planning performance, such as the Differential Evolution (DE) algorithm [8], Ant Colony Optimization (ACO) algorithm [9], Grey Wolf Optimizer (GWO) algorithm [10], Fireworks Algorithm (FWA) [11], and Particle Swarm Optimization (PSO) algorithm [11].
DE, recognized for its structural simplicity and robustness [12], has been effectively applied in a large number of engineering fields [13,14], including UAV path planning and multi-UAVs coordination [15]. Researchers have conducted extensive studies and improvements on the standard DE to enhance its performance. The adaptive DE (JADE) [16] incorporates an optional external archive and adaptive mechanisms to improve the diversity and global search ability. The multiple strategies of adaptive DE (MLSHADE-SPA and SaUSDE) [17,18] dynamically adjust and select the suitable mutation strategy based on problem characteristics and dynamic changes during evolution, thereby enhancing the algorithm’s performance and convergence speed. The iterative local search adaptive DE (SHADE-ILS) [19] combines adaptive mechanisms with an iterative local search. The Multiple-Objective DE (MODE) [20] incorporates multiple differential mutation strategies and crossover operators to handle multiple-objective optimization problems. The hybrid DE (HDE) [21] enhances the robustness against noise and disturbances. However, existing DE variants still suffer from premature convergence and insufficient search efficiency when dealing with complex problems.
In recent years, several existing studies have studied the application of the DE to the UAV path planning problem. Zhou et al. [22] presented a three-dimensional trajectory planning algorithm for UAV based on an improved DE algorithm. Moreover, considering that the path planning problem is truly a multi-objective optimization problem in which conflicting goals of minimizing the length of the path and maximizing the margin of safety can be simultaneously important, Mittal et al. [23] used a hybrid multi-objective evolutionary algorithm to optimize the flight distance and risk factor simultaneously, and thus generated a set of Pareto optimal paths. However, in complex 3D environments, these methods often converge slowly, become trapped in local optima, and struggle to produce smooth, feasible paths.
Opposite-Based Learning (OBL) [24] has recently been incorporated into population-based optimization techniques to enhance their search capabilities. OBL, as a relatively simple learning mechanism, can analyze known outputs, derive optimal input parameters, and make adjustments to enhance system performance or improve operational processes. OBL does not require complex data processing, making it simple to implement and computationally cost-effective. Meanwhile, quantum theory has provided new ideas for improving optimization algorithms [25]. When individuals follow the rules of quantum-behaved movement, their solutions exhibit higher diversity, thereby expanding the search range. Quantum-behaved Particle Swarm Optimization (QPSO) utilizes quantum mechanical rules for particle motion to help particles escape local optima [26,27]. Quantum Firefly Algorithm (QFA) employs quantum movement to maintain population diversity [28].
This paper integrates quantum theory and the OBL mechanism to overcome the limitations of standard DE, and proposes a novel Quantum-behaved Loser Reverse-learning Differential Evolution (QLRDE) algorithm. First, a Loser Reverse-Learning Mechanism (LRLM) is introduced to reconstruct the positions of inferior individuals, thereby increasing population diversity. Second, quantum-behaved mutation strategies are adopted, leveraging the probabilistic nature of wave functions to suppress premature convergence. Third, an adaptive parameter adjustment strategy based on the hyperbolic tangent function is designed to dynamically balance the exploration and exploitation of solution space. Simulation results demonstrate that QLRDE outperforms other algorithms on benchmark test functions. Furthermore, this paper presents a significant application of QLRDE to 3D path planning for UAV. The QLRDE encodes B-Spline-based control points as optimization individuals and employs a comprehensive cost function of UAV path. Experimental evaluations confirm that QLRDE can generate shorter, lower-altitude, and smoother flight paths compared to other algorithms.
This paper makes three main contributions. First, we propose the Quantum-behaved Loser Reverse-learning Differential Evolution (QLRDE) algorithm, which incorporates three innovations: a quantum-behaved mutation strategy to suppress premature convergence, the Loser Reverse-Learning Mechanism (LRLM) to enhance population diversity, and an adaptive parameter adjustment mechanism to balance exploration and exploitation. Second, experimental evaluations on twelve benchmark functions confirm that QLRDE demonstrates better performance than existing algorithms in terms of search capability and convergence speed, achieving solutions closer to the true global optimum. Third, the proposed QLRDE is effectively applied to the 3D UAV path planning problem. By encoding B-Spline-based control points as optimization individuals and incorporating real-world constraints into a comprehensive cost function, QLRDE generates shorter, lower-altitude, and smoother flight paths, outperforming other algorithms with respect to path quality and robustness.
This paper proceeds as follows. Section 2 focuses on the DE algorithm. The detailed QLRDE algorithm is introduced in Section 3. Section 4 compares QLRDE and other algorithms on test functions. Section 5 shows the application of QLRDE to UAV path planning. Section 6 discusses experimental results of QLRDE-based UAV path planning. Section 7 summarizes this paper finally.
2. Standard DE Algorithm
Operating as a parallel direct search method, DE follows an iterative, population-based stochastic process consisting of mutation, crossover, and selection. Algorithm 1 presents the general structure of the DE algorithm [12].
| Algorithm 1. Structure of the basic DE algorithm | |
| 1. | Set the generation index t = 0 |
| 2. | /*Initialization*/ |
| 3. | An initial population of Mpop individuals is randomly generated. The fitness of each individual is then evaluated |
| 4. | While stopping criteria are unsatisfied, do |
| 5. | generation index t = t + 1 |
| 6. | For i = 1: Mpop do |
| 7. | /*Mutation*/ |
| 8. | generate via a random selection of distinct individuals |
| 9. | /*Crossover*/ |
| 10. | apply the crossover scheme to and to obtain |
| 11. | /*Selection*/ |
| 12. | evaluate , then perform greedy selection between and |
| 13. | End for |
| 14. | End while |
The standard DE algorithm consists of the following four steps:
(1) Initialization: Generate the random initial solution population, in which the i-th individual is denoted as , and
where D denotes the dimensionality of search space, i = 1, 2, …, Mpop, xj,min and xj,max are the predefined search bounds, and rand is a uniformly distributed random number within [0, 1].
(2) Mutation: In the g-th generation, the mutation vector vi is generated by linearly combining the components of the difference vector with the corresponding components of other individuals by
where xr1, xr2 and xr3 are three distinct individuals that are randomly selected from the population and i ∉ {r1, r2, r3}. Amplification factor F > 0 can scale the difference vector.
The vector should also satisfy the constraints of the search space as follows:
(3) Crossover: Perform a crossover operation using vi and xi to generate by
where CR is the crossover rate and jrand is a randomly selected integer from 1 to D.
(4) Selection: By comparing ui and xi, the vector with lower value of cost function is selected in the next generation, as follows:
where the function f(.) represents the cost function.
3. QLRDE
In the standard DE algorithm, the mutation vector updates primarily rely on the differences between individuals, lacking clear directionality. Moreover, the DE algorithm tends to overfit the current population, further reducing diversity and weakening global search ability. As a result, DE is often stuck in local optima, especially when dealing with the high-dimensional complex problems. To address these issues, this paper proposes the QLRDE algorithm by combining quantum-behaved mutation strategies and the Loser Reverse-Learning Mechanism.
3.1. Quantum-Behaved Mutation Strategies
According to the principles of quantum mechanics [25], the population of QLRDE is viewed as a quantum system where each individual exhibits quantum behavior. Additionally, local attractors serve as centers of an attractive potential field, and individuals within this field are able to explore different positions in the feasible region.
Assuming a D-dimensional search space and a population of Mpop individuals, during the g-th iteration, if the position vector of individual i, i = 1, 2, …, Mpop, and the space is , then the local attractor qi and mutation vector vi are generated by
where rp, rx, and rand are mutually independent random numbers from a uniform distribution in the range of (0, 1). The amplification parameter k controls the intensity of quantum fluctuations acting on the mutation vector . A larger k results in a broader exploration size around the local attractor . The nonlinear parameter x controls the shape of the , controlling the transition from exploration to exploitation, thereby balancing exploration and exploitation capabilities.
3.2. Loser Reverse-Learning Mechanism
The LRLM allows individuals with poor performance in the search process to improve through reverse-learning, explore a wider solution space, enhance the global optimization ability, and effectively mitigate premature convergence. The implementation and application of this augmentation mechanism is detailed below.
3.2.1. Identifying the Losers
The improvement extent δi of xi after mutation and crossover operations is defined by
where the large value of δi indicates that the population is improving rapidly.
Then, utilize the linear prediction [29] to estimate its costs at generation gmax by
If the predicted final cost is inferior to the current best cost min f(xi), this identifies individual ui as a “loser”, named as uL. In this case, reverse-learning will be applied to the loser crossover vector.
3.2.2. Reverse-Learning
OBL expands the search space and increases population diversity by simultaneously exploring both self-solution and antithetical solutions, thereby preventing premature convergence to local optimum and enhancing global search capabilities [28,29,30].
Suppose a point in the D-dimensional space , its opposite point can be represented as
where the variable xj is confined to the interval [aj, bj], and j = 1, 2, …, D.
To enhance the population’s probability of locating the global optimum, the reverse-learning operation in this paper is modified to update the loser as follows:
where p1 and p2 are uniformly random numbers within (0, 1). The reverse-learning intensity parameter o primarily influences the magnitude of the reverse-learning perturbation. A larger o increases the adjustment strength applied to the loser individual uL, encouraging its movement toward better regions of the solution space. The limitation parameter w acts as a weighting coefficient, jointly with the random number p1, determining the distance between the reverse-learning direction and the original opposite point. Its role is to maintain diversity and stability in the learning process. In practice, o is commonly chosen from [0.4, 1.0] and w from [0.1, 1.0]. The selection criterion involves balancing exploration intensity (controlled by o) with convergence stability (influenced by w), often determined through sensitivity analysis as demonstrated in Section 6.2.
The procedure of the LRLM is specifically described in Algorithm 2.
| Algorithm 2. Loser Reverse-Learning Mechanism | |
| 1. | Require: maximal generation number gmax, population size Mpop |
| 2. | For i = 1: Mpop do |
| 3. | If f(ui) < f(xi) then |
| 4. | δi = f(xi) − f(ui) |
| 5. | If δi∙(gmax − g) < f(ui) − min f(xi) then |
| 6. | The crossover vector ui is denoted as uL. |
| 7. | Set adjustment parameters o, w. |
| 8. | For j = 1: D |
| 9. | |
| 10. | End for |
| 11. | Re-evaluate and updated the cost f(ui) |
| 12. | End if |
| 13. | End if |
| 14. | End for |
3.3. Adaptive Parameter Adjust Mechanism
In the standard DE algorithm, two adaptive parameters, the amplification factor F and crossover rate CR, are typically fixed throughout the entire evolutionary process. This static configuration tends to lead to an imbalance between search diversity and convergence precision across distinct phases, especially when tackling complex high-dimensional problems. To better balance search diversity and convergence precision throughout the search, the adaptive parameter adjust mechanism is proposed for F and CR. This mechanism dynamically adjusts these parameters for each individual based on successful mutation events, thereby enhancing the algorithm’s robustness and convergence efficiency.
For each individual i in the population, the amplification factor Fi and crossover rate CRi are updated as follows.
3.3.1. Parameter Initialization
Initially, Fi and CRi for each individual are randomly initialized within predefined ranges:
where rand ∈ [0, 1] is a uniformly distributed random number.
3.3.2. Parameter Retention
If a trial vector ui succeeds in replacing the target vector xi (where ), the parameters that generated this successful mutation are considered favorable. The parameters for individual i in the next generation are updated as follows:
where rand ∈ [0, 1] is a uniformly distributed random number and a ∈ {1, 2, …, Mpop} is a randomly chosen index distinct from i. This update introduces a stochastic element that helps preserve population diversity.
3.3.3. Parameter Updating
If the ui fails to outperform the xi, the parameters remain unchanged for the next generation, allowing the individual to maintain its current search direction temporarily:
3.3.4. Boundary Violation Handling
After each update, the values of Fi and CRi are constrained to their respective allowable ranges to ensure stability:
This adaptive mechanism enables the algorithm to automatically tailor its search parameters to the landscape of the optimization problem. By promoting effective parameter sets and discouraging ineffective ones, the mechanism contributes to a more efficient trade-off between search diversity and convergence precision. Consequently, the adaptive parameter adjust mechanism significantly enhances the convergence performance and solution optimality of the QLRDE.
During exploratory phases, the adaptive mechanism promotes higher Fi and CRi, which enhances the mutation strength in quantum-behaved mutation for global exploration while allowing Loser Reverse-Learning Mechanism to actively reconstruct inferior individuals. During exploitation phases, adaptive mechanism reduces parameter variations through its success-based retention strategy, enabling quantum-behaved mutation to perform refinement while loser reverse-learning focuses on fine-tuning near optimal regions.
By integrating the adaptive mechanism with the quantum-behaved mutation strategies and the Loser Reverse-Learning Mechanism, we obtain a comprehensive and powerful optimization framework, which is detailed in the complete procedure of the QLRDE algorithm provided in Algorithm 3.
3.4. Process of the QLRDE Algorithm
The specific procedure of QLRDE is shown in Algorithm 3.
| Algorithm 3. QLRDE | |
| /*Parameter setting*/ | |
| 1. | Set algorithm parameters gmax, Mpop, Fmax, Fmin, CRmax, CRmin. |
| /*Initialization*/ | |
| 2. | Randomly initialize xi and calculate the costs J(xi). |
| 3. | Initialize amplification factors Fi and CRi. |
| 4. | While g < gmax do |
| 5. | Randomly select a, b ∈ [1, Mpop]. |
| 6. | If rand < 0.5 then |
| 7. | |
| 8. | Else |
| 9. | |
| 10. | End if |
| 11. | For i = 1: Mpop do |
| /*Quantum behavior mutation*/ | |
| 12. | Set adjustment parameters k, x. |
| 13. | If rand < 0.5 then |
| 14. | |
| 15. | Else |
| 16. | |
| 17. | End if |
| /*Crossover*/ | |
| 18. | For j = 1: D do |
| 19. | |
| 20. | End for |
| 21. | End for |
| 22. | Apply the Loser Reverse-Learning Mechanism as described in Algorithm 2. |
| 23. | For i = 1: Mpop do |
| /*Selection*/ | |
| 24. | If J(ui) ≤ J(xi) then |
| 25. | |
| 26. | |
| 27. | |
| 28. | |
| 29 | Else |
| 30. | |
| 31. | |
| 32. | |
| 33. | |
| 34. | End if |
| 35. | End for |
| 36. | g = g + 1 |
| 37. | End while |
| 38. | Return the optimal solution. |
3.5. Computing Complexity
Let Mpop denote the population size, D the problem dimension, Gmax the maximum number of generations, and C the cost function. The QLRDE algorithm consists of two main phases: initialization and iterative optimization. The computing complexity of each part is detailed below.
3.5.1. Initialization
The initialization is executed only once at the beginning of the algorithm. It involves generating Mpop random individuals in the D-dimensional search space and evaluating their costs, C. The computational complexity of this part is .
3.5.2. Optimization
The optimization is executed in each generation and comprises four components: quantum-behaved mutation, crossover, loser reverse-learning, and selection with adaptive parameter adjust. The complexity of each component is analyzed below under worst-case assumptions.
(1) Quantum-behaved mutation: In each generation, all Mpop individuals undergo mutation according to Equation (7). This operation involves arithmetic computations for each of the D dimensions per individual. Therefore, the computational complexity of the mutation step is .
(2) Crossover: Following mutation, the binomial crossover in Equation (4) is applied to each individual. For each of the D dimensions, a random number is generated and compared with the crossover rate CRnew. The resulting complexity is .
(3) Loser reverse-learning: The LRLM (Algorithm 2) is invoked after crossover and involves three steps. First, loser identification computes the improvement (Equation (8)) and the predicted cost (Equation (9)) for each trial vector, requiring operations. Second, reverse-learning application, assuming in the worst case that all Mpop trial vectors are identified as losers, applies Equation (11) to every dimension of each loser, resulting in a complexity of . Third, re-evaluation re-computes the cost of the updated losers, contributing operations. Thus, the overall worst-case complexity of LRLM is .
(4) Selection with adaptive parameter adjust: First, greedy selection compares each trial vector with its corresponding target vector based on cost, requiring operations. Second, adaptive parameter adjusts the parameters Fi and CRi for every individual according to Equations (13) and (14), which contributes operations.
Summing the dominant terms from the above components, the computational complexity per generation is The scalar operations are omitted as they are dominated by the higher-order terms when D is large. Over Gmax generations, the total computational complexity of QLRDE is .
The standard DE algorithm possesses the same asymptotic complexity of . Therefore, QLRDE maintains the identical asymptotic complexity class as standard DE, demonstrating that the introduced quantum-behaved mutation, Loser Reverse-Learning Mechanism, and adaptive parameter adjustment do not increase the algorithm’s asymptotic computational burden.
In practical implementation, QLRDE introduces a modest constant-factor overhead due to the additional operations in LRLM. However, this overhead is justified by the algorithm’s significantly enhanced search capability and convergence performance, as evidenced by the experimental results in the following sections. The improved convergence characteristics often enable QLRDE to attain high-quality solutions with fewer generations (Gmax), thereby potentially reducing the total computational time in practice.
In summary, the complexity analysis confirms that QLRDE achieves substantial performance improvements while preserving the computational efficiency of the DE.
4. Test Results and Discussion on Benchmark Functions
This section presents a comparative analysis of QLRDE against standard DE, PSO, the Loser Reverse-learning Differential Evolution (LRDE), and the Global Optimal Learning Differential Evolution (GOBLDE) algorithm [13]. We conducted 50 independent tests on 12 benchmark functions for each algorithm. All comparisons were conducted on a computer with the Win-11 platform. The algorithm was written in Matlab-2024a, without the use of any commercial algorithmic tools.
4.1. Parameters and Test Functions Setting
All tested algorithms shared a common configuration, where Mpop = 40 and gmax = 400. Other parameters are set as follows: for LRDE, DE, and GOBLDE, F is a random number in [0.1, 0.9], and CR = 0.7; for PSO, c1 = c2 = 1.4 and w = 0.9; for QLRDE, both the amplification factor F ∈ [0.1, 0.9] and the crossover operator CR ∈ [0.2, 0.9] are dynamically adjusted during the optimization process. Twelve representative benchmark functions were considered, including seven unimodal functions (F1, F2, F3, F4, F8, F9, F11) to assess convergence speed and exploitation capability and five multimodal functions (F5, F6, F7, F10, F12) to examine the algorithm’s ability to avoid local optima and maintain population diversity, which are provided in Table 1.
Table 1.
Test functions.
4.2. Results and Discussion
This section presents data analysis, iterative curve analysis, and one-way analysis of variance (ANOVA) for different algorithms on 12 benchmark functions.
4.2.1. Analysis of Statistical Results and Evolution Curves
In the cases of D = 20 and D = 30, the best, worst, median, mean, and standard deviation (std) values for each group of benchmark functions and testing algorithms are recorded in Table 2 and Table 3, respectively, with the optimal values highlighted in bold. According to Table 2, in the D = 20, LRDE performs best on F2 and F9. For the other ten functions, the best results were achieved with QLRDE. Except for F1, F2, F3, F9, and F12, QLRDE has the smallest standard deviations. Table 3 indicates that, in the D = 30, LRDE performs best on functions F1, F3, and F9, whereas QLRDE performs outstandingly on the other nine functions, particularly converging to 0 on functions F4, F6, F7, and F10. QLRDE also shows superior performance in average best function values, consistently achieving lower averages in both 20-dimensional and 30-dimensional cases, demonstrating stable performance in approaching optimal solutions. Analysis of the tables reveals that QLRDE finds solutions closer to the true global optimum across most functions and exhibits higher stability and reliability in different runs.
Table 2.
Comparison results on test functions with D = 20.
Table 3.
Comparison results on test functions with D = 30.
The evolution curves in Figure 1 and Figure 2 show that QLRDE demonstrates rapid convergence in the early stages of the search and achieves the optimal solution in the later stages. Compared to other algorithms, QLRDE has the better search performance and faster convergence rate.
Figure 1.
Comparison of evolution curves on test functions with D = 20. (a) Sphere function F1; (b) Shifted Schwefel’s problem F2; (c) Shifted rotated high conditioned elliptic function F3; (d) Shifted step function F4; (e) Shifted rotated Ackley’s function with global optimum on bounds F5; (f) Shifted Rastrigin’s function F6; (g) Shifted rotated Griewank’s function without bounds F7; (h) Axis parallel hyper ellipsoid function F8; (i) Rotated hyper-ellipsoid function F9; (j) Schaffer function F10; (k) Two-dimensional de Jong function F11; (l) Multimodal Rastrigin-Sankoff hybrid function F12.
Figure 2.
Comparison of evolution curves on test functions with D = 30. (a) Sphere function F1; (b) Shifted Schwefel’s problem F2; (c) Shifted rotated high conditioned elliptic function F3; (d) Shifted step function F4; (e) Shifted rotated Ackley’s function with global optimum on bounds F5; (f) Shifted Rastrigin’s function F6; (g) Shifted rotated Griewank’s function without bounds F7; (h) Axis parallel hyper ellipsoid function F8; (i) Rotated hyper-ellipsoid function F9; (j) Schaffer function F10; (k) Two-dimensional de Jong function F11; (l) Multimodal Rastrigin-Sankoff hybrid function F12.
The data from tables and iterative curves indicate that LRDE outperforms standard DE, PSO, and GOBLDE in all metrics, validating the effectiveness of the Loser Reverse-Learning Mechanism combined with DE. The LRLM significantly improves algorithm efficiency and the quality of optimal solutions, enabling LRDE to effectively escape local optima and approach the global optimum more closely. While GOBLDE performs well in some aspects, QLRDE generally exhibits superior global search ability and stability.
In conclusion, the analysis of the data presented in tables and figures indicates that QLRDE outperforms four other algorithms across twelve benchmark functions in both D = 20 and D = 30 dimensions. QLRDE demonstrates outstanding global search ability and consistently achieves excellent results in finding the minimum values. It effectively mitigates the issues of local optima, premature convergence, and slow convergence speed in solving continuous function optimization problems, thereby significantly improving algorithm performance. Moreover, QLRDE displays commendable convergence behavior, stability, and reliability throughout the optimization process. Therefore, QLRDE stands as a highly recommended choice for solving high-dimensional continuous optimization problems.
4.2.2. Analysis of ANOVA for Values
Table 4 shows the one-way ANOVA of the best values for five different algorithms across 12 test functions with D = 20. The results indicate that the between-group variance SS is 672.04 with a degree of freedom (df) of 4, and the mean square (MS) is 168.0097. The within-group variance SS is 11,752.00 with a df of 55, and the MS is 213.6758. The total SS is 12,424.00 with a df of 59. The F-value is 0.7863 and the p-value is 0.5390, which is above the 0.05 significance level. This leads to the conclusion that there is no significant difference. Figure 3a illustrates the differences between the DE and PSO algorithms compared to others, showing considerable fluctuations in most cases, while other algorithms demonstrate more stability in certain test functions.
Table 4.
The one-way ANOVA for the best values with D = 20.
Figure 3.
(a) One-way ANOVA box plots on the best values for different test functions with D = 20; (b) one-way ANOVA box plots on the best values for different test functions with D = 30.
Table 5 presents the one-way ANOVA of the mean values for eight different algorithms across 12 test functions with D = 30. The F-value is 0.9636 and the p-value is 0.4349, which is above the 0.05 significance level. Figure 3b shows that QLRDE and is the most stable algorithm with the least fluctuations and the best performance, indicating greater reliability and effectiveness in most cases.
Table 5.
The one-way ANOVA for the best values with D = 30.
In summary, the analysis of these data concludes that QLRDE exhibits fast convergence, stability during operation, and a lower likelihood of becoming trapped in local optima. In most cases, it finds solutions that are closer to the true global optimum.
5. QLRDE for UAV Path Planning
This section presents on a path planning solution based on the QLRDE algorithm. The solution begins by employing B-Spline curves to transform the continuous path planning problem into an optimization problem with a finite number of control points, thereby reducing the dimensionality of the decision variables. A comprehensive cost function that incorporates multiple objectives is established, considering path length, threat minimization, flying height, turning angle, and terrain constraints, thus modeling the path planning as an optimization problem. The QLRDE algorithm is then employed to optimize the path, ultimately generating a smooth flight path that satisfies dynamic constraints while minimizing the total cost.
5.1. Path Representation Based on B-Spline Curve
The UAV path comprises N discrete waypoints (excluding the fixed start and target), each waypoint pk, k = 1, …, N, is defined by three coordinates (xpk, ypk, zpk), thereby resulting in a 3N-dimensional decision space. For dimensionality reduction and smooth path generation, B-Splined strategy [31] is employed to derive the flight trajectory from control points, ultimately represent the target UAV path.
UAV path defined by discrete waypoints {p0, p1, …, pN, pN+1} with the coordinates (xpk, ypk, zpk) for each waypoint, corresponds to a set of control points {w0, w1, …, wn, wn+1} with the coordinates (xci, yci, zci), i = 1, …, n. The start (p0, w0) and goal (pN+1, wn+1) are prescribed. Path planning is equivalent to solving for the n free control points that, along with the fixed endpoints, produce the desired B-Spline curve-based path.
5.2. Model of Cost Function
Path planning is formulated as an optimization task, and minimizes a comprehensive cost function J, which is the core criterion to evaluate path quality [27]. As detailed in Section 5.1, given the UAV path represented by waypoints {p0, p1, …, pN, pN+1} with coordinates (xpk, ypk, zpk), k = 0, …, N + 1. The cost function is as follows:
where total cost J aggregates five key terms: length cost , threat cost , no-fly zone cost , altitude cost , lateral maneuvering cost , and vertical maneuvering cost .
The length cost fLC is given by the accumulated sum of all consecutive segment lengths along the path
Designed to penalize proximity to threats, the threat cost fTC accounts for the cumulative threat along the path, factoring in segment lengths and localized threat probabilities as follows
where Pj,k, Rmax,j, and dj,k denote, respectively, the threat probability for segment pkpk+1 from the j-th threat, the threat’s maximum effective radius, and the segment’s distance to the threat center.
The UAV should avoid entering some no-fly zones, such as harsh climate zones, unknown zones, and so on. The no-fly zones cost function fNFC is calculated as follows:
where N is the number of no-fly zones and Lin,k is the length of the UAV path that is inside the k-th no-fly zones.
The altitude cost fAC is designed to incentivize low-altitude flight, thereby exploiting terrain masking, and to penalize any path colliding with the terrain, as follows:
where Hmap (xpk, ypk) is the terrain elevation at coordinates (xpk, ypk), Hmin is the minimum allowable flight altitude, and C is the penalty coefficient.
The lateral maneuvering cost fLMC penalizes excessive turning by summing penalties at waypoints where the turning angle exceeds the allowable limit, as follows:
where turning angle ϕk at pk and penalty C are as defined. The allowable turning is constrained by a maximum angle , which is determined by the UAV’s dynamic limits
where V is the velocity and nmax is the maximum allowable lateral load.
The vertical maneuvering cost fVMC can be computed as the sum of penalties imposed at waypoints that violate the allowable climb/glide slope, as follows:
where the maximum climbing slope αk, minimum gliding slope βk, and the instantaneous path slope sk at waypoint pk are calculated according to the aerodynamic model [27].
5.3. Path Planning Using QLRDE
The core of applying the QLRDE algorithm to UAV path planning lies in effectively mapping the path optimization problem onto the evolutionary search framework of QLRDE. This section details this mapping process, including the encoding of B-Spline control points into QLRDE individuals, the design of the cost function as the fitness evaluation criterion, and the specific procedural integration.
In the B-Spline-based path representation described in Section 5.1, a smooth flight path is determined by a sequence of n free-to-move control points, , where each control point has 3D coordinates. Together with the fixed start point w0 and target point wn+1, these points fully define the path.
In the QLRDE algorithm, each individual in the population represents a candidate solution. Specifically, an individual x in the QLRDE population is encoded as the concatenation of all coordinates of the n free control points:
Thus, the problem dimension is D = 3n. Each individual x corresponds to a unique UAV path generated by the B-Spline curve construction formulas. The population of QLRDE, therefore, explores a space of potential paths by evolving these control points.
The quality of a path, represented by an individual x, is evaluated by the cost function J(x) defined in Equation (17). This function J serves as the fitness function in the QLRDE algorithm, guiding the evolutionary search towards safer and more efficient paths.
During greedy selection, each trial path is directly compared with its parent based on the total cost J(ui) versus J(xi), which establishes a steady selection pressure toward lower-cost paths. The penalty mechanism embedded in the cost function plays a crucial role in steering the search direction: paths that violate hard constraints (e.g., exceedance of turning or climb-angle limits) receive a heavy penalty of order C = 103, raising their total cost far above that of any feasible path and ensuring their rapid elimination. For violations of soft constraints (e.g., proximity to threats), a moderate penalty on the order of 101 to 102 is applied, allowing some exploration near constraint boundaries while gradually guiding the population toward feasible regions. This differentiated penalty strategy enables the algorithm to explore a broad space that includes slightly infeasible solutions in early generations, yet converge strictly to fully feasible and optimized paths in later stages.
For the adaptive parameter adjust mechanism, the update of an individual’s control parameters Fi and CRi is directly coupled to cost improvement: only when J(ui) < J(xi) are the parameters used in that trial considered successful and retained for the next generation via Equation (13); otherwise, the original parameters are kept. This mechanism ensures that search regions in the control-point space that consistently yield cost reductions acquire enhanced local-search capability, while regions that fail to improve are gradually assigned lower search intensity. In particular, individuals that successfully avoid threat zones or terrain obstacles tend to preserve and propagate their parameters, thereby establishing a search bias toward feasibility and low cost within the population.
The process of integrating the QLRDE algorithm for 3D UAV path planning follows a structured iterative procedure, as illustrated in Figure 4. Algorithm 4 summarizes the detailed procedure, highlighting the integration points between path planning and the QLRDE optimizer.
| Algorithm 4. QLRDE path planning | |
| /*Parameter setting*/ | |
| 1. | Set Start point w0, target point wn+1, terrain data Hmap, UAV parameters, nmax, V, Hmin, QLRDE parameters D, gmax, Mpop, Fmax, Fmin, CRmax, CRmin. |
| /*Initialization*/ | |
| 2. | Generation g = 0 |
| 3. | Randomly initialize population of Mpop individuals, xi, I = 1, …, Mpop, within the mission boundaries. Each individual xi represents n control points and calculate the costs J(xi). |
| 4. | Initialize amplification factors Fi and CRi. |
| 5. | While g < gmax do |
| 6. | Randomly select a, b ∈ [1, Mpop]. |
| 7. | If rand < 0.5 then |
| 8. | |
| 9. | Else |
| 10. | |
| 11. | End if |
| 12. | For i = 1: Mpop do |
| /*Quantum behavior mutation*/ | |
| 13. | Set adjustment parameters k, x. |
| 14. | If rand < 0.5 then |
| 15. | |
| 16. | Else |
| 17. | |
| 18. | End if |
| /*Crossover*/ | |
| 19. | For j = 1: D do |
| 20. | |
| 21. | End for |
| 22. | End for |
| 23. | Apply the Loser Reverse-Learning Mechanism as described in Algorithm 2. |
| 24. | For i = 1: Mpop do |
| /*Selection*/ | |
| 25. | If J(ui) ≤ J(xi) then |
| 26. | |
| 27. | |
| 28 | |
| 29. | |
| 30. | |
| 31 | Else |
| 32. | |
| 33. | |
| 34. | |
| 35. | |
| 36. | |
| 37. | End if |
| 38. | End for |
| 39. | g = g + 1 |
| 40. | End while |
| 41. | Return the Jbest and optimal path defined by the control points of xbest. |
Figure 4.
Flowchart of QLRDE-based UAV path planning.
5.4. Synergistic Advantages of QLRDE in Path Planning
The unique mechanisms of the QLRDE algorithm confer distinct synergistic advantages for addressing the UAV path planning problem. Primarily, the quantum-behaved mutation strategy, by virtue of its inherent uncertainty, drives individuals to explore distant regions of the search space. This is crucial for escaping local optima frequently induced by complex obstacle fields, thereby enhancing the probability of discovering a globally competitive path. Furthermore, the LRLM actively revitalizes stagnant individuals, effectively preventing premature convergence of the population to a suboptimal path and promoting the exploration of diverse path alternatives around threats and terrain features, thus maintaining critical population diversity. Ultimately, the adaptive parameter adjustment mechanism dynamically fine-tunes the search intensity: encouraging broad exploration during the early evolutionary stages and subsequently focusing on the local refinement of promising paths, achieving a balance between convergence efficiency and robustness.
6. Evaluation and Comparison in Solving Path Planning
6.1. Simulation Results and Comparison
To evaluate experimental evaluation of the designed QLRDE-based path planner, several comparative simulation experiments are conducted. This section focuses on comparing the QLRDE algorithm with several other algorithms, including the LRDE algorithm, the GOBLDE algorithm, the standard DE algorithm, and the PSO algorithm under two mission cases. A fair comparison was ensured by adopting identical settings for all algorithms, with the maximum iteration count as the termination criterion and a population size of Mpop = 40. The simulation environment consisted of MATLAB-2024a running on a standard personal computer, on which all algorithms were implemented.
Given the inherent stochasticity of swarm intelligence algorithms, performance was evaluated statistically. For each test case, every algorithm was executed independently for 50 runs. The experiments were conducted in a known rectangular mission environment containing predefined terrain and threats. As shown in Table 6, the positions of the start point, target point, threats, and no-fly zones are all represented by 2D planar coordinates. In test cases, the UAV mission area is defined as a square airspace with a side length of 90 km. Threats from anti-aircraft gun, radars, and missiles are all modeled as cylinders of infinite height. For example, {[40, 25], 11} indicates that the center of a ground-based threat is located at coordinates [40, 25] km, with a threat radius of 11 km. No-fly zones are represented as rectangular cuboids of infinite height. For instance, {[9, 43], [23, 43], [23, 27], [9, 27]} denotes that the four vertices of the rectangular no-fly zone are situated at [9, 43] km, [23, 43] km, [23, 27] km, and [9, 27] km, respectively. The UAV path was parameterized using n = 5 B-Spline control points, resulting in an optimization problem of dimension D = 3n = 15. The UAV was configured with the following operational parameters: nmax = 5, Hmin = 20 m, and V = 200 m/s. All tested algorithms shared a common configuration, where Mpop = 40 and gmax = 400. Other parameters are set as follows: for LRDE, DE, and GOBLDE, F is a random number in [0.1, 0.9], and CR = 0.7; for PSO, c1 = c2 = 1.4, w = 0.9; for QLRDE, both the amplification factor F ∈ [0.1, 0.9] and the crossover operator CR ∈ [0.2, 0.9] are dynamically adjusted during the optimization process.
Table 6.
Parameters of the threats in test cases.
Figure 5 shows the best 3D UAV paths in the digital terrain environment obtained by the five algorithms after 50 independent runs for Case 1 and Case 2, respectively, where white cylinders represent threat areas from missiles, radars, and anti-aircraft guns, blue cubes represent no-fly zones.
Figure 5.
(a) Best UAV paths of the five algorithms in 3D environments in Case 1; (b) best UAV paths of the five algorithms in 3D environments in Case 2.
Figure 6 shows the horizontal projections of the best UAV paths from Figure 5 on the contour map, where circular areas represent threats and rectangular areas represent no-fly zones. All algorithms successfully planned safe paths that completely avoid threats.
Figure 6.
(a) Comparison of the horizontal projections in 3D environments for best UAV paths obtained by the five algorithms in Case 1; (b) comparison of the horizontal projections in 3D environments for best UAV paths obtained by the five algorithms in Case 2.
Figure 7 displays the altitude profiles of the best-performing paths for Case 1 and Case 2. The results show that after 50 runs, all five optimization algorithms successfully generated feasible paths avoiding all danger zones. However, the paths generated by the QLRDE algorithm have the shortest length while also exhibiting smaller variations in flight altitude. The QLRDE algorithm effectively combines threat avoidance and path optimization, significantly enhancing the survivability and mission effectiveness of the aircraft. Compared to the paths obtained by LRDE, GOBLDE, DE, and PSO, the QLRDE-generated paths yield superior cost metrics and smaller standard deviations, demonstrating its stronger search capability.
Figure 7.
(a) Comparison of altitude curves for the best UAV paths of the five algorithms in Case 1; (b) comparison of altitude curves for the best UAV paths of the five algorithms in Case 2.
Algorithm performance can be evaluated using the mean cost and std, indicating its search capability and stability, respectively. The data in Table 7 indicate that QLRDE achieves the minimum values in all five metrics: best, mean, median, worst, and std, with each optimal value highlighted in bold. This indicates that QLRDE possesses the strongest optimization capability in a statistical sense. Algorithm execution efficiency can be evaluated using the average time (AT, representing the average time the algorithm runs once). The data in Table 7 indicate that QLRDE achieves shorter average time in both test cases, outperforming LRDE and GOBLDE, while remaining competitive with DE and PSO. Figure 8 presents the convergence curves, showing that the QLRDE converges faster and achieves lower cost values than its counterparts.
Table 7.
Statistical results of the tested algorithms.
Figure 8.
(a) Comparison of convergence curves of the average best cost values of the five algorithms in Case 1; (b) comparison of convergence curves of the average best cost values of the five algorithms in Case 2.
To study the distribution characteristics of the solution set, Figure 9 plots the cumulative frequency against the cost value J for the solution set obtained from independent runs, where the cumulative frequency at any cost threshold J is as follows:
where the numerator N (Jmin < J) counts the runs where the obtained minimum cost is at most J and the denominator Ntotal = 50 is the total number of independent runs. Figure 9 clearly shows that the QLRDE algorithm delivers better performance compared to comparison algorithms. Taking Case 1 as an example, 95% of the solutions found by QLRDE have a cost below the threshold of 1500. In contrast, GOBLDE and LRDE reach this threshold in only about 75% and 80% of the trials, respectively. In the Case 2, 100% of the cost values obtained by QLRDE are less than 1000 iterations, whereas the GOBLDE, LRDE, and DE algorithms only achieve this in about 80% of the cases. These results show that QLRDE has higher capability compared to other algorithms in solving UAV path planning. In summary, the proposed QLRDE algorithm outperforms GOBLDE, LRDE, DE, and PSO.
Figure 9.
(a) Cumulative frequency curves of the five algorithms in Case 1; (b) cumulative frequency curves of the five algorithms in Case 2.
6.2. Further Discussion of Algorithm Parameters
In QLRDE algorithm, parameters k, x, o, and w significantly influence the search performance. To ensure the scientific rigor of the experiments, this study first established reasonable ranges for these parameters. Parameter k controls the amplification factor of differential mutation. If its value is too small, the population may lack exploration capability, whereas an excessively large value may cause oscillations in the solution space. Therefore, this study selected k = 3 and k = 4 within the commonly used range for comparison. Parameters x and o primarily affect the shape of the nonlinear adjustment function, influencing the step size distribution during the search process and the ability to escape local optima and their effective intervals are concentrated within [0.5, 0.9]. Thus, several representative points were selected for experimentation. The parameter w balances the ratio of mutation and retention. An overly small value may weaken exploration capability, while an overly large value may lead to algorithmic instability. Hence, typical values within [0.4, 0.8] were chosen for testing. Although this approach does not exhaustively traverse all possible combinations, it adequately covers the main effective intervals, ensuring the reliability of the analytical conclusions.
Under the above settings, this study designed 16 parameter combinations, each independently run 50 times in both Case 1 and Case 2, and recorded the best, worst, median, and mean values, as well as std in Table 8 and Table 9, with the top four values per row marked in bold. The experimental results indicate that, regarding parameter k, the overall mean value for k = 4 is significantly lower than that for k = 3, demonstrating stronger convergence performance. For the parameter w, although w = 0.4 achieved favorable mean values in some combinations, nearly all the best and second-best results corresponded to w = 0.8, indicating that w = 0.8 exhibits greater stability when combined with k = 4. In terms of the nonlinear coefficients, the combination x = 0.7 and o = 0.9 achieved the smallest mean value in both Case 1 and Case 2, representing the global optimum. It also demonstrated greater stability and better robustness across different experiments.
Table 8.
The statistical results of different parameter combinations in Case 1.
Table 9.
The statistical results of different parameter combinations in Case 2.
In Case 1 and Case 2, the performance metrics of each parameter combination are reported in Table 8 and Table 9, with best values highlighted in bold. In Case 1 and Case 2, k = 4, x = 0.7, o = 0.9, and w = 0.8 achieved the smallest best, median, mean, and standard values.
In summary, three main conclusions can be drawn: First, k = 4 is superior to k = 3; second, w = 0.8 frequently appears in high-quality solutions and ensures stable performance; and third, x = 0.7 and o = 0.9 provides a good balance between mean performance and stability. Therefore, the optimal parameter combination is determined as k = 4, x = 0.7, o = 0.9, and w = 0.8, which achieves low mean cost while maintaining robust performance and general applicability, demonstrating the superiority of the selected parameters.
The experiments in the two test cases demonstrate the efficacy of the QLRDE algorithm for UAV path planning. In the two test cases, QLRDE achieved the shortest path lengths with best costs of 269 (with an average execution time of 93.27 s per run) and 253 (with an average execution time of 80.57 s per run), respectively. The altitude profiles show that paths generated by QLRDE exhibit smaller variations in flight altitude compared to other algorithms. Moreover, cumulative frequency analysis exhibits that 95% of the solutions found by QLRDE have a cost below the threshold of 1500, demonstrating faster convergence than other algorithms. Additionally, the analysis of 16 parameter combinations identified the optimal set as k = 4, x = 0.7, o = 0.9, and w = 0.8, which achieved the lowest mean cost and the most robust performance across both test cases. Experimental results demonstrate that QLRDE outperforms other existing algorithms in terms of solution quality, convergence speed, and computational efficiency.
7. Conclusions
This paper contributes an improved QLRDE to overcome local optima and premature convergence in standard DE, with its application to the UAV path planning. The proposed QLRDE integrates three innovations: a quantum-behaved mutation strategy to suppress premature convergence, the LRLM to enhance population diversity, and an adaptive parameter adjust mechanism to enhance the algorithm’s robustness and convergence efficiency. These improvements strengthen global exploration and convergence capabilities. Experimental results on twelve benchmark functions demonstrated that QLRDE demonstrates enhanced performance with respect to convergence speed, solution quality, and stability compared to several state-of-the-art algorithms. Applied to 3D UAV path planning, QLRDE generates short and low-altitude path while satisfying realistic constraints including maximum turning angle, terrain avoidance, and threat zones. Results from path planning experiments demonstrate QLRDE’s advantages over other algorithms in achieving higher quality solutions, faster convergence, and more efficient computation.
QLRDE is suitable for medium-dimensional optimization problems (e.g., 20–30 dimensions), where it effectively escapes local optima. It also performs well in engineering applications such as UAV 3D path planning and robotic trajectory optimization, which require smooth and feasible solutions under multiple physical and environmental constraints. Furthermore, in scenarios demanding high robustness, QLRDE maintains consistent performance and stable solution quality across multiple independent runs. However, the current QLRDE is limited to single-UAV path planning and does not support cooperative multi-UAV coordination, which involves challenges such as inter-agent collision avoidance and communication constraints. Furthermore, the algorithm is an offline planning method assuming a static environment, and thus does not handle dynamic threats or moving obstacles that would require real-time replanning capabilities.
Future work will focus on two aspects. First, experimental validation will be conducted on real drone platforms or using public UAV path planning datasets to assess the physical feasibility and performance of the planned paths under real-world conditions. Second, we will extend the research to multi-UAVs cooperative path planning problem incorporating more practical constraints.
Author Contributions
Methodology, Z.C. and Y.L.; formal analysis, Y.L. and Z.C.; investigation, X.Z. and Y.L.; writing—original draft preparation, Z.C. and Y.L.; writing—review and editing, X.Z.; funding acquisition, X.Z. All authors have read and agreed to the published version of the manuscript.
Funding
This work is supported by National Natural Science Foundation of China (No. 62373015).
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
Data are contained within the article.
Conflicts of Interest
The authors declare no conflicts of interest.
References
- Zhang, Z.; Jiang, J.; Ling, K.V.; Wang, X.; Zhang, W.A. Cooperative Path Planning for Heterogeneous UAV Swarms. IEEE Trans. Autom. Sci. Eng. 2025, 22, 18531–18548. [Google Scholar] [CrossRef]
- Mac, T.T.; Copot, C.; Tran, D.T.; Keyser, R.D. Heuristic Approaches in Robot Path Planning: A Survey. Rob. Auton. Syst. 2016, 86, 13–28. [Google Scholar] [CrossRef]
- Yang, P.; Tang, K.; Lozano, J.A.; Cao, X.B. Path Planning for Single Unmanned Aerial Vehicle by Separately Evolving Waypoints. IEEE Trans. Robot. 2015, 31, 1130–1146. [Google Scholar] [CrossRef]
- Lin, Z.; Wu, K.; Shen, R.; Yu, X.; Huang, S. An Efficient and Accurate A-Star Algorithm for Autonomous Vehicle Path Planning. IEEE Trans. Veh. Technol. 2024, 73, 9003–9008. [Google Scholar] [CrossRef]
- Zheng, Z.-A.; Xie, S.; Ye, Z.; Zheng, X.; Yu, Z. Research on Path Smoothing Optimization Based on Improved RRT-Connect Algorithm and Third-Order Bézier Curve. Proc. Inst. Mech. Eng. Part C J. Mech. Eng. Sci. 2025, 239, 2544–2561. [Google Scholar] [CrossRef]
- Xu, F.-X.; Wang, Y.-C.; Cheng, D.-Q.; An, W.-G.; Zhou, C.; Kou, Q.-Q. Reinforcement Learning-Driven Heuristic Path Planning Method for Automated Special Vehicles in Unstructured Environment. Robot. Auton. Syst. 2026, 195, 105231. [Google Scholar] [CrossRef]
- Lyu, Z.; Gao, Y.; Chen, J.; Du, H.; Xu, J.; Huang, K.; Kim, D.I. Empowering Intelligent Low-altitude Economy with Large AI Model Deployment. arXiv 2025, arXiv:2505.22343. [Google Scholar] [CrossRef]
- Zhu, C.; Bouteraa, Y.; Khishe, M.; Martín, D.; Hernando-Gallego, F.; Vaiyapuri, T. Enhancing Unmanned Marine Vehicle Path Planning: A Fractal-Enhanced Chaotic Grey Wolf and Differential Evolution Approach. Knowl.-Based Syst. 2025, 317, 113481. [Google Scholar] [CrossRef]
- Wu, Y.; Low, K.H.; Pang, B.; Tan, Q. Swarm-Based 4D Path Planning for Drone Operations in Urban Environments. IEEE Trans. Veh. Technol. 2021, 70, 7464–7479. [Google Scholar] [CrossRef]
- Yu, X.; Jiang, N.; Wang, X.; Li, M. A hybrid algorithm based on grey wolf optimizer and differential evolution for UAV path planning. Expert Syst. Appl. 2023, 215, 119327. [Google Scholar] [CrossRef]
- Zhang, X.Y.; Xia, S.; Zhang, T.; Li, X.Z. Hybrid FWPS Cooperation Algorithm Based Unmanned Aerial Vehicle Constrained Path Planning. Aerosp. Sci. Technol. 2021, 118, 107004. [Google Scholar] [CrossRef]
- Storn, R.; Price, K. Differential Evolution: A Simple and Efficient Heuristic for Global Optimization over Continuous Spaces. J. Glob. Optim. 1997, 11, 341–359. [Google Scholar] [CrossRef]
- Neri, F.; Tirronen, V. Recent Advances in Differential Evolution: A Review and Experimental Analysis. Artif. Intell. Rev. 2010, 33, 61–106. [Google Scholar] [CrossRef]
- Das, S.; Suganthan, P.N. Differential Evolution: A Survey of the State-of-the-Art. IEEE Trans. Evol. Comput. 2011, 9, 4–31. [Google Scholar] [CrossRef]
- Brintaki, A.N.; Nikolos, I.K. Coordinated UAV Path Planning Using Differential Evolution. Oper. Res. 2005, 5, 487–502. [Google Scholar] [CrossRef]
- Yang, J.Q.; Yan, F.; Zhang, J.; Peng, C.J. Hybrid Chaos Game and Grey Wolf Optimization Algorithms for UAV Path Planning. Appl. Math. Model. 2025, 142, 115979. [Google Scholar]
- Zhu, L.; Zhou, Y.; Zhou, G.; Luo, Q.; Wei, Y. Solving Spherical Multi-Aircraft Path Planning Problem Using Self-Adaptive Update Strategy Differential Evolution Algorithm. Swarm Evol. Comput. 2025, 96, 102004. [Google Scholar] [CrossRef]
- Xu, P.L.; Luo, W.J.; Lin, X.; Zhang, J.J.; Wang, X. A Large-Scale Continuous Optimization Benchmark Suite with Versatile Coupled Heterogeneous Modules. Swarm Evol. Comput. 2023, 78, 101280. [Google Scholar] [CrossRef]
- Zhang, H.; Sun, J.; Tan, K.C.; Xu, Z. Learning Adaptive Differential Evolution by Natural Evolution Strategies. IEEE Trans. Emerg. Top. Comput. Intell. 2023, 7, 872–886. [Google Scholar] [CrossRef]
- Wang, Y.; Cai, Z. Combining Multiobjective Optimization with Differential Evolution to Solve Constrained Optimization Problems. IEEE Trans. Evol. Comput. 2012, 16, 117–134. [Google Scholar] [CrossRef]
- Chiou, J.P.; Chang, C.F.; Su, C.T. Variable Scaling Hybrid Differential Evolution for Solving Network Reconfiguration of Distribution Systems. IEEE Trans. Power Syst. 2005, 20, 668–674. [Google Scholar] [CrossRef]
- Zhang, X.Y.; Xia, S.; Li, X.Z.; Zhang, T. Multi-Objective Particle Swarm Optimization with Multi-Mode Collaboration Based on Reinforcement Learning for Path Planning of Unmanned Air Vehicles. Knowl.-Based Syst. 2022, 250, 109075. [Google Scholar] [CrossRef]
- Gyongyosi, L.; Imre, S. A Survey on Quantum Computing Technology. Comput. Sci. Rev. 2019, 31, 51–71. [Google Scholar] [CrossRef]
- Zhang, X.Y.; Xia, S.; Li, X.Z. Quantum Behavior-Based Enhanced Fruit Fly Optimization Algorithm with Application to UAV Path Planning. Int. J. Comput. Intell. Syst. 2020, 13, 1315–1331. [Google Scholar] [CrossRef]
- Liu, T.Y.; Jiao, L.C.; Ma, W.P.; Shang, R.H. Quantum-Behaved Particle Swarm Optimization with Collaborative Attractors for Nonlinear Numerical Problems. Commun. Nonlinear Sci. Numer. Simul. 2017, 44, 167–183. [Google Scholar] [CrossRef]
- Ozsoydan, F.B.; Baykasoglu, A. Quantum Firefly Swarms for Multimodal Dynamic Optimization Problems. Expert Syst. Appl. 2019, 115, 189–199. [Google Scholar] [CrossRef]
- Zhang, X.Y.; Lu, X.Y.; Jia, S.M.; Li, X.Z. A Novel Phase Angle-Encoded Fruit Fly Optimization Algorithm with Mutation Adaptation Mechanism Applied to UAV Path Planning. Appl. Soft Comput. 2018, 70, 371–388. [Google Scholar] [CrossRef]
- Yu, X.; Xu, W.Y.; Li, C.L. Opposition-Based Learning Grey Wolf Optimizer for Global Optimization. Knowl.-Based Syst. 2021, 226, 107139. [Google Scholar] [CrossRef]
- Guo, H.; Li, S.; Qi, K.; Guo, Y.; Xu, Z. Learning Automata Based Competition Scheme to Train Deep Neural Networks. IEEE Trans. Emerg. Top. Comput. Intell. 2020, 4, 151–158. [Google Scholar] [CrossRef]
- Nasser, A.B.; Zamli, K.Z.; Hujainah, F.; Ghanem, W.; Alduais, N. An Adaptive Opposition-Based Learning Selection: The Case for Jaya Algorithm. IEEE Access 2021, 9, 55581–55594. [Google Scholar] [CrossRef]
- Besada-Portas, E.; de la Torre, L.; Moreno, A.; Risco-Martín, J.L. On the Performance Comparison of Multi-Objective Evolutionary UAV Path Planners. Inf. Sci. 2013, 238, 111–125. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.








