1. Introduction
In recent years, the popularity of Unmanned Aerial Vehicles (UAVs) has increased because of their affordability and versatility [
1,
2,
3]. These characteristics have facilitated the utilization of UAVs across multiple domains and applications, including infrastructure inspection [
4], environmental monitoring [
5], rescue missions and research [
6], mapping [
7], surveillance [
8], and remote sensing [
9]. In these sensing applications, target localization plays a critical role, which involves determining the position of a target via relative information between the UAV and the target, obtained from airborne sensors equipped on UAVs [
10]. However, target localization based on passive sensors faces two main challenges. The first is that the bearing measurement bias introduces substantial errors in target localization [
11]. The second challenge is that the estimation accuracy may be reduced because of the reliance solely on bearing information [
12].
Target localization, which is usually aimed at obtaining a target’s inertial position, necessitates obtaining bearing information relative to the world frame [
13,
14]. Consequently, attitude sensors and onboard cameras, which provide the UAV’s attitude and the target’s line of sight (LOS) angle relative to the UAV’s fuselage frame, respectively, are generally integrated into the bearing-only target localization problem. According to references [
15,
16], targets can be accurately localized when the bearing measurements are perturbed only by Gaussian noise. However, in practice, this assumption is not valid. The outputs from both the attitude sensor and the onboard camera are biased [
17]. Neglecting the bias can notably impair the target localization performance [
11,
17]. Considerable research efforts have focused on calibrating sensor biases. The method proposed in [
18] constructs the bias pseudo measurement exclusively through the manipulation of local tracks, covariances, and the equivalent bias measurement matrices to estimate the sensor bias in sensor registration. The authors of [
19] integrated sensor calibration and trajectory fusion within a multi-target tracking framework to mitigate the effect of bearing measurement bias on target tracking. The integration of data is crucial, particularly when a target needs to be observed simultaneously by two radar devices. In terms of state estimation, the EKF is a useful method for dealing with diffuse white noise models [
20]. It also has the advantage of high computational efficiency compared with the Unscented Kalman Filter (UKF), which has been widely used in recent years for nonlinear filtering problems [
21]. Given that the observability of the system directly impacts the performance of the state estimation [
22,
23], it is necessary to maintain and enhance the observability of the target localization system.
Observability is a fundamental property of a system that indicates its ability to uniquely determine an initial state from its outputs. Only if the system is observable can the states at any time be determined by a state estimator such as the Kalman Filter. The Fisher Information Matrix (FIM) is commonly used as a metric for assessing system observability [
24,
25]. The inverse of FIM corresponds to the Cramer–Rao Lower Bound (CRLB), which sets a theoretical lower limit on the covariance matrix of an ideal estimator, thereby representing the best achievable performance in state estimation. As highlighted in [
12], maximizing system observability typically involves maximizing the determinant of the FIM, which in turn minimizes the estimation error covariance of the filter, enhancing its overall performance.
The observability of a bearing-only measurement system is dynamic and depends on the relative positions of the observer and the target [
26,
27,
28,
29,
30]. Therefore, it is necessary to enhance the system’s observability by trajectory optimization. The authors of [
31] utilized the rank of the observability matrix as a criterion to assess the system’s observability and to determine the reliability of different sensor locations. In [
24,
32,
33], the determinant of the FIM was employed as a metric. It is maximized to generate the optimal trajectories that enhance the system’s observability. Unfortunately, there has been no investigation in the literature into UAV trajectory optimization for target localization based on directly enhancing system observability.
This work first analyzes the observability of a target localization system with biased bearing measurements via the Lie derivative method. It derives the conditions necessary to maintain system observability. To ensure observability, inspired by the phototropism of plants, a control barrier function was designed. This function restricts UAV motion, allowing it to avoid areas that may affect observability, with adjustable avoidance levels. Additionally, the condition number of the system observability matrix was employed as a metric to quantify the system observability, helping to identify the factors that contribute to system observability. Based on this analysis, a multi-objective, nonlinear programming problem was established to maintain and enhance system observability. To effectively solve the multi-objective, nonlinear programming problem, a penalty function was integrated into the Multi-Objective Gray Wolf Optimization Algorithm to address nonlinear constraints. Simulations confirmed the effectiveness of the proposed method. The UAV operated at a fixed altitude, modeled on a 2D, obstacle-free map, with constraints on the speed and turn rate of the UAV to limit its turning radius. The root-mean-square error (RMSE) of localization was used as a performance metric indicator for localization accuracy. The effectiveness of the proposed Nonlinear Constrained Multi-Objective Gray Wolf Optimization Algorithm (NCMOGWOA) was verified through comparisons with the Multi-Objective Particle Swarm Optimization Algorithm (MOPSOA) [
34], Multi-Objective Arithmetic Optimization Algorithm (MOAOA) [
35], and Sequential Quadratic Programming (SQP) method [
36]. While the MOPSOA and MOAOA are both heuristic and neglect the fitness among the nondominated solutions, the SQP method requires a suitable starting point. To address these limitations, we propose the NCMOGWOA, which shows faster convergence and lower localization error in simulations. It outperforms the other methods in terms of the convergence metrics GD and IGD. Additionally, we explore the impact of the CBF attenuation rates and initial flight path angles on trajectory optimization.
The main contributions of this paper include the following:
- (1)
Observability Analysis: Deriving necessary conditions for maintaining the observability of the target localization system using only biased bearing measurements. A control barrier function is designed to ensure system observability by restricting UAV motion.
- (2)
Optimization Metric: Utilizing the condition number of the observability matrix as a metric. A multi-objective optimization algorithm is proposed to enhance system observability.
- (3)
Algorithm Improvement: To address the limitations of the MOPSOA and MOAOA, the NCMOGWOA incorporates nondominated sorting and a crowding distance mechanism to improve the solution accuracy. A penalty function is constructed to manage nonlinear constraints, and random starting points increase adaptability.
The remaining sections are organized as follows. The kinematic model of the UAV and the target localization system with biased bearing measurement information are detailed in the subsequent section. Then, we conduct both qualitative and quantitative analyses of the system’s observability and introduce the designed trajectory optimization method. The simulation results are provided in the penultimate section, and the conclusions are presented in the final section.
2. System Models
This paper explores a two-dimensional stationary target localization problem. As illustrated in
Figure 1, the inertial reference is denoted as
. The variables with subscripts
and
indicate those of the UAV and target, respectively. The speed of the UAV is represented by
;
and
denote the bearing angle and the relative distance between the UAV and target, respectively.
represents the flight path angle of the UAV defined in the inertial reference frame. The separation angle
is defined as the angle between the longitudinal axis of the UAV and the line of sight of the UAV, which can be expressed as the bearing angle and flight path angle of the UAV:
. To ensure the uniqueness of each angle, let
.
Assuming that the UAV moves at the same horizontal altitude as the target, the kinematics of the UAV can be formulated as follows:
where
, while
and
represent the position vector and turn rate of the UAV, respectively.
is defined as the position vector of the target at time
in the inertial reference frame, which the UAV cannot directly obtain. The objective of target localization is to calculate
with the UAV’s position, speed, and bearing measurement
(measured by the UAV onboard sensor). According to the literature [
15], target localization with bearing information can be accurate only when affected by Gaussian noise. However, in practice, the bearing information is collected by the attitude sensor and onboard camera of the UAV, which have errors in their outputs. As shown in
Figure 1,
is defined as the bias in the bearing measurement. To achieve better performance in target localization, it is necessary to estimate
to compensate for the measured bearing information; thus, the state vector
to be estimated is defined as
The discrete-time system dynamics can subsequently be formulated as
where
denotes the state transition matrix from
to
.
and
are the white Gaussian noise with corresponding covariances
and
, respectively.
is defined as the relative position vector; then, in the form of the relative position, the biased measurement function can be represented as
and the dynamic system (3) can be rewritten as
Since observability directly influences the accuracy of target estimation, this paper aims to maintain the observability of the dynamic system (5) and subsequently enhance its performance in the target localization system with biased bearing measurements.
3. Bio-Inspired Observability Enhancement Optimization Model
In this section, the qualitative and quantitative analyses of the system are provided, as are the observability conditions and influencing factors.
3.1. Qualitative Analysis of System Observability
Within the continuous-time framework, the dynamic system (5) can be replaced by
Definition 1. The system (6) is observable in the time interval if the initial state can be uniquely determined from , .
Defining
as the cumulative measurement vector, the arbitrary initial state
and its corresponding measurement
are related by
According to the implicit function theorem, the unique determination of the initial state
from the measurement
, if and only if the observability matrix, denoted by
is nonsingular.
Theorem 1. The dynamic system (6) is observable only if the observability has full rank, i.e., rank , where is the order of the system.
To compute the observability matrix
, the Lie derivative is employed [
37]. For simplicity, the state and time symbols are both ignored. The Lie derivative of
with respect to
is expressed as
. The
j-th order Lie derivative can be calculated via
Consequently, the relationship between states and measurement can be represented by
and its corresponding observability matrix can be computed by
where
In the dynamic system (6), the bias
remains constant, providing no additional information in the computation of the Lie derivative vector. Therefore, when computing the observability matrix, only the first two orders of
are considered:
For simplicity, let
. Hence, in polar coordinates, we have
By substituting (14) into (13), the observability matrix can be simplified as
and its corresponding determinant can be computed as
According to Theorem 1, the system (6) is observable only when the observability matrix
has full rank, which is equivalent to the fact that the determinant of
is nonzero, i.e.,
When , the UAV remains stationary, and the bearing measurement cannot be updated; when , . Regardless of the angle measurement, distinguishing the next step relative position vector from the current step relative position vector is impossible. Hence, to ensure system observability, the separation angle and the relative speed between the UAV and the target must remain nonzero.
3.2. Bio-Inspired Unobservable Area Avoidance Based on the CBF Method
The observability matrix is determined by the bearing angle , the relative distance , the speed , and the flight path angle of the UAV. When the UAV flight parameters are constant, the observability matrix can be uniquely determined by the relative positional relationship between the UAV and the target.
Definition 2. The observability of the dynamic system can be determined by the geometric relationship between the UAV and the target.
To guarantee the observability of the dynamic system, it is crucial to modify the geometric relationship between the UAV and the target. In this work, inspired by phototropism and plants’ dark avoidance behavior, the CBF was employed to categorize the observable and unobservable areas. Additionally, restrictions were imposed to ensure UAV movement within the observable areas and maintain the system’s observability.
Phototropism refers to the phenomenon in which plants bend toward light when they are exposed to it [
38,
39]. It is considered to be a mechanism by which plants adapt to low-light environments, as illustrated in
Figure 2.
Definition 3. For a smooth function , define as a superlevel set of , its boundary as , and its interior as :Considering a general nonlinear systemif there exists a constant such that for all , satisfyingthen, is a control barrier function of (19). The control barrier function is frequently employed for addressing safety analysis and control issues in nonlinear systems [
40,
41]. In this work, the control barrier function was employed to delineate observable and unobservable areas and restrict UAV motion, thereby ensuring system observability.
From (1), the kinematics of the UAV can be rewritten as
where
.
By utilizing the system observability conditions provided in (16) and (17), the control barrier function can be formulated as
The CBF observability constraints can subsequently be established through the following inequality:
where
represents the attenuation rate of
.
and
denote the Lie derivatives along the vector fields
and
, respectively:
Substituting (24) and (25) into (23) yields the observability constraints of the system:
3.3. Influence of on System Observability
By integrating (23) from
to
, we have
Subsequently, the superlevel set
that satisfies the CBF observability constraint at
can be defined as
Let
denote the set of states reachable by the system after
when the constraints are satisfied. Hence, a solution exists at
if the UAV’s motion at
fulfills both the control input constraints and the observability CBF constraints, i.e.,
. The impact of
on system observability is illustrated through the geometric relationship between
and
, as shown in
Figure 3.
When
, as illustrated in
Figure 3a, and
, the UAV has the flexibility to either approach or move away from the unobservable area, while ensuring the system’s observability at
. However, the possibility of the UAV moving closer to the unobservable area increases the risk of the system becoming unobservable.
When
,
, as shown in
Figure 3b, the UAV is restricted to moving far from the unobservable area, ensuring observability. However, as the coverage area of
decreases, the feasible area for UAV motion diminishes.
From
Figure 3a,b, it is evident that with a certain value of
, as
decreases, as indicated by (28), the area of
diminishes. This compression reduces the area covered by
, thereby decreasing the likelihood of the UAV moving closer to the unobservable area and ensuring system observability.
3.4. Quantitative Analysis of System Observability
In addition to maintaining the system’s observability, we aim to enhance it to improve the state estimation performance. To analyze system observability quantitatively,
was defined as the observability metric:
where
represents the condition number of the observability matrix
.
and
denote the minimum and maximum singular values, respectively.
Definition 4. A system is considered to have weak observability if the condition number of the observability matrix is extremely large or infinite.
Definition 5. A system is considered to have strong observability if the condition number of the observability matrix is close to one.
Substituting (15) into (29), we have
(30) indicates that
,
if and only if the following equation is satisfied:
Typically, the relative distance between the UAV and target significantly exceeds the speed of the UAV, i.e., . Consequently, is generally much larger than , and Equation (31) can be satisfied only when and .
By defining
, a heatmap was generated to visualize the distribution of
. As demonstrated in
Figure 4, when
is closer to 1 and
is closer to
,
is closer to 1, indicating improved system observability. Consequently, to enhance the observability, the UAV must approach the target closely, while also adjusting the flight path angle
to make the separation angle
approach
.
3.5. Trajectory Optimization Model
Based on the observability analysis above, a multi-objective optimization model was formulated to enhance the observability of the dynamic system while ensuring it. The object functions were defined as
and
, respectively. Typically, the distance from the UAV to target
is significantly greater than the speed of the UAV; the proximity of
and 1 is comparable to the proximity of
and 1. Hence, considering a minimum safe distance
between the UAV and the target, the first objective function can be redefined as
.
where
and
denote the minimum and maximum UAV speeds, respectively;
and
denote the minimum and maximum UAV turn rates, respectively.
Although the optimization problem in (32) can be solved via traditional linear optimization techniques, heuristic methods provide greater flexibility and adaptability, especially when searching for local optimal solutions. To improve the convergence speed, we utilized the NCMOGWOA algorithm in this study. As will be shown in
Section 5, our simulations highlight the superior convergence speed of the NCMOGWOA algorithm compared with other methods, demonstrating its effectiveness for the given optimization problem.
4. Nonlinear Constrained Multi-Objective Gray Wolf Optimization Algorithm (NC-MOGWOA)
The optimization model presented in the last section constitutes a multi-objective, nonlinear programming problem. As the number of candidates in the state space increases, the potential combinations available for selection rise exponentially, posing a challenge for solutions using conventional methods [
42,
43].
The Gray Wolf Optimization Algorithm (GWOA) is a bio-inspired algorithm that simulates the predatory actions of gray wolf populations in nature [
44]. It efficiently tracks the optimal solution’s iterative direction and finds the optimal solution, enabling quick discovery. In this paper, a Nonlinear Constrained Multi-Objective Gray Wolf Optimization Algorithm (NCMOGWOA) is employed to address the presented multi-objective, nonlinear programming problem.
4.1. Gray Wolf Optimization Algorithm (GWOA)
The GWOA is a meta-heuristic algorithm inspired by the predatory behavior observed in gray wolf populations. It combines the hierarchy and distribution patterns observed within these populations to simulate the hunting and encircling process of gray wolves when they pursure their prey. This process includes four steps: establishing social hierarchies, searching for prey, encircling prey, and attacking prey.
The wolves are classified into 4 distinct classes:
,
,
, and
, and each class has unique responsibilities within the pack. Wolf
possesses managerial skills and oversees decisions regarding food acquisition and location; wolf
aids in decision making and serves as a communicator; wolf
follows the directives of wolf
and wolf
, undertaking tasks such as scouting and guarding; and wolf
complies with the pack’s hierarchy, maintaining social equilibrium. The GWO model can be expressed as
where
represents the distance between the wolf and the prey, while
and
denote the positions of the prey and the wolf, respectively. Both
and
are coefficient vectors. The parameter
denotes the convergence factor, which linearly decreases with each iteration, and
and
are randomly selected values within the range [0, 1].
In the hunting process, wolves , , and initially make a random estimation of the prey’s location, since it is unknown. They then guide the other wolves to assess and update the estimated location iteratively until an optimal solution is achieved.
The coefficient vector , with a range of , influences the wolf’s decision making regarding its current position relative to that of the prey. More precisely, when , the algorithm exhibits a robust search capability, causing the wolf to move farther away from the prey. Conversely, when , the algorithm shows a strong developmental ability, prompting the wolf to move closer to the prey.
4.2. Multi-Objective Gray Wolf Optimization Algorithm
To adapt the GWOA to multi-objective problems, two enhancements were introduced to the algorithm.
4.2.1. External Stock Archive
An external population archive was introduced to store nondominated solutions. At each iteration, the algorithm generates a new position for the gray wolf. The new gray wolves are compared to the original gray wolves stored in the archive when their eligibility for the archive is assessed. If the new gray wolf is dominated by all the original wolves in the archive, it cannot join the pack. Conversely, if the new gray wolf dominates one or more gray wolves, it joins the pack, displacing any dominated wolf. If neither dominates the other, the new gray wolf can join only if the archive has not reached its maximum capacity.
4.2.2. Decision-Making Wolf Selection
In the literature [
45], a roulette method was employed to choose the decision-making wolf from the archive. This method involves identifying the least crowded grid in the archive and randomly selecting three solutions corresponding to wolves
,
, and
, without any perceived superiority or inferiority. If the number of segments is insufficient, the selection is deferred to the grid with the second lowest crowdedness.
4.3. Nonlinear Constraint Penalty Function
Multi-objective optimization problems frequently involve nonlinear inequality and equation constraints, rendering them challenging to solve. However, the Multi-Objective Gray Wolf Optimization Algorithm (MOGWOA) does not account for these nonlinear constraints, potentially causing it to exceed the allowable boundaries during wolf location updates [
45]. In this paper, based on the MOGWOA, we exclusively consider nonlinear inequality and equation constraints. A general model for multi-objective programming problems can be expressed as
where
and
represent the nonlinear equational and inequality constraints, respectively. The constraint penalty function
can be defined as follows:
where
and
are constants, and where
is a judgment function of inequality constraint
:
With the introduction of the constraint penalty function, the nonlinear constraints can be converted into the objective function, thereby simplifying the problem-solving process. The multi-objective programming problem can be reformulated as
As represented in (43), if the updated position of the wolves breaches the nonlinear constraints, the values of each objective function significantly increase, rendering the wolves’ location nonoptimal. In other words, the optimal solution in each iteration must satisfy all the constraints.
Algorithm 1 demonstrates the process of the proposed NCMOGWOA. In line 1, the process begins with the random initialization of the wolf population through the following equation:
where
,
represents a uniformly distributed random number in the range [0, 1];
and
denote the lower and upper bounds of the dimension, respectively.
Algorithm 1: Nonlinear Constrained Multi-Objective Gray Wolf Optimization Algorithm (NCMOGWOA) |
1: | begin |
2: | Selected the gray wolf population Xi randomly selected within the feasible region (i = 1, 2, …, n) |
3: | Initialize a, A, and C using Equations (35) and (36). |
4: | Calculate the objective values for each search agent using Equation (40) |
5: | The initial archive Ar0 ← The nondominated solutions |
6: | X∂1, X∂2, X∂3 ← The initial archive A0 (select the initial three wolves with the lowest objective |
7: | function values) |
8: | for t = 1,2, …, Max number of iterations |
9: | Update the positions of each current search agent using Equations (37), (38) and (39) |
10: | Update a, A, and C using Equations (35) and (36) |
11: | Calculate [f1(x), f2(x), …fn(x)] of all search agents using Equation (40) |
12: | Calculate ρ(x) for all search agents using Equation |
13: | [f1(x), f2(x), …fn(x)] ← [f1(x) + ρ(x), f2(x) + ρ(x), …fn(x) + ρ(x)] |
14: | The archive A ← The nondominated solutions |
15: | If the archive is full |
16: | Omit one of the current archive members |
17: | The archive Ar ← The archive Ar + The new solution |
18: | end if |
19: | If solutions are outside the hypercubes |
20: | Update the grids to cover the new solutions |
21: | end if |
22: | X∂1, X∂2, X∂3 ← The archive Ar |
23: | end for |
24: | end |
25: | return archive Ar |
The proposed algorithm uses the archive for storing the nondominated solution. For each iteration in the loop, after generating a new wolf population, the penalty function is calculated along with all the objective functions (lines 11, 12). The objective function is updated by adding the penalty function to the original objective function values (line 13).
All the current new gray wolves are compared with the original gray wolves stored in the archive in terms of the updated objective functions. If the new gray wolf is dominated by all the original wolves in the archive, it cannot join the pack. Conversely, if the new gray wolf dominates one or more gray wolves, it joins the pack, displacing any dominated wolf (lines 14–18).
After the search process, wolves , , and are replaced by the three best wolves in the current archive, respectively (line 22). When the algorithm finishes, the updated archive is stored as the output, which stores all the nondominated solutions.
4.4. Selection of the Optimal Solution
As the NCMOGWOA yields a Pareto optimal solution, it becomes challenging to simultaneously obtain optimal solutions for both objectives and , thus complicating the selection of optimal UAV inputs.
Definition 6. Given a cost-based, multi-objective optimization problem with a feasible solution set : , , if satisfies.
Consequently, is denoted as the optimal solution of .
Definition 7. Given a cost-based, multi-objective optimization problem with a feasible solution set : , , if there exists and no other , then (46) holds.
and at least one of (46) is a strict inequality; then
is denoted as the Pareto optimal solution of
.
The set containing all the Pareto optimal solutions of
is referred to as the Pareto optimal solution set of
. The graphical representation of the Pareto optimal solution set in the space of the objective function is defined as the Pareto front. These concepts are demonstrated in
Figure 5.
Figure 5 illustrates the Pareto solution of a two-objective optimization problem. Each black dot within the feasible solution set
represents a Pareto solution, collectively forming the Pareto solution set. The red line indicates this front, highlighting solutions where neither objective
nor
can be improved without sacrificing the other. Thus,
is not a Pareto solution.
To provide a comprehensive evaluation, a synthesized evaluation methodology that combines Technique for Order Preference by Similarity to Ideal Solution (TOPSIS) and the Criteria Importance Through Intercriteria Correlation (CRITIC) method (TCM) was proposed. Applying the TCM to the Pareto front solution set obtained by the NCMOGWOA enables it to determine the optimal solution to the problem.
In this paper, the CRITIC method is employed to assign weights to each objective function in every iteration. CRITIC is usually used to weight indicators. It considers the differences and similarities among evaluation indicators, assigning smaller weights to indicators with high horizontal similarity and larger weights to those with significant vertical differences [
46].
With the weights generated by the CRITIC method, the TOPSIS method was employed to synthesize the assessments of various objective function values. TOPSIS chooses the optimal solution by establishing the positive and negative ideal solutions of the evaluation problem and the optimal and worst solutions for each index. It ranks the solutions based on their relative closeness to the ideal solution, considering their proximity to the positive and negative ideal solutions [
47].
The TCM process is illustrated in
Figure 6. Once the Pareto front solution set was obtained by the NCMOGWOA in each iteration, the weight of each objective function could be computed using the CRITIC method. Based on these weights, the synthesized evaluation index was subsequently constructed using the TOPSIS method, and the final scores were derived for all Pareto solutions. We selected the optimal solution and the corresponding optimized variables from these scores.
Upon examining the distribution of
in
Figure 4, it is evident that during the initial phase, the impact of
on the condition number is almost negligible due to its large magnitude. Consequently, it is imperative to prioritize the optimization of
to ensure a decrease in the value of
to an appropriate level.
However, solely concentrating on reducing the distance
can lead the UAV to violate the CBF constraint. Therefore, the optimization objectives of this problem are divided into two stages:
where
represents the optimal solution,
represents the Pareto optimal solution with the smallest value in
,
indicates the optimal solution selected by the TCM, and
is a constant.
Figure 7 demonstrates an algorithm for biased bearing information-only target localization. The aim was to optimize the UAV trajectory to improve the localization performance. First, by employing the NCMOGWOA, the optimal Pareto set
was obtained, as was its corresponding objective function value vector
. Afterward, utilizing the optimal Pareto set
with the TCM, we selected the optimal solution according to (47), which was then used to update the position of the UAV. Furthermore, employing the Extended Kalman Filter (EKF) facilitated obtaining both
and
, where
and
denote the estimation of the target’s position and bias in the bearing measurement
, respectively. In closed-loop mode,
was employed to compensate for the bearing measurement
, enhancing its accuracy.
5. Experiments and Results
In this section, the bearing-only stationary target localization problem is solved by the proposed UAV trajectory optimization algorithm based on observability enhancement. We analyze the differences between open-loop and closed-loop modes within the algorithm and explore the effects of varying
and initial flight path angles on trajectory optimization. Finally, we evaluate the proposed NCMOGWOA’s localization accuracy and convergence by comparing it with the MOPSOA [
34], MOAOA [
35], and SQP methods [
36]. The comparison is conducted through state estimation and the GD and IGD metrics. The simulation platform for each experiment in this study is the Windows 11 AMD Ryzen 5 5600H chip system, and the MATLAB version is R2021b.
5.1. Trajectory Optimization Results for Target Localization
In the case of a stationary target, the state transition matrix
can be represented by an identity matrix. The relevant parameters are summarized in
Table 1.
In this study, the target position and bias bearing measurement bias were estimated using the Extended Kalman Filter (EKF). The initial value for state estimation was set as follows:
The covariance
Q and initial values of the error covariance matrix were set to
The target localization system, which relies on biased bearing measurements, operates in closed-loop mode. In this mode, the estimation of the bearing measurement bias
at each moment compensates for the subsequent bearing measurement, reducing its impact. Accurate estimation of bias
improves the precision of bearing measurements. In closed-loop mode, based on (4), the measurement function at each time instant
is described as follows:
where
denotes the estimation of bias
at time instant
.
The trajectory optimization results are illustrated in
Figure 8. In
Figure 8a, the UAV trajectory forms a decreasing radius circle around the stationary target. When
, the UAV approaches the target, increasing the separation angle
. As
rapidly decreases to
, there is a sharp decline in the condition number, as shown in
Figure 8l. At
,
is selected as the optimal solution. To maintain the separation angle
near
, the UAV continues a circular path with a diminishing radius, enhancing system observability. When
reaches the safe distance
, the UAV slows down to maintain this distance, orbiting the target at
.
Figure 8c,d,k,l demonstrates that the observability matrix remained full rank throughout. The localization and bearing bias estimation errors converged to zero as the condition number decreased. Moreover, the estimated position of the target eventually converged to its real position, as shown in
Figure 8b. As
stabilizes, the speed of the UAV must decrease, causing a slight increase in the condition number. However, these fluctuations were minimal, and had a negligible effect on estimation accuracy. It is evident that the proposed trajectory algorithm had a significant effect on observability enhancement, resulting in accurate localization of the target and calibration of the bearing measurement.
5.2. Comparison between the Open-Loop and Closed-Loop Modes
This section analyzes the performance differences in trajectory optimization between the open-loop and closed-loop modes, focusing on whether the estimation of bias
b is used to compensate for the target bearing measurement. The simulation parameters for both modes remain consistent with the values provided in
Table 1.
For simplicity, we denoted the open-loop mode as
and the closed-loop mode as
.
Figure 9 shows that the trends in the condition number were similar for both modes. However, a significant difference exists in the estimation of bias
: the error can converge to nearly zero at
but not at
.
Figure 9a,c and
Table 2 demonstrate the superior performance of
in target localization, where the estimated target location converged to the actual position, unlike in
. Moreover, except for the initial period, the localization error in
remained lower than
and could converge to zero.
5.3. Effect of on Trajectory Optimization
To explore the effect of
on trajectory optimization, the attenuation rates
of the CBF were divided into three groups, namely,
, instead of being optimized as a parameter by the NCMOGWOA. To exclude other factors, the optimal solution was chosen by the TCM throughout the process. The simulation results are presented in
Figure 10.
Figure 10a,b illustrates that as
decreased, the constraints imposed by the CBF on the UAV were reduced. A smaller
allowed the UAV to approach the target more rapidly, leading to a quicker reduction in both
and the condition number, thus enhancing observability. When
, a smaller
resulted in a faster decrease in the condition number, as shown in
Figure 10d. After
reached
, the condition number stabilized across different cases.
Figure 10c shows that, although it eventually converges to nearly zero in all the scenarios, the localization error was significantly lower with a smaller
. To exclude the effects of stochastic factors such as Gaussian noise, a Monte Carlo simulation was conducted. The mean and standard deviation (Std) of the root-mean-square error (RMSE) of the localization error over 100 runs are summarized in
Table 3, clearly demonstrating that a smaller
improves target localization performance.
Although a smaller
implied better observability, it is apparent that
was not as small as it should be. When
became too small, the CBF constraints on the UAV became excessively stringent, as illustrated in
Figure 10a, resulting in a severely limited feasible area for the UAV. This limitation makes it challenging for UAVs to fulfill certain mission requirements, such as cruising. Therefore, it is crucial to optimize
to effectively constrain UAV motion based on mission objectives.
5.4. Effect of the Initial Flight Path Angle on Trajectory Optimization
To analyze the effect of the initial flight path angle on trajectory optimization, the initial flight path angles were divided into six groups: , , , , , and . The optimal solution was chosen by the TCM throughout the process.
Figure 11a shows the UAV trajectory for different initial angles. The trajectory was divided into two categories based on a 45° flight path angle: For angles less than 45°, the UAV followed a counter-clockwise circular path around the target with a continuously decreasing radius. Conversely, for angles greater than 45°, the UAV followed a clockwise circular path. Similarly, as shown in
Figure 11b, the convergence of the UAV separation angle was categorized into two groups. For initial flight path angles greater than 45°, the separation angle converged to −90°. When the initial flight path angle was less than 45°, it converged to 90°.
When the flight path angle was exactly 45°, the UAV moved directly toward the target, resulting in a separation angle of zero. In this case, the target localization system became unobservable, meaning that the target’s position could not be accurately estimated.
5.5. Comparison between the NCMOGWOA and Other Methods
To evaluate the effectiveness of the proposed NCMOGWOA, it was compared with the MOPSOA, MOAOA, and SQP methods [
34,
35,
36]. The same experimental parameters (
Table 1) were selected for the NCMOGWOA. The MOPSOA, MOAOA, and SQP parameters used in this study are summarized in
Table 4.
The results are illustrated in
Figure 12. To exclude stochastic effects such as Gaussian noise, a Monte Carlo simulation was conducted. The mean and standard deviation (Std) of the root-mean-square error (RMSE) are summarized in
Table 5. As shown in
Figure 12 and
Table 5, although each method achieved a low localization error, the proposed NCMOGWOA converged more rapidly. It also achieved a lower mean and Std for the target position and bearing bias errors, indicating a superior localization performance.
To assess algorithm performance quantitively, the Generational Distance (GD) and Inverted Generational Distance (IGD) were used to evaluate convergence and comprehensive performance. They are expressed as follows:
where
and
denote the number of obtained Pareto solutions and true Pareto solutions, respectively;
denotes the Euclidean distance between the ith obtained Pareto solution and the closest true Pareto solution; and
denotes the Euclidean distance between the ith true Pareto solution and the closest true Pareto solution. A smaller value of the GD means better convergence and the lower the value of the IGD is, the better the comprehensive performance of the algorithm.
Table 6 compares the proposed NCMOGWOA with the other methods via the GD and IGD metrics. The NCMOGWOA has slightly lower values, indicating better performance in terms of convergence property. This improvement is due to the effective leader selection strategy, which involves choosing wolves
,
, and
from the current Pareto front, optimizing the process by leveraging the best results, and exploring diverse solutions across the objective space.
Figure 13 illustrates the Pareto solutions from different algorithms over 50 runs. Although the Pareto solution sets of the NCMOGWOA, MOPSOA, and SQP were similar when t = 10 s, the NCMOGWOA performed better in both objective functions, initially showing good exploration ability. Additionally,
Figure 13a,b shows that the Pareto optimal solutions converged as the simulation time increased, which was consistent with the localization error trend in
Figure 12a, where all algorithms eventually optimized the objectives, reducing the localization error to nearly zero.
6. Discussion
Most previous studies focused on system observability and maintenance. This study used the condition number of the observability matrix to numerically analyze system observability. We formulated a multi-objective, nonlinear optimization problem for UAV trajectory planning to enhance observability. The proposed NCMOGWOA can efficiently solve this problem.
The simulation demonstrated that optimizing UAV trajectories significantly improved localization system observability and target localization performance. Compared with the other algorithms, NCMOGWOA achieved better performance in both target localization and convergence.
This study focused on a two-dimensional observability matrix. As the dimensions increased, computing the condition number became challenging, limiting its use as a metric. Future work will explore alternative metrics to quantify system observability.