Next Article in Journal
DELOFF: Decentralized Learning-Based Task Offloading for Multi-UAVs in U2X-Assisted Heterogeneous Networks
Next Article in Special Issue
A Cooperative Decision-Making Approach Based on a Soar Cognitive Architecture for Multi-Unmanned Vehicles
Previous Article in Journal
An Improved Method for Swing State Estimation in Multirotor Slung Load Applications
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

The Situation Assessment of UAVs Based on an Improved Whale Optimization Bayesian Network Parameter-Learning Algorithm

1
School of Automation, Northwestern Polytechnical University, Xi’an 710129, China
2
The Institute of Xi‘an Aerospace Solid Propulsion Technology, Xi’an 710025, China
*
Author to whom correspondence should be addressed.
Drones 2023, 7(11), 655; https://doi.org/10.3390/drones7110655
Submission received: 6 October 2023 / Revised: 28 October 2023 / Accepted: 30 October 2023 / Published: 1 November 2023

Abstract

:
To realize unmanned aerial vehicle (UAV) situation assessment, a Bayesian network (BN) for situation assessment is established. Aimed at the problem that the parameters of the BN are difficult to obtain, an improved whale optimization algorithm based on prior parameter intervals (IWOA-PPI) for parameter learning is proposed. Firstly, according to the dependencies between the situation and its related factors, the structure of the BN is established. Secondly, in order to fully mine the prior knowledge of parameters, the parameter constraints are transformed into parameter prior intervals using Monte Carlo sampling and interval transformation formulas. Thirdly, a variable encircling factor and a nonlinear convergence factor are proposed. The former and the latter enhance the local and global search capabilities of the whale optimization algorithm (WOA), respectively. Finally, a simulated annealing strategy incorporating Levy flight is introduced to enable the WOA to jump out of the local optimum. In the experiment for the standard BNs, five parameter-learning algorithms are applied, and the results prove that the IWOA-PPI is not only effective but also the most accurate. In the experiment for the situation BN, the situations of the assumed mission scenario are evaluated, and the results show that the situation assessment method proposed in this article is correct and feasible.

1. Introduction

The modern working environments for UAVs are extremely complex and filled with a wide variety and a large number of entities, and these bring difficulties and challenges for UAV situation assessment. Situation assessment is the process of perceiving the attributes, states, and behaviors of entities in an environment, understanding an environment’s information, inferring entities’ intentions, and finally predicting entities’ future short-term actions [1]. Intention inference is the core of situation assessment, and the mission is the expression of the intention. Therefore, the situation is embodied as the entities’ mission in this article. The general methods of situation assessment include the analytic hierarchy process (AHP) [2,3], the technique for order preference by similarity to ideal solution (TOPSIS) [4,5], neural networks [6,7], fuzzy logic [8,9], Bayesian networks [10,11], etc.
All the above methods, except for Bayesian networks, achieve situation assessment to a certain degree, but each of them has some shortcomings. The AHP relies on the human subjective experience too much, and inaccurate experiences lead to a large deviation in situation assessment results. The TOPSIS is unable to solve the problem of single-target situation assessment. In the case of multiple targets, the distance from the optimal solution to the positive ideal solution may approximate the distance to the negative ideal solution, which makes it possible to obtain the opposite assessed results. Neural networks have black box characteristics that make it hard to accurately determine the effect on the output when certain inputs change. It is difficult to determine the logic operations and inferring methods of fuzzy logic. In addition, both neural networks and fuzzy logic have only a single output for known inputs and cannot provide other feasible solutions.
A Bayesian network (BN) is a probabilistic graphical model that combines the abilities to express and infer uncertain knowledge and is capable of handling multivariate information. BNs consist of a structure and parameters [12]. The former abstracts the dependencies between the situations and its associated elements into a visual network, and the latter transforms the degree of association between the elements into a conditional probability table (CPT). The construction process of BNs is consistent with the human cognitive habits of situations, which is more conducive to commanders’ inference and application. Compared to the AHP, thanks to structure and parameter learning, BNs alleviate the influence of human subjective experiences. In comparison with the TOPSIS, BNs solve the problem of single-target and multi-target situation assessment. Speaking of neural networks and fuzzy logic, BNs not only have a clear structure with white box characteristics, but also follow mature Bayesian criteria in probability calculation and inference. In addition, BNs output the probabilities of different states of multiple nodes and provide decision-makers with multiple choices.
Despite the many advantages of BNs, the network parameters are sometimes difficult or even impossible to obtain. Combined with expert knowledge and sample data, the parameters are acquired via parameter learning. The main parameter-learning methods include constrained optimization methods and Bayesian estimation methods.
The constrained optimization methods represent expert knowledge as constraints and then treat parameter learning as an optimization problem. Niculescu [13], Campos [14], and Hou [15] transform parameter learning into an optimization problem with constraints. They use the logarithmic or entropy function as the objective function and optimize the objective function convexly in the feasible domain, limited by constraints. Altendorf [16] and Liao [17] transformed parameter learning into an optimization problem without constraints. They constructed a penalty function with constraints and then summed the logarithmic likelihood function and the penalty function to obtain the augmented objective function to be optimized.
The Bayesian estimation methods transform expert knowledge into the ranges of the values of the parameters, calculate the hyper-parameters of the prior distribution of the parameters within these ranges, and finally combine the hyper-parameters and the sample data to compute the Bayesian maximum a posteriori (MAP) to obtain the BN parameters. Ren [18] and Di [19] assume that the parameters obey a uniform distribution in the feasible domain, and Chai [20] proposes to represent the approximate equality constraints based on the normal distribution. They use Beta distribution to approximate the uniform and normal distributions, respectively. Since the Beta distribution is the marginal distribution of the Dirichlet distribution, the hyper-parameters of the Dirichlet distribution are derived from the Beta distribution parameters. In the end, the BN parameters are obtained using the MAP formula. Gao [21] proposes a constrained Bayesian estimation (CBE) algorithm that enhances learning accuracy by introducing expert criteria. Di [22] proposes a constrained adjusted MAP (CaMAP) algorithm by choosing a reasonable equivalent sample size. The qualitative maximum a posteriori estimation (QMAP) algorithm proposed by Chang [23] performs Monte Carlo (MC) sampling on the feasible domain of the parameters determined based on constraints. The pseudo prior counts obtained are functionally equivalent to the hyper-parameters of the Dirichlet distribution, and then the parameters are computed using the MAP formula. Guo [24] made an improvement to the QMAP algorithm and proposes a further constrained QMAP algorithm.
The above two methods have the following defects: firstly, the learning results of the convex optimization algorithm in the studies [13,14,15] are often located at the boundary of the feasible domain of the parameters [25], which leads to the degradation of inequality constraints into equality constraints. This indicates a failure to fully utilize expert knowledge. In addition, directly optimizing within the limited range of ordinary constraints often fails to fully utilize prior knowledge in the constraints. Secondly, the penalty function method in studies [16,17] needs to design specific penalty functions for different constraints, and the penalty factor is manually determined, which is sometimes inaccurate. Thirdly, for some sample data, the learning results of QMAP [23] may violate some parameter constraints. This means a failure to make full use of the sample data. In addition, study [19] only applies to monotonic constraints, and study [20] only applies to approximate equality constraints. The prior distribution of [21] is set to be the BDeu priors rather than the transferred priors, which are more meaningful. When no parameter constraints are available, the CaMAP [22] is inferior to the MAP. The FC-QMAP algorithm [24] only outperforms QMAP when dealing with small datasets.
In order to overcome the shortcomings of the above algorithms, in this article, an improved whale optimization algorithm based on parameter prior intervals (IWOA-PPI) for parameter learning is proposed, and the parameters learned by the IWOA-PPI are substituted into the situation assessment BN to evaluate the situation. In this algorithm, expert knowledge is transformed into parameter qualitative constraints; then, parameter prior intervals (PPIs) are calculated depending on the qualitative constraints, and finally, the improved WOA is used to optimize within the PPIs to obtain the optimal parameters. Due to the integration of the advantages of the constrained optimization methods and Bayesian estimation methods, the new algorithm proposed in this article can not only mine the information in the expert knowledge as much as possible, but can also fully utilize sample data. The main contributions of this article are that (1) the concept of PPIs is proposed. The PPIs not only contain the prior knowledge of the parameters but also narrow the feasible domain of the parameters, thus improving the search accuracy of subsequent IWOA. (2) A new variable encircling factor for the WOA is proposed. The variable encircling factor can properly shrink the local search area with the operation of the algorithm, and the local search ability of the WOA is enhanced.
The rest of the article is organized as follows: Section 2 introduces the preliminary knowledge, including BN parameter learning, parameter constraints, and the WOA. Section 3 establishes the structure of the BN for situation assessment. Section 4 proposes the improved whale optimization algorithm based on parameter prior intervals. Section 5 designs the simulation experiments. The parameters of four BNs are learned by five algorithms in the experiment for standard BNs, and the results prove that the IWOA-PPI proposed in this article is of the highest accuracy. The situation of the assumed mission scenario is evaluated in the experiment for the situation assessment BN, and the results verify that the BN established in this article can correctly infer the target intention. Finally, conclusions are drawn in Section 6.

2. Preliminaries

2.1. BN Parameter Learning

A BN is a direct acyclic graph with probabilities, which is generally denoted as B = G , Θ . G = V , E is the graph structure, the set of nodes ( V ) denotes the set of random variables, and the set of directed arcs ( E ) denotes the dependencies between the random variables. Θ is the set of network parameters denoting the conditional probability distributions between states of nodes. Θ is also denoted as P X i | p a X i , where p a X i denotes the set of parents of node X i . When the parents are given, node X i is conditionally independent of its non-descendant nodes. According to the Markov condition, the joint probability distribution of a BN can be represented in Equation (1):
P X 1 , X 2 , , X n = i = 1 n P X i | p a X i
When the structure G is known, the parameter learning of a BN refers to the estimation of the true parameters of the BN from a given sample dataset, D , according to a certain criterion. The BN parameter θ i j k is denoted in Equation (2):
θ i j k = P X i = k | p a X i = j , 1 i n , 1 j q i , 1 k r i
where i denotes the serial number of X i , j denotes the state of p a X i , and k denotes the state of X i .
There are two basic methods for BN parameter learning: maximum likelihood estimation (MLE) and Bayesian estimation. MLE is usually used for a sample set with a sufficient sample size and no missing data. The logarithmic likelihood function for BNs is defined in Equation (3):
L θ | D = log P D | θ = i = 1 n j = 1 q i k = 1 r i N i j k log θ i j k
where N i j k denotes the number of samples that satisfy X i = k and p a X i = j . The θ i j k that maximizes L θ | D is called MLE, and is shown in Equation (4):
θ i j k = N i j k k = 1 r i N i j k = N i j k N i j
where N i j denotes the number of samples that satisfy p a X i = j .
MAP is usually used for a sample set with an insufficient sample size and no missing data. Assuming that the prior distribution, P θ , of the BN parameter satisfies the Dirichlet distribution D α i j 1 , α i j 1 , α i j r i , and since the Dirichlet distribution is a conjugate family of polynomial distributions, the posterior distribution P θ | D also satisfies the Dirichlet distribution, which is expressed in Equation (5):
P θ | D i = 1 n j = 1 q i k = 1 r i θ N i j k + α i j k 1
At this point, the logarithmic likelihood function for the BN is defined in Equation (6):
log P θ | D = log P θ P D | θ c
where c is a constant. The θ i j k that maximizes P θ | D is called MAP, and is shown in Equation (7):
θ i j k = N i j k + α i j k k = 1 r i N i j k + α i j k
where α i j k is the hyperparameter of the Dirichlet distribution.

2.2. Parameter Constraints

Parameter constraints refer to the constraint relationships between the conditional probabilities of a BN given by domain experts. There are generally five types of parameter constraints.
  • Axiomatic constraint.
This constraint defines a special sum-of-parameters relationship and is given in Equation (8). The sum of the probabilities of the various states of the child nodes with constant parent combinations is 1.
k = 1 r i θ i j k = 1 , 0 θ i j k 1 , i , j , k
2.
Range constraint.
This constraint defines the upper and lower bounds of the parameter and is given in Equation (9).
0 α i j k θ i j k β i j k 1
where α i j k and β i j k are the bounds of the parameter.
3.
Approximate equality constraint.
This constraint defines the approximate equality relationship between the parameters and is given in Equation (10).
θ i j k θ i j k , i j k i j k
Equation (10) is not easy to apply, so the form of Equation (11) is usually adopted:
θ i j k θ i j k < ε , i i , j j , k k
where ε is a very small positive rational number.
4.
Inequality constraint.
This constraint defines the inequality relationship between the parameters and includes three types that are given in Equations (12)–(14):
  • Intra-distribution constraint:
θ i j k θ i j k , k k
  • Cross-distribution constraint:
θ i j k θ i j k , j j
  • Inter-distribution constraint:
θ i j k θ i j k , i i , j j , k k
5.
Synergy constraint.
This constraint defines the inequality relationship between the sum or product of multiple parameters and includes two types that are given in Equations (15) and (16):
  • Additive synergy constraint:
θ i j 1 k + θ i j 2 k θ i j 3 k + θ i j 4 k , j 1 j 2 j 3 j 4
  • Product synergy constraint:
θ i j 1 k θ i j 2 k θ i j 3 k θ i j 4 k , j 1 j 2 j 3 j 4

2.3. Whale Optimization Algorithm

The whale optimization algorithm (WOA) is a meta-heuristic intelligent optimization algorithm proposed by Mirjalili [26], inspired by the natural phenomenon of the predatory behavior of humpback whales. This algorithm has the advantages of a high optimization accuracy and few control factors, and has been applied in many fields such as photovoltaic power generation [27], resource scheduling [28], and path planning [29]. In addition, some scholars have made improvements to the whale algorithm [30,31,32].
The WOA is divided into three parts, namely encircling prey, bubble-net attacking (exploitation phase), and the search for prey (exploration phase).
1.
Encircling prey:
The WOA assumes that the position of the individual that reaches the optimal solution in the current population is the prey position, and then other whales will move towards that position for the purpose of encirclement. The mathematical model for encircling prey is expressed in Equations (17)–(20):
D = C X * t X t
X t + 1 = X * t A D
A = 2 a r 1 a
C = 2 r 2
where t is the iteration number, X * t is the optimal position, and X t is the individual position. A and C are the coefficient vectors determined by a , r 1 , and r 2 . a decreases linearly from 2 to 0 with each iteration. r 1 and r 2 are random vectors in [0, 1].
2.
Bubble-net attacking (exploitation phase)
Bubble-net attacking consists of two parts: shrinking the encircling mechanism and the spiral updating position. When | A | 1 , the former is realized in Equations (17) and (18). The mathematical model of the latter is expressed in Equations (21) and (22):
D = X * t X t
X t + 1 = X * t + D e b l cos 2 π l
where b defines the shape of the logarithmic helix, and l is a random number in [−1, 1].
When the whales implement bubble-net attacking, the shrinking of the encirclement and the spiral update occur simultaneously. To model simultaneity, it is assumed that there is a 50% probability of selecting one of the behaviors each time, as shown in Equation (23):
X t + 1 = X * t A D i f p < 0.5 X * t + D e b l cos 2 π l i f p 0.5
where p is a random number in [0, 1].
3.
Search for prey (exploration phase)
When | A | > 1 , whales search randomly according to each other’s position. The mathematical model is expressed in Equations (24) and (25):
D = C X r a n d t X t
X t + 1 = X r a n d t A D
where X r a n d t is the position vector of a random individual in the current population.

3. Structure Establishment of Situation Assessment BN

In this section, the influencing factors of the target intention are enumerated, and the structure of the situation assessment BN is established based on their dependencies.
When a target performs a mission, its intention can be reflected by its attributes and state to a certain extent. Firstly, by restricting the range of missions that the target can perform, the target type is an important attribute to infer the intention. Secondly, relative motion is the movement trend of the target relative to our UAVs, which can indicate that the target is approaching or leaving our UAVs. Finally, the relative velocity and relative height between the target and our UAVs are able to serve as an auxiliary basis for inferring the target intention. Therefore, the BN for intention inference is shown in Figure 1.
Similar to target intention inference, target type recognition requires obtaining target information, which generally includes the velocity, height, radar cross-section (RCS), and radar frequency band variability (RFV) of the target. Therefore, the BN for type recognition is shown in Figure 2.
The relative motion trend of the target is determined based on the distance and bearing relative of the target to our UAVs. Therefore, the BN for relative motion determination is shown in Figure 3.
By connecting the three sub-networks in Figure 1, Figure 2 and Figure 3 into a multi-layer BN, the structure of the situation assessment BN is established, as shown in Figure 4. Among them, the light-colored nodes such as velocity, height, etc., are the base nodes, and their states can be obtained via observation. The three dark-colored nodes, namely target intention, target type, and relative motion, are the hidden nodes that use base nodes as their child nodes. In conjunction with the CPTs and the states of the child nodes, the probabilities of each state of the hidden nodes are calculated and acquired. The relevant information on each node in the network is shown in Table 1.

4. IWOA-PPI for Parameter Learning

The establishment of the structure for a situation assessment BN is introduced in Section 3, and the IWOA-PPI for parameter learning is discussed in detail in this section. The parameter learning problem is treated as an unconstrained optimization problem in the PPIs, and the objective function can be acquired using Equation (3). The idea of the IWOA-PPI is to transform parameter constraints into PPIs, and then use the improved WOA to search for the optimal parameters. In this way, the algorithm has the advantages of Bayesian estimation, which fully utilizes prior information, and constrained optimization, which fully utilizes sample data. Therefore, the key to the algorithm is the PPIs and the improvements to the WOA. The former has the ability to mine the parameter prior information embedded in the expert knowledge, thereby improving the accuracy of the parameter learning. For the latter, since the original WOA has the shortcomings of a slow convergence speed, an inability to jump out of the local optima, and difficulty in reaching the global optima, it is necessary to make appropriate improvements to accelerate the search speed and enhance the global optimization ability. In this section, the establishment of the PPIs is described in Section 4.1, and the improvements to the WOA are explained in Section 4.2, Section 4.3 and Section 4.4.

4.1. Parameter Prior Interval

Expert knowledge is defined as the qualitative parameter relationships that are summarized by domain experts based on physical phenomena or objective laws. The knowledge contains prior information of the parameters and is typically represented as parameter constraints. In order to make full use of prior knowledge, the concept of PPIs is proposed.
PPIs are essentially the intervals bounded by the upper and lower bounds of the parameters, formally equivalent to the range constraint of Equation (9), but fundamentally different from general parameter constraints. Firstly, the PPI is a concise upper and lower bound form, but the parameter constraints are embodied in a variety of more complex forms. Secondly, the PPI is transformed by extracting the prior information of the parameters in the parameter constraints, so the prior knowledge can be fully utilized. Finally, compared to parameter constraints, PPIs narrow the parameters’ feasible domain to a more precise region, which accelerates the subsequent IWOA speed and improves its accuracy.
PPIs are obtained in three steps:
  • By performing MC sampling on the parameter space delimited by the parameter constraints, the parameters without samples are obtained using Equation (26):
    θ i j k M C = P X i = k , p a X i = j | Ω = l = 1 S P l X i = k , p a X i = j | Ω S
    where Ω denotes the parameters constraints, S denotes the number of samples of MC sampling, P l X i = k , p a X i = j | Ω denotes the value of θ i j k l for a single sample, and θ i j k M C denotes the mean value of all θ i j k l for S samples.
    θ i j k M C is the parameter obtained via MC sampling without considering the samples. Its numerical value is close to the true parameter to some extent, and the degree of closeness is positively related to the number of constraints.
  • Using the interval transform formulas, the uncorrected PPIs are obtained. θ i j k M C ( k = 1 , 2 , , r i ) are denoted from small to large, as shown in Equation (27):
    θ i j k 1 M C θ i j k 2 M C θ i j k m M C θ i j k r i M C
    where k m 1 , 2 , , r i .
    d k m l o w is defined as the lower bound interval of θ i j k m M C , d k m u p is defined as the upper bound interval of θ i j k m M C , and [ θ i j k m M C d k m l o w , θ i j k m M C + d k m u p ] is the uncorrected PPI of θ i j k m M C . The determination of d k m l o w and d k m u p needs to be categorized into two cases:
  • When θ i j k 1 M C < θ i j k 2 M C < < θ i j k m M C < < θ i j k r i M C , the interval transform formulas of θ i j k m M C are implemented as Equations (28) and (29):
    d k m l o w = θ i j k m M C θ i j k m 1 M C θ i j k m M C / θ i j k m 1 M C
    d k m u p = θ i j k m + 1 M C θ i j k m M C θ i j k m + 1 M C / θ i j k m M C
    For θ i j k 1 M C , d k 1 u p is calculated first, and then d k 1 l o w = d k 1 u p . For θ i j k r i M C , d k r i l o w is calculated first, and then d k r i u p = d k r i l o w .
  • When θ i j k p M C = θ i j k p + 1 M C = = θ i j k p + q M C , the interval transform formulas of θ i j k m M C are implemented as Equation (30):
    d k m l o w = d k m u p = ω θ i j k p M C + θ i j k p + 1 M C + + θ i j k p + q M C q
    where ω is the weight coefficient that takes a small value.
3.
Combining the parameter constraints, the PPIs are obtained. The uncorrected PPIs may violate some parameter constraints, so the intersection of the parameter constraints and the uncorrected PPIs is sought to obtain the PPIs. For example, in order to satisfy the axiomatic constraint, the lower bound interval, d k 1 l o w , of θ i j k 1 M C is max ( 0 , θ i j k 1 M C d k 1 l o w ) and the upper bound interval, d k 1 u p , of θ i j k 1 M C is min ( θ i j k r i M C + d k i u p , 1 ) .

4.2. Variable Encircling Factor

Aimed at the encircling prey behavior, a variable encircling factor is proposed to enhance the local search ability of the WOA.
The value range of C X * ( t ) in Equation (17) is [ 0 , 2 X * ( t ) ] . The physical meaning of [ 0 , 2 X * ( t ) ] refers to a high-dimensional neighborhood. This neighborhood is centered on the current optimal position, X * ( t ) , and the range of each dimension is [ 0 , 2 X i * ( t ) ] . This neighborhood is embodied as a rectangle in two dimensions, a rectangular body in three dimensions, and a hyper-cuboid in high dimensions. The encircling prey behavior of Equations (17) and (18) is a random selection of a point, C X * ( t ) , within the neighborhood, [ 0 , 2 X * ( t ) ] , and then all whale individuals shrink and move from their current position, X ( t ) , to C X * ( t ) .
C is defined as the encircling factor, and its value range determines the size of [ 0 , 2 X i * t ] . In the original WOA, all dimensional variables of r 2 in Equation (20) obey the uniform distribution, U 0 , 1 ; i.e., all dimensional variables of C obey the uniform distribution, U 0 , 2 . As the algorithm runs, X * ( t ) gradually approaches the optimal position. Since the neighborhood [ 0 , 2 X * ( t ) ] is too large, this leads to a large number of invalid local searches. Therefore, a variable encircling factor, C V , obeying the normal distribution, is proposed as Equation (31):
p C i V = 1 2 π σ e C i V 1 2 2 σ 2
where C i V is a one-dimensional variable of C V .
The variability of the variable encircling factor, C V , is reflected in σ : firstly, the value of σ is very large at the beginning of the algorithm. At this moment, the distribution of C i V degenerates from the normal distribution, N 1 , σ 2 , to the uniform distribution, U 0 , 2 , and it means a degeneration towards the original WOA. Secondly, after certain iterations, make 3 σ equal to 1, and then C i V obeys the normal distribution, N 1 , 1 / 3 . Then, the 3 σ area of C i V is [ 0 , 2 ] , which significantly increases the probability of C X * ( t ) approaching X * ( t ) . Finally, 3 σ decreases gradually to 0.1 at the later stage of the algorithm, and the 3 σ area of C i V becomes [ 0.9 , 1.1 ] in the end. The process of changing σ shrinks the local search area to accelerate the local search speed.

4.3. Nonlinear Convergence Factor

The WOA performs a global search in the exploration phase and a local search in the exploitation phase, and switches between the two phases based on control coefficient A , determined as a . As a decreases linearly, both global and local searches are performed in the early stages ( a > 1 ), and only a local search is performed in the later stages ( a 1 ). However, when dealing with certain optimization problems, the linear convergence factor a makes the algorithm enter the later phase so early that a sufficient global search has not been performed in the early stage. Therefore, the global optimal solution cannot be found. To overcome this problem, a nonlinear convergence factor, a n l , is proposed as Equation (32):
a n l = 2 cos π 2 t t max
where t is the iteration number and t max is the maximum iteration number.
The curves of a and a n l with iterations are shown in Figure 5. As shown in Figure 5, a n l has two characteristics: firstly, from the perspective of the number of iterations, the number of iterations where a > 1 is 250, while the number of iterations where a n l > 1 is extended to about 330. This indicates that a n l increases the number of iterations where | A n l | > 1 . Secondly, from the perspective of the value, when a > 1 and a n l > 1 , a n l is always greater than a , except for the beginning iteration, where a = a n l . This indicates that a n l makes it easier for | A n l | to exceed 1 compared to | A | . The above two characteristics mean that a n l increases the probability of the global search. In addition, the decreasing trend of a n l is gentle in the early stage and sharp in the later stage, which means that the algorithm performs more global searches in the early stage and faster local searches in the later stage. The above indicates that a n l has the ability to balance the global and local search of the algorithm better. In a word, the performance of WOA is improved by a n l .

4.4. Simulated Annealing Strategy Incorporating Levy Flight

Simulated annealing [33] has the ability to jump out of local optima through the Metropolis criterion, and Levy flight [34] has the ability to enhance the randomness of the search. Therefore, a simulated annealing strategy incorporating Levy flight is proposed. This strategy utilizes Levy flight for position updating and the Metropolis criterion to jump out of local optima, thereby enhancing the global search ability of the WOA.
Levy flight refers to random walks with step sizes that satisfy the heavy-tailed distribution. Due to the alternating characteristics of a short-distance search and occasional long-distance wander, the random walks are used to simulate animal foraging in nature. The Levy distribution is usually realized with the Mantegna algorithm, and the step length, s , is defined in Equation (33):
s = u v 1 / β
where L e v y β ~ s 1 β , 0 < β 2 , and the general value of β is 1.5. u ~ N 0 , σ u 2 v ~ N 0 , 1 . σ u 2 is defined in Equation (34):
σ u = Γ 1 + β sin π β 2 Γ 1 + β 2 β 2 β 1 / 2 1 / β
The position updating of Levy flight is shown in Equation (35):
X i t + 1 = X i t + α L e v y β
where α is the step size control factor and L e v y β is the step length of the Levy flight.
Simulated annealing comes from the simulation of the solid annealing cooling process. The core idea is to accept inferior solutions with a certain probability using the Metropolis function, thereby achieving the goal of jumping out of local optima. The Metropolis function is defined in Equation (36):
p = 1 Δ f > 0 exp Δ f / T Δ f 0
where p is the probability of accepting inferior solutions, Δ f is the increment of the objective function, and T is the current temperature. T [ T e n d , T 0 ] , T 0 and T e n d refer to the initial temperature and the end temperature, respectively.
Assuming that population 1 is obtained via an iteration of the WOA, and population 2 is obtained after using Equation (35) to update the positions of the individuals in population 1, comparing the fitness of the individuals in the two populations, if the fitness of an individual in population 2 is better than that in population 1, the corresponding individual in population 1 is replaced; otherwise, the corresponding individual in population 1 is replaced with the probability ( p ) of Equation (36). The population obtained through the Metropolis function is population 3, which serves as the initial population for the next iteration of the WOA.
In summary, the pseudo-code of the IWOA-PPI is shown in Algorithms 1.
Algorithms 1 Pseudo-code of the IWOA-PPI
01 : Initialize   the   whales   population   X l ( l = 1 , 2 , , m )   in   the   parameter   prior   internals
02 : Calculate   the   fitness   of   each   search   agent
03 : T = T 0 , X * = the   best   search   agent
04 : While   ( t < maximum   number   of   iterations )
05 : for   each   search   agent
06 : Update   a n l ,   A n l ,   C V ,   l ,   and   p
07 : if1   ( p < 0.5 )
08 : if2   ( | A | < 1 )
09 : Update   the   position   X l   of   the   current   search   agent   by   the   Equation   ( 18 )
10 : else if2   ( | A | 1 )
11 : Select   a   random   search   agent   ( X r a n d )
12 : Update   the   position   X l   of   the   current   search   agent   by   the   Equation   ( 25 )
13 : else if2
14 : else if1   ( p 0.5 )
15 : Update   the   position   X l   of   the   current   search   agent   by   the   Equation   ( 22 )
16 : else if1
17 : Update   the   position   X l L e v y   of   the   current   search   agent   by   the   Equation   ( 35 )
18 : Check   if   X l   or   X l L e v y   goes   beyond   the   search   space   and   amend   it
19 : Calculate   the   fitness   of   X l   and   X l L e v y
20 : Δ f = f ( X l L e v y ) f ( X l )
21 : if3   ( Δ f > 0 )   or   random ( 0 , 1 ) p ( Δ f , T )
22 : X l X l L e v y
23 : end if3
24 : end for
25 : Update   X *   if   there   is   a   better   solution
26 : Update   T
27 : t = t + 1
28 : end while
29 : return   X *

5. Experiment and Discussion

In the experiment for the standard BNs, including MLE, MAP, QMAP, WOA, and the IWOA-PPI proposed in this article, five algorithms are used to learn the parameters of four standard BNs. In the experiment for the situation assessment BN, the parameters of the situation assessment BN in Figure 4 are learned by the IWOA-PPI, and then the learned parameters are used to evaluate the target’s situation.
Both the two experiments are simulated in MATLAB 2018b. The construction and inference of the BNs, the collection of the sample data, and the MLE and MAP algorithms are achieved with the BNT toolbox. The QMAP, WOA, and IWOA-PPI are validated with the assistance of the BNT toolbox.

5.1. Experiment for the Standard BNs

The KL divergence [35] is used to measure the accuracy of the parameter learning results, which is defined in Equation (37):
K L θ | θ ^ = 1 i = 1 n r i q i i = 1 n j = 1 q i k = 1 r i θ i j k log θ i j k θ ^ i j k
where θ i j k denotes the true parameter, and θ ^ i j k denotes the learned parameter. The smaller the KL divergence is, the more accurate the learned parameters are.
To verify the universality and effectiveness of the IWOA-PPI, four different sizes of BNs are adopted as learning objects, and the basic information of the four BNs is shown in Table 2. According to the number of nodes, arcs, and parameters, the standard BNs are categorized into four scales, namely small, medium, large, and very large.
The sample sizes are set to 40, 80, 120, 160, and 200, and the training samples are generated with the BNT toolbox. Except for synergy constraints, all kinds of constraints in Section 2 are considered and randomly selected. For example, the structure and CPTs of the Asia BN are shown in Figure 6.
According to the CPTs of the Asia BN, three typical constraints are as follows:
  • Range constraint. For example, 0.8 P ( tub = 0 | asia = 0 ) 0.1 ;
  • Approximate equality constraint. For example,
    P ( either = 0 | tub = 0 , lung = 0 ) P ( either = 1 | tub = 0 , lung = 1 ) ;
  • Inequality constraint. For example, P ( bronc = 0 | smoke = 0 ) > P ( bronc = 1 | smoke = 1 ) .
After the samples and constraints are prepared, the five algorithms are used to learn the parameters of each network 20 times at different sample sizes, and the expectation and variance of the KL divergences are calculated. The KL divergences are represented as expectation ± standard deviation in Table 3. The comparisons of the KL divergences of the five algorithms are shown in Figure 7, Figure 8, Figure 9 and Figure 10. The horizontal axis represents the number of the sample, and the vertical axis represents the value of the KL divergence. Due to the significant differences in the KL divergences of the various algorithms, there are four graphs for each network. Graph (a) shows the KL divergences of the five algorithms, Graph (b) shows those of MLE, Graph (c) shows those of MAP and the WOA, and Graph (d) shows those of QMAP and the IWOA-PPI.
From Table 3 and Figure 6, Figure 7, Figure 8 and Figure 9, the following is clear:
  • The accuracy ranking of the five algorithms is IWOA-PPI > QMAP > WOA > MAP > MLE. There are roughly three levels of accuracy, the lowest level for MLE, the medium level for the WOA and MAP, and the highest level for the IWOA-PPI and QMAP.
  • For all networks and sample sizes, the IWOA-PPI proposed in this article has the smallest KL divergence among the five parameter-learning algorithms, which means that the learning results are the most accurate. The highest accuracy of the IWOA-PPI indicates that, in contrast to the QMAP and WOA, the IWOA-PPI absorbs the advantages of the Bayesian estimation and constrained optimization methods, and fully extracts information from both parameter constraints and sample data.
  • Comparing the WOA and the IWOA-PPI, the KL divergence of the former is about three times that of the latter. This indicates that the improvements in Section 4 significantly enhance the optimization ability of the WOA.

5.2. Experiment for the Situation Assessment BN

The experiment for the situation assessment BN includes two parts:
  • Use the IWOA-PPI to learn the parameters of the situation assessment BN in Figure 4;
  • For the assumed mission scenario, substitute the learned parameters into the situation assessment BN, and use this BN to evaluate the operational intentions of the opposing targets, i.e., the opposing situation.

5.2.1. Parameter Learning of the Situation Assessment BN

The parameter-learning process of the situation assessment BN is the same as that of the standard BNs. Using the function “sample_bnet” of the BNT toolbox, the samples are generated in the same way. However, the constraints of the situation assessment BN cannot be randomly generated because of the realistic physical meanings. The corresponding relationship between the parameters and the CPTs of nodes is introduced first, and then some examples of the constraints are listed.
The true parameters of the situation assessment BN are shown in Table 4, and the CPTs of the node intention, target type, and relative motion are displayed. Taking the target type node as an example, when the intention is patrol, the probability that the target type is AEW is 0.6, and this probability is denoted as the parameter θ 211 = 0.6 . According to Table 1, the number of the target type node is two. Corresponding the state notation { p , r , e , a } of the intention to { 1 , 2 , 3 , 4 } and the state notation { a , r , e , f } of the target type to { 1 , 2 , 3 , 4 } , the whole CPT between the target type and the intention is denoted as the parameters θ 2 j k , in which j , k { 1 , 2 , 3 , 4 } .
After discussing the corresponding relation between the CPT and the parameters, some typical constraints of the situation assessment BN are shown below with the explanation of their physical meanings.
  • Range constraint.
    θ 233 = P ( target type = 3 | intention = 3 ) , 0.6 θ 233 1 .
    When the intention is jamming and the target type is EJA, the range of the conditional probability, P ( target type | intention ) , is [0.6, 1]. This indicates that if the target is performing the jamming mission, it has at least a 60% chance of being the EJA. Since the probability cannot be greater than one, the range is [0.6, 1].
  • Approximate equality constraint.
    θ 411 = P ( relative   motion = 1 | intention = 1 ) , θ 412 = P ( relative   motion = 2 | intention = 1 ) ,
    θ 411 θ 412 .
    The conditional probability when the intention is patrol and the relative motion is approach is approximately equal to the one when the intention is patrol and the relative motion is leave. When performing the patrol mission, in order to ensure comprehensive air surveillance of the defense focus, the AEW sets the defense focus as the center of the circle and flies around it according to a certain patrol radius and patrol speed. Since the target makes a circular flight in a fixed area, the approach and leave of the target have no influence on our UAVs. Therefore, when the intention is patrol, the probabilities of approach and leave are approximately equal.
  • Inequality constraint.
    θ 721 = P ( height = 1 | target type = 2 ) , θ 722 = P ( height = 2 | target type = 2 ) ,
    θ 723 = P ( height = 3 | target type = 2 ) , θ 721 < θ 723 , θ 722 < θ 723 .
    The conditional probability when the target type is RP and the height is low altitude is smaller than the one when the target type is RP and the height is high altitude. So is the one when the target type is RP and the height is medium altitude. Because the RP is often at high altitude when conducting reconnaissance, the conditional probability, P ( height = 3 | target type = 2 ) , is the highest.
With the structure of the situation assessment BN, the samples, and the constraints, the parameters learned by the IWOA-PPI are shown in Table 5. The KL divergence between the true parameters and the learned parameters is 0.1123, and it indicates that they are close enough that the former can be completely replaced by the latter to evaluate the situation.

5.2.2. Result of Situation Assessment

To assess the situations of the targets, there are three contents to be completed. Firstly, an assumed mission scenario is constructed. Secondly, the observed evidence of the targets is collected. At last, the situation assessment BN is used on all the targets to acquire the results.
1.
The description of the assumed mission scenario.
The existing entities in the environment are several UAVs perceiving the situation and a ground radar belonging to us. An AEW, an RP, an EJA, and a fighter belong to the other side.
The assumed missions of the opposing targets are: in the beginning, the AEW is patrolling with a fighter, the RP is conducting reconnaissance and the EJA has no clear mission. After a while, the ground radar is discovered by the RP. Then, the EJA starts to fly towards the radar and implement electronic jamming, and the fighter stops escorting the AEW and assaults the radar after the electronic jamming takes effect.
2.
The acquisition of the observed evidence.
The observed evidences are necessary for situation assessment and are transferred from the attributes and states of the targets. Since the attributes and states are continuous variables, and the evidences in the form of probabilities are discrete variables. Then, fuzzy discretization is used to acquire the evidences. Taking the height node as an example, the process to acquire the observed evidence is as below.
  • The construction of the membership function.
    The fuzzy membership function of the height node is defined as Equation (38) and shown in Figure 11.
    μ H , 1 = 1 0 H < 5000 H / 2000 + 7 / 2 5000 H < 7000 0 other   μ H , 2 = H / 3000 5 / 3 5000 H < 8000 1 8000 H < 9000 H / 3000 + 4 9000 H < 12,000 0 other μ H , 3 = H / 4000 5 / 2 10,000 H < 14,000 1 H 14,000 0 other
    where μ H , 1 denotes the membership of low altitude, μ H , 2 denotes that of medium altitude, and μ H , 3 denotes that of high altitude.
    Figure 11. The membership function of height.
    Figure 11. The membership function of height.
    Drones 07 00655 g011
    The unit of height is meters. The green area is μ H , 1 , the red area is μ H , 2 , and the blue area is μ H , 3 . The brown area is the overlap between μ H , 1 and μ H , 2 , and the grey area is the overlap between μ H , 2 and μ H , 3 .
  • The transform from the fuzzy membership to the probability.
    The probability–possibility transformation formula [36] is defined in Equation (39):
    p i u = μ i u 1 / α i = 1 n μ i u 1 / α , 0 < α < 1
    where u denotes the attribute or state, μ i denotes the membership function, and p i u denotes the probability. α is 0.5 in this article.
For example, if the height of the target is 6000 m, the fuzzy memberships are [1/2, 1/3, 1] according to Equation (38), and the probabilities are [0.6923, 0.3077, 0] according to Equation (39). In this way, the evidences of other attributes or states are acquired. The evidences at a certain moment are shown in Table 6.
3.
The application of the situation assessment BN.
Substituting the learned parameters and the evidences into the situation assessment BN, the results are inferred by the BNT toolbox according to the Bayesian formula. At each moment, the situation assessment BN is used for each target, and the situations at all times constitute the situation of the target for a period of time. The results of the situation assessment for all the targets are shown in Figure 12. The X-axis represents the time, the Y-axis represents the intention, and the Z-axis represents the probability. The height of the bars means the probability value, and the color of the bars means a certain moment like t1.
In Figure 12a, the probability of patrol is always the highest, which means that the AEW is patrolling all the time. It is discovered that the probability of the patrol before t5 is lower than that after t5, and the reason is that the AEW is approaching before t5 and leaving after t5 when the aircraft is in circular flight. The approaching behavior reduces the probability of patrolling. In Figure 12b, the probability of reconnaissance becomes the highest after t3, which means that the RP is conducting reconnaissance. In the beginning, since the RP is climbing and the heights are low and medium altitude, the true intention is not recognized. In Figure 12c, the probability of jamming becomes the highest after t3. Because the EJA does not have a clear mission in the beginning, the probabilities of patrol and jamming are close before t4. After the EJA starts to perform electronic jamming, the probability of jamming becomes the highest. In Figure 12d, the probability of patrol is the highest before t4, and the probability of assault becomes the highest starting from t4. This indicates a trend that the fighter performs the patrol at first and turns into an assault later, and the trend is the same as the fighter in the assumed mission scenario. According to Figure 12, all the situation assessment results are consistent with the description of the assumed mission scenario, and this proves that the situation assessment method proposed in this article is correct and feasible.

6. Conclusions

In this article, the situation assessment BN is established to evaluate the situation of the targets, and an improved whale optimization algorithm based on parameter prior intervals (IWOA-PPI) is proposed for parameter learning. In the IWOA-PPI, the prior knowledge embedded in the parameter constraint is maximally mined based on the PPIs, and the performance of the original WOA is enhanced by a variable encircling factor, a nonlinear convergence factor, and a simulated annealing strategy incorporating Levy flight. The experiment for the standard BNs proves that the parameter-learning algorithm proposed in this article is able to effectively learn parameters with optimal learning accuracy. The experiment for the situation assessment BN shows that the situation assessment BN established in this article has the ability to infer the correct intention and understand the situation.
In future research, the algorithm proposed in this article will be explored for application in other fields such as disaster area search and rescue, plant protection, and so on. In addition, work on applying the algorithm of this paper to real UAVs like quad-copters will be carried out.

Author Contributions

Conceptualization, W.L. and W.Z.; methodology, W.L.; software, W.L.; validation, W.L.; formal analysis, W.L.; investigation, W.L. and B.L.; resources, W.L. and W.Z.; data curation, W.L. and B.L.; writing—original draft preparation, W.L.; writing—review and editing, W.L.; visualization, W.L.; supervision, W.L., W.Z., B.L. and Y.G.; project administration, W.Z. and Y.G.; funding acquisition, W.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China, grant No. 62173277 and grant No. 62373301.

Data Availability Statement

Not applicable.

Acknowledgments

The authors are grateful to the Shaanxi Province Key Laboratory of Flight Control and Simulation Technology.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this article:
AEWAir-Borne Early Warning
AHPAnalytic Hierarchy Process
BNBayesian Network
CPTConditional Probability Table
EJAElectronic Jamming Aircraft
IWOA-PPIImproved Whale Optimization Algorithm based on PPIs
MAPMaximum A Posteriori
MCMonte Carlo
MLEMaximum Likelihood Estimation
PPIParameter Prior Interval
QMAPQualitative Maximum A Posteriori
RCSRadar Cross-Section
RFVRadar Frequency Band Variability
RPReconnaissance Plane
TOPSISTechnique for Order Preference by Similarity to Ideal Solution
UAVUnmanned Aerial Vehicle
WOAWhale Optimization Algorithm

References

  1. Endsley, M.R. Toward a Theory of Situation Awareness in Dynamic Systems. Hum. Factors 1995, 37, 32–64. [Google Scholar] [CrossRef]
  2. Li, P.; Zhang, L.; Dai, L.; Zou, Y.; Li, X. An Assessment Method of Operator’s Situation Awareness Reliability Based on Fuzzy Logic-AHP. Saf. Sci. 2019, 119, 330–343. [Google Scholar] [CrossRef]
  3. Wang, H.; Chen, Z.; Feng, X.; Di, X.; Liu, D.; Zhao, J.; Sui, X. Research on Network Security Situation Assessment and Quantification Method Based on Analytic Hierarchy Process. Wirel. Pers. Commun. 2018, 102, 1401–1420. [Google Scholar] [CrossRef]
  4. Zhang, H.W.; Xie, J.W.; Ge, J.A.; Yang, C.X.; Liu, B.Z. Intuitionistic Fuzzy Set Threat Assessment Based on Improved TOPSIS and Multiple Times Fusion. Control Decis. 2019, 34, 811–815. [Google Scholar] [CrossRef]
  5. Yin, Y.; Zhang, R.; Su, Q. Threat Assessment of Aerial Targets Based on Improved GRA-TOPSIS Method and Three-Way Decisions. Math. Biosci. Eng. 2023, 20, 13250–13266. [Google Scholar] [CrossRef]
  6. Zhang, L.; Zhu, Y.; Shi, X.; Li, X. A Situation Assessment Method with an Improved Fuzzy Deep Neural Network for Multiple UAVs. Information 2020, 11, 194. [Google Scholar] [CrossRef]
  7. Yue, L.; Yang, R.; Zuo, J.; Luo, H.; Li, Q. Air Target Threat Assessment Based on Improved Moth Flame Optimization-Gray Neural Network Model. Math. Probl. Eng. 2019, 2019, 4203538. [Google Scholar] [CrossRef]
  8. D’Aniello, G. Fuzzy Logic for Situation Awareness: A Systematic Review. J. Ambient Intell. Humaniz. Comput. 2023, 14, 4419–4438. [Google Scholar] [CrossRef]
  9. Chen, J.; Gao, X.; Rong, J.; Gao, X. A Situation Awareness Assessment Method Based on Fuzzy Cognitive Maps. J. Syst. Eng. Electron. 2022, 33, 1108–1122. [Google Scholar] [CrossRef]
  10. Xu, X.; Yang, R.; Yu, Y. Situation Assessment for Air Combat Based on Novel Semi-Supervised Naive Bayes. J. Syst. Eng. Electron. 2018, 29, 768–779. [Google Scholar] [CrossRef]
  11. Sun, Y.; Ma, P.; Dai, J.; Li, D. A Cloud Bayesian Network Approach to Situation Assessment of Scouting Underwater Targets with Fixed-Wing Patrol Aircraft. CAAI Trans. Intell. Technol. 2023, 8, 532–545. [Google Scholar] [CrossRef]
  12. Pearl, J. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference; Morgan Kaufmann Series in Representation and Reasoning; Morgan Kaufmann Publishers: San Mateo, CA, USA, 1988; 552p. [Google Scholar]
  13. Niculescu, R.S.; Mitchell, T.M.; Rao, R.B. Bayesian Network Learning with Parameter Constraints. J. Mach. Learn. Res. 2006, 7, 1357–1383. [Google Scholar] [CrossRef]
  14. De Campos, C.P.; Tong, Y.; Ji, Q. Constrained Maximum Likelihood Learning of Bayesian Networks for Facial Action Recognition; Lecture Notes in Computer Science (including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Springer: Berlin/Heidelberg, Germany, 2008; Volume 5304 LNCS, pp. 168–181. [Google Scholar]
  15. Hou, Y.; Zheng, E.; Guo, W.; Xiao, Q.; Xu, Z. Learning Bayesian Network Parameters with Small Data Set: A Parameter Extension under Constraints Method. IEEE Access 2020, 8, 24979–24989. [Google Scholar] [CrossRef]
  16. Altendorf, E.E.; Restificar, A.C.; Dietterich, T.G. Learning from Sparse Data by Exploiting Monotonicity Constraints. In Proceedings of the 21st Conference on Uncertainty in Artificial Intelligence, Edinburgh, UK, 26–29 July 2005; pp. 18–25. [Google Scholar]
  17. Liao, W.; Ji, Q. Learning Bayesian Network Parameters under Incomplete Data with Domain Knowledge. Pattern Recognit. 2009, 42, 3046–3056. [Google Scholar] [CrossRef]
  18. Ren, J.; Gao, X.G.; Bai, Y. Discrete Dynamic BN Parameter Learning under Small Sample and Incomplete Information. Systems Eng. Electron. 2012, 34, 1723–1728. [Google Scholar] [CrossRef]
  19. Di, R.; Gao, X.; Guo, Z. Learning Bayesian Network Parameters under New Monotonie Constraints. J. Syst. Eng. Electron. 2017, 28, 1248–1255. [Google Scholar] [CrossRef]
  20. Chai, H.; Zhao, Y.; Fang, M. Learning Bayesian Networks Parameters by Prior Knowledge of Normal Distribution. Systems Eng. Electron. 2018, 40, 2370–2375. [Google Scholar] [CrossRef]
  21. Xiaoguang, G.; Yu, Y.; Zhigao, G. Learning Bayesian Networks by Constrained Bayesian Estimation. J. Syst. Eng. Electron. 2019, 30, 511–524. [Google Scholar] [CrossRef]
  22. Di, R.; Wang, P.; He, C.; Guo, Z. Constrained Adjusted Maximum a Posteriori Estimation of Bayesian Network Parameters. Entropy 2021, 23, 1283. [Google Scholar] [CrossRef]
  23. Chang, R.; Wang, W. Novel Algorithm for Bayesian Network Parameter Learning with Informative Prior Constraints. In Proceedings of the International Joint Conference on Neural Networks, Barcelona, Spain, 18–23 July 2010. [Google Scholar]
  24. Guo, Z.G.; Gao, X.G.; Ren, H.; Yang, Y.; Di, R.H.; Chen, D.Q. Learning Bayesian Network Parameters from Small Data Sets: A Further Constrained Qualitatively Maximum a Posteriori Method. Int. J. Approx. Reason. 2017, 91, 22–35. [Google Scholar] [CrossRef]
  25. Yang, Y.; Gao, X.G.; Guo, Z.G. Learning BN Parameters with Small Data Sets Based by Data Reutilization. Acta Autom. Sin. 2015, 41, 2058–2071. [Google Scholar] [CrossRef]
  26. Mirjalili, S.; Lewis, A. The Whale Optimization Algorithm. Adv. Eng. Softw. 2016, 95, 51–67. [Google Scholar] [CrossRef]
  27. Hasanien, H.M. Performance Improvement of Photovoltaic Power Systems Using an Optimal Control Strategy Based on Whale Optimization Algorithm. Electr. Power Syst. Res. 2018, 157, 168–176. [Google Scholar] [CrossRef]
  28. Hu, B.; Zhu, Y.; Zhou, Y. Simulated Annealing Whale Radar Resource Scheduling Algorithm Based on Cauchy Mutation. J. Northwest. Polytech. Univ. 2022, 40, 796–803. [Google Scholar] [CrossRef]
  29. Wu, K.; Tan, S. Path Planning of UAVs Based on Improved Whale Optimization Algorithm. Acta Aeronaut. Astronaut. Sin. 2020, 41, 724286. [Google Scholar] [CrossRef]
  30. Abdel-Basset, M.; Mohamed, R.; Mirjalili, S. A Novel Whale Optimization Algorithm Integrated with Nelder–Mead Simplex for Multi-Objective Optimization Problems. Knowl. Based Syst. 2021, 212, 106619. [Google Scholar] [CrossRef]
  31. Sun, Y.; Wang, X.; Chen, Y.; Liu, Z. A Modified Whale Optimization Algorithm for Large-Scale Global Optimization Problems. Expert Syst. Appl. 2018, 114, 563–577. [Google Scholar] [CrossRef]
  32. Sun, W.Z.; Wang, J.S.; Wei, X. An Improved Whale Optimization Algorithm Based on Different Searching Paths and Perceptual Disturbance. Symmetry 2018, 10, 210. [Google Scholar] [CrossRef]
  33. Kirkpatrick, S.; Gelatt, C.D.; Vecchi, M.P. Optimization by Simulated Annealing. Science 1983, 220, 671–680. [Google Scholar] [CrossRef]
  34. Yang, X.S.; Deb, S. Cuckoo Search via Lévy Flights. In Proceedings of the 2009 World Congress on Nature and Biologically Inspired Computing, NABIC 2009—Proceedings, Coimbatore, India, 9–11 December 2009; pp. 210–214. [Google Scholar]
  35. Kullback, S.; Leibler, R.A. On Information and Sufficiency. Ann. Math. Stat. 1951, 22, 79–86. [Google Scholar] [CrossRef]
  36. Yamada, K. Probability-Possibility Transformation Based on Evidence Theory. In Proceedings of the Joint 9th IFSA World Congress and 20th NAFIPS International Conference, Vancouver, BC, Canada, 25–28 July 2001; Volume 1, pp. 70–75. [Google Scholar]
Figure 1. Intention inference BN.
Figure 1. Intention inference BN.
Drones 07 00655 g001
Figure 2. Type recognition BN.
Figure 2. Type recognition BN.
Drones 07 00655 g002
Figure 3. Relative motion determination BN.
Figure 3. Relative motion determination BN.
Drones 07 00655 g003
Figure 4. Situation assessment BN.
Figure 4. Situation assessment BN.
Drones 07 00655 g004
Figure 5. Curves of two convergence factors. The blue curve refers to a and the black curve refers to a n l .
Figure 5. Curves of two convergence factors. The blue curve refers to a and the black curve refers to a n l .
Drones 07 00655 g005
Figure 6. Structure and conditional probability tables of the Asia BN.
Figure 6. Structure and conditional probability tables of the Asia BN.
Drones 07 00655 g006
Figure 7. (a) KL divergences of 5 algorithms for the Asia BN. (bd) KL divergences with error bars.
Figure 7. (a) KL divergences of 5 algorithms for the Asia BN. (bd) KL divergences with error bars.
Drones 07 00655 g007
Figure 8. (a) KL divergences of 5 algorithms for the Alarm BN. (bd) KL divergences with error bars.
Figure 8. (a) KL divergences of 5 algorithms for the Alarm BN. (bd) KL divergences with error bars.
Drones 07 00655 g008aDrones 07 00655 g008b
Figure 9. (a) KL divergences of 5 algorithms for the Win95pts BN. (bd) KL divergences with error bars.
Figure 9. (a) KL divergences of 5 algorithms for the Win95pts BN. (bd) KL divergences with error bars.
Drones 07 00655 g009
Figure 10. (a) KL divergences of 5 algorithms for the Andes BN. (bd) KL divergences with error bars.
Figure 10. (a) KL divergences of 5 algorithms for the Andes BN. (bd) KL divergences with error bars.
Drones 07 00655 g010
Figure 12. (a) Situation assessment results of the AEW. (b) Situation assessment results of the RP. (c) Situation assessment results of the EJA. (d) Situation assessment results of the fighter.
Figure 12. (a) Situation assessment results of the AEW. (b) Situation assessment results of the RP. (c) Situation assessment results of the EJA. (d) Situation assessment results of the fighter.
Drones 07 00655 g012aDrones 07 00655 g012b
Table 1. Relevant information on the nodes.
Table 1. Relevant information on the nodes.
NodeNumberState SetState Notation
Intention1Patrol, Recon, Jamming, Assault { p , r , j , a }
Type2AEW 1, RP 2, EJA 3, Fighter { a , r , e , f }
Relative velocity3Low, Medium, High { l , m , h }
Relative height4Low, Medium, High { l , m , h }
Relative motion5Approach, Leave { a , l }
Velocity6Low, Medium, High { l , m , h }
Height7Low, Medium, High { l , m , h }
RCS8Very small, Small, Medium, Large { v , s , m , l }
RFV9Agility, Fixed { a , f }
Relative distance10Decrease, Unchanged, Increase { d , u , i }
Relative bearing11Low, Medium, High { l , m , h }
1 Air-borne early warning. 2 Electronic jamming aircraft. 3 Reconnaissance plane.
Table 2. Information of 4 BNs.
Table 2. Information of 4 BNs.
BNsScaleNodesArcsParameters
AsiaSmall8818
AlarmMedium3746509
Win95ptsLarge76112574
AndesVery large2233381157
Table 3. KL divergences of five algorithms.
Table 3. KL divergences of five algorithms.
BNsNumber of SamplesMLEMAPQMAPWOAIWOA-PPI
Asia4024.27 ± 3.472.28 ± 0.670.62 ± 0.041.29 ± 0.440.46 ± 0.06
8019.75 ± 3.752.14 ± 0.540.57 ± 0.071.06 ± 0.180.42 ± 0.06
12014.06 ± 3.731.57 ± 0.420.51 ± 0.060.92 ± 0.170.40 ± 0.08
16011.27 ± 3.411.28 ± 0.310.44 ± 0.020.84 ± 0.230.35 ± 0.07
2008.90 ± 2.941.08 ± 0.260.41 ± 0.040.75 ± 0.180.28 ± 0.07
Alarm40183.31 ± 19.57116.41 ± 8.5616.15 ± 0.2634.44 ± 3.9611.43 ± 0.54
80172.97 ± 15.24105.91 ± 6.8715.01 ± 0.3833.21 ± 2.6110.79 ± 0.46
120152.38 ± 14.3593.32 ± 6.3114.16 ± 0.3931.55 ± 3.1010.44 ± 0.30
160145.58 ± 21.3589.11 ± 5.9913.65 ± 0.1729.38 ± 1.8310.25 ± 0.32
200127.44 ± 12.0681.33 ± 6.9013.21 ± 0.3428.87 ± 2.229.88 ± 0.27
Win95pts40220.84 ± 12.81133.36 ± 3.8440.57 ± 0.23100.51 ± 3.0934.90 ± 0.82
80215.76 ± 20.75129.42 ± 4.6238.71 ± 0.2598.92 ± 4.5733.21 ± 1.09
120220.27 ± 15.16124.19 ± 5.2037.40 ± 0.2995.72 ± 3.3229.28 ± 0.56
160212.96 ± 15.84119.66 ± 5.9336.38 ± 0.3193.97 ± 2.9126.23 ± 0.55
200211.75 ± 13.20118.32 ± 5.1235.73 ± 0.3689.08 ± 3.2924.22 ± 0.34
Andes40702.29 ± 36.94226.33 ± 8.5535.68 ± 0.28113.40 ± 3.5429.73 ± 0.91
80590.41 ± 29.50190.57 ± 10.7331.33 ± 0.25109.76 ± 3.0528.24 ± 0.53
120504.44 ± 24.21165.79 ± 10.0228.49 ± 0.31105.43 ± 3.4725.49 ± 0.66
160457.17 ± 16.99149.56 ± 9.3126.57 ± 0.27101.27 ± 2.7522.24 ± 0.59
200423.23 ± 31.17139.03 ± 8.8025.76 ± 0.2298.57 ± 3.1620.68 ± 0.41
Table 4. True parameters in the form of conditional probability tables.
Table 4. True parameters in the form of conditional probability tables.
NodeStateIntention
PatrolReconJammingAssault
Target typeAEW0.60.10.10.05
RP0.20.650.10.1
EJA0.10.150.70.1
Fighter0.10.10.10.75
Relative
velocity
Low0.70.50.30.1
Medium0.20.40.60.5
High0.10.10.10.4
Relative heightLow0.30.10.50.2
Medium0.60.20.40.5
High0.10.70.10.3
Relative
motion
Approach0.50.70.80.9
Leave0.50.30.20.1
Target type
AEWRPEJAFighter
VelocityLow0.60.70.20.2
Medium0.30.20.70.3
High0.10.10.10.5
HeightLow0.20.10.30.1
Medium0.70.10.60.5
High0.10.80.10.4
RCSVery small0.10.30.10.6
Small0.10.50.10.2
Medium0.20.10.70.1
Large0.60.10.10.1
RFVAgility0.80.30.90.2
Fixed0.20.70.10.8
Relative motion
ApproachLeave
Relative
distance
Decrease0.80.1
Unchanged0.10.1
Increase0.10.8
Relative
bearing
Low0.70.1
Medium0.20.2
High0.10.7
Table 5. Learned parameters in the form of conditional probability tables.
Table 5. Learned parameters in the form of conditional probability tables.
NodeStateIntention
PatrolReconJammingAssault
Target typeAEW0.63230.10070.10170.0678
RP0.19120.62910.09950.0798
EJA0.08860.17050.68890.1040
Fighter0.08790.09970.10990.7484
Relative
velocity
Low0.67220.50600.26790.1135
Medium0.23600.38330.63540.5083
High0.09180.11070.09670.3782
Relative heightLow0.23580.12690.51330.2142
Medium0.63940.19210.36130.5112
High0.12480.68100.12540.2746
Relative
motion
Approach0.51210.68130.79460.8582
Leave0.48790.31870.20540.1418
Target type
AEWRPEJAFighter
VelocityLow0.63850.67900.23470.1398
Medium0.26670.23090.67650.3525
High0.09480.09010.08880.5077
HeightLow0.23650.10020.28250.1122
Medium0.67270.12580.61870.5118
High0.09080.77400.09880.3760
RCSVery small0.09010.29580.10010.6428
Small0.08800.51430.10030.1815
Medium0.18730.09520.67900.0877
Large0.63460.09470.12060.0880
RFVAgility0.76840.31200.93510.2230
Fixed0.23160.68800.06490.7770
Relative motion
ApproachLeave
Relative
distance
Decrease0.81840.0997
Unchanged0.08190.0976
Increase0.09970.8027
Relative
bearing
Low0.68430.0934
Medium0.22090.2084
High0.09480.6982
Table 6. Observed evidences.
Table 6. Observed evidences.
VelocityHeightRCSRFVRelative
Velocity
Relative
Height
Relative
Distance
Relative
Bearing
AEW0.9, 0.1, 0.00.2, 0.8, 0.00.0, 0.0, 0.3, 0.71.0, 0.00.9, 0.1, 0.00.1, 0.9, 0.00.0, 0.8, 0.20.0, 0.2, 0.8
RP0.9, 0.1, 0.00.0, 0.1, 0.90.1, 0.9, 0.0, 0.00.0, 1.00.9, 0.1, 0.00.0, 0.1, 0.90.9, 0.1, 0.00.9, 0.1, 0.0
EJA0.2, 0.8, 0.00.9, 0.1, 0.00.0, 0.0, 0.9, 0.11.0, 0.00.3, 0.7, 0.00.8, 0.2, 0.00.8, 0.2, 0.00.8, 0.2, 0.0
Fighter0.0, 0.2, 0.80.2, 0.8, 0.00.9, 0.1, 0.0, 0.00.0, 1.00.0, 0.2, 0.80.2, 0.8, 0.00.8, 0.2, 0.00.8, 0.2, 0.0
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Li, W.; Zhang, W.; Liu, B.; Guo, Y. The Situation Assessment of UAVs Based on an Improved Whale Optimization Bayesian Network Parameter-Learning Algorithm. Drones 2023, 7, 655. https://doi.org/10.3390/drones7110655

AMA Style

Li W, Zhang W, Liu B, Guo Y. The Situation Assessment of UAVs Based on an Improved Whale Optimization Bayesian Network Parameter-Learning Algorithm. Drones. 2023; 7(11):655. https://doi.org/10.3390/drones7110655

Chicago/Turabian Style

Li, Weinan, Weiguo Zhang, Baoning Liu, and Yicong Guo. 2023. "The Situation Assessment of UAVs Based on an Improved Whale Optimization Bayesian Network Parameter-Learning Algorithm" Drones 7, no. 11: 655. https://doi.org/10.3390/drones7110655

Article Metrics

Back to TopTop