A Cooperative Search and Coverage Algorithm with Controllable Revisit and Connectivity Maintenance for Multiple Unmanned Aerial Vehicles

In this paper, we mainly study a cooperative search and coverage algorithm for a given bounded rectangle region, which contains several unknown stationary targets, by a team of unmanned aerial vehicles (UAVs) with non-ideal sensors and limited communication ranges. Our goal is to minimize the search time, while gathering more information about the environment and finding more targets. For this purpose, a novel cooperative search and coverage algorithm with controllable revisit mechanism is presented. Firstly, as the representation of the environment, the cognitive maps that included the target probability map (TPM), the uncertain map (UM), and the digital pheromone map (DPM) are constituted. We also design a distributed update and fusion scheme for the cognitive map. This update and fusion scheme can guarantee that each one of the cognitive maps converges to the same one, which reflects the targets’ true existence or absence in each cell of the search region. Secondly, we develop a controllable revisit mechanism based on the DPM. This mechanism can concentrate the UAVs to revisit sub-areas that have a large target probability or high uncertainty. Thirdly, in the frame of distributed receding horizon optimizing, a path planning algorithm for the multi-UAVs cooperative search and coverage is designed. In the path planning algorithm, the movement of the UAVs is restricted by the potential fields to meet the requirements of avoiding collision and maintaining connectivity constraints. Moreover, using the minimum spanning tree (MST) topology optimization strategy, we can obtain a tradeoff between the search coverage enhancement and the connectivity maintenance. The feasibility of the proposed algorithm is demonstrated by comparison simulations by way of analyzing the effects of the controllable revisit mechanism and the connectivity maintenance scheme. The Monte Carlo method is employed to validate the influence of the number of UAVs, the sensing radius, the detection and false alarm probabilities, and the communication range on the proposed algorithm.


Introduction
Recently, multiple UAVs have received more and more attention for their accomplishments in both military and civil applications. Cooperative search and coverage is one major application of multiple UAVs equipped with sensors [1], such as camera, radar, and sonar. The goal of cooperative search and coverage is to control multiple UAVs to find several unknown ground targets scattered in a given surveillance region, while maximally reducing the uncertainty of the environment and minimizing the search time [2].
In the cooperative search and coverage problem, there are two main technical issues that should be considered [3]. ( targets existence and uncertainty in the environment, and how to treat sensor observation results as evidence to update the knowledge of the environment so that the UAVs' belief can reflect the true existence or absence of the targets within areas of the surveillance region; (2) search path planning. This focuses on how develop cooperative control methods that enable UAVs to move in such a way as to maximize the possibility of finding targets or minimize the uncertainty in the environment. Extensive studies have been carried out on these two key issues. The environment representation is the primary problem of cooperative search and coverage. In general, a common method is to make the whole surveillance area become smaller cells, and each cell is associated with values, such as probability or uncertainty, thereby constituting a map for the search region. Each UAV maintains a map of the surveillance region that serves as the UAV's knowledge about the state of the region. There are several types of the map, such as the occupancy map [4][5][6], the target probability map [7][8][9], the uncertainty map [10][11][12], and so on. In fact, these maps can collectively be called the cognitive map, which is used to represent the environment and is incrementally updated based on the new sensor observations.
The continuous updating of the cognitive map reflects the collecting and processing of new sensory information about the environment for the UAVs. However, each UAV can only obtain local sensory information about the whole surveillance region due to its limited sensing range. In addition, considering the non-ideal sensor performance, there are a great variety of errors and uncertainties on the sensory information from the UAVs. Therefore, the sensory information from multiple UAVs needs to be combined through some update and fusion schemes so that the best knowledge of the environment can be obtained. Most available update and fusion schemes are broadly based on Bayesian theory [2,13,14] and Dempster-Shafer theory [15]. However, few of the existing update and fusion methods can guarantee that all individual cognitive maps can be converged to the same one that reflects the true target existence status of each cell in the whole surveillance region.
Search path planning is an important problem for efficient search and coverage by a team of UAVs. It is concerned with cooperative controlling of the movements of multi-UAVs in order to guarantee optimal search paths that maximize the possibility of finding targets and minimize the uncertainty in the environment. Different authors have developed several search path planning methods, such as reinforcement learning [15], potential field [16], group dispersion pattern [17], intelligence algorithm [18], dynamic programming [19], gradient optimization [20], mixed integer linear programming [21], Voronoi partitioning [22,23], and receding horizon optimization (RHO) [24][25][26]. In [22,23], the convex region is partitioned into Voronoi cells so that there is only one agent in each Voronoi cell according to the position of the agents. Therefore, the distributed multi-agents coverage control problem is converted into the coverage of Voronoi cell for single agent. Then, a gradient descent control law is designed to continually drive the agents toward the centroids of their Voronoi cells. In this way, the whole convex region can be covered by multi-agents without violating the collision constraint.
For the reason of the fact that the RHO could effectively handle the dynamic changes of the environment and constrain movements of UAVs, it is used to solve the cooperative search and coverage problems. Reference [24] presents a receding horizon cooperative search algorithm that jointly optimizes paths and sensor orientations for a team of UAVs searching for a mobile target. In reference [25], a receding horizon, motion-planning algorithm is used to obtain the optimal search path in the given horizon. In reference [26], the distributed model predictive control method is presented to solve the cooperative search moving targets problem. In addition, the attractant and repulsion pheromones are introduced to improve the effective cooperation between UAVs.
Although these above methods have been verified to be effective for the multi-UAVs cooperative search and coverage problems, they lack a controllable revisit mechanism to guide the UAVs to revisit sub-areas with large target probability or high uncertainty, so that the capability to capture the target and the efficiency of coverage are low.

Problem Formulations
As shown in Figure 1, there is a team of UAVs (A i , i = 1, 2, . . . , N) performing cooperative search and coverage mission in an unknown surveillance region that contained several stationary ground targets (T j , j = 1, 2, . . . , M). In unknown environment, for the UAVs, the targets number and their locations are unknown a priori. The UAVs need to use on-board sensors (e.g., cameras) to observe some areas of the environment, so that the UAVs can incrementally obtain knowledge of the environment and find targets. The decision on where to explore next is driven by the objective to increase the chance of finding targets or reducing the uncertainty in the environment. For this purpose (1) we need to design a cognitive map as the environment representation, to represent targets existence and uncertainty in the environment. In other words, each UAV maintains a cognitive map of the whole surveillance region that serves as the UAV's knowledge of the environment; (2) an update and fusion scheme need to be designed to guarantee all cognitive maps converge to the same one that reflects the targets' true existence or absence in each cell of the search region. As the UAVs move around and observe sub-areas of the region, the corresponding cells of the cognitive maps are updated to incorporate the information gained through the on-board sensors, as well as the communication network. The updated cognitive maps should be same and could reflect the true environment; (3) we need to design a distributed search and coverage control algorithm that plans the optimal paths for the UAVs to follow to search and coverage the surveillance region, while guarantying the constraints of the collision avoidance and connectivity preservation.

Problem Formulations
As shown in Figure 1, there is a team of UAVs (Ai, i = 1, 2, …, N) performing cooperative search and coverage mission in an unknown surveillance region that contained several stationary ground targets (Tj, j = 1, 2, …, M). In unknown environment, for the UAVs, the targets number and their locations are unknown a priori. The UAVs need to use on-board sensors (e.g., cameras) to observe some areas of the environment, so that the UAVs can incrementally obtain knowledge of the environment and find targets. The decision on where to explore next is driven by the objective to increase the chance of finding targets or reducing the uncertainty in the environment. For this purpose (1) we need to design a cognitive map as the environment representation, to represent targets existence and uncertainty in the environment. In other words, each UAV maintains a cognitive map of the whole surveillance region that serves as the UAV's knowledge of the environment; (2) an update and fusion scheme need to be designed to guarantee all cognitive maps converge to the same one that reflects the targets' true existence or absence in each cell of the search region. As the UAVs move around and observe sub-areas of the region, the corresponding cells of the cognitive maps are updated to incorporate the information gained through the on-board sensors, as well as the communication network. The updated cognitive maps should be same and could reflect the true environment; (3) we need to design a distributed search and coverage control algorithm that plans the optimal paths for the UAVs to follow to search and coverage the surveillance region, while guarantying the constraints of the collision avoidance and connectivity preservation.

The Description of Search Environment
As shown in Figure 2, an L × W rectangular region Ω is uniformly divided into Lx × Ly cells of the same size. The cell that is located in the m-th row and n-th column is identified by its identity number c = m + (n − 1) × Lx, c ∊ {1, 2, …, Lx × Ly}. Δx and Δy denote the length and width of the cells, respectively. Δx and Δy may be chosen as the flight distance of the UAV in a time step with the constant cruising speed. Each cell is located with its center μc = [xc, yc] T , in which xc and yc are coordinated with its center. ζc ∊ {0, 1} indicates whether a target exists in cell c or not, i.e., ζc = 1 indicates a target is present in cell c, and ζc = 0 indicates no target is present in that cell. To better describe the problem, it assumes that (i) there is, at most, one target in each cell and (ii) there are no threats and obstacles in the surveillance region.

The Description of Search Environment
As shown in Figure 2, an L × W rectangular region Ω is uniformly divided into L x × L y cells of the same size. The cell that is located in the m-th row and n-th column is identified by its identity number c = m + (n − 1) × L x , c ∈ {1, 2, . . . , L x × L y }. ∆x and ∆y denote the length and width of the cells, respectively. ∆x and ∆y may be chosen as the flight distance of the UAV in a time step with the constant cruising speed. Each cell is located with its center µ c = [x c , y c ] T , in which x c and y c are coordinated with its center. ζ c ∈ {0, 1} indicates whether a target exists in cell c or not, i.e., ζ c = 1 indicates a target is present in cell c, and ζ c = 0 indicates no target is present in that cell. To better describe the problem, it assumes that (i) there is, at most, one target in each cell and (ii) there are no threats and obstacles in the surveillance region.

Simplified Dynamic Model of UAV
We assume that the UAVs move on a fixed plane above the surveillance region. Let xi(k) = [μi(k), ψi(k)] T denote the state of Ai at time k. μi(k) = [xi(k), yi(k)] T represents the position of Ai, in which xi(k) and yi(k) are the planar coordinates of its projection onto the surveillance region. ψi(k) is the heading angle. The two-dimension motion of UAV in a horizontal plane is analyzed, and the kinematics of the UAV are c c cos ( ) In Equation (1), vc represents the constant cruising speed, and Δt represents the time step. Operator In [·] indicates a rounding operation that maps the flight distance of Ai over a time step Δt to the cell index increments (Δm, Δn) in surveillance region.
For simplicity, it is assumed that the UAV is equipped with an autopilot that holds constant altitude and ground speed. We only need to design guidance inputs to this low-level autopilot system for target searching. In this paper, the design guidance inputs are ui(k) = ψi(k), which are constrained by the dynamic limits of UAV, i.e., the turning rate Δψi(k) ∊ [−Δψmax, Δψmax]. According to Equation (1), the dynamics of the UAVs can be described as follows. At each Δt, Ai can only move from the current cell to the neighboring cells, due to the constraints of maneuverability. More specifically, there are eight possible flight orientations defined as ψi(k) ∊ {1(east), 2(northeast), 3(north), 4(northwest), 5(west), 6(southwest), and 7(south)}. Due to its physical curvature radius constraints, the UAV can only change its orientation at most once in a Δt. In this case, Ai has only three possible orientation choices (turn left, go straight, or turn right, denoted by Δψi(k) ∊ {l(left), s(straight), r(right)}) for the next time step. Thus, the maximum turning angle is Δψmax = 45°.

Communication Model
In this paper, we only consider the limited communication range and ignore the bandwidth limitation, the communication delay, and the interruption that will be the future research direction extending the current work. Thus, two UAVs can directly exchange information if the distance between them is no more than the communication range Rc. The network topology of the l UAVs at time k can be modeled as an undirected graph G(k) = (V, E(k)). V = {A1, …, AN} is the vertices set and E(k) = {(Ai, Aj)| (Ai, Aj) ∊ V; ||μi(k) − μj(k)|| ≤ Rc} is the edge set, in which ||·|| denotes the 2-norm for vectors. Let Ni(k) = {Aj|||μi(k) − μj(k)|| ≤ Rc; j = 1, …, N, and j ≠ i} denote the set of neighbors of Ai at time k. The adjacency matrix can be expressed as Figure 2. A rectangular surveillance region is uniformly divided into L x × L y cells of the same size.

Simplified Dynamic Model of UAV
We assume that the UAVs move on a fixed plane above the surveillance region. Let and y i (k) are the planar coordinates of its projection onto the surveillance region. ψ i (k) is the heading angle. The two-dimension motion of UAV in a horizontal plane is analyzed, and the kinematics of the UAV are In Equation (1), v c represents the constant cruising speed, and ∆t represents the time step. Operator In [·] indicates a rounding operation that maps the flight distance of A i over a time step ∆t to the cell index increments (∆m, ∆n) in surveillance region.
For simplicity, it is assumed that the UAV is equipped with an autopilot that holds constant altitude and ground speed. We only need to design guidance inputs to this low-level autopilot system for target searching. In this paper, the design guidance inputs are u i (k) = ψ i (k), which are constrained by the dynamic limits of UAV, i.e., the turning rate ∆ψ i (k) ∈ [−∆ψ max , ∆ψ max ]. According to Equation (1), the dynamics of the UAVs can be described as follows. At each ∆t, A i can only move from the current cell to the neighboring cells, due to the constraints of maneuverability. More specifically, there are eight possible flight orientations defined as ψ i (k) ∈ {1(east), 2(northeast), 3(north), 4(northwest), 5(west), 6(southwest), and 7(south)}. Due to its physical curvature radius constraints, the UAV can only change its orientation at most once in a ∆t. In this case, A i has only three possible orientation choices (turn left, go straight, or turn right, denoted by ∆ψ i (k) ∈ {l(left), s(straight), r(right)}) for the next time step. Thus, the maximum turning angle is ∆ψ max = 45 • .

Communication Model
In this paper, we only consider the limited communication range and ignore the bandwidth limitation, the communication delay, and the interruption that will be the future research direction extending the current work. Thus, two UAVs can directly exchange information if the distance between them is no more than the communication range R c . The network topology of the l UAVs at time k can be modeled as an undirected graph G(k) = (V, E(k)). V = {A 1 , . . . , A N } is the vertices set and is the edge set, in which ||·|| denotes the 2-norm for vectors. Let N i (k) = {A j |||µ i (k) − µ j (k)|| ≤ R c ; j = 1, . . . , N, and j = i} denote the set of neighbors of A i at time k. The adjacency matrix can be expressed as in which ω ij > 0 is the weight of the wireless link (A i , A j ). In this paper, ω ij is defined as in which d ij = ||µ i (k) − µ j (k)|| is the distance between A i and A j . From Equation (3), we can see that a greater between A i and A j results in the larger weight of the link (A i , A j ). Then, the Laplacian matrix can be expressed as is full connected if direct communication exists between every two vertices of the graph. G(k) is connected if a sequence of edges (a route) exists for any two vertices; otherwise, G(k) is unconnected. The information sharing is quite indispensable in the cooperative search and coverage mission, and thus the network connectivity must be preserved. Let 0 = λ 1 (k) ≤ λ 2 (k) ≤ . . . ≤ λ N (k) be the ordered eigenvalues of the Laplacian matrix L(k). According to the algebraic graph theory, if and only if λ 2 (k) > 0, the graph G(k) is connected. The second smallest eigenvalue of L(k), λ 2 (k), is also called as the algebraic connectivity of the graph.

Cognitive Map
One of the most important aspects of search and coverage mission is to create a representation of the environment that contains information about targets existence and uncertainty within each cell. In this section, the notion of cognitive map is used as the environmental representation. We firstly construct the target probability map (TPM), which is used to describe the probabilities of the cells being occupied by a target. Next, the other two maps are introduced based on TPM. One is uncertainty map (UM), and it can describe the uncertainty degree of the environment. The other is digital pheromone map (DPM), which is mainly used to establish the controllable revisit mechanism. Integrating TPM, UM, and DPM, the cognitive map can be designed.

The Target Probability Map
If prior intelligent information obtained is not accurate, UAV can not absolutely know the distribution of the targets. The TPM is used to describe target existing probability of each cell. The target existing probability p c (k) ∈ [0, 1] is modeled as a Bernoulli distribution, i.e., p c (k) = P(target present in cell c at time k), (1 − p c (k)) = P(no target present in cell c at time k). The higher p c (k) is, the more likely the UAV believes that a target is in cell c. p c (k) = 0.5 indicates that the UAV has no knowledge about target existence in cell c, because the probability that a target is present is equal to the probability that no target is present in cell c. Based on the above notations, the TPM is defined as In the mission processing, each UAV maintains its own TPM. The knowledge of each UAV on target existence state in cell c is based on its sensor observations and the shared information from the neighboring UAVs by communication. So, the update of individual TPM has two stages: update TPM based on sensor observations and update TPM based on shared information.

Update TPM Based on Sensor Observations
It is assumed that A i takes observations via a downward-facing camera. The field of view (FoV) is a circle with sensing radius R s that covers some cells in the surveillance region Ω. Therefore, at time k, the set of the covered cells inside the FoV of A i denotes Φ i,k The sensor observations about cell c at time k are defined as Z i,c,k ∈ {0, 1}, in which Z i,c,k = 1 indicates "target detection" and Z i,c,k = 0 indicates "non-target detection". However, the sensor is non-idear. The performance of the sensor can be described by detection probability p d and false alarm probability p f , i.e., P(Z i,c,k = 1|ζ c = 1) = p d and P(Z i,c,k = 1|ζ c = 0) = p f . We assume that, for all cells and UAVs, p d and p f are constant and known prior.
According to the sensor observations Z i,c,k , the p i,c,k p i,c (k) is update via following rule, which is based on Bayesian theory By introducing the nonlinear transformation that described in Equation (8), the update rule is rewritten as shown in Equation (9).
From Equation (9), we can see that the evolution of Q i,c,k depends on the number of detections that is taken over cell c up to time k, which is denoted as m i,c,k . In fact, m i,c,k → +∞, p i,c,k → 0 if no target is present in cell c, and p i,c,k → 1 if a target is present in cell c. The symbol "→" indicates "approaches". It means that the converged TPM, M i,TPM (k), can reflect the true existence and absence of the targets. To prove this view, the convergence property of the updating method (Equation (9)) is analyzed in Theorem 1.

Theorem 1.
Given the initial prior TPM 0 < p i,c,0 < 1 for A i ., and 0 < p f < 0.5 < p d < 1, the following conclusions hold by implementing the updating rule in Equation (9).

•
If ζ c = 1, which indicates a target is present in cell c, as m i,c,k → +∞, then Q i,c,k → −∞ (i.e., p i,c,k → 1) and (Q i,c,k /m i,c,k ) → p d ln(p f /p d ) If ζ c = 0, which indicates no target is present in cell c, as m i,c,k → +∞, then Q i,c,k → +∞ (i.e., p i,c,k → 0), and (Q i,c,k /m i,c,k ) → p f ln(p f /p d ) The proof of Theorem 1 is seen in Appendix A. From Equation (9), we can find several interesting properties. If 0 < p f < 0.5 < p d < 1, which means that the sensor could provide useful information, then p i,c (k) > p i,c (k + 1) if a target is present in cell c and p i,c (k) < p i,c (k + 1) if no target is present in cell c. Therefore, the upper bound p max and the lower bound p min of the target existing probability are introduced. If p i,c,k ≥ p max , the UAV has obtained enough evidence to support its belief in a target existing in cell c. If p i,c,k ≤ p min , the UAV confirms that no target exists in cell c. In order to confirm whether there is a target in cell c or not, the UAVs needs to detect cell c several times to update the target existence probability to approach the upper bound p max or the lower bound p min . Theorem 2 shows how to estimate the minimum number of observations required in a given cell c.

•
If a target is present in cell c, the minimum number of observations m + avg required in cell c to satisfy the condition p i,c,k ≥ p max is estimated as • If no target is present in the cell c, the minimum number of observations m − avg required in cell c to satisfy the condition p i,c,k ≤ p min is estimated as The proof of Theorem 2 is seen in [3]. We can see that the sensor performance is better, e.g., when the detection probability p d and the lower the false alarm probability p f are larger, the minimum number of observations required m + avg or m − avg is smaller. This means that the better the performance of the sensor, the faster the TPM converges.

Update TPM Based on Shared Information
First, according to the sensor observations Z i,c,k , each UAV A i at time k updates its own TPM using Bayesian rule in Equation (9).
Then, each UAV A i exchanges the updated TMP H i,c,k to neighboring UAVs, and updates its own TPM again (map fusion) by following consensus protocol in which ω i,c,k = 1 − (|N i (k)|/N), ω i,j,k = (1/N) for j ∈ N i (k), and ω i,c,k = 0 for j / ∈ N i (k). N i (k) is the set of neighbors of A i at time k.
Similar to the Theorem 1, the rationality and the convergence property of the distributed update and fusion method in Equation (13) is analyzed in Theorem 3.

Theorem 3.
Given the initial prior TPM 0 < p i,c,0 < 1 for A i ., and 0 < p f < 0.5 < p d < 1, if the network topology G(k) of the UAVs is connected at all times, the following conclusions hold by implementing the distributed update and fusion scheme in Equation (13).

•
If ζ c = 1, which indicates a target is present in the cell c, as m c,k → +∞, then Q i,c,k → −∞ (i.e., p i,c,k → 1) and If ζ c = 0, which indicates no target is present in the cell c, as m c,k → +∞, then Q i,c,k → +∞ (i.e., p i,c,k → 0), in which m c,k = N ∑ i=1 m i,c,k represents the total number of detections that taken over the cell c up to time k by the whole team of UAVs.
The proof of Theorem 3 is seen in Appendix B. Theorem 3 gives two important conclusions, as follows

1.
As m c,k → +∞, the converged TPM, M i,TPM (k), i = 1, 2, . . . , N, which is updated according to the distributed update and fusion scheme in Equation (13), can reflect the true existence and nonexistence of the targets. Thus, the rationality of the distributed update and fusion scheme in Equation (13) can be proven.
in which (m c,k /(Nk)) represents the global average detected rate of the cell c by the UAVs. Equation (14) shows that the speed of Q i,c,k approaching −∞ or +∞ is proportional to the global average detected rate of the cell c. This conclusion is important for improving the effectiveness of cooperative search and coverage.
When we implement Equation (13), there is a data overflow problem which is caused by extremely large or small value of Q i,c,k . Thus, we set a bound Q max > 0, such that Q i,c,k ∈ [−Q max , Q max ].

The Uncertainty Map
As mentioning above, if p min < p i,c,k < p max , A i cannot confirm whether there is a target in cell c or not; in other words, cell c is uncertain for A i . In order to quantify the uncertainty of the cells in the surveillance region, we define the following uncertainty associated with cell c, which is a function of its target existing probability p i,c,k in which K η > 0 is a gain parameter. |·| means absolute operator. According to Equations (8) and (15), the relationship between η i,c,k and p i,c,k is shown in Figure 3.
in which (mc,k/(Nk)) represents the global average detected rate of the cell c by the UAVs. Equation (14) shows that the speed of Qi,c,k approaching −∞ or +∞ is proportional to the global average detected rate of the cell c. This conclusion is important for improving the effectiveness of cooperative search and coverage. When we implement Equation (13), there is a data overflow problem which is caused by extremely large or small value of Qi,c,k. Thus, we set a bound Qmax > 0, such that Qi,c,k ∊ [−Qmax, Qmax].

The Uncertainty Map
As mentioning above, if pmin < pi,c,k < pmax, Ai cannot confirm whether there is a target in cell c or not; in other words, cell c is uncertain for Ai. In order to quantify the uncertainty of the cells in the surveillance region, we define the following uncertainty associated with cell c, which is a function of its target existing probability pi,c,k , , (15) in which Kη > 0 is a gain parameter. |·| means absolute operator. According to Equations (8) and (15), the relationship between ηi,c,k and pi,c,k is shown in Figure 3. It is seen that when the cell c has pi,c,k = 0.5, it has maximal uncertainty ηi,c,k = 1, which indicates that cell c is unknown for the UAVs completely. When cell c has pi,c,k = 1 or 0, it has minimal uncertainty ηi,c,k = 0. In this case, the UAV are completely sure about the target present or not. In the process of cooperative search and coverage, more attention should be paid to the cells with higher uncertainties. Based on the above notations, we can define the UM as

The Digital Pheromone Map
Theorem 3 shows the convergence speed of the TPM is proportional to the global average detected rate of the cells in the surveillance region. It means that, in order to reduce the uncertainty of the cells in the surveillance region as soon as possible, it is essential to improve the global average  It is seen that when the cell c has p i,c,k = 0.5, it has maximal uncertainty η i,c,k = 1, which indicates that cell c is unknown for the UAVs completely. When cell c has p i,c,k = 1 or 0, it has minimal uncertainty η i,c,k = 0. In this case, the UAV are completely sure about the target present or not. In the process of cooperative search and coverage, more attention should be paid to the cells with higher uncertainties. Based on the above notations, we can define the UM as

The Digital Pheromone Map
Theorem 3 shows the convergence speed of the TPM is proportional to the global average detected rate of the cells in the surveillance region. It means that, in order to reduce the uncertainty of the cells in the surveillance region as soon as possible, it is essential to improve the global average detected rate of the cells for the UAVs, which equates to improving the controllable revisit ability of the UAVs. For this reason, we develop a controllable revisit mechanism that is based on the characteristic of pheromone, such as "secrete, propagate and evaporation". This mechanism can be used to control the UAVs to revisit sub-areas with large target probability or high uncertainty, and hence improve the performance of search and coverage. The digital pheromone map is defined as in which s i,c,k is the pheromone strength in the cell c at time k. Digital pheromone supports three primary operations: (i) release: Pheromone can be released quantitatively into the cell; (ii) propagate: Pheromone propagates from a cell to its neighboring cells; (iii) evaporate: Pheromone gradually evaporates to zero over time. In order to simulate these three primary operations of pheromone, the propagation factor G s ∈ (0, 1) and the evaporation factor E s ∈ (0, 1) are defined.
Equation (18) describes evolution of the pheromone strength in cell c at time k is the pheromone releasing switch in cell c at time k. g(c, k) denotes the amount of pheromone propagated to cell c from its neighboring cells during the time period (k − 1, k], which can be described by the following Equation (19).
in which N(c) is the neighbor set of the cell c, the neighboring cells are denoted as c ∈ N(c), and |N(c )| is the number of the neighboring cells of cell c . s i (c , k − 1) is the pheromone strength in neighboring cell c at time (k − 1). k i (c , k) ∈ {0, 1} is switch coefficient of the pheromone releasing in the neighboring cell c at time k.
The key of controllable revisit mechanism is determining the switch coefficient of the pheromone releasing in the cells of the DPM at time k. The switching coefficient k i (c, k) indicates whether the cell c autonomously releases pheromone or not. If k i (c, k) = 1, cell c will release the pheromone, and the released pheromone will propagate to the neighboring cells. In this way, the pheromone fields will be formed and attract the UAVs to revisit cell c. In order to drive the UAVs to revisit the cells that have large target probability or high uncertainty, the pheromone releasing switch of cell c should be turned on in the following two cases.
Case 1: If target existing probability in cell c satisfies the condition p i,c,k > 0.5, it indicates that the UAVs are more likely to believe that a target exists in cell c. However, the UAVs do not have enough evidence to support their beliefs, since p i,c,k < p max . In this case, the UAVs are required to detect cell c again (revisit) so as to confirm target present state of cell c. Thus, the pheromone releasing switch of cell c should be turn on (i.e., k i (c, k) = 1), in order to attract the UAVs to revisit cell c. Once p i,c,k ≥ p max or p i,c,k ≤ p min , the switch should be turned off (i.e., k i (c, k) = 0) immediately, and, finally, the pheromone in cell c evaporates to zero over time.
Case 2: Assume that τ c is the last revisited time of cell c, T 0 is a pre-defined time threshold, and t is current time. If (t − τ c ) > T 0 , then k i (c, k) = 1, which means cell c has not been explored for a long time and should be revisited; otherwise, k i (c, k) = 0. Once cell c has revisited by a UAV, its pheromone releasing switch is turned off. After a period of time, if the condition (t − τ c ) > T 0 is satisfied again, the switch in cell c will be turn on again.

Distributed Path Planning Algorithm for Cooperative Search and Coverage
Optimal search and coverage paths can be designed based on the cognitive maps of the UAVs. Since the cognitive map of each UAV is incrementally updated based on its sensor observations and the shared information from other UAVs by communication, each UAV continually re-plans its path to guarantee the team of UAVs identifies maximum number of targets or gathers maximum information about environment. Therefore, multi-UAVs cooperative search and coverage problem is an on-line dynamic optimization problem. In this section, we plan optimal paths of the UAVs for cooperative search and coverage in the frame of distributed receding horizon optimizing.

Distributed Receding Horizon Optimizing Model for Cooperative Search and Coverage
In the frame of distributed receding horizon optimizing as shown in Figure 4, at time k, A i optimizes its control inputs U i (k) and computes its own path X i (k) based on the received the current state X −i (k) of its neighboring UAVs. A i computes its own path by solving the following local optimization problem Define J i as the "gain" at decision time step k. f i is the UAV dynamic model. Ξ and Θ denote the feasible state set and admissible control input set of UAV, respectively. U i (k) = {u i (k + 1|k), . . . , u i (k + T|k)} denote the control inputs sequence of A i over the time horizon [k + 1:k + T], which is determined at time step k. According to the simplified dynamic model of UAV, the control inputs u is the choose orientation ψ i of the aircraft.
} denote the received the current state of the neighboring UAVs of A i . Generally, in order to satisfy the collision avoidance and connectivity maintenance constraints, A i needs to obtain the future state sequence of its neighboring UAVs, denoted X −i (k + q k) = {x j (k + 1|k), . . . , x j (k + q|k), . . . , x j (k + T|k)|j ∈ N i (k), q = 1, 2, . . . , T}, when A i plans its own path X * However, due to the bandwidth limitation of the realistic wireless communication links, it is hard to receive the future state sequence of the all neighboring UAVs within one sampling time. In this paper, we adopt a feasible approach to reduce the communication between UAVs. Each UAV A i only receives the current state X −i (k) of its neighboring UAVs. Using current state X −i (k) and the UAV dynamic model, A i can estimate the future state sequence of the neighboring UAVs at next T time steps. Based on the estimate state sequence of the neighboring UAVs, A i computes its own path by solving the following local optimization problem in Equations (20) and (21). In order to guarantee the estimation accuracy and reduce the computation time, the planning horizon must be shorter. information about environment. Therefore, multi-UAVs cooperative search and coverage problem is an on-line dynamic optimization problem. In this section, we plan optimal paths of the UAVs for cooperative search and coverage in the frame of distributed receding horizon optimizing.

Distributed Receding Horizon Optimizing Model for Cooperative Search and Coverage
In the frame of distributed receding horizon optimizing as shown in Figure 4, at time k, Ai optimizes its control inputs ( ) i k U and computes its own path of its neighboring UAVs. Ai computes its own path by solving the following local optimization problem Define Ji as the "gain" at decision time step k. fi is the UAV dynamic model. Ξ and Θ denote the feasible state set and admissible control input set of UAV, respectively.
ui(k + T|k)} denote the control inputs sequence of Ai over the time horizon [k + 1:k + T], which is determined at time step k. According to the simplified dynamic model of UAV, the control inputs u is the choose orientation ψi of the aircraft.  (20) and (21). In order to guarantee the estimation accuracy and reduce the computation time, the planning horizon must be shorter.

Communication Network
Optimization in the planning horizon (RHO)

Search Path Decision Process
The whole process of the search path decision strategy is illustrated in Figure 5. There are three sub-stages in the whole decision process: (i) prediction; (ii) decision; and (iii) acting.

Search Path Decision Process
The whole process of the search path decision strategy is illustrated in Figure 5. There are three sub-stages in the whole decision process: (i) prediction; (ii) decision; and (iii) acting.
obtaining the state Expanding planning tree (Prediction) Optimal path selection (Decision) Vehicle dynamics (Acting) Target existing probability  Uncertainty  Digital pheromone strength   sensor  observations Z i,c,k from neighboring UAVs

Prediction Stage
In this stage, using current state xi(k) and the UAV dynamic model, each UAV Ai generates its predicted state sequence, in which xi(k + q|k) represents the predicted state at time (k + q).
According to the simplified UAV dynamic model, Ai can only move from one cell to another neighboring cell, and has only three possible orientation choices for the next time step, i.e., turn left, go straight, and turn right, based on the current position and orientation. Therefore, the predicted } reflects the set of reachable cells from the future time step (k + 1) to the future time step (k + q), based on the current state xi(k) at time k. These reachable cells form an expanding planning tree, as shown in Figure 6. The expanding planning tree is denoted is the set of the predicted reachable cells at the future time step (k + q). It is clear that the expanding planning tree contains 3 T candidate paths; the l-th path can be denoted as Pi l (k) = {Pi l (k + 1|k), …, Pi l (k + q|k), …,

Prediction Stage
In this stage, using current state x i (k) and the UAV dynamic model, each UAV A i generates its predicted state sequence, According to the simplified UAV dynamic model, A i can only move from one cell to another neighboring cell, and has only three possible orientation choices for the next time step, i.e., turn left, go straight, and turn right, based on the current position and orientation. Therefore, the predicted state sequence X i (k) = {x i (k + 1|k), . . . , x i (k + T|k)} reflects the set of reachable cells from the future time step (k + 1) to the future time step (k + q), based on the current state x i (k) at time k. These reachable cells form an expanding planning tree, as shown in Figure 6. The expanding planning tree is denoted is the set of the predicted reachable cells at the future time step (k + q). It is clear that the expanding planning tree contains 3 T candidate paths; the l-th path can be denoted as P i l (k) = {P i l (k + 1|k), . . . , P i l (k + q|k), . . . ,

Search Path Decision Process
The whole process of the search path decision strategy is illustrated in Figure 5. There are three sub-stages in the whole decision process: (i) prediction; (ii) decision; and (iii) acting. Vehicle dynamics (Acting)

Prediction Stage
In this stage, using current state xi(k) and the UAV dynamic model, each UAV Ai generates its predicted state sequence, in which xi(k + q|k) represents the predicted state at time (k + q).
According to the simplified UAV dynamic model, Ai can only move from one cell to another neighboring cell, and has only three possible orientation choices for the next time step, i.e., turn left, go straight, and turn right, based on the current position and orientation. Therefore, the predicted state sequence ( ) i k X = {xi(k + 1|k), …, xi(k + T|k)} reflects the set of reachable cells from the future time step (k + 1) to the future time step (k + q), based on the current state xi(k) at time k. These reachable cells form an expanding planning tree, as shown in Figure 6. The expanding planning tree is denoted is the set of the predicted reachable cells at the future time step (k + q). It is clear that the expanding planning tree contains 3 T candidate paths; the l-th path can be denoted as Pi l (k) = {Pi l (k + 1|k), …, Pi l (k + q|k), …,

Decision Stage
In this state, in the basis of the current knowledge about the environment (such as, the target existing probability p i,c,k , the uncertainty η i,c,k , and the digital pheromone strength s i,c,k ) available via the cognitive map, and the positions and orientations of the team of UAVs, each UAV uses a multi-objectives optimization function J i to select and update its search path. In other words, at each time step k, A i evaluates the value of J i associated with each path and selects the optimal path and determines the corresponding optimal control input (heading angle) sequence U*(k) = {ψ i * (k + 1|k),

Acting Stage
The first item ψ i * (k + 1|k) in the optimal decision sequence U*(k) is implemented to guide A i to visit the cells that waiting to be visited, and A i updates its own cognitive map into the next round of cycle.

Multi-Objectives of the Cooperative Search and Coverage Mission
In this section, we investigate different objectives of the search and coverage mission, which include (i) environment exploration; (ii) target discovery and environment coverage; (iii) collision avoidance and (iv) connectivity maintenance. So, we define the reward of environment exploration J A , the reward of target discovery and environment coverage J B , the cost of collision avoidance J C , and the cost of connectivity maintenance J D , respectively.

Environment Exploration
The main objective of the mission is exploring the whole environment to decrease the uncertainty about the existence and nonexistence of the targets in the cells of the environment. In other words, the UAV should follow the path where there is maximum uncertainty in the cognitive map. Thus, if A i selects the l-th candidate path, the reward of environment exploration J A can be defined as in which Φ i (c) represents the set of cells that would be searched by A i along the path P i l (k).

Target Discovery and Environment Coverage
Although the main objective of the search mission is to explore the environment to gather more information about it, one could also be interested in exploiting that information to concentrate the UAVs around the targets to capture them as soon as possible. The objective of target discovery and environment coverage is to distribute the UAVs across the environment while aggregating in more important sub-areas. These more important sub-areas refer to the cells that have high target probability (Case 1 in Section 3.3) or have not been explored for a long time (Case 2 in Section 3.3). Based on this consideration, we design the controllable revisit mechanism based on DPM in Section 3.3. Thus, if A i selects the l-th candidate path, P i l (k) = {P i l (k + 1|k), . . . , P i l (k + q|k), . . . , P i l (k + T|k)}, to follow, the reward of target discovery and environment coverage J B can be defined as

Collision Avoidance
We use the concept of virtual rivaling force, which is borrowed from [29], to solve the collision avoidance problem. The main idea of the rivaling force mechanism is to treat the paths of other UAVs as "soft obstacles" to be avoided in path selection. The virtual rivaling force F j→i exerted by A j to A i at time k is non-zero if the relative position and heading conditions are both hold. The relative position condition imposes the requirement that A j needs to be sufficiently close to A i . The relative heading condition means that, if A j exerts a rivaling force on A i , A j must be moving in the same or opposite direction as A i , approximately. The overall rivaling force exerted by all the other UAVs upon A i at time A schematic diagram of multi-UAVs collision avoidance strategy using virtual rivaling force is shown in Figure 7. P i 1 (k + 1|k), P i 2 (k + 1|k), and P i 3 (k + 1|k) are three candidate waypoints; θ 1 , θ 2 , and θ 3 are the angles between the direction of the virtual rivaling force F i and the directions in the candidate waypoints {P i 1 (k + 1|k), P i 2 (k + 1|k), and P i 3 (k + 1|k)}. The angles θ 1 , θ 2 , and θ 3 satisfy In order to avoid the collision between A i and A j , A i should select P i 1 (k + 1|k) as the next waypoint. Therefore, the cost of collision avoidance in waypoint P i l (k + q|k) is defined as in which |F i | is the magnitude of the overall rivaling force F i . θ(P i l (k + q|k)) ∈ [0,π] is the angle difference between the direction of F i and the direction in the waypoints P i l (k + q|k). If |F i | is small while θ(P i l (k + q|k)) is small, then the cost is small. Thus, if the UAV A i selects the l-th candidate path,

Collision Avoidance
We use the concept of virtual rivaling force, which is borrowed from [29], to solve the collision avoidance problem. The main idea of the rivaling force mechanism is to treat the paths of other UAVs as "soft obstacles" to be avoided in path selection. The virtual rivaling force Fj→i exerted by Aj to Ai at time k is non-zero if the relative position and heading conditions are both hold. The relative position condition imposes the requirement that Aj needs to be sufficiently close to Ai. The relative heading condition means that, if Aj exerts a rivaling force on Ai, Aj must be moving in the same or opposite direction as Ai, approximately. The overall rivaling force exerted by all the other UAVs upon Ai at time k is Fi = ∑j≠i Fj→i.
A schematic diagram of multi-UAVs collision avoidance strategy using virtual rivaling force is shown in Figure 7. Pi 1 (k + 1|k), Pi 2 (k + 1|k), and Pi 3 (k + 1|k) are three candidate waypoints; θ1, θ2, and θ3 are the angles between the direction of the virtual rivaling force Fi and the directions in the candidate waypoints {Pi 1 (k + 1|k), Pi 2 (k + 1|k), and Pi 3 (k + 1|k)}. The angles θ1, θ2, and θ3 satisfy 0 ≤ θ1 < θ2 < θ3 ≤ π. In order to avoid the collision between Ai and Aj, Ai should select Pi 1 (k + 1|k) as the next waypoint. Therefore, the cost of collision avoidance in waypoint Pi l (k + q|k) is defined as (24) in which |Fi| is the magnitude of the overall rivaling force Fi. θ(Pi l (k + q|k)) ∊ [0,π] is the angle difference between the direction of Fi and the direction in the waypoints Pi l (k + q|k). If |Fi| is small while θ(Pi l (k + q|k)) is small, then the cost is small. Thus, if the UAV Ai selects the l-th candidate path, Pi l (k) = {Pi l (k + 1|k), …, Pi l (k + q|k), …, Pi l (k + T|k)}, to follow, the cost of target collision avoidance JC is Figure 7. Collision avoidance based on virtual rivaling force.

Connectivity Maintenance
In order to realize the information exchange and sharing in the team of UAVs, it is usually required that the UAVs maintain a connected communication network. In this section, we present an algorithm that might be used to maintain communication connectivity. Our connectivity maintenance algorithm is based on the pairwise connectivity maintenance problem introduced in [30]. The pairwise connectivity maintenance problem is as shown in Figure 8. At time k, we consider two UAVs Ai and Aj at positions μi(k) and μj(k), such that ||μi(k) − μj(k)|| ≤ Rc. Rc is the communication range. If Ai and Aj are both restricted to moving inside the disk εi,j,k centered at μdisk(μi(k), μj(k)) = 0.5[μi(k) + μj(k)] with radius 0.5Rc, then the distance between the UAVs' positions at time (k + 1) is no more than Rc, i.e., the communication between Ai and Aj is still connected at time (k + 1).
The disk εi,j,k is the connectivity constraint set, which is defined as

Connectivity Maintenance
In order to realize the information exchange and sharing in the team of UAVs, it is usually required that the UAVs maintain a connected communication network. In this section, we present an algorithm that might be used to maintain communication connectivity. Our connectivity maintenance algorithm is based on the pairwise connectivity maintenance problem introduced in [30]. The pairwise connectivity maintenance problem is as shown in Figure 8. At time k, we consider two UAVs A i and A j at positions µ i (k) and µ j (k), such that ||µ i (k) − µ j (k)|| ≤ R c . R c is the communication range. If A i and A j are both restricted to moving inside the disk ε i,j,k centered at µ disk (µ i (k), µ j (k)) = 0.5[µ i (k) + µ j (k)] with radius 0.5R c , then the distance between the UAVs' positions at time (k + 1) is no more than R c , i.e., the communication between A i and A j is still connected at time (k + 1).
The disk ε i,j,k is the connectivity constraint set, which is defined as Therefore, in order to maintain the network connectivity, the motion of each UAV must be restricted. Specifically, if the network is connected at time k, a set ξ i,k of the allowable positions of A i need to be found. If the position of A i at time (k + 1) is inside the a set ξ i,k , i.e., µ i (k + 1) ∈ ξ i,k , the network is still connected at time (k + 1). Obviously, the allowable position constrained set ξ i, k of A i is in which N i (k) is the set of neighbors of A i at time k. For A i , the allowable position constrained set is determined by the intersection of all connectivity constraint sets ε i,j,k for j ∈ N i (k), as shown in Figure 9. Therefore, in order to maintain the network connectivity, the motion of each UAV must be restricted. Specifically, if the network is connected at time k, a set ξi,k of the allowable positions of Ai need to be found. If the position of Ai at time (k + 1) is inside the a set ξi,k, i.e., μi(k + 1) ∊ ξi,k, the network is still connected at time (k + 1). Obviously, the allowable position constrained set ξi, k of Ai is , , , (27) in which Ni(k) is the set of neighbors of Ai at time k. For Ai, the allowable position constrained set is determined by the intersection of all connectivity constraint sets εi,j,k for j ∊ Ni(k), as shown in Figure 9. Figure 8. The connectivity maintenance constraint. It is clear that the network topology determines the neighbors Ni(k) of Ai. The number of Ai's neighbors, denoted |Ni(k)|, determines the size of the allowable positions set ξi,k. With the increasing of |Ni(k)|, the size of ξi,k becomes small. This indicates that, if Ai needs to maintain more links with its neighbors, then the motion space of Ai will be smaller. It is not beneficial to improve the efficiency of the cooperative search and coverage. Thus, the tradeoff between the coverage enhancement and the connectivity maintenance should be considered, which aims to preserve a connected topology for the network while providing as much freedom for the UAVs as possible.
In this paper, the minimum spanning tree (MST) strategy is used to optimize the communication network topology. Based on such MST topology, only the links in the MST topology are maintained, and the redundant links are removed without violating the global connectivity condition. The Lemma 4 in [31] shows that, of all the related spanning sub-graphs GS applied for maintaining connectivity of network G, the sub-graph GMST based on the MST provides the largest allowable position Therefore, in order to maintain the network connectivity, the motion of each UAV must be restricted. Specifically, if the network is connected at time k, a set ξi,k of the allowable positions of Ai need to be found. If the position of Ai at time (k + 1) is inside the a set ξi,k, i.e., μi(k + 1) ∊ ξi,k, the network is still connected at time (k + 1). Obviously, the allowable position constrained set ξi, k of Ai is , , , in which Ni(k) is the set of neighbors of Ai at time k. For Ai, the allowable position constrained set is determined by the intersection of all connectivity constraint sets εi,j,k for j ∊ Ni(k), as shown in Figure 9. Figure 8. The connectivity maintenance constraint. It is clear that the network topology determines the neighbors Ni(k) of Ai. The number of Ai's neighbors, denoted |Ni(k)|, determines the size of the allowable positions set ξi,k. With the increasing of |Ni(k)|, the size of ξi,k becomes small. This indicates that, if Ai needs to maintain more links with its neighbors, then the motion space of Ai will be smaller. It is not beneficial to improve the efficiency of the cooperative search and coverage. Thus, the tradeoff between the coverage enhancement and the connectivity maintenance should be considered, which aims to preserve a connected topology for the network while providing as much freedom for the UAVs as possible.
In this paper, the minimum spanning tree (MST) strategy is used to optimize the communication network topology. Based on such MST topology, only the links in the MST topology are maintained, and the redundant links are removed without violating the global connectivity condition. The Lemma 4 in [31] shows that, of all the related spanning sub-graphs GS applied for maintaining connectivity of network G, the sub-graph GMST based on the MST provides the largest allowable position It is clear that the network topology determines the neighbors N i (k) of A i . The number of A i 's neighbors, denoted |N i (k)|, determines the size of the allowable positions set ξ i,k . With the increasing of |N i (k)|, the size of ξ i,k becomes small. This indicates that, if A i needs to maintain more links with its neighbors, then the motion space of A i will be smaller. It is not beneficial to improve the efficiency of the cooperative search and coverage. Thus, the tradeoff between the coverage enhancement and the connectivity maintenance should be considered, which aims to preserve a connected topology for the network while providing as much freedom for the UAVs as possible.
In this paper, the minimum spanning tree (MST) strategy is used to optimize the communication network topology. Based on such MST topology, only the links in the MST topology are maintained, and the redundant links are removed without violating the global connectivity condition. The Lemma 4 in [31] shows that, of all the related spanning sub-graphs G S applied for maintaining connectivity of network G, the sub-graph G MST based on the MST provides the largest allowable position constrained set on average for every UAVs. The distributed algorithm for the minimum spanning tree is presented in [32].
In order to ensure that the link between A i and A j is still connected at time (k + 1), we use the potential filed method to restrict the movement of A i inside the disk ε i,j,k . The potential field value V C i,j,k (c) at the location of cell c is given by in which µ c represents the position of cell c in the surveillance region. µ(ε i,j,k ) represents the center of the disk ε i,j,k . r c is a custom parameter that satisfies r c = 0.8R c . The potential field V C i,j,k is illustrated in Figure 10. If ∆d(c, ε i,j,k ) ≤ 0.5r c or ∆d(c, ε i,j,k ) > 0.5R c , then V C i,j,k = 0. If 0.5r c < ∆d(c, ε i,j,k ) < 0.5R c , V C i,j,k > 0 and V C i,j,k sharply increase when ∆d(c, ε i,j,k ) increases from 0.5r c to 0.5R c . When ∆d(c, constrained set on average for every UAVs. The distributed algorithm for the minimum spanning tree is presented in [32]. In order to ensure that the link between Ai and Aj is still connected at time (k + 1), we use the potential filed method to restrict the movement of Ai inside the disk εi,j,k. The potential field value C , , ( ) i j k V c at the location of cell c is given by in which μc represents the position of cell c in the surveillance region. μ(εi,j,k) represents the center of the disk εi,j,k. rc is a custom parameter that satisfies rc = 0.8Rc. The potential field V C i,j,k is illustrated in Figure 10. If ∆d(c, εi,j,k) ≤ 0.5rc or ∆d(c, εi,j,k) > 0.5Rc, then V C i,j,k = 0. If 0.5rc < ∆d(c, εi,j,k) < 0.5Rc, V C i,j,k > 0 and V C i,j,k sharply increase when ∆d(c, εi,j,k) increases from 0.5rc to 0.5Rc. When ∆d(c, εi,j,k) → 0.5Rc, V C i,j,k → +∞.  Figure 10. The potential field V C i,j,k is generated to maintain the connectivity between Ai and Aj. Therefore, the potential field that aims to restrict the movement of Ai inside the allowable position constrained set ξi, k is defined as Thus, if the UAV Ai selects the l-th candidate path, Pi l (k) = {Pi l (k + 1|k), …, Pi l (k + q|k), …, Pi l (k + T|k)}, to follow, the cost of connectivity maintenance JD can be defined as The total performance index function J(i, l, k) is the weighted sum of JA, JB, JC, and JD  Figure 10. The potential field V C i,j,k is generated to maintain the connectivity between A i and A j .
Therefore, the potential field that aims to restrict the movement of A i inside the allowable position constrained set ξ i, k is defined as Thus, if the UAV A i selects the l-th candidate path, P i l (k) = {P i l (k + 1|k), . . . , P i l (k + q|k), . . . , P i l (k + T|k)}, to follow, the cost of connectivity maintenance J D can be defined as The total performance index function J(i, l, k) is the weighted sum of J A , J B , J C , and J D

The Simulation Validation and Results Analysis
The simulations are carried out in Microsoft Visual C++6.0 on a 2.4GHz, 2GB RAM laptop (Lenovo ThinkPad T420si, Beijing, China).

Effect of the Controllable Revisit Mechanism Based on Digital Pheromone
In Scenario 1, there are four UAVs performing search and coverage mission over a 2 km × 2 km rectangular region. The surveillance region is uniformly divided into 50 × 50 cells of the same size. The size of each cell is 40 m × 40 m. Three targets are scattered in the surveillance region. Tables 1 and 2 list the initial settings of the UAVs and targets respectively. The communication range is R c = 4000 m. Thus, the communication range is large enough to maintain the direct communication between every two vertices so that the movement of the UAVs can be not restricted by the network connectivity constraint. The sensing radius is R s = 60 m. The detection and false alarm probabilities are p d = 0.9 and p f = 0.3, respectively. The time step in simulations is T s = 0.1 s.  Table 2. The initial settings of 3 targets in Scenario 1.

Targets T j Position (x j , y j )/m Occupant Cell (m, n)
T 1 (−620, −620) (10, 10) T 2 (−620, 580) (10, 40) T 3 (580, −620) (40, 10) In Scenario 1, we compared our proposed algorithm with the method of reference [16]. The essential difference of the two methods is that the controllable revisit mechanism based on digital pheromone is not taken account by the method of reference [16]. Thus, in order to verify the controllable revisit mechanism based on digital pheromone and enhance the capacities of target capture and region coverage for the UAVs, the following two groups of experiments are carried out: • Group A: the controllable revisit mechanism is considered (the proposed method); • Group B: the controllable revisit mechanism is not considered (the method of reference [16]).

Group A: With the Controllable Revisit Mechanism
First, when A 1 arrives at the cell (10, 10) and its sensor observations is "target detection", so the pheromone releasing switch of the cell (10, 10) in A 1 's cognitive map is turned on and the cell (10, 10) releases the pheromone to attract the A 1 to revisit it. At t = 1.4 s, as shown in Figure 11, T 1 presenting in the cell (10, 10) is confirmed by A 1 . Then, A 4 arrives at the cell (10, 40), and its sensor observation is "target detection", so the cell (10, 40) releases the pheromone to attract the A 4 to revisit it. At t = 7.7 s, as shown in Figure 12, T 2 presenting in the cell (10, 40) is confirmed by A 4 . At last, at t = 18.2 s, T 3 in the cell (40, 10) is confirmed by A 2 , as shown in Figure 13. Figure 14 shows the minimum distance between the UAVs. The collision distance is the length of the square cell, which is 40 m. The minimum distance is never lower than the collision distance; it means that any two UAVs are always not in the same cell; thus, the collision avoidance is guaranteed.

Group B: Without the Controllable Revisit Mechanism
The snapshots of finding the targets T2, T3, T1 are shown in Figures 15-17, respectively. From these snapshots, we can conclude that, due to lacking the controllable revisit mechanism, the capabilities of the UAVs for target capture in Group B are lower than Group A. Specifically, as shown in Figure 15, although A1 travels through the cell (10, 10) at the beginning of the search process, A1 does not revisit the cell (10, 10) to confirm whether there is a target in this cell or not, due to lacking the controllable revisit mechanism. Therefore, the target T1 is not captured (found and confirmed) early enough and is confirmed by A1 until t = 33.2 s, as shown in Figure 17. Figure 18 shows the minimum distance between the UAVs in Group B. It can be seen that two UAVs are never in the same cell; thus, the collision avoidance is guaranteed.

Group B: Without the Controllable Revisit Mechanism
The snapshots of finding the targets T 2 , T 3 , T 1 are shown in Figures 15-17, respectively. From these snapshots, we can conclude that, due to lacking the controllable revisit mechanism, the capabilities of the UAVs for target capture in Group B are lower than Group A. Specifically, as shown in Figure 15, although A 1 travels through the cell (10, 10) at the beginning of the search process, A 1 does not revisit the cell (10, 10) to confirm whether there is a target in this cell or not, due to lacking the controllable revisit mechanism. Therefore, the target T 1 is not captured (found and confirmed) early enough and is confirmed by A 1 until t = 33.2 s, as shown in Figure 17. Figure 18 shows the minimum distance between the UAVs in Group B. It can be seen that two UAVs are never in the same cell; thus, the collision avoidance is guaranteed.

Group B: Without the Controllable Revisit Mechanism
The snapshots of finding the targets T2, T3, T1 are shown in Figures 15-17, respectively. From these snapshots, we can conclude that, due to lacking the controllable revisit mechanism, the capabilities of the UAVs for target capture in Group B are lower than Group A. Specifically, as shown in Figure 15, although A1 travels through the cell (10,10) at the beginning of the search process, A1 does not revisit the cell (10, 10) to confirm whether there is a target in this cell or not, due to lacking the controllable revisit mechanism. Therefore, the target T1 is not captured (found and confirmed) early enough and is confirmed by A1 until t = 33.2 s, as shown in Figure 17. Figure 18 shows the minimum distance between the UAVs in Group B. It can be seen that two UAVs are never in the same cell; thus, the collision avoidance is guaranteed.   From the time when the targets are confirmed by the UAVs in Groups A and B, it can be concluded that the controllable revisit mechanism can concentrate the UAVs around the targets to capture them as soon as possible, and enhance the capacity of target capture for the UAVs.   From the time when the targets are confirmed by the UAVs in Groups A and B, it can be concluded that the controllable revisit mechanism can concentrate the UAVs around the targets to capture them as soon as possible, and enhance the capacity of target capture for the UAVs.   From the time when the targets are confirmed by the UAVs in Groups A and B, it can be concluded that the controllable revisit mechanism can concentrate the UAVs around the targets to capture them as soon as possible, and enhance the capacity of target capture for the UAVs.  From the time when the targets are confirmed by the UAVs in Groups A and B, it can be concluded that the controllable revisit mechanism can concentrate the UAVs around the targets to capture them as soon as possible, and enhance the capacity of target capture for the UAVs. Furthermore, in order to verify that the controllable revisit mechanism can enhance the region coverage capacity of the UAVs and then improve the performance of mission operation, we analyze and compare the global average region revisited rate and the global average uncertainty in Groups A and B, as shown in Figures 19 and 20, respectively.
First, we defined the following global average region revisited rate ("global" means it is averaged over the N UAVs and (L x × L y ) cells) to evaluate the region coverage capacity of the UAVs.
in which σ k is the global average region revisited rate at time k. σ i,k indicates the region revisited rate of A i at time k. ϑ i,k indicates the number of the cells that are being revisited by A i at time k. ε i,k indicates the number of the cells that need to be revisited at time k in the cognitive map of A i . Generally, Then, we defined the following global average uncertainty to evaluate the performance of mission operation.
It can be seen from Figure 19 that the global average region revisited rate of the cells in Group A is higher than that in Group B, due to considering the controllable revisit mechanism in Group A and lacking the controllable revisit mechanism in Group B. In Group A, after about 35 s, the target existence status of most of the cells has been confirmed, so that the average revisited rate remains substantially zero after about 35 s. However, at some moments (e.g., t = 32.2 s), the average revisited rate is approximately 100%. This is because the number of cells currently being revisited is approximately equal to the number of cells that need to be revisited at these moments. Thus, we can conclude that the controllable revisit mechanism can enhance the region coverage capacity of the UAVs.
It can be seen from Figure 20 that compared with Group B, the global average uncertainty in Group A decreases to 0 more quickly. In the controllable revisit mechanism, we design the digital pheromone as the "guidance information", which is used to concentrate the UAVs to revisit sub mission area with high target probability or maximum uncertainty. In this way, the global average detected rate of the cell increases accordingly. Therefore, according to Theorem 3, in Group A the convergence speed of the cognitive maps in the UAVs is higher than in Group B, which means that our method has better mission operation performance than the method in reference [16]. Furthermore, in order to verify that the controllable revisit mechanism can enhance the region coverage capacity of the UAVs and then improve the performance of mission operation, we analyze and compare the global average region revisited rate and the global average uncertainty in Groups A and B, as shown in Figures 19 and 20, respectively.
First, we defined the following global average region revisited rate ("global" means it is averaged over the N UAVs and (Lx × Ly) cells) to evaluate the region coverage capacity of the UAVs. , , 1 1 ,  It can be seen from Figure 19 that the global average region revisited rate of the cells in Group A is higher than that in Group B, due to considering the controllable revisit mechanism in Group A and lacking the controllable revisit mechanism in Group B. In Group A, after about 35 s, the target existence status of most of the cells has been confirmed, so that the average revisited rate remains substantially zero after about 35 s. However, at some moments (e.g., t = 32.2 s), the average revisited rate is approximately 100%. This is because the number of cells currently being revisited is approximately equal to the number of cells that need to be revisited at these moments. Thus, we can conclude that the controllable revisit mechanism can enhance the region coverage capacity of the UAVs.
It can be seen from Figure 20 that compared with Group B, the global average uncertainty in Group A decreases to 0 more quickly. In the controllable revisit mechanism, we design the digital pheromone as the "guidance information", which is used to concentrate the UAVs to revisit sub mission area with high target probability or maximum uncertainty. In this way, the global average detected rate of the cell increases accordingly. Therefore, according to Theorem 3, in Group A the convergence speed of the cognitive maps in the UAVs is higher than in Group B, which means that our method has better mission operation performance than the method in reference [16].

Effect of Different Connectivity Maintenance Control Strategies
In Scenario 2, we set limited communication radius Rc = 1000 m and compared our proposed algorithm with the method of reference [26]. The essential difference of the two methods is that, in our method, the MST topology strategy is used to optimize the communication network topology, and only the communication links in the MST topology are maintained without violating the global connectivity condition. In the method of reference [26], the communication network topology is full connected, which means that the links between each vehicle must be maintained during the mission process. Thus, in order to test the influence of different connectivity maintenance control strategies on the performance of mission operation, the following two groups of experiments are carried out: • Group A: the MST topology strategy (the proposed method); • Group B: the full connected topology (the method of reference [26]).

Group A: The Minimum Spanning Tree Topology
The snapshots, Figures 21-23, respectively, show the search paths and communication topology of the whole UAVs when the targets T1, T2, T3 are found in Group A. The communication topology of the UAVs is denoted by the green dashed lines in these snapshots; it can be seen that the minimum spanning tree is used to optimize the topology of the communication network during the search process, and only the communication links in the MST topology are maintained. The second smallest eigenvalue of the Laplacian matrix of the communication topology is illustrated in Figure 24. It can be seen that second smallest eigenvalue is always larger than zero, so the network connectivity is maintained during the mission process.   Figure 20. Comparison of the global average uncertainty (Scenario 1).

Effect of Different Connectivity Maintenance Control Strategies
In Scenario 2, we set limited communication radius R c = 1000 m and compared our proposed algorithm with the method of reference [26]. The essential difference of the two methods is that, in our method, the MST topology strategy is used to optimize the communication network topology, and only the communication links in the MST topology are maintained without violating the global connectivity condition. In the method of reference [26], the communication network topology is full connected, which means that the links between each vehicle must be maintained during the mission process. Thus, in order to test the influence of different connectivity maintenance control strategies on the performance of mission operation, the following two groups of experiments are carried out:

•
Group A: the MST topology strategy (the proposed method); • Group B: the full connected topology (the method of reference [26]).

Group A: The Minimum Spanning Tree Topology
The snapshots, Figures 21-23, respectively, show the search paths and communication topology of the whole UAVs when the targets T 1 , T 2 , T 3 are found in Group A. The communication topology of the UAVs is denoted by the green dashed lines in these snapshots; it can be seen that the minimum spanning tree is used to optimize the topology of the communication network during the search process, and only the communication links in the MST topology are maintained. The second smallest eigenvalue of the Laplacian matrix of the communication topology is illustrated in Figure 24. It can be seen that second smallest eigenvalue is always larger than zero, so the network connectivity is maintained during the mission process.

Effect of Different Connectivity Maintenance Control Strategies
In Scenario 2, we set limited communication radius Rc = 1000 m and compared our proposed algorithm with the method of reference [26]. The essential difference of the two methods is that, in our method, the MST topology strategy is used to optimize the communication network topology, and only the communication links in the MST topology are maintained without violating the global connectivity condition. In the method of reference [26], the communication network topology is full connected, which means that the links between each vehicle must be maintained during the mission process. Thus, in order to test the influence of different connectivity maintenance control strategies on the performance of mission operation, the following two groups of experiments are carried out:

•
Group A: the MST topology strategy (the proposed method); • Group B: the full connected topology (the method of reference [26]).

Group A: The Minimum Spanning Tree Topology
The snapshots, Figures 21-23, respectively, show the search paths and communication topology of the whole UAVs when the targets T1, T2, T3 are found in Group A. The communication topology of the UAVs is denoted by the green dashed lines in these snapshots; it can be seen that the minimum spanning tree is used to optimize the topology of the communication network during the search process, and only the communication links in the MST topology are maintained. The second smallest eigenvalue of the Laplacian matrix of the communication topology is illustrated in Figure 24. It can be seen that second smallest eigenvalue is always larger than zero, so the network connectivity is maintained during the mission process.

Group B: The Full Connected Topology
The snapshots, Figures 25-27, respectively, show the search paths and communication topology of the whole UAVs when the targets T2, T1, T3 are found in Group B. As shown in Figure 27, in order to maintain a fully connected communication topology, the UAVs are concentrated around the left half plane in the surveillance region. This causes the right half plane in the surveillance region that is not explored by the UAVs. The second smallest eigenvalue of the Laplacian matrix of the network

Group B: The Full Connected Topology
The snapshots, Figures 25-27, respectively, show the search paths and communication topology of the whole UAVs when the targets T2, T1, T3 are found in Group B. As shown in Figure 27, in order to maintain a fully connected communication topology, the UAVs are concentrated around the left half plane in the surveillance region. This causes the right half plane in the surveillance region that is not explored by the UAVs. The second smallest eigenvalue of the Laplacian matrix of the network

Group B: The Full Connected Topology
The snapshots, Figures 25-27, respectively, show the search paths and communication topology of the whole UAVs when the targets T2, T1, T3 are found in Group B. As shown in Figure 27, in order to maintain a fully connected communication topology, the UAVs are concentrated around the left half plane in the surveillance region. This causes the right half plane in the surveillance region that is not explored by the UAVs. The second smallest eigenvalue of the Laplacian matrix of the network

Group B: The Full Connected Topology
The snapshots, Figures 25-27, respectively, show the search paths and communication topology of the whole UAVs when the targets T 2 , T 1 , T 3 are found in Group B. As shown in Figure 27, in order to maintain a fully connected communication topology, the UAVs are concentrated around the left half plane in the surveillance region. This causes the right half plane in the surveillance region that is not explored by the UAVs. The second smallest eigenvalue of the Laplacian matrix of the network topology is illustrated in Figure 28. It can be seen that the network connectivity is maintained during the mission process. topology is illustrated in Figure 28. It can be seen that the network connectivity is maintained during the mission process.    topology is illustrated in Figure 28. It can be seen that the network connectivity is maintained during the mission process.    topology is illustrated in Figure 28. It can be seen that the network connectivity is maintained during the mission process.    Furthermore, in order to verify the connectivity maintenance scheme based on the MST topology optimization and efficaciously balance the coverage enhancement and the connectivity maintenance, and then improve the performance of mission operation, we analyze and compare the aggregated coverage of the whole surveillance region and the global average uncertainty in Groups A and B, as shown in Figures 29 and 30, respectively.
In Scenario 2, we defined the following aggregated coverage of the surveillance region for the UAVs to evaluate the region coverage capacity of the UAVs. Then, we also use the global average uncertainty, which is defined in Equation (33), to evaluate the performance of mission operation.
It can be seen that, in Group A, the aggregated coverage is higher than Group B (Figure 29), and the convergence speed of the cognitive maps in the whole UAVs is higher than Group B (Figure 30), which means that our method has better mission operation performance than the method in reference [26]. The reason for these results is that the connectivity maintenance scheme based on the MST communication topology optimization relaxes the communication constraints and gives more freedom for the UAVs to search the more areas. However in Group B the UAVs tend to maintain all the communication links with the other UAVs rather than exploring more areas. Hence, the connectivity maintenance scheme based on the MST communication topology optimization can efficaciously balance the coverage enhancement and the connectivity maintenance, which aim to maintain a connected topology for the network while minimizing the movement constraints on the UAVs.

Effect of Varying Number of UAVs
In

Effect of Varying Number of UAVs
In In Scenario 2, we defined the following aggregated coverage of the surveillance region for the UAVs to evaluate the region coverage capacity of the UAVs.
in which k denotes the aggregated coverage at time k. ω k indicates the aggregated number of the cells that have been searched at least once by at least one UAV up to time k. Then, we also use the global average uncertainty, which is defined in Equation (33), to evaluate the performance of mission operation.
It can be seen that, in Group A, the aggregated coverage is higher than Group B (Figure 29), and the convergence speed of the cognitive maps in the whole UAVs is higher than Group B (Figure 30), which means that our method has better mission operation performance than the method in reference [26]. The reason for these results is that the connectivity maintenance scheme based on the MST communication topology optimization relaxes the communication constraints and gives more freedom for the UAVs to search the more areas. However in Group B the UAVs tend to maintain all the communication links with the other UAVs rather than exploring more areas. Hence, the connectivity maintenance scheme based on the MST communication topology optimization can efficaciously balance the coverage enhancement and the connectivity maintenance, which aim to maintain a connected topology for the network while minimizing the movement constraints on the UAVs.

Effect of Varying Number of UAVs
In Scenario 3, we use different number of UAVs to test its influence on the average mission complete time (AMCT). In Monte-Carlo simulations, the number of targets is M = 3, while the number of UAVs is N = 5, 6 and 7; thus, the simulations can be divided into 3 groups of experiments. The number of experiments was 100 in each group. We need to calculate the AMCT of each group of experiments. The mission completion time is defined as the required mission execution time when the global average uncertainty η ≤ 0.01. For each experiment, the initial positions and the initial heading angles of the UAVs, and the initial positions of targets, are randomly generated. The communication range is R c = 4000 m. Thus, the communication range is large enough to maintain direct communication between each vertice so that the movement of the UAVs can be not restricted by the network connectivity constraint. The sensing radius is R s = 60 m. The detection and false alarm probabilities are p d = 0.9 and p f = 0.3, respectively. Figure 31 shows the AMCT for different numbers of UAVs, and we can summarize that the larger the number of UAVs, the smaller the AMCT. This is because if the number of UAVs is larger, more cells in the surveillance region are detected by the UAVs, and hence the global average detection rate of the cell will be higher. According to Theorem 3, the convergence speed of the cognitive maps in the UAVs is higher, so the AMCT is smaller.

Effect of Different Sensing Radius
In Scenario 4, we set different values of sensing radius to test its influence on the AMCT. In Monte-Carlo simulations, the sensing radius is 20 m, 60 m, 100 m if the number of UAVs is kept as N = 5 and the number of targets is kept as M = 3. The communication range is Rc = 4000 m. The detection and false alarm probabilities are pd = 0.9 and pf = 0.3, respectively. In Scenario 4, the mission completion time is defined as the required mission execution time when the global average uncertainty ≤ 0.1. For each experiment, the initial positions and the initial heading angles

Effect of Different Sensing Radius
In Scenario 4, we set different values of sensing radius to test its influence on the AMCT. In Monte-Carlo simulations, the sensing radius is 20 m, 60 m, 100 m if the number of UAVs is kept as N = 5 and the number of targets is kept as M = 3. The communication range is R c = 4000 m. The detection and false alarm probabilities are p d = 0.9 and p f = 0.3, respectively. In Scenario 4, the mission completion time is defined as the required mission execution time when the global average uncertainty η ≤ 0.1. For each experiment, the initial positions and the initial heading angles of the UAVs, and the initial positions of targets, are randomly generated. Figure 32 shows the AMCT for different values of sensing radius, and we can summarize that the larger the sensing radius, the smaller the AMCT. This is because if the sensing radius is larger, more cells in the surveillance region are detected by the UAVs, and hence the global average detected rate of the cell will be higher. According to Theorem 3, the convergence speed of the cognitive maps in the UAVs is higher, so the AMCT is smaller.

Effect of Different Sensing Radius
In Scenario 4, we set different values of sensing radius to test its influence on the AMCT. In Monte-Carlo simulations, the sensing radius is 20 m, 60 m, 100 m if the number of UAVs is kept as N = 5 and the number of targets is kept as M = 3. The communication range is Rc = 4000 m. The detection and false alarm probabilities are pd = 0.9 and pf = 0.3, respectively. In Scenario 4, the mission completion time is defined as the required mission execution time when the global average uncertainty ≤ 0.1. For each experiment, the initial positions and the initial heading angles of the UAVs, and the initial positions of targets, are randomly generated. Figure 32 shows the AMCT for different values of sensing radius, and we can summarize that the larger the sensing radius, the smaller the AMCT. This is because if the sensing radius is larger, more cells in the surveillance region are detected by the UAVs, and hence the global average detected rate of the cell will be higher. According to Theorem 3, the convergence speed of the cognitive maps in the UAVs is higher, so the AMCT is smaller.

Effect of Detection and False Alarm Probabilities
In Scenario 5, we set different detection probabilities and different false alarm probabilities to test their influence on the AMCT. In Monte-Carlo simulations, the detection probability pd is 0.6, 0.7 and 0.9, while the false alarm probability pf is 0.2, 0.3 and 0.4. The number of targets is M = 3. The number of UAVs is N = 5. The communication range is Rc = 4000 m. The sensing radius is Rs = 60 m. In Scenario 5, the mission completion time is defined as the required mission execution

Effect of Detection and False Alarm Probabilities
In Scenario 5, we set different detection probabilities and different false alarm probabilities to test their influence on the AMCT. In Monte-Carlo simulations, the detection probability p d is 0.6, 0.7 and 0.9, while the false alarm probability p f is 0.2, 0.3 and 0.4. The number of targets is M = 3. The number of UAVs is N = 5. The communication range is R c = 4000 m. The sensing radius is R s = 60 m. In Scenario 5, the mission completion time is defined as the required mission execution time when the global average uncertainty η ≤ 0.01. For each experiment, the initial positions and the initial heading angles of the UAVs, and the initial positions of targets, are randomly generated. Figure 33 shows the AMCT for different detection probabilities and different false alarm probabilities, and we can summarize that, for a given detection probability p d , the smaller the false alarm probability p f , the smaller the AMCT. For a given false alarm probability p f , the larger the detection probability p d and the smaller the AMCT. According to Theorem 2, the better the performance of the sensor is, such as the larger the detection probability p d and the smaller the false alarm probability p f , the minimum number of observations that require m + avg or m − avg is smaller. In other words, the target existence status in each cell can be confirmed by fewer searches, and hence the AMCT is smaller.
probabilities, and we can summarize that, for a given detection probability pd, the smaller the false alarm probability pf, the smaller the AMCT. For a given false alarm probability pf, the larger the detection probability pd and the smaller the AMCT. According to Theorem 2, the better the performance of the sensor is, such as the larger the detection probability pd and the smaller the false alarm probability pf, the minimum number of observations that require avg m + or avg m − is smaller.
In other words, the target existence status in each cell can be confirmed by fewer searches, and hence the AMCT is smaller.

Effect of Different Communication Range
In Scenario 6, we set different communication ranges to test their influence on the AMCT. In Monte-Carlo simulations, the communication range Rc is 800 m, 1000 m, 1200 m, 1500 m and 1800 m, and keep the number of UAVs is N = 4; the number of targets as M = 3. The sensing radius is Rs = 60 m. The detection and false alarm probabilities are pd = 0.9 and pf = 0.3, respectively. For each experiment, the initial positions and the initial heading angles of the UAVs, and the initial positions of targets, are randomly generated. In Scenario 6, the mission completion time is defined as the required mission execution time when the global average uncertainty η ≤ 0.1. Figure 34 shows the AMCT for different communication ranges, and we can summarize that the larger the communication range, the smaller the AMCT. This is because if the communication range is larger, the size of the connectivity constraint set εi,j,k is larger, and then the size of the allowable positions set ξi,k is larger, which gives more freedom for the UAVs without violating the network connectivity constraint. In this case, the UAVs can search more areas, and which leads to the higher average detected rate of the cell. According to Theorem 3, the convergence speed of the cognitive maps in the UAVs is higher, so the AMCT is smaller. However, it is worth noting that when the communication range is large enough, the AMCT is essentially unchanged. This is because if the communication range is large enough to maintain the direct communication links between every two vertices, the movements of the UAVs in the surveillance region are not bound by the communication range constraint. This means that the communication between UAVs can be seen as perfect, so that the influence of the communication range can be ignored.

Effect of Different Communication Range
In Scenario 6, we set different communication ranges to test their influence on the AMCT. In Monte-Carlo simulations, the communication range R c is 800 m, 1000 m, 1200 m, 1500 m and 1800 m, and keep the number of UAVs is N = 4; the number of targets as M = 3. The sensing radius is R s = 60 m. The detection and false alarm probabilities are p d = 0.9 and p f = 0.3, respectively. For each experiment, the initial positions and the initial heading angles of the UAVs, and the initial positions of targets, are randomly generated. In Scenario 6, the mission completion time is defined as the required mission execution time when the global average uncertainty η ≤ 0.1. Figure 34 shows the AMCT for different communication ranges, and we can summarize that the larger the communication range, the smaller the AMCT. This is because if the communication range is larger, the size of the connectivity constraint set ε i,j,k is larger, and then the size of the allowable positions set ξ i,k is larger, which gives more freedom for the UAVs without violating the network connectivity constraint. In this case, the UAVs can search more areas, and which leads to the higher average detected rate of the cell. According to Theorem 3, the convergence speed of the cognitive maps in the UAVs is higher, so the AMCT is smaller. However, it is worth noting that when the communication range is large enough, the AMCT is essentially unchanged. This is because if the communication range is large enough to maintain the direct communication links between every two vertices, the movements of the UAVs in the surveillance region are not bound by the communication range constraint. This means that the communication between UAVs can be seen as perfect, so that the influence of the communication range can be ignored. It is also worth noting that, in Scenario 1, 3, 4 and 5, the communication range is large enough to maintain the direct communication between every two vertices so that each UAV can exchange the target probability maps with all the other UAVs. Thus, there is no deviation between each two individual target probability maps. However, in Scenario 6, due to limited communication range, there exits the deviation between each two individual target probability maps. Hence, in Scenario 6, on one hand, we use the global average uncertainty, which is defined in Equation (33), to evaluate It is also worth noting that, in Scenario 1, 3, 4 and 5, the communication range is large enough to maintain the direct communication between every two vertices so that each UAV can exchange the target probability maps with all the other UAVs. Thus, there is no deviation between each two individual target probability maps. However, in Scenario 6, due to limited communication range, there exits the deviation between each two individual target probability maps. Hence, in Scenario 6, on one hand, we use the global average uncertainty, which is defined in Equation (33), to evaluate the convergence performance of the target probability maps in the UAVs. On the other hand, the following global average uncertainty deviation is defined to evaluate the consensus performance of the UAVs, which implement the distributed update and fusion scheme in Equation (13) for map merging.  It is also worth noting that, in Scenario 1, 3, 4 and 5, the communication range is large enough to maintain the direct communication between every two vertices so that each UAV can exchange the target probability maps with all the other UAVs. Thus, there is no deviation between each two individual target probability maps. However, in Scenario 6, due to limited communication range, there exits the deviation between each two individual target probability maps. Hence, in Scenario 6, on one hand, we use the global average uncertainty, which is defined in Equation (33), to evaluate the convergence performance of the target probability maps in the UAVs. On the other hand, the following global average uncertainty deviation is defined to evaluate the consensus performance of the UAVs, which implement the distributed update and fusion scheme in Equation (13)     From Figure 35, it can be seen that the larger communication range, the faster the global average uncertainty decreases to 0. This is because the larger communication range gives more freedom for the UAVs without violating the network connectivity constraint. In this case, the UAVs can search more areas, which leads to the higher average detected rate of the cells. According to Theorem 3, the map convergence speed of the whole UAVs is higher, which means the performance of mission operation is better (this conclusion is also confirmed in Figure 34).
From Figure 36, we can summarize that the larger the communication range, the faster the global From Figure 35, it can be seen that the larger communication range, the faster the global average uncertainty decreases to 0. This is because the larger communication range gives more freedom for the UAVs without violating the network connectivity constraint. In this case, the UAVs can search more areas, which leads to the higher average detected rate of the cells. According to Theorem 3, the map convergence speed of the whole UAVs is higher, which means the performance of mission operation is better (this conclusion is also confirmed in Figure 34).
From Figure 36, we can summarize that the larger the communication range, the faster the global average deviation of the uncertainties decreases to 0, which means the consensus performance is better. This is because, the larger the communication range, the more communication neighbors of each UAV A i . It means that more UAVs share their observation results with A i . It is beneficial to eliminate deviations between the individual probability maps of the UAVs.

Comparison of the Two Map Update Methods
To evaluate the effectiveness of our proposed distributed update and fusion scheme, two groups of experiments are carried out in Scenario 7.

•
Group A: uncooperative map update. Each UAV only updates its own TPM according to its sensor observations. • Group B: cooperative map merging. Each UAV first updates its own TPM according to its sensor observations, and then transmits the updated map to its neighbors for map fusion using our proposed update and fusion scheme in Equation (3).
We analyze and compare the global average uncertainty and the global average uncertainty deviation in Groups A and B, as shown in Figures 37 and 38, respectively.
From Figure 37, it can be seen that the average uncertainty converges faster using our proposed distributed update and fusion scheme than by using the uncooperative map update method. This is because, if the neighboring UAVs exchange their current observations with A i , A i can get more observations each time than in the case without exchanging the observations, which increases the global average detected rates (m c,k /(Nk)) over the covered cells. Specifically, in addition to the observations taken over the cells within its own sensing range, the UAV A i can also get the observations taken over the cells outside its field of view (FoV) but inside the FoV of its neighbors. According to Theorem 3, the convergence speed of the cognitive maps in the UAVs is higher. The performance of mission operation is better with a higher convergence speed.   From Figure 38, it can be seen that, implementing the distributed update and fusion scheme, the average uncertainty deviation can decrease faster to 0. This is because the deviation between individual probability maps can be eliminated by exchanging the TPM for map fusion. Eventually, all individual target probability maps can converge to the same one, which reflects the existence or absence of the targets within each cell. However, in the uncooperative map update method, each UAV only updates its own TPM according to its sensor observations. In this case, it is hard to guarantee consensus among UAVs to maintain similar target probability maps and thus lead to mission performance degradations.

Conclusions
This paper mainly studies cooperative search and coverage for a given bounded rectangle region by a team of UAVs with non-ideal sensors and limited communication ranges. The main contribution of this paper is to develop a distributed cooperative search and coverage algorithm, which generates paths to gather more information about the environment and find more targets. Following conclusions can be obtained.
1. By integrating TPM, UM, and DPM, the cognitive map can effectively represent targets existence, uncertainty, and revisiting requirement of each cell in the surveillance region. Thus, the cognitive map can serve as the UAV's knowledge of the environment, effectively.
Due to 1 T W k = 1 T , W k 1 = 1 and [W k ] i,j ≥ 0 for all i and j, we get e c,k = ∆ 1, k + ∆ 2,k ≤ ∆ 1,k + ∆ 2,k (A9) According to the consensus theory, it can be easily proved that, given any (N × 1) vector θ and any (N × N) matrix W k that is associated with a connected topology G(k), we can find a number 0 < λ < 1 such that the following inequality holds.
Therefore, we have Substituting Equations (A12) and (A13) into Equation (A9), we get According to Equation (9), it is easy to get