Multi-UAV Reconnaissance Task Assignment for Heterogeneous Targets Based on Modified Symbiotic Organisms Search Algorithm

This paper considers a reconnaissance task assignment problem for multiple unmanned aerial vehicles (UAVs) with different sensor capacities. A modified Multi-Objective Symbiotic Organisms Search algorithm (MOSOS) is adopted to optimize UAVs’ task sequence. A time-window based task model is built for heterogeneous targets. Then, the basic task assignment problem is formulated as a Multiple Time-Window based Dubins Travelling Salesmen Problem (MTWDTSP). Double-chain encoding rules and several criteria are established for the task assignment problem under logical and physical constraints. Pareto dominance determination and global adaptive scaling factors is introduced to improve the performance of original MOSOS. Numerical simulation and Monte-Carlo simulation results for the task assignment problem are also presented in this paper, whereas comparisons with non-dominated sorting genetic algorithm (NSGA-II) and original MOSOS are made to verify the superiority of the proposed method. The simulation results demonstrate that modified SOS outperforms the original MOSOS and NSGA-II in terms of optimality and efficiency of the assignment results in MTWDTSP.


Introduction
Unmanned aerial vehicles (UAVs) have played increasingly important roles in military reconnaissance during the last decade. When dealing with multiple reconnaissance tasks [1], it is often required that UAVs form teams and work collaboratively, which necessitates cooperative task assignments [2]. The purpose of cooperative task assignments is to allocate necessary tasks and determine the appropriate task execution sequence to the UAVs to maximize overall performance. The battlefield situation and the range of permissible performance of the UAV set are commonly considered in the cooperative task assignments.
The cooperative task assignment problem (CTAP) is usually considered as NP-Hard Problems. In some prior works, the basic CTAP was formulated as a mixed-integrated linear programming (MILP) problem [3], which aims to obtain an optimal assignment solution. Decentralized approaches were taken to deal with the dynamic, unexpected situations by using the information obtained by itself or through communications with neighboring UAVs [4]. With the intention of reducing the computational burden, optimization algorithms such as genetic algorithms (GA) were adopted to obtain suboptimal solutions [5]. Shaferman et al. [6] took UAVs' dynamic constrains into consideration to ensure flyable trajectories and applied a stochastic search method to improve assignment solutions. Gyeongtaek et al. [7] put forward a market-based distributed assignment algorithm for timing constrained tasks and determine the allocation sequence by the estimated time of arrival (ETA) of each UAV. In the works mentioned above, UAVs' paths to the task positions were not precisely calculated during the process of assignment.
In the actual assignment planner, the path for the UAVs to perform the tasks should also be taken into consideration. Thus, in some other prior studies, the basic CTAP were formulated as a travelling salesman problem (TSP) [8], which aims to find the optimal scheme for a salesman to visit all the cities with the shortest path. For the original TSP, the length between two cities is usually considered as Euclidean length. According to the dynamic characteristics and constraints of UAVs, Dubins vehicle model and Dubins path were introduced into the CTAP for route length computation [9]. Thus, the CTAP was formulated as a Dubins travelling salesman problem (DTSP) [10]. Furthermore, Zhang et al. [11] took the effective range of UAVs' sensors into consideration and formulated the CTAP as a Dubins travelling salesman problem with neighborhood (DTSPN). Wang et al. [12] considered the multi-UAV reconnaissance task allocation as an extended multiple DTSP, where the visit paths to the heterogeneous targets must meet specific constraints due to the targets' feature. Since the reconnaissance tasks are time-sensitive, the execution time required will affect UAVs' choices of tasks and flight paths. Considering the time windows of the tasks (targets), Karabulut et al. [13] formulated the TSP with a service time containing ready time and due time, which was so called TSPTW. Nunes et al. [14] explained the different circumstance, under which the time window was took as a hard temporal constraint and a soft temporal constraint.
As for the subject of CTAP, the literature mentioned above concentrated on the heterogeneity features and kinematic constraints of UAVs, but the features of targets were ignored or considered to be homogeneous. In this paper, we focus on the reconnaissance task allocation problem for UAVs, considering ground targets with heterogeneous features and sizes are considered. To describe the CTAP, a novel cooperative reconnaissance task assignment model for multi-UAV is formulated: multiple time window-based Dubins travelling salesmen problem (MTWDTSP). Compared with the MILP form, MTWDTSP takes into account the influence the influence of various flight paths on the assignment results; as for compared with common TSP form, MTWDTSP could deal with the effect of different time windows on task sequence. In this particular problem, heterogeneous reconnaissance targets are also presented as point, strip, and surface targets, according to the features of targets and the performance of UAV sensor. To accomplish the reconnaissance of each target, UAVs must cover all the heterogeneous targets with the airborne sensors.
The MTWDTSP is a typical NP-hard multi-objective optimization problem. Group intelligent optimization methods (such as Bayesian approach [15]) are employed to solve the problem with independent task points in a static environment within limited space, which can reduce the complexity of modeling and calculation. In recent years, heuristic and bio-inspired algorithms based group intelligent optimization methods have been adopted to solve the multi-objective optimization problem, including the simulated annealing algorithm [16], the Tabu search algorithm [17], genetic algorithms (NSGA-II) [18,19], the artificial fish school algorithm [20], ant colony algorithm [21] and the particle swarm optimization (niching PSO) [22][23][24][25]. These algorithms perform better than most traditional mathematical techniques in solving these problems, because they do not require substantial gradient information. Traditional group intelligent optimization algorithms can obtain locally optimal solutions for low-dimensional problems. As the number of UAVs and tasks (targets) grows, the traditional group intelligent optimization algorithms can easily premature and fall into local optimum. In order to search over the designed possible solution spaces as much as possible, Cheng and Prayogo [26] first proposed in 2014 a new meta-heuristic optimization algorithm known as symbiotic organisms search (SOS). In this paper, we introduce a modified multi-objective symbiotic organisms search (MOSOS) [27] to solve the MTWDTSP with dynamic time window constraints, task type constraints, and UAV sensor constraints. To improve the performance of MOSOS, a globally adaptive parasitism parameter [28] is used to increase convergence speed. Pareto optimal solution set is introduced to improve the variety of population and increase the probability of converging to the optimal solution. Also, a double-chain chromosomes encoding is adopted to pre-process the designed solution space for the efficiency of the algorithm.

Mathematical Model of the MTWDTSP
In this section, the models of sensors, targets, and UAVs are established. The paths of covering targets with different shapes are also designed to calculate the estimated reconnaissance time.

Modeling of the UAV and Sensor
In the particular problem, a set of heterogeneous targets need to be reconnoitered by different types of UAVs subject to several types of constraints. In this article, Dubins model for UAV [10] is introduced with the following basic assumptions: (1) the velocity of each UAV is constant; (2) the UAVs perform the reconnaissance task at given altitudes; (3) the UAVs fly at different altitudes without inter-UAV collision; (4) the flight time of each UAV is limited. At this point, the spatial dimension of the UAV can be reduced from three dimensions to two dimensions (X U -Z U plane): where x and z represent the UAV's coordinates in plane space, V U represents for the speed, r min represents the minimum turning radius, and ψ is the flight yaw angle. Airborne sensors are required to reconnoiter ground targets. It is assumed that the reconnaissance task on this target is finished when a ground target is fully covered by the sensor's field of view. In this study, a sensor's field of view is considered as a circle region with the constant opening angle on the ground, neglecting the influence of altitude and noise, as illustrated in Figure 1. In Figure 1, H U is the flight altitude, W U is the detection width of the sensor, the axis X U -Z U -Y U denotes the body/dynamic system of the UAV.

Mathematical Model of the MTWDTSP
In this section, the models of sensors, targets, and UAVs are established. The paths of covering targets with different shapes are also designed to calculate the estimated reconnaissance time.

Modeling of the UAV and Sensor
In the particular problem, a set of heterogeneous targets need to be reconnoitered by different types of UAVs subject to several types of constraints. In this article, Dubins model for UAV [10] is introduced with the following basic assumptions: (1) the velocity of each UAV is constant; (2) the UAVs perform the reconnaissance task at given altitudes; (3) the UAVs fly at different altitudes without inter-UAV collision; (4) the flight time of each UAV is limited. At this point, the spatial dimension of the UAV can be reduced from three dimensions to two dimensions (XU-ZU plane): where x and z represent the UAV's coordinates in plane space, VU represents for the speed, rmin represents the minimum turning radius, and  is the flight yaw angle.
Airborne sensors are required to reconnoiter ground targets. It is assumed that the reconnaissance task on this target is finished when a ground target is fully covered by the sensor's field of view. In this study, a sensor's field of view is considered as a circle region with the constant opening angle on the ground, neglecting the influence of altitude and noise, as illustrated in Figure  1. In Figure 1, HU is the flight altitude, WU is the detection width of the sensor, the axis XU-ZU-YU denotes the body/dynamic system of the UAV.

Modeling of Heterogeneous Ground Targets
In general, one of the fundamental features is the target shape. To facilitate the determination of the reconnaissance duration and path, the shape of a target can be divided into point targets, strip targets, and surface targets. Due to the different sizes of targets, the same UAV will have different detection times for each type of target. Assuming that the detection time is always lower than the time window, then the following provisions are made to determine a suitable target detection time: 1) For point targets, the default detection time is a fixed duration: 2) For strip targets, the UAV detects targets along the direction of strip extension (shown as

Modeling of Heterogeneous Ground Targets
In general, one of the fundamental features is the target shape. To facilitate the determination of the reconnaissance duration and path, the shape of a target can be divided into point targets, strip targets, and surface targets. Due to the different sizes of targets, the same UAV will have different detection times for each type of target. Assuming that the detection time is always lower than the time window, then the following provisions are made to determine a suitable target detection time: (1) For point targets, the default detection time is a fixed duration: t d = t Dec ; (2) For strip targets, the UAV detects targets along the direction of strip extension (shown as Figure 2), and the detection duration is t d = L st /V U , where L st represents the extension length of the target and V U is the UAV's speed. (3) For surface targets, the UAV must fly along the "Z" path to detect the entire area. Considering the irregularity of the detection area, as shown in Figure 3a, an equal interval rotation method should first be performed to find the smallest circumscribed rectangle (SCR) of the irregular area. Then the circumscribed rectangle should be divided into several smaller rectangles, and finally, the UAV can detect the whole area along the "Z" path. Denote the SCR as H min = L s f * W s f , where L sf and W sf represent the length and width, respectively, shown in Figure 3b. Therefore, the detection duration of surface targets is where w U is the detection width of a single UAV, · , · represent for the upward and downward rounding operators. 3) For surface targets, the UAV must fly along the "Z" path to detect the entire area.
Considering the irregularity of the detection area, as shown in Figure 3a, an equal interval rotation method should first be performed to find the smallest circumscribed rectangle (SCR) of the irregular area. Then the circumscribed rectangle should be divided into several smaller rectangles, and finally, the UAV can detect the whole area along the "Z" path.
Denote the SCR as * min sf sf H L W = , where Lsf and Wsf represent the length and width, respectively, shown in Figure 3(b). Therefore, the detection duration of surface targets is ( )    It is worth pointing out that the way to select the entry point of the surface or the strip and the generation of Dubins-Path may be found in [12].
The second feature of heterogeneous target is the detectable time. The time attribute of the reconnaissance task means that the target has a fixed time window (as shown in Figure 4): where is the earliest time available for detection and is the latest time available, which is considered as a hard-constraint. Denote as the arrival time of the UAV, and then the waiting time (shown in Figure 5.) before execution can be calculated as follows:  3) For surface targets, the UAV must fly along the "Z" path to detect the entire area. Considering the irregularity of the detection area, as shown in Figure 3a, an equal interval rotation method should first be performed to find the smallest circumscribed rectangle (SCR) of the irregular area. Then the circumscribed rectangle should be divided into several smaller rectangles, and finally, the UAV can detect the whole area along the ʺZ" path.
Denote the SCR as where Lsf and Wsf represent the length and width, respectively, shown in Figure 3(b). Therefore, the detection duration of surface targets is    It is worth pointing out that the way to select the entry point of the surface or the strip and the generation of Dubins-Path may be found in [12].
The second feature of heterogeneous target is the detectable time. The time attribute of the reconnaissance task means that the target has a fixed time window (as shown in Figure 4): where is the earliest time available for detection and is the latest time available, which is considered as a hard-constraint. Denote as the arrival time of the UAV, and then the waiting time (shown in Figure 5.) before execution can be calculated as follows: It is worth pointing out that the way to select the entry point of the surface or the strip and the generation of Dubins-Path may be found in [12].
The second feature of heterogeneous target is the detectable time. The time attribute of the reconnaissance task means that the target has a fixed time window (as shown in Figure 4): T task = (t begin , t end ), where t begin is the earliest time available for detection and t end is the latest time available, which is considered as a hard-constraint. Denote Get uav as the arrival time of the UAV, and then the waiting time (shown in Figure 5.) before execution t c can be calculated as follows: where t c = −1 means the failure of this task.

Assignment Model of Reconnaissance Tasks for Multi-UAVs
Task assignment problem for multiple cooperative UAVs can be described as follows: for while ensuring that the overall performance indicator Asn J will be optimized. The objective of the task assignment is to minimize the total cost (performance indicator), which can be expressed as: Considering the reconnaissance model, the number of the UAVs, the type of restrictions, and the time window of the task itself, certain tasks may not be successfully implemented. Therefore, determining how to prioritize the important tasks and make the total flight distance as short as possible become the optimization goals of the problem. To describe the above progress in a mathematical form, the decision variable of task assignments can be given as follows: Arrival Time 1 means the failure of this task.

Assignment Model of Reconnaissance Tasks for Multi-UAVs
Task assignment problem for multiple cooperative UAVs can be described as follows: for while ensuring that the overall performance indicator Asn J will be optimized. The objective of the task assignment is to minimize the total cost (performance indicator), which can be expressed as: Considering the reconnaissance model, the number of the UAVs, the type of restrictions, and the time window of the task itself, certain tasks may not be successfully implemented. Therefore, determining how to prioritize the important tasks and make the total flight distance as short as possible become the optimization goals of the problem. To describe the above progress in a mathematical form, the decision variable of task assignments can be given as follows: Arrival Time

Assignment Model of Reconnaissance Tasks for Multi-UAVs
Task assignment problem for multiple cooperative UAVs can be described as follows: for N U UAVs and N T reconnaissance tasks of geographically dispersed targets, under the decision variable X| N T N U , the whole group of the UAVs is able to complete all tasks satisfying all types of constraints C Asn (c 1 , c 2 · · · c N C ) while ensuring that the overall performance indicator J Asn will be optimized. The objective of the task assignment is to minimize the total cost (performance indicator), which can be expressed as: X| Considering the reconnaissance model, the number of the UAVs, the type of restrictions, and the time window of the task itself, certain tasks may not be successfully implemented. Therefore, determining how to prioritize the important tasks and make the total flight distance as short as possible become the optimization goals of the problem. To describe the above progress in a mathematical form, the decision variable of task assignments can be given as follows: where T = < 1,2 . . . . N T > denotes the set of the number for task points, k denotes the type of UAV, p denotes the pth drone of the kth type, and the number of all reconnaissance tasks is N T . Therefore, the decision variable has the following matrix form: a 00 a 01 · · · · · · · · · a 0T max a 10 a 11 · · · · · · · · · a 1T max . . .
Given that the base node number is 0, it can be seen that the element has the form of a sparse matrix. In this study, considering that each task reconnoiters only once at most, there is at most one nonzero item per row in the sparse matrix.

Mathematical Model of the Performance Indicator (PI) and Constraints
In this problem, the constraints of assignment involve two parts: physical and logical. Physical constraints are related to UAVs' flight characteristics and task properties, while logical constraints focus on the requirements of each type of task and the strategies for UAVs to implement the reconnaissance tasks. Some of the fundamental constraints are as follows [29,30]: (1) All UAVs start from the base and eventually return to the base. (2) Each task can be completed no more than once.
(3) The assignment scheme should match the time window and imaging requirements of each individual task. (4) Each UAV can be assigned only one task at the same time period. (5) The total time for each UAV to implement the tasks should not be longer than the longest flight time.
The details of the constraints are described as below: (1) PIs based on task assignment: where S(X) represents the total weight of the tasks that have been assigned, w j reflects the importance of the jth reconnaissance task, N U is the total number of all UAVs, and N(X) is the number of the employed UAVs to execute the assigned tasks. The PI (7) is designed in a flexible form with the purpose to let the task assignment model choose the proper objective function to optimize depending on the conditions of UAVs and tasks. λ represents the adjustment parameter, and divides min J 1 (X) into two parts: when λ = 1, minJ 1 (X) = min(N U + 2 − S(X)), which indicates that the PI chooses to optimize the maximum weight of assigned tasks; when λ = 0, minJ 1 (X) = minN(X), which indicates that the PI chooses to optimize the minimum number of the employed UAVs. Defining the symbolic functions as the following form: so that: If the value of sgn(χ) is kept within the range of 0 to 1, the following condition is satisfied: Then, it can be ensured that the PI optimizes the maximum number of tasks, and after all the tasks are successfully executed, the number of UAVs is minimized at the same time. Since N(X) ≤ N U , the following must hold: This equation is a universal calculation formula for all the possible decision variables X. Therefore, when designing the task weight coefficient w, it is necessary to satisfy the following requirement: Denote w = w 1 , w 2 , w 3 , . . . , w N T , and w = [0, w] represents the extension vector of a weight vector. Then, there is the following expression: If and only if the non-zero row element of the matrix of all elements in decision variable X is N T , then all targets have been successfully reconnoitered.
(2) PI based on flight distance: where Dis ij represents the distance from node i to node j. The total optimization PI can be written as follows: The PI must satisfy the following constraints [29]: (a) Each target is scouted for only once: (c) Each UAV departs from the base and eventually returns to the base (as mentioned above, the base is noted as 0): ∀j = 1, 2, 3, . . . , N T , and since χ kp ij = 1, then: (d) Time window constraints: Define the time window of task j : ms j , mo j , T j indicates when the UAV will detect the target; therefore, the constraint can be given as follows: where Ser k i is the reconnaissance time of the kth type of UAV, t k ij indicates the time for the kth type of UAV to fly from node i to node j, and t Cj is the wait time before executing the task.
(e) Sensor constraints of tasks: The minimum sensor accuracy required for the task j is SenT j , the sensor accuracy of the kth type of UAV is SenV k j , and the constraint can be given as follows: (f) The total flight time of each UAV cannot exceed the maximum flight time: where node 0 represents the base by default, last represents the last reconnaissance task point of the UAV, and the latter item t k last−0 represents the time to return from the last task point.

Modified SOS for MTWDTSP
The original MOSOS algorithm works on the cooperative behavior seen among organisms in nature. During the search process, each organism benefits from continuous interaction with others in three different phases [24]: mutualism, commensalism and parasitism. Mutualism allows the two sides to benefit from each other; commensalism benefits one party, while the other party is not affected; parasitism benefits one party, and the other party suffers. These three phases are adopted from the most common symbioses used by organisms to increase their fitness and survival advantage over the long term. The mechanisms for updating the best organism are triggered after one generation of organisms has completed its three phases. The phases are repeated until the stopping criterion is achieved. The original MOSOS enjoys some advantages such as simplicity in operation, few control parameters, good stability, and strong optimization ability, but there are several shortcomings such as early maturity and delay in later-search [30]. As for the proposed problem, combining with the motivation to improve the diversity and quality of the solution, the accuracy of the algorithm convergence, and reduce the computational complexity, several modifications are introduced to the original MOSOS algorithm [31]. A "Pareto dominance determination" approach [32] is adopted to preprocess the performance indicator of the proposed problem to improve the diversity of non-inferior solutions. Meanwhile, globally adaptive approach is used to improve speed, accuracy and convergence characteristics of the original MOSOS.

Double-Chain Encoding of the Decision Variable
Considering the constraints of the multi-UAV reconnaissance task assignment problem, a double-chain encoding method is developed, where both chromosomes are encoded by integers. In Section 2, the decision variable is given by Formulas (4)- (6); and then the individual variable dimensions for each particle are as follows: Considering the dimension, the number of individuals and the iteration equations, undoubtedly, there will be a complication of the problem and a slow convergence of the algorithm. As noted above, each element of the decision variable χ is in the form of a sparse matrix. If the sparse matrix is compressed and the location of nonzero elements is stored, then the particle's variable dimension will be compressed as follows: where M represents the number of nonzero elements in each sparse matrix. Compared to the original coding, the decision space is greatly compressed, but it is still large. It can be imagined that there will still be a large number of zero matrices in the optimal decision variable χ.
Considering the constraint functions (a), (b) and (c) proposed in Section 2.4, the double-chain encoding is used from the perspective of the task point, and the decision code string is then mapped to the task sequence of each UAV. For example, for the assignment of six tasks, there are currently three types of UAVs, each of which has 3, 2, and 3 UAVs, so that if T UAV = (i,j) represents the jth drone of the ith type, the example can be easily constructed as in Table 1.It is clear that for the UAV (1,1), the UAV passes through mission points 3 and 5, with mission point 5 flown to first and mission point 3 flown to last. A detailed decoding table is constructed as shown in Table 2. Considering the uniform dimensions of the double-chain encoding, T UAV = (i,j) is reduced dimensionally so that the following formulas are satisfied: then In summary, the decision variables of 2 × N T dimension can be defined to describe the mission assignment status of the UAV, and N T represents the number of tasks. The assignment scheme generated by this encoding determines that each task point is visited by a UAV, but in reality, due to constraints such as the time window, there will be situations where certain task points cannot be executed. M can be set to zero to indicate that a UAV was not available to execute the task.

Pareto Dominance Determination and Optimal Solution Set
According to the mathematical model constructed in Section 2, the task assignment problem can be summarized as a nonlinear multi-objective optimization (minimum) problem: where X = (χ 1 , χ 2 , χ 3 , . . . , χ n ) is the n-dimensional decision vector in R n space and c i (X) ≤ 0 (i = 1, 2, 3, . . . , p) represents the constraints of the optimization problem. The series of sub-PIs may conflict with each other; this nonlinear multi-objective optimization problem will produce multiple solutions. French economist V. Pareto proposed the concept of Pareto solution set for comparing the different feasible solutions. Assuming that for the minimization optimization problem, there are two feasible solutions X 1 and X 2 , if the following holds: Then, X 1 dominates X 2 , and X 1 is called a noninferior solution, and X 2 is called an inferior solution. If solutions X 1 and X 2 are not dominated by each other, then both solutions are noninferior solutions and are added to the noninferior solution set. This solution set is the so-called "Pareto solution set" and is often referred to as the optimal frontier (as shown in Figure 6).
Sensors 2019, 19, x FOR PEER REVIEW 10 of 20 where X , , , … , is the n-dimensional decision vector in space and 0( 1,2,3, … , )represents the constraints of the optimization problem. The series of sub-PIs may conflict with each other; this nonlinear multi-objective optimization problem will produce multiple solutions. French economist V. Pareto proposed the concept of Pareto solution set for comparing the different feasible solutions. Assuming that for the minimization optimization problem, there are two feasible solutions and , if the following holds: Then, dominates , and is called a noninferior solution, and is called an inferior solution. If solutions and are not dominated by each other, then both solutions are noninferior solutions and are added to the noninferior solution set. This solution set is the so-called "Pareto solution set" and is often referred to as the optimal frontier (as shown in Figure 6).  Figure 6 illustrates the process of minimizing the optimization of the two indicator functions. It can be clearly seen that as the number of iterations increases, the distribution of the optimal solution set tends to increasingly form a curve, and the red squares indicate the best frontier after 200 iterations. Then, using the artificial knowledge, the feasible solutions in the set are analyzed and compared based on different weights of PIs.
This paper focuses on the assignment problem for UAVs with obstacles and threats in the flight zone and presents a modified SOS algorithm based on Pareto dominance determination inspired by traditional evolutional algorithms and PSO. At the same time, a suitable coding operator is designed to effectively address all types of constraints to solve the task assignment problem.

Modified SOS with Global Adaptive Scaling Factor
Specific steps of the SOS algorithm can be expressed as follows: Step 1. Initialize the population: First, generate Nb random individual ʺbiologicalsʺ according to Equation (30); each ʺbiologicalʺ is an initial solution.
where represents the ith (i = 1, 2,..., Nb) ʺbiologicalʺ in the ecosystem, DX is the dimension of the  Figure 6 illustrates the process of minimizing the optimization of the two indicator functions. It can be clearly seen that as the number of iterations increases, the distribution of the optimal solution set tends to increasingly form a curve, and the red squares indicate the best frontier after 200 iterations. Then, using the artificial knowledge, the feasible solutions in the set are analyzed and compared based on different weights of PIs.
This paper focuses on the assignment problem for UAVs with obstacles and threats in the flight zone and presents a modified SOS algorithm based on Pareto dominance determination inspired by traditional evolutional algorithms and PSO. At the same time, a suitable coding operator is designed to effectively address all types of constraints to solve the task assignment problem.

Modified SOS with Global Adaptive Scaling Factor
Specific steps of the SOS algorithm can be expressed as follows: Step 1. Initialize the population: First, generate N b random individual "biologicals" according to Equation (30); each "biological" is an initial solution.
where X i represents the ith (i = 1, 2,..., N b ) "biological" in the ecosystem, D X is the dimension of the solution, and rand(1,D X ) is a 1 × D X dimension scaling factor vector. u b and l b are upper and lower bounds of the search space, respectively.
Step 2. Mutualism: At this stage a "biological" X j is picked randomly from the population to interact with X i and produce mutual benefits so that each progresses toward the optimal solution. X i and X j generate new solutions of X i and X j according to Equation (31). In the formula, X best is the optimal individual in the ecosystem, V M is a symbiotic vector and represents the middle point of the two individuals, and BF is the benefit factor; since each individual has a different degree of benefit, BF takes a value of 1 or 2.
Step 3. Globally Adaptive Parasitism [31]: For the original MOSOS algorithm, only one individual can benefit in a symbiotic relationship, while another individual is not affected. For the ith individual X i , one individual X j is randomly selected from other individuals. The original MOSOS algorithm always uses the search strategy of "current-to-best." Because of the randomness of the scaling factor, the search equation limits the guiding effect of the global optimal value, resulting in insufficient convergence accuracy and a long convergence time, which does not meet the requirement of the task assignment problem. For this reason, this paper proposes an adaptive hybrid search strategy with global optimal guidance. The specific form is as follows: where µ ∈ [0, 1] is the scaling factor; µ max , µ min are the maximum and minimum scaling factors, respectively; G max is the maximum number of cycles; and G I is the current number of iterations. This strategy maintains the difference between individuals by adding a differential disturbance factor. It simultaneously introduces an adaptive scaling factor. The initial cycle has a smaller µ. The algorithm emphasizes the global search, reduces the tendency to move toward a single point in the search space, and prevents the algorithm from falling into a local minimum. With the progress of the algorithm, µ gradually increases, the guiding role of the current optimal individual is strengthened, the leading algorithm advances, and the search process is no longer just a complete random search but instead is more purposeful and directional.

Optimal Solution Selection of the Proposed Algorithm
The core of the MOSOS algorithm is to determine the global optimal solution X best according to Pareto dominance determination, and only one set of noninferior solutions can be identified. Since the optimal solution must exist in the noninferior solution set, in this study the external noninferior profile C of the "biological" population is constructed for the selection of global optimal solution X best . Refer to the selection technique used in [29], solutions in the noninferior solution set is sorted by their dominance firstly, then compared with the external noninferior profile C, and replace the dominated solutions in profile C with better ones.
At the meantime, since each iteration of the MOSOS algorithm will inevitably generate many new solutions, the scale of external non-inferior profile will increase dramatically. With the purpose to distribute the optimal front edge as evenly as possible, it is necessary to eliminate a part of the densely distributed individuals and avoid falling into local optimum. Therefore, the external noninferior profile need to be "pruned" during the iterations. Refer to the crowded function defined in NSGA-II, an individual density function for sorting the solutions with normalized PIs and constraints is developed in this study as follows: As shown in Figure 7, (J i+1,1 − J i−1,1 ) 2 , (J i+1,2 − J i−1,2 ) 2 are the length and width of the quadrilateral, and the density of the individual i is reciprocal of their sum. Which reflects, the larger  Considering the efficiency of the proposed algorithm, the size of the external non-inferior profile C is limited to less than 1/5 of the size of the ecosystem. The following operations are required during the iterations of MOSOS: when the number of non-inferior solutions in the profile C exceeds its maximum size, calculate the density of each non-inferior solution. Then eliminate the individuals with the highest density until the number of non-inferior solutions does not exceed the largest scale of profile C.
In the end, in order to let individuals to track the position of Xbest quickly and accelerate the convergence of the algorithm, the individual with the lowest density in the external non-inferior profile C is selected as the global optimal solution Xbest. The algorithm of optimal solution selection is given as follows: Algorithm 1: Algorithm of the optimal solution selection.
Step 1. Initialize all the ʺbiologicalʺ individual variables and calculate their PI values and constraint function values.
Step 2. According to Section 3.1, all optimal noninferior solutions are added to the external nontrivial file C.
Step 3. Execute the SOS search iteration formula to obtain a new ʺbiologicalʺ individual X and calculate its index function value and constraint function value. Step 5. Conduct pruning operations on the non-inferiority external profile C, and select Xbest.
Step 6. Repeat Steps 2-5 until the SOS algorithm reaches the maximum number of iterations.

Numerical Simulation Results and Analysis
The proposed method for task assignments was tested via numerical simulation experiments. The numerical experiments were created by using Visual C++ (Ver. 10.0), and the figures illustrating Considering the efficiency of the proposed algorithm, the size of the external non-inferior profile C is limited to less than 1/5 of the size of the ecosystem. The following operations are required during the iterations of MOSOS: when the number of non-inferior solutions in the profile C exceeds its maximum size, calculate the density of each non-inferior solution. Then eliminate the individuals with the highest density until the number of non-inferior solutions does not exceed the largest scale of profile C.
In the end, in order to let individuals to track the position of X best quickly and accelerate the convergence of the algorithm, the individual with the lowest density in the external non-inferior profile C is selected as the global optimal solution X best . The algorithm of optimal solution selection is given as follows: Algorithm 1: Algorithm of the optimal solution selection.
Step 1. Initialize all the "biological" individual variables and calculate their PI values and constraint function values. Step 2. According to Section 3.1, all optimal noninferior solutions are added to the external nontrivial file C.
Step 3. Execute the SOS search iteration formula to obtain a new "biological" individual X and calculate its index function value and constraint function value. Step 5. Conduct pruning operations on the non-inferiority external profile C, and select X best .
Step 6. Repeat Steps 2-5 until the SOS algorithm reaches the maximum number of iterations.

Numerical Simulation Results and Analysis
The proposed method for task assignments was tested via numerical simulation experiments. The numerical experiments were created by using Visual C++ (Ver. 10.0), and the figures illustrating the results were created using MATLAB. The entire simulation was performed on a workstation consisting of a 3.5 GHz Intel Core-i7 CPU and 16 GB of physical RAM running 64-bit Microsoft Windows 10.
A total of 12 tasks were set in the battlefield. The parameter settings for all task points are shown in Table 3. The minimum turning radius r min is set as 1 km, and the detection width W U is set as 2 km. The distribution map of all task points is shown in Figure 8. Circles represent for the point targets, short strips and rectangles stand for strip targets and surface targets, respectively.  The distribution map of all task points is shown in Figure 8. Circles represent for the point targets, short strips and rectangles stand for strip targets and surface targets, respectively. There are the categories of UAVs on the battlefield, located at the starting positions (50, 0), of which parameters are shown in Table 4. There are the categories of UAVs on the battlefield, located at the starting positions (50, 0), of which parameters are shown in Table 4. The MOSOS algorithm is set to have a population size of 30; the number of termination iterations G max is 200, with a maximum external profile size of 30 and an external file of "biological" individual size of 5, scaling factors µ max = 0.87 and µ min = 0.13. Scenario 1: One UAV of each type, with a total of 3 UAVs are employed in the base for task assignment. Since the number of UAVs is much smaller than the number of tasks, UAVs are unable to complete of the tasks, then the adjustment parameter λ is set to 1 in this scenario. Figure 9 displayed the distribution of PIs of the initial positions of the population and the optimal frontier distribution of the initial generation. Figure 10 shows the evolution of the history frontier distribution in the iterative process, including the initial generation, the 50th generation, the 100th generation, and the 200th generation. Scenario 1: One UAV of each type, with a total of 3 UAVs are employed in the base for task assignment. Since the number of UAVs is much smaller than the number of tasks, UAVs are unable to complete of the tasks, then the adjustment parameter  is set to 1 in this scenario. Figure 9 displayed the distribution of PIs of the initial positions of the population and the optimal frontier distribution of the initial generation. Figure 10 shows the evolution of the history frontier distribution in the iterative process, including the initial generation, the 50th generation, the 100th generation, and the 200th generation.  From the perspective of implementing tasks more economically in the real combat, it is expected that existing UAVs could implement the tasks as much as possible, or implement all the tasks with less UAVs. Thus, the final feasible solution is selected as the one with the smallest value of PI J1, then code of assignment variables to the task is constructed as follows: The serial number and detailed flight sequence of each UAV is shown in Table 5, while the paths of each employed UAV are displayed in Figure 11. It can be seen that all the UAVs are employed to execute tasks with significant weight under permitted conditions. Besides, the reconnaissance sequences of the assigned targets for the UAVs are in a good order, and the total flight distance is 456.19 km. Thus, proposed method can obtain satisfied task allocation results for multi-UAV cooperative reconnaissance problems on heterogeneous targets. Scenario 1: One UAV of each type, with a total of 3 UAVs are employed in the base for task assignment. Since the number of UAVs is much smaller than the number of tasks, UAVs are unable to complete of the tasks, then the adjustment parameter  is set to 1 in this scenario. Figure 9 displayed the distribution of PIs of the initial positions of the population and the optimal frontier distribution of the initial generation. Figure 10 shows the evolution of the history frontier distribution in the iterative process, including the initial generation, the 50th generation, the 100th generation, and the 200th generation.  From the perspective of implementing tasks more economically in the real combat, it is expected that existing UAVs could implement the tasks as much as possible, or implement all the tasks with less UAVs. Thus, the final feasible solution is selected as the one with the smallest value of PI J1, then code of assignment variables to the task is constructed as follows: The serial number and detailed flight sequence of each UAV is shown in Table 5, while the paths of each employed UAV are displayed in Figure 11. It can be seen that all the UAVs are employed to execute tasks with significant weight under permitted conditions. Besides, the reconnaissance sequences of the assigned targets for the UAVs are in a good order, and the total flight distance is 456.19 km. Thus, proposed method can obtain satisfied task allocation results for multi-UAV cooperative reconnaissance problems on heterogeneous targets. From the perspective of implementing tasks more economically in the real combat, it is expected that existing UAVs could implement the tasks as much as possible, or implement all the tasks with less UAVs. Thus, the final feasible solution is selected as the one with the smallest value of PI J1, then code of assignment variables to the task is constructed as follows: The serial number and detailed flight sequence of each UAV is shown in Table 5, while the paths of each employed UAV are displayed in Figure 11. It can be seen that all the UAVs are employed to execute tasks with significant weight under permitted conditions. Besides, the reconnaissance sequences of the assigned targets for the UAVs are in a good order, and the total flight distance is 456.19 km. Thus, proposed method can obtain satisfied task allocation results for multi-UAV cooperative reconnaissance problems on heterogeneous targets.   Figure 11. Paths of all UAVs in Scenario 1.

Scenario 2:
Three UAVs of each type, with a total of 9 UAVs are prepared in the base for task assignment. Figure 12 displayed the distribution of PIs of the initial positions of the population and the optimal frontier distribution of the initial generation.  Figure 13 shows the evolution of the history frontier distribution in the iterative process, each point in the graph represents the best non-inferior solution to the number of iterations. Since there exists solutions for UAVs to complete all the tasks, the adjustment parameter  is set to 0 in the latter stage of optimization. As seen in the figure, with the increase in the number of iterations of the algorithm, the external archive is continuously approaching the best frontier, indicating that the Scenario 2: Three UAVs of each type, with a total of 9 UAVs are prepared in the base for task assignment. Figure 12 displayed the distribution of PIs of the initial positions of the population and the optimal frontier distribution of the initial generation.   Figure 11. Paths of all UAVs in Scenario 1.

Scenario 2:
Three UAVs of each type, with a total of 9 UAVs are prepared in the base for task assignment. Figure 12 displayed the distribution of PIs of the initial positions of the population and the optimal frontier distribution of the initial generation.  Figure 13 shows the evolution of the history frontier distribution in the iterative process, each point in the graph represents the best non-inferior solution to the number of iterations. Since there exists solutions for UAVs to complete all the tasks, the adjustment parameter  is set to 0 in the latter stage of optimization. As seen in the figure, with the increase in the number of iterations of the algorithm, the external archive is continuously approaching the best frontier, indicating that the  Figure 13 shows the evolution of the history frontier distribution in the iterative process, each point in the graph represents the best non-inferior solution to the number of iterations. Since there exists solutions for UAVs to complete all the tasks, the adjustment parameter λ is set to 0 in the latter stage of optimization. As seen in the figure, with the increase in the number of iterations of the algorithm, the external archive is continuously approaching the best frontier, indicating that the algorithm is convergent. The front-end distribution of all generations of noninferior solutions is uniform, which reflects the distribution of the real optimal frontier.  If taking the most reconnaissance tasks completed as the main PI, that is, the index function J1, then in this simulation process, a total of 6 drones were deployed to perform reconnaissance tasks of all 9 UAVs, and a total of 12 task points were visited. The code of assignment variables to the task is constructed as follows: 2 1 5 2 6 5 9 2 3 1 6 5 2 2 1 1 1 2 1 3 1 1 2 3 The detailed flight sequence of each UAV is shown in Table 6, while the paths of each employed UAV are displayed in Figure 14. It can be seen that all the targets are visited by the employed UAVs. The total flight distance of all employed UAVs is 937.52 km. Thus, it is proved once again that the proposed method can obtain satisfied task allocation results for multi-UAV cooperative reconnaissance problems on heterogeneous targets. With the intention to verify the superiority of the proposed method (IMOSOS) in Section 3, improved MOPSO (IMOPSO) [33], NSGA-II [12] and original MOSOS with the same simulation conditions and stopping criterion are implemented for comparison. The initial parameters of IMOPSO are set as: wmax = 0.9; wmin = 0.4; factor C1 = C2 = 1.454. The initial parameters of NSGA-II are set as: probabilities of crossover and mutation are 0.9 and 0.1, respectively; weight factors α and β of sub-objectives are both set to be 0.5. The initial parameters of original MOSOS is the same as the proposed method. The final feasible solutions of all the algorithms are selected as the one with the smallest value of PI J1.
Monte-Carlo simulations on Scenario 2 are carried out, as the 12 tasks mentioned above with random positions. And the statistical results of 200 runs displayed in Table 7. Among the results are the best, worst, and average (i.e. Avg.) value of the PIs proposed in Section 3. Also, the compute efficiency of each algorithm is given in Table 7. Table 8 shows the improvement ratio of IMOSOS compared with other algorithms. From the statistical results of the referred indicators, the IMOSOS If taking the most reconnaissance tasks completed as the main PI, that is, the index function J 1 , then in this simulation process, a total of 6 drones were deployed to perform reconnaissance tasks of all 9 UAVs, and a total of 12 task points were visited. The code of assignment variables to the task is constructed as follows: 2 1 5 2 6 5 9 2 3 1 6 5 2 2 1 1 1 2 1 3 1 1 2 3 The detailed flight sequence of each UAV is shown in Table 6, while the paths of each employed UAV are displayed in Figure 14. It can be seen that all the targets are visited by the employed UAVs. The total flight distance of all employed UAVs is 937.52 km. Thus, it is proved once again that the proposed method can obtain satisfied task allocation results for multi-UAV cooperative reconnaissance problems on heterogeneous targets.
With the intention to verify the superiority of the proposed method (IMOSOS) in Section 3, improved MOPSO (IMOPSO) [33], NSGA-II [12] and original MOSOS with the same simulation conditions and stopping criterion are implemented for comparison. The initial parameters of IMOPSO are set as: w max = 0.9; w min = 0.4; factor C1 = C2 = 1.454. The initial parameters of NSGA-II are set as: probabilities of crossover and mutation are 0.9 and 0.1, respectively; weight factors α and β of sub-objectives are both set to be 0.5. The initial parameters of original MOSOS is the same as the proposed method. The final feasible solutions of all the algorithms are selected as the one with the smallest value of PI J 1 .   Three main aspects have considerable influence on the computational efficiency of the algorithm in the simulation process: the number of "biological" individual, the iterations of the algorithm and the scale of the external non-inferior archive. The increased number of task points and the number of the UAVs have little effect on the computational efficiency of the algorithm. The encoding rules ensure that the number of task points is linearly related to the calculation time of the algorithm. If more tasks, UAVs or bases are considered in this particular problem, in order to solve the problem effectively, the size of encoded decision variables and iterations may increase correspondingly. Therefore, for the problem of multi-UAV multi-reconnaissance tasks, the proposed method can provide a valid suboptimal solution.

Conclusions
This paper considers a reconnaissance task assignment problem for multi-UAVs as a multiobjective, multi-constraint nonlinear optimization problem. A model of cooperative UAVs and  Table 7. Among the results are the best, worst, and average (i.e. Avg.) value of the PIs proposed in Section 3. Also, the compute efficiency of each algorithm is given in Table 7. Table 8 shows the improvement ratio of IMOSOS compared with other algorithms. From the statistical results of the referred indicators, the IMOSOS performs better in searching optimal solutions to the certain task assignment problem than the other three algorithms. Meanwhile, according to the average results of the referred indicators, the improved MOSOS can provide stable solutions as the three algorithms. Thus, with the help of problem formulation of MTWDTSP and modification to the original MOSOS, this proposed method enjoys certain superiority and efficiency for multi-UAV reconnaissance task allocation problems with heterogeneous targets. Three main aspects have considerable influence on the computational efficiency of the algorithm in the simulation process: the number of "biological" individual, the iterations of the algorithm and the scale of the external non-inferior archive. The increased number of task points and the number of the UAVs have little effect on the computational efficiency of the algorithm. The encoding rules ensure that the number of task points is linearly related to the calculation time of the algorithm. If more tasks, UAVs or bases are considered in this particular problem, in order to solve the problem effectively, the size of encoded decision variables and iterations may increase correspondingly. Therefore, for the problem of multi-UAV multi-reconnaissance tasks, the proposed method can provide a valid suboptimal solution.

Conclusions
This paper considers a reconnaissance task assignment problem for multi-UAVs as a multi-objective, multi-constraint nonlinear optimization problem. A model of cooperative UAVs and reconnaissance tasks of heterogeneous ground targets with time windows is built to describe this problem. To solve the task assignment problem, two PIs were constructed and need to be optimized. When dealing with the multi-UAV task assignment optimization model, this paper adopts the modified MOSOS algorithm based on the Pareto dominance determination and the global adaptive operation. To address the various constraints of the task assignment model, the decision variables designed from each UAV itself are mapped to the task point of view. Therefore, the double-chain encoding for modified MOSOS inherently meets the constraint requirements of the task assignment model and greatly reduces the dimensionality of the decision variables, thereby speeding up the convergence of the algorithm. The global best external archive and the best external individual archive are designed in this study. Furthermore, this study defines the density function of each individual, thus ensuring that the solution to the problem meets the requirements of diversity and homogeneity. Finally, numerical simulation results and statistical results of Monte-Carlo simulations are proposed to verify the superiority and efficiency of the introduced method.

Abbreviations
x The UAV's coordinates in plane space z The UAV's coordinates in plane space V U Speed of the UAV r min Minimum turning radius of the UAV The number of types of the UAV N p The number of the UAV of the kth type J 1 Performance indicator of tasks/UAVs λ Adjustment parameter S(X) The total weight of the tasks that have been assigned N(X) The number of the employed UAVs to execute the assigned tasks sgn(·) Symbol function w Weight vector J 2 Performance indicator of flight distance Dis ij The distance from task i to task j SenT j The minimum sensor accuracy required for the task j SenV k j The sensor accuracy of the kth type of UAV X i The ith (i = 1, 2,..., N b ) "biological" in the ecosystem D X The dimension of the solution BF The benefit factor µ The scaling factor G max The maximum number of cycles G I The current number of iterations C The external noninferior profile of the noninferior solution set