A Two-Stage Approach for Routing Multiple Unmanned Aerial Vehicles with Stochastic Fuel Consumption

The past decade has seen a substantial increase in the use of small unmanned aerial vehicles (UAVs) in both civil and military applications. This article addresses an important aspect of refueling in the context of routing multiple small UAVs to complete a surveillance or data collection mission. Specifically, this article formulates a multiple-UAV routing problem with the refueling constraint of minimizing the overall fuel consumption for all the vehicles as a two-stage stochastic optimization problem with uncertainty associated with the fuel consumption of each vehicle. The two-stage model allows for the application of sample average approximation (SAA). Although the SAA solution asymptotically converges to the optimal solution for the two-stage model, the SAA run time can be prohibitive for medium- and large-scale test instances. Hence, we develop a tabu search-based heuristic that exploits the model structure while considering the uncertainty in fuel consumption. Extensive computational experiments corroborate the benefits of the two-stage model compared to a deterministic model and the effectiveness of the heuristic for obtaining high-quality solutions.


Introduction
Advances in sensing, robotics, and wireless sensor networks have enabled the use of small unmanned aerial vehicles (UAVs) in environmental sensing applications such as crop monitoring [1][2][3], forest fire monitoring [4], and ecosystem management [5,6], as well as civil security applications such as border surveillance [7,8] and disaster management [9]. In military applications, small UAVs are preferred as they can be hand launched without the need for a runway [10]. More recently, a variety of studies have indicated that small UAVs have a huge potential for package delivery, last-mile delivery, and emergency response [11], and studies have also indicated that the use of small UAVs in emergency response applications would lead to significant cost reductions [12,13] because of their inexpensive operation, maintenance, and labor costs. Moreover, in the emergency response context, they also have the potential to save lives by transporting and delivering much-needed food and water supplies over any terrain. Worldwide, corporate companies such as Amazon and Google have begun field-testing their UAV capabilities through initiatives such as Amazon Prime Air [13] and Project Wing [14], respectively.
Despite the advantages of small UAV platforms, they come with other resource constraints due to their size and limited payload capacity. Small UAVs typically have fuel constraints and are therefore required to make one or more refueling stops in a long surveillance or data gathering application. For instance, consider the following problem setup involving multiple small UAVs that are supposed to visit a set of targets and collect data from them. We are given a set of depots and a set of targets. The depots are refueling sites, and a team of homogeneous small UAVs is initially stationed at one of the depots. It is safe to assume that any UAV is refueled to its capacity when it visits a depot. The fuel a vehicle consumes to travel between any pair of targets/depots is uncertain, and the description of this uncertainty is known via numerous samples. In this context, the following two-stage Fuel-Constrained Multiple-UAV Routing Problem (FCMURP) naturally arises: In the first stage, the FCMURP aims to compute a route for each UAV that starts and ends at the depot where all the vehicles are initially stationed, such that each target is visited at least once by some vehicle and no vehicle runs out of fuel as it traverses its route. If a vehicle does not have enough fuel, it can make a refueling stop at any depot. This fuel restriction is imposed in the first stage using a nominal value for the fuel consumed by any vehicle as it travels between a pair of targets/depots. The first-stage routing decisions for each vehicle are referred to as 'here-and-now' decisions in the parlance of two-stage stochastic programming; these decisions are typically made before the realization of the uncertainty is known, which conforms to most realistic settings where decisions have to be made in the face of uncertain parameters. In the second stage, additional refueling trips are added to the first-stage routes to ensure feasibility for the realization of the fuel consumption. The second-stage decisions are referred to as 'the recourse actions/decisions' because these decisions are typically made after the realization of uncertainties. The objective of the FCMURP is to minimize the sum of the travel distances for all the vehicles (first-stage objective function) and the expected travel distance for the additional refueling trips (second-stage objective function). An illustration of feasible first-stage and second-stage solutions is shown in Figure 1.

Related Work
The FCMURP is NP-hard because it is a generalization of the multiple traveling salesman problem (MTSP). The literature contains many variants of vehicle routing problems (VRPs) for UAVs with fuel constraints. The interested reader is referred to [15] for an overview of the formulations and algorithms for many of these variants. A single-vehicle deterministic variant of the FCMURP for ground vehicles was first introduced by Khuller et al. [16], who develop an approximation algorithm for the case where the travel costs are symmetric and satisfy the triangle inequality. Khuller et al. assume that the minimum fuel required to travel from any target to its nearest depot is equal to at most Fα/2 units, where α is a constant in the interval [0, 1) and F is the fuel capacity of the vehicle. This is a reasonable assumption since, in any case, one cannot have a feasible tour if there is a target that cannot be visited from any of the depots. Based on this assumption, Khuller et al. [16] present a (3(1 + α))/(2(1 − α)) approximation algorithm for the problem. Variants of the deterministic FCMURP in the context of routing UAVs are addressed in [17][18][19][20][21]. Specifically, Ref. [17] was the first paper to address the problem in the context of UAVs. The authors formulate the single-vehicle variant as a mixed-integer linear program (MILP) and present k-opt-based exchange heuristics to obtain feasible solutions within 7% of the optimal solution, on average. The MILP formulations are solved using off-the-shelf commercial solvers to obtain an optimal solution to the single-vehicle variant, and the authors also observe that the solvers are unable to handle their formulation for instances with more than 25 targets. Sundar et al. [18] extend the approximation algorithm in [16] to the asymmetric case and present heuristics to solve this case. Levy et al. [19] present variable neighborhood search heuristics for the deterministic FCMURP with heterogeneous vehicles, i.e., vehicles with different fuel capacities. More recently, Ref. [20] develop an approximation algorithm and heuristics for the deterministic FCMURP. Mitchell et al. [20] extend the MILP proposed in [17] to the multiple-vehicle setting. Like [17], they also report that the off-the-shelf mixed-integer solvers had difficulty computing an optimal solution. More recently, Sundar et al. [22,23] analytically and empirically analyze different MILP formulations for the deterministic FCMURP and develop additional valid inequalities for these.
Another line of work examines the deterministic FCMURP in the context of electric vehicles, which is referred to as the green vehicle routing problem (GVRP). The GVRP was first introduced in [24]. The authors of [24] propose an MILP formulation and several heuristics to tackle the GVRP. Since then, many algorithms have been developed to solve the deterministic capacitated and uncapacitated variants of the GVRP; an excellent survey of this work is presented in [25]. Other variants of the VRP that are closely related to the deterministic version of the FCMURP are the distance-constrained VRP [26][27][28][29][30] and the orienteering problem [31,32]. All these variants are well suited for ground vehicles. The distance-constrained VRP is a special case of the FCMURP with a single vehicle and a single depot. The FCMURP is also quite different and more general than the orienteering problem, which focuses on maximizing the number of targets visited by a vehicle subject to its fuel constraints. To the best of our knowledge, fuel-constrained, multiple-vehicle routing problems considering uncertainty in the fuel consumption of the vehicles have not been explored in the framework of stochastic optimization in the literature. Hence, this article can be considered as a first step in that direction. We also remark that the framework and the model that we present for the FCMURP in this article are fairly general and can accommodate many other practical constraints that arise in typical surveillance missions involving multiple UAVs. We use the sample average approximation (SAA) approach [33,34] to solve the two-stage stochastic program and obtain statistical estimates of the upper and lower bounds for the optimal FCMURP solution. The SAA approach is a Monte Carlo simulation-based sampling technique, in which the expected value of the objective function is approximated using a finite number of scenarios; nevertheless, SAA is computationally expensive for larger problem instances. To address the scalability of the SAA algorithm, we also propose a tabu search heuristic [35,36] to compute good-quality suboptimal solutions with relatively less computational effort. We use a tabu search heuristic because of its success with stochastic variants of many VRPs; the interested reader is referred to [37][38][39][40][41][42] for tabu searches applied to general applications and VRPs, respectively. Specifically, the tabu search algorithm presented in this article is adapted from the work in [39,43].

Contributions
The following are the main contributions of this article: (i) the FCMURP with uncertainty in the fuel consumption of all vehicles is modeled as a two-stage stochastic program with random recourse; (ii) SAA is used to obtain statistical estimates of the lower and upper bounds for the optimal solution of the two-stage stochastic program; (iii) a tabu search heuristic for finding suboptimal solutions for the two-stage stochastic model is proposed; and finally; (iv) the performance achieved by SAA and the tabu search heuristic is corroborated through extensive computational experiments.
The rest of this paper is organized as follows. Notation is introduced in Section 2, and the two-stage stochastic programming model formulation is presented in Section 3. Details about SAA using an exact method and a heuristic solution are presented in Sections 4 and 5, respectively. Computational results for randomly generated instances are presented in Section 6. Finally, the paper concludes in Section 7.

Notation
Let T = {t 1 , . . . , t n } denote the set of targets, let d 0 denote the depot where m homogeneous UAVs, each with fuel capacity F, are initially stationed, let D = {d 1 , . . . , d k } denote the set of additional depots or refueling sites, and let D = D {d 0 }. All the m UAVs initially stationed at the depot d 0 are assumed to be fueled to capacity. The FCMURP is defined on a directed graph G = (V, E), where V = T ∪ D denotes the set of vertices and E denotes the set of edges joining any pair of vertices. We assume that G does not contain any self-loops. For each edge (i, j) ∈ E, we let c ij andf ij represent the travel cost and the nominal fuel that will consumed by any UAV traversing the edge (i, j). We remark thatf ij is directly computed using the length of the edge (i, j) and the fuel economy of the UAVs. Additional notation that will be used in the mathematical formulation is as follows: for any set S ⊂ V, When S = {i}, we shall simply write δ + (i) and δ − (i) instead of δ + ({i}) and δ − ({i}), respectively.
The notation introduced next is for describing the uncertainty associated with the vehicles' fuel consumption. Let f denote a discrete random variable vector representing the fuel consumed by any UAV to traverse any edge in E. The vector f has |E| components, one for each edge, and the random variable in the vector f corresponding to edge (i, j) is denoted by f ij . Let Ω denote the set of scenarios for f , where ω ∈ Ω represents a random event or realization of the random variable f with a probability of occurrence p(ω). We use f ij (ω) to denote the fuel consumed by any UAV when traversing the edge (i, j), and f (ω) to denote the random vector for the realization ω ∈ Ω. Finally, we use E to denote the expectation operator, i.e., E Ω (α) = ∑ ω∈Ω p(ω)α. Table 1 lists all the notations introduced in this section for ease of reading. In the next section, we formulate the FCMURP as a two-stage stochastic program using the notation introduced in this section.
travel cost for the edge (i, j) ∈ Ê f ij nominal fuel consumed by the UAV to traverse the edge (i, j) ∈ E f random variable vector representing the fuel consumed by any UAV Ω set of scenarios for f ω ∈ Ω realization of random variable f p(ω) probability of occurrence of ω E expectation operator

Mathematical Formulation
The first-stage decision variables in the stochastic program are used to compute the initial set of routes for each of the UAVs such that every target is visited at least once by some UAV and no UAV ever runs out of fuel as it traverses its route. The fuel constraint for each UAV in the first-stage is enforced using the nominal fuel consumption valuef ij for each edge (i, j) ∈ E. For a realization ω ∈ Ω, the second-stage decision variables are used to compute the refueling trips that must be added to the first-stage routes based on the realized values of f ij (ω) for all (i, j) ∈ E.
Specifically, the first-stage decision variables are as follows: each edge (i, j) ∈ E is associated with a variable x ij that equals 1 if the edge (i, j) is traversed by some UAV, and 0 otherwise. We let x ∈ {0, 1} |E| denote the vector of all decision variables x ij . There is also a flow variable z ij associated with each edge (i, j) ∈ E that denotes the total nominal fuel consumed by any vehicle as it starts from depot i and reaches the vertex j. Additionally, for any A ⊆ E, we let x(A) = ∑ (i,j)∈A x ij . Analogous to the variable x ij in the first stage, we define a binary variable y ij (ω) for each edge (i, j) ∈ E. The variables y ij (ω) are used to define the refueling trips needed for any vehicle when the route defined by x is not feasible for the realization ω. Similarly, v ij (ω) is a flow variable analogous to z ij for every (i, j) ∈ E and ω ∈ Ω. Additional second-stage variables q ij (ω) for each (i, j) ∈ E are used to compute the cost of a refueling trip for the realization ω. Finally, for every i, j ∈ V and ω ∈ Ω, we letd be a depot in the set D that minimizes the overall refueling trip between i and j, i.e.,d = argmin d f id (ω) + f dj (ω). We remark that the dependence ofd on (i, j) ∈ E and ω is not explicitly shown and we leave it to the reader to understand the dependence from the context.

Objective Function
The objective function for the two-stage stochastic programming model for the FCMURP is the sum of the first-stage travel cost and the expected second-stage recourse cost. The second-stage recourse cost for a realization ω ∈ Ω of the fuel consumption of the vehicles is the cost of the additional refueling trips that are required for the realization ω. The recourse cost is a function of the first-stage routing decision x and the realization ω. Letting the recourse cost be denoted by β(x, f (ω)), the objective function for the two-stage stochastic optimization problem is given by (1)

First-Stage Routing Constraints
The constraints for the first stage enforce the routing constraints, i.e., the requirements that each target in T should be visited at least once by some vehicle and that each vehicle never runs out of fuel as it traverses its route. In the first stage, the fuel constraint is enforced using the nominal value of fuel consumed by any vehicle to traverse any edge (i, j) ∈ E. The first-stage routing constraints are as follows: Constraint (2a) forces the in-degree and out-degree of each refueling station to be equal. Constraints (2b) and (2c) ensure that all the UAVs leave and return to depot d 0 , where m is the number of UAVs. Constraint (2d) ensures that a feasible solution is connected. For each target i, the pair of constraints in (2e) require that some UAV visits the target i. Constraint (2f) eliminates subtours of the targets and defines the flow variables z ij for each edge (i, j) ∈ E using the nominal fuel consumption valuesf ij . Constraints (2g) and (2h) together impose the fuel constraints on the routes for all the UAVs. Finally, constraint (2i) imposes binary restrictions on the decision variables x ij .

Second-Stage Constraints
The second-stage model for a fixed x and f (ω) is given as follows: subject to: The objective function (3a) minimizes the total refueling trips for a given scenario and removes the cost of the edges between the targets which are used in the first stage but not in the second stage due to the additional refueling edges added in the second stage. Constraints (3b)-(3d) are analogous to constraints (2f)-(2h); they prevent subtours of the targets and impose the fuel constraints when the realization of the fuel consumption is given by ω. Constraint (3e) defines the actual recourse action for any UAV, i.e., given a realization ω, if the route defined by x is not feasible for ω, this constraint allows for adding refueling visits to the route without altering the sequence in which each UAV visits the targets. Furthermore, if the route corresponding to any UAV obtained using the first-stage variables x includes an edge (i, j), we force the refueling-site visit to occur at depotd, whered is precomputed using the vertices i, j ∈ V. Constraint (3f) ensures that the sequence of target visits for each UAV is not altered by the recourse action. Constraints (3g)-(3l) ensure that the recourse cost of additional visits to refueling stations is captured by the objective function (3a). Finally, constraint (3m) imposes the binary and continuous restrictions on the second-stage variables. It should be noted that due to constraint (3i), the second-stage model is a mixed-integer nonlinear model.

Tightening the Two-Stage Stochastic Formulation
In this section, we present the results we use to strengthen the constraints in the first and second stages of the formulation presented in the previous section. We say that a constraint is strengthened if it eliminates fractional solutions to the two-stage stochastic formulation of the FCMURP without removing any feasible FCMURP solutions. The following proposition strengthens the inequalities (2h) and (3d). Since the results are similar for (2h) and (3d), we present the proposition only for (2h).

Proposition 1.
For any i, j, k ∈ V, iff ij +f jk f ki , then the inequalities in (2h) can be strengthened as follows: where t i = min d∈Dfid and s i = min d∈Dfdi .
Proof. When j ∈ D or i ∈ D, constraints (2h) and(4b) specify the values of z id and z di , respectively. When both i, j ∈ D, constraint (2h) bounds the value of z ij . Hence, we only need to discuss the case where i, j ∈ T. When x ij = 1, the total fuel consumed by any vehicle that traverses the edge (i, j) cannot be greater than (F − t j ), where t j is the minimum amount of nominal fuel required by any vehicle to reach a refueling station or the depot from target j. Therefore, constraint (4a) strengthens the upper bound of z ij in (2h). Similarly, any vehicle that traverses edge (i, j) consumes at least (s i +f ij ) fuel. As a result, constraint (4c) strengthens the lower bound of z ij in (2h).
By virtue of Proposition 1, throughout the rest of the article, we shall assume that constraints (2h) and (3d) are replaced by their strengthened counterparts.

Proposition 2.
Using a new set of variables 0 xy ij (ω) 1, constraint (3i) can be linearized using the following set of constraints.
Using Proposition 2, the second-stage constraints in Section 3.3 can be reformulated as linear constraints with binary and continuous variables, making the two-stage stochastic program an MILP. From here on, unless otherwise stated, we shall assume that the second-stage constraints are linear.

Solution Approach: Sample Average Approximation (SAA)
In this section, we present an approach to solving the two-stage stochastic program for the FCMURP presented in Section 3 using sampling. Although the mathematical formulation of the FCMURP presented in Section 3 is an MILP, the number of realizations of the uncertain fuel consumption is typically very large, and hence, it is not computationally viable to directly input the formulation to an off-the-shelf commercial MILP solver. Nevertheless, a wealth of recent theoretical and empirical work in the literature has shown that an SAA approach can often accurately solve stochastic programs when the number of scenarios (realizations of the uncertainty) is prohibitively large [34,[45][46][47]. For the sake of completeness, we present details about SAA. The SAA of the two-stage stochastic program approximates the objective function C in (1) by its sample average approximation C Γ given by where Γ ⊂ Ω, with Ω denoting the sample space of realizations for f . The function C Γ is known to be an unbiased estimator of C [48] for all feasible solutions x, i.e., EC Γ (x) = C(x) for every feasible solution x. Hence, the approximating problem is now given by v Γ = min {C Γ : (2) and (3)}.
If C * denotes the optimal solution to the full two-stage stochastic program formulated in Section 3, then the value of v Γ will approach C * as the number of scenarios considered in the sample average problem, |Γ|, approaches |Ω| [49]. However, the solution v Γ is biased in the sense that Ev Γ C * [48]. In the forthcoming sections, we detail procedures to obtain statistical estimates of lower and upper bounds for the objective value of the full two-stage stochastic program, C * , using the solution obtained by solving multiple small problems in (7) for different sample sets Γ.

Statistical Estimate of a Lower Bound for C *
From the above, the lower bound for C * is given by Ev Γ . We note that the value v Γ is a random variable, as it depends on the random sample Γ ⊂ Ω. Nevertheless, a confidence interval for the value of v Γ can be computed using the following procedure, which was suggested in [48]. First, N independent samples Γ 1 , Γ 2 , . . . , Γ N with the same cardinality (|Γ k | = M) are drawn from the same distribution as that of Ω. For each of these samples, the associated SAA problem in (7) is solved. The objective values of the associated SAA problems, v Γ 1 , v Γ 2 , . . . , v Γ N are all independent and identically distributed with the mean L and standard error s L given by

Statistical Estimate of an Upper Bound for C *
To construct an upper bound for C * , we first observe that for each candidate solution x * k to the SAA problem in (7) with |Γ k | = M, and that the value of C(x * k ) is the actual cost of the objective function when the first-stage routes are specified by x * k . Similar to v Γ , the value C(x * k ) is a random variable since it depends on the sample set Γ k . Furthermore, this quantity C(x * k ) is undoubtedly larger than C * . Hence, each candidate solution to the N SAA problems can be used to obtain a statistical estimate of the upper bound using the following procedure. We first take a sample Λ ⊂ Ω (where Λ is different from the samples Γ k used to obtain the lower bound) and compute an estimate of the upper bound C(x * k ) for the candidate solution x * k as w Λ (x * k ) is then computed for each of the N candidate SAA solutions, and the x * that corresponds to the least upper bound of all of the w Λ (x * k ) is chosen as the feasible solution, i.e., x * = argmin{w Λ (x) : x ∈ x * 1 , x * 2 , . . . , x * N }. Typically, |Λ| |Γ k | since the samples in Λ are used only to evaluate a candidate solution as opposed to the Γ k s that are used to optimize large-scale MILPs that result from the SAAs. The standard error for the candidate upper bound x * is computed using an expression similar to the expression used for computing the standard error of the lower bound estimate in (8). We remark that this technique of obtaining the statistical estimate of an upper bound can be used for any feasible FCMURP solution obtained using any heuristic technique; each feasible solution would result in a different estimate for the upper bound.
The challenge arising for the SAA algorithm is that it still requires N SAA problems, which are MILPs, to be solved to optimality. The optimal solutions to these problems are required to construct the statistical estimates of the upper and lower bounds of the optimal solution of the two-stage stochastic formulation, C * , for the FCMURP. When the number of targets |T| is large, the MILPs can still pose a considerable challenge for commercial MILP solvers. Moreover, the second-stage model is itself an MILP, making it averse to traditional decomposition techniques such as the L-shaped method. For this reason, there is a need to develop heuristics to solve the SAA problems. These heuristics would aid in computing an estimate of the upper bound for C * , using the procedure detailed in Section 4.2. However, an estimate of the lower bound for C * can only be constructed using the optimal solutions to the SAA problems. In the forthcoming section, we present a tabu search heuristic to solve the individual SAA problems that are used to obtain statistical estimates of an upper bound for C * in medium-and large-sized test instances.

Heuristic Solution Technique
The heuristic solution technique for finding good suboptimal solutions to the two-stage stochastic programming formulation of the FCMURP proceeds in two phases. The first phase is the construction phase, in which an initial feasible solution to the two-stage stochastic program is constructed. The second phase is an improvement phase, in which a tabu search method is applied to the feasible solution from the first phase. The tabu search method performs a dynamic neighborhood search to improve the quality of the objective functions. For the first phase of the heuristic, we present an optimization-based construction algorithm motivated by the fact that the deterministic variant of the FCMURP is computationally not intensive, despite being an MILP itself. The construction heuristic solves multiple, massively parallelizable, instances of the deterministic FCMURP and combines all the solutions to construct a feasible solution for the two-stage stochastic program for the FCMURP. In the following subsection, we present the details of this construction heuristic.

Construction Heuristic
We begin by examining the objective function for the two-stage stochastic program given in (1). The objective function aims to minimize the sum of the total travel cost in the first stage and the expected cost of the refueling trips in the second stage. The input to the construction heuristic is a subset of scenarios ∆ ⊂ Ω and |∆| = |Γ k |. The main task of the construction heuristic is to determine the subset of edges in E that are most likely to be present in the optimal solution for the two-stage problem. The deterministic version of the FCMURP (which is basically the first-stage problem with the objective of minimizing the travel cost) is solved for each scenario in the set ∆, in decreasing order of the probability of the scenarios, with the fuel consumption taking the value f (ω) for the solution corresponding to the realization ω ∈ ∆. Let the optimal solution corresponding to the scenario ω ∈ ∆ be denoted by x ω . The values of x ω for ω ∈ ∆ that are obtained from the deterministic solutions are then used to construct a cumulative weighting function d ij for each edge (i, j) ∈ E, defined by where p(ω) is the probability of the scenario ω. These weights are used to transform the travel cost coefficients and the fuel consumption for each edge (i, j) ∈ E as The intuition behind the construction of the costsc ij for (i, j) ∈ E is that if an edge (i, j) ∈ E is present in the optimal solution of every deterministic solution x ω , ω ∈ ∆, then it is very likely to be a part of the optimal solution for the two-stage formulation. Hence, its transformed costc ij in this case becomes zero. A similar argument holds for the other case, where an edge is not a part of the optimal solution for every deterministic solution. A final deterministic solution to obtain the heuristic solution is then constructed by using the transformed cost and fuel consumption values in (10). Pseudocode for the heuristic is given in Algorithm 1.

Algorithm 1 Construction heuristic
Input: the instance data, ∆ ⊂ Ω Output: a heuristic solution x h 1: for ω ∈ ∆ do Parallelizable for each ω ∈ ∆ 2: solve the deterministic FCMURP with fuel consumption values set to f (ω) 3: x ω ← optimal solution of the deterministic problem 4: end for 5: for (i, j) ∈ E do 6: fuel consumption update 9: end for 10: x h ← optimal solution of deterministic FCMURP solved withc ij andf ij

Improvement Heuristic: Tabu Search Method
The tabu search method (TSM) is essentially a dynamic neighborhood search algorithm with a flexible way to restrict the next solution choice to some subset of neighbors of the current solution. Unlike a local descent-based search algorithm, TSM permits moves that degrade the current objective function value. The procedure maintains a modified neighborhood for the current solution that is constructed by maintaining a selective history of the states encountered during the search. Hence, TSM is a dynamic neighborhood search method, i.e., the neighborhood of the solution is not a static set, but rather a set that can change according to the search history. The algorithm presented in this section is based on the work in [39,43], which we adapt to our problem setting. The input to TSM is the feasible solution obtained using the construction heuristic presented in the previous section. At each iteration of TSM, a neighborhood for the current solution is constructed, and within the neighborhood, a new current solution is selected in accordance with a selection criterion. Then the algorithm continues with the new current solution.
Givenx, a feasible first-stage route for the FCMURP, TSM computes the cost ofx as where ∆ 1 and ∆ 2 represent feasible and infeasible scenarios for the given feasible solutionx (notice that ∆ 1 ∩ ∆ 2 = ∅ and ∆ 1 ∪ ∆ 2 = ∆ ⊂ Ω). The parameter ν penalizes the infeasiblex and is chosen as the sum of the largest β(x, f (ω)) and a large positive number. A dynamic neighborhood of the solutionx, denoted by g(x), is then constructed. Givenx, g(x) is obtained by performing a one-point exchange of the targets, i.e., two targets i and j in the solutionx are swapped. Suppose that in the current iteration a solution that is obtained by swapping two targets i and j is selected as a new current solution, then this target-target swap is inserted into to the 'tabu' list to prevent its selection for the next iterations. The purpose of this 'tabu' list is to keep track of exchanges that have taken place during the recent past. This prevents certain solutions from the recent past from belonging to the neighborhood g(x). Nevertheless, swaps may also be removed from the 'tabu' list if the aspiration criterion is satisfied i.e., a solution generated by exchanging the positions of the targets within a route has a better objective value than the current best objective value.
At each iteration, the algorithm selects a non-tabu solution in the neighborhood g(x) of the current solutionx that has a better objective value than the current solution (lines 3 and 4 in Algorithm 2). If such a solution cannot be found by the algorithm, then the best non-tabu solution in g(x) is chosen (line 6 in Algorithm 2). The chosen solution is then checked for feasibility and objective value improvement (line 8 in Algorithm 2) and if satisfied, is set to be the best solution thus far, i.e., x * . The primary termination criterion of the algorithm is given by the number of iterations, θ. In addition, the algorithm has a secondary termination criterion: if the best solution x * is not updated for τ iterations (τ < θ), then the algorithm is terminated (see line 17 in Algorithm 2). The complete pseudocode for the tabu search heuristic is shown in Algorithm 2. end if 16: update Ξ based on tabu list update 17: if x * is not updated for τ iterations then secondary termination criterion 18: set stop := 1 19: end if 20: set k := k + 1 21: end while

Computational Results
All the formulations and algorithms were implemented using the Java programming language, and CPLEX 12.6.2 was used as the underlying MILP solver. All simulations were performed using an Intel(R) Core(TM) i7 processor @2.70 GHz and 80 GB RAM. The performance of the algorithm was tested on randomly generated test instances.

Instance Generation
Two classes of test instances were randomly generated. The first set consisted of five instances with 10 targets. We refer to this set of instances as the small-scale instances for which all the algorithms presented in this article were tested and compared. The second set consisted of 120 instances with |T| ∈ {20, 30}, which we will refer to as large-scale instances. For all the instances, the coordinates of all targets were uniformly generated from a square grid of size 100 × 100 and there were four refueling sites. The locations of the depot and the four refueling sites were fixed for all the test instances.
For each of the randomly generated small-scale instances, there were three vehicles in the depot, and the fuel capacity F of the vehicles was varied linearly with a parameter λ that denoted the maximum distance between any depot and any target. F was set to 2.25λ for all the small-scale instances. For the large-scale instances, the number of targets, the number of vehicles m, and the value of F were chosen from the sets {20, 30}, {3, 4, 5}, and {2.25λ, 2.5λ, 2.75λ, 3λ}, respectively. For each combination (|T|, m, F), five random instances were generated, resulting in a total of 120 large-scale instances. To define the uncertainty regarding the fuel consumed by any UAV to traverse an edge (i, j) ∈ E, we let the mean fuel consumed for traversing the edge (i, j) be the Euclidean distance between the vertices i and j [50]. For the scenario generation, we also followed [50], which suggests a gamma distribution to model the uncertainties regarding fuel consumption. In general, the fuel consumption has the following two properties [50]: (i) the fuel consumed by any UAV to traverse any edge (i, j) is a higher deviation from the expected value due to external factors such as wind; and (ii) the deviations from the mean that are higher than the mean are much higher when compared to the deviations that are lower than the mean. A gamma distribution is an ideal candidate to exhibit such behavior. Hence, for the purposes of this article, we generated scenarios from the gamma distribution for the random fuel consumption vector, f , using a scale parameter 0.25 · E( f ) and a shape parameter value of four [50]. We also remark that, since the SAA and heuristic solutions are independent of the distributions used, the algorithm would still work with scenarios generated from any other fitting distribution. Finally, to make the scenarios more realistic, we divided the coordinate space in all test instances into four equal quadrants, and randomly designated one of the quadrants as 'congested', one as 'sparse', and the other two as 'mean'. For a scenario ω, all the edges incident to a vertex in a congested quadrant were given an f ij (ω) that is higher than its mean, E( f ij ). Similarly, all the edges connected with a vertex in a sparse quadrant were given an f ij (ω) that is lower than its E( f ij ). For the remaining edges, i.e., edges with both vertexes in mean quadrants, f ij (ω) was set to E( f ij ).
Instead of randomly assigning f ij (ω) for the edges based on the gamma distribution, this design creates correlations among the fuel consumption for different edges, making the scenarios more realistic.

Parameters for SAA and the Heuristic
The statistical estimate of the lower bound for the optimal objective value of the two-stage stochastic program was obtained using SAA as detailed in Section 4.1. The following parameter values were used to obtain the estimate: N = 10 and |Γ k | = 10 for every k ∈ {1, . . . , 10}. All 10 scenarios in each Γ k were independently drawn in accordance with the procedure detailed in the previous section. The scenario set Λ that was used to compute an estimate of the upper bound for C * from any feasible first-stage solution (the UAV routes) was generated similarly to the sets Γ k , k = 1, . . . , 10, with |Λ| = 1000. Java's concurrent libraries were used to parallelize the construction heuristic. Construction and improvement heuristics were run N times, each iteration with ∆ = Γ k , k ∈ {1, . . . , N}.

Performance Evaluation Criteria
We compare the objective values obtained using SAA and the heuristic solution with the traditional "Expected Value Problem" (EVP). The EVP is a single-stage deterministic problem that computes minimum travel costs for UAV routes with the fuel consumption set to their mean, i.e., the objective value for the EVP is given by ∑ (i,j)∈E c ij x ij . We denote the objective value for the EVP-i.e., the travel cost for the UAV routes obtained by solving the EVP-as EV in the following tables. Givenx, the optimal solution to the EVP, the expected cost of usingx in the two-stage stochastic program is given by C(x). In the following tables, this estimate is denoted as EEV (Expected cost of the Expected Value solution). Finally, we use LB-SAA, UB-SAA, and H to respectively denote the statistical estimates of the lower and upper bounds obtained by SAA and the estimate of the upper bound obtained using the heuristic. C(x) is estimated using the same scenarios that were used in obtaining the lower bound for the SAA. Given a subset of realizations of Ω, the difference between the optimal solution to the two-stage stochastic program and the EEV is referred to as the value of the stochastic solution (VSS) in the parlance of stochastic optimization. The VSS was first introduced in [51] and is a standard means for quantifying the usefulness of the stochastic programming approach. As is shown in the literature, for a minimization problem the cost of the optimal solution to the two-stage stochastic program is always less than or equal to the cost of using the expected value solution in a two-stage setting [33]. For a given instance and a subset of realizations, the VSS quantifies the gain from solving a two-stage stochastic program instead of a deterministic EVP.

Results for Small-Scale Test Instances
The purpose of the small-scale instances was to examine the run times for the SAA and heuristic approaches and to benchmark the objective values obtained using them. Table 2 presents the complete results for the 10-target instances. From this table, it can be observed that the estimates obtained using the tabu search heuristic are very close to the SAA upper bound estimates in the column UB-SAA. Furthermore, the heuristic objective values are also strictly greater than the EEV, indicating that there is value in using the heuristic to solve the two-stage problem rather than solving the corresponding deterministic EVP. A histogram of the run times for solving each problem instance in the SAA approach is shown in Figure 2. The figure indicates that even for 10-target instances, solving each SAA problem to optimality can be time consuming, and in the worst case it takes around three hours. This indicates the computational difficulty of solving the SAA problems to optimality. Nevertheless, the solutions obtained using the SAA approach are useful for evaluating the quality of the solutions obtained using the heuristic approach for the small-scale instances.

Results for Large-Scale Instances
For the large-scale set of instances with |T| ∈ {20, 30}, we compare the quality of the solutions produced by the heuristic with the EEV using VSS. We do not use the SAA approach for these instances due to the computational scalability issues of the SAA approach for these test instances. Tables 3-5 present the expected value solution's objective (EV), the EEV, and the heuristic solution objective value (H) for all instances in the large-scale set. Similar to Table 2, the values in brackets denote the standard deviations of the corresponding estimates. It is clear from Tables 3-5 that the heuristic is consistently able to provide a solution whose mean cost is strictly less than the mean EEV for every instance in this class, indicating that the value of VSS that is computed using the mean values is strictly greater than 0. The average and maximum values of the VSS (in percentages) for the large-scale set of instances computed using the mean values are 6.4% and 29.69%, respectively. A scatter plot of the mean EEV and mean heuristic solution cost is shown in Figure 3. The fact that all the points lie below the line with slope one indicates that the VSS is strictly greater than zero for all the instances. Finally, Figure 4 shows the box plot of the computation time taken by the tabu search heuristic to compute a feasible solution for the FCMURP on all the instances in the large-scale set. It can be observed from the box plot that the computation time for the heuristic solution approach is well within six minutes for all these instances.

Conclusions and Future Work
In this paper, we have introduced a new approach to address route planning for multiple UAVs under uncertainty (FCMURP) regarding the fuel consumption between any two targets. The new approach is a two-stage stochastic model with a first-stage objective function to minimize the route costs and a second-stage recourse cost for additional visits to refueling depots when the fuel is inadequate to complete the tour proposed by the first stage. We adopted the sample average approximation (SAA) method to solve the two-stage FCMURP model, with the second-stage expectation function approximated using a sampling-based approach. The SAA objective value will asymptotically converge to the optimal expected cost of the two-stage model, but it is still computationally expensive for large instances. Therefore, we developed a tabu search heuristic, which is very conducive to parallelization. Our heuristic provides good-quality solutions and is computationally efficient. For our computational experiments, we used randomly generated instances and created uncertainty that reflects more realistic environments. The results indicate that the solutions obtained with the heuristic are computationally efficient and not very different from the solutions for deterministic models and that they are also very robust in the face of uncertainty. Potential future research can consider uncertainty in the availability of UAVs for the mission, and a possible focus for the two-stage model would be to consider a risk-averse model where additional risk-measuring weights are added to the recourse objective function.
Funding: This research received no external funding.