Joint Trajectory and Communication Design for Buffer-Aided Multi-UAV Relaying Networks

With the rapid development and evolvement of unmanned aerial vehicle (UAV) technology, UAV aided wireless communication technology has been widely studied recently. In this paper, a buffer aided multi-UAV relaying network is investigated to assist blocked ground communication. According to the mobility and implementation flexibility of UAV relays, it is assumed that the communication link between air-to-ground is the Rician fading channel. On the basis of information causality, we derive the state change of the information in the buffer of UAV relays and maximize the end-to-end average throughput by join the relay selection, UAV transmit power, and UAV trajectory optimization. However, the considered problem is a mixed integer non-convex optimization problem, and therefore, it is difficult to solve directly with general optimization methods. In order to make the problem tractable, an efficient iterative algorithm based on the block coordinate descent and the successive convex optimization techniques is proposed. The convergence of the proposed algorithm will be verified analytically at the end of this paper. The simulation results show that by alternately optimizing the relay selection, UAV transmit power, and UAV trajectory, the proposed algorithm is able to achieve convergence quickly and significantly improve the average throughput, as compared to other benchmark schemes.

Studies have shown that the buffer aided relaying technique can effectively solve the adverse effects brought by time varying channels. It enables relays to have the ability of data storage, breaks through the traditional transmission mode, and flexibly uses the best channel [7,8]. Therefore, in this paper, it is considered that all UAV relays are equipped with a buffer and employed in delay tolerant applications [4]. Benefiting from both relay node mobility and buffer aided relaying, UAV relays equipped with a buffer can store and forward the information under more favorable conditions when certain quality-of-service (QoS) requirements are satisfied.
Note that many studies have been done on UAVs, but little work considers buffer aided UAV relaying networks. In [2,6], the authors maximized the throughput of single-UAV and multi-UAV relaying systems, respectively, via jointly optimizing UAV trajectory and transmit power design. In [5], the authors deduced a closed form propulsion power consumption model for rotary wing UAVs and minimized the total energy consumption of the network by joint UAV trajectory and communication time allocation optimization. In [9], a basic principle combining the path loss exponent and small scale fading was proposed, which analyzed the optimization problem of large scale wireless networks by combining air-to-ground systems and widely used ground-to-ground channel models. In [10], the authors investigated the use of a UAV as a hybrid access point to provide wireless power transfer and assist obstructed ground communications in wireless powered communication networks. In [11][12][13], a new cooperative jamming method was studied to guarantee the communication security between UAVs and ground nodes, which leveraged the jamming from other nearby UAVs to prevent the eavesdropping and combined UAV trajectory and communication design to maximize the average secrecy rate. The works in [14][15][16] focused on buffer aided UAV relaying networks. The authors in [14,15] considered the secrecy rate maximization problem with a two phase buffer aided single UAV relaying network. Based on the buffer constraint, the work in [16] considered a hybrid communication link consisting of free space optical and radio frequency links and maximized the average security rate with the consideration of information causality. The channel models between UAVs and ground nodes can be divided into three types: the light-of-sight (LoS) channel [2], the probabilistic LoS channel [17], and the Rician fading channel [18,19]. Due to the fact that the LoS channel model ignores the shadow fading and small scale fading, the channel is perhaps inappropriate in urban/suburban areas. Therefore, in this paper, our work is done under the Rician fading channel.
In this paper, a two phase buffer aided multi-UAV relaying network is studied, in which we choose two different UAVs to serve the source and destination at the same time, respectively. According to the buffer state and channel state information (CSI), the network will decide which UAV receives the data from the source and which UAV serves the destination. In particular, the UAV will never be chosen if its buffer is empty. In this paper, our goal is to maximize the end-to-end average throughput by jointly optimizing the relay selection, UAV transmit power control, and UAV trajectory based on the buffer constraints. However, the relay selection, UAV transmit power control, and UAV trajectory design are tightly coupled with each other in this formulated problem, which makes it difficult to solve directly via general optimization methods. To tackle the above problem, we relax the binary variables for relay selection into continuous variables and utilize the block coordinate descent and the successive convex optimization techniques to solve the considered problem with an efficient iterative algorithm [20]. The convergence of this iterative algorithm is proven analytically at the end of this paper.
The remainder of this paper is organized as follows. In Section 2, the buffer aided multi-UAV relaying system model and the problem formulation are introduced. In Section 3, the solution approach based on the block coordinate descent and the successive convex optimization techniques is proposed. In Section 4, the simulation results are presented to verify the convergence and performance of the proposed network. Finally, we summarize the conclusions of this paper in Section 5.

System Model
As illustrated in Figure 1, a buffer aided multi-UAV relaying network is considered. The model consists of K ≥ 2 UAVs, a source node S and a destination node D. Suppose S and D are stationary ground nodes and all UAVs are equipped with a buffer Q. For the convenience of analysis, the data rate instead of the packet rate is considered, and the data in the buffer follow the "first-in-first-out" rule. It is supposed that all UAVs operate in the time-division half-duplex mode and that the communication link between S and D is unavailable due to the shadow fading and small-scale fading. In addition, it is assumed that the considered network always selects two different UAVs to communicate with S and D according to the known CSI, respectively. There is always a UAV storing the data from S, and D will be served by a UAV if and only if the UAV's buffer is not empty. For interference avoidance, UAV j stores the data from S, and UAV k forwards the data to D over orthogonal time slots, so we do not consider the interference between UAVs. In order to make the UAV trajectory feasible, the distance between the initial and final location of the UAV during the flight period is assumed to satisfy the following constraint: Besides, in any time slot, all UAVs should be subjected to the maximum speed and collision avoidance constraints: The distance from UAV k to the ground nodes in time slot n is given by: In this paper, we consider the low altitude UAVs, so too high ground obstacles and UAV flight elevation angles are the key factors affecting whether air-to-ground (A2G) links are LoS links. Under practical conditions, the A2G channels of low altitude UAVs will be affected by large scale fading and multi-path fading [21]. However, A2G links are still dominated by LoS links. Therefore, it is not completely assumed that A2G channels are LoS channels or Rayleigh channels. In the Rician fading channel, there are both LoS signals (main signals) and non-line-of-sight (NLoS) signals. Therefore, in this paper, we assume that A2G communication channels obey Rician fading [18,19]. Specifically, the channels are dominated by the LoS path with random phases, and the amplitudes of communication channels within one time slot are considered to be constant, because the horizontal location variation is virtually negligible as compared to the altitude of the UAVs during a sufficiently small time slot [18]. Then, the Rician fading channel model is given by: where β is defined as the channel power gain at the reference distance d 0 = 1 m and G ≥ 0 denotes the Rician factor. In λ = c f c , c is defined as the speed of light and f c is the carrier frequency. For simplicity, we let θ m,k [n] be modeled as a uniformly distributed random variable in [0, 2π) [18]. Moreover, g m,k [n] is a coefficient of the Rayleigh fading channel, subject to a complex Gaussian distribution with zero mean and unit variance.
A binary variable a k,m [n] , m ∈ {S, D} , ∀n, k is defined to indicate whether UAV k is to communicate with ground nodes S or D in time slot n, i.e., a k,m [n] = 1 means communicating, and a k,m [n] = 0 means no communicating. It is also assumed that S and D communicate with different UAVs and at most one UAV communicates with S and D in the same time slot; thus the relay selection constraints can be obtained as: Besides, p S is a constant of the transmit power of S, and p k [n] denotes the transmit power of UAV k in time slot n, which satisfies both the average power constraint and peak power constraint, expressed by P mean and P peak , respectively. Thus, the transmit power satisfies the following constraints: According to the above discussion and analysis, the maximum achievable rate from S to UAV j in time slot n can be written as: where σ 2 is defined as the additive white Gaussian noise (AWGN) at the receiver. Accordingly, these data will be temporarily stored in the buffer of UAV j. At the end of time slot n, the remaining data in the buffer of UAV j can be expressed as: Then, the maximum achievable rate from UAV k to D in time slot n can be written as: Therefore, at the end of time slot n, the remaining data in the buffer of UAV k can be expressed as: Considering the buffer constraint, R k,D [n] can be reformatted as: Since the buffer equipped on the UAV is allowed to store the data collected from S in time slot n and forward it to D in any of the remaining slots, we propose the following information causality constraints:

Problem Formulation
, ∀k, n}, which corresponds to the relay selection, UAV transmit power, and UAV trajectory design, respectively. It is assumed that the locations of the stationary ground nodes are known, and our goal is to maximize the average throughput of the network by jointly optimizing A, P, and Q. We formulate the average throughput maximization problem as: Inequation (17b) is the information causality, which implies that the forwarded data by UAV must be already stored in its buffer. Inequations (1)-(3) are constraints on the UAV's mobility, including the UAV's initial and final location, the maximum speed of UAV and collision avoidance. Inequations (6)-(8) are relay selection constraints, and Inequations (9) and (10) are the average power and peak power constraints of transmit power. Since all UAVs can only transmit the data that have been previously stored from S, according to the Shannon formula, Problem (17) can be rewritten as: Since the objective function is non-concave and the constraints are non-convex, Problem (18) is non-convex and difficult to solve directly. First, since the optimization variables A for relay selection are binary, the constraints in (6)-(8) involve integer constraints. Second, Constraint (3) is non-convex with regard to UAV trajectory variables Q. Finally, the information causality constraint in (18b) is non-convex because of the coupling of three optimization variable. In the next section, an efficient algorithm is proposed to make the problem mathematically tractable.

The Proposed Algorithm
In this section, to make Problem (18) more tractable, a method based on optimization variables iteration is proposed. We decompose the original Problem (18) into the following two sub-problems iteratively. First, Sub-problem 1 optimizes the relay selection A and transmit power P with given fixed UAV trajectory Q. Second, Sub-problem 2 optimizes the UAV trajectory Q with given relay selection A and power allocation P. Finally, we present the overall algorithm and prove the convergence of this iteration algorithm. Without loss of generality, through the iteration of the two sub-problems, we can get a suboptimal solution to the original problem.

Sub-Problem 1: Optimizing Relay Selection and UAV Power Control of a Given UAV Trajectory
For a given UAV trajectory, the relay selection and transmit power allocation of Sub-problem 1 can be expressed as the following problem: We note that Problem (19) is still non-convex because the constraints in (6)- (8) involve integer constraints and the information causality constraint in (18b) is non-convex. To solve the integer constraints, we relax the binary variables in (8) It can be proven that when the constraints (20c) and (20d) take the equal sign, there is an optimal solution for Problem (20). Therefore, Problems (20) and (19) have the same optimal value and optimal solutions of A and P. Next, we concentrate on solving Problem (20).
However, Problem (20) is still non-convex since the variables A and slack variables u k,m [n] are coupled and the objective function is non-concave. Generally, a maximization problem with a non-concave objective function is non-convex and thus difficult to solve optimally. Aiming at this problem, a local convex approximation method is adopted, which ensures the convergence to the local optimal solution. Thus, a k,m [n] u k,m [n] , m ∈ {S, D} can be reformulated as [22]: Then, as we all know that the first-order Taylor expansion of a convex function at any point is its global lower bound [23], the lower bound function of (a k,m [n] + u k,m [n]) 2 After such approximation, the objective function is transformed into a concave function, so Problem (19) can be optimized by solving the following problem: Thus, when the constraints (25g) and (25h) take the equal sign, there is an optimal solution for Problem (25). Problem (25) is convex, which can be solved effectively by the interior-point method or standard convex tools such as CVX [24]. It can be found that the optimal solution {A, P} can be obtained with given Q, when the constraints (25g) and (25h) take the equal sign. In addition, both A and P will serve as the inputs for Sub-problem 2 in the next subsection.

Sub-Problem 2: Optimizing the UAV Trajectory a Given Relay Selection and Power Control
For a given relay selection and transmit power, by defining P n = It can be shown that when the constraint (27c) takes the equal sign, there is an optimal solution for Problem (27), because we can decrease t k,m [n] , m ∈ {S, D} to improve the objective value. Consequently, Problems (26) and (27) have the same optimal value and optimal solution of Q. Next, we concentrate on handling Problem (27).
It is easy to see that t k,m [n] and log 2 1 + P n t k,m [n] are convex. However, the objective function is non-concave, and the constraints in (3) and (27b) are non-convex, so Problem (27) is still non-convex. In order to make the problem (27) solvable, we obtain the lower bound of the objective function (27) by taking the first-order Taylor expansion of log 2 1 + P n t k,m [n] at the points t f k,m [n] and then maximizing the concave bound to obtain the optimal solution of Sub-problem 2. Actually, the first-order Taylor expansion of function log 2 A + B x is: Therefore, The first-order Taylor expansion of log 2 1 + P n t k,m [n] is given below: Then, Constraint (27b) can be rewritten as: , which can be written as: Then, Problem (26) can be approximated as the following problem: After such approximation, we note that the objective function of Problem (32) is concave and all constraints are convex. Therefore, Problem (32) is a convex optimization problem that can be efficiently solved by optimization solvers. In addition, the optimal objective value obtained from the approximate problem (32) is usually the lower bound of Problem (26).

Overall Algorithm and Convergence
In summary, the proposed algorithm can apply the block coordinate descent method to find a suboptimal solution of Problem (18) with alternately solving the two sub-problems (25) and (32) in an iterative manner. Specifically, we divide the entire optimization variables in the original problem (18) into three blocks, i.e., the relay selection A, UAV transmit power P, and UAV trajectory Q. Then, we alternately optimize the two sub-problems (25) and (32) to obtain the optimal solutions of block variables A, P, and Q. Algorithm 1 details the steps of the proposed algorithm, in which R l = f A l , P l , Q l is the objective value of Problem (18) with variables A l , P l , and Q l in iteration l, and ε is defined as the accuracy of convergence, ε > 0.
Algorithm 1 Proposed joint relay selection, power allocation, and trajectory optimization algorithm. 1: initialization: Set the initial feasible solution A 0 , P 0 , Q 0 . Set the accuracy tolerance ε, the iteration number l = 0, and R l = f A l , P l , Q l . find the optimal solution A l+1 , P l+1 , Q l+1 . 5: Update l = l + 1.
Next, we show the convergence of the proposed algorithm. Let us define R lb,l A,P (A, P, Q) = R l A,P and R lb,l Q (A, P, Q) = R l Q as the objective value of Problem (25) and (32) based on A, P, and Q, respectively. In general, we consider a bounded function that is non-negative monotonic and undiminished to be convergent. First, it is obvious that the objective values we solve are all non-negative. For given A l , P l , and Q l , the details of obtaining the optimal solution of (25) in Step 3 can be written as [26]: where R A l , P l , Q l is the pre-defined initial solution of Problem (18). Since Inequality (23) is obtained by the first-order Taylor expansion approximation of the objective function, it has the same optimal solution and optimal objective value as the original problem at A l , P l . Therefore, Equation (a) is satisfied. For given A l , P l , Q l , A l+1 , P l+1 is the optimal solution for Problem (25), so Inequation (b) holds. According to the previous discussions, Problem (25) provides a lower bound solution for Problem (19) at A l+1 , P l+1 ; hence, Inequation (c) is established. According to the inequality in (33), although Problem (25) is an approximate optimal problem for obtaining the relay selection and UAV transmit power, the objective value of the original Problem (18) is still non-decreasing. Similarly, the details of obtaining the optimal solution of (32) in Step 4 can show that: Combining (33) and (34), it can be concluded that: Based on the analysis above, we can conclude that the objective value of Problem (18) is non-decreasing over the iteration l, and the objective value of Problem (18) is upper bounded by a finite value. Therefore, our optimization problem is non-negative, non-decreasing, and bounded, i.e., the proposed algorithm converges.
Next, we will discuss the complexity of the proposed algorithm. As shown in Algorithm 1, the key idea of the proposed algorithm is to optimize alternately the relay selection, UAV transmit power control, and UAV trajectory. To make the proposed problem more tractable, we decompose the original problem into the following two sub-problems iteratively. In each iteration, the main complexity of the proposed algorithm lies in Steps 3 and 4, which require solving a series of convex optimization problems. The computational costs of Steps 3 and 4 are about O L (KN + 2N) 3 and O L (2N) 3 , respectively, where K is the number of UAVs, N is the number of time slots, and L is the iteration numbers of the proposed algorithms [27,28].

Numerical Results
In this section, we provide simulation results to verify the effectiveness of the proposed algorithm for buffer aided multi-UAV relaying networks with joint relay selection, UAV transmit power, and UAV trajectory optimization. We assumed that the network had K = 2 UAVs and that the locations of S and D were (−500, 0, 0) and (500, 0, 0), respectively. The initial locations of UAVs were set as (−300, 0, 100) and (300, 0, 100), respectively. It was assumed that all UAVs flew at the fixed altitude H = 100 m and that the maximum speed of the UAV was V max = 60 m/s. The communication bandwidth was assumed to be B = 10 MHz, and the noise power was σ 2 = −110 dbm. The channel power gain at the reference distance d 0 = 1 m was set as β = −70 dbm. The transmit power of S and the maximum transmit power of UAVs were assumed to be p S = 0.1 W and P max = 0.4 W, respectively. In addition, the average power was P mean = P max /4. The initial transmit power of the UAVs was assumed to be equal to p S , i.e., p k [n] = p S . Furthermore, we assumed the buffer size equipped with UAVs was infinite. Figure 2 shows the UAVs optimization trajectories by the proposed algorithm for flight period T = 60 s. The initial locations of UAVs and ground nodes are marked with purple " " and black " ", respectively. In this figure, the asterisk indicates that the location of the UAV is sampled at every 0.5 s. The initial and final location of all UAVs satisfied a feasible constraint in (1), and all UAVs satisfied the minimum safety distance to avoid collision. From this figure, we can see the two UAVs tried to get close to their served ground nodes during every time slot under the flying constraints. Because the UAVs served the source or destination in different time slots, sometimes the trajectories of UAV1 and UAV2 may be a straight line from the source to the destination node. Then, in Figure 3, the flight security of the proposed algorithm is demonstrated for period time T = 60 s. It can be obviously observed that the distance between the two UAVs varied with time and always exceeded the minimum security distance. Therefore, the trajectories of the two UAVs satisfied the collision avoidance constraints in every time slot. Distance between UAV1 and UAV2 Minimun securing distance  First, as expected, the average throughput could be increased by optimizing any variable (relay selection and transmit power and/or flight trajectory). Second, it is observed that the joint optimization algorithm we proposed could greatly increase the average throughput in comparison with optimizing a single variable. The results above further demonstrated that the trajectory optimization was more effective than optimizing relay selection and transmit power. Finally, it was found that the average throughput could be improved as the flight period increased with the joint optimization algorithm. In a word, the proposed algorithm could significantly improve the system performance in terms of throughput. Average throughput(bit/s) 10 5 No optimization Only relay & power optimized Only trajectory optimized Joint Next, we demonstrate the convergence behavior of Algorithm 1 in Figure 5. The average throughput of this network during different periods is illustrated. From this figure, it can be observed that with the number of iterations of the proposed algorithm increasing, the average throughput of this network gradually converged, and the convergence was achieved at about six iterations. Therefore, the curves satisfied our expectation and verified the effectiveness of the proposed algorithm in maximizing average throughput. Furthermore, it can be found that the system average throughput could be increased by prolonging the period time.

Conclusions
A buffer aided multi-UAV relaying network with joint trajectory and communication design was studied in this paper. Specifically, based on the information causality, aiming at maximizing the average throughput of the network, the relay selection, UAV transmit power, and UAV trajectory were jointly optimized. Since the maximization problem was a mixed integer non-convex problem, the problem was re-described and divided into iterations of the two sub-problems through the block coordinate descent and the successive convex optimization techniques. Finally, the convergence of the proposed algorithm was testified analytically. Numerical results demonstrated that Algorithm 1 could significantly improve the average throughput of the considered network. Furthermore, in this paper, we considered the transition of buffer states, and the proposed algorithm can avoid the collision of UAVs in multi-UAV relaying networks.