Iterative Trajectory Optimization for Physical-Layer Secure Buffer-Aided UAV Mobile Relaying

With the fast development of commercial unmanned aerial vehicle (UAV) technology, there are increasing research interests on UAV communications. In this work, the mobility and deployment flexibility of UAVs are exploited to form a buffer-aided relaying system assisting terrestrial communication that is blocked. Optimal UAV trajectory design of the UAV-enabled mobile relaying system with a randomly located eavesdropper is investigated from the physical-layer security perspective to improve the overall secrecy rate. Based on the mobility of the UAV relay, a wireless channel model that changes with the trajectory and is exploited for improved secrecy is established. The secrecy rate is maximized by optimizing the discretized trajectory anchor points based on the information causality and UAV mobility constraints. However, the problem is non-convex and therefore difficult to solve. To make the problem tractable, we alternatively optimize the increments of the trajectory anchor points iteratively in a two-dimensional space and decompose the problem into progressive convex approximate problems through the iterative procedure. Convergence of the proposed iterative trajectory optimization technique is proved analytically by the squeeze principle. Simulation results show that finding the optimal trajectory by iteratively updating the displacements is effective and fast converging. It is also shown by the simulation results that the distribution of the eavesdropper location influences the security performance of the system. Specifically, an eavesdropper further away from the destination is beneficial to the system’s overall secrecy rate. Furthermore, it is observed that eavesdropper being further away from the destination also results in shorter trajectories, which implies it being energy-efficient as well.

investigated. In [26], secrecy rate maximization was achieved by optimal power allocation at the source and the relay. Zhang et al. added the UAV trajectory to the design problem and studied maximization of the sum secrecy rate of the UAV by jointly designing the UAV trajectory and the transmit power control [27]. However, these works rely on a strong assumption of fixed and known eavesdropper location. More recently, multiple potential eavesdroppers with imperfect knowledge of the eavesdropper locations were considered in [28], where robust design of the UAV trajectory and the transmit power for PHY security optimization was investigated. Still, how UAV-enabled secure mobile relaying benefits from buffer-aided relaying is under-investigated.
In this work, PHY security of a buffer-aided UAV mobile relaying system is studied. Specifically, a four-node system model containing a source, a destination, a UAV mobile relay with finite data buffer, and a randomly located eavesdropper is considered. The sum secrecy rate of the system is maximized through UAV relay trajectory optimization. The main contributions of this work are summarized in the following.

•
Instead of making a strong assumption of known and static eavesdropper location/channel, in this work, a randomly located eavesdropper with only the statistical information of its location known to the legitimate system is considered in the secure trajectory design for buffer-aided UAV mobile relaying. • By discretizing the total flight time into N equal quasi-static time slots and exploiting the buffer-aided relaying protocol, a sum secrecy rate maximization problem is formulated to find the optimal UAV relay trajectory anchor points that achieve the maximum sum secrecy rate.

•
The lower bounds of the maximal achievable rates are derived through Taylor's expansion. The accuracy of the lower bounding technique is guaranteed by extra upper bounding the rates in the constraints of the optimization problem.

•
To make the original non-convex problem tractable, an iterative trajectory optimization scheme is proposed. Specifically, instead of optimizing the trajectory anchor points of the UAV directly, the increments from the previous iteration for each anchor point are iteratively optimized. The problem is then decomposed into successive convex approximation subproblems by invoking the rate bounds in an iterative procedure. The convergence of this trajectory iteration method is proved analytically by the squeeze principle.
Simulation results illustrate that the method of finding the optimal trajectory by iterative incrementing of the anchor points is effective and fast converging. The simulation results show that the trajectory of the UAV converges in around 10 iterations, and the performance of the system's sum secrecy rate is significantly improved. The location of the eavesdropper affects the security performance of the system. Specifically, the eavesdropper further away from the destination is more favorable to the system's secrecy capacity. Furthermore, it was observed that having higher maximum UAV speed is also beneficial to the improvement of the secrecy rate performance.
The remainder of this paper is organized as follows. In Section 2, we present the buffer-aided UAV relaying system model and give an initial description of the trajectory optimization problem. In Section 3, the solution approach based on decomposition and progressive convex approximation of the original non-convex problem is proposed. Three propositions are presented, and we prove analytically the trajectory iteration method converges. Simulation results are presented in Section 4, and concluding remarks are made in Section 5.

System Model and Problem Description
A UAV mobile relaying wireless communication system model as shown in Figure 1 is considered in this work. There are four single-antenna nodes in the model: a single source (S), a single destination (D), a UAV mobile relay (R), and an eavesdropper (E). Suppose the source and the destination are fixed in a straight line on the ground, which is designated as the d x axis in the model. The positions of the source and the destination in the two-dimensional (2D) space are denoted by (L s , 0) and (L d , 0), respectively. The ground-based eavesdropper is located at (L e , 0). In this work, it is assumed that L e is a random variable, and uniformly distributed L e is considered in the subsequent analysis to demonstrate the proposed solution approach. Specifically, L e is uniformly distributed in an interval [a, b], where a and b are two real valued constants with a ≤ b. This work and the proposed solution technique can be extended to scenarios with more complex geometries of the node locations. Direct communication between the source and the destination is assumed to be blocked. In addition, it is assumed that the eavesdropper cannot receive direct transmissions from the source, either. The UAV moves in the 2D geographical area at a fixed height h above the terrestrial communication system to assist communications between the source and the destination. It also raises information security issues because the ground-based eavesdropper can now receive the forwarded signals from the UAV relay. Ignoring the taking off and landing processes, the UAV serves as a mobile relay for a finite time horizon T, and its starting and ending points are denoted as SP and EP, respectively, as shown in Figure 1. For convenience, we designate the location of SP as the origin, and the location of EP is denoted as (L, 0). As the UAV moves, the distance between the UAV and each terminal is constantly changing, and the channel gains of the corresponding communication links change accordingly. A dynamic channel model is established to reflect these changes with the UAV location. The UAV relay's service time interval T is divided into N equally spaced time slots. Each time slot is sufficiently short to guarantee the quasi-static assumption, i.e., the wireless channels are almost constant within one time slot. The N time slots then correspond to N decision instants for the trajectory, and the UAV position , and h re [n], respectively. It is assumed that the UAV relay is flying at a height where the path is clear of obstacles to allow total freedom in the trajectory design. This requires the UAV to fly at a relatively high altitude to be well above all the buildings. Consequently, as discussed in [13], the LOS path dominates the air-to-ground channel. The large-scale free-space path-loss is the dominating factor in h sr [n], h rd [n], and h re [n]. The S-R channel path-loss is given as [29] PL sr [n] = PL sr (d 0 ) + 10n log(d sr [n]/d 0 ), n = 1, . . . , N, where d 0 is the free-space reference distance, and d is the distance between the transmitter and the receiver. A path-loss exponentn = 2 is used due to the large elevation angle of the air-to-ground communication system model under consideration [13]. As a result, (1) is written as where C 0 = PL sr (d 0 ) − 20 log(d 0 ). Let C = 10 C 0 /10 , the large-scale S-R channel coefficient in time slot n is approximately given as , n = 1, . . . , N.
The approximate channel coefficients of the S-R and R-D channels can be obtained similarly as and , n = 1, . . . , N.
As the UAV relay moves in the 2D geographic area in the air, the wireless-channel states constantly change, resulting in different h sr [n], h rd [n], and h re [n] values in different time slots. The corresponding achievable rate and secrecy rate also change accordingly. In contrast to conventional wireless communication systems where the channel coefficients' changes with time are mainly due to fading that has a random nature, in the UAV mobile relaying system studied in this work, based on the above assumptions of the air-to-ground channels, these changes are primarily determined by the UAV trajectory and therefore can be planned ahead, in an off-line manner. It is then possible to improve the sum achievable secrecy rate of the UAV mobile relaying system by designing a favorable UAV trajectory. The computation task of finding the optimal trajectory, as a result, can be offloaded to a ground-based computing facility with controllable communication overhead considering the limited computing power and battery lifetime of the UAV relay. This is important to the practical implementation of the proposed design technique.
Denote by R s [n] and R d [n] the maximum achievable rates of the S-R and R-D channels in the nth time slot. It is straightforward to show that and where p s and p r represent the transmit power of the source and the UAV relay, respectively, W is the communication bandwidth, and N 0 is the power spectral density of the additive white Gaussian noise (AWGN). Because only statistical information about the eavesdropper location is known to the legitimate communication system, and the UAV position keeps changing along time, the eavesdropper's ergodic achievable rate in the nth time slot, denote by R e [n], is a reasonable measure of the eavesdropper capability. By definition of ergodic rate, R e [n] is the expected value of the R-E rate over the distribution of the eavesdropper location.
The idea of PHY security is based on the notion of perfect secrecy, which requires the information leaked about the transmitted message to the eavesdropper is asymptotically zero. Maximal achievable secrecy rate, or secrecy capacity, characterizes the maximal rate at which the legitimate receiver can reliably recover the message, while the eavesdropper obtains no information about the message. The underlying idea is that the existence of the eavesdropper undermines the reliable transmission between the legitimate parties from information security perspective. The mutual information between the legitimate parties is penalized by the amount of the mutual information of the transmitter-eavesdropper link. Conditioned on the quasi-static fading in one time slot, the second-hop (R-D/R-E) channel can be modeled as a discrete memoryless AWGN wire-tap channel. The corresponding ergodic secrecy rate in the nth time slot is then given as To improve PHY security in the trajectory design, the following optimization problem P1 is formulated that maximizes the sum ergodic secrecy rate by finding the optimal UAV trajectory points (d x [n], d y [n]) for all n = 2, . . . , N.
where v and B represent the UAV's maximum speed and buffer size, respectively. Equation (9a) is the information causality and buffer size constraint for buffer-aided relaying, which implies that the forwarded secrecy packets must be cached in a buffer of size no larger than B. And (9b) sets constraints on the UAV's mobility, taking into consideration both the UAV's starting and ending locations as well as the maximum UAV speed. Owing to the form of the objective function and the information causality constraint (9a), it can be shown that the original problem P1 is non-convex. In the following section, we reformulate P1 by change of variables and successive convex approximation to make the problem mathematically tractable.

The Progressive Convex Approximation Method for the Non-Convex Problem
In this section, firstly the design variables are changed to transform the original problem P1 into a more friendly form. An iterative updating procedure of the trajectory anchor points based on optimization of the increments of each anchor point in each iteration is proposed. Lower bounding the rate expressions in each algorithm iteration by Taylor's expansion results in convex subproblems which can be readily solved by standard techniques for convex optimization. This successive convex approximation procedure is shown to approach the optimal trajectory progressively with good convergence properties. The optimality gap of the proposed iterative optimization technique is shown to be very small with only a few algorithm iterations.

Change of Variables and Lower Bounding the Achievable Rates
It can be observed from Problem P1 that optimizing the trajectory anchor points (d x [n], d y [n]) directly is cumbersome due to the analytic forms of the objective function and the constraints. Alternatively, because of the assumption of linear motion between decision (anchor) points, we propose to optimize the trajectory increments for each anchor point, denoted (η[n] ≥ 0, ξ[n] ≥ 0), in an iterative procedure. The results have shown that finding the optimal trajectory through optimizing the increments is effective and fast converging.
Assume the trajectory increment on the nth trajectory anchor point obtained in the lth algorithm iteration is {η (l) [n], ξ (l) [n]}, n = 0, 1, . . . , N. By setting an initial trajectory, e.g., the straight line segment from the source to the destination, it is straightforward to obtain the corresponding initial values of the anchor points, i.e., {(d The trajectory anchor points for the lth algorithm iteration is updated after the (l − 1)th algorithm iteration as The achievable rate of the S-R channel in the lth algorithm iteration is calculated as where the channel coefficient h Similarly, the achievable rates of the R-D and R-E channels for the current iteration, denoted R The iterative procedure that updates {(d The subproblem P1 (l) of the lth iteration obtained after the conversion is still non-convex. In order to deal with the non-convexity in the rate expressions in P1 (l) , a lower bounding technique based on Taylor's expansion is proposed. The idea of the proposed technique is illustrated through three propositions in the following.
where a Proof. Firstly, we define the function form with constants λ > 0 and A. When The achievable rate of the S-R channel in the (l + 1)th algorithm iteration is given as where d . Equation (16) can be fitted into the form of (15) with the coefficients given by As a result, where a (l) Lower bound of the R-D channel rate R (l+1) d [n] in (14) can be obtained in the same way, with a (l) The coefficients λ d and A d in (19a) are given by This completes the proof.
Unlike the source and the destination, the location of the eavesdropper is assumed to be random and follows a uniform distribution. Lower bounding the eavesdropper rate therefore needs to be done in a slightly different way compared with R s and R d in Proposition 1. Instead, we give the following Proposition 2 about the ergodic eavesdropper rate that takes into consideration the distribution of the eavesdropper location.

Proposition 2. The following inequality must hold for any trajectory increment
where E a Proof. Please see Appendix A.

Convergence of the Iterative Trajectory Optimization Technique
In this subsection, we first show the accuracy of the lower bounds on the rate expressions obtained in Section 3.1, which is important to the validity of the proposed iterative optimization algorithm. To guarantee validity and accuracy of the lower bounding technique proposed in Section 3.1, the following two additional inequality constraints on the lower bounds are introduced to the optimization problem.
Combining the above inequalities with Propositions 1 and 2, we have By the squeeze principle, it is straightforward that equalities must hold for any feasible solution to the optimization problem adopting the additional constraints. As a result, In the meantime, adding constraints (22a) and (22b) to the optimization problem with the above lower bounding technique can also guarantee convergence of the trajectory iteration. Next, we show through the following Proposition 3 convergence of the proposed iterative trajectory optimization technique. Proposition 3. The sum secrecy rate of the UAV relay system converges if the following inequalities must hold.
Proof. For convenience, in the following proof the iteration index l and the time index n are omitted because the general results apply to all the trajectory points.
If the inequalities (22a) and (22b) must hold, then as in (23) there have R d ≥ R lb d ≥ R d and R e ≥ R lb e ≥ R e . By definition, the secrecy rate to be maximized is In Section 3.1, it is assumed that the optimization variables are nonnegative, i.e., η ≥ 0, ξ ≥ 0. A positive pair (η, ξ) should always be found in an algorithm iteration, which leads to an improved secrecy rate until both η and ξ are zero. The secrecy rate is thus non-decreasing over the iterations.
Hence, R * is monotonically increasing and bounded with respect to the optimization variables η and ξ. The convergence of the proposed iterative optimization method is thus proved.
Based on the above discussions, the original Problem P1 can be accurately solved through an iterative procedure as described in Section 3.1 by solving the following constrained Problem P2 (l) in each algorithm iteration until convergence.
[n], n = 1, . . . , N; By combining Proposition 1 and Proposition 2, the information causality constraint (9a) in Problem P1 becomes (24a). Constraints (24b)-(24c) and the additional variables R are added to the optimization problem to guarantee validity and convergence of the proposed lower bounding solution approach. It can be shown that the support of the variables is a convex set and the second-order derivatives of all function and constraints are positive semidefinite. As a result, Problem P2 (l) for the lth iteration is a convex problem, which can be readily solved by standard convex optimization solvers such as CVX [30].
The proposed iterative UAV trajectory optimization algorithm for secure UAV mobile relaying is summarized in Algorithm 1. 5: Set l = l + 1. 6: Until terminate at convergence or a predefined maximum number of iterations is reached.

Numerical Results
In this section, simulation results are presented to verify the proposed iterative trajectory optimization technique for secure buffer-aided UAV mobile relaying. The UAV-assisted mobile relaying system model as shown in Figure 1 is adopted. The starting point SP and the end point EP of the trajectory are designated as the origin and (0, L), respectively. The source and the destination are located at fixed points (L s , 0) and (L d , 0) on the horizontal axis. The eavesdropper location follows a uniform distribution between [a, b] on the horizontal axis. The UAV relay moves in the upper half-plane of the 2D space, i.e., d y > 0, at height h above the terrestrial communication system. Simulation parameters are summarized in the following Table 1. Among them, v, a, b are adjusted in the simulations to observe their impacts on the system performance.

Convergence of the Secrecy Rate Performance
First of all, we investigated how the average ergodic secrecy rate of the buffer-aided UAV mobile relaying system achieved by the proposed iterative optimization scheme changes with the number of trajectory iterations and the UAV relay's maximum speed. The boundaries of the uniformly distributed eavesdropper location were a = 300 m and b = 500 m. The total flight time was set to 80 s. Several maximum UAV speed values v = 16 m/s, v = 18 m/s, and v = 20 m/s were examined. The simulation results (average secrecy rate versus iteration number) are shown in Figure 2. The average secrecy rate curve of a system without trajectory optimization is shown as the solid line without marks in the figure to provide a performance benchmark. It can be observed from Figure 2 that as the proposed trajectory optimization algorithm iterates, the overall average secrecy rate increases. The performance achieved by the proposed algorithm converged very fast in the first two to three iterations, and became levelled off in less than 10 iterations for all the scenarios examined. The performance achieved by three algorithm iterations was over 99% of that at convergence (10 iterations). Because the subproblem in each iteration is strictly convex, which can be readily solved by a classic convex optimization algorithm, the complexity of each algorithm iteration is almost fixed. The overall complexity of the proposed iterative optimization algorithm is mainly determined by how fast the iterative procedure converges. The proposed iterative algorithm is therefore practically desirable because the numerical study revealed that near optimal solutions can always be obtained in a small number of (around 3) iterations. The fast convergence property then indicates relatively low complexity of the proposed algorithm in practical implementations. This is desirable from both theoretic study and practical system design perspectives. It is also observed that higher maximum UAV speed is beneficial to the system's overall secrecy rate.
Increasing v from 16 m/s to 20 m/s resulted in over 9% improvement to the average secrecy rate. This is because the greater the maximum UAV speed, the less constrained the trajectory. A more favorable trajectory that achieves greater secrecy rate can be obtained accordingly. Obviously, the greater the number of iterations, the closer the trajectory to the optimal. That means as the trajectory is updated in the proposed iterative procedure, it is gradually optimized and eventually converges to the optimal trajectory.
How the location of the eavesdropper impacts the overall average secrecy rate performance was studied by examining different boundary values a and b for the uniform distribution. Maximum UAV speed v = 20 m/s and total flight time T = 80 s were used in this part of simulation. Simulation results for 5 different [a, b] combinations are presented in Figure 3. Number of iterations Average secrecy rate (bits/s/Hz) a=100,b=300 a=300,b=500 a=500,b=700 a=600,b=800 a=700,b=900 For all the scenarios examined, the overall average secrecy rate increases as the trajectory optimization algorithm iterates, and fast convergence as in Figure 2 can also be observed. The eavesdropper location further away from the destination (closer to the source) is shown to be beneficial to the overall average secrecy rate performance. This is mainly because when the first hop communication is completely obstructed on the ground, the forwarded signal from the UAV relay is the only source of information leakage to the eavesdropper. It is, therefore, not desirable to have an eavesdropper closer to the destination such that the R-D and R-E channels are more correlated, which violates the basic principle for PHY security design.

Trajectory Regarding Iteration Number and Eavesdropper Location Distribution
We next present the obtained UAV trajectory in the 2D space to show how the optimized trajectory is approached as the number of algorithm iterations increases. The impact of eavesdropper location on the optimized UAV trajectory was also investigated. In this part, the total flight time was set to T = 80 s, and the maximum UAV speed was v = 16 m/s. An eavesdropper uniformly distributed between [300, 500] on the d x axis was considered to demonstrate the iterative update process of the UAV trajectory. It is observed in Figure 4 that as the proposed algorithm iterates, the UAV's trajectory gradually converges. Convergence of the trajectory was achieved at about 10 iterations, which validates the effectiveness of the proposed algorithm in following an optimized trajectory.
In Figure 5, the optimized UAV trajectories obtained by 12 algorithm iterations are presented for three groups of eavesdropper locations. The selection of 12 iterations was based on the authors' observations from the numerical studies (as shown in Figures 2 and 3), which guaranteed to give the converged trajectory. It can be observed that when the eavesdropper location is further away from the destination (closer to the source), the UAV's optimized trajectory has a shorter total flight distance. That means with fixed flight time, if the eavesdropper is expected to be closer to the destination, the UAV needs to fly faster to create a trajectory that can avoid potential eavesdropping as much as possible. As a result, having an eavesdropper far away from the destination is beneficial to both the sum secrecy rate performance and the energy efficiency.   It is observed that when the eavesdropper location is further away from the destination, the UAV's optimized trajectory has a shorter total flight distance, which is both spectrum-efficient and energy-efficient.

Conclusions
The trajectory optimization problem for PHY security of a buffer-aided UAV mobile relaying system with a randomly located eavesdropper has been studied. The problem of optimizing the anchor points of the discretized piecewise linear trajectory for maximized sum secrecy rate under information causality and maximum UAV speed constraints has been formulated and shown to be non-convex. By changing the optimization variables to the iterative trajectory increments on each anchor point and invoking a lower bounding technique for the achievable rates, the problem has been reformulated and decomposed into a series of convex optimization subproblems through an iterative procedure. Based on the squeeze principle, convergence of the iterative optimization approach has been achieved by adding extra upper bound constraints to the achievable rates. This successive convex approximation procedure is shown to approach the optimal trajectory progressively with good convergence property. The optimality gap between the approximate convex problem and the original non-convex problem has been shown to be very small with only a few (about 3) iterations. The complexity of the proposed iterative optimization algorithm is thus practically low. The optimal PHY secure UAV relay trajectory has been obtained through the iterative procedure after a few iterations. It has been observed from the simulation results that higher maximum UAV speed would improve the sum secrecy rate performance because it gives higher flexibility to the trajectory. The simulation results have also revealed that an eavesdropper further away from the destination is beneficial to both the sum secrecy rate performance and the UAV relay's energy efficiency.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A
In this Appendix, we prove that the ergodic eavesdropper rate is lower bounded as shown in Proposition 2.
Proof. By definition, the ergodic achievable rate of the R-E channel is given by where d