Joint Beamforming and Trajectory Design for Aerial Intelligent Reflecting Surface-Aided Secure Transmission

Wang, Yanping; Qiao, Jingping; Zhang, Chuanting

doi:10.3390/electronics11182802

Open AccessArticle

Joint Beamforming and Trajectory Design for Aerial Intelligent Reflecting Surface-Aided Secure Transmission

by

Yanping Wang

¹,

Jingping Qiao

^1,*

and

Chuanting Zhang

²

¹

School of Information Science and Engineering, Shandong Normal University, Jinan 250358, China

²

Department of Electrical and Electronic Engineering, University of Bristol, Bristol BS8 1QU, UK

^*

Author to whom correspondence should be addressed.

Electronics 2022, 11(18), 2802; https://doi.org/10.3390/electronics11182802

Submission received: 10 August 2022 / Revised: 2 September 2022 / Accepted: 2 September 2022 / Published: 6 September 2022

(This article belongs to the Special Issue MIMO: Multiple Input Multiple Output Technology for Physical-Layer Security)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

This paper studies the secure transmission challenge confronted with future communication systems. In our considered model, the confidential communication between legitimate users is strengthened by an aerial intelligent reflecting surface (AIRS) deployed on aerial platforms such as an unmanned aerial vehicle (UAV). The average secrecy rate for all time slots is first investigated to improve information security during AIRS flights. Then, the transmit beamforming, phase-shifting matrix, and trajectory of AIRS are jointly designed, aiming to maximize the average secrecy rate performance between legitimate users. On account of the non-convexity of the formulated objective function and the coupled three key variables, we resort to an alternative strategy that converts the original objective into three sub-problems and solves them recursively. In particular, the transmit beamforming is designed on the basis of the generalized eigenvalue optimization method, and the closed-form solution is derived. For the AIRS-related optimization, a Minorization-Maximization (MM)-based algorithm and a deep deterministic policy gradient (DDPG)-based method are proposed to derive the solutions of phase shift matrix and trajectory, respectively. Simulation results show that AIRS assistance can obtain nearly twice the secrecy performance of terrestrial intelligent reflecting surface (TIRS) systems.

Keywords:

physical layer security; aerial intelligent reflecting surface; passive beamforming; trajectory optimization

1. Introduction

Wireless communication revolutionizes human life in various aspects, gradually leading us to a more convenient and intelligent society. Despite the advantages of wireless communication, the broadcast nature of the wireless medium makes it vulnerable to interception attacks by malicious users. This poses a challenge to secure wireless communication. To conquer this challenge, physical layer security (PLS) emerged and attracted tremendous attention from both academia and industry [1,2,3]. As a valuable method to ensure the security of wireless communication, PLS utilizes the randomness and uniqueness of wireless channels to protect the confidential signals from being intercepted by adversaries and achieve secure and reliable communication [4].

As a promising technology of PLS, cooperative relaying technology can help expand the communication range and maintain the wireless network’s security [5,6]. However, the performance gain of traditional active wireless relays require an additional power consumption, and relays usually work in a half-duplex model with lower spectral efficiency. Even though the full-duplex mode can enhance the spectral efficiency, additional interference is introduced in confidential signal transmission [7]. Recently, the emerging intelligent reflective surface (IRS) technology provides a novel solution to the above challenges in PLS systems [8,9,10,11]. The IRS is a uniform planar array composed of amount of composite material elements, and the phase shift of each composite element can be independently regulated to reflect incident electromagnetic waves [12,13]. Thereby, the propagation environment and performance can be improved with the assistance of IRS. Moreover, compared with cooperative relays, IRS also has the advantage of realizing full-duplex communication with lower energy consumption because the passive reflecting elements do not decode or amplify and forward information signals [14]. In [15,16], the programmable IRS was introduced to assist confidential information transmission, and the active beamformer at transmitter and passive beamformer at IRS were jointly optimized for secrecy performance maximization. To further enhance information security, IRS-empowered secure transmission scenarios are introduced into multiple-input multiple-output (MIMO) systems [17,18] as well as millimeter-wave (mmWave) and terahertz (THz) systems [19,20]. It is proved that, in comparison to baseline schemes without IRS, the help of the IRS can significantly suppress information leakage at malicious users, hence boosting the secrecy performance.

The IRS deployments involved in the above work mainly focus on the terrestrial IRS (TIRS). That is, the IRS is installed on the walls or building surfaces in terrestrial networks. Such a deployment restricts TIRS only to achieve half-angle reflection (i.e., 180 degree panoramic angle reflection) and serve users in half of the space [21]. In complicated environments such as urban areas, the half-angle reflection of TIRS makes it necessary for multiple IRSs to cover all users in the wireless network, which results in both additional communication overhead and signal attenuation. Recently, the aerial IRS (AIRS) that can establish full-angle reflection has attracted much attention [22]. In the AIRS-aided communications, AIRS is mounted on aerial platforms such as UAVs and satellites, enjoying more flexible deployment. Thus, AIRS can cater to the real-time communication environment and easily construct hardened line-of-sight (LoS) links for both ground and aerial users. It is shown in [21] that AIRS promotes a higher probability of LoS links between communication users, and its achievable data rate is more than twice that of TIRS. However, the benefits of AIRS in secure communications are rarely explored. Recent work in [23,24,25,26,27] mainly focused on AIRS-authorized communications without regard to information security or the TIRS-aided secure air-ground communications. Specifically, Refs. [23,24] jointly designed AIRS’s placement and three dimensions (

3 D

) passive beamforming in one time slot to maximize the worst-case signal-to-noise ratio (SNR) in the target communication area. Refs. [25,26,27] proposed joint optimization schemes of TIRS phase shift and UAV trajectory based on successive convex approximation (SCA) and a deep deterministic policy gradient (DDPG) framework to information security. The above IRS-aided secure transmission schemes in [23,24,25,26,27] cannot be directly applied in AIRS-empowered secure communication systems because of different system models. In summary, all the above work is listed in Table 1. It can be seen that the listed IRS-aided secure UAV communication schemes focus on terrestrial IRS-aided propagation between UAV and ground users rather than the secure transmission problem between ground users. Moreover, the AIRS mounted on the aerial platform enjoys the flexibility and mobility of aerial platforms and the passive reflection characteristics of IRS, but results in coupled phase shift and trajectory design, leading to an intractable secure transmission problem.

Inspired by the aforementioned challenges, in this paper, we investigate the AIRS-assisted secure communication system, where the transmitter with multi-antennas tries to emit confidential signals to its desired destination, while one passive eavesdropper remains silent and tries to decode the confidential signals. Since the direct link is blocked by the infrastructure, AIRS mounted on an aerial platform is employed to facilitate LoS links and help secure transmission. The main objective is to maximize the average secrecy rate during flight period of AIRS, and investigate the effects of the phase-shift and trajectory of AIRS on secure transmission. We summarize our contributions as follows.

A novel AIRS-assisted secure transmission scenario is investigated. With massive reflecting elements mounted on an aerial platform, AIRS facilitates the free-space LoS link transmission of confidential signals.
Aiming to guarantee the information security, the average secrecy rate during the flight time is introduced as the secrecy performance metric. Then, we formulate the average secrecy rate maximization problem by jointly designing transmit beamforming, phase-shift matrix, and trajectory of AIRS.
To cope with the non-convex problem that is formulated, we propose an iterative algorithm. The solutions are derived alternately by dividing the original problem into three sub-problems.
For the AIRS, the Minorization-Maximization (MM) algorithm-based method is proposed to derive the closed-form solution of phase shifts. Additionally, the DDPG-based algorithm is proposed to design AIRS’s trajectory.

2. System Model

In this paper, an AIRS-empowered secure communication system is considered, in which a source node Alice with

N_{t}

antennas emits confidential signals to its desired user Bob. At the same time, Eve attempts to intercept the emitted information as a passive eavesdropper. As shown in Figure 1, the direct connection between Alice and Bob is blocked by some infrastructure in a complex urban environment, and an AIRS is exploited to help the confidential signal transmission from Alice to Bob. We assume that the AIRS consists of M reflecting elements and an aerial platform (e.g., UAV [28]). Owing to massive reflector elements installed on the aerial platform, AIRS can make the transmit beam diffract obstacles and establish an LoS link between ground users.

In the AIRS-assisted secure communication system, the height of Alice is assumed as

h_{A}

, and then its

3 D

Cartesian coordinate location can be denoted as

[l_{A}, h_{A}] = [x_{A}, y_{A}, h_{A}]

. The locations of the legitimate user Bob and the eavesdropper Eve are assumed to be

[l_{D}, 0] = [x_{D}, y_{D}, 0]

and

[l_{E}, 0] = [x_{E}, y_{E}, 0]

, respectively. For the AIRS, we assume the flight period T of the aerial platform is divided into N equally spaced time slots, and each time slot duration is

δ_{t}

, i.e.,

T = N δ_{t}

. In the n-th time slot, the location of the AIRS is assumed to be

[q [n], h]

, where h is the constant altitude during AIRS flying to avoid collision with buildings, and

q [n] = {[x [n], y [n]]}^{T}

is the coordinate in the

x y

-plane. In particular, due to mission and network deployment requirements, the starting and target positions of AIRS are defined as

[q_{0}, h] = [x_{0}, y_{0}, h]

and

[q_{T}, h] = [x_{T}, y_{T}, h]

, respectively. The maximum speed of the AIRS during flight is assumed to be

V_{m a x}

, thereby the maximum distance that the AIRS can fly between two adjacent slots is

V_{m a x} δ_{t}

,

δ_{t}

denoting the time duration of each time slot.

Note that, as mentioned in [29,30], the mobility of aerial platforms may lead to challenges in obtaining instantaneous CSI of aerial platform-ground user links. Here, we assume that the ground control station connected with AIRS can make full use of all available resources to estimate the CSI and angle of arrival between users and the aerial platform coated with AIRS [31]. Moreover, we suppose that the eavesdropper is an idle legal user in the communication network but unauthorized for the private signals. Thus, both legitimate and wiretap CSI can be perfectly known to the system.

2.1. Channel Model

Since the AIRS is deployed on an air platform and flies in the air, the Alice-IRS link is dominated by the LoS path. Thus, the free-space path loss model is employed, then the Alice-IRS channel of the nth time slot is assumed as

H_{S I}^{H} [n] = \sqrt{ρ_{0} d_{A I}^{- 2} [n]} a_{r} (ϑ_{a}^{I}, ϑ_{e}^{I}) a_{t} {(ϑ_{a}^{A})}^{H},

(1)

wherein the distance from Alice to AIRS is assumed as

d_{A I} [n] = \sqrt{∥ q [n] - l_{A} ∥^{2} + {∥ h - h_{A} ∥}^{2}}

and the channel gain at the reference distance

d_{0} = 1

m is denoted as

ρ_{0}

.

a_{r} (ϑ_{a}^{I}, ϑ_{e}^{I})

is the receive array response vector at the AIRS, and

ϑ_{a}^{I}

and

ϑ_{e}^{I}

are the azimuth and the elevation angles of arrival (AoA). In addition,

a_{t} (ϑ_{a}^{A})

represents the transmit array response vector at Alice and

ϑ_{a}^{A}

is the azimuth angles of departure (AoD). With respect to the uniform square planner array (USPA) assumption at AIRS (The USPA with

\sqrt{M} \times \sqrt{M}

-element on the

x z

-plane is adopted.), the receive array response vector for AIRS

a_{r} (ϑ_{a}^{I}, ϑ_{e}^{I})

is written as

\begin{matrix} a_{r} (ϑ_{a}^{I}, ϑ_{e}^{I}) = & \frac{1}{\sqrt{M}} [1, \dots, e^{j \frac{2 π}{λ} d (p cos (ϑ_{a}^{I}) sin (ϑ_{e}^{I}) + q cos (ϑ_{e}^{I}))}, \\ {\dots, e^{j \frac{2 π}{λ} d ((\sqrt{M} - 1) cos (ϑ_{a}^{I}) sin (ϑ_{e}^{I}) + (\sqrt{M} - 1) cos (ϑ_{e}^{I}))}]}^{T}, \end{matrix}

(2)

where

d = λ / 2

is the distance between two adjacent antennas at AIRS,

λ = c / f_{c}

is the length of wave. The parameters

0 \leq p \leq \sqrt{M}

and

0 \leq q \leq \sqrt{M}

are the antenna indices in the

2 D

plane.

A uniform linear array (ULA) is utilized at transmitter Alice, thereby the transmit array response vector

a_{t} (ϑ_{a}^{A})

can be expressed as

a_{t} (ϑ_{a}^{A}) = [1, e^{j \frac{2 π}{λ} d sin (ϑ_{a}^{A})}, \dots, e^{j (N_{t} - 1) \frac{2 π}{λ} d sin (ϑ_{a}^{A})}] .

(3)

Then, the channel between AIRS and Bob or Eve is expressed as

g_{k} [n] = \sqrt{ρ_{0} d_{k}^{- α} [n]} a_{t} (ϑ_{a, k}^{I}, ϑ_{e, k}^{I}),

(4)

in which

k = {I D, I E}

is an index term to indicate the link between AIRS and Bob or Eve.

d_{I D} = \sqrt{∥ q [n] - l_{B} ∥^{2} + h^{2}}

is the distance from AIRS to Bob, and

d_{I E} = \sqrt{∥ q [n] - l_{E} ∥^{2} + h^{2}}

denotes the distance from AIRS to Eve.

α

is the path loss exponent, and

a_{t} (ϑ_{a, k}^{I}, ϑ_{e, k}^{I})

is the transmit array response vector at AIRS and is obtained according to (2).

2.2. Signal Model

In our considered AIRS-assisted secure communication system, the source node Alice tries to emit confidential signal

x_{s}

to AIRS using a transmit beamformer

w [n]

, then AIRS adjusts the phase shift of its elements to reflect signals to Bob, and Eve can also intercept the reflected signal. At the n-th time slot, the signal received by Bob can be obtained as

y_{D} = g_{I D}^{H} [n] Θ [n] H_{S I} {[n]}^{H} w [n] x_{s} + n_{D},

(5)

where

E {| x_{s} |^{2}} = P_{s}

, the transmit beamformer at Alice is

w [n] \in C^{N_{t} \times 1}

with

{∥ w [n] ∥}^{2} \leq 1

.

g_{I D} [n] \in C^{M \times 1}

is the AIRS-Bob channel and

n_{D} \sim CN (0, N_{0})

is the additive white Gaussian noise (AWGN) received at the Bob. We assume

Θ [n] = diag {e^{j θ_{1} [n]}, \dots, e^{j θ_{M} [n]}}^{T}

denote the reflecting matrix at AIRS and

θ_{m} [n] \in [0, 2 π), m = 1, \dots, M

is the phase shift of reflecting element at the n-th time slot.

Then, the signal intercepted by Eve is represented as

y_{E} = g_{I E}^{H} [n] Θ [n] H_{S I}^{H} w [n] x_{s} + n_{E} .

(6)

where

g_{I E} [n] \in C^{M \times 1}

is the AIRS-Eve channel, which can be obtained based on (4).

n_{E} \sim CN (0, N_{0})

is received noise.

In terms of the received signals in the above equations, the signal-to-noise ratio (SNR) at Bob and Eve are represented as

γ_{D} = \frac{P_{s}}{N_{0}} {| g_{I D}^{H} [n] Θ [n] H_{S I}^{H} [n] w [n] |}^{2} .

(7)

γ_{E} = \frac{P_{s}}{N_{0}} {| g_{I E}^{H} [n] Θ [n] H_{S I}^{H} [n] w [n] |}^{2} .

(8)

Then, the received information rates at Bob and Eve are obtained as

R_{D} [n] = {log}_{2} (1 + \frac{P_{s}}{N_{0}} | g_{I D}^{H} [n] Θ [n] H_{S I}^{H} [n] w [n] |^{2}),

(9)

R_{E} [n] = {log}_{2} (1 + \frac{P_{s}}{N_{0}} | g_{I E}^{H} [n] Θ [n] H_{S I}^{H} [n] w [n] |^{2}) .

(10)

According to the definition of secrecy rate in [32], the secrecy rate performance for an average of N time slots, i.e., average secrecy rate, is written as

\begin{array}{l} R_{s}^{a v e r} & = \frac{1}{N} \sum_{n = 1}^{N} R_{s} [n] \\ = \frac{1}{N} \sum_{n = 1}^{N} {[R_{D} [n] - R_{E} [n]]}^{+} \\ = \frac{1}{N} \sum_{n = 1}^{N} {log}_{2} (\frac{1 + \frac{P_{S}}{N_{0}} {| g_{I D}^{H} [n] Θ [n] H_{S I}^{H} [n] w [n] |}^{2}}{1 + \frac{P_{S}}{N_{0}} {| g_{I E}^{H} [n] Θ [n] H_{S I}^{H} [n] w [n] |}^{2}}) \end{array}

(11)

wherein

{[\cdot]}^{+}

means that the secrecy rate of the communication is non-negative, e.g.,

{[a]}^{+} = max (a, 0)

.

2.3. Problem Formulation

With the objective to maximize the achievable average secrecy rate in N time slots, a joint optimization problem of transmit beamforming, phase-shift, and trajectory of the AIRS is formulated, i.e.,

max_{w [n], Θ [n], Q} \frac{1}{N} \sum_{n = 1}^{N} R_{s} [n]

s . t . \{\begin{cases} C 1 : & {∥ w [n] ∥}^{2} \leq 1, \\ C 2 : & θ_{m} \in [0, 2 π), m = 1, \dots, M, \\ C 3 : & q [0] = q_{0}, q [N] = q_{T}, \\ C 4 : & ∥ q [n + 1] - q [n] ∥ \leq V_{m a x} δ_{t}, n = 1, \dots, N . \end{cases}

(12)

where

Q = [q [0], \dots, q [n], \dots, q [N]]

denotes AIRS trajectory. The first constraint

C 1

is the transmit power constraint of transmit beamformer

w

, and the constraint

C 2

is the adjustable range of AIRS’s phase-shift. Constraints

C 3

and

C 4

are related to AIRS’s position and trajectory. In particular,

C 3

constrains the starting position before the first slot and the target position at the end of N slots, while

C 4

represents the maximum distance constraint between two adjacent slots due to the limitation of maximum flight speed.

It is intuitive that

C 2

is the non-convex constraint since it establishes the unit modulus constraints of the elements of

Θ

. Moreover, in the objective function three variables

(w [n], Θ [n], Q)

are coupled with each other for each time slot. The above conditions result in the non-convexity of the formulated problem in (12), which is difficult to solve. As an alternative, the following iterative algorithm is proposed, and the solutions of transmit beamforming, phase-shift matrix, and trajectory of AIRS will be derived alternately.

3. Average Secrecy Rate Maximization

Instead of seeking the solutions of the formulated average secrecy rate maximization problem in (12) directly, we try to divide the original problem into several sub-problems of

w [n]

,

Θ [n]

, and

Q

, respectively, then solve them recursively. The idea is based on the fact that, for the objective function, the transmit beamforming

w [n]

design is only related to the phase shift

Θ [n]

as well as the position

q [n]

at the same time slot, while it is independent of the variables of other time slots. Thus, we propose an iterative algorithm to derive

w [n]

and

Θ [n]

alternately. With the obtained

w [n]

and

Θ [n]

, the trajectory of AIRS

q [n]

at the n-th time slot is designed using the DDPG algorithm, which is a deep reinforcement learning algorithm for learning continuous actions.

3.1. Transmit Beamforming Design

Given the phase-shift matrix

Θ [n]

and the trajectory of AIRS, the sub-problem of the transmit beamforming vector

w [n]

at Alice can be written as

max_{w [n]} \frac{1}{N} \sum_{n = 1}^{N} {log}_{2} (\frac{w^{H} [n] [\frac{1}{ξ} I_{N_{t}} + \frac{P_{s}}{N_{0}} H_{S I} [n] Θ {[n]}^{H} g_{I D} [n] g_{I D}^{H} [n] Θ [n] H_{S I}^{H} [n]] w [n]}{w^{H} [n] [\frac{1}{ξ} I_{N_{t}} + \frac{P_{s}}{N_{0}} H_{S I} [n] Θ {[n]}^{H} g_{I E} [n] g_{I E}^{H} [n] Θ [n] H_{S I}^{H} [n]] w [n]}) .

{s . t . ∥ w [n] ∥}^{2} = ξ, 0 \leq ξ \leq 1 .

(13)

In the above problem, we definite a parameter

ξ \in [0, 1]

and introduce it to convert the inequality power constraint into equality. It can be observed that the sub-problem with respect to transmit beamformer

w [n]

is a generalized eigenvector problem. Hence, we proposed the following Proposition 1 to obtain the solution of (13).

Proposition1.

The closed-form solution of

w [n]

that maximizes the average secrecy rate can be derived as

w^{o p t} [n] = e^{u n i t},

(14)

where

e^{u n i t}

is the unit-norm eigenvector of matrix

H_{E}^{- 1} H_{D}

based on its maximum eigenvalue. The matrix

H_{E}

is defined as

H_{E} = I_{N_{t}} + \frac{P_{s}}{N_{0}} H_{S I} [n] Θ {[n]}^{H} g_{I E} [n] g_{I E}^{H} [n] Θ [n] H_{S I}^{H} [n]

, while matrix

H_{D} = I_{N_{t}} + \frac{P_{s}}{N_{0}} H_{S I} [n] Θ {[n]}^{H} g_{I D} [n] g_{I D}^{H} [n] Θ [n] H_{S I}^{H} [n]

.

Proof.

Please see Appendix A. □

3.2. Phase-Shift Design

With the given AIRS trajectory and the obtained transmit beamformer

w [n]

, the average secrecy rate maximization problem of phase-shift

Θ [n]

can be written as

max_{\hat{θ} [n]} \frac{1}{N} \sum_{n = 1}^{N} {log}_{2} (\frac{{\hat{θ}}^{T} [n] [\frac{1}{M} I_{M} + {\bar{R}}_{D}] {\hat{θ}}^{*} [n]}{{\hat{θ}}^{T} [n] [\frac{1}{M} I_{M} + {\bar{R}}_{E}] {\hat{θ}}^{*} [n]}) .

s . t . \{\begin{cases} \hat{θ} [n] = {[e^{j θ_{1} [n]}, \dots, e^{j θ_{m} [n]}, \dots, e^{j θ_{M} [n]}]}^{T}, \\ θ_{m} [n] \in [0, 2 π), m = 1, \dots, M . \end{cases}

(15)

In the above problem, we define the vector

\hat{θ} [n]

, then the phase-shift matrix

Θ [n]

in (12) is expressed as

Θ [n] = diag {\hat{θ} [n]}

. The matrices

{\bar{R}}_{D}

and

{\bar{R}}_{E}

are expressed as

{\bar{R}}_{D} = \frac{P_{s}}{N_{0}} diag {g_{I D}^{H} [n]} H_{S I}^{H} [n] w [n] w^{H} [n] H_{S I} [n] diag {g_{I D} [n]}

and

{\bar{R}}_{E} = \frac{P_{s}}{N_{0}} diag {g_{I E}^{H} [n]}

H_{S I}^{H} [n] w [n] \cdot w^{H} [n] H_{S I} [n] diag {g_{I E} [n]}

, respectively.

It is proved that the problem in (15) is also in the form of generalized eigenvalue problems, but unit modulus constraints

θ_{m} [n] \in [0, 2 π)

make it difficult to solve using the traditional generalized eigenvector solver. To cope with this difficulty, here we propose to utilize the MM method [33,34] to compose a surrogate objective function and derive the closed-form solution. The design of the MM algorithm is more challenging when malicious eavesdroppers are introduced into the communication system. We first find the surrogate function to approximate the average secrecy rate, then solve the optimal phase shifts based on the reformulated problem. Here, we define the vector

φ = {\hat{θ}}^{*} [n]

, and the matrices

{\hat{R}}_{D} = \frac{1}{M} I_{M} + {\bar{R}}_{D}

and

{\hat{R}}_{E} = \frac{1}{M} I_{M} + {\bar{R}}_{E}

. Then, the objective function in the above problem (15) can be rewritten as

f (φ) = φ^{H} {\hat{R}}_{D} φ / φ^{H} {\hat{R}}_{E} φ

. It is assumed

φ_{i t e r}^{i}

denotes the value of

φ

in the i-th iteration. Then, a lower bound

g (φ | φ_{i t e r}^{i})

that touches

f (φ)

at point

φ_{i t e r}^{i}

can be constructed.

Proposition2.

Objective function

f (ϕ)

is lower bounded by

\begin{matrix} f (φ) = \frac{φ^{H} {\hat{R}}_{D} φ}{φ^{H} {\hat{R}}_{E} φ} \geq g (φ | φ_{i t e r}) + [f (φ_{i t e r}) - g (φ_{i t e r} | φ_{i t e r})], \end{matrix}

(16)

where

f (φ_{i t e r}) - g (φ_{i t e r} | φ_{i t e r})

is a constant term.

Proof.

Please see Appendix B. □

According to Proposition 2, it is clear that, for the

(i + 1)

-th iteration operation of the MM method, both the second and third terms on the right-hand side of (16) are constants. Then, we have the conclusion that the optimal reflecting matrix is determined by the first part

g (φ | φ_{i t e r})

. In other words, the optimization problem of phase shifts for

(i + 1)

-th iteration is equivalent to

\begin{matrix} φ_{i t e r}^{i + 1} = \underset{φ}{a r g max} R [{(u^{i})}^{H} φ], s . t . θ_{m} [n] \in [0, 2 π), \end{matrix}

(17)

where the operation

R [\cdot]

means taking the real part of

[\cdot]

. The expression for the vector

u^{i}

can be derived from Proposition 2 and written as

\begin{matrix} u^{i} = \frac{{\hat{R}}_{D} φ_{i t e r}^{i}}{{(φ_{i t e r}^{n})}^{H} {\hat{R}}_{E} φ_{i t e r}^{i}} - \frac{{(φ_{i t e r}^{i})}^{H} {\hat{R}}_{D} φ_{i t e r}^{i}}{{[{(φ_{i t e r}^{i})}^{H} {\hat{R}}_{E} φ_{i t e r}^{i}]}^{2}} \times [{\hat{R}}_{E} - λ_{m a x} ({\hat{R}}_{E}) I_{M}] φ_{i t e r}^{i} . \end{matrix}

(18)

On the basis of basic algebraic manipulations [35], the solution of the above problem in (17) can be achieved as

∠ φ_{i t e r}^{i + 1} = ∠ u^{i} .

(19)

Consequently, in the

(i + 1)

-th iteration, the solution of

\hat{θ}

can be derived as

∠ {\hat{θ}}^{i + 1} = - ∠ u^{i}

. When the iterative operation of the MM method converges, the optimal

θ_{m}^{o p t} [n]

can be obtained.

3.3. AIRS Trajectory Design

With the obtained transmit beamformer

w

and phase-shift vector

\hat{θ}

at AIRS, the AIRS trajectory is designed for average secrecy rate maximization, i.e.,

\begin{array}{l} max_{Q} \frac{1}{N} \sum_{n = 1}^{N} R_{s} [n] \\ s . t . \{\begin{cases} q [0] = q_{0}, q [N] = q_{T}, \\ | | q [n + 1] - q [n] | | \leq V_{m a x} δ_{t} . \end{cases} \end{array}

(20)

For the definition of channel model in Section 2.1, the location of AIRS affects the path loss coefficient of all communication links. Thus, the trajectory

Q = [q [0], \dots, q [n], \dots, q [N]]

is coupled with all channel coefficients, which establishes the non-convexity of the problem in (20). To cope with the non-convex problem, we resort to the DDPG algorithm by designing novel deep neural networks, i.e., actor networks and critic networks. The actor network embraces the state of the communication system as an input feature and generates the action. The critic network inputs the state and the action then yields the reward. We detail the design of the neural networks and their corresponding input and output in the next subsections.

3.3.1. State, Action, and Reward

To maximize the average secrecy rate of the AIRS-empowered secure communication, all system-related conditions are included in the state space

S

, which is shown in Table 2. More specifically, the locations of all users, phase-shift and trajectory of AIRS, and transmit beamformer are considered states in our learning algorithm. As for the choice of actions, we select the moving direction and velocity v of the AIRS in each time slot as the action in the action space

A

of the neural network.

Due to the fact that our considered AIRS is mounted on an UAV, we also consider several practical constraints for the flight of the UAV, that is, the maximum flight speed

V_{m a x}

, the flight’s starting position

q_{0}

and target position

q_{T}

. Thus, the reward function of the neural network includes the following terms: (1) The rate at Bob is greater than the minimum rate requirement; (2) the speed does not exceed the maximum speed of the AIRS; and (3) AIRS has to fly to the finish position

q_{T}

within N time slots.

3.3.2. Loss Function

In the DDPG algorithm, there are four networks, namely the evaluate actor network, evaluate critic network, target actor network, and target critic network. In the n-th time slot, the actor network

π (s; μ^{π})

is responsible for iteratively updating the policy network parameters

μ^{π}

and selecting the current action

a (n)

according to the current state

s (n)

. The target actor network

π^{^{'}} (s; {μ^{π}}^{^{'}})

is to select the optimal next action

a (n + 1)

according to the next state

s (n + 1)

in the experience pool and realize the update of the network parameters

{μ^{π}}^{^{'}} = μ^{π}

. The role of the critic network

Q (s, a; μ^{Q})

is an iterative update of the value network parameters

μ^{Q}

and the calculation of current Q. The role of target critic network

Q^{'} (s, a; {μ^{Q}}^{^{'}})

is to calculate the target

Q^{^{'}}

and periodically update network parameters

μ^{Q^{^{'}}}

.

We can utilize the loss function to measure the goodness of the algorithm by describing the difference between the target network and the virtual network. Then, we optimize the loss function to make the predicted value closer to the actual value. Therefore, the loss functions of the actor network and the critic network are analyzed as

L (μ^{π}) = \frac{1}{N_{b}} \sum_{n = 1}^{N} - Q (s (n), π (s (n))),

(21)

L (μ^{Q}) = \frac{1}{N_{b}} \sum_{n = 1}^{N} {[r (n) + γ Q^{^{'}} (s (n + 1), π^{^{'}} (s (n + 1))) - Q (s (n), a (n))]}^{2},

(22)

where the parameter

N_{b}

is defined as the size of the mini-batch, and

γ

is the discount factor,

r (n)

is the reward function received by agent at the n-th time slot, and the reward function is as

r (n) = η_{r a t e} R_{s} [n] + η_{d i s t} d_{A T} [n],

(23)

where

η_{r a t e}

is a binary penalty indicator on secrecy rate

R_{s} [n]

with

η_{r a t e} = 1

if the constraints on the AIRS’s fight are guaranteed and

η_{r a t e} = 0

otherwise. In addition,

η_{d i s t}

denotes a constant penalty term on the reduced distance

d_{A T} [n]

between current position and the position of the last time slot, in which

d_{A T} [n] = \sqrt{∥ q [n] - q_{T} ∥^{2}}

.

3.3.3. Neural Network

In the n-th time slot, the actor network selects the current state

s (n)

from

S

as the input to the network and then generates the action

a (n)

on the basis of the policy function. The inputs of actor network are nine state parameters listed in Table 2, and two action parameters, the magnitude and direction of the AIRS velocity, will be output from the output layer using the activation function tanh. The angle range of AIRS velocity direction is

(- π, π)

. The neural network diagram of actor is shown in the Figure 2. The critic network inputs the current

s (n)

and

a (n)

then generates

Q (s (n), a (n))

for evaluating the actor network based on our designed reward function. The neural network diagram of critic is shown in Figure 3 and our whole DDPG algorithm is summarized in the following Algorithm 1. The parameters involved in the neural network are listed in Table 3.

Algorithm 1 DDPG Algorithm

1:: Input transmit beamformer $w$ , phase shift matrix $Θ$ , locations of all users, maximum velocity, starting and target position of AIRS, actor and critic network structures.
2:: Output actor network with $μ^{π}$ , critic network with $μ^{Q}$ , and the trajectory $Q = [q [0], \dots, q [n], \dots, q [N]]$ in state space.
3:: Initialize evaluate actor network $π (s; μ^{π})$ and evaluate critic network $Q (s, a; μ^{Q})$ with weight $μ^{π}$ and $μ^{Q}$ .
4:: Initialize the target actor network $π^{^{'}} (s; {μ^{π}}^{^{'}})$ and the critic network $Q^{^{'}} (s, a; {μ^{Q}}^{^{'}})$ , ${μ^{π}}^{^{'}} \leftarrow μ^{π}$ and ${μ^{Q}}^{^{'}} \leftarrow μ^{Q}$ .
5:: Initialize the experience replay buffer $M$ .
6:: for each episode do
7:: Initialize a random exploration noise $N$ , and include AIRS’s starting position, velocity, phase-shift matrix and transmit beamformer into state space.
8:: for each time slot $n \in N$ do
9:: Get the initial observation state $s (n)$ , and obtain $a (n) = π (s (n); μ^{π}) + N$ .
10:: The central controller takes the action $a (n)$ and observes reward $r (n)$ , then the model transfers to a new state $s (n + 1)$ .
11:: Store $T (n) = (s (n), a (n), r (n), s (n + 1))$ from the experience reply buffer $M$ .
12:: end for
13:: Sample randomly a minibatch of $N_{b}$ transitions $T (n)$ from $M$ and calculate the $y (n) = r (n) + γ Q^{^{'}} (s (n + 1), π^{^{'}} (s (n + 1)))$ .
14:: Minimize the the loss function $L (μ^{Q})$ to update parameter $μ^{Q}$ .
15:: Minimize the loss function $L (μ^{π})$ to update parameter $μ^{π}$ .
16:: Soft updates for the target network.
17:: ${μ^{Q}}^{^{'}} = τ μ^{Q} + (1 - τ) {μ^{Q}}^{^{'}}$ .
18:: ${μ^{π}}^{^{'}} = τ μ^{π} + (1 - τ) {μ^{π}}^{^{'}}$ .
19:: end for

3.4. Iterative Algorithm Description

On the basis of Propositions 1 and 2 and Algorithm 1, the transmit beamformer

w

and phase-shift vector

\hat{θ}

and trajectory

Q

of AIRS can be updated alternatively, and the solution of average secrecy rate maximization problem is obtained. We summarize the whole process of iterative algorithm as Algorithm 2. In addition, since solutions of each sub-problems are derived based on the performance maximization, the average secrecy rate after each iteration is non-decreasing. Consequently, the proposed iterative algorithm can be guaranteed to be convergent in the case of transmit power limitation.

The optimization problem investigated in this paper is non-convex, so we divide the objective problem into three sub-problems and iteratively solve the sub-problems separately. Each sub-problem is the solution obtained by maximizing the secrecy rate of the system. Thus, our proposed algorithm ensures that the objective function of this paper is non-decreasing after each iteration. Moreover, because of the transmit power limitation, the achievable secrecy rate is bounded by a finite value. Therefore, the convergence of our proposed iterative algorithm is guaranteed.

Algorithm 2 Algorithm for Average Secrecy Rate Maximization

1:: Initialize $w^{0} [n]$ , ${\hat{θ}}^{0} [n]$ , $Q^{0} = [q [0], \dots, q [n], \dots, q [N]$ , and set locations of all users, the maximum velocity of AIRS, and $ε$ .
2:: Calculate the average secrecy rate $R_{s}^{a v e r} (w {[n]}^{0}, \hat{θ} {[n]}^{0}, Q^{0})$ .
3:: Set an index term of iteration $i = 0$ .
4:: repeat
5:: $i = i + 1$
6:: Given ${\hat{θ}}^{i - 1} [n]$ and $Q^{i - 1}$ , derive the transmit beamformer $w^{i} [n]$ using Proposition 1.
7:: With the obtained $w^{i} [n]$ and given $Q^{i - 1}$ , find ${\hat{θ}}^{i} [n]$ according to Proposition 2.
8:: With the obtained $w^{i} [n]$ and ${\hat{θ}}^{i} [n]$ , initialize the state space $S$ , and derive the trajectory of AIRS $Q^{i}$ based on DDPG method in Algorithm 1.
9:: Calculate average secrecy rate ${(R_{s}^{a v e r})}^{i} = R_{s}^{a v e r} (w^{i} [n], {\hat{θ}}^{i} [n], Q^{i})$ .
10:: until $| {(R_{s}^{a v e r})}^{i} - {(R_{s}^{a v e r})}^{i - 1} | \leq ε$ .
11:: return $w^{o p t} [n] = w^{i} [n]$ , ${\hat{θ}}^{o p t} [n] = {\hat{θ}}^{i} [n]$ , $n = 1, \dots, N$ , and $Q^{o p t} = Q^{i}$ .

3.5. Complexity Analysis

We assume that the proposed iterative algorithm converges after

N_{i t e r}

iterations. The complexity of our proposed iterative algorithm can be calculated relying on the transmit beamforming design, AIRS phase shift design and AIRS trajectory design, and

N_{i t e r}

. For the design of transmit beamforming, on the basis of Proposition 1, the complexity is obtained as

O (N_{t}^{3} + N_{t} M^{2})

. For the design of AIRS phase shifts, the complexity relies mainly on the number of iterations of MM algorithm

I_{i t e r}

and the calculation of Equation (18). Therefore, the complexity of obtaining phase shifts is

O ((M^{3} + M^{2} N_{t}) I_{i t e r})

. For the design of AIRS trajectory, we assume L layers used in the deep neural network and

n_{k}

neurons in the k-th layer. With

N_{b}

mini batches and N time slots, the complexity of DDPG algorithm is obtained as

O (N_{b} N (\sum_{k = 1}^{L - 1} n_{k} n_{k + 1}))

. Consequently, the complexity of our proposed iterative algorithm is about

O (N_{i t e r} (N_{t}^{3} + I_{i t e r} (M^{3} + M^{2} N_{t}) + N_{b} N (\sum_{k = 1}^{L - 1} n_{k} n_{k + 1})))

.

4. Simulation Results and Analysis

In this section, simulation results are presented to evaluate the average secrecy rate performance of our proposed AIRS-aided secure transmission scheme. Unless otherwise stated, it is assumed that Alice is located at

l_{A} = [50, 30]

m with the height

h_{A} = 10

m, while the desired destination and passive eavesdropper are located at

[100, 150, 0]

and

[150, 180, 0]

, respectively. Moreover, we assume that AIRS needs to fly from the starting position

q_{0} = [0, 0]

m to the target position

q_{T} = [200, 200]

m within

N = 20

time slots. The constant flight altitude of AIRS is

h = 50

m and the maximum flight speed is limited to

V_{m a x} = 20

m/s. In addition, the number of antennas equipped with Alice is

N_{t} = 20

, and the number of reflecting elements at AIRS is

M = 64

. The element spacing at Alice and AIRS are both set to a half wavelength. All channels are modeled based on the rank-one channel model mentioned in Section 2.1. Without loss of generality, we set the path loss exponent to

α = 2.5

and the channel gain at the reference distance

d_{0} = 1

m to

ρ_{0} = - 10

dBm. The transmit power is

P_{s} = 20

dBm.

In addition, the neural network used in the DDPG algorithm involves two networks, the actor network and the critic network. We set the number of neurons in four hidden layers of the actor network to

(128, 128, 64, 64)

. In the critic network, there are also four hidden layers and the number of neurons in each layer is set to

(128, 128, 64, 64)

. The learning rates of actor network and critic network are set to

3 \times 10^{- 4}

and

1 \times 10^{- 3}

, and the constant penalty term is set to

η_{d i s t} = 0.1

. The discount factor is set to

γ = 0.99

, and the deviation of exploration noise is

σ_{N} = 0.1

. The experience reply buffer is

1 \times 10^{4}

and mini-batches is 4. The soft update coefficient is assumed as

τ = 0.005

.

In the AIRS-assisted secure communication system, we propose an iterative algorithm for secrecy rate maximization. The effectiveness of our proposed algorithm depends on whether it converges and how fast it converges. Thereby, we select two time slots during the AIRS flight as samples to study the convergence property. As shown in Figure 4, the iterative algorithm converges monotonically and has a fast convergence speed for both sampling time slots. Specifically, for time slot 1, the secrecy rate performance reaches a smooth state after 10 iterations, while the secrecy rate performance in time slot 2 reaches a plateau after nine iterations. That is, after 10 iterations, both time slots converge, and the secrecy rate performance remains unchanged in the remaining iterations. After the iterative algorithm converges, the solution of joint design of transmit beamforming, phase shift and trajectory for each slot can be obtained, and the secrecy performance can be maximized. Results reveal that the theoretical analysis in Section 3 can be valid from the simulation perspective.

The transmit power is a key factor that affects the security performance of physical layer security systems. An increase in transmit power can enhance the signal strength received by all receivers. Therefore, it can increase the information rate of legitimate user Bob, and it can also increase the information leakage of illegal user Eve. Figure 5 displays the variation of the average secrecy rate with different transmit power at transmitter Alice. It can be observed that increasing power at Alice to a certain extent can enhance the secrecy rate performance. The reason is that increasing the transmit power to a certain extent can bring more information rate received at Bob, thereby increasing the secrecy rate of each time slot. However, the excessive transmit power can also result in more information being eavesdropped on. Therefore, setting the appropriate power is crucial for the secure transmission of information. Moreover, the performance comparison of

M = 81

and

M = 64

reveal that the more reflecting elements the AIRS is equipped with, the better the average secrecy rate performance can be obtained.

To verify the effectiveness and performance of the developed AIRS-assisted secure transmission scenario, the secrecy performance of terrestrial IRS (TIRS)-based scenarios is also included as a baseline in Figure 5. Different from the AIRS mounted on aerial platforms, numerous reflecting elements of TIRS are installed on the surface of buildings or other infrastructure on the ground. Thus, TIRS in terrestrial networks can only achieve a coverage of

180^{\circ}

, while AIRS can achieve a full coverage of

360^{\circ}

due to its flexibility and mobility in the air. It can be observed from Figure 5 that, in comparison with TIRS-assisted secure transmission, our proposed AIRS-assisted scenario can obtain more secrecy rate gain. When the transmit power is

P_{s} = 10

dBm and the number of reflecting elements is

M = 64

, AIRS assistance can obtain a secrecy rate gain of

34.5 %

more than TIRS-assisted systems. Moreover, when the number of reflecting elements of AIRS is increased to

M = 81

, the secrecy rate of our proposed AIRS-assisted scheme is almost twice that of TIRS systems. The above superior performance relies on the full-angle coverage of AIRS, which enables AIRS to facilitate enhanced LoS connectivity for terrestrial users. Thus, in contrast to the terrestrial LoS connection constructed by TIRS, which may suffer from terrestrial obstacles and strong fading, the air LoS can achieve better transmission performance, which in turn leads to the average secrecy rate improvement.

In wireless communication systems, the configuration of multiple antennas can significantly improve the information transmission rate and security because more antennas lead to sharper information beams, thereby reducing information leakage to eavesdroppers. Figure 6 analyzes the impact of the number of transmit antennas on the average secrecy rate. In Figure 6, our proposed trajectory design method is compared with two baselines, straight trajectory and polyline trajectory, and all trajectories are shown in Figure 7. Specifically, for a straight trajectory, we assume that the AIRS flies straight from its starting position of the horizontal coordinate

q_{0} = [0, 0]

m to the target position

q_{T} = [200, 200]

m with a constant velocity

v = 14.14

m/s. For the polyline trajectory, assuming that AIRS also starts from the starting position, it first flies straight to Alice’s position, then to Bob’s position, and finally to the target position. In contrast to the preset fixed flight trajectory, our proposed trajectory can use the DDPG algorithm to design an optimal trajectory that maximizes the system secrecy rate based on real-time channel state.

In Figure 6, it can be derived that, as the number of transmit antennas increases, the average secrecy rate is improved greatly. Intuitively, this is because it is equipped with more transmit antennas at Alice enabling more spatial degrees of freedom for confidential information transmission and results in improvement of secrecy rate. In addition, results also show that our proposed AIRS trajectory method designed based on the DDPG algorithm can achieve the best average secrecy rate performance, while the performance of the polyline trajectory is better than that of the straight trajectory. Compared with linear and polyline trajectories, since the dynamic communication environment and distance-based path loss are taken into account in the trajectory design of our paper, the state and actions of the neural network are updated with the dynamic channel conditions. Then, the optimal real-time position of AIRS can be output by the designed neural network. Therefore, the proposed DDPG-based trajectory design can obtain the best secrecy rate performance compared to the preset trajectories. Particularly, when the transmit power is

P_{s} = 100

mW and the number of transmit antennas is

N_{t} = 20

, the proposed trajectory design can achieve a secrecy rate gain of

15 %

more than polyline trajectory.

In addition, as analyzed by Figure 5, the system with transmit power

P_{s} = 100

mW can obtain more secrecy rate gain than the system with power

P_{s} = 50

mW.

5. Conclusions

In this paper, we investigated an AIRS-empowered secure communication system. In the system, an AIRS mounted on an aerial platform is introduced to help confidential information transmission. With the objective to guarantee information security during flight period, the average secrecy rate was investigated, and the transmit beamforming at transmitter and phase-shift matrix as well as trajectory of AIRS are jointly designed. To cope with the formulated non-convex problem, we first divided the secrecy rate maximization problem into three sub-problems and proposed an iterative algorithm. Specifically, we optimized the IRS phase shift matrix by the MM algorithm and designed the trajectory of AIRS by the DDPG algorithm. Simulation results have demonstrated that our proposed algorithm can obviously enhance the secrecy performance.

Author Contributions

Methodology, Y.W. and J.Q.; data and experiment, Y.W. and C.Z.; writing—original draft preparation, Y.W.; writing—review and editing, J.Q. and C.Z.; funding acquisition, J.Q. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (No. 61901247), and Natural Science Foundation of Shandong Province of China (No. ZR2019BF032).

Data Availability Statement

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Proof of Proposition 1

To ensure the information transmission security, the positive secrecy rate

R_{s}^{a v e r} \geq 0

should be guaranteed, i.e.,

γ_{D} \geq γ_{E}

. Therefore, in the following analysis, the operator

{[\cdot]}^{+}

in (11) is removed. In the formulated average secrecy rate maximization problem, we try to introduce an extra coefficient

ξ

to form the inequality power constraint of

w

into an equality one, that is,

{∥ w ∥}^{2} = ξ

and

0 \leq ξ \leq 1

. Thus, the sub-problem with respect to transmit beamformer can be formulated as (13).

We first study the relationship between the objective function and

ξ

. In (13), the objective function can be expressed as

R_{s}^{a v e r} (ξ) = \frac{1 + k_{d} ξ}{1 + k_{e} ξ},

(A1)

where

k_{d}

and

k_{e}

can be derived from (13). It is easily obtained that the derivative of

R_{s}^{a v e r} (ξ)

is

\frac{d R_{s}^{a v e r} (ξ)}{d ξ} = \frac{k_{d} - k_{e}}{{(1 + k_{e} ξ)}^{2}} .

(A2)

Because

{(1 + k_{e} ξ)}^{2}

is greater than zero, the sign of derivative depends on

(k_{d} - k_{e})

. In terms of the positive secrecy rate constraint, we have that

R_{s}^{a v e r} (ξ)

is a monotonically increasing function of

ξ

. Thus, the optimal solution coefficient

ξ

is

ξ^{o p t} = 1

.

With the obtained

ξ^{o p t}

, the subproblem in (13) can be expressed as a generalized eigenvector problem, and the optimal solution of

w [n]

can be derived as

w^{o p t} = e^{u n i t} .

(A3)

where

e^{u n i t}

is the unit-norm eigenvector obtained based on the largest eigenvalue of the matrix

H_{E}^{- 1} H_{D}

. The matrices

H_{D}

and

H_{E}

can be found in the above Section 3.1.

Appendix B. Proof of Proposition 2

To solve the sub-problem with respect to phase-shift vector

\hat{θ} [n]

, the MM algorithm is utilized, and one replacement function approaching the objective function is first defined. Here, we define the parameter

y = ϕ^{H} {\hat{R}}_{E} ϕ

, then the objective function

f (ϕ)

is written as

f (ϕ, y) = \frac{ϕ^{H} {\hat{R}}_{D} ϕ}{y} .

(A4)

It can be proved that

f (ϕ, y)

is jointly convex over

ϕ

and y. Given a fixed point

(ϕ_{0}, y_{0})

, a linear approximation of function

f (ϕ, y)

can be obtained as

\begin{array}{l} f (ϕ, y) & \geq \frac{ϕ_{0}^{H} {\hat{R}}_{D} ϕ_{0}}{y} + \frac{1}{y_{0}} ϕ_{0}^{H} {\hat{R}}_{D}^{H} (ϕ - ϕ_{0}) + \frac{1}{y_{0}} {(ϕ - ϕ_{0})}^{H} {\hat{R}}_{D} ϕ_{0} - \frac{ϕ_{0}^{H} {\hat{R}}_{D} ϕ_{0}}{y_{0}^{2}} (y - y_{0}) \\ = 2 \frac{R (ϕ_{0}^{H} {\hat{R}}_{D}^{H} ϕ)}{y_{0}} - \frac{y}{y_{0}^{2}} ϕ_{0}^{H} {\hat{R}}_{D}^{H} ϕ_{0} . \end{array}

(A5)

Since

{\hat{R}}_{E}

is Hermitian matrix,

λ_{m a x} I_{M} - {\hat{R}}_{E}

is positive semi-definite, in which

λ_{m a x}

is the largest eigenvalue of

{\hat{R}}_{E}

. Then, we have

y \leq ϕ^{H} λ_{m a x} ϕ + 2 R (ϕ^{H} ({\hat{R}}_{E} - λ_{m a x} I_{M}) ϕ_{0}) + ϕ_{0}^{H} (λ_{m a x} I_{M} - {\hat{R}}_{E}) ϕ_{0} .

(A6)

Therefore, the surrogate function can be defined as

\begin{matrix} g (φ | φ^{n}) = & 2 \frac{R (φ^{H} {\hat{R}}_{D} φ)}{φ^{H} {\hat{R}}_{E} φ} - \frac{φ^{H} {\hat{R}}_{D} φ}{{(φ^{H} {\hat{R}}_{D} φ)}^{2}} \{φ^{H} λ_{m a x} ({\hat{R}}_{E}) φ + 2 R (φ^{H} [{\hat{R}}_{E} - λ_{m a x} ({\hat{R}}_{E}) I_{M}] φ)\} . \end{matrix}

(A7)

Then,

f (φ) = \frac{φ^{H} {\hat{R}}_{D} φ}{φ^{H} {\hat{R}}_{E} φ} \geq g (φ | φ_{0}) + [f (φ_{0}) - g (φ_{0})]

is proved.

References

Chorti, A.; Barreto, A.N.; Köpsell, S.; Zoli, M.; Chafii, M.; Sehier, P.; Fettweis, G.; Poor, H.V. Context-Aware Security for 6G Wireless: The Role of Physical Layer Security. IEEE Commun. Stand. Mag. 2022, 6, 102–108. [Google Scholar] [CrossRef]
Angueira, P.; Val, I.; Montalban, J.; Seijo, Ó.; Iradier, E.; Fontaneda, P.S.; Fanari, L.; Arriola, A. A Survey of Physical Layer Techniques for Secure Wireless Communications in Industry. IEEE Commun. Surv. Tutor. 2022, 24, 810–838. [Google Scholar] [CrossRef]
Tang, Z.; Hou, T.; Liu, Y.; Zhang, J.; Hanzo, L. Physical Layer Security of Intelligent Reflective Surface aided NOMA Networks. IEEE Trans. Veh. Technol. 2022, 71, 7821–7834. [Google Scholar] [CrossRef]
Li, G.; Yang, H.; Zhang, J.; Liu, H.; Hu, A. Fast and Secure Key Generation with Channel Obfuscation in Slowly Varying Environments. In Proceedings of the IEEE Conference on Computer Communications (IEEE INFOCOM), Online, 2–5 May 2022; pp. 1–10. [Google Scholar]
Dong, L.; Han, Z.; Petropulu, A.P.; Poor, H.V. Improving Wireless Physical Layer Security via Cooperating Relays. IEEE Trans. Signal Process. 2010, 58, 1875–1888. [Google Scholar] [CrossRef]
Qiao, J.; Zhang, H.; Zhao, F.; Yuan, D. Secure Transmission and Self-Energy Recycling With Partial Eavesdropper CSI. IEEE J. Sel. Areas Commun. 2018, 36, 1531–1543. [Google Scholar] [CrossRef]
Wu, Q.; Zhang, R. Towards Smart and Reconfigurable Environment: Intelligent Reflecting Surface Aided Wireless Network. IEEE Commun. Mag. 2019, 58, 106–112. [Google Scholar] [CrossRef]
Gong, S.; Lu, X.; Hoang, D.T.; Niyato, D.; Shu, L.; Kim, D.I.; Liang, Y.C. Toward Smart Wireless Communications via Intelligent Reflecting Surfaces: A Contemporary Survey. IEEE Commun. Surv. Tutor. 2020, 22, 2283–2314. [Google Scholar] [CrossRef]
Yang, H.; Xiong, Z.; Zhao, J.; Niyato, D.; Xiao, L.; Wu, Q. Deep Reinforcement Learning-Based Intelligent Reflecting Surface for Secure Wireless Communications. IEEE Trans. Wirel. Commun. 2021, 20, 375–388. [Google Scholar] [CrossRef]
Dong, L.; Wang, H.M. Enhancing Secure MIMO Transmission via Intelligent Reflecting Surface. IEEE Trans. Wirel. Commun. 2020, 19, 7543–7556. [Google Scholar] [CrossRef]
Khan, W.U.; Lagunas, E.; Ali, Z.; Javed, M.A.; Ahmed, M.; Chatzinotas, S.; Ottersten, B.; Popovski, P. Opportunities for physical layer security in UAV communication enhanced with intelligent reflective surfaces. arXiv 2022, arXiv:2203.16907. [Google Scholar]
Wu, Q.; Zhang, S.; Zheng, B.; You, C.; Zhang, R. Intelligent Reflecting Surface-Aided Wireless Communications: A Tutorial. IEEE Trans. Commun. 2021, 69, 3313–3351. [Google Scholar] [CrossRef]
Liu, Y.; Liu, X.; Mu, X.; Hou, T.; Xu, J.; Di Renzo, M.; Al-Dhahir, N. Reconfigurable Intelligent Surfaces: Principles and Opportunities. IEEE Commun. Surv. Tutor. 2021, 23, 1546–1577. [Google Scholar] [CrossRef]
Huang, C.; Zappone, A.; Alexandropoulos, G.C.; Debbah, M.; Yuen, C. Reconfigurable Intelligent Surfaces for Energy Efficiency in Wireless Communication. IEEE Trans. Wirel. Commun. 2019, 18, 4157–4170. [Google Scholar] [CrossRef]
Yu, X.; Xu, D.; Sun, Y.; Ng, D.W.K.; Schober, R. Robust and Secure Wireless Communications via Intelligent Reflecting Surfaces. IEEE J. Sel. Areas Commun. 2020, 38, 2637–2652. [Google Scholar] [CrossRef]
Cui, M.; Zhang, G.; Zhang, R. Secure Wireless Communication via Intelligent Reflecting Surface. IEEE Wirel. Commun. Lett. 2019, 8, 1410–1414. [Google Scholar] [CrossRef]
Dong, L.; Wang, H.M. Secure MIMO Transmission via Intelligent Reflecting Surface. IEEE Wirel. Commun. Lett. 2020, 9, 787–790. [Google Scholar] [CrossRef]
Asaad, S.; Wu, Y.; Bereyhi, A.; Müller, R.R.; Schaefer, R.F.; Poor, H.V. Secure Active and Passive Beamforming in IRS-Aided MIMO Systems. IEEE Trans. Inf. Forensics Secur. 2022, 17, 1300–1315. [Google Scholar] [CrossRef]
Qiao, J.; Alouini, M.S. Secure Transmission for Intelligent Reflecting Surface-Assisted mmWave and Terahertz Systems. IEEE Wirel. Commun. Lett. 2020, 9, 1743–1747. [Google Scholar] [CrossRef]
Qiao, J.; Zhang, C.; Dong, A.; Bian, J.; Alouini, M.S. Securing Intelligent Reflecting Surface Assisted Terahertz Systems. IEEE Trans. Veh. Technol. 2022, 71, 8519–8533. [Google Scholar] [CrossRef]
Ye, J.; Qiao, J.; Kammoun, A.; Alouini, M.S. Non-Terrestrial Communications Assisted by Reconfigurable Intelligent Surfaces. Proc. IEEE 2022. [Google Scholar] [CrossRef]
Hashida, H.; Kawamoto, Y.; Kato, N. Intelligent Reflecting Surface Placement Optimization in Air-Ground Communication Networks Toward 6 G. IEEE Wirel. Commun. 2020, 27, 146–151. [Google Scholar] [CrossRef]
Lu, H.; Zeng, Y.; Jin, S.; Zhang, R. Aerial Intelligent Reflecting Surface: Joint Placement and Passive Beamforming Design with 3D Beam Flattening. IEEE Trans. Wirel. Commun. 2021, 20, 4128–4143. [Google Scholar] [CrossRef]
Lu, H.; Zeng, Y.; Jin, S.; Zhang, R. Enabling Panoramic Full-Angle Reflection via Aerial Intelligent Reflecting Surface. In Proceedings of the IEEE International Conference on Communications Workshops (ICC Workshops), Dublin, Ireland, 7–11 June 2020; pp. 1–6. [Google Scholar]
Pang, X.; Zhao, N.; Tang, J.; Wu, C.; Niyato, D.; Wong, K.K. IRS-Assisted Secure UAV Transmission via Joint Trajectory and Beamforming Design. IEEE Trans. Commun. 2021, 70, 1140–1152. [Google Scholar] [CrossRef]
Guo, X.; Chen, Y.; Wang, Y. Learning-Based Robust and Secure Transmission for Reconfigurable Intelligent Surface Aided Millimeter Wave UAV Communications. IEEE Wirel. Commun. Lett. 2021, 10, 1795–1799. [Google Scholar] [CrossRef]
Ge, L.; Dong, P.; Zhang, H.; Wang, J.B.; You, X. Joint beamforming and trajectory optimization for intelligent reflecting surfaces-assisted UAV communications. IEEE Access 2020, 8, 78702–78712. [Google Scholar] [CrossRef]
AlJubayrin, S.; Al-Wesabi, F.N.; Alsolai, H.; Duhayyim, M.A.; Nour, M.K.; Khan, W.U.; Mahmood, A.; Rabie, K.; Shongwe, T. Energy Efficient Transmission Design for NOMA Backscatter-Aided UAV Networks with Imperfect CSI. Drones 2022, 6, 190. [Google Scholar] [CrossRef]
Khan, W.U.; Jamshed, M.A.; Lagunas, E.; Chatzinotas, S.; Li, X.; Ottersten, B. Energy efficiency optimization for backscatter enhanced NOMA cooperative V2X communications under imperfect CSI. IEEE Trans. Intell. Transp. Syst. 2022. [Google Scholar] [CrossRef]
Ihsan, A.; Chen, W.; Khan, W.U. Energy-efficient backscatter aided uplink NOMA roadside sensor communications under channel estimation errors. arXiv 2021, arXiv:2109.05341. [Google Scholar]
Alfattani, S.; Jaafar, W.; Hmamouche, Y.; Yanikomeroglu, H.; Yongaçoglu, A.; Đào, N.D.; Zhu, P. Aerial platforms with reconfigurable smart surfaces for 5G and beyond. IEEE Commun. Mag. 2021, 59, 96–102. [Google Scholar] [CrossRef]
Wyner, A.D. The wire-tap channel. Bell. Syst. Tech. J. 1975, 54, 1355–1387. [Google Scholar] [CrossRef]
Lange, K. Optimization (Chapter 8); Springer Science & Business Media: New York, NY, USA, 2013; Volume 95, pp. 185–219. [Google Scholar]
Peng, Z.; Zhang, Z.; Pan, C.; Li, L.; Swindlehurst, A.L. Multiuser full-duplex two-way communications via intelligent reflecting surface. IEEE Trans. Signal Process. 2021, 69, 837–851. [Google Scholar] [CrossRef]
Yu, X.; Xu, D.; Schober, R. Enabling Secure Wireless Communications via Intelligent Reflecting Surfaces. In Proceedings of the IEEE Global Communications Conference (GLOBECOM), Waikoloa, HI, USA, 9–13 December 2019; pp. 1–6. [Google Scholar]

Figure 1. The system model of AIRS-assisted secure communication systems.

Figure 2. The actor network architecture with four hidden layers.

Figure 3. The critic network architecture with four hidden layers.

Figure 4. Convergence performance of the proposed iteration algorithm.

Figure 5. Secrecy rate versus the transmit power at Alice, in which the number of reflecting elements is set to

M = [64, 81]

.

Figure 5. Secrecy rate versus the transmit power at Alice, in which the number of reflecting elements is set to

M = [64, 81]

.

Figure 6. The average secrecy rate performance relative to the number of antennas at Alice, where the transmit power is set to

P_{s} = [50, 100]

mW.

Figure 6. The average secrecy rate performance relative to the number of antennas at Alice, where the transmit power is set to

P_{s} = [50, 100]

mW.

Figure 7. Three different AIRS trajectories in

N = 20

time slots are given, including straight trajectory, polyline trajectory, and the designed optimal trajectory with

P_{s} = 100

mW.

Figure 7. Three different AIRS trajectories in

N = 20

time slots are given, including straight trajectory, polyline trajectory, and the designed optimal trajectory with

P_{s} = 100

mW.

Table 1. Related work on IRS and UAV assisted communication.

Reference	Scenario	Description	Design Objective
[15,16]	Secure communication in MISO system	TIRS-assisted secure communication between multi-antenna access point and a ground user	Maximize the secrecy rate of the legitimate communication link
[17,18]	Secure transmission in MIMO system	TIRS-aided secure transmission from BS to multi-users	Maximize the secrecy rate
[19,20]	Secure transmission in mmWave/THz systems	TIRS-aided secure transmission of terrestrial network in mmWave/THz bands	Maximize the system secrecy rate
[22]	AIRS-aided cellular network	AIRS-assisted communication between BS and aerial users	Maximize mean signal-to-interference-noise ratio (SINR)
[23,24]	AIRS-assisted communication system	AIRS-enhanced terrestrial communication within a given area	Maximize the worst-case signal-to-noise ratio
[25]	Secure UAV communication system	TIRS-aided secure transmission from UAV to a terrestrial user	Maximize the average secrecy rate
[26]	Robust secure UAV communication system	TRIS-aided secure transmission in mmWave UAV system with imperfect CSI	Maximize the sum secrecy rate of all users
[27]	TIRS-aided UAV communication system	Multi-TIRSs aided communication from UAV to ground user	Maximize the received power at the ground user

Table 2. Description of states in the DDPG algorithm.

No.	State	Numeric Types	Dimension
1	transmit beamformer	complex vector	$N_{t}$
2	phase-shift vector of AIRS	complex vector	M
3	AIRS’s current speed	real scalar	1
4	location of Alice	real vector	3
5	location of Bob	real vector	3
6	location of Eve	real vector	3
7	starting position of AIRS	real vector	3
8	target position of AIRS	real vector	3
9	current location of AIRS	real vector	3

Table 3. Notations for parameters used in the neural network.

Parameter	Description	Parameter	Description
$a (n)$	action at n-th time slot	$s (n)$	state at n-th time slot
$π (s; μ^{π})$	current actor network	$π^{^{'}} (s; μ^{π^{^{'}}})$	target actor network
$Q (s, a; μ^{Q})$	current critic network	$Q^{^{'}} (s, a; μ^{Q^{^{'}}})$	target critic network
$d_{A T} [n]$	distance from $q [n]$ to target $q_{T}$	$r (n)$	reward function
$η_{r a t e}$	binary penalty indicator on $R_{s} [n]$	$γ$	discount factor
$η_{d i s t}$	binary penalty indicator on $d_{A T} [n]$	$N_{b}$	size of mini-batch

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, Y.; Qiao, J.; Zhang, C. Joint Beamforming and Trajectory Design for Aerial Intelligent Reflecting Surface-Aided Secure Transmission. Electronics 2022, 11, 2802. https://doi.org/10.3390/electronics11182802

AMA Style

Wang Y, Qiao J, Zhang C. Joint Beamforming and Trajectory Design for Aerial Intelligent Reflecting Surface-Aided Secure Transmission. Electronics. 2022; 11(18):2802. https://doi.org/10.3390/electronics11182802

Chicago/Turabian Style

Wang, Yanping, Jingping Qiao, and Chuanting Zhang. 2022. "Joint Beamforming and Trajectory Design for Aerial Intelligent Reflecting Surface-Aided Secure Transmission" Electronics 11, no. 18: 2802. https://doi.org/10.3390/electronics11182802

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Joint Beamforming and Trajectory Design for Aerial Intelligent Reflecting Surface-Aided Secure Transmission

Abstract

1. Introduction

2. System Model

2.1. Channel Model

2.2. Signal Model

2.3. Problem Formulation

3. Average Secrecy Rate Maximization

3.1. Transmit Beamforming Design

3.2. Phase-Shift Design

3.3. AIRS Trajectory Design

3.3.1. State, Action, and Reward

3.3.2. Loss Function

3.3.3. Neural Network

3.4. Iterative Algorithm Description

3.5. Complexity Analysis

4. Simulation Results and Analysis

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Appendix A. Proof of Proposition 1

Appendix B. Proof of Proposition 2

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI