Energy-Efficient Aerial STAR-RIS-Aided Computing Offloading and Content Caching for Wireless Sensor Networks

Yang, Xiaoping; Wang, Quanzeng; Yang, Bin; Cao, Xiaofang

doi:10.3390/s25020393

Open AccessArticle

Energy-Efficient Aerial STAR-RIS-Aided Computing Offloading and Content Caching for Wireless Sensor Networks

¹

College of Computer Science, Beijing University of Technology, Beijing 100124, China

²

School of Business, Beijing Wuzi University, Beijing 101149, China

^*

Authors to whom correspondence should be addressed.

Sensors 2025, 25(2), 393; https://doi.org/10.3390/s25020393

Submission received: 31 October 2024 / Revised: 3 January 2025 / Accepted: 8 January 2025 / Published: 10 January 2025

(This article belongs to the Special Issue Recent Developments in Wireless Network Technology)

Download

Browse Figures

Versions Notes

Abstract

Unmanned aerial vehicle (UAV)-based wireless sensor networks (WSNs) hold great promise for supporting ground-based sensors due to the mobility of UAVs and the ease of establishing line-of-sight links. UAV-based WSNs equipped with mobile edge computing (MEC) servers effectively mitigate challenges associated with long-distance transmission and the limited coverage of edge base stations (BSs), emerging as a powerful paradigm for both communication and computing services. Furthermore, incorporating simultaneously transmitting and reflecting reconfigurable intelligent surfaces (STAR-RISs) as passive relays significantly enhances the propagation environment and service quality of UAV-based WSNs. However, most existing studies place STAR-RISs in fixed positions, ignoring the flexibility of STAR-RISs. Some other studies equip UAVs with STAR-RISs, and UAVs act as flight carriers, ignoring the computing and caching capabilities of UAVs. To address these limitations, we propose an energy-efficient aerial STAR-RIS-aided computing offloading and content caching framework, where we formulate an energy consumption minimization problem to jointly optimize content caching decisions, computing offloading decisions, UAV hovering positions, and STAR-RIS passive beamforming. Given the non-convex nature of this problem, we decompose it into a content caching decision subproblem, a computing offloading decision subproblem, a hovering position subproblem, and a STAR-RIS resource allocation subproblem. We propose a deep reinforcement learning (DRL)–successive convex approximation (SCA) combined algorithm to iteratively achieve near-optimal solutions with low complexity. The numerical results demonstrate that the proposed framework effectively utilizes resources in UAV-based WSNs and significantly reduces overall system energy consumption.

Keywords:

unmanned aerial vehicle; wireless sensor networks; simultaneously transmitting and reflecting reconfigurable intelligent surface; computing offloading; content caching

1. Introduction

Wireless sensor networks (WSNs), comprising a large number of sensor nodes, show impressive capability in transmitting a very large number of data with high efficiency [1,2]. Their compactness, cost-effectiveness, and ease of deployment make WSNs highly effective for a wide range of real-time applications. With the rapid development of WSNs, the explosive growth of sensor devices has intensified the demand for high data rates and ultra-low-latency services [3]. Traditional cloud computing paradigms face challenges in meeting the diverse service requirements of delay-sensitive and computing-intensive tasks [4]. Recently, mobile edge computing (MEC), which provides computing and caching services for nearby sensors, has enabled local task offloading, avoiding the need to send data to distant cloud centers and thus reducing costs [5]. This has facilitated the transition of WSN devices to large-scale deployment, allowing for the real-time monitoring of environmental conditions to offer essential insights for urban planning and management. Moreover, by deploying MEC servers at access points (APs) or base stations (BSs), it is possible not only to cache popular content on the edge cloud to reduce the delay and energy consumption of sensor content requests but also to offload tasks requiring computation [6]. However, traditional MEC servers, typically deployed at fixed base stations, often suffer from limited coverage and challenges like non-line-of-sight (NLoS) transmission, which degrades signal quality and results in reduced overall service efficiency and performance [7].

In recent years, unmanned aerial vehicles (UAVs) have gained widespread application across various industries due to their mobility, low operational cost, and ability to establish easy line-of-sight (LoS) communication links [8]. These unique characteristics enable UAVs to effectively address challenges inherent to traditional communication systems, such as fixed deployment locations, high infrastructure costs, and limited adaptability to specialized scenarios [9]. Furthermore, equipped with MEC servers, UAVs can not only provide computing and caching services to sensors but also function as aerial relays to offload tasks to other nodes, thereby significantly enhancing the flexibility and efficiency of network services [10,11,12]. However, UAVs have limited computing, caching, and endurance capabilities, thus low-power solutions are crucial to improving UAV network performance [13].

Recently presented, a promising approach to reducing UAV energy consumption involves the deployment of simultaneously transmitting and reflecting reconfigurable intelligent surfaces (STAR-RISs) as an alternative to UAVs for signal relaying [14]. Each element of a STAR-RIS, capable of supporting both electric and magnetic currents, can simultaneously reconfigure transmitted and reflected signals, thereby achieving full-space coverage [15,16]. However, most existing studies in this field assume that STAR-RISs are deployed in a fixed position [17], with UAVs primarily offering computing [18,19] or caching capabilities [20,21,22]. The fixed deployment of STAR-RISs limits their ability to flexibly adjust the distance between themselves and the sensors, thereby degrading task offloading transmission performance. Other studies have proposed mounting STAR-RISs on UAVs, but UAVs do not have computing and caching resources [23,24]. Consequently, UAVs serve only as flight carriers, leaving their potential computing and caching capabilities underutilized. This significantly reduces resource utilization and overall task processing efficiency in UAV-based WSNs.

While significant progress has been made in this field, several critical research gaps remain unaddressed. First, the fixed deployment of STAR-RISs restricts their adaptability to dynamic WSN environments, leading to suboptimal task offloading transmission performance. Second, while some studies equip UAVs with STAR-RISs, they overlook UAVs’ inherent computing and caching capabilities, resulting in the underutilization of WSN resources and reduced system efficiency. Third, existing research predominantly lacks a comprehensive joint optimization framework that simultaneously considers caching decisions, offloading strategies, UAV hovering positions, and STAR-RIS passive beamforming. These limitations hinder the ability to fully capitalize on the dynamic and heterogeneous resources of UAV-based WSNs. To address these limitations, we propose a novel STAR-RIS-assisted computing offloading and content caching framework to minimize system energy consumption for UAV-based WSNs. In this context, UAVs have computing and caching capabilities, which can greatly enhance task processing efficiency and response speed. Subsequently, given the limited energy capacity of UAVs, we optimize their energy utilization by separating the relay functionality. To this end, a STAR-RIS, as a passive relay, is introduced to assist the sensor nodes in forwarding tasks that can be reasonably allocated to UAV-based WSN resources. Furthermore, by installing a STAR-RIS on the UAV, the system can dynamically adjust the relative positioning between the STAR-RIS and the sensors, further enhancing efficiency in task transmission and processing. However, several key challenges must still be addressed to fully achieve this. First, traditional static caching strategies are not suitable for the dynamic characteristics of UAV-based WSNs. Therefore, it is essential to design effective caching strategies that ensure fast responses to sensor requests. Secondly, in order to ensure the efficient use of resources and avoid single-point overload, how to achieve the dynamic offloading of tasks among edge clouds, UAVs, and sensors is crucial. Third, due to the limited endurance of UAVs, STAR-RISs as relay nodes can effectively save the energy consumption of UAVs when forwarding tasks as relay nodes by optimizing signal transmission and reflection paths. Therefore, in order to further improve the endurance of the UAV system, it is crucial to design an effective transmission and reflection coefficient matrix for STAR-RISs. Finally, UAVs equipped with STAR-RISs not only cache and process sensor tasks but also serve as relay nodes, providing additional communication links between sensors and the edge cloud. Therefore, it is essential to jointly optimize caching decisions, offloading decisions, UAV hovering positions, and passive beamforming to minimize overall system energy consumption.

Tackling these challenges, the main contributions of this paper can be summarized as follows:

(1): We propose a novel aerial STAR-RIS-aided computing offloading and content caching framework to minimize system energy consumption for WSNs. This framework leverages flexible deployment and caching, computing, and communication (3C) resources to offer adaptive computation and caching services. Additionally, a STAR-RIS is introduced as a passive relay to assist sensors in forwarding tasks, which can reasonably allocate UAV-based WSN resources. Lastly, by installing a STAR-RIS on a UAV, the system can flexibly adjust the position between the STAR-RIS and the sensors to improve task transmission performance.
(2): Since the energy consumption minimization problem is non-convex, we decomposed the problem into four subproblems: content caching decision, computation offloading decision, UAV hovering position, and STAR-RIS resource allocation. For the subproblem of content caching decisions, the network caching decisions are optimized by utilizing a new deep reinforcement learning (DRL) algorithm. For the other subproblems, we utilize the Karush–Kuhn–Tucker (KKT) conditions and the successive convex approximation (SCA) algorithm to iteratively solve and optimize system energy consumption.
(3): The numerical results demonstrate that the proposed STAR-RIS-aided computing offloading and content caching framework significantly reduces system energy consumption in UAV-based WSNs compared with the benchmarks, especially in scenarios with limited network resources or adverse channel conditions.

The rest of this paper is organized as follows: We first briefly review the related works of this paper in Section 2 and then give the overview and mathematical description of the system model in Section 3. In Section 4 and Section 5, the optimization algorithm and iterative solution process for the proposed model are introduced. In Section 6, we discuss the convergence and complexity of the proposed DRL-SCA algorithm. Simulation environments are presented, and the results are discussed in Section 7, followed by conclusions in Section 8.

2. Related Works

In this section, we present related works on computation offloading and content caching in three key aspects: MEC, UAVs, and STAR-RIS-aided UAVs. The three subsections are in a progressive relationship, from MEC to UAVs and then to STAR-RIS-aided UAVs, which helps to identify gaps in existing research and gradually demonstrates the superiority of the proposed framework.

2.1. Computing Offloading and Content Caching in MEC

In the context of MEC, the key element is the edge server, which provides computing resources, caching capability, and connectivity [25]. When computing tasks are offloaded to MEC nodes, content caching can effectively degrade the latency and bandwidth cost of acquiring and initializing applications [26,27]. Several studies have explored the joint optimization of content caching and computing offloading in MEC [5,19,26]. Liao et al. [5] studied a joint service caching and task offloading problem in a multi-user, multi-BS, cloud-based MEC system to optimize execution delay, energy consumption, total benefit, and task offloading rate. Yu et al. [19] proposed optimal joint service caching and task offloading strategies to minimize the overall execution delay at the mobile terminal side in a MEC scenario. Bi et al. [26] minimized user computation delay and energy consumption by jointly optimizing resource allocation, computing offloading, and content caching decisions in MEC networks.

2.2. Computing Offloading and Content Caching in UAVs

Although MEC can provide low-cost computing and caching services for nearby sensors, MEC is typically deployed at fixed BSs, which limits its adaptability to sensors’ demands, coverage, and service quality [5,7]. To overcome these limitations, UAV communication networks have become crucial due to their high mobility, strong LoS links, and fast deployment capabilities [28]. The UAVs equipped with MEC servers can plan their hovering positions or flight trajectories strategically, offering 3C resources dynamically [20]. Several studies have investigated the joint optimization of computing offloading and content caching in UAVs [21,22]. Zhao et al. [21] studied a UAV-enabled MEC network where UAVs equipped with caching and computation capabilities collaborate with ground BSs to fulfill sensor requests, enhancing communication resources for data transmission. Huang et al. [22] proposed a UAV-assisted Internet of Vehicles (IoV) framework, where both UAVs and BSs provide computing and caching services for smart vehicles, to minimize average task processing delay and maximize the UAV cache hit ratio.

2.3. Computing Offloading and Content Caching in STAR-RIS-Aided UAVs

Although UAVs can provide flexible computing and caching services, it is difficult to meet high-energy-consuming task demands due to the limited endurance capabilities of UAVs [13,17,29]. Therefore, low-power solutions are essential to improving the performance of UAV-based WSNs. Fortunately, a promising approach to reducing UAV energy consumption is the deployment of STAR-RISs as an alternative to UAVs for signal relaying [23,30]. Aung et al. [23] deployed STAR-RISs on UAVs and minimized energy consumption for IoT devices and aerial STAR-RISs by jointly optimizing task offloading, aerial STAR-RIS trajectory, amplitude, phase shift coefficients, and transmitting power in a STAR-RIS-aided MEC network. Zhang et al. [30] maximized the sum rate of all users by jointly optimizing the STAR-RISs’ beamforming vectors, the UAVs’ trajectory, and the power allocation STAR-RIS-assisted UAV communication system.

However, most studies assume that STAR-RISs are fixed and UAVs only provide computing or caching capabilities. Since STAR-RISs cannot flexibly adjust the distance between themselves and the sensor, task offloading transmission performance is reduced [30]. Some other studies assumed that STAR-RISs are mounted on UAVs but UAVs do not have computing and caching resources [23]. Since a UAV can only serve as a flight carrier, the computing and caching service resources of UAV-based WSNs cannot be utilized, which significantly reduces resource utilization and overall task processing efficiency. To overcome these limitations, we propose a novel energy-efficient aerial STAR-RIS-assisted computing offloading and content caching framework, which installs a STAR-RIS on a UAV, fully leveraging UAV-based WSNs’ computational and caching capacities for improved resource utilization and task execution efficiency. In this framework, we formulate a problem to minimize system energy consumption by jointly optimizing the content caching decisions, computing offloading decisions, UAV hovering positions, and STAR-RIS passive beamforming.

3. System Model and Problem Formulation

In this section, we provide a comprehensive description of the network overview, communication model, computation model, caching model, and energy consumption model for the proposed aerial STAR-RIS-aided WSN. We also formulate a system energy consumption minimization optimization problem. The notation is summarized in Table 1.

3.1. Network Overview

As shown in Figure 1, the considered aerial STAR-RIS-aided WSN consists of an edge cloud c, a UAV u equipped with computing and caching servers and a STAR-RIS, and many single-antenna sensors. UAV u can provide services for sensors and those sensors are denoted by

M = {1, 2, \dots, m, \dots, M}

. And the set of tasks is denoted by

K = {1, 2, \dots, k, \dots, K}

. Moreover, we assume that content popularity follows the Zipf distribution. We consider that edge cloud c and UAV u have limited caching and computing capacities and that the sensors only have limited computing capacities. The UAV has mobility and can adjust its position relative to the connected sensors to improve the quality of signal transmission. In this network, UAV u is connected to edge cloud c and to the sensors by wireless links. In particular, the STAR-RIS consists of

M_{c}

×

M_{r}

passive units, which are spanned as a uniform planar array (UPA). Each column and row of the UPA has

M_{c}

and

M_{r}

passive units, respectively [31]. The

(m_{c}, m_{r})

-th STAR-RIS element is used to represent the element of the

m_{r}

-th row and

m_{c}

-th column of the STAR-RIS. We assume that the direct communication links between the sensors and edge cloud c are blocked and that the STAR-RIS elements work in energy-splitting (ES) mode [32]. In the proposed STAR-RIS-aided UAV system, the sensor m can process the partial tasks; then, UAV u can partially process the remaining tasks uploaded by the sensors and return the results to the sensors. At the same time, the other remaining tasks are forwarded to edge cloud c for processing through the STAR-RIS elements, and the results are returned to the sensors. In this paper, the main focus is on minimizing system energy consumption in the STAR-RIS-aided UAV system by jointly optimizing the content caching decisions, computing offloading decisions, UAV hovering positions, and STAR-RIS passive beamforming.

As illustrated in Figure 2, we make use of virtual reality (VR) as a typical application scenario. The sensors need to process a variety of virtual reality application tasks, such as object tracking, object identification, and scene rendering. Every task requires a different amount of computing capacity and data size. For instance, object-tracking tasks require a large number of data to be transmitted, while object identification tasks and scene-rendering tasks require greater computing resources. Therefore, we adopt two parameters in total for modeling heterogeneous computation tasks. For computation task k, we define

F_{k} = {ω_{k} L_{k}}

, where

ω_{k}

represents the computing resources required to complete the task and

L_{k}

represents the data size of the task, i.e., the size of the data that need to be transmitted to UAV u or edge cloud c.

When the computing task is not cached in the system, we consider the partial offloading scheme for delay-sensitive computation tasks in STAR-RIS-aided UAV systems. This kind of computation offloading model allows tasks to be calculated in parallel at the sensors, UAV u, and edge cloud c. The tasks processed at the sensors are referred to as local tasks, the tasks processed at UAV u are referred to as UAV-offloading tasks, and the tasks that are offloaded to edge cloud c are called edge cloud-offloading tasks. Figure 3 presents the time allocation for task processing in the STAR-RIS-aided UAV system, where the sensors utilize the same resource block with duration

T_{m}^{k, max}

to transmit and compute tasks.

In the local execution phase, the sensors process their tasks by the local computing servers. In the UAV task offloading phase, some of the remaining tasks are uploaded to UAV u and are processed by the computing server of UAV u. In the edge cloud task offloading phase, some of the remaining tasks are forwarded to edge cloud c through the STAR-RIS elements for processing. When the tasks are completed, the computing results obtained at both UAV u and edge cloud c are returned to the sensors. In downlink communication, since UAV u and edge cloud c tend to have high transmit power and the computing results are usually of small size, the downloading time is comparatively negligible in the UAV task offloading phase and edge cloud task offloading phase.

3.2. Communication Model

This subsection introduces the communication model and gives the uplink data rate when the sensor offloads tasks on UAV u and edge cloud c. We assume that when the task is offloaded, the sensor does not move. A 3D Cartesian coordinate system is established to describe the locations of the sensors, UAV u, STAR-RIS, and edge cloud c. The locations of sensor m and edge cloud c are described by vector

r_{m} = (x_{m}, y_{m}, 0)

and vector

r_{c} = (x_{c}, y_{c}, 0)

. We assume that UAV u is hovering at a fixed position in every slot to provide computing services for the sensors [33]. The position of UAV u is

r_{u} = (x_{u}, y_{u}, h_{u})

, and the position of the

(m_{c}, m_{r})

-th STAR-RIS element is

r_{(m_{c}, m_{r})} = (x_{(m_{c}, m_{r})}, y_{(m_{c}, m_{r})}, h_{(m_{c}, m_{r})})

.

Due to the high probability of LoS links in UAV communication, the communication channels between sensors m and UAV u, between sensor m and the

(m_{c}, m_{r})

-th STAR-RIS element, and between the

(m_{c}, m_{r})

-th STAR-RIS element and edge cloud c are assumed to be LoS links, all following the free-space path loss model [34]. So, the channel gain between node n and another node

n^{'}

can be formulated as follows:

\begin{matrix} h_{n, n^{'}} = & \sqrt{g_{0}} d_{n, n^{'}}^{- 2}, \end{matrix}

(1)

where

n \neq n^{'}

,

n \in \{m, (m_{c}, m_{r})\}

, and

n^{'} \in \{u, (m_{c}, m_{r}), c\}

.

g_{0}

is the received power at a distance of 1 m for a transmission power of 1 W and

d_{n, n^{'}} = \sqrt{{(x_{n} - x_{n^{'}})}^{2} + {(y_{n} - y_{n^{'}})}^{2} + {(h_{n} - h_{n^{'}})}^{2}}

. The signal-to-interference-plus-noise ratio (SINR) of the wireless link from node n to another node

n^{'}

, denoted by

γ_{n, n^{'}}

, can be expressed as

γ_{n, n^{'}} = \frac{P_{n} {| h_{n, n^{'}} |}^{2}}{σ_{n, n^{'}}^{2}},

(2)

where

P_{n}

is the transmit power of node n. We assume that the noise power has constant variance

σ_{n, n^{'}}^{2}

. Therefore, the transmission rate from node n to node

n^{'}

can be given by

R_{n, n^{'}} = B_{n, n^{'}} {log}_{2} (1 + \frac{P_{n} {| h_{n, n^{'}} |}^{2}}{σ_{n, n^{'}}^{2}}),

(3)

where

B_{n, n^{'}}

is the transmission bandwidth of the wireless link from node n to node

n^{'}

.

3.2.1. Channel in UAV Task Offloading

UAV u equipped with the MEC server has computing resources, which allows the sensors to transfer some tasks to UAV u for processing. The signal received by UAV u can be written as

z_{u} = \sum_{m \in M} h_{m, u} \sqrt{P_{m}} s_{m} + n_{u},

(4)

where

s_{m}

is the corresponding signal with

E \{| s_{m} |^{2}\} = 1

[35].

n_{u}

is the additive white Gaussian noise (AWGN) received by UAV u. The noise power is

σ_{u}^{2}

, i.e.,

n_{u} = n \sim N (0, σ_{u}^{2})

.

According to (2) and (3), the SNR and transmission rate of the link from sensor m to UAV u are denoted by

γ_{m, u}

and

R_{m, u}

.

3.2.2. Channel in Edge Cloud Task Offloading

Subject to its energy limitations and task time constraints, UAV u can only perform part of its received remaining tasks. The remaining tasks are forwarded to edge cloud c for processing through the STAR-RIS. In order to better distinguish the channel gain from sensor m to the

(m_{c}, m_{r})

-th STAR-RIS element, the channel gain from the

(m_{c}, m_{r})

-th STAR-RIS element to edge cloud c is expressed as

g_{(m_{c}, m_{r}), c}

. Therefore, the channel gains from sensor m to the STAR-RIS and from the STAR-RIS to edge cloud c are denoted by

H_{m, s} \in C^{M_{c} M_{r} \times 1}

and

G_{s, c}^{H} \in C^{1 \times M_{c} M_{r}}

, respectively.

Θ_{a} = diag [\sqrt{β_{(1, 1)}^{a}} e^{j θ_{(1, 1)}^{a}}, \sqrt{β_{(1, 2)}^{a}} e^{j θ_{(1, 2)}^{a}}, \dots, \sqrt{β_{(m_{c}, m_{r})}^{a}} e^{j θ_{(m_{c}, m_{r})}^{a}}, \dots, \sqrt{β_{(M_{c}, M_{r})}^{a}} e^{j θ_{(M_{c}, M_{r})}^{a}}]

is the transmission or reflection coefficient matrix of the STAR-RIS for the incident signal from sensor m, where

\sqrt{β_{(m_{c}, m_{r})}^{a}} \in [0, 1]

and

θ_{(m_{c}, m_{r})}^{a} \in [0, 2 π)

denote the amplitude and phase shift of the

(m_{c}, m_{r})

-th STAR-RIS element for the sensor’s signal, respectively, and

a \in \{r, t\}

. Let

Θ_{a}

be the reflection

(a = r)

or transmission

(a = t)

beamforming vectors. In consequence, the following constraint is required when the STAR-RIS is in ES mode:

\sum_{a \in \{r, t\}} β_{(m_{c}, m_{r})}^{a} \leq 1 .

(5)

Hence, the signal received at edge cloud c can be obtained as

z_{c} = \sum_{m \in M} G_{s, c}^{H} Θ_{a} H_{m, s} \sqrt{P_{m}} s_{m} + n_{c},

(6)

where

n_{c}

is the AWGN received at edge cloud c with variance

σ_{c}^{2}

, i.e.,

n_{c} = n \sim N (0, σ_{c}^{2})

. If sensor m is in the transmission region, then

a = t

; otherwise, if sensor m is in the reflection region,

a = r

.

Similarly, the SNR and transmission rate of the link from sensor m to edge cloud c via the STAR-RIS are denoted by

γ_{m, c}

and

R_{m, c}

, respectively, where the channel gain from sensor m via the STAR-RIS to edge cloud c is

G_{s, c}^{H} Θ_{a} H_{m, s}

.

We do not take packet loss and downlink transmission delay into account in this paper. This is due to the fact that the downlink transition rate is higher than the uplink transition rate and the size of the data after task processing is much smaller than that of the data before processing.

3.3. Computation Model

In this study, we consider a divisible computation task, allowing it to be segmented into multiple parts. Taking video analysis as an example, a large video file containing numerous frames can be divided into several video clips through segmentation. This enables a portion of the clips to be initially processed locally at the sensor, while others are handled by UAV u, and the remaining clips are offloaded to edge cloud c. Detailed explanations will be provided in subsequent sections. Additionally, we disregard the delay associated with transmitting the processed task results from the UAV and edge cloud back to the sensors, as the output data size is significantly smaller compared with that of the input data.

We define the integer caching decision variable,

X_{j^{'}}^{k} \in {0, 1}

(j^{'} \in {u, c})

, which indicates whether task k is cached at node

j^{'}

(

X_{j^{'}}^{k}

= 1) or not (

X_{j^{'}}^{k}

= 0). Therefore, the task caching strategy can be represented as follows:

X = \{X_{j^{'}}^{1}, X_{j^{'}}^{2}, \dots, X_{j^{'}}^{k}\}

. For the task k offloading problem of sensor m, we define decision variable

α_{m, j}^{k} \in [0, 1]

,

j \in {m, u, c}

.

α_{m, j}^{k} \in [0, 1]

is the task offloading ratio of sensor m at node j. (In this paper, we assume that sensor m initiates a task request to node j. Node j senses the information of the task (task size and time constraints) and then makes a task offloading decision based on its own computing capability. We call the above process the sensing phase, and since this time is very short, we ignore this time.) Consequently, the following is a representation of the task offloading policy:

α = \{α_{m, m}^{k}, α_{m, u}^{k}, α_{m, c}^{k}\}

,

\sum_{j \in {m, u, c}} α_{m, j}^{k} = 1

.

Service delay is the overall service time for task k when task k is not cached, consisting of two parts: (i) uplink communication delay; (ii) computation delay. In downlink communication, since the UAV and edge cloud tend to have high transmit power and the computing results are usually of small size, the downloading time is also negligible. Next, we discuss the delay and corresponding energy consumption.

3.3.1. Energy Consumption for Uplink Communication

We denote

X_{j^{'}}^{k} = 0

, as task k is not cached, and offloading decision variable

α_{m, j^{'}}^{k}

. Therefore, the uplink transmission delay of offloading task k from sensor m to node

j^{'}

can be expressed as

T_{m, j^{'}}^{t r, k} = \frac{α_{m, j^{'}}^{k} L_{k}}{R_{m, j^{'}}} .

(7)

Therefore, when task k is not cached, the total uplink communication energy consumption for sensor m offloading task k can be calculated as

\begin{matrix} E_{m}^{t r, k} = & \sum_{j^{'} \in {u, c}} P_{m} T_{m, j^{'}}^{t r, k} . \end{matrix}

(8)

3.3.2. Energy Consumption for Computation

We denote

X_{j^{'}}^{k} = 0

, as task k is not cached, and offloading decision variable

α_{m, j}^{k}

. Therefore, the computation delay for computing offloading task k at node j can be expressed as

\begin{matrix} T_{m, j}^{c o m, k} = \frac{α_{m, j}^{k} ω_{k}}{f_{j}}, \end{matrix}

(9)

where

ω_{k}

is the number of required computation resources for task k, i.e., the number of CPU cycles required for computing 1-bit task data.

f_{j}

is the computing capability (CPU cycles per second) of node j.

Therefore, the total computing energy consumption of sensor m for computing task request k can be expressed as

\begin{matrix} E_{m}^{c o m, k} & = \sum_{j \in {m, u, c}} E_{m, j}^{c o m, k} \\ = \sum_{j \in {m, u, c}} κ_{j} T_{m, j}^{c o m, k} f_{j}^{3}, \end{matrix}

(10)

where

κ_{j}

is the effective capacitance coefficient of node j that depends on the processor’s chip architecture.

3.4. Caching Model

In this subsection, we describe the caching model. Task caching involves storing completed tasks and associated data within UAV u or edge cloud c. Specifically, an independent resource container is maintained on UAV u or edge cloud c. The caching process operates as follows: Firstly, sensors send a computing task request. If UAV u or edge cloud c has already cached this task, the respective node notifies the sensors of its availability on the caching servers. Consequently, the sensor can avoid offloading the same task to UAV u or edge cloud c. At last, after the task is processed by UAV u or edge cloud c, the results are sent back to the sensors. This caching mechanism reduces the need for redundant task offloading, thereby lowering sensor energy consumption and minimizing offloading delays.

Despite its benefits, task caching still faces many challenges: (i) although UAV u and edge cloud c have greater caching and computational capacities compared with sensors, they still cannot cache or handle all kinds of computation tasks; (ii) unlike traditional caching strategies, task caching requires the consideration of not only the data size and computational resources necessary for each task but also task popularity. Consequently, designing an effective caching strategy presents significant challenges. We introduce an integer caching decision variable,

X_{j^{'}}^{k} \in {0, 1}

, to indicate whether task k is cached at node

j^{'}

(

X_{j^{'}}^{k}

= 1) or not (

X_{j^{'}}^{k}

= 0). Therefore, the computation caching strategy can be defined as

X = \{X_{j^{'}}^{1}, X_{j^{'}}^{2}, \dots, X_{j^{'}}^{k}\}

. In this study, we evaluate the task duration and energy consumption on UAV u or edge cloud c in scenarios with and without task caching. For task caching (

X_{j^{'}}^{k}

= 1), the task duration, simplified to the processing delay, is denoted by

T_{m, j^{'}}^{c a c h e, k}

and can be expressed as

T_{m, j^{'}}^{c a c h e, k} = \frac{ω_{k}}{f_{j^{'}}} .

(11)

The primary energy consumption occurs within UAV u or edge cloud c, with sensors incurring no energy cost. Accordingly, the energy consumption associated with task caching can be formulated as

\begin{matrix} E_{m, j^{'}}^{c a c h e, k} = & κ_{j^{'}} T_{m, j^{'}}^{c a c h e, k} f_{j^{'}}^{3}, \end{matrix}

(12)

where

κ_{j^{'}}

is the effective capacitance coefficient of node

j^{'}

that depends on the processor’s chip architecture.

3.5. Problem Formulation

Based on the task communication, computation, and caching process mentioned above, the overall delay and energy consumption of sensor m from sending task request k to obtaining computing results are expressed as

\begin{matrix} T_{m}^{k} & = \sum_{j^{'} \in {u, c}} X_{j^{'}}^{k} T_{m, j^{'}}^{c a c h e, k} + (1 - \sum_{j^{'} \in {u, c}} X_{j^{'}}^{k}) max \{κ_{m} T_{m, m}^{c o m, k} f_{m}^{3}, max_{j^{'} \in {u, c}} \{T_{m, j^{'}}^{t r, k} + T_{m, j^{'}}^{c o m, k}\}\}; \end{matrix}

(13)

\begin{matrix} E_{m}^{k} & = \sum_{j^{'} \in {u, c}} X_{j^{'}}^{k} E_{m, j^{'}}^{c a c h e, k} + (1 - \sum_{j^{'} \in {u, c}} X_{j^{'}}^{k}) (E_{m}^{tr, k} + E_{m}^{c o m, k}) . \end{matrix}

(14)

In this paper, for task caching (

X_{j^{'}}^{k}

= 1), the primary energy consumption occurs within UAV u or edge cloud c, with sensors incurring no energy cost. The primary energy consumption, simplified to the processing energy consumption, is denoted by

E_{m, j^{'}}^{c a c h e, k}

. For task caching (

X_{j^{'}}^{k}

= 0), the primary energy consumption includes the total uplink communication energy consumption for sensor m offloading task k and the total computing energy consumption of computing task request k. Therefore, we formulate a system energy consumption minimization problem by jointly optimizing caching decision

X = \{X_{j^{'}}^{1}, X_{j^{'}}^{2}, \dots, X_{j^{'}}^{k}\}

(j^{'} \in {u, c})

, offloading decision

α_{m, j}^{k} \in [0, 1]

(

j \in {m, u, c}

), hovering position of UAV u

r_{u}

, and passive beamforming

Θ

in the STAR-RIS-aided UAV system. According to (14), the optimization problem for minimizing the total energy consumption of the STAR-RIS-aided UAV system can be formulated as

\begin{matrix} P_{1} : & min_{\begin{matrix} X, α, r_{u}, Θ \end{matrix}} \sum_{m \in M} \sum_{k \in K} E_{m}^{k} \end{matrix}

(15)

\begin{matrix} s . t . & C_{1} : α_{m, j}^{k} \in [0, 1], \forall m, j \in \{m, u, c\}, \end{matrix}

(15a)

\begin{matrix} C_{2} : \sum_{j \in \{m, u, c\}} α_{m, j}^{k} = 1, \end{matrix}

(15b)

\begin{matrix} C_{3} : \sum_{a \in \{r, t\}} β_{(m_{c}, m_{r})}^{a} \leq 1, β_{(m_{c}, m_{r})}^{a} \in [0, 1], \forall a, \forall m_{c}, \forall m_{r}, \end{matrix}

(15c)

\begin{matrix} C_{4} : θ_{(m_{c}, m_{r})}^{a} \in [0, 2 π), \forall a, \forall m_{c}, \forall m_{r}, \end{matrix}

(15d)

\begin{matrix} C_{5} : x_{u} \leq x_{u}^{max}, y_{u} \leq y_{u}^{m a x}, \end{matrix}

(15e)

\begin{matrix} C_{6} : \sum_{k \in K} X_{j^{'}}^{k} S^{k} \leq O_{j^{'}}, \forall k, \forall j^{'}, \end{matrix}

(15f)

\begin{matrix} C_{7} : \sum_{k \in K} α_{m, j}^{k} ω_{k} \leq f_{j}, \forall k, \forall m, \forall j, \end{matrix}

(15g)

\begin{matrix} C_{8} : X_{j^{'}}^{k} \in {0, 1}, \forall k, \forall j^{'}, \end{matrix}

(15h)

\begin{matrix} C_{9} : T_{m}^{k} \leq T_{m}^{k, max}, \forall m, \forall k . \end{matrix}

(15i)

Constraint

C_{1}

ensures that the task offloading ratio at each sensor must be between 0 (indicating no offloading) and 1 (indicating full offloading), ensuring that the offloading task is appropriately distributed. Constraint

C_{2}

ensures that the sum of offloading ratios for task k across all offloading targets (sensors m, UAV u, and edge cloud c) is equal to 1. Constraint

C_{3}

ensures that it governs the amplitude response of the

(m_{c}, m_{r})

-th STAR-RIS element, limiting it to a range of

[0, 1]

. Constraint

C_{4}

defines the phase shift of the

(m_{c}, m_{r})

-th STAR-RIS element, ensuring that it lies within the range

[0, 2 π)

. This is because the phase shift is a periodic quantity, and its effective range should be between 0 and

2 π

. Constraint

C_{5}

restricts the UAV’s coordinates,

x_{u}

and

y_{u}

, which must not exceed the maximum allowed values

x_{u}^{max}

and

y_{u}^{max}

, ensuring that the UAV operates within a defined area. Constraint

C_{6}

ensures that the total cached content at each node does not exceed its maximum caching capacity

O_{j^{'}}

. Constraint

C_{7}

ensures that the total computational resources allocated to handle tasks at each node do not exceed its maximum computation capacity

f_{j}

. Constraint

C_{8}

restricts the caching binary decision variables

X_{j^{'}}^{k}

to take the value of either 0 or 1. This is typically used to represent whether a task is cached to a specific node or not. Constraint

C_{9}

ensures that the completion time of task k does not exceed the maximum tolerable time

T_{m}^{k, max}

for the task. This guarantees that all tasks are completed within the allowed time window, ensuring that delay requirements are met.

Due to the different hovering positions of UAV u, the channel gain of the sensor-to-UAV channel link and the sensor-to-STAR-RIS-to-edge cloud channel link may differ, which in turn affects the transmission energy consumption. For ease of calculation, we assume that the

h_{u}

of UAV u is fixed [36]. We optimize the

x_{u}

and

y_{u}

of UAV u to minimize the total energy consumption of the system. Meanwhile, we assume that the coordinates of STAR-RIS are the same as those of UAV u.

As shown in Figure 4, we adopt alternative optimization techniques and decompose problem (15) into four subproblems:

(1): Content caching decision subproblem: Given $α = α^{0}$ , $r_{u} = r_{u}^{0}$ , and $Θ = Θ^{0}$ , i.e., when $α, r_{u},$ and $Θ$ are fixed, problem (15) optimizes the caching decision vector $X$ to minimize the total energy consumption of the system. We adopt the DRL algorithm to optimize the content caching decision, denoted by $X^{*}$ .
(2): Computing offloading decision subproblem: Given $X = X^{*}$ , $r_{u} = r_{u}^{0}$ , and $Θ = Θ^{0}$ , i.e., when $X, r_{u},$ and $Θ$ are fixed, problem (15) optimizes the offloading decision vector $α$ to minimize the total energy consumption of the system. We adopt the KKT conditions to obtain an optimal solution, denoted by $α^{*}$ .
(3): UAV hovering position subproblem: Given $X = X^{*}$ , $α = α^{*}$ , and $Θ = Θ^{0}$ , i.e., when $α, X,$ and $Θ$ are fixed, problem (15) optimizes the hovering position of UAV u vector $r_{u}$ to minimize the total energy consumption of the system. We adopt the SCA method to optimize the hovering position of UAV u, denoted by $r_{u}^{*}$ .
(4): STAR-RIS resource allocation subproblem: Given $X = X^{*}$ , $α = α^{*}$ , and $r_{u} = r_{u}^{*}$ , the transmission and reflection coefficient matrix $Θ$ is optimized to minimize the total energy consumption of the system. We adopt the SCA method to obtain an optimal solution, denoted by $Θ^{*}$ .

Figure 4. The proposed optimization framework of the energy consumption minimization problem.

4. Content Caching Decision Optimization

Since content caching decision optimization is only related to

X

but independent of other variables in

P_{1}

, caching decision optimization can be solved in advance with given offloading decisions and the hovering position of the UAV, as well as the transmission and reflection coefficient matrix. The subproblem can be written as

\begin{matrix} P_{2} : & min_{\begin{matrix} X \end{matrix}} \sum_{m \in M} \sum_{k \in K} E_{m}^{k} \\ s . t . (C_{6}) - (C_{9}) \end{matrix}

(16)

Due to the fact that

X

is a binary vector,

P_{2}

is still subject to a mixed-integer nonlinear programming (MINLP) problem. Traditional optimization methods like SCA may not effectively handle such problems, especially in environments where the network state changes rapidly and unpredictably. The traditional binary-relax SCA approach first relaxes the binary variables

X

from the discrete space

{0, 1}

into the continuous space [0, 1] and then forces them to round back after the SCA-based iterations. Therefore, such “relaxation” may lead the solution to converge to a local minimum in real-time dynamic systems.

To address these challenges, we utilize the proximal policy optimization (PPO) algorithm, a DRL method. As shown in Figure 5, the PPO algorithm uses neural networks to model complex relationships between system states and actions, learning directly from interactions with the environment. By leveraging the PPO algorithm, caching decisions are dynamically adjusted based on real-time network conditions, such as caching state, content popularity, historical request access frequency, and network topology. The intelligent agent iteratively updates its caching strategy to maximize a reward function reflecting caching efficiency, like the cache hit rate, enabling near-optimal caching strategies that adapt to changing conditions and enhance overall system performance.

PPO is implemented within the actor–critic framework, comprising a policy network (actor) and a value network (critic). In this setup, the actor generates actions, while the critic evaluates them. A significant limitation of the basic actor–critic framework is its low sample efficiency, which requires extensive interactions with the environment to converge. To address this, PPO introduces two major contributions: mini-batch updates to improve data efficiency and a clipped surrogate loss to constrain policy updates. The PPO algorithm allows for a small difference between the target policy

π_{θ} (a_{t}^{c a} | s_{t}^{c a})

and the behavior policy

π_{θ_{old}} (a_{t}^{c a} | s_{t}^{c a})

, where

a_{t}

and

s_{t}

denote the action taken and the state observed at time t. This is achieved by using a clipping function that limits the extent of policy change. If the policy update exceeds a predefined threshold, the clipping function prevents further increase.

4.1. Intelligent Caching MDP Model

In the content caching decision subproblem for aerial STAR-RIS-aided WSNs, we leverage caching state, content popularity, historical request frequency, and network topology data to realize the optimal content caching decisions. The caching update model is formulated as a Markov decision process (MDP). In each time slot

t \in T

, the agent observes the current state,

s_{t}^{c a} \in S^{c a}

, and selects an action

a_{t}^{c a} \in A^{c a}

. Upon executing action

a_{t}^{c a}

, the agent receives an immediate reward

r_{t}^{c a}

and the environment transitions to the next state,

s_{t + 1}^{e x}

. The transition tuple

(s_{t}^{c a}, a_{t}^{c a}, r_{t}^{c a}, s_{t + 1}^{c a})

is stored in an experience replay buffer for agent training. To derive the optimal solution for problem

P_{2}

under heterogeneous scenarios, state space

s_{t}^{c a}

, action space

a_{t}^{c a}

, and reward function

r_{t}^{c a}

in the proposed intelligent caching MDP model are designed as follows:

(1): State: State $s_{t}^{c a}$ in slot t includes caching state information $M_{t}$ , content popularity $P_{t}$ , historical request access frequency $F_{t}$ , and network topology information $G_{t}$ . Thus, the state vector in slot t is expressed as

$\begin{matrix} s_{t}^{c a} = \{M_{t}, P_{t}, F_{t}, G_{t}\}, \end{matrix}$

(17)

where $M_{t}$ represents the content caching status across all caching nodes in time slot t, expressed as $M_{t} = \{M_{u, t}, M_{c, t}\}$ . Additionally, $F_{t} = \{F_{1, t}, \dots, F_{k, t}, \dots, F_{K, t}\}$ denotes the historical access frequencies for all requests.
(2): Action: During the process of caching decision, the optimal content caching decision, $a_{t}^{c a}$ , includes the cached content across all nodes for the upcoming time slot $t + 1$ . The expression for $a_{t}^{c a}$ is given by

$\begin{matrix} a_{t}^{c a} = & [C_{1, c, t + 1}, \dots, C_{k, c, t + 1}, \dots, C_{K, c, t + 1}, C_{1, u, t + 1}, \dots, C_{k, u, t + 1}, \dots, C_{K, u, t + 1}], \end{matrix}$

(18)

where $C_{k, c, t + 1}$ represents whether the content of task request k is cached in edge cloud c in time slot t + 1.
(3): Reward: The formulation of the reward function plays a crucial role in guiding the exploration of the caching update problem and ensuring algorithm convergence. Consequently, the reward function in time slot t is defined as follows:

$\begin{matrix} r_{t} = λ \sum_{k = 1}^{N_{s t}} R_{k} + (1 - λ) \sum_{j^{'} \in {u, c}} H_{t}^{j^{'}}, \end{matrix}$

(19)

where $N_{s t}$ is the maximum training steps, $λ$ is the weights parameter, $H_{t}^{j^{'}}$ is the number of cache hits for node $j^{'}$ in time slot t, and $R_{k}$ represents sensor satisfaction at step k.

4.2. PPO-Based Content Caching Process

As shown in Algorithm 1, the PPO algorithm optimizes content caching decisions by iteratively adjusting the policy parameters by using a clipped objective function and advantage estimation to maximize cumulative rewards. The process starts with initializing the actor network’s policy parameters

θ

and

θ_{old}

to ensure consistent learning. The critic network is initialized with parameters

ϕ

to evaluate state values. Key hyperparameters, including learning rate

α

, discount factor

γ

, and clipping parameter

ϵ

, are set to guide the training process. This setup forms the foundation for the algorithm’s ability to adapt and optimize caching strategies dynamically. Then, the environment is reset to its initial state in each episode. At each time step, the agent observes the current state (

s_{t}^{c a}

), selects an action

a_{t}^{c a}

based on the current policy

θ

, executes the action, and receives a reward

r_{t}

.

The critic network, parameterized by

ϕ

, is trained by using gradient descent to minimize the loss function

\begin{matrix} L (ϕ) = \sum_{t = 1}^{T} {(δ_{t})}^{2}, \end{matrix}

(20)

where

δ_{t} = r_{t} + γ V_{ϕ} (s_{t + 1}^{c a}) - V_{ϕ} (s_{t}^{c a})

. The generalized advantage estimator (GAE) is as shown in

\begin{matrix} A_{t} = \sum_{l = 0}^{\infty} {(γ λ)}^{l} δ_{t + l} . \end{matrix}

(21)

The actor network updates its policy parameters

θ

by maximizing a clipped objective function, ensuring stable updates. The clipped objective function is defined as follows:

\begin{matrix} J_{PPO}^{θ} (θ) = \sum_{(s_{t}, a_{t})} min (w A_{t}, clip (w, 1 - ϵ, 1 + ϵ) A_{t}), \end{matrix}

(22)

where

ϵ

is a clip fraction. The policy ratio of the target policy to the behavior policy can be expressed as

\begin{matrix} w = \frac{π_{θ} (a_{t}^{c a} | s_{t}^{c a})}{π_{θ_{old}} (a_{t}^{c a} | s_{t}^{c a})} . \end{matrix}

(23)

Periodically, the behavior policy parameters are synchronized with the current policy to maintain consistency. By iteratively repeating these steps, the PPO algorithm effectively learns optimal caching strategies, adapting to dynamic changes in the network environment and improving overall performance.

Algorithm 1: PPO-based content caching process

5. Offloading Decision, UAV Hovering Position, and STAR-RIS Resource Allocation

Once content caching has been solved by PPO, computing offloading decision

α

, UAV hovering position

r_{u}

, and STAR-RIS passive beamforming

Θ

can be further optimized by KKT and SCA iteratively.

\begin{matrix} P_{3} : & min_{\begin{matrix} α, r_{u}, Θ \end{matrix}} \sum_{m \in M} \sum_{k \in K} ((1 - \sum_{j^{'} \in {u, c}} {X_{j^{'}}^{k}}^{*}) (E_{m}^{tr, k} + E_{m}^{c o m, k}) + \sum_{j^{'} \in {u, c}} {X_{j^{'}}^{k}}^{*} E_{m, j^{'}}^{c a c h e, k}) \\ s . t . & (C_{1}) - (C_{5}), (C_{7}), (C_{9}), \end{matrix}

(24a)

\begin{matrix} \sum_{k \in K} {X_{j^{'}}^{k}}^{*} S^{k} \leq O_{j^{'}}, \forall k, \forall j^{'}, \end{matrix}

(24b)

\begin{matrix} {X_{j^{'}}^{k}}^{*} \in {0, 1}, \forall k, \forall j^{'} . \end{matrix}

(24c)

5.1. Offloading Decision

Given

X = X^{*}

,

r_{u} = r_{u}^{0}

, and

Θ = Θ^{0}

, problem (15) is rewritten as

\begin{matrix} min_{\begin{matrix} α \end{matrix}} = & \sum_{m \in M} \sum_{k \in K} ((1 - \sum_{j^{'} \in {u, c}} {X_{j^{'}}^{k}}^{*}) (\sum_{j^{'} \in {u, c}} P_{m} \frac{α_{m, j^{'}}^{k} L_{k}}{R_{m, j^{'}}^{0}} + \sum_{j \in {m, u, c}} κ_{j} α_{m, j}^{k} ω_{k} f_{j}^{2})) \\ s . t . & (15a), (15b), (15g), \end{matrix}

(25a)

\begin{matrix} {T_{m}^{k}}^{0} \leq T_{m}^{k, max}, \forall m, \forall k, \end{matrix}

(25b)

where in

R_{m, j^{'}}^{0}

and

{T_{m}^{k}}^{0}

,

r_{u} = r_{u}^{0}

and

Θ = Θ^{0}

.

Consider the non-convexity of problem (25) under the given

X = X^{*}

,

r_{u}^{0}

, and

Θ^{0}

. A proximal quadratic regularization term, i.e.,

\sum_{m \in M} \sum_{k \in K} ξ {α_{m, j}^{k}}^{2}

, is added to (25a) to overcome the issue, where

ξ

is a positive scalar parameter. The regularized problem is equivalent to the original one in (25) as

ξ

→ 0. Therefore,

E_{p r o x}^{t o t} (X^{*}, α, r_{u}^{0}, Θ^{0})

can be written as

\begin{matrix} E_{p r o x}^{t o t} (X^{*}, α, r_{u}^{0}, Θ^{0}) = \sum_{m \in M} \sum_{k \in K} ((1 - \sum_{j^{'} \in {u, c}} {X_{j^{'}}^{k}}^{*}) \\ (\sum_{j^{'} \in {u, c}} P_{m} \frac{α_{m, j^{'}}^{k} L_{k}}{R_{m, j^{'}}^{0}} + \sum_{j \in {m, u, c}} (κ_{j} α_{m, j}^{k} ω_{k} f_{j}^{2} + ξ {α_{m, j}^{k}}^{2}))) . \end{matrix}

(26)

Hence, problem (25) is rewritten as

\begin{matrix} min_{\begin{matrix} α \end{matrix}} & E_{p r o x}^{t o t} (X^{*}, α, r_{u}^{0}, Θ^{0}) \end{matrix}

(27a)

\begin{matrix} s . t . & (15a), (15b), (15g), (25b) . \end{matrix}

(27b)

According to (27),

E_{p r o x}^{t o t} (X^{*}, α, r_{u}^{0}, Θ^{0})

is a function of

α

, and its Hessian matrix is semi-positive definite. Given

X^{*}

,

r_{u}^{0}

, and

Θ^{0}

, the optimal task offloading decision vector of problem (27) is convex and can be obtained by using the KKT conditions. The Lagrange function of problem (27), denoted by

L (α, ε, χ, ζ, ξ)

, is written as (28), where

ε

,

χ

,

ζ

, and

ξ

are the nonnegative Lagrangian multipliers associated with constraints (15a), (15b), (15g), and (25b), respectively.

\begin{matrix} L (α, ε, χ, ζ, ξ) = & \sum_{m \in M} \sum_{k \in K} \{(1 - \sum_{j^{'} \in {u, c}} {X_{j^{'}}^{k}}^{*}) (\sum_{j^{'} \in {u, c}} P_{m} \frac{α_{m, j^{'}}^{k} L_{k}}{R_{m, j^{'}}^{0}} + \sum_{j \in {m, u, c}} (ξ {α_{m, j}^{k}}^{2} \\ + κ_{j} α_{m, j}^{k} ω_{k} f_{j}^{2}))\} + ε_{m} (α_{m, j}^{k} - 1) + χ_{m} (\sum_{j \in {m, u, c}} (α_{m, j}^{k} - 1)) \\ + ζ_{m} (\sum_{k \in K} α_{m, j}^{k} ω_{k} - f_{j}) + ξ_{m} ({T_{m}^{k}}^{0} - T_{m}^{k, max}) . \end{matrix}

(28)

According to the KKT conditions, the optimal task offloading decision vector

α^{*}

is given by

α^{*} = \underset{\{\tilde{α}, \tilde{ε}, \tilde{χ}, \tilde{ζ}, \tilde{ξ}\}}{arg min} E_{p r o x}^{t o t} (X^{*}, α, r_{u}^{0}, Θ^{0}) .

(29)

If

\{\tilde{α}, \tilde{ε}, \tilde{χ}, \tilde{ζ}, \tilde{ξ}\}

is any point in the feasible solution set

℧_{α}

that satisfies the KKT conditions (30), by solving (30), the feasible solution set

℧_{α}

can be derived. The optimal task offloading decisions can be obtained, as given in (29).

\begin{matrix} \frac{\partial L}{\partial α_{m, j}^{k}} |_{α_{m, j}^{k} = {\tilde{α}}_{m, j}^{k}, ε_{m} = {\tilde{ε}}_{m}, χ_{m} = {\tilde{χ}}_{m}, ζ_{m} = {\tilde{ζ}}_{i}, ξ_{m} = {\tilde{ξ}}_{i}} = 0, \forall m, \forall k, \forall j, \end{matrix}

(30a)

\begin{matrix} 0 \leq {\tilde{α}}_{m, j}^{k} \leq 1, \forall m, \forall k, \forall j, \end{matrix}

(30b)

\begin{matrix} \sum_{j \in {m, u, c}} {\tilde{α}}_{m, j}^{k} = 1, \forall m, \forall k, \forall j, \end{matrix}

(30c)

\begin{matrix} \sum_{k \in K} {\tilde{α}}_{m, j}^{k} ω_{k} \leq f_{j}, \forall m, \forall k, \forall j, \end{matrix}

(30d)

\begin{matrix} {\tilde{T_{m}^{k}}}^{0} \leq T_{m}^{k, max}, \forall m, \forall k, \forall j, \end{matrix}

(30e)

\begin{matrix} {\tilde{ε}}_{m} ({\tilde{α}}_{m, j}^{k} - 1) = 0, \forall m, \forall k, \forall j, \end{matrix}

(30f)

\begin{matrix} {\tilde{χ}}_{m} (\sum_{j \in {m, u, c}} ({\tilde{α}}_{m, j}^{k} - 1)) = 0, \forall m, \forall k, \forall j, \end{matrix}

(30g)

\begin{matrix} {\tilde{ζ}}_{m} (\sum_{k \in K} {\tilde{α}}_{m, j}^{k} ω_{k} - f_{j}) = 0, \forall m, \forall k, \forall j, \end{matrix}

(30h)

\begin{matrix} {\tilde{ξ}}_{m} ({\tilde{T_{m}^{k}}}^{0} - T_{m}^{k, max}) = 0, \forall m, \forall k, \forall j, \end{matrix}

(30i)

\begin{matrix} {\tilde{ε}}_{m} \geq 0, \forall m, \end{matrix}

(30j)

\begin{matrix} {\tilde{χ}}_{m} \geq 0, \forall m, \end{matrix}

(30k)

\begin{matrix} {\tilde{ζ}}_{m} \geq 0, \forall m, \end{matrix}

(30l)

\begin{matrix} {\tilde{ξ}}_{m} \geq 0, \forall m . \end{matrix}

(30m)

5.2. UAV Hovering Position

Given

X = X^{*}

,

α = α^{*}

, and

Θ = Θ^{0}

, problem (15) is rewritten as

\begin{matrix} min_{\begin{matrix} r_{u} \end{matrix}} = & \sum_{m \in M} \sum_{k \in K} ((1 - \sum_{j^{'} \in {u, c}} {X_{j^{'}}^{k}}^{*}) \sum_{j^{'} \in {u, c}} P_{m} \frac{α_{m, j^{'}}^{k} L_{k}}{R_{m, j^{'}}^{0}}) \\ s . t . & (15e), \end{matrix}

(31a)

\begin{matrix} {T_{m}^{k}}^{0} \leq T_{m}^{k, max}, \forall m, \forall k, \end{matrix}

(31b)

where in

R_{m, j^{'}}^{0}

and

{T_{m}^{k}}^{0}

,

Θ = Θ^{0}

.

We adopt alternative optimization techniques and decompose the UAV hovering position problem into two steps: (1) Given

y_{u} = y_{u}^{0}

, problem (31) optimizes

x_{u}

, and we adopt SCA and the dual decomposition method to obtain an optimal solution, denoted by

x_{u}^{*}

. (2) Given

x_{u} = x_{u}^{*}

, problem (31) optimizes

y_{u}

, and we also adopt SCA and the dual decomposition method to obtain an optimal solution, denoted by

y_{u}^{*}

.

Given

y_{u} = y_{u}^{0}

, problem (31) can be rewritten as

\begin{matrix} min_{\begin{matrix} x_{u} \end{matrix}} = & \sum_{m \in M} \sum_{k \in K} ((1 - \sum_{j^{'} \in {u, c}} {X_{j^{'}}^{k}}^{*}) \sum_{j^{'} \in {u, c}} P_{m} \frac{α_{m, j^{'}}^{k} L_{k}}{R_{m, j^{'}}^{0}}) \end{matrix}

(32a)

\begin{matrix} s . t . & x_{u} \leq x_{u}^{max}, \end{matrix}

(32b)

\begin{matrix} {T_{m}^{k}}^{0} \leq T_{m}^{k, max}, \forall m, \forall k, \end{matrix}

(32c)

where in

R_{m, j^{'}}^{0}

and

{T_{m}^{k}}^{0}

,

y_{u} = y_{u}^{0}

and

Θ = Θ^{0}

.

The non-convexity of objective (32a) arises from the hovering position

x_{u}

in

r_{u}

in

{log}_{2} (1 + γ_{m, j^{'}}^{0})

. Considering that

R_{m, j^{'}}^{0}

is replaced by their approximate variables, denoted by

{\tilde{R}}_{m, j^{'}}^{0}

, to transform (32) into a convex function, problem (32) can be reformulated as

\begin{matrix} min_{\begin{matrix} x_{u} \end{matrix}} & \sum_{m \in M} \sum_{k \in K} ((1 - \sum_{j^{'} \in {u, c}} {X_{j^{'}}^{k}}^{*}) \sum_{j^{'} \in {u, c}} P_{m} \frac{α_{m, j^{'}}^{k} L_{k}}{{\tilde{R}}_{m, j^{'}}^{0}}) \\ s . t . & (32b), \end{matrix}

(33a)

\begin{matrix} {\tilde{T_{m}^{k}}}^{0} \leq T_{m}^{k, max}, \forall m, \forall k, \end{matrix}

(33b)

\begin{matrix} B_{m, j^{'}} {log}_{2} (1 + γ_{m, j^{'}}^{0}) \geq {\tilde{R}}_{m, j^{'}}^{0}, \forall m, \forall j^{'} . \end{matrix}

(33c)

To address the non-convexity of (33c), the SCA method is used to derive a near-optimal solution. We first convexify

{log}_{2} (1 + γ_{m, j^{'}}^{0})

by utilizing a logarithmic approximation as follows:

{log}_{2} (1 + γ_{m, j^{'}}^{0}) \geq \frac{δ_{m, j^{'}} ln γ_{m, j^{'}}^{0} + λ_{m, j^{'}}}{ln 2},

(34)

which is tight when

γ_{m, j^{'}}^{0}

=

{\tilde{γ}}_{m, j^{'}}^{0}

.

δ_{i, u}

and

λ_{m, j^{'}}

are two approximation constants about

γ_{m, j^{'}}^{0}

, defined as follows:

δ_{m, j^{'}} = \frac{{\tilde{γ}}_{m, j^{'}}^{0}}{1 + {\tilde{γ}}_{m, j^{'}}^{0}};

(35)

λ_{m, j^{'}} = ln (1 + {\tilde{γ}}_{m, j^{'}}^{0}) - \frac{{\tilde{γ}}_{m, j^{'}}^{0}}{1 + {\tilde{γ}}_{m, j^{'}}^{0}} ln {\tilde{γ}}_{m, j^{'}}^{0} .

(36)

Let

x_{u} = {\tilde{x}}_{u}

. Then, constraint (33c) can be approximated by its concave lower bound, as given by

R_{m, j^{'}} \geq B_{m, j^{'}} \frac{δ_{m, j^{'}} ln γ_{m, j^{'}}^{0} ({\tilde{x}}_{u}) + λ_{m, j^{'}}}{ln 2} .

(37)

Therefore, problem (33) can be approximated by a convex problem, as given by

\begin{matrix} min_{\begin{matrix} x_{u} \end{matrix}} & \sum_{m \in M} \sum_{k \in K} ((1 - \sum_{j^{'} \in {u, c}} {X_{j^{'}}^{k}}^{*}) \sum_{j^{'} \in {u, c}} P_{m} \frac{α_{m, j^{'}}^{k} L_{k}}{{\tilde{R}}_{m, j^{'}}^{0}}) \\ s . t . & (32b), (33b), \end{matrix}

(38a)

\begin{matrix} B_{m, j^{'}} \frac{δ_{m, j^{'}} ln γ_{m, j^{'}}^{0} ({\tilde{x}}_{u}) + λ_{m, j^{'}}}{ln 2} \geq {\tilde{R}}_{m, j^{'}}^{0}, \forall m, \forall j^{'} . \end{matrix}

(38b)

The optimal hovering position

x_{u}^{*}

in

r_{u}^{*}

of UAV u can be readily solved by using CVX. We utilize the Lagrangian dual approach for better efficiency. The Lagrangian function

L (x_{u}, ς, ι, ω)

is given in (39), where

ς

,

ι

, and

ω

are the Lagrangian multipliers associated with the constraints of problem (38).

\begin{matrix} L (x_{u}, ς, ι, ω) = \sum_{m \in M} \sum_{k \in K} ((1 - \sum_{j^{'} \in {u, c}} {X_{j^{'}}^{k}}^{*}) \sum_{j^{'} \in {u, c}} P_{m} \frac{α_{m, j^{'}}^{k} L_{k}}{{\tilde{R}}_{m, j^{'}}^{0}}) + ς_{m} (x_{u} - x_{u}^{m a x}) \\ + ι_{m} ({\tilde{T_{m}^{k}}}^{0} - T_{m}^{k, max}) + ω_{m} ({\tilde{R}}_{m, j^{'}}^{0} - B_{m, j^{'}} \frac{δ_{m, j^{'}} ln γ_{m, j^{'}}^{0} ({\tilde{x}}_{u}) + λ_{m, j^{'}}}{ln 2}) . \end{matrix}

(39)

The Lagrangian dual function of (39) can be given by

\begin{matrix} D (ς, ι, ω) = min_{\begin{matrix} x_{u} \end{matrix}} L (x_{u}, ς, ι, ω) . \end{matrix}

(40)

With the convexity of problem (38), the optimal solutions of both the original problem (38) and the dual problem (40) satisfy the KKT conditions. By solving

\frac{\partial L (x_{u}, ς, ι, ω)}{\partial x_{u}} = 0

, the optimal solution

x_{u}^{*}

is derived. The optimal hovering position

r_{u}^{*} = (x_{u}^{*}, y_{u}, h_{u})

, where

x_{u}^{*}

can be obtained as

\begin{matrix} x_{u}^{*} = \underset{x_{u}}{arg min} L (x_{u}, ς, ι, ω) . \end{matrix}

(41)

The Lagrangian multipliers

ς

,

ι

, and

ω

are given by

\begin{matrix} ς_{m} [k + 1] = & {[ς_{m} [k] + △_{ς} [k] (x_{u} - x_{u}^{m a x})]}^{+}; \end{matrix}

(42)

\begin{matrix} ι_{m} [k + 1] = & {[ι_{i} [k] + △_{ι} [k] ({\tilde{T_{m}^{k}}}^{0} - T_{m}^{k, max})]}^{+}; \end{matrix}

(43)

\begin{matrix} ω_{m} [k + 1] = & {[ω_{m} [k] + △_{ω} [k] ({\tilde{R}}_{m, j^{'}}^{0} - B_{m, j^{'}} \frac{δ_{m, j^{'}} ln γ_{m, j^{'}}^{0} ({\tilde{x}}_{u}) + λ_{m, j^{'}}}{ln 2})]}^{+}, \end{matrix}

(44)

where

△_{1} [k] = (△_{ς} [k], △_{ι} [k], △_{ω} [k]

is the step size vector to update the Lagrangian multipliers

ς, ι

, and

ω

in the k-th iteration, respectively.

△_{1}

is also updated per iteration. Without loss of generality, we assume that each element in

△_{1}

has the same step size in an iteration.

According to problem (41), the optimal solution

x_{u}^{*}

is derived. Given

x_{u} = x_{u}^{*}

, problem (31) is non-convex, we also adopt SCA and the dual decomposition method to obtain an optimal solution, denoted by

y_{u}^{*}

. Therefore, the optimal hovering position can be given by

\begin{matrix} r_{u}^{*} = (x_{u}^{*}, y_{u}^{*}, h_{u}) . \end{matrix}

(45)

5.3. STAR-RIS Resource Allocation

Given

X = X^{*}

,

α = α^{*}

, and

r_{u} = r_{u}^{*}

, problem (15) can be rewritten as

\begin{matrix} min_{\begin{matrix} Θ \end{matrix}} & \sum_{m \in M} \sum_{k \in K} ((1 - {X_{c}^{k}}^{*}) P_{m} \frac{α_{m, c}^{k} L_{k}}{{\tilde{R}}_{m, c}^{'}}) \\ s . t . & (15c), (15d), \end{matrix}

(46a)

\begin{matrix} {T_{m}^{k}}^{'} \leq T_{m}^{k, max}, \forall m, \forall k, \end{matrix}

(46b)

where in

R_{m, c}^{'}

and

{T_{m}^{k}}^{'}

,

X = X^{*}

,

α = α^{*}

, and

r_{u} = r_{u}^{*}

.

The non-convexity of both objective (46a) and constraint (46b) arises from

Θ

in

{log}_{2} (1 + γ_{m, c}^{'})

for the given

X^{*}

,

α^{*}

, and

r_{u}^{*}

. Similarly, an approximate variable

{\tilde{γ}}_{m, c}^{'}

replaces

γ_{m, c}^{'}

to overcome its non-convexity. Problem (46) is rewritten as

\begin{matrix} min_{\begin{matrix} Θ \end{matrix}} & \sum_{m \in M} \sum_{k \in K} ((1 - {X_{c}^{k}}^{*}) P_{m} \frac{α_{m, c}^{k} L_{k}}{{\tilde{R}}_{m, c}^{'}}) \\ s . t . & (15c), (15d), \end{matrix}

(47a)

\begin{matrix} {\tilde{T_{m}^{k}}}^{'} \leq T_{m}^{k, max}, \forall m, \forall k, \end{matrix}

(47b)

\begin{matrix} γ_{m, c}^{'} \geq {\tilde{γ}}_{m, c}^{'}, \forall m . \end{matrix}

(47c)

Being part of

γ_{m, c}^{'}

,

{|G_{s, c}^{H} Θ_{a} H_{m, s}|}^{2}

can be rewritten as

{|G_{s, c}^{H} Θ_{a} H_{m, s}|}^{2} = {|ν_{a}^{H} ϕ_{s, c}^{H} H_{m, s}|}^{2} = ν_{a}^{H} Υ ν_{a},

(48)

where

ν_{a}^{H}

,

ϕ_{s, c}^{H}

, and

Υ

can be written as

\begin{matrix} ν_{a}^{H} = & [\sqrt{β_{(1, 1)}^{a}} e^{j θ_{(1, 1)}^{a}}, \sqrt{β_{(1, 2)}^{a}} e^{j θ_{(1, 2)}^{a}}, \dots, \\ \sqrt{β_{(m_{c}, m_{r})}^{a}} e^{j θ_{(m_{c}, m_{r})}^{a}}, \dots, \sqrt{β_{(M_{c}, M_{r})}^{a}} e^{j θ_{(M_{c}, M_{r})}^{a}}]; \end{matrix}

(49)

\begin{matrix} ϕ_{s, c}^{H} = & diag [g_{(1, 1), c}, g_{(1, 2), c}, \dots, g_{(m_{c}, m_{r}), c}, \dots, g_{(M_{c}, M_{r}), c}]; \end{matrix}

(50)

\begin{matrix} Υ = & ϕ_{s, c}^{H} H_{m, s} H_{m, s}^{H} ϕ_{s, c} . \end{matrix}

(51)

As a result, constraints (15c) and (15d) are replaced by the following constraint:

\sum_{a \in \{r, t\}} ν_{a}^{2} [(m_{c}, m_{r})] \leq 1,

(52)

where

ν_{a} [(m_{c}, m_{r})]

is the

(m_{c}, m_{r})

-th element of

ν_{a}^{H}

. Clearly, (52) is convex.

To tackle non-convex constraint (47c), the SCA method is adopted. Specifically, since

ν_{a}^{H} Υ ν_{a}

is convex about

ν_{a}

, its lower bound can be derived as follows:

ν_{a}^{H} Υ ν_{a} \geq - ν_{a}^{(n) H} Υ ν_{a}^{(n)} + 2 Re [ν_{a}^{H} Υ ν_{a}^{(n)}],

(53)

where

ν_{a}^{(n)}

is obtained in the previous iteration. Then, constraint (47c) can be replaced by the following constraint:

- ν_{a}^{(n) H} Υ ν_{a}^{(n)} + 2 Re [ν_{a}^{H} Υ ν_{a}^{(n)}] \geq \frac{{\tilde{γ}}_{m, c}^{'} σ_{c}^{2}}{P_{m}},

(54)

which is linear. As a result, problem (47) is rewritten as

\begin{matrix} min_{\begin{matrix} ν_{a} \end{matrix}} & \sum_{m \in M} \sum_{k \in K} ((1 - {X_{c}^{k}}^{*}) P_{m} \frac{α_{m, c}^{k} L_{k}}{{\tilde{R}}_{m, c}^{'}}) \\ s . t . & (47b), (52), (54) . \end{matrix}

(55)

Although convex problem (55) can be solved, e.g., by using the CVX toolbox, we utilize the Lagrangian dual approach to improve the computational efficiency. The Lagrangian function

L (ν, ϱ, ϖ, ρ)

is given in (56), where

ϱ

,

ϖ

, and

ρ

are the Lagrangian multipliers associated with the three constraints of problem (55).

\begin{matrix} L (ν, ϱ, ϖ, ρ) = \sum_{k \in K} ((1 - {X_{c}^{k}}^{*}) P_{m} \frac{α_{m, c}^{k} L_{k}}{{\tilde{R}}_{m, c}^{'}}) + ϱ_{m} [{\tilde{T_{m}^{k}}}^{'} - T_{m}^{k, max}] \\ + ϖ_{m} [\sum_{a \in \{r, t\}} ν_{a}^{2} [(m_{c}, m_{r})] - 1] + ρ_{m} [\frac{{\tilde{γ}}_{m, c}^{'} σ_{c}^{2}}{P_{m}} + ν_{a}^{(n) H} Υ ν_{a}^{(n)} - 2 Re [ν_{a}^{H} Υ ν_{a}^{(n)}]] . \end{matrix}

(56)

The Lagrangian dual function is written as

D (ϱ, ϖ, ρ) = min_{\begin{matrix} ν \end{matrix}} L (ν, ϱ, ϖ, ρ) .

(57)

With the convexity of problem (55), the optimal solutions of both the original problem (55) and problem (57) satisfy the KKT conditions. By solving

\frac{\partial L (ν, ϱ, ϖ, ρ)}{\partial ν_{a}} = 0

, the optimal solution

ν_{a}^{*}

is derived. The optimal

ν_{a}^{*}

is obtained as

ν_{a}^{*} = \underset{ν_{a}}{arg min} L (ν, ϱ, ϖ, ρ) .

(58)

The Lagrangian multipliers

ϱ, ϖ

, and

ρ

are given by

\begin{matrix} ϱ_{m} [k + 1] = & {\{ϱ_{m} [k] + △_{ϱ} [k] [{\tilde{T_{m}^{k}}}^{'} - T_{m}^{k, max}]\}}^{+}; \end{matrix}

(59)

\begin{matrix} ϖ_{m} [k + 1] = & {\{ϖ_{m} [k] + △_{ϖ} [k] [\sum_{a \in \{r, t\}} ν_{a}^{2} [(m_{c}, m_{r})] - 1]\}}^{+}; \end{matrix}

(60)

\begin{matrix} ρ_{m} [k + 1] = & {\{ρ_{m} [k] + △_{ρ} [k] (\frac{{\tilde{γ}}_{m, c}^{'} σ_{c}^{2}}{P_{m}} + ν_{a}^{(n) H} Υ ν_{a}^{(n)} - 2 Re [ν_{a}^{H} Υ ν_{a}^{(n)}])\}}^{+}, \end{matrix}

(61)

where

△_{2} [k] = {(△_{ϱ} [k], △_{ϖ} [k], △_{ρ} [k])}^{T}

is the step size vector to update the Lagrangian multipliers

ϱ

,

ϖ

, and

ρ

in the k-th iteration, respectively.

△_{2}

is updated in each iteration.

According to (49) and (58), the optimal transmission and reflection matrix

Θ_{a}^{*}

is obtained as

Θ_{a}^{*} = diag {(ν_{a}^{*})}^{H} .

(62)

6. Convergence and Complexity Analysis

Algorithm 2 provides details of the DRL-SCA algorithm in its entirety. As a whole loop, the generated action

X^{*}

by the PPO algorithm will be taken into (29), (38), and (55) as a given parameter for computing offloading decision, UAV hovering position, and STAR-RIS passive beamforming optimization. Then, utilizing block coordinate descent (BCD) to iterative optimize computing offloading decision

α^{*}

, UAV hovering position

r_{u} = (x_{u}^{*}, y_{u}^{*}, h_{u})

, and STAR-RIS passive beamforming

Θ

. Each iteration’s solution serves as the input of feasible points for the subsequent one. Therefore,

α^{*}

,

r_{u}^{*}

, and

Θ^{*}

will become a new state for the PPO agent in the next time slot.

Algorithm 2: DRL-SCA-based caching decision, offloading decision, hovering position of the UAV, and passive beamforming of STAR-RIS for UAV-based WSNs

Therefore, the change of action will eventually lead to the transition of state, which drives the agent to learn to improve the operation objective-related reward. Even if the system finds it hard to obtain an analytical prediction due to the environment complexity and embedded SCA, deep reinforcement learning can still pave the way to figure out a feasible solution in such a complicated dynamic environment.

Lemma 1.

At least in finite iterations, Algorithm 2 can converge to a local suboptimal solution.

Proof.

The initial problem (15) is decomposed into four subproblems and addressed iteratively by using the BCD method. Specifically, subproblems (16), (29), (38), and (55) are optimized in an alternating sequence to acquire a suboptimal solution. The solution obtained in each iteration is subsequently used as the feasible input for the following iteration.

Let

η (X^{l}, α^{l}, r_{u}^{l}, Θ^{l})

represent the value of the original objective function (15) obtained during the l-th iteration.

The DRL algorithm generates an improved solution

X^{l + 1}

in Step 3 that meets the condition

\begin{matrix} η (X^{l}, α^{l}, r_{u}^{l}, Θ^{l}) \overset{(a)}{\geq} η (X^{l + 1}, α^{l}, r_{u}^{l}, Θ^{l}), \end{matrix}

(63)

where (a) follows the inherent nature of learning approaches that always tend to seek a better reward defined in (19).

In Step 4, the suboptimal solution for offloading decision

α^{l + 1}

can be obtained by solving (29), where (b) follows its convexity. In Step 5, the suboptimal solution for UAV hovering position

r_{u}^{l + 1}

can be obtained by solving (38), where (c) follows its convexity at the given feasible point due to constraints (32b), (33b), and (38b), and it can be optimally solved due to its convexity. In Step 6, the suboptimal solution for STAR-RIS passive beamforming

Θ^{l + 1}

can be obtained by solving (55), where (d) follows its convexity at the given feasible point due to constraints (47b), (52), and (54), and it can be optimally solved due to its convexity.

\begin{matrix} η (X^{l + 1}, α^{l}, r_{u}^{l}, Θ^{l}) & \overset{(b)}{\geq} η (X^{l + 1}, α^{l + 1}, r_{u}^{l}, Θ^{l}) \\ \overset{(c)}{\geq} η (X^{l + 1}, α^{l + 1}, r_{u}^{l + 1}, Θ^{l}) \\ \overset{(d)}{\geq} η (X^{l + 1}, α^{l + 1}, r_{u}^{l + 1}, Θ^{l + 1}) . \end{matrix}

(64)

The inequality of Equation (64) induces that subproblems (29), (38), and (55) with regard to energy consumption are always nonincreasing after each iteration. It is observed that the objective function does not increase with each iteration. Due to the constraints, the minimum achievable energy consumption is bounded below by a finite value. Consequently, Algorithm 2 is assured to converge to at least a locally suboptimal solution within a finite number of iterations. □

7. Performance Evaluation and Discussion

In this section, we present numerical results to evaluate the effectiveness of the proposed energy-efficient aerial STAR-RIS-aided computing offloading and content caching framework for WSNs. The simulation scenario and parameter settings are initially outlined, followed by an in-depth discussion of the simulation results.

7.1. Simulation Scenario and Parameter Settings

In this section, we evaluate the performance of the proposed energy-efficient aerial STAR-RIS-aided computing offloading and content caching framework through numerical experiments. The simulation setup is detailed below.

We assume that the reference locations of the ground edge cloud (c), the UAV (u), and the STAR-RIS are positioned at

(0, 20, 0)

meters,

(x_{u}^{*}, y_{u}^{*}, 20)

meters, and

(x_{u}^{*}, y_{u}^{*}, 20)

meters, respectively. Additionally, the four ground sensors are fixed at

(- 40, 40, 0)

meters,

(- 40, - 40, 0)

meters,

(40, 40, 0)

meters, and

(40, - 40, 0)

meters, representing a typical distributed sensor network configuration [37]. The detailed simulation parameters are summarized in Table 2 [38]. For example, the data size to be transmitted is set to

L_{k} = 6 \times 10^{8}

bits, representing a typical high-volume data transmission scenario in UAV-based WSNs. Moreover, we modeled data transmission as periodic, assuming a consistent flow of information typical of high-demand applications [39]. The maximum hovering range of the UAV is constrained within 40 m (

x_{u}^{m a x}, y_{u}^{m a x} = 40

m), while the communication bandwidth

B_{m, j^{'}}

is set to 3.2 MHz, ensuring sufficient transmission capacity for the tasks.

w_{k}

represents the 600 r/bit computing resources required to complete task k.

M_{c} \times M_{r}

indicates that the STAR-RIS consists of nine passive units.

For the caching DRL algorithm, detailed parameter settings are provided to ensure stability and convergence of the learning process: Different learning rates

α

are evaluated (e.g., 0.0001, 0.0003, and 0.0005) to assess their impact on convergence speed and stability. A learning rate of 0.0003 is found to provide the best balance between performance and robustness. The clipping parameter

ϵ

is set to 0.2. This parameter constrains policy updates, ensuring stable and gradual improvements during training. The discount factor

γ

is fixed at 0.99, which balances immediate rewards and long-term benefits, crucial to achieving consistent performance across episodes. These parameter settings were chosen to optimize the performance of the PPO algorithm within the DRL-SCA framework and ensure the reliable operation of the proposed model under dynamic conditions [40,41].

Moreover, to demonstrate the advantages of the proposed framework, we evaluate its performance in multiple existing comparative scenarios, including (1) full offloading, (2) fixed positions, (3) no STAR-RIS, and (4) no caching [30,42]. These existing comparative setups provide insights into the effectiveness of the proposed joint optimization framework in improving energy efficiency and overall system performance. All simulation scenarios are detailed below:

(1): Proposed policy: We minimize the total energy consumption by jointly optimizing the content caching decision, offloading decision, the hovering position of the UAV, and the STAR-RIS transmission and reflection coefficient matrix in the STAR-RIS-aided UAV system.
(2): Fixed-position policy: The fixed-position policy is to set the UAV position at a fixed position without dynamic adjustment. The reason for setting this comparison strategy is to analyze the effectiveness and advantages of optimizing the UAV hovering position.
(3): Full offloading policy: The full offloading policy mainly offloads all request tasks sent by users to the UAV or cloud for independent processing. The reason for setting this comparison strategy is to analyze the effectiveness and advantages of the partial offloading strategy.
(4): No-STAR-RIS policy: The no-STAR-RIS policy is to directly transmit the task request sent by the user to the edge cloud without using the STAR-RIS transmission and reflection coefficient matrix for transmission. The reason for setting this comparison strategy is to analyze the effectiveness and advantages of the STAR-RIS transmission and reflection coefficient matrix.
(5): No-caching policy: The no–caching strategy consists of not to using the cache service capability on the UAV or edge cloud and not caching the requested content sent by the user in advance. Therefore, under this strategy, the user directly uses the partial offloading strategy to process the task locally or in the UAV or edge cloud. The reason for setting this comparison strategy is to analyze the effectiveness and advantages of the content caching strategy.

7.2. The Discussion of the Simulation Results

This section evaluates and discusses the performance of the proposed DRL-SCA algorithm in the energy-efficient aerial STAR-RIS-aided computing offloading and content caching framework by comparing it with various benchmark methods across diverse scenarios.

Figure 6 plots the total energy consumption of the system versus the number of iterations. We see that the proposed method can ensure that the total energy consumption of the system converges to the optimal value after only serval iterations, confirming that the optimal caching strategy, partial offloading strategy, the hovering position of the UAV, and the STAR-RIS transmission and reflection matrix are always available. Moreover, the convergence speed of the four benchmark schemes is slightly slower than that of the proposed scheme. Finally, this figure can verify that the energy consumption of the proposed STAR-RIS-aided computing offloading and content caching framework for UAV systems is less than that of other benchmark schemes.

Figure 7 shows the total energy consumption of the system versus the network bandwidth. The results show that the total energy consumption of the five schemes decreases with the increase in network bandwidth. The reason is that the increase in network bandwidth for offloading improves the transmission rate between sensors and the UAV, as well as the transmission rate among the sensors, the STAR-RIS, and the edge cloud, reducing the transmission delay and energy consumption. At a lower network bandwidth, the proposed caching strategy optimization, UAV hovering position optimization, partial offloading optimization, and STAR-RIS transmission reflection coefficient matrix significantly reduce energy consumption compared with other comparison strategies. However, as the network bandwidth increases, it helps to increase the task transmission rate, making the impact of network bandwidth in system energy consumption dominant. The system energy consumption of other benchmark solutions gradually approaches that of the proposed solution, but through the proposed joint optimization policy, our system performance is always optimal.

Figure 8 shows that as the CPU cycles required for computing 1 bit of task data increase, the system consumes more computing resources when calculating tasks of the same size, thereby increasing computing energy consumption. However, when the computing power of drones and local sensors is limited, more tasks are offloaded to the edge cloud for processing, which increases transmission energy consumption. At the same time, as the number of offloaded tasks increases, network resources become increasingly limited. Under the condition of limited network resources, the cache strategy optimization, drone hovering position optimization, and STAR-RIS transmission reflection coefficient matrix optimization strategies we proposed can significantly reduce system transmission energy consumption compared with other comparison strategies. Therefore, the system energy consumption gap between the proposed solution and other benchmark solutions will gradually widen. However, as network resources become increasingly limited, the network energy consumption under full offloading is only affected by computing resources, resulting in an increase in computing energy consumption, and the transmission energy consumption does not change much. Therefore, the system energy consumption of full offloading solutions gradually approaches that of the proposed partial offloading solution.

Figure 9 evaluates the total energy consumption of the proposed scheme and other benchmark schemes for various computation task sizes. First, the increase in computation task sizes leads to an increase in the total energy consumption of the system, because as the computation task sizes increase, transmission energy consumption gradually increases, and the computing energy consumption for processing unit tasks also increases. When local sensors and drones are unable to handle tasks due to their computation capability limitations, more computation tasks have to be offloaded to the distant edge cloud for processing. This not only increases computing energy consumption but also, due to the limited resources of UAVs and local sensors, offloading tasks to a distant cloud for processing, greatly increasing transmission energy consumption. Furthermore, the advantage of the proposed scheme over other benchmark schemes in the total energy consumption is marginal when we set the computation task size to small, while the advantage becomes increasingly substantial upon increasing the task size.

Figure 10 shows that the system energy consumption of all STAR-RIS-related schemes decreases significantly as the number of STAR-RIS elements increases. This is because STAR-RIS elements can provide more channel gain and effectively reduce transmission energy consumption. However, the no-STAR-RIS policy directly transmits some task requests sent by the sensor to the edge cloud without using the STAR-RIS transmission reflection coefficient matrix for transmission. Therefore, as the number of STAR-RIS elements increases, the system energy consumption under this strategy remains unchanged. At a smaller number of STAR-RIS elements, the energy consumption of the proposed cache strategy optimization, drone hovering position optimization, partial offloading optimization, and STAR-RIS transmission reflection coefficient matrix are significantly reduced compared with other comparison strategies. However, as the number of STAR-RIS elements increases, this provides more channel gain, which greatly reduces the transmission energy consumption of the system offloading tasks to a more distant cloud for processing. Therefore, the gap between the system energy consumption of other benchmark solutions and the energy consumption of the proposed solution gradually narrows, but through the proposed joint optimization strategy, our system performance is always in the optimal state.

Figure 11 illustrates that the system energy consumption of all schemes increases significantly with the increase in the sensors’ transmit power. This is because as the sensor’s transmission power increases, the transmission delay gradually decreases, but the transmission energy consumption is proportional to the sensor’s power. Therefore, when the sensor’s offloading ratio remains unchanged, the overall energy consumption of the system gradually increases. At smaller sensor’s transmit power, the energy consumption of the proposed cache strategy optimization, drone hovering position optimization, partial offloading optimization, and STAR-RIS transmission reflection coefficient matrix is significantly reduced compared with other comparison strategies. However, as the sensor’s transmission power increases, it greatly increases the transmission energy consumption of the system offloading tasks to the edge cloud or UAV for processing. Therefore, the gap between the system energy consumption of other benchmark solutions and the energy consumption of the proposed solution gradually narrows, but through the proposed joint optimization strategy, our system performance is always in the optimal state.

We define the SINR as

γ_{n, n^{'}} = \frac{P_{n} {| h_{n, n^{'}} |}^{2}}{σ_{n, n^{'}}^{2}}

. It can be seen from Figure 12 that at greater SINR, the signal strength is relative to the noise. That is to say, the signal transmission quality will be better, thus improving transmission efficiency, which helps to reduce energy consumption during the transmission process. At a smaller SINR, the energy consumption of the proposed caching strategy optimization, UAV hovering position optimization, partial offloading optimization, and STAR-RIS transmission reflection coefficient matrix is significantly reduced compared with other comparison strategies. However, as the SINR increases, this provides a better channel state, which greatly reduces the transmission energy consumption of the system offloading tasks to a more distant cloud for processing. Therefore, the gap between the system energy consumption of other benchmark solutions and the energy consumption of the proposed solution gradually narrows, but through the proposed joint optimization strategy, our system performance is always in the optimal state.

Figure 13 illustrates the convergence of the average weighted reward per episode for the DRL caching agent at varying learning rates. As seen in the figure, it is evident that the DRL caching agent converges rapidly at all learning rates and achieves optimal performance at a learning rate of 0.0003. A higher learning rate results in the current Q-value having more influence than the prior Q-value, leading to faster updates. However, excessively high learning rates may destabilize the learning process and hinder convergence by over-adjusting to recent rewards. Therefore, selecting an appropriate learning rate is crucial to balancing the learning speed and stability for optimal performance.

The above simulation results demonstrate the effectiveness of the proposed DRL-SCA algorithm in optimizing energy consumption in the aerial STAR-RIS-aided computing offloading and content caching framework. Across various scenarios, our joint optimization strategy consistently outperforms benchmark methods. Specifically, the results reveal that the proposed framework achieves faster convergence in energy optimization, as shown by the rapid decrease in energy consumption over iterations. The analysis highlights that key factors such as network bandwidth, computation task size, CPU cycles per bit, STAR-RIS element count, and sensors’ transmission power significantly impact the system’s energy performance. For instance, the proposed strategy exhibits substantial energy savings, particularly under conditions of limited network resources or high computational demand, owing to the efficient coordination of offloading and caching decisions, UAV hovering positions, and STAR-RIS passive beamforming. Additionally, the results demonstrate that higher STAR-RIS element counts and better SINR conditions further enhance energy efficiency by improving channel gain and transmission quality. The sensitivity analysis of DRL caching learning rates confirms the stability and rapid convergence of the DRL component, with optimal performance achieved at a learning rate of 0.0003. Overall, the results validate that the proposed framework achieves superior energy efficiency and scalability while maintaining robustness under dynamic conditions, showcasing its applicability in real-world UAV-based wireless sensor networks.

8. Conclusions

In this paper, the energy-efficient STAR-RIS-aided computing offloading and content caching framework is proposed in order to meet the service requirements of delay-sensitive tasks for UAV-based WSNs. Firstly, we formulated the system energy consumption minimization problem, aiming to jointly optimize content caching decisions, computing offloading decisions, UAV hovering positions, and STAR-RIS passive beamforming. Subsequently, to tackle the non-convex problem of system energy consumption minimization, we decomposed it into four subproblems and proposed a DRL-SCA algorithm for iterative optimization, achieving near-optimal solutions with low complexity. According to numerical results, the suggested framework significantly reduces the network energy consumption of the overall system in aerial STAR-RIS-aided WSNs, exhibiting a fast convergence rate.

In our future research, we plan to investigate the impact of various design parameters, including the power-to-weight ratio, to further optimize the overall system performance. We will explore the application of multi-agent proximal policy optimization (MAPPO) to enhance the content caching decisions, computing offloading decisions, UAV hovering positions, and STAR-RIS passive beamforming decision process by enabling collaborative optimization among multiple agents, further improving adaptability and the overall performance of the framework. To enhance real-world applicability, future work will focus on developing lightweight optimization models for key parameters and testing the framework under dynamic conditions, such as varying data loads, network congestion, and hardware constraints. Additionally, we will incorporate more complex real-world factors, including UAV mobility, multi-task coordination, and heterogeneous sensor networks, to further expand the framework’s applicability. Finally, we plan to explore advanced feature extraction and training techniques to improve the scalability of MAPPO algorithms for large-scale and complex network environments.

Author Contributions

Conceptualization, X.Y. and X.C.; methodology, X.Y. and Q.W.; software, X.Y. and B.Y.; validation, X.Y., Q.W. and B.Y.; formal analysis, X.Y. and X.C.; investigation, B.Y. and X.C.; resources, X.C.; data curation, X.Y. and Q.W.; writing—original draft preparation, X.Y.; writing—review and editing, Q.W. and X.C.; visualization, B.Y.; supervision, X.Y. and X.C.; project administration, X.Y. and X.C.; funding acquisition, X.C. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by National Social Science Foundation of China (22CGL017).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data analyzed during the current study are available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Wang, N.C.; Chen, Y.L.; Huang, Y.F.; Huang, L.C.; Wang, T.Y.; Chuang, H.Y. Energy Efficient Geocasting Based on Q-Learning for Wireless Sensor Networks. In Proceedings of the 2019 International Conference on Machine Learning and Cybernetics (ICMLC), Kobe, Japan, 7–10 July 2019; pp. 1–4. [Google Scholar] [CrossRef]
Akyildiz, I.F.; Melodia, T.; Chowdhury, K.R. Wireless Multimedia Sensor Networks: Applications and Testbeds. Proc. IEEE 2008, 96, 1588–1605. [Google Scholar] [CrossRef]
Shen, J.; Wang, A.; Wang, C.; Hung, P.C.K.; Lai, C.F. An Efficient Centroid-Based Routing Protocol for Energy Management in WSN-Assisted IoT. IEEE Access 2017, 5, 18469–18479. [Google Scholar] [CrossRef]
Ji, J.; Zhu, K.; Yi, C.; Niyato, D. Energy Consumption Minimization in UAV-Assisted Mobile-Edge Computing Systems: Joint Resource Allocation and Trajectory Design. IEEE Internet Things J. 2021, 8, 8570–8584. [Google Scholar] [CrossRef]
Liao, Z.; Yin, G.; Tang, X.; Liu, P. A Cooperative Community-Based Framework for Service Caching and Task Offloading in Multi-Access Edge Computing. IEEE Trans. Netw. Serv. Manag. 2024, 21, 3224–3235. [Google Scholar] [CrossRef]
Hao, Y.; Chen, M.; Hu, L.; Hossain, M.S.; Ghoneim, A. Energy Efficient Task Caching and Offloading for Mobile Edge Computing. IEEE Access 2018, 6, 11365–11373. [Google Scholar] [CrossRef]
Zhang, K.; Gui, X.; Ren, D.; Li, D. Energy–Latency Tradeoff for Computation Offloading in UAV-Assisted Multiaccess Edge Computing System. IEEE Internet Things J. 2021, 8, 6709–6719. [Google Scholar] [CrossRef]
Hu, X.; Wong, K.K.; Yang, K.; Zheng, Z. UAV-Assisted Relaying and Edge Computing: Scheduling and Trajectory Optimization. IEEE Trans. Wireless Commun. 2019, 18, 4738–4752. [Google Scholar] [CrossRef]
Zhang, T.; Wang, Y.; Liu, Y.; Xu, W.; Nallanathan, A. Cache-Enabling UAV Communications: Network Deployment and Resource Allocation. IEEE Trans. Wireless Commun. 2020, 19, 7470–7483. [Google Scholar] [CrossRef]
Li, M.; Cheng, N.; Gao, J.; Wang, Y.; Zhao, L.; Shen, X. Energy-Efficient UAV-Assisted Mobile Edge Computing: Resource Allocation and Trajectory Optimization. IEEE Trans. Veh. Technol. 2020, 69, 3424–3438. [Google Scholar] [CrossRef]
Gao, X.; Zhai, L. Service Experience Oriented Cooperative Computing in Cache-Enabled UAVs Assisted MEC Networks. IEEE Trans. Mob. Comput. 2024, 23, 9721–9736. [Google Scholar] [CrossRef]
Zhong, L.; Yang, S.; Song, K.; Wang, M.; Jiang, K.; Muntean, G.M. MDC2: An Integrated Communication and Computing Framework to Optimize Edge-Assisted Caching for Improved Multimedia Services in UAV-Based IoT Networks. IEEE Internet Things J. 2024, 11, 32393–32403. [Google Scholar] [CrossRef]
Guo, H.; Liu, J. UAV-Enhanced Intelligent Offloading for Internet of Things at the Edge. IEEE Trans. Ind. Inform. 2020, 16, 2737–2746. [Google Scholar] [CrossRef]
Wu, C.; You, C.; Liu, Y.; Gu, X.; Cai, Y. Channel Estimation for STAR-RIS-Aided Wireless Communication. IEEE Commun. Lett. 2022, 26, 652–656. [Google Scholar] [CrossRef]
Liu, Z.; Li, Z.; Wen, M.; Gong, Y.; Wu, Y.C. STAR-RIS-Aided Mobile Edge Computing: Computation Rate Maximization With Binary Amplitude Coefficients. IEEE Trans. Commun. 2023, 71, 4313–4327. [Google Scholar] [CrossRef]
Yang, S.; Xie, C.; Lyu, W.; Ning, B.; Zhang, Z.; Yuen, C. Near-Field Channel Estimation for Extremely Large-Scale Reconfigurable Intelligent Surface (XL-RIS)-Aided Wideband mmWave Systems. IEEE J. Sel. Areas Commun. 2024, 42, 1567–1582. [Google Scholar] [CrossRef]
Mozaffari, M.; Saad, W.; Bennis, M.; Nam, Y.H.; Debbah, M. A Tutorial on UAVs for Wireless Networks: Applications, Challenges, and Open Problems. IEEE Commun. Surv. Tut. 2019, 21, 2334–2360. [Google Scholar] [CrossRef]
Qin, X.; Song, Z.; Hou, T.; Yu, W.; Wang, J.; Sun, X. Joint Resource Allocation and Configuration Design for STAR-RIS-Enhanced Wireless-Powered MEC. IEEE Trans. Commun. 2023, 71, 2381–2395. [Google Scholar] [CrossRef]
Yu, S.; Langar, R.; Fu, X.; Wang, L.; Han, Z. Computation Offloading With Data Caching Enhancement for Mobile Edge Computing. IEEE Trans. Veh. Technol. 2018, 67, 11098–11112. [Google Scholar] [CrossRef]
Wang, J.; Liu, K.; Pan, J. Online UAV-Mounted Edge Server Dispatching for Mobile-to-Mobile Edge Computing. IEEE Internet Things J. 2020, 7, 1375–1386. [Google Scholar] [CrossRef]
Zhao, Y.; Liu, C.; Hu, X.; He, J.; Peng, M.; Ng, D.W.K.; Quek, T.Q. Joint Content Caching, Service Placement and Task Offloading in UAV-Enabled Mobile Edge Computing Networks. IEEE J. Sel. Areas Commun. 2024. early access. [Google Scholar] [CrossRef]
Huang, J.; Zhang, M.; Wan, J.; Chen, Y.; Zhang, N. Joint Data Caching and Computation Offloading in UAV-assisted Internet of Vehicles via Federated Deep Reinforcement Learning. IEEE Trans. Veh. Technol. 2024. early access. [Google Scholar] [CrossRef]
Aung, P.S.; Nguyen, L.X.; Tun, Y.K.; Han, Z.; Hong, C.S. Aerial STAR-RIS Empowered MEC: A DRL Approach for Energy Minimization. IEEE Wireless Commun. Lett. 2024, 13, 1409–1413. [Google Scholar] [CrossRef]
Lyu, W.; Xiu, Y.; Yang, S.; Yeoh, P.L.; Li, Y.; Zhang, Z. Weighted Sum Age of Information Minimization in Wireless Networks With Aerial IRS. IEEE Trans. Veh. Technol. 2023, 72, 5390–5394. [Google Scholar] [CrossRef]
Ren, J.; Yu, G.; He, Y.; Li, G.Y. Collaborative Cloud and Edge Computing for Latency Minimization. IEEE Trans. Veh. Technol. 2019, 68, 5031–5044. [Google Scholar] [CrossRef]
Bi, S.; Huang, L.; Zhang, Y.J.A. Joint Optimization of Service Caching Placement and Computation Offloading in Mobile Edge Computing Systems. IEEE Trans. Wireless Commun. 2020, 19, 4947–4963. [Google Scholar] [CrossRef]
Xiu, Y.; Zhao, Y.; Yang, S.; Xu, M.; Niyato, D.; Li, Y.; Wei, N. Delay Minimization for Movable Antennas-Enabled Anti-Jamming Communications with Mobile Edge Computing. arXiv 2024, arXiv:2409.14418. [Google Scholar] [CrossRef]
Meng, K.; Wu, Q.; Xu, J.; Chen, W.; Feng, Z.; Schober, R.; Swindlehurst, A.L. UAV-Enabled Integrated Sensing and Communication: Opportunities and Challenges. IEEE Wireless Commun. 2024, 31, 97–104. [Google Scholar] [CrossRef]
Wu, H.; Lyu, F.; Zhou, C.; Chen, J.; Wang, L.; Shen, X. Optimal UAV Caching and Trajectory in Aerial-Assisted Vehicular Networks: A Learning-Based Approach. IEEE J. Sel. Areas Commun. 2020, 38, 2783–2797. [Google Scholar] [CrossRef]
Zhang, Q.; Zhao, Y.; Li, H.; Hou, S.; Song, Z. Joint Optimization of STAR-RIS Assisted UAV Communication Systems. IEEE Wireless Commun. Lett. 2022, 11, 2390–2394. [Google Scholar] [CrossRef]
Wei, Z.; Cai, Y.; Sun, Z.; Ng, D.W.K.; Yuan, J.; Zhou, M.; Sun, L. Sum-Rate Maximization for IRS-Assisted UAV OFDMA Communication Systems. IEEE Trans. Wireless Commun. 2021, 20, 2530–2550. [Google Scholar] [CrossRef]
Mu, X.; Liu, Y.; Guo, L.; Lin, J.; Schober, R. Simultaneously Transmitting and Reflecting (STAR) RIS Aided Wireless Communications. IEEE Trans. Wireless Commun. 2022, 21, 3083–3098. [Google Scholar] [CrossRef]
Su, Y.; Pang, X.; Chen, S.; Jiang, X.; Zhao, N.; Yu, F.R. Spectrum and Energy Efficiency Optimization in IRS-Assisted UAV Networks. IEEE Trans. Commun. 2022, 70, 6489–6502. [Google Scholar] [CrossRef]
Luo, Y.; Ding, W.; Zhang, B. Optimization of Task Scheduling and Dynamic Service Strategy for Multi-UAV-Enabled Mobile-Edge Computing System. IEEE Trans. Cogn. Commun. Netw. 2021, 7, 970–984. [Google Scholar] [CrossRef]
Zhang, Z.; Chen, J.; Liu, Y.; Wu, Q.; He, B.; Yang, L. On the Secrecy Design of STAR-RIS Assisted Uplink NOMA Networks. IEEE Trans. Wireless Commun. 2022, 21, 11207–11221. [Google Scholar] [CrossRef]
Zeng, Y.; Chen, S.; Cui, Y.; Yang, J.; Fu, Y. Joint Resource Allocation and Trajectory Optimization in UAV-Enabled Wirelessly Powered MEC for Large Area. IEEE Internet Things J. 2023, 10, 15705–15722. [Google Scholar] [CrossRef]
Su, Y.; Pang, X.; Lu, W.; Zhao, N.; Wang, X.; Nallanathan, A. Joint Location and Beamforming Optimization for STAR-RIS Aided NOMA-UAV Networks. IEEE Trans. Veh. Technol. 2023, 72, 11023–11028. [Google Scholar] [CrossRef]
Shnaiwer, Y.N.; Kaneko, M. Minimizing IoT Energy Consumption by IRS-Aided UAV Mobile Edge Computing. IEEE Netw. Lett. 2023, 5, 16–20. [Google Scholar] [CrossRef]
Ma, Z.; Xiao, M.; Xiao, Y.; Pang, Z.; Poor, H.V.; Vucetic, B. High-Reliability and Low-Latency Wireless Communication for Internet of Things: Challenges, Fundamentals, and Enabling Technologies. IEEE Internet Things J. 2019, 6, 7946–7970. [Google Scholar] [CrossRef]
Tang, Q.; Li, B.; Yang, H.H.; Li, Y.; He, S.; Yang, K. Delay and Load Fairness Optimization with Queuing Model in Multi-UAV Assisted MEC: A Deep Reinforcement Learning Approach. IEEE Trans. Netw. Serv. Manag. 2024. early access. [Google Scholar] [CrossRef]
Qin, P.; Fu, Y.; Zhang, J.; Geng, S.; Liu, J.; Zhao, X. DRL-Based Resource Allocation and Trajectory Planning for NOMA-Enabled Multi-UAV Collaborative Caching 6G Network. IEEE Trans. Veh. Technol. 2024, 73, 8750–8764. [Google Scholar] [CrossRef]
Lin, N.; Bai, L.; Hawbani, A.; Guan, Y.; Mao, C.; Liu, Z.; Zhao, L. Deep-Reinforcement-Learning-Based Computation Offloading for Servicing Dynamic Demand in Multi-UAV-Assisted IoT Network. IEEE Internet Things J. 2024, 11, 17249–17263. [Google Scholar] [CrossRef]

Figure 1. System model of aerial STAR-RIS-aided WSN.

Figure 2. Illustration of task caching and offloading for STAR-RIS-aided UAV system.

Figure 3. Time allocation for task processing in STAR-RIS-aided UAV system.

Figure 5. Workflow of PPO algorithm.

Figure 6. Energy consumption versus the number of iterations.

Figure 7. Energy consumption versus network bandwidth.

Figure 8. Energy consumption versus CPU cycles required for computing 1 bit of task data.

Figure 9. Energy consumption versus computation task size.

Figure 10. Energy consumption versus number of elements.

Figure 11. Energy consumption versus sensors’ transmit power.

Figure 12. Energy consumption versus SINR.

Figure 13. Convergence of average weighted reward sum for various caching DRL learning rates.

Table 1. Summary of important notation.

Symbol	Definition
$M_{c}$	Total number of elements in each column of STAR-RIS
$M_{r}$	Total number of elements in each row of STAR-RIS
$r_{j}$	Location of node j
$r_{(m_{c}, m_{r})}$	Location of the $(m_{c}, m_{r})$ -th STAR-RIS element
$h_{m, u}$	Channel gain from sensor m to UAV u
$h_{m, (m_{c}, m_{r})}$	Channel gain from sensor m to the $(m_{c}, m_{r})$ -th STAR-RIS element
$g_{(m_{c}, m_{r}), c}$	Channel gain from the $(m_{c}, m_{r})$ -th STAR-RIS element to edge cloud c
P	Transmit power
B	Spectrum bandwidth
R	Transmission rate
$f_{j}$	Computing capability (CPU cycles per second) of node j
$α$	Task offloading ratio
$β_{(m_{c}, m_{r})}^{a}$	Amplitude coefficients of transmitting and reflecting vectors ( $a \in \{r, t\}$ )
$θ_{(m_{c}, m_{r})}^{a}$	The phase shift angle vectors of transmitting and reflecting elements ( $a \in \{r, t\}$ )
$k_{j}$	Effective switched capability of node j
$L_{k}$	The requested data size of task k (uplink)
$ω_{k}$	Number of CPU cycles for computing 1-bit task k

Table 2. Simulation parameters.

Notation	Simulation Value	Notation	Simulation Value
$κ_{j}$	$1 \times 10^{- 28}$	$P_{m}$	30 dBm
$f_{c}$	$1 \times 10^{10}$ r/s	$B_{m, j^{'}}$	3.2 MHz
$f_{u}$	$40 \times 10^{7}$ r/s	$L_{k}$	$6 \times 10^{8}$ bit
$f_{m}$	$2 \times 10^{8}$ r/s	$ω_{k}$	600 r/bit
$T_{m}^{k, max}$	0.1 s	$M_{c}$ × $M_{r}$	9
$g_{0}$	$- 30$ dB	$x_{u}^{max}$	40 m
$σ_{n, n^{'}}^{2}$	$- 174$ dBm	$y_{u}^{max}$	40 m

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yang, X.; Wang, Q.; Yang, B.; Cao, X. Energy-Efficient Aerial STAR-RIS-Aided Computing Offloading and Content Caching for Wireless Sensor Networks. Sensors 2025, 25, 393. https://doi.org/10.3390/s25020393

AMA Style

Yang X, Wang Q, Yang B, Cao X. Energy-Efficient Aerial STAR-RIS-Aided Computing Offloading and Content Caching for Wireless Sensor Networks. Sensors. 2025; 25(2):393. https://doi.org/10.3390/s25020393

Chicago/Turabian Style

Yang, Xiaoping, Quanzeng Wang, Bin Yang, and Xiaofang Cao. 2025. "Energy-Efficient Aerial STAR-RIS-Aided Computing Offloading and Content Caching for Wireless Sensor Networks" Sensors 25, no. 2: 393. https://doi.org/10.3390/s25020393

APA Style

Yang, X., Wang, Q., Yang, B., & Cao, X. (2025). Energy-Efficient Aerial STAR-RIS-Aided Computing Offloading and Content Caching for Wireless Sensor Networks. Sensors, 25(2), 393. https://doi.org/10.3390/s25020393

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Energy-Efficient Aerial STAR-RIS-Aided Computing Offloading and Content Caching for Wireless Sensor Networks

Abstract

1. Introduction

2. Related Works

2.1. Computing Offloading and Content Caching in MEC

2.2. Computing Offloading and Content Caching in UAVs

2.3. Computing Offloading and Content Caching in STAR-RIS-Aided UAVs

3. System Model and Problem Formulation

3.1. Network Overview

3.2. Communication Model

3.2.1. Channel in UAV Task Offloading

3.2.2. Channel in Edge Cloud Task Offloading

3.3. Computation Model

3.3.1. Energy Consumption for Uplink Communication

3.3.2. Energy Consumption for Computation

3.4. Caching Model

3.5. Problem Formulation

4. Content Caching Decision Optimization

4.1. Intelligent Caching MDP Model

4.2. PPO-Based Content Caching Process

5. Offloading Decision, UAV Hovering Position, and STAR-RIS Resource Allocation

5.1. Offloading Decision

5.2. UAV Hovering Position

5.3. STAR-RIS Resource Allocation

6. Convergence and Complexity Analysis

7. Performance Evaluation and Discussion

7.1. Simulation Scenario and Parameter Settings

7.2. The Discussion of the Simulation Results

8. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI