Next Article in Journal
Enhancing Physical-Layer Security in UAV-Assisted Communications: A UAV-Mounted Reconfigurable Intelligent Surface Scheme for Secrecy Rate Optimization
Previous Article in Journal
A Hybrid Decision-Making Framework for UAV-Assisted MEC Systems: Integrating a Dynamic Adaptive Genetic Optimization Algorithm and Soft Actor–Critic Algorithm with Hierarchical Action Decomposition and Uncertainty-Quantified Critic Ensemble
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Online Resource Allocation and Trajectory Optimization of STAR–RIS–Assisted UAV–MEC System

NEUQ WiNet Laboratory, Northeastern University, Shenyang 110819, China
*
Author to whom correspondence should be addressed.
Drones 2025, 9(3), 207; https://doi.org/10.3390/drones9030207
Submission received: 24 January 2025 / Revised: 3 March 2025 / Accepted: 12 March 2025 / Published: 14 March 2025

Abstract

:
In urban environments, the highly complex communication environment often leads to blockages in the link between ground users (GUs) and unmanned aerial vehicles (UAVs), resulting in poor communication quality. Although traditional reconfigurable intelligent surfaces (RISs) can improve wireless channel quality, they can only provide reflection services and have limited coverage. For this reason, we study a novel simultaneously transmitting and reflecting reconfigurable intelligent surface (STAR–RIS)–assisted UAV–mobile edge computing (UAV–MEC) network, which can serve multiple users residing in the transmission area and reflection area, and switch between reflection and transmission modes according to the relative positions of the UAV, GUs, and STAR–RIS, providing users with more flexible and efficient services. The system comprehensively considers user transmit power, time slot allocation, UAV flight trajectory, STAR–RIS mode selection, and phase angle matrix, achieving long–term energy consumpution minimization while ensuring stable task backlog queue. Since the proposed problem is a long–term stochastic optimization problem, we use the Lyapunov method to transform it into three deterministic online optimization subproblems and iteratively solve them alternately. Specifically, we firstly use the Lambert function to solve for the closed-form solution of the transmit power; then, use Lagrange duality and the Karush–Kuhn–Tucker conditions to solve time slot allocation; finally, successive convex approximation is used to obtain trajectory planning for UAVs with lower complexity, and triangular inequalities are used to solve the STAR–RIS phase shift. The simulation results show that the proposed scheme has better performance than other benchmark schemes in maintaining queue stability and reducing energy consumption.

1. Introduction

The rapid growth of Internet of Things (IoT) devices has driven the development of mobile edge computing (MEC) systems supported by unmanned aerial vehicles (UAVs), known as UAV–MEC systems. These systems address overload traffic requirements and improve communication quality [1,2,3]. UAVs, with their low cost and high maneuverability, enable rapid deployment in areas lacking infrastructure. As airborne MEC nodes, they aggregate computing resources near ground users (GUs), reducing communication distances and enhancing quality of service (QoS) [4,5]. However, challenges remain in UAV–MEC applications, particularly in harsh wireless environments caused by ground obstacles.
To address these challenges, reconfigurable intelligent surface (RIS) have been introduced to improve energy efficiency and communication stability in UAV–MEC systems [6,7,8]. Recent studies have explored combining UAV and RIS to optimize communication quality and energy consumption [9,10,11]. However, traditional RIS only reflect signals, limiting their coverage. To overcome this, simultaneously transmitting and reflecting reconfigurable intelligent surface (STAR–RIS) have been proposed, enabling 360-degree full-space coverage and enhanced signal propagation [12,13,14].
Unlike traditional RIS, GU signals can be reflected and transmitted simultaneously by various elements of STAR–RIS, achieving 360-degree full-space coverage. STAR–RIS provides additional degrees of freedom for signal propagation operations by simultaneously optimizing transmission and reflection coefficients, which can further enhance the performance of various wireless communication systems [12,13,14]. Reference [15] added STAR–RIS to UAV networks and, under the constraint of service quality requirements, jointly optimized the position and power allocation of UAVs and passive reflection/transmission beamforming of STAR–RIS to maximize the overall transmission rate. Reference [16] proposed using STAR–RIS installed on UAVs to provide an improved signal-to-noise ratio for numerous GUs in remote or inaccessible areas. By optimizing the STAR–RIS phase shift matrix and UAV position, the overall data rate of GUs under UAV mobility constraints is maximized. Reference [17] proposed a downlink system model for full-coverage STAR–RIS UAV secure communication, which achieves maximum energy efficiency by exploring joint resource allocation of the system. The above research proves that STAR–RIS can improve the communication performance of UAV systems. Therefore, we attempt to incorporate STAR–RIS into the UAV–assisted MEC network to provide better channel conditions for GU offloading. References [18,19] both propose a STAR–RIS–assisted UAV–MEC scheme that enables bidirectional task offloading to both ground base stations and UAVs, optimizing resource allocation, user scheduling, STAR–RIS beamforming, and UAV trajectory to maximize offloaded tasks while ensuring QoS. Extensive experimental analyses in these studies have demonstrated that STAR–RIS significantly outperforms conventional RIS in enhancing the service quality of UAV–assisted MEC systems. However, these studies do not fully address critical aspects such as long–term system stability and energy consumption under varying network conditions. Therefore, further research is needed to explore these challenges and develop more robust solutions for practical deployment.
In the STAR–RIS–assisted UAV–MEC network, GUs randomly generate tasks in each time slot, many of which cannot be processed immediately, forming task queues. Excessive queue lengths can delay task processing, degrading system performance and user experience. Thus, ensuring long–term queue stability for GUs and UAVs is a critical challenge. The Lyapunov method is a powerful framework for ensuring the long–term stable operation of dynamic systems. By constructing a Lyapunov function, it balances system utility and queue backlog, ensuring that task queues remain stable while optimizing performance metrics such as energy efficiency and task completion rates. Specifically, the drift-plus-penalty approach is employed to minimize system energy consumption while maintaining queue stability, which is critical for real-time applications in mobile edge computing (MEC) systems. Existing studies, such as [20,21], demonstrate the effectiveness of the Lyapunov method in UAV–assisted MEC systems. For instance, reference [20] jointly optimizes computation offloading, resource allocation, and UAV trajectory to minimize energy consumption under random task arrivals, while [21] minimizes the average weighted energy consumption of ground users (GUs) while ensuring data queue stability and UAV energy constraints. These works highlight that the Lyapunov method uniquely achieves long–term queue stability and improved quality of service (QoS), which is challenging for other optimization algorithms to accomplish. However, the above studies did not simultaneously consider the issue of task queue stability in STAR–RIS–assisted UAV–MEC networks. The stochastic optimization problem in MEC systems is challenging due to time coupling. Existing approaches, such as intelligent optimization algorithms [22], convex optimization via CVX [23], and reinforcement learning [24], provide approximate solutions but suffer from high time complexity, making them unsuitable for real-time applications. Therefore, designing an efficient, low-complexity online optimization algorithm is essential.
This paper proposes an efficient and energy-saving resource allocation scheme for STAR–RIS assisted UAV–MEC networks. By considering the high-speed mobility of UAVs and the random arrival of GU tasks, our scheme ensures the long–term stability of both GU and UAV task queues. The key contributions of this work are summarized as follows:
  • A STAR–RIS–assisted UAV–MEC network model is established, where UAVs equipped with MEC servers provide MEC services to GUs. STAR–RIS dynamically switches between reflection and transmission modes based on the relative positions of UAVs, STAR–RIS, and GUs, improving channel conditions and offering flexible signal transmission methods. This overcomes the coverage limitations of traditional RIS, which can only reflect GU signals. Considering the random arrival of GU tasks and the need for long–term queue stability, this paper jointly optimizes GU transmit power, offloading time allocation, STAR–RIS phase shift, and UAV trajectory to minimize system energy consumption while ensuring queue stability.
  • To solve the long–term optimization problem, we introduce the Lyapunov method to transform it into deterministic subproblems for each time slot. A low-complexity alternating optimization algorithm is designed, which includes using the Lambert function to derive a closed-form solution for transmit power, applying Lagrange duality and the Karush–Kuhn–Tucker (KKT) conditions to optimize offloading time allocation, employing successive convex approximation (SCA) for low-complexity UAV trajectory planning, and solving the STAR–RIS phase shift using trigonometric inequalities.
  • The proposed algorithm is validated through extensive simulations, where it is compared with traditional RIS models and the chaotic particle swarm optimization (CPSO) algorithm. The results demonstrate that our approach outperforms benchmark methods in ensuring task queue stability and reducing energy consumption.
The rest of this paper is organized as follows: Section 2 presents our STAR–RIS–assisted UAV–MEC network model and related optimization problems. In Section 3, the process of transforming problems using Lyapunov theory and the algorithm of this paper are introduced. Section 4 analyzes the effectiveness of the algorithm, and our simulation results are presented in Section 5. Finally, Section 6 provides a summary of this paper.

2. System Model and Problem Formulation

2.1. Overview

As illustrated in Figure 1, a basic STAR–RIS–assisted UAV–MEC network consists of a computing-oriented UAV as the air MEC server, a STAR–RIS located in the center of the network, and K single-antenna GUs which are randomly and uniformly distributed in the network. We assume that STAR–RIS consists of M R = M R x × M R y elements, where M R x and M R y represent the number of horizontal and vertical elements of STAR–RIS, respectively. In addition, we define the set of all GUs as K = { 1 , 2 , , K } , with GUs randomly and uniformly distributed in the urban area of the figure, and STAR–RIS located in the central area. Based on the relative positions of the UAV, STAR–RIS, and GUs, GUs can be divided into two categories. Specifically, GUs on the same side of STAR–RIS as the UAV are called reflective GUs, while GUs on different sides of STAR–RIS from the UAV are called transmission GUs. In addition, each GU will continuously generate computationally intensive tasks and offload a portion of them to the UAV’s MEC server through a partial offloading mode. And there are two signal links between the GU and UAV: direct link and reflection link or transmission link established through STAR–RIS. The signals from the above methods reach the receiver at the same time, and finally form the received signal.
As shown in Figure 2, the system consists of three modules, namely the UAV module, GU module, and STAR–RIS module. The Software–Defined Networking (SDN) controller decouples the forwarding plane and control plane of network devices, managing network devices, orchestrating network services, and scheduling business traffic through controllers. The SDN controller deployed on the UAV collects real-time network environment information of the entire system and develops a resource allocation scheme in a programmed manner based on the collected information to improve system performance. Specifically, the system information (task generation volume, trajectory, queue length, etc.) of GUs, STAR–RIS, and the UAV is transmitted to the SDN controller, which quickly generates a reasonable resource allocation scheme by executing the algorithm described in this paper. Furthermore, the system transmits control decisions to the UAV, GUs, and STAR–RIS through the communication units of each module. In this process, GUs, upon receiving control commands from the SDN controller, offload tasks to the UAV through their communication units. The UAV, after receiving control commands from the SDN controller, accepts the offloaded tasks through its communication unit and adjusts its flight trajectory via its flight control unit. Simultaneously, STAR–RIS, upon receiving control instructions from the SDN controller, adjusts its mode and phase through its control unit. By reflecting or transmitting GU signals, STAR–RIS improves the channel conditions between GUs and the UAV. These three modules work collaboratively to effectively execute control decisions and allocate resources efficiently and reasonably.
The UAV flies over GUs during the flight cycle T to provide MEC services. The total service duration T is divided into N equal time slots, represented by N = { 1 , 2 , 3 , , N } , with each time slot having a size of τ = T / N . The value of τ is small enough to assume that the UAV position and channel conditions remain unchanged in each time slot. Moreover, we uses a three-dimensional Cartesian coordinate system, and we set the UAV to fly at a fixed height H u to avoid additional energy consumption caused by altitude changes, with its time-varying horizontal position denoted as S u ( n ) = [ x u ( n ) , y u ( n ) ] , and the GU’s position remaining almost static within an hour gap τ denoted as S k = [ x k , y k , 0 ] . STAR–RIS is located in a fixed position on a building, denoted by S R = [ x R , y R , H R ] .

2.2. Communication Model

The communication link between the GU and UAV consists of a direct link and a reflection link. Therefore, we need to model the connection channel states of UAV, GU, and STAR–RIS separately. In the time slot n, h k U G ( n ) represents the channel gain between UAV and GU k, h U R ( n ) represents the channel gain between UAV and STAR–RIS, and h k R G ( n ) represents the channel gain between STAR–RIS and GU k. In complex urban scenarios, the communication link between UAV and GUs may be hindered by obstacles such as buildings and trees, which can worsen the channel state of the direct link. We assume that the communication link between the UAV and GUs is composed of the line–of–sight (LoS) and non–line–of–sight (NLoS) links. Therefore, the path loss models for LoS and NLoS links are [25]
h k ξ ( n ) = 2 π d k U G ( n ) f r c 2 η ξ , ξ = 0 , 1 ,
where d k U G ( n ) is the three–dimensional distance between the GU k and the UAV, f r is the carrier frequency, c is the speed of optical propagation, which is a fundamental constant in the transmission medium, ξ represents the communication link state, ξ = 0 represents the LoS channel condition, ξ = 1 represents the NLoS channel condition, and η ξ represents the additional path loss. The probability of a successful LoS connection between them is expressed as [25]
P k L o S ( n ) = 1 1 + A e B γ k U G ( n ) A ,
where A and B are environmental parameters, γ k U G ( n ) = ( 180 / π ) arcsin ( H u / d k ( n ) ) is the elevation angle between the GUs and UAV. In our scenario, since only LoS and NLoS links are included, the probability of NLoS link success is represented as P k N L o S ( n ) = 1 P k L o S ( n ) . The phase angle of the direct link in this paper is composed of the phase angle of the LoS link and a random phase angle. Considering that the main channel between the UAV and GU is the LoS link, the phase angle of the LoS link can be used instead of the phase angle of the average path gain. According to reference [26], we use the average path gain of LoS and NLoS from a statistical perspective to determine the channel gain:
h k U G ( n ) = P k L o S ( n ) h k 0 ( n ) + P k N L o S ( n ) h k 1 ( n ) · ϑ ,
where ϑ = e j 2 π d k U G ( n ) / λ c is the phase angle of the LoS link, and λ c = c / f r represents the carrier wavelength. This phase angle is caused by the delay of the LoS part of the UAV–GUs link, which is only determined by their position and known to the system.
Due to the fact that both UAV and STAR–RIS are far away from the ground, they can be considered as pure LoS links. To determine the channel coefficient between the UAV and STAR–RIS, the following model is [27]:
h U R ( n ) = β 0 d U R ( n ) a U R · e j 2 π λ c d U R ( n ) × 1 , e j 2 π λ c Δ R x sin θ U R ( n ) · cos δ U R ( n ) , , e j 2 π λ c Δ R x ( M R x 1 ) · sin θ U R ( n ) · cos δ U R ( n ) H 1 , e j 2 π λ c Δ R y sin θ U R ( n ) sin δ U R ( n ) , , e j 2 π λ c Δ R y ( M R y 1 ) sin θ U R ( n ) sin δ U R ( n ) H ,
where β 0 is the unit path gain, and a U R is the channel path gain index between the UAV and STAR–RIS. sin θ U R ( n ) = H u ( n ) H R d U R ( n ) , cos δ U R ( n ) = y R y u ( n ) ( x R x u ( n ) ) 2 + ( y R y u ( n ) ) 2 , and sin δ U R ( n ) = x R x u ( n ) ( x R x u ( n ) ) 2 + ( y R y u ( n ) ) 2 . d U R ( n ) is the three–dimensional distance between the UAV and STAR–RIS, and Δ R x and Δ R y are the distances between STAR–RIS elements in the vertical and horizontal directions, respectively. In addition, the channel link between STAR–RIS and GU k adopts the Rician attenuation signal model as follows [28,29]:
h k R G ( n ) = β 0 d k R G a R G [ k R G 1 + k R G h k R G , L o S ( n ) + 1 1 + k R G Δ h k R G ( n ) ] ,
where
h k R G , L o S ( n ) = e j 2 π λ c d k R G ( n ) × 1 , e j 2 π λ c Δ R x sin θ k R G ( n ) cos δ k R G ( n ) , e j 2 π λ c Δ R x ( M R x 1 ) sin θ k R G cos δ k R G H 1 , e j 2 π λ c Δ R y sin θ k R G ( n ) sin δ k R G ( n ) , , e j 2 π λ c Δ R y M R y 1 sin θ k R G ( n ) sin δ k R G ( n ) H
is the LoS part of the known STAR–RIS-GUs channel in the system. Δ h k R G ( n ) , a R G 0 , and k R G 0 , respectively, represent the random component, path gain index, and Rayleigh coefficient of the STAR–RIS–GUs channel. sin θ k R G ( n ) = H u ( n ) d k R G ( n ) , cos δ k R G ( n ) = y R y k ( n ) ( x R x k ( n ) ) 2 + ( y R y k ( n ) ) 2 , and sin δ k R G ( n ) = x R x k ( n ) ( x R x k ( n ) ) 2 + ( y R y k ( n ) ) 2 . d k R G ( n ) is the three–dimensional distance from GU k to STAR–RIS. Δ h k R G ( n ) follows a circularly symmetric complex Gaussian distribution with a mean of 0 and a variance of 1.
Time Division Multiple Access (TDMA) is a well–established transmission technique widely used in IoT communication networks, effectively avoiding GU interference through precise phase alignment in each sub–slot [30]. Compared to Frequency Division Multiple Access (FDMA), TDMA reduces inter–user interference by approximately 10% in dynamic environments [31], making it particularly suitable for the flexible and changing topology of STAR–RIS–assisted UAV–MEC networks. Therefore, this paper adopts TDMA communication. In this system, the n-th time slot divides GUs into two sets. The first set is the reflection GUs, with a quantity of K n , r , represented by K n , r = { 1 , 2 , 3 , , K n , r } , and the second set is the transmission GUs, with a quantity of K n , t , represented by K n , t = { 1 , 2 , 3 , , K n , t } , and satisfies K n , r + K n , t = K . Since STAR–RIS can quickly change the phase angle matrix, it is difficult to switch between reflection and transmission modes quickly; we designed a dedicated time slot partitioning protocol as shown in Figure 3. Each time slot is further divided into sub–slots, with the k th GU offloading task data in the corresponding sub–slot δ k ( n ) while satisfying constraints
C 1 : k K δ k ( n ) τ , n N ,
C 2 : δ k ( n ) 0 , k K , n N .
Similar to reference [32], due to the small scale of the calculation results in this paper, the data downlink process was ignored. In addition, because there are two types of GU states, in the n-th time slot, the sub–slots are further divided into two groups. Let τ r ( n ) = k n , r K n , r δ k n , r r be the offloading time of all STAR–RIS reflection GUs, where δ k n , r r ( n ) is the sub–slot allocated to reflection GU k n , r . τ t ( n ) = k n , t K n , t δ k n , t t ( n ) is the offloading time of all STAR–RIS transmission GUs, where δ k n , t r ( n ) is the sub–slot allocated to reflection GU k n , t and satisfies τ r ( n ) + τ t ( n ) = τ , and STAR–RIS switches the reflection and transmission modes of all elements between τ r ( n ) and τ t ( n ) at different time periods. Therefore, the transmission and reflection coefficient matrices corresponding to STAR–RIS can be Φ k r ( n ) = d i a g e j ϕ k , 1 , 1 r ( n ) , , e j ϕ k , m R x , m R y r ( n ) , , e j ϕ k , M R x , M R y r ( n ) , and Φ k t ( n ) = d i a g e j ϕ k , 1 , 1 t ( n ) , , e j ϕ k , m R x , m R y t ( n ) , , e j ϕ k , M R x , M R y t ( n ) , where ϕ k , m R x , m R y r ( n ) and ϕ k , m R x , m R y t ( n ) represent the reflection phase control and transmission phase control of the elements, respectively.
Now, we can obtain the channel gain between the GU and UAV in the n-th time slot as
h k ( n ) = h k U G ( n ) + h k R G ( n ) H · Φ k z ( n ) · h U R ( n ) ,
where Φ k z ( n ) = d i a g e j ϕ k , 1 , 1 z ( n ) , , e j ϕ k , m R x , m R y z ( n ) , , e j ϕ k , M R x , M R y z ( n ) , z r , t represents the reflection coefficient matrix and transmission coefficient matrix, and H represents the Hermitian transpose.
Each GU transmits data to the UAV on the same channel. According to Shannon’s theorem, the offloading task volume of each GU to the UAV in the n-th time slot is expressed as
D k l u ( n ) = δ k ( n ) W log 2 1 + P k ( n ) h k ( n ) N 0 ,
where W represents channel bandwidth, P k ( n ) is the transmit power of GU k in the n-th time slot, and N 0 expresses Gaussian white noise.

2.3. Task and Queue Model

The amount of task data generated by GUs in the system is huge and the processor cannot process it in a timely manner. Therefore, in the proposed system, it is assumed that each GU and the UAV have a task buffer queue. In each time slot, the GU first caches the generating tasks in their local queue, and then uses a partial offloading strategy to offload the tasks to the UAV’s cache queue while processing them locally. We believe that GU k generates a task in each time slot, with the size denoted as A k ( n ) and the average task arrival denoted as E A k ( n ) = λ k , where λ k is the average number of randomly generated tasks, φ represents the central processing unit (CPU) cycle required to process 1-bit data. Define the CPU cycle frequency of GU k as f k l , and the CPU cycle frequency of UAV edge servers as f u . The task data are queued serially in the cache, adhering to the first–in, first-out (FIFO) principle. Given the significant volume of tasks at hand, the processors of GU k and the UAV must operate at their rated power to fulfill the requirements of GU k. Thus, the local processing capacities of GU k and the UAV in each time slot are obtained as follows:
D k l = f k l · τ φ ,
D u = f u · τ φ .
Subsequently, Q k ( n ) and L ( n ) are defined as task queues for GU k and the UAV, respectively. GUs and the UAV process various tasks in sequence, and tasks that arrive in each queue within one time slot are processed at the next time slot. Therefore, we can obtain the update process of the task queue as follows:
Q k ( n + 1 ) = Q k ( n ) D u D k l u ( n ) + + A k ( n ) ,
L ( n + 1 ) = [ L ( n ) D u ] + + k K D k l u ( n ) ,
where [ x ] + = max { 0 , x } . We need to ensure the stability of the queue, and the queue backlog in the time–averaged sense needs to meet the following requirements:
C 3 : lim N 1 N N 1 n = 0 E { Q k ( n ) } < ,
C 4 : lim N 1 N N 1 n = 0 E { L ( n ) } < .
In (15) and (16), we set expectations on the task queue of GUs and the UAV under various random factors, and in the long service process, we hope that the task queue length is bounded, so as to maintain the stability of the edge computing system, so that GUs’ tasks can be handled in time in the edge computing process, and increase GU satisfaction.

2.4. Energy Consumption Model

2.4.1. Offloading Energy Consumption

In the n-th time slot, GU k offloads some tasks in the allocated time slot δ k ( n ) , and each GU is equipped with a communication module with variable transmission power. The GU’s transmission energy consumption is expressed as
e k l u ( n ) = P k ( n ) · δ k ( n ) .

2.4.2. Flight Energy Consumption

The UAV flies from its initial position to its final position, and its flight energy consumption consists of two parts: maintaining flight altitude energy consumption and propulsion energy consumption. Considering that the interval between each time slot is small enough, in order to facilitate the calculation of flight energy consumption, we assume that the UAV flies at a constant speed in a straight line between two positions. In addition, the time that drones stay at the same altitude is fixed, so the energy consumption for maintaining altitude is a fixed value. We only need to optimize the propulsion energy consumption of the UAV, and combined with the existing rotor UAV analytical model, we can obtain its propulsion energy consumption as follows:
e f ( n ) = M g · τ · v ( n ) 2 2 ,
where M g is the mass of the UAV, v ( n ) is the velocity of the UAV in the n-th slot, which needs to meet v ( n ) v m a x , and v m a x represents the maximum flight speed of the UAV.

2.5. Problem Formulation

Similar to reference [18], the battery capacities of the UAV and GUs are limited, so we need to optimize the total energy consumption of the system. However, considering only the energy consumption of the system can lead to a large backlog of queues, making it difficult for GUs to process tasks in a timely manner and meet their needs. Therefore, we jointly optimize resource allocation, UAV trajectory, and STAR–RIS phase shift, and minimize the average energy consumption of each time slot while maintaining the stability of the task queue. The optimization problem is
P 1 : min Λ ( n ) 1 N n N k K e m l u ( n ) + ε e f ( n ) s . t . C 1 , C 2 , C 3 , C 4 ,
C 5 : 0 P k ( n ) P m a x ,
C 6 : 0 ϕ k , m R x , m R y z ( n ) < 2 π ,
C 7 : S u ( n + 1 ) S u ( n ) 2 τ v m a x ,
C 8 : S u ( 1 ) = S I , S u ( N ) = S F ,
where Λ ( n ) = { P ( n ) , δ ( n ) , Φ ( n ) , S u ( n ) } is the set of optimization variables, P ( n ) = { P 1 ( n ) , P 2 ( n ) , , P K ( n ) } represents the set of transmission power, δ ( n ) = { δ 1 ( n ) , δ 2 ( n ) , , δ K ( n ) } is the set of time slot allocation, Φ ( n ) = { Φ 1 z ( n ) , Φ 2 z ( n ) , , Φ K z ( n ) } represents the set of STAR–RIS phase angle matrices, ε represents the attenuation factor of flying energy consumption, V is the weight parameter that controls the preference for minimizing UAV energy consumption, P m a x represents the maximum transmit power of each GU, and S I and S F are the initial and final locations, respectively. C 1 and C 2 ensure that each GU has offloading time, and the total offloading time does not exceed the length of the time slot; C 3 and C 4 ensure the stability of GU and UAV task queues; C 5 indicates that the transmit power requested by GU k does not exceed its power constraint; C 6 represents the constraint of the STAR–RIS phase angle; C 7 indicates that the UAV speed has an upper limit; C 8 is the initial and final position of the UAV.
Our goal is long–term, and constraints C 3 and C 4 are also long–term, making the problem a long–term stochastic optimization problem. The system undergoes dynamic and random changes over time, further complicating the problem. Traditional methods, such as intelligent optimization algorithms or reinforcement learning, require prior knowledge of global statistical information, which is impractical in real-world scenarios. Additionally, problem P 1 is highly time-coupled, increasing computational complexity. In contrast, the Lyapunov optimization method maintains task queue stability by adjusting queue inputs and outputs. It makes control decisions based solely on current time slot information, transforming long–term stochastic optimization problems into online deterministic ones. Therefore, we adopt the Lyapunov optimization method to decouple the problem in time and convert it into an online optimization problem.

3. Problem Solution

Due to the coupling and randomness of variables, the proposed problem we need to solve is NP–hard, and the Lyapunov optimization method has significant advantages in solving such problems. Similar to references [32,33], in this section, we transform long–term optimization problems into online optimization problems based on Lyapunov theory. The specific plan is as follows: Firstly, we define the quadratic Lyapunov function as
U ( n ) = k K Q k ( n ) 2 2 + L ( n ) 2 2 .
The conditional Lyapunov drift function between two consecutive slots is
Δ U ( n ) = E { U ( n + 1 ) U ( n ) | Θ ( n ) } ,
where Θ ( n ) Q 1 ( n ) , Q 2 ( n ) , , Q K ( n ) , L ( n ) represents the queue backlog status of the n-th time slot in the system. Utilizing the objective function of the original problem as the penalty function, the drift–plus–penalty function can be expressed as
Δ v U ( n ) = Δ U ( n ) + V · E k K e k l u ( n ) + ε e f ( n ) | Θ ( n ) ,
where V 0 is the trade-off control coefficient that measures system utility and queue stability. To minimize Δ v U ( n ) , we achieve this by minimizing the upper bound through Lemma 1.
Lemma 1.
For any variable and queue, the Lyapunov drift-plus-penalty function follows the following equation:
Δ v U ( n ) E B b ( n ) 2 + k K B k a ( n ) 2 + D k l u ( n ) 2 + L ( n ) D k l u ( n ) Q k ( n ) D k l u ( n ) + V e k l u ( n ) + V ε e f ( n ) | Θ ( n ) ,
where B k a ( n ) and B b ( n ) are constants. The detailed derivation of (23) is provided in Appendix A.
According to inequality (23), the optimization problem can be transformed into
P 2 : min Λ ( n ) k K 1 2 D k l u ( n ) 2 + L ( n ) Q k ( n ) D k l u ( n ) + V e k l u ( n ) + V ε e f ( n ) s . t . C 1 , C 2 , C 5 , C 6 , C 7 , C 8 .
As shown in Figure 4, after converting P 1 to P 2 , the problem is rephrased as an online optimization problem. In P 2 , there is a serious coupling relationship between optimization variables, which makes it difficult to solve this problem. To solve this problem, we will transform it into three subproblems that can be alternately optimized and solved: (1) optimization of transmission power; (2) time slot allocation; (3) optimization of UAV trajectory and intelligent reflector phase matrix. Then, we can alternately solve the three subproblems until the change in variables is less than ζ 1 , thus obtaining an approximate solution to the objective function.

3.1. Transmission Power Optimization

When δ ( n ) , Φ ( n ) , and S u ( n ) are fixed, the GU power allocation problem is expressed as
P 2.1 : min P ( n ) k K 1 2 D k l u ( n ) 2 + L ( n ) Q k ( n ) D k l u ( n ) + V · P k ( n ) δ k ( n ) s . t . C 5 .
where D k l u ( n ) = δ k W log 2 1 + P k ( n ) h k ( n ) N 0 . If L ( n ) Q k ( n ) 0 , it is obvious that P k ( n ) = 0 is the solution of P 2.1 . Therefore, we only consider the case of L ( n ) Q k ( n ) < 0 . Considering that the function in problem P 2.1 is a transcendental equation, it is necessary to obtain the optimal transmission power by solving the extremum points of the equation. Therefore, we first calculate the first-order partial derivative of P k ( n ) in the equation, and we have
δ k W 2 h k ( n ) log 2 1 + P k ( n ) h k ( n ) N 0 + W h k ( n ) L ( n ) Q k ( n ) + V ( N 0 + P k ( n ) h k ( n ) ) ln 2 = 0 .
By converting Equation (26), we can obtain
V ln 2 2 P k ( n ) δ k ( n ) W 2 + V ln 2 2 N 0 δ k ( n ) W 2 h k ( n ) e V ln 2 2 P k ( n ) δ k ( n ) W 2 + V ln 2 2 N 0 δ k ( n ) W 2 h k ( n ) = V ln 2 2 N 0 δ k ( n ) W 2 h k ( n ) 2 Q k ( n ) L ( n ) δ k ( n ) W
According to the transformation, we can solve it using the Lambert function, denoted as W ( · ) . Note that the W ( · ) function is the inverse of f ( x ) = x e x . Let us assume G ( n ) = w ( P k ( n ) ) e w ( P k ( n ) ) , where w ( P k ( n ) ) = V ln 2 2 δ k ( n ) W 2 P k ( n ) + V ln 2 2 N 0 δ k ( n ) W 2 h k ( n ) , G ( n ) = V ln 2 2 N 0 δ k ( n ) W 2 h k ( n ) 2 Q k ( n ) L ( n ) δ k ( n ) W after the transformation, and we obtain
P ^ k ( n ) = δ k ( n ) W 2 h k ( n ) W G ( n ) V ln 2 2 N 0 V ln 2 2 h k ( n ) .
Considering the existence of constraints, we can obtain
P k * ( n ) = 0 , P ^ k ( n ) < 0 P ^ k ( n ) , 0 P ^ k ( n ) P m a x P m a x , P ^ k ( n ) * > P m a x .

3.2. Time Slot Allocation

After fixing P ( n ) , Φ ( n ) , and S u ( n ) , the system time slot allocation problem can be expressed as
P 2.2 : min δ ( n ) k K 1 2 D k l u ( n ) 2 + L ( n ) Q k ( n ) D k l u ( n ) + V · P k ( n ) δ k ( n ) s . t . C 1 , C 2 .
The above problem P 2 . 2 is a convex optimization problem that can be solved using Lagrange duality and the KKT conditions. If L ( n ) Q k ( n ) 0 , it is obvious that δ k ( n ) = 0 is the solution of P 2.2 . Therefore, we only consider the case of L ( n ) Q k ( n ) < 0 . The Lagrange dual function of the above problem can be rewritten as
( δ k ( n ) , α , ν k ) = k K 1 2 δ k ( n ) W log 2 1 + P k ( n ) h k ( n ) N 0 2 + L ( n ) Q k ( n ) δ k ( n ) W log 2 1 + P k ( n ) h k ( n ) N 0 + V · P k ( n ) δ k ( n ) + α k K δ k ( n ) τ k K ν k δ k ( n ) ,
where α 0 and ν k 0 are Lagrange multipliers, and ν k ν 1 , ν 2 , , ν K . According to the KKT conditions, when obtaining the optimal time slot allocation δ k ( n ) * , we can obtain
δ k * ( n ) W log 2 1 + P k ( n ) h k ( n ) N 0 2 + L ( n ) Q k ( n ) W log 2 1 + P k ( n ) h k ( n ) N 0 , + V · P k ( n ) + α * ν k * = 0 ,
α * k K δ k * ( n ) τ = 0 ,
ν k * δ k * ( n ) = 0 .
By performing simple algebraic processing on (32a), the optimal δ k * ( n ) can be obtained
δ k * ( n ) = ν k * α * L ( n ) Q k ( n ) W log 2 1 + P k ( n ) h k ( n ) N 0 V P k ( n ) W log 2 1 + P k ( n ) h k ( n ) N 0 2 .
Substitute (33) into Equation (32c) to obtain
ν k * ν k * α L ( n ) Q k ( n ) W log 2 1 + P k ( n ) h k ( n ) N 0 V P k ( n ) W log 2 1 + P k ( n ) h k ( n ) N 0 2 = 0 .
We can ultimately obtain two solutions: one is ν k 1 * = 0 , and the other is ν k 2 * = α * + L ( n ) Q k ( n ) W log 2 1 + P k ( n ) h k ( n ) N 0 + V P k ( n ) . We need to maintain ν k * 0 and choose to maximize the objective function, that is,
ν k * = α * + L ( n ) Q k ( n ) W log 2 1 + P k ( n ) h k ( n ) N 0 + V P k ( n ) + .
Substitute (35) into Equation (32b) to obtain
α * k K ν k * α * L ( n ) Q k ( n ) W log 2 1 + P k ( n ) h k ( n ) N 0 W log 2 1 + P k ( n ) h k ( n ) N 0 2 τ = 0.32
Equation (36) is an equation that is only related to α * . We can solve the equation using binary division to obtain the optimal Lagrange multiplier α * . Then, we can substitute it into (35) to obtain the optimal v k * . Finally, we substitute α * and v k * into (33) to obtain the optimal δ k * ( n ) .

3.3. UAV Trajectory Planning and STAR–RIS Phase Optimization

The communication gain between the GU and UAV in this system is related to the modulus and phase of each communication link, so we need to optimize the UAV trajectory and STAR–RIS phase angle matrix simultaneously. Due to the use of the TDMA protocol in this paper, the perfect STAR–RIS phase matrix can be obtained in the communication gain of all GUs regardless of the UAV’s position. Therefore, we assume that the phase matrix of STAR–RIS has reached its optimal value in each time slot and transform the problem into one that is only related to the UAV trajectory. In summary, we can directly optimize the UAV trajectory while fixing P ( n ) and δ ( n ) , resulting in the following:
P 2.3 : min S u ( n ) k K 1 2 D k l u ( n ) 2 + L ( n ) Q k ( n ) D k l u ( n ) + V · M g · τ · v ( n ) 2 2 s . t . C 7 , C 8 .
To solve problem P 2.3 , we address both the non-convexity of the objective function and the constraints C 6 , C 7 , and C 8 . First, we eliminate the impact of constraints C 7 and C 8 by adjusting the UAV’s flight mode. The UAV operates in normal flight mode to serve GUs, and in each time slot, the optimal position for the next time slot is determined by solving P 2.3 . If the solution S u * ( n ) satisfies the return-to-destination requirement, the UAV calculates the average flight speed from S u ( n 1 ) to S u * ( n ) and compares it with the maximum speed v m a x . The UAV then moves toward the optimal position at the lower of the two speeds. If S u ( n ) does not meet the conditions, the UAV will switch from normal flight mode to return flight mode and fly towards the destination position at maximum flight speed v m a x . Through the above scheme, we can eliminate the influence of constraint C 7 and C 8 . Next, in order to address the non-convexity of the objective function, we use recursive SCA to solve the optimal position S u * ( n ) . By approximating P 2.3 as a convex problem and iteratively solving it alternately, we obtain the approximate solution of the trajectory optimal solution. Specifically, by converting D k l u ( n ) into a first-order Taylor expansion about x u ( n ) and y u ( n ) substituting it into the objective function, we can obtain
C k a ( n ) = W δ k ( n ) P k ( n ) N 0 + P k ( n ) h k ( n ) i ln 2 180 A B H u ( x u ( n ) i x k ) e ( B [ γ k U G ( n ) A ] ) h k 1 ( n ) i h k 0 ( n ) i π d k U G ( n ) i 2 1 + A e ( B [ γ k U G ( n ) i A ] ) 2 1 + ( H u d k U G ( n ) i ) 2 ( x u ( n ) i x k ) 2 + ( y u ( n ) i y k ) 2 + H u 2 4 π d k U G ( n ) i f r c 3 π f r ( x u ( n ) i x k ) c ( x u ( n ) i x k ) 2 + ( y u ( n ) i y k ) 2 + H u 2 P k N L o S ( n ) i η 1 + P k L o S ( n ) i η 0 M R β 0 a U R ( d U R ( n ) i ) a U R 2 1 ( x u ( n ) i x k ) 2 ( d R G ) a R G ( x u ( n ) i x R ) 2 + ( y u ( n ) i y R ) 2 + ( H u H R ) 2 ,
C k b ( n ) = W δ k ( n ) P k ( n ) N 0 + P k ( n ) h k ( n ) i ln 2 180 A B H u ( y u ( n ) i y k ) e ( B [ γ k U G ( n ) i A ] ) h k 0 ( n ) i h k 1 ( n ) i h k 0 ( n ) i π ( d k U G ( n ) i ) 2 ( 1 + A e ( B [ γ k U G ( n ) i A ] ) 2 1 + ( H u d k U G ( n ) i ) 2 ( x u ( n ) i x k ) 2 + ( y u ( n ) i y k ) 2 + H u 2 4 π d k U G ( n ) i f r c 3 π f r ( y u ( n ) i y k ) c ( x u ( n ) i x k ) 2 + ( y u ( n ) i y k ) 2 + H u 2 P k N L o S ( n ) i η 1 + P k L o S ( n ) i η 0 M R β 0 a U R ( d U R ( n ) i ) a U R 2 1 ( y u ( n ) i y k ) 2 ( d R G ) a R G ( x u ( n ) i x R ) 2 + ( y u ( n ) i y R ) 2 + ( H u H R ) 2 ,
C k c ( n ) = D k l u ( n ) i C a x u ( n ) C b y u ( n ) ,
F S u ( n ) = k K 1 2 C a 2 x u ( n ) 2 + 1 2 C b 2 y u ( n ) 2 + C a C b x u ( n ) y u ( n ) + C a C c + C a L ( n ) Q k ( n ) x u ( n ) + C b C c + C b L ( n ) Q k ( n ) y u ( n ) + C c 2 + C c L ( n ) Q k ( n ) + V ε M g τ v ( n ) 2 2 ,
where [ · ] i is the numerical value of the expansion point for the given i-th iteration. After approximating the problem as a convex problem, we solve for the optimal solution in (41) by finding the extremum points. We take the first partial derivatives of function F S u ( n ) relative to x u ( n ) and y u ( n ) , respectively, to make them equal to 0, and obtain two linear correlation equations
F S u ( n ) x u ( n ) = k K C a 2 x u ( n ) + C a C b y u ( n ) + C a C c + C a L ( n ) Q k ( n ) + V ε M g ( ( x u ( n ) x u ( n 1 ) ) τ = 0 ,
F S u ( n ) y u ( n ) = k K C b 2 y u ( n ) + C a C b x u ( n ) + C b C c + C b L ( n ) Q k ( n ) + V ε M g ( ( y u ( n ) y u ( n 1 ) ) τ = 0 .
Further merge (42a) and (42b) to obtain an independent equation, as shown below:
k K C a 2 + C a C b + V ε M g τ x u ( n ) + k K C b 2 + C a C b + V ε M g τ y u ( n ) + k K C b 2 + C a C b + L ( n ) Q k ( n ) C a + C b V ε M g ( ( y u ( n ) + y u ( n 1 ) ) τ = 0 .
Equation (27) is a linear function, where any solution x u ( n ) , y u ( n ) can be the optimal position for the UAV in the next time slot. According to Equation (27), we can obtain multiple optimal solutions for the objective function. Considering the time and energy consumed by the UAV’s position changes, we choose the point closest to S u ( n 1 ) as the optimal position S u * ( n ) to minimize the UAV’s flight distance. Then, the optimal trajectory S u * ( n ) obtained from the solution is used as the expansion point for (i + 1) iterations. When the change in the final trajectory is less than ζ 2 , the asymptotic optimal solution can be obtained.
In the above UAV optimal trajectory solving process, we assume that the phase matrix of STAR–RIS is the optimal solution, so we only need to derive the optimal phase matrix based on the final optimal position S u * ( n ) . Due to the use of the TDMA protocol, we can perform phase optimization in the time slots allocated to each GU, and we can obtain
P 2 . 3 : max Φ k ( n ) h k U G ( n ) + h k R G ( n ) H · Φ k z ( n ) · h U R ( n ) s . t . C 8 .
Therefore, there is the following inequality:
h k U G ( n ) + h k R G ( n ) H · Φ k z ( n ) · h U R ( n ) h k U G ( n ) + h k R G ( n ) H · Φ k z ( n ) · h U R ( n ) .
If and only if arg h k U G ( n ) = arg h k R G ( n ) H · Φ k z ( n ) · h U R ( n ) = ϑ , inequality (45) obtains an equal sign, so we can ultimately obtain
ϕ k , m R x , m R y z ( n ) = mod arg h k , m R x , m R y U G ( n ) arg h ^ k , m R x , m R y R G ( n ) arg h ^ m R x , m R y U R ( n ) ,
where h ^ k , m R x , m R y R G ( n ) represents the component of h k R G ( n ) corresponding to element ( m R x , m R y ) , and h ^ m R x , m R y U R ( n ) represents the component of h U R ( n ) corresponding to element ( m R x , m R y ) . Thus, we obtain the optimal phase angle matrix Φ k z * ( n ) for STAR–RIS.

4. Algorithm Analysis

In this section, the system stability and complexity of the algorithm proposed in this paper were studied. In Theorem 1, we derived that there is a trade-off between queue backlog and energy consumption.
Theorem 1.
Assuming that the trade-off between system stability and energy consumption is feasible, we can obtain
lim N 1 N n = 0 N 1 E { Θ ( n ) } C D + V k K e k o p t + V ε e f o p t ϵ ,
lim N 1 N n = 0 N 1 E { k K e k ( n ) + ε e f ( n ) } 2 C D V + k K e k o p t + ε e f o p t .
It can be concluded from (47) and (48) that there are upper bounds on the average queue length and average energy consumption of the algorithm. Please refer to Appendix B for proof.
Theorem 1 states that the average energy consumption decreases with the increase in control factor V, and the total length of the queue increases with the increase in control factor V. In addition, when the control factor V is large enough, we can obtain the optimal e m o p t and e f o p t . Therefore, there is a trade-off between average energy consumption and queue backlog in this system.
Finally, we analyze the complexity of the proposed algorithm. Firstly, the complexity of the algorithm is determined by the alternating iteration process of the three subproblems and the solving process of the three subproblems. Due to the fact that the optimal transmission power, optimal time slot allocation, and optimal STAR–RIS phase matrix are all closed–form solutions with low complexity in the solving process of each subproblem, we mainly consider the complexity of UAV trajectory optimization. Considering the linear convergence of the algorithm and the nested nature of the two processes, we can conclude that the total complexity of the algorithm is O ( log ( 1 / ζ 1 ) log ( 1 / ζ 2 ) ) . In contrast, the complexity of solving the problem in this paper using the CVX toolbox is O ( N 7 log ( 1 / ζ 1 ) log ( 1 / ζ 2 ) ) . From this, it can be concluded that the algorithm proposed in this paper is a low-complexity algorithm that is more suitable for practical applications.

5. Result Analysis

We evaluate the proposed algorithm from the perspectives of average queue backlog and average total energy consumption through extensive numerical experiments. We simulated a scene of a UAV, a STAR–RIS, and 10 GUs evenly distributed in a large 3D area of 1000 × 1000 m2. In addition, the initial and final positions of the UAV are S I = ( 500 , 500 ) and S F = ( 500 , 500 ) , with an altitude of H u = 100 m and a v m a x of 15 m/s. The number of STAR–RIS elements is R x = 20 , R y = 20 , and the deployment location is S R = [ 0 , 0 , 0 ] . Then, set the service duration of the UAV to N = 300 time slots, with a time slot length of τ = 1 s, and the important simulation parameters are listed in Table 1.
Figure 5 shows the flight trajectories of the proposed scheme and UAV without STAR–RIS assistance under different Lyapunov control factors. As shown in the figure, in the proposed scheme, the larger the penalty coefficient V, the shorter the UAV’s flight distance. This is because an increase in V indicates a higher weight of energy consumption, and the system will be more focused on energy consumption. Therefore, UAV reduces its flight distance to achieve energy savings, which corresponds to the theoretical results in (48). Specifically, when V = 1 × 10 5 occurs, the system will pay more attention to queue stability, so UAV will hover near STAR–RIS to obtain better channel gain for GUs and handle more tasks. When V = 1 × 10 6 occurs, the system begins to focus on energy consumption, gradually shifting its focus from queue stability to energy consumption, shortening the UAV’s flight distance. As V continues to increase until V = 1 × 10 7 occurs, the system will pay more attention to energy consumption, further shortening the UAV’s flight distance. In addition, in UAV–MEC systems without STAR–RIS assistance, due to the extremely harsh communication environment, the UAV chooses shorter flight distances to reduce energy consumption.
Figure 6 shows the influence of the Lyapunov control factor on system stability. In the figure, it can be seen that the queue length of GUs is much larger than that of UAV, because UAVs have a much faster processing speed than GUs. In addition, as the time slots increase, all GU queues first grow rapidly, then slowly decrease, and eventually stabilize in the region. All UAV queues first grow and then gradually stabilize. This is because when the UAV takes off from its initial position, it is far away from the GU center and STAR–RIS, with a poor channel gain, which makes it difficult to offload tasks in a timely manner, resulting in a backlog of GU queues. As the UAV trajectory is adjusted, the UAV will gradually approach the center and STAR–RIS. At this time, the GU’s channel gain gradually increases, allowing more tasks to be offloaded onto the UAV, resulting in a gradual decrease in the GU’s queue. When the UAV reaches the optimal position, the communication channel gain fluctuates very little, and the number of tasks generated in each time slot is balanced with the number of tasks processed locally and by the MEC server. At this point, both the GU queue and the UAV queue tend to stabilize. Furthermore, as the V value increases, the length of each queue also increases. The reason is that the increase in V leads to a higher proportion of energy consumption, and the system tends to lean more towards energy consumption between queue stability and energy consumption, resulting in an increase in queue backlog.
Figure 7 illustrates the variations in system energy consumption under different control factors V. At the initial stage of UAV task processing, the communication channel gain is relatively low due to the UAV’s distant position from the GUs and the STAR–RIS. This results in higher energy consumption for transmission tasks, as more power is required to maintain reliable communication over a weaker channel. Consequently, the total energy consumption is initially high. As the UAV moves closer to the center area of the GUs and the STAR–RIS over successive time slots, the channel gain improves significantly. This improvement reduces the energy required for transmission tasks, leading to a gradual decrease in total energy consumption. However, when the UAV reaches its final position, the channel gain begins to decrease again due to the increased distance from the GUs or potential obstacles, causing a corresponding rise in energy consumption. Furthermore, the control factor V plays a critical role in balancing energy consumption and system performance. As V increases, the system places greater emphasis on minimizing energy consumption, which is achieved by optimizing the UAV’s trajectory and transmission power allocation. This results in a noticeable reduction in system energy consumption, as demonstrated in Figure 7. This behavior aligns with the theoretical framework established in (47), which highlights the trade–off between energy efficiency and system performance controlled by V.
Figure 8 provides the channel gains of four GUs. The total channel gain is the sum of the direct channel gain and STAR–RIS channel gain. As can be seen from the figure, the channel gain brought by the STAR–RIS auxiliary system makes the total gain of each GU much higher than the direct channel gain, greatly improving the communication quality between GUs and the UAV. Furthermore, all three channel gains gradually increase as the drone approaches GUs and STAR–RIS, and due to different positions, GUs enjoy different STAR–RIS gains. GUs closer to STAR–RIS have greater channel gains when communicating with the UAV.
In Figure 9, we plot the change curves of the task backlog queue for our proposed scheme and two benchmark schemes. From Figure 9, it can be seen that if the greedy algorithm is used, the system only focuses on the current optimal resource allocation and does not pay attention to long–term system stability, resulting in more intense competition among GUs and the inability to maintain queue stability. This algorithm that only considers current interests leads to a continuous increase in task queue backlog, resulting in a decrease in system performance and affecting GU experience. In addition, the scheme proposed in reference [34] uses traditional RIS to optimize the unmanned aerial vehicle–mobile edge computing (UAV–MEC) system. While this method does achieve stable results, the length of the queue backlog remains longer compared to our proposed scheme. The reliance on traditional RIS introduces limitations due to its static positioning, which hinders efficient resource utilization and prolongs the processing delays. Therefore, STAR–RIS is introduced in the model of this paper to avoid the positional defects of traditional RIS. STAR–RIS can be dynamically deployed at the center position of ground users (GUs), providing significant channel gain improvements. This adaptability ensures that the communication links are consistently optimized, thereby enhancing overall system performance. At the same time, a low–complexity and highly efficient resource allocation algorithm is designed to effectively reduce queue backlog. This algorithm takes into account both current and future resource demands, ensuring balanced and stable queue lengths. By intelligently allocating resources and leveraging the enhanced capabilities of STAR–RIS, our proposed scheme outperforms existing methods in maintaining system stability and reducing task backlog.
In order to investigate the impact of more factors on the total energy consumption of the system, Figure 10 studies the effects of different GU and STAR–RIS unit numbers on the total energy consumption. Both benchmark schemes were set up to compare with the proposed scheme in this paper. In Figure 10a, as the number of GUs increases from 5 to 25, the total energy consumption increases from 0.15 J to 0.49 J. This is because the increase in GUs leads to intensified competition for communication resources, resulting in higher energy consumption. In Figure 10b, as the number of STAR–RIS units increases from 100 to 500, the total energy consumption decreases from 0.28 J to 0.15 J. This is because the increase in STAR–RIS units improves the channel gain of the communication link and reduces the energy consumption of GU offloading tasks.
Meanwhile, we can conclude that the total energy consumption of the proposed scheme in this article is lower than that of the other three benchmark schemes. The greedy algorithm only considers the current energy consumption and does not take a holistic perspective, making it unsuitable for solving the long–term optimization problem in this article. Therefore, it has the highest energy consumption. Although CPSO has lower energy consumption than the greedy strategy, as an intelligent optimization algorithm, it is prone to getting stuck in local optima, and due to its high complexity, it is difficult to apply in practical scenarios. However, the algorithm in this article cleverly utilizes the properties of convex functions to transform the problem into three subproblems to be solved alternately and uses mathematical methods to derive closed-form solutions of each subproblem, effectively reducing the complexity of the algorithm and significantly improving the accuracy of the solution. Therefore, the algorithm proposed in this article can achieve lower energy consumption. In terms of modeling, traditional RIS is used to modify the UAV–MEC system in reference [34]. Although it effectively reduces energy consumption, the STAR–RIS proposed in this article outperforms traditional RIS in modeling and provides more communication resources.

6. Conclusions

This paper investigates STAR–RIS–assisted UAV–MEC networks and proposes a low-complexity resource allocation algorithm based on Lyapunov theory. By replacing traditional RIS with STAR–RIS, we improve communication quality in harsh environments, achieving full-space coverage between UAVs and GUs. The Lyapunov theory transforms the long–term optimization problem into an online optimization problem, and the objective function is solved by iteratively optimizing three subproblems. Simulation results demonstrate that the proposed scheme significantly outperforms benchmark methods (greedy algorithm, CPSO algorithm, and traditional RIS scheme) in terms of queue stability and energy efficiency. Specifically, the proposed scheme achieves a significantly lower user average task queue length compared to the greedy algorithm, CPSO algorithm, and traditional RIS scheme, with reductions of 46.7%, 33.3%, and 20%, respectively. Furthermore, the proposed scheme demonstrates a substantial reduction in average energy consumption, achieving improvements of 68.6%, 56%, and 42.1% over the greedy algorithm, CPSO algorithm, and traditional RIS scheme scheme, respectively. These results highlight the superior performance of our scheme in maintaining queue stability and reducing energy consumption. Future work will explore resource allocation schemes for multi-UAV/RIS scenarios.

Author Contributions

Conceptualization, X.H. and H.Z.; methodology, X.H.; software, H.Z.; validation, X.H. and H.Z.; formal analysis, H.Z.; investigation, X.H.; resources, H.Z.; data curation, H.Z.; writing—original draft preparation, H.Z.; writing—review and editing, X.H.; visualization, W.Z.; supervision, D.H.; project administration, W.Z.; funding acquisition, X.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data presented in this study are available in Section 5.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

According to the formula [ a b ] + + c 2 a 2 + b 2 + c 2 + 2 a ( c b ) , the following equation can be obtained:
Q k ( n + 1 ) 2 Q k ( n ) 2 = Q k ( n ) D k l D k l u ( n ) + + A k ( n ) 2 Q k ( n ) 2 D k l + D k l u ( n ) 2 + A k ( n ) 2 + 2 Q k ( n ) A k ( n ) D k l D k l u ( n ) D k l + D k l u m a x ( n ) 2 + A k ( n ) 2 + 2 Q k ( n ) A k ( n ) D k l D k l u ( n ) = B k a ( n ) 2 Q k ( n ) D k l u ( n ) ,
where B k a ( n ) = D k l + D k l u m a x ( n ) 2 + A k ( n ) 2 + 2 Q k ( n ) ( A k ( n ) + D k l ) is a constant that is independent of variables.
Based on x + i = 1 Y y i + + z = 1 Z z j 2 x 2 i = 1 Y ( x + y i ) 2 + j = 1 Z ( x + z j ) 2 ( Y + Z ) x 2 + i = 1 Y y i m a x + j = 1 Z z i m a x 2 , we can obtain the following equation:
L ( n + 1 ) 2 L ( n ) 2 = L ( n ) D u + + k K D k l u ( n ) 2 L ( n ) 2 L ( n ) D u 2 + k K L ( n ) + D k l u ( n ) 2 ( M + 1 ) L ( n ) 2 + D u + k K D k l u m a x 2 = L ( n ) D u 2 L ( n ) 2 + k K D k l u ( n ) 2 + 2 L ( n ) D k l u ( n ) u + D u + k K D k l u m a x 2 = B b ( n ) + k K D k l u ( n ) 2 + 2 L ( n ) D k l u ( n ) ,
where B b = L ( n ) D u 2 L ( n ) 2 + D u + k K D k l u m a x 2 is a constant that is independent of variables. Divide (A1) and (A2) by 2 and add them together to obtain
Δ v U ( n ) = Δ U ( n ) + V · E k K e k l u ( n ) + ε e f ( n ) | Θ ( n ) = E k K Q k ( n + 1 ) 2 Q k ( n ) 2 2 + L ( n + 1 ) 2 L ( n ) 2 2 + V k K e k l u ( n ) + V ε e f ( n ) | Θ ( n ) E B 2 2 + k K B 1 2 + D k l u ( n ) 2 + L ( n ) D k l u ( n ) Q k ( n ) D k l u ( n ) + V e k l u ( n ) + V ε e f ( n ) | Θ ( n ) .
Based on the above inference, we have completed the proof of Lemma 1.

Appendix B

Assuming problem P 2 is feasible, consider a task with an arrival rate λ . For all time slots and any σ > 0 , there exists an independent and stable task offloading and resource allocation algorithm that satisfies the following inequality:
E { k K A k ( n ) | Θ ( n ) } D u + k K D k l ϵ ,
E { k K D k l u ( n ) | Θ ( n ) } D u ϵ ,
E { k K e k ( n ) | Θ ( n ) } k K e k o p t + σ ,
E { ε e f ( n ) | Θ ( n ) } ε e f o p t + σ ,
where ϵ is a normal number. When setting σ 0 , substitute Equations (A4)–(A7) into Equation (A3) to obtain
Δ v U ( n ) E D u ( n ) 2 2 + k K D k l + D k l u ( n ) m a x 2 2 + k K D k l u ( n ) m a x + D k l 2 2 + k K A k ( n ) 2 2 + k K D k l u ( n ) 2 ϵ E L ( n ) + k K Q k ( n ) | Θ ( n ) + V ε e f o p t + k K V e k o p t E D u ( n ) 2 2 + k K D k l + D k l u ( n ) m a x 2 2 + k K D k l u ( n ) m a x + D k l 2 2 + D u + k K D k l ϵ 2 2 + D u ϵ 2 ϵ E Θ ( n ) + V ε e f o p t + k K V e k o p t = C D ϵ E Θ ( n ) + V ε e f o p t + k K V e k o p t ,
where C D is a constant. By summing up the above inequalities for all time slots except n = N , we define N { N | n N } . Using the properties of queue scaling and sum, we can obtain
E { U ( N ) U ( 0 ) } + n N E k K V e k ( n ) + V ε e f ( n ) N C D n N ϵ E Θ ( n ) + N k K V e k o p t + N V ε e f o p t .
Due to the initial state of the system queue being empty, represented by E U ( 0 ) = 0 , and eventually reaching a steady state, E U ( N ) is a bounded constant. When N , dividing both sides of inequality (A9) by N ϵ and ignoring some non-negative terms, we can obtain
lim N 1 N n = 0 E { Θ ( n ) } C D ϵ + V ε e f o p t ϵ + n N k K V e k o p t N ϵ = C D ϵ + V ε e f o p t ϵ + 1 ϵ k K 1 N n N V e k o p t = C D + V k K e k o p t + V ε e f o p t ϵ .
Formula (A10) indicates that there is an upper bound on the system queue backlog, indicating that the entire system is stable. By using a similar approach, when N , dividing both sides of inequality (A9) by N V and ignoring some non-negative terms, we can obtain
lim N 1 N n = 0 N 1 E { k K e k ( n ) + ε e f ( n ) } 2 C D V + k K e k o p t + ε e f o p t .
Based on the above inference, we have completed the proof of Theorem 1.

References

  1. Gong, Y.; Yao, H.; Wang, J.; Li, M.; Guo, S. 6G Internet of Things: Edge Intelligence-Driven Joint Offloading and Resource Allocation for Future 6G Industrial Internet of Things. IEEE Trans. Netw. Sci. Eng. 2024, 11, 5644–5655. [Google Scholar] [CrossRef]
  2. Lu, Y.; Luo, Z. An Effective Scheme for Delay Minimization in a Multi-UAV-Enabled NOMA-MEC System. IEEE Commun. Lett. 2025, 29, 40–44. [Google Scholar] [CrossRef]
  3. Yan, J.; Wang, W.; Liu, J.; Deng, J.; Yuan, H.; Zhu, Y. Task Demand-Oriented Collaborative Offloading and Deployment Strategy in Software-Defined UAV-Assisted Edge Networks. IEEE Sens. J. 2025, 25, 1641–1655. [Google Scholar] [CrossRef]
  4. Tang, X.; Tang, Q.; Yu, R.; Li, X. Digital Twin-Empowered Task Assignment in Aerial MEC Network: A Resource Coalition Cooperation Approach with Generative Model. IEEE Trans. Netw. Sci. Eng. 2025, 12, 13–27. [Google Scholar] [CrossRef]
  5. Yahya, M.; Naeem, M.; Kaleem, Z.; Alenezi, A.H.; Ejaz, W. Robust Multicriterion Offloading in Digital-Twin-Assisted UAV Networks. IEEE Internet Things J. 2025, 12, 1643–1654. [Google Scholar] [CrossRef]
  6. Wang, S.; Song, X.; Song, T.; Yang, Y. Fairness-Aware Computation Offloading with Trajectory Optimization and Phase-Shift Design in RIS-Assisted Multi-UAV MEC Network. IEEE Internet Things J. 2024, 11, 20547–20561. [Google Scholar] [CrossRef]
  7. Liao, Y.; Song, Y.; Xia, S.; Han, Y.; Xu, N.; Zhai, X. Energy Minimization of RIS-Assisted Cooperative UAV–USV MEC Network. IEEE Internet Things J. 2024, 11, 32490–32502. [Google Scholar] [CrossRef]
  8. Wang, T.; Fang, F.; Ding, Z. An SCA and Relaxation Based Energy Efficiency Optimization for Multi-User RIS-Assisted NOMA Networks. IEEE Trans. Veh. Technol. 2022, 71, 6843–6847. [Google Scholar] [CrossRef]
  9. Liu, Q.; Han, J.; Liu, Q. Joint Task Offloading and Resource Allocation for RIS–assisted UAV for Mobile Edge Computing Networks. In Proceedings of the 2023 IEEE/CIC International Conference on Communications in China (ICCC), Dalian, China, 10–12 August 2023; pp. 1–6. [Google Scholar]
  10. Zhang, M.; Su, Z.; Xu, Q.; Qi, Y.; Fang, D. Energy-Efficient Task Offloading in UAV-RIS-Assisted Mobile Edge Computing with NOMA. In Proceedings of the IEEE INFOCOM 2024—IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), Vancouver, BC, Canada, 20 May 2024; pp. 1–6. [Google Scholar]
  11. Duo, B.; He, M.; Wu, Q.; Zhang, Z. Joint Dual-UAV Trajectory and RIS Design for ARIS-Assisted Aerial Computing in IoT. IEEE Internet Things J. 2023, 10, 19584–19594. [Google Scholar] [CrossRef]
  12. Sultan, R.; Shamseldeen, A. STAR–RIS-Aided Full-Duplex Communication for Massive MIMO IoT Systems. In Proceedings of the 2023 IEEE 15th International Conference on Computational Intelligence and Communication Networks (CICN), Bangkok, Thailand, 22–23 December 2023; pp. 50–55. [Google Scholar]
  13. Ahmed, M.; Wahid, A.; Laique, S.S.; Khan, W.U.; Ihsan, A.; Xu, F.; Chatzinotas, S.; Han, Z. A Survey on STAR–RIS: Use Cases, Recent Advances, and Future Research Challenges. IEEE Internet Things J. 2023, 10, 14689–14711. [Google Scholar] [CrossRef]
  14. Xiao, H.; Hu, X.; Mu, P.; Wang, W.; Zheng, T.X.; Wong, K.K.; Yang, K. Simultaneously Transmitting and Reflecting RIS (STAR–RIS) Assisted Multi-Antenna Covert Communication: Analysis and Optimization. IEEE Trans. Wirel. Commun. 2024, 23, 6438–6452. [Google Scholar] [CrossRef]
  15. Su, Y.; Pang, X.; Lu, W.; Zhao, N.; Nallanathan, A. Joint Location and Beamforming Optimization for STAR–RIS Aided NOMA-UAV Networks. IEEE Trans. Veh. Technol. 2023, 72, 11023–11028. [Google Scholar] [CrossRef]
  16. Ahmed, S.; Kamal, A.E. Sky’s the Limit: Navigating 6G with ASTAR–RIS for UAVs Optimal Path Planning. In Proceedings of the 2023 IEEE Symposium on Computers and Communications (ISCC), Gammarth, Tunisia, 9–12 July 2023; pp. 582–587. [Google Scholar]
  17. Zhang, H.; Zhang, X.; Long, K.; Ren, C.; Nallanathan, A. Resource Allocation for STAR-IRS-Aided UAV Secure Communication. In Proceedings of the ICC 2024-IEEE International Conference on Communications, Denver, CO, USA, 9–13 June 2024; pp. 5658–5663. [Google Scholar]
  18. Xiao, H.; Zhang, X.; Wang, W.; Wong, K.-K. STAR–RIS Enhanced UAV–MEC Networks: Bi-Directional Task Offloading. In Proceedings of the 2024 IEEE/CIC International Conference on Communications in China (ICCC), Hangzhou, China, 7–9 August 2024; pp. 30–35. [Google Scholar]
  19. Xiao, H.; Hu, X.; Wang, W.; Su, Z.; Wong, K.-K.; Yang, K. STAR–RIS and UAV Combination for Simultaneous Task Offloading and Communications in MEC Networks. In Proceedings of the 2024 16th International Conference on Wireless Communications and Signal Processing (WCSP), Hefei, China, 24–26 October 2024; pp. 857–862. [Google Scholar]
  20. Zhang, J.; Zhou, L.; Tang, Q.; Ngai, E.C.H.; Hu, X.; Zhao, H.; Wei, J. Stochastic Computation Offloading and Trajectory Scheduling for UAV-Assisted Mobile Edge Computing. IEEE Internet Things J. 2019, 6, 3688–3699. [Google Scholar] [CrossRef]
  21. Yang, Z.; Bi, S.; Ding, Z.; Zhang, Y.-J.A. Dynamic Trajectory and Offloading Control of UAV-enabled MEC under User Mobility. In Proceedings of the 2021 IEEE International Conference on Communications Workshops (ICC Workshops), Montreal, QC, Canada, 14–23 June 2021; pp. 1–6. [Google Scholar]
  22. Kumar, M.; Kishor, A.; Singh, P.K.; Dubey, K. Deadline-Aware Cost and Energy Efficient Offloading in Mobile Edge Computing. IEEE Trans. Sustain. Comput. 2024, 9, 778–789. [Google Scholar] [CrossRef]
  23. Bao, S.; Zhang, S.; Chi, K.; Chen, X.; Gao, W. Lyapunov-Based Computation Rate Maximization for Wireless Powered Edge Computing. In Proceedings of the 2023 19th International Conference on Mobility, Sensing and Networking (MSN), Nanjing, China, 14–16 December 2023; pp. 207–214. [Google Scholar]
  24. Mei, H.; Yang, K.; Liu, Q.; Wang, K. 3D-Trajectory and Phase-Shift Design for RIS-Assisted UAV Systems Using Deep Reinforcement Learning. IEEE Trans. Veh. Technol. 2022, 71, 3020–3029. [Google Scholar] [CrossRef]
  25. Zeng, Y.; Chen, S.; Cui, Y.; Du, J. Efficient Trajectory Planning and Dynamic Resource Allocation for UAV-Enabled MEC System. IEEE Commun. Lett. 2024, 28, 597–601. [Google Scholar] [CrossRef]
  26. Hazarika, B.; Singh, K.; Li, C.-P.; Schmeink, A.; Tsang, K.F. RADiT: Resource Allocation in Digital Twin-Driven UAV-Aided Internet of Vehicle Networks. IEEE J. Sel. Areas Commun. 2023, 41, 3369–3385. [Google Scholar] [CrossRef]
  27. Cao, X.; Yang, B.; Huang, C.; Yuen, C.; Di Renzo, M.; Niyato, D.; Han, Z. Reconfigurable Intelligent Surface-Assisted Aerial-Terrestrial Communications via Multi-Task Learning. IEEE J. Sel. Areas Commun. 2021, 39, 3035–3050. [Google Scholar] [CrossRef]
  28. Cai, Y.; Wei, Z.; Hu, S.; Liu, C.; Ng, D.W.K.; Yuan, J. Resource Allocation and 3D Trajectory Design for Power-Efficient IRS-Assisted UAV-NOMA Communications. IEEE Trans. Wirel. Commun. 2022, 21, 10315–10334. [Google Scholar] [CrossRef]
  29. Mei, H.; Yang, K.; Shen, J.; Liu, Q. Joint Trajectory-Task-Cache Optimization with Phase-Shift Design of RIS-Assisted UAV for MEC. IEEE Wirel. Commun. Lett. 2021, 10, 1586–1590. [Google Scholar] [CrossRef]
  30. Deng, D.; Rao, W.; Liu, B.; Jia, D.; Sheng, Y.; Wang, J.; Xiong, S. TA-MAC: A Traffic-Aware TDMA MAC Protocol for Safety Message Dissemination in MEC–assisted VANETs. In Proceedings of the 2020 29th International Conference on Computer Communications and Networks (ICCCN), Honolulu, HI, USA, 3–6 August 2020; pp. 1–9. [Google Scholar]
  31. Yang, Q.; Li, Y.; Peng, Y.; Wu, T. Performance Comparison of RIS—Aided Uplink NOMA and OMA Multiuser Networks. In Proceedings of the 2023 IEEE 11th Joint International Information Technology and Artificial Intelligence Conference (ITAIC), Chongqing, China, 8–10 December 2023; pp. 1890–1896. [Google Scholar]
  32. Kota, N.R.; Naidu, K. Minimizing Energy Consumption in H-NOMA Based UAV-Assisted MEC Network. IEEE Commun. Lett. 2023, 27, 2536–2540. [Google Scholar] [CrossRef]
  33. Hao, C.; Chen, Y.; Mai, Z.; Chen, G.; Yang, M. Joint Optimization on Trajectory, Transmission and Time for Effective Data Acquisition in UAV-Enabled IoT. IEEE Trans. Veh. Technol. 2022, 71, 7371–7384. [Google Scholar] [CrossRef]
  34. Zhuo, Z.; Dong, S.; Zheng, H.; Zhang, Y. Method of Minimizing Energy Consumption for RIS Assisted UAV Mobile Edge Computing System. IEEE Access 2024, 12, 39678–39688. [Google Scholar] [CrossRef]
Figure 1. Scenario of UAV–MEC network system assisted by STAR–RIS.
Figure 1. Scenario of UAV–MEC network system assisted by STAR–RIS.
Drones 09 00207 g001
Figure 2. Schematic diagram of UAV–MEC network process assisted by STAR–RIS.
Figure 2. Schematic diagram of UAV–MEC network process assisted by STAR–RIS.
Drones 09 00207 g002
Figure 3. TDMA time slot partition protocol.
Figure 3. TDMA time slot partition protocol.
Drones 09 00207 g003
Figure 4. Algorithm design framework.
Figure 4. Algorithm design framework.
Drones 09 00207 g004
Figure 5. Flight trajectories under different control factors.
Figure 5. Flight trajectories under different control factors.
Drones 09 00207 g005
Figure 6. Queue length under different control factors.
Figure 6. Queue length under different control factors.
Drones 09 00207 g006
Figure 7. Total energy consumption under different control factors.
Figure 7. Total energy consumption under different control factors.
Drones 09 00207 g007
Figure 8. Changes in channel gain for different GUs.
Figure 8. Changes in channel gain for different GUs.
Drones 09 00207 g008
Figure 9. Task backlog queue under different schemes.
Figure 9. Task backlog queue under different schemes.
Drones 09 00207 g009
Figure 10. The energy consumption via different environment parameters. (a) The energy consumption of different numbers of GUs. (b) The energy consumption of different STAR–RIS element quantities.
Figure 10. The energy consumption via different environment parameters. (a) The energy consumption of different numbers of GUs. (b) The energy consumption of different STAR–RIS element quantities.
Drones 09 00207 g010
Table 1. Numerical simulation network parameters.
Table 1. Numerical simulation network parameters.
NotationDefinitionValue
WCommunication bandwidth 3 × 10 4 Hz
P m a x Maximum transmit power of GU 100 mW
N 0 Noise power density 1 × 10 9 dBm / Hz
φ Process density 10 3 cycles / bit
A k The average task arrival rate 40 kbit / Hz
M g The weight of UAV 9.8 kg
β 0 The unit path loss 1 × 10 7
f r Communication carrier frequency 1 GHz
η 0 , η 1 The excessive path loss of LoS and NLoS 0.1 , 21
A , B The constant values of environment 9.61 , 0.16
f k l Processing frequency at local 2 × 10 7 Hz
f u Processing frequency at UAV 2 × 10 8 Hz
Δ R x , Δ R y The distance between the array antennas 1 m , 1 m
ε The energy attenuation factor of the UAV 3 × 10 3
a U R , a R G The path loss exponent 1 , 3.6
k R G The Rician factor 2 dB
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Hu, X.; Zhao, H.; Zhang, W.; He, D. Online Resource Allocation and Trajectory Optimization of STAR–RIS–Assisted UAV–MEC System. Drones 2025, 9, 207. https://doi.org/10.3390/drones9030207

AMA Style

Hu X, Zhao H, Zhang W, He D. Online Resource Allocation and Trajectory Optimization of STAR–RIS–Assisted UAV–MEC System. Drones. 2025; 9(3):207. https://doi.org/10.3390/drones9030207

Chicago/Turabian Style

Hu, Xi, Hongchao Zhao, Wujie Zhang, and Dongyang He. 2025. "Online Resource Allocation and Trajectory Optimization of STAR–RIS–Assisted UAV–MEC System" Drones 9, no. 3: 207. https://doi.org/10.3390/drones9030207

APA Style

Hu, X., Zhao, H., Zhang, W., & He, D. (2025). Online Resource Allocation and Trajectory Optimization of STAR–RIS–Assisted UAV–MEC System. Drones, 9(3), 207. https://doi.org/10.3390/drones9030207

Article Metrics

Back to TopTop