1. Introduction
Over the past few years, the swift advancement of mobile computing has been propelled by the growing use of different mobile devices, including smartphones and wearable devices, which enable computing and communication to be carried out anytime and anywhere. However, due to resource limitations, mobile devices often struggle to meet users’ quality of service expectations. Although cloud computing has significant advantages in resource management and computational performance compared to local computing, it usually relies on remote data centers with long data transmission latency [
1,
2], which can severely impact applications that demand high real-time performance such as multimedia or medical monitoring. In recent years, fog computing has emerged as a promising paradigm to address the limitations of traditional cloud-based architectures in smart city environments, such as high latency and limited context awareness. A recent systematic literature review by Rahman et al. [
3] provides a comprehensive classification of fog computing research into service-based, resource-based, and application-based approaches, highlighting its crucial role in latency-sensitive urban applications. Additionally, by offloading computing tasks to servers closer to users or task sources, mobile edge computing (MEC) provides an efficient solution to these challenges, alleviating the computational burden on user devices, and effectively reducing the data transmission delay compared with cloud computing.
For the multi-user scenario, researchers have given solution ideas to reduce the energy consumption of the system. In [
4], the authors introduced an energy-efficient beamforming scheme for downlink multi-user MISO systems, where a base station equipped with dynamic metasurface antennas (DMAs) simultaneously serves multiple users, and they used an efficient alternating optimization (AO) algorithm to solve it. Additionally, researchers have investigated computational offloading strategies and suggested various methods to optimize both energy and delay consumption. In [
5], the authors combined energy and delay consumption into a single framework, reformulating the problem as the search for an optimal solution within a finite policy space, and applied a heuristic approach to make offloading decisions for mobile devices. In [
6], the authors utilized an orthogonal frequency division multiplexing (OFDM) transmission mechanism to reduce inter-user interference while optimizing offloading decisions and bandwidth resources to lower the overall energy consumption of the MEC system. In [
7], the authors considered the uncertainty of resource demand and the delay constraints of heterogeneous computing tasks for dynamic time-varying systems, and co-optimized the offloading decision, computational resources, and bandwidth resources in the MEC system to minimize the total system energy consumption. Yousefpour et al. [
8] proposed a comprehensive IoT–fog–cloud architecture aimed at minimizing IoT service delay through fog offloading. Their framework allows IoT tasks to be adaptively offloaded among fog nodes based on real-time queue states and estimated processing times, effectively balancing load and improving response times. By developing both a Markovian queueing model and an event-driven simulator, they demonstrate that their fog-collaboration policy significantly lowers average service delay across different traffic patterns (e.g., light vs. heavy tasks). This work highlights the potential of collaborative fog offloading to meet stringent quality of service (QoS) requirements in large-scale IoT deployments, and it provides valuable insights for subsequent research on robust, low-latency fog-based architectures. The authors of [
9] introduced a computational offloading algorithm that jointly considers task prioritization and partial offloading. This algorithm makes offloading decisions based on the task’s tolerable delay, the minimum computation delay, and the lowest energy consumption. The offloading task priorities and MEC server rankings are determined independently. Tasks with higher priority are offloaded firstly to MEC servers with greater computational power, alleviating the processing load on local devices. In [
10], the authors proposed a mobile edge computing model consisting of multiple users and MEC servers, where each user has multiple independent tasks. Through optimizing task offloading, computation, and communication resource allocation, the goal is to achieve the overall best decision that minimizes the weighted total energy cost and latency for all users. However, the cache capacity of the MEC servers is not considered in the above studies, and all assumed that all service programs are cached in the servers.
In practice, the cache capacity of the MEC server’s storage is constrained, preventing it from caching all service programs. To address this problem, the authors of [
11] proposed a cache-assisted computation offloading scheme and optimized the caching policy, offloading policy, and resource allocation to decrease task processing time and conserve energy on user devices. In [
12], an integrated optimization approach for multi-user computation offloading and service caching was explored by the authors, modeling it as a mixed-integer nonlinear programming problem to reduce the system’s task cost. In [
13], the authors designed a collaborative caching algorithm to assist computation offloading, which determined different cache contents for different tasks and designed corresponding update policies to optimize the usage of cache and computational resources in the system. The authors of [
14] used game theory to develop a suboptimal offloading strategy that incorporates service caching and D2D communication in multi-access networks, aiming to reduce computational offloading overhead by optimizing the offloading decision. In [
15], a cache-enhanced computational offloading system model was proposed, extending the local caching of a single region in a MEC network to collaborative caching across multiple regions. This approach improves the overall cache hit rate, with offloading decisions ultimately determined using deep reinforcement learning. The above studies assumed that users are within the mobile network coverage and did not consider the communication situation of users in complex environments such as when terrestrial infrastructure is scarce or unavailable.
Mobile networks cannot cover all areas due to complex terrain, infrastructure limitations, or natural disasters. The use of UAVs has become increasingly prevalent in wireless communication systems due to their flexibility to enable dynamic deployment [
16]. Equipping MEC servers on UAVs to provide service to users who cannot be covered by wireless networks is an approach that can significantly improve the user experience compared to traditional terrestrial MEC networks. In [
17], researchers have investigated the dominant barriers and key techniques of THz-ISAC-UAV from the transceiver design perspective. To enhance both energy and hardware efficiency while addressing challenges related to distance and mobility, the authors focused on three critical technologies: UM-MIMO-ISAC hybrid beamforming, THz-ISAC waveform design, and communication and sensing channel state information acquisition. Finally, they provided a comprehensive discussion of their underlying principles and the key challenges associated with each.
In [
18], focusing on multi-UAV MEC systems, researchers examined service caching-based cooperative computation and resource allocation, proposing an optimization problem that minimizes the worst-case task completion delay under device and UAV energy constraints. In [
19], the authors utilized UAV-assisted service caching to obtain an optimal offloading and resource allocation policy that minimizes energy consumption under delay constraints by determining the 3D location of the UAV and the deployment of its services in the edge servers. The authors of [
20] proposed a model to minimize the total UAV endurance under delay constraints, which jointly optimizes the offloading and caching decisions, the UAV hovering trajectory, and its computational resource allocation to improve the communication and computational resource utilization.
Research on UAV-assisted service cache-based computational offloading has just begun, and the related literature and research results are relatively limited. Refs. [
18,
19] studied UAV-assisted service cache-based computation offloading, but all of them assumed that users cached the required services, but the service requirements of users are usually highly dynamic and diverse, it is difficult to match and cache suitable services for all users in advance, and some computational tasks not only rely on static code or data, but also require real-time back-end processing. In that circumstance, all users holding caches is inconsistent with reality and not feasible. In addition, the authors of [
18] proposed a scheme to offload user tasks to the base station for computation when the UAV computational capability or cache resources do not meet the user’s needs, but it does not take into account the fact that the user is distant from the base station, which makes it difficult for the UAV to cover both users and the base station at the same time, and when the UAV does not cache the relevant service program, it only considers the download of service programs from the base station, which results in a waste of the user’s device service resources. In [
19], the authors assumed that only the services cached by the UAV are possible to uninstall, and this scenario ignores the situation where the UAV can download programs from either the user or the base station, which underutilizes the computational resources of the UAV. In [
20], the authors assumed that the user’s computing power and cache capacity are much smaller than those of the UAV. In addition, they did not study local computing and caching, and concentrated on offloading computing workloads and the processing of UAV-cached tasks at the current and previous moments, without considering service caching. To tackle the aforementioned problems, this paper presents a UAV-assisted resource optimization algorithm based on service caching and downloading, which introduces a relay UAV and a computing UAV in order to assist users in completing computation tasks. To reduce total system energy consumption, this paper jointly optimizes UAV positions, caching decisions, computational resource allocation for the computing UAV, and user offloading strategies. The key contributions are summarized as follows:
- (1)
- A UAV-assisted cooperative computing model based on service caching and downloading is constructed, which consists of users, a relay UAV, a computing UAV, and a base station, and users are outside the coverage range of the base station. Among them, the computing UAV is able to cache part of the service programs and can provide computing and downloading services to users. The relay UAV has a wider communication coverage and can act as a relay to download programs from the base station and forward them to a user or the computing UAV when neither the user nor the computing UAV has cached the relevant service programs. 
- (2)
- Under the constraints of time delay and the computing UAV cache capacity, an optimization problem is designed to optimize and lower system-wide energy consumption. The problem is a mixed-integer nonlinear programming (MINLP) problem which is divided into three separate sub-problems. The UAV location deployment and service caching sub-problem involves the optimization of discrete and continuous variables, and is solved using a simulated annealing algorithm; the computational resource allocation sub-problem is examined to establish the association between the resource allocation ratio and the system overall energy consumption, and it is addressed by using a greedy algorithm; the optimization variables of the offloading strategy sub-problem are binary, and the offloading strategy is solved using a genetic algorithm as a genetic code for each individual. 
- (3)
- The objective problem’s optimal solution is determined through global iteration. Simulation results demonstrate that the proposed algorithm achieves substantial energy savings compared to benchmark algorithms. In comparison with the algorithm introduced in [ 18- ], the overall energy consumption is reduced by approximately 40%. 
This paper proceeds as follows: The system model is described in 
Section 2, and the objective problem based on this system model is presented in 
Section 3. In 
Section 4, the optimization method for each sub-problem is given and the analysis of the simulation results is conducted. Finally, the full paper is summarized in 
Section 5.
  2. System Model
As shown in 
Figure 1, the system model is presented, which consists of a relay UAV, a computing UAV, 
 ground users, and a base station. Considering the UAV’s constrained capacity, the computation function and the longer distance communication function are completed by two UAVs, respectively, the computing UAV carries a MEC server to provide computation and service program caching services for users, and the relay UAV is used to assist the communication between users, the computing UAV, and the base station. The set of user devices is denoted by 
; each device is assigned a computational task to complete, and execution of the task can take place on the local device or on the computing UAV.
Let 
 signify the task assigned to device 
m, where 
 represents the size of the task data, 
 represents the amount of CPU cycles needed to process a single bit of task data, and 
 denotes the upper limit of permissible latency for processing the task. Let 
 be the user device offloading strategy, where 
 implies that the task is executed directly on the user device, and 1 means that the task is executed on the computing UAV, which can serve multiple user devices at the same time [
21]. Each task requires a corresponding program to execute, the set of all executed programs is denoted as 
, and the size of each program is 
. Here, 
 is used to denote that the computational task of the device 
m requires a program 
p, and the tasks of multiple devices may require the same program to execute. The base station can cache the service programs required by all user devices because it is equipped with a high-performance server, and user devices and UAVs can only cache some of the programs due to their limited cache capacity.
A three-dimensional Cartesian coordinate system is used to specify the coordinates of the UAV and the user equipment, assuming that the positions of the UAV and the user equipment remain unchanged during the execution of the computational task, that the UAV hovers at a fixed altitude of , whose horizontal coordinates are represented by , that the user equipment and the base station coordinates are represented by  and , respectively, and that the horizontal altitude of both can be neglected with respect to the UAV.
  2.1. Communication Model
This model involves wireless communication between the user device, UAV, and base station. Assuming that the communication spectrum between the three is independent and there is no interference with each other, the bandwidth of the link between the computing UAV and the relay UAV is , the bandwidth of the link between the base station and the relay UAV is , the link bandwidths between user devices and the two UAVs are both .
  2.1.1. Communication Between Ground Equipment (User Equipment, Base Stations) and UAVs
Considering the presence of multiple scatterers or obstructions in the real-world environment, which results in signals not being able to propagate according to the free-space model and generate additional path loss, it is difficult to reflect the actual situation by using the free path loss model (FSPL) [
22]. In this paper, a probability-based path loss model [
23] is used, which integrates the occurrence probability of line-of-sight (LoS) and non-line-of-sight (NLoS) communications and their corresponding path loss characteristics. The LoS and NLoS communication probabilities between the base station and the relay UAV are, respectively,
          where 
 signifies the Euclidean distance between the base station and the relay UAV, and 
a and 
b are environment-dependent constants. The path loss between the base station and the relay UAV is
          where 
 is the signal’s carrier frequency, 
c is the speed of light, and 
 is the additional path loss of line-of-sight and non-line-of-sight links. Therefore, the average path loss between the base station and the relay UAV is
The channel gain between the two is
Similarly, the channel gains for the computing UAV–user and relay UAV–user connections can be obtained, respectively,
          where 
 is the average path loss between the computing UAV and user devices, while 
 denotes the average path loss between the relay UAV and user devices.
Accordingly, the communication rate between the base station and the relay UAV is
          where 
 is the transmission power of the base station, and 
N is the noise power.
The data transmission rate between the relay UAV and the user device is
          where 
 is the transmit power of the relay UAV.
If a user device offloads the computational task to the computing UAV, the data transfer rate is
          where 
 is the device’s transmission power.
If a user device downloads the relevant service program from the computing UAV, the data transfer rate is
          where 
 is the transmit power of the computing UAV.
  2.1.2. Communication Between the Relay UAV and the Computing UAV
Considering that the UAV has sufficient hovering height, the communication environment can be assumed to be free space, so the path loss is
          where 
 is the distance between the relay UAV and the computing UAV. The channel gain between the two is
Therefore, the data transfer rate between the relay UAV and the computing UAV is
  2.2. Cache Model
The computing UAV can cache some programs for use by the user devices. The caching decision of the computing UAV for a program 
p is denoted as 
; 
 denotes that it has cached the service program 
p, and 0 indicates that it has not cached the program. Since the computing UAV has a constrained cache capacity, the cumulative size of the cached programs cannot surpass its maximum storage limit; then,
        where 
K is the cache capacity of the computing UAV.
Each user device also has a cache tolerance, so there is
        where 
 indicates the cache capacity of the device 
m, 
 is a binary constant, and 
 indicates that the device 
m has cached the program 
p.
  2.3. Computational Model
The flowchart of the computation offloading model designed in this paper is shown in 
Figure 2, where the computation tasks for each device can be computed locally or executed remotely on the computing UAV. When a user chooses local computation, there will be the following three scenarios:
- 1.
- If the user device  m-  caches the relevant service program, the delay and power consumption during task execution are as follows: - 
            where  -  is an active capacitor switch, and  -  denotes the local computing resources of the device. 
- 2.
- In the case where the user device does not cache the service program, but the computing UAV stores the relevant service program, then the device will download the service program from the computing UAV, and the task completion delay and energy consumption are, respectively, 
- 3.
- If neither the user device  m-  nor the computing UAV has cached the relevant service program, the device will download the service program from the base station through the relay UAV, and the task completion delay and energy consumption are, respectively, 
When a task is offloaded by a user to the computing UAV for computation, the following three scenarios occur:
- 1.
- If the corresponding service program for UAV caching is calculated, the task completion delay and energy consumption are, respectively, - 
            where  -  is the computing resource allocation ratio for the user device by the computing UAV. 
- 2.
- If the computing UAV does not cache but the user device  m-  caches the service program, then the computing UAV will download the service program from the device, and then the task completion delay and energy consumption are 
- 3.
- If neither the computing UAV nor the user device  m-  has cached the service program, then the computing UAV will download the service program from the base station through the relay UAV, and the task completion delay and energy consumption are, respectively, 
Therefore, for the user device, the delay and system energy consumption required to accomplish its task are, respectively,
  2.4. Description of the Problem
This research aims to optimize system-wide energy consumption, and jointly optimize the location deployment of the computing UAV 
, the relay UAV 
, the service caching of the computing UAV 
C, the resource allocation for computation 
, and the offloading decision of user devices 
 in order to minimize the total system energy consumption. Since the mechanical energy consumption is basically constant in this model, this paper takes the mechanical energy consumption as a pre-condition and focuses on the optimization problem of communication energy consumption and computational energy consumption. The problem can be expressed as
Constraint 
 enforces that the user device’s offloading decision must be a binary variable. Constraint 
 indicates that a binary variable is used to represent the computing UAV’s caching decision, and Constraint 
 indicates that the UAV’s computational resource allocation ratio is a continuous variable. Constraint 
 indicates that the computing UAV will only allocate resources to users who perform offloading, and the sum of the resource allocation ratios will not exceed 1. Constraint 
 indicates that the sum of the program sizes cached in the computing UAV must not exceed the maximum storage capacity of the UAV. Constraint 
 means that the maximum tolerable delay sets an upper limit on each task’s completion time. Constraint 
 limits the location of the two UAVs to ensure that they can communicate, where 
, 
 indicates the UAV antenna’s half-power beamwidth [
24]. Constraint 
 guarantees that the relay UAV maintains communication with the base station, where 
, 
 denotes the base station’s communication range in 3D space. For user devices that perform computational offloading or need to download a service program from the computing UAV, the constraint 
 restricts the said devices to be within the coverage area of the computing UAV, where 
. If user devices need to download service programs from the base station via the relay UAV, the constraint 
 restricts these devices to be within the coverage area of the relay UAV, where 
.
  3. Joint Optimization Algorithm
Due to its non-convexity, the original problem is broken into three sub-problems that are interrelated but can be solved separately, and then the three decompositions are updated iteratively, so as to obtain an approximate or global optimal solution on the whole. The sub-problem of UAV location deployment and service caching decision contains both discrete variables (caching decision) and continuous variables (UAV location), and the whole problem is non-convex, which is difficult to solve by conventional analytic or gradient class methods. The simulated annealing algorithm, as a typical global heuristic algorithm, is able to search in a larger solution space and avoid falling into local optimums to a certain extent, which is suitable for these kinds of complex optimization problems with mixed discrete and continuous variables. In the sub-problem of computational resource allocation, when other variables are fixed, there is a monotonic relationship between the task delay and the computational resources allocated to the user, so the optimal allocation policy can be obtained by directly adopting the greedy idea of “allocating as few resources as possible” under the premise of satisfying the delay constraint. This sub-problem is relatively simple, and the greedy algorithm can efficiently obtain a closed-form or approximate closed-form solution. Finally, the offloading decision sub-problem is a purely discrete binary variable optimization problem; the dimension increases with the number of users and also has non-convex characteristics. Genetic algorithms can efficiently find an approximate global optimal solution in a large discrete search space through the mechanism of “selection-crossover-mutation”, which is very suitable for this kind of discrete combinatorial optimization. These three algorithms are chosen to solve the corresponding sub-problems separately, which can take advantage of their respective advantages on the problem characteristics, but also spread out the complexity of different sub-problems, and ultimately obtain the optimal solution of the overall problem through iterative solving.
  3.1. UAV Position and Cache Decision Solving
When the variables 
, 
 are determined, the problem 
 can be rewritten as
The problem remains non-convex, and this paper employs the simulated annealing algorithm to solve it. The simulated annealing algorithm is a heuristic-based optimization approach that draws on the principle of solid matter annealing, in which the solid is able to make its internal particles gradually reach a more stable arrangement by slowly decreasing the temperature during the annealing process. The simulated annealing algorithm makes use of the above idea by gradually reducing the temperature parameter 
T when searching the solution space and accepting the worse solution with a certain degree of randomness, so that it can overcome the local optimum and eventually discover a superior solution in the solution space [
25]. When the simulated annealing algorithm is used to solve the problem 
, the solution space, the fitness function, and the Metropolis criterion are defined as follows:
- 1.
- Solution space: In this paper, the solution space can be expressed as  The solution space  can be generated by random values in the initial stage, and the optimal solution  is obtained after n iterations of the simulated annealing algorithm. In the global iteration stage, the initial solution space  of the Nth simulated annealing algorithm is the best solution  of the th algorithm. 
- 2.
- Fitness function: In order to improve the search efficiency of the algorithm and the quality of the solution, the fitness function consists of the objective function augmented by a weighted penalty value, i.e., - 
            where  -  represents the penalty value accumulated during the iteration process when the solution does not satisfy the constraints in  - ,  -  is the value obtained in the  n- th iteration using the current solution,  Y-  is the constraint value, and  -  corresponds to the different weights of the penalty value. The logarithmic form of the penalty value is used to control the growth of the penalty value, to avoid the explosion of the value. 
- 3.
- Metropolis criterion: This criterion is used to determine whether or not to accept a new solution at each iteration of the algorithm. By introducing a certain amount of randomness, it can help the algorithm prevent convergence to a local optimum, and thus explore the global optimum solution more efficiently. The chance of adopting a new solution is - 
            where  -  represents the difference between the new adaptation value and the current adaptation value,  T-  denotes the current temperature, and the update equation is - 
            where  -  is the initial temperature,  -  is the cooling rate, and  n-  is the number of iterations in the algorithm. 
Firstly, the relevant parameters are initialized, the total energy consumption and the penalty term under the current scheme are calculated, and the penalty term is added to the total energy consumption to obtain the fitness value. Then, the variables are optimized by the simulated annealing algorithm; specifically, a neighborhood search method is used to generate the new solution to obtain the new fitness value, and a decision is made on whether to accept the new solution according to the acceptance criterion. Finally, the temperature is lowered by continuous iteration until the temperature reaches a certain threshold value. The specific solution process is shown in Algorithm 1.
The algorithm’s computational complexity is structured into two primary layers. For the outer annealing loop, it takes 
 iterations for the temperature to decrease from the initial 
T to 
; the loop also exits when 
 reaches the specified 
, so the outer loop time complexity is 
. For the inner loop at each temperature, the main operations of generating the neighborhood solution and calculating the total energy consumption and adaptation value are performed. The generation of the neighborhood solution is achieved by randomly deciding the number of this neighborhood perturbation, which in the worst case is of the same order of magnitude as the number of service program types, so the time complexity is 
 and 
 is the number of service program types. Since the overall energy consumption and the adaptation value are calculated for each user, the time complexity of both operations is 
, i.e., the time complexity of the inner loop at each temperature is 
. In summary, the time complexity of Algorithm 1 is 
.
        
| Algorithm 1 Cache and UAV position optimization algorithm based on simulated annealing algorithm | 
| Require: initialization constants, offloading decisions , calculating resource allocation ratios , caching decisions C, UAV positions  and
 Ensure: C,  and
   1:, , initial temperature , cooling rate , minimum temperature threshold , number of iterations per temperature L, max maximum number of consecutive non-improvements allowed , penalty factor   2:Calculate of the overall energy consumption of the system based on the current solution  with the adaptation value   3:Let   4:while  and  do  5:     6:   for  to L do  7:     Calculate the overall energy consumption of the system  based on the neighborhood solution and the adaptation value , respectively.  8:     if  or  then  9:        10:        11:     end if12:     if  then13:        14:        15:        16:     end if17:   end for18:   if  then19:     20:   else21:     22:   end if23:   24:end while25:return 
                       
 | 
  3.2. Optimization of Computing Resource Allocation
The problem 
 can be rewritten when the variables 
Q, 
C, 
 are determined:
By observing Equation (
29), it suggests that the total system energy usage required to accomplish the computational task is proportional to the computational resources allocated by the computing UAV when other variables are known, so the computational resources can be allocated as little as possible while satisfying the delay constraints. For the user device 
m, it can be obtained from Equation (
28) and Constraint 
 that
The best resource allocation strategy is obtained to attain the lowest value of the objective function by converting Equation (
34) into another equation. The specific implementation is shown in Algorithm 2.
        
| Algorithm 2 Greedy-based UAV computational resource allocation algorithm | 
| Require: initialization constants, offloading decisions , caching decision C, UAV positions  and
 Ensure:
 1:resource allocations 2:for  to  do3:   4:   5:end for6:return 
                      
 | 
Since the number of iterations of the above algorithm is only related to the number of user devices, it has a linear time complexity and that complexity is .
  3.3. Offloading Decision Optimization
The optimization problem 
 can can be rewritten when the variables 
Q, 
C, and 
 are determined:
The problem involves the solution of discrete variables, which is suitable to be solved by the genetic algorithm. As a probabilistic approach, the genetic algorithm is used for search and optimization driven by concepts from natural selection and genetic evolution, which simulates the evolutionary process of natural selection in biological evolution, continuously optimizes the quality of the solution through the mechanisms of “selection”, “crossover” and “mutation”, and finally finds a near-optimal solution to the problem. Through “selection”, “crossover”, “mutation”, and other mechanisms, the quality of the solution is continuously optimized, ultimately yielding a near-optimal solution for the problem [
26].
- 1.
- Population formation and optimal individual selection: In this paper, it is assumed that  populations are needed, and each individual in the population is the offloading strategy, so the size of the initialized population matrix is . To prevent the inefficiency in optimization caused by a completely random population generation, the best solution of the previous optimization is used as a part of the initial solution of the current optimization. The fitness values of individuals are calculated according to Equations (32) and (33), and the tournament selection method is used to randomly select multiple individuals from the population each time and select the individual with the lowest fitness as the parent. 
- 2.
- Individual crossover and mutation: Two parents produce offspring by single-point crossover, and then mutation operation is performed on the offspring, i.e., randomly flipping the gene locus according to the probability  - . Crossover and mutation processes are illustrated in  Figure 3-  and  Figure 4- . 
- 3.
- Population update: Replacing the current population with the offspring generated in Algorithm 2 ensures that the new population inherits superior genes and is also diverse. 
Firstly, the relevant parameters are initialized, and the optimal individuals obtained previously are mutated to obtain the initial population. Secondly, the fitness value is obtained by calculating the energy consumption and task completion time of each individual in the current population, and the better individuals are retained by using the tournament selection method, which is shown in Algorithm 3. Then, the intersection is randomly selected for gene exchange, and the genes are flipped with a certain probability. Finally, the iteration is stopped when the number of times reaches the upper limit value to obtain the optimal offloading decision. Finally, upon reaching the maximum iteration count, the iteration is stopped to obtain the optimal offloading decision. Algorithm 4 illustrates the overall execution of the proposed algorithm.
        
| Algorithm 3 Tournament selection method | 
| Require: population matrix , fitness value for each individual in the population , number of individuals randomly selected from the population
 Ensure:
 1:Counting the number of individuals in the current population2:3:Within  randomly select  individual indices  that represent the individuals competing in the current round of the tournament.4:Find the individual with the best fitness among the selected individuals 5:return 
 | 
| Algorithm 4 Genetic algorithm-based optimization algorithm for offloading decision | 
| Require: initialization constants, compute resource allocation ratio , caching decision C, UAV positions  and , penalty factor , last offloading optimization strategy
 Ensure:
   1:Initialize population size , maximum number of iterations , crossover rate , mutation rate   2:Generate initial populations:  3:  4:  5:  6:for  to  do  7:   for  to  do  8:      calculate the fitness value for each individual  9:   end for10:   11:   if  then12:     13:     14:   end if15:   for  to  do16:     Two parents ,  were selected by Algorithm 3, and 17:     if  then18:        Randomly select the intersection point and perform a crossover operation on  and 19:     end if20:     for  to  do21:        if  then22:          23:        end if24:     end for25:     26:   end for27:   28:end for29:30:return 
                      
 | 
Since the fitness of each individual in the population must be evaluated in every generation, the time complexity of this operation is ; for each individual, crossover and mutation are constant operations, so the time complexity of both population crossover and mutation is . In summary, the time complexity of the algorithm is . The optimal solution for the aforementioned three algorithms is achieved via cyclic iteration. For any user device unable to obtain a feasible solution, the required energy consumption for task completion is assigned as .
The global iteration is shown in Algorithm 5, and the overall time complexity of the algorithm is 
, where 
, 
, and 
.
        
| Algorithm 5 Overall iterative algorithm | 
| 1:Randomly initialize {, , , , }2:Initialize iteration number  and convergence threshold 3:repeat4:   Given (, ), obtain the optimal (, , ) by Algorithm 15:   Given (, , , ), obtain the optimal  by Algorithm 26:   Given (, , , ), obtain the optimal  by Algorithm 47:   8:until The objective value decreases by less than  or the maximum number of iterations is reached.
 | 
  4. Simulation Results and Analysis
In this work, the performance of the proposed algorithm is assessed through simulations in MATLAB 2023a, and the corresponding simulation scenario is depicted in 
Figure 1, where user devices are presumed to be dispersed within the area of 
, the UAV computational resource is 7 GHz, and the environment-related parameters such as 
, 
, and the detailed simulation configurations are presented in 
Table 1 [
20].
To assess and contrast the effectiveness of this algorithm, this paper uses three comparison algorithms:
- 1.
- Algorithm 1 for computing the UAV cache decision without optimization, i.e., the random cache at initialization is used as the final decision, and other variables are optimized according to the algorithm presented in this paper. 
- 2.
- Algorithm 2 for computing UAV and relay UAV positions without optimization, i.e., fixing the UAV position, and other variables are optimized according to the algorithm presented in this paper. 
- 3.
- Algorithm 3 is the algorithm of [ 18- ], which adopts a single-UAV architecture, with limited UAV coverage; the UAV has both computing and relay functions, and it only considers downloading the service program from the base station when the UAV does not cache the related service program. 
The convergence behavior of the proposed algorithm is demonstrated in 
Figure 5. As shown in the figure, the total energy consumption gradually decreases with an increasing number of iterations. Specifically, the proposed algorithm converges to the final solution within a maximum of 5 iterations across various conditions, demonstrating its fast convergence rate. Furthermore, when the UAV’s computing capacity is 7 GHz, the total energy consumption decreases by 36.1% and 33.6% for 25 and 30 users, respectively, which highlights the effectiveness of our proposed algorithm.
Figure 6 illustrates the influence of UAV cache capacity on overall energy consumption. It can be observed that with the cache capacity increasing, the energy consumption decreases. This is because the increase in the UAV cache capacity indicates that the UAV can cache more services and saves the transmission energy for downloading the services from the base station, so the curve is a decreasing trend. The figure also demonstrates that the algorithm proposed in this paper outperforms the other three algorithms. The caching policy of Algorithm 1 is not determined based on the user device task type and offloading decision, resulting in the type of services cached on the UAV possibly not being required by the device, and the device and the computing UAV will download the service program from the base station through the relay UAV, which increases the transmission energy consumption. Algorithm 2 fixes the position of the two UAVs, and some devices may be outside the UAV coverage range, which has an impact on devices that are outside the range and need to download service programs for local computation, i.e., leading to an infeasible solution. In this case, the user device whose energy consumption is high is assigned a larger value as a penalty. In addition, the overall energy consumption of the algorithm is larger because the UAV is not on the optimal location, the task upload or service download delay is larger, and thus the transmission energy consumption will be larger as well. The UAV in Algorithm 3 has the functions of both computing and relaying, and subject to the coverage, it needs to balance the communication distance with the base station and the user, there will be more devices outside the UAV’s coverage, and the number of devices that appear to have an infeasible solution is increased compared to the algorithm in this paper.
 Figure 7 depicts the overall energy consumption trends of various algorithms as the maximum tolerable delay grows. For small tolerable delays, the task processing has less flexibility, and the device tends to complete the computation on the local device or UAV with larger computational resources to finish the acceptable delay limit. Thus, the computational energy consumption is larger at this time. In the process of gradually increasing the maximum tolerance delay, the task processing has more selectivity. When the overall delay of the task in the local computation or offloading to the UAV computation is within the tolerance range, the device opts for the option that consumes less energy to complete the task, resulting in a decrease in the device’s total energy consumption. The algorithm in this paper dynamically adjusts the offloading decision to prioritize local computing with lower energy consumption or low-power resource allocation (e.g., the lower bound of resource allocation in Equation (
37)) when the delay margin increases. In contrast, Algorithms 1–3 are unable to flexibly utilize the delay margin due to rigid caching strategies or fixed UAV position.
 Figure 8 illustrates the variation in system overall energy consumption as the number of user devices changes. It is evident that with the number of user devices growing, a corresponding increase in total energy consumption is observed for all four algorithms. Since the algorithm in this paper optimizes the computing UAV cache decision, the cache hit rate is higher than that of Algorithm 1, which saves some of the transmission energy of the service programs. An increase in the number of computing tasks means an increase in the likelihood of needing a greater number of service types. But the computing UAV cache capacity is limited, so when the service programs corresponding to the computing tasks are not cached, they also need to be downloaded from the base station via the relay UAV, which increases the transmission energy. Therefore, the overall energy consumption increases. In Algorithms 2 and 3, the fixed UAV location or single-function UAV architecture is prone to the problem that some users are forced to use energy-intensive local computation due to being out of coverage when the number of users increases, which leads to the result that the total energy consumption is greater than the energy consumption of the algorithm in this paper.
 Figure 9 presents the relationship between average task size and system overall energy consumption, demonstrating that larger task sizes result in higher energy consumption. Given a fixed CPU workload per bit, an increase in task size directly results in a higher total CPU cycle demand for task execution, consequently leading to an increase in computational latency. Under the condition of constant maximum tolerable latency, the device tends to complete the computation on a local device or UAV with larger computational resources, which leads to an increase in computational energy consumption. In addition, an increase in task size also leads to an increased likelihood that the task completion delay will exceed the tolerated delay, ultimately affecting the overall energy consumption. Additionally, a low cache hit rate for Algorithm 1 requires frequent downloads of service programs from the base station, which adds additional transmission energy. For Algorithms 2 and 3, some users may be at the edge or in low-coverage areas, thus adding additional delay and transmission energy consumption.
 Figure 10 shows that the overall energy consumption of the system decreases as the UAV computational resources increase. For the algorithms in this paper and Algorithm 1, when the computational resources are small, the number of users who can receive the UAV service is small, and more devices take local computation, which leads to larger total energy consumption. As the computational resources increase, the UAV can provide sufficient computational services to the users in the coverage area, resulting in smaller computational energy consumption. For Algorithms 2 and 3, when the computational resources exceed 7 GHz, the computational resources are more sufficient, and when the resources continue to increase, no more user devices can be served. It is difficult to effectively extend the service coverage, and when the resources are increased, the users beyond the coverage cannot utilize the additional resources, resulting in a limited reduction in the overall energy consumption.
 Figure 11 gives the mean value as well as the standard deviation of the energy consumption obtained from multiple tests (with different user locations each time) for a given computational resource. From the figure, it can be seen that the average energy consumption shows an overall decreasing trend as the UAV computational resources increase, while the length of the error bars showing the magnitude of fluctuations in the effect of different user locations on energy consumption under each computational resource.
   5. Conclusions
This paper investigates a UAV-assisted computing offloading model based on service caching with the objective of lowering system-wide energy consumption. The simulated annealing algorithm is applied to optimize UAV location deployment and computing UAV caching decisions, while a greedy-based approach is used for UAV computational resource allocation. Additionally, a genetic algorithm is adopted for selecting the optimal offloading strategy for user devices. Finally, the obtained simulation results reinforce the proposed algorithm’s effectiveness and outstanding performance. When the number of users is small, the gap between each algorithm is relatively small; when the number of users increases to 30, the gap becomes significant. When the number of users reaches 30, the energy consumption of the algorithm in this paper is about 12 J, while the energy consumption of the worst algorithm is as high as 24 J, indicating that the algorithm has better scalability and energy-saving effects for large-scale user scenarios. When the delay is short (0.7 s), the energy consumption of each algorithm is relatively high, but with an increase in delay, the energy consumption of the algorithm in this paper decreases the most, only about 5 J at 1.1 s, while the energy consumption of the comparison algorithm remains above 7.5 J or even higher, indicating that the algorithm can better use the delay redundancy for energy consumption optimization. When the task size is small, the difference between each algorithm is relatively limited; when the task increases to  bits, the gap increases. The energy consumption of the algorithm in this paper is kept at about 10 J, while the energy consumption of the worst algorithm is as high as 20 J, showing that the algorithm has better energy-saving advantages under large-task-load scenarios. Regarding the increase in cache capacity, the algorithm in this paper has the most obvious decrease, from about 8.5 J to 6.0 J, which saves about 29% of energy consumption, while the algorithm with the highest energy consumption remains above 13 J under the same caching condition, which indicates that the present algorithm more fully utilizes the cache resources to reduce the transmission overhead. In this paper, the energy consumption of the algorithm is about 9.0 J when the UAV’s computational resources are  Hz, and decreases to 6.0 J when the UAV’s computational resources are  Hz, which is a reduction of 33%. The energy consumption of the comparative algorithms is still higher than this paper’s algorithm by 20% to 50%, which indicates that this paper’s algorithm can better utilize the additional computational resources and further reduce the energy consumption. Future work will concentrate on improving latency performance, UAV trajectory optimization, and other related content in UAV-assisted computing offloading scenarios.