UAV-Assisted Caching Strategy Based on Content Cache Pricing in Vehicular Networks

: A UAV-assisted caching strategy considering content cache pricing in vehicular networks is proposed to address the problem of high communication load and high backhaul link overhead in vehicular networks. Consider a trafﬁc scenario consisting of a content provider (CP), a network operator (NO), and multiple mobile users, where the NO has a set of cache-enabled roadside units (RSUs) and an unmanned aerial vehicle (UAV). The CP leases some popular contents to the NO for its beneﬁt and the NO places this leased content in its RSU’s local cache to save expensive backhaul transmission overhead and latency. However, both NO and CP are selﬁsh and their interests conﬂict with each other because they have opposing expectations for content pricing. In order to take into account the interests of both, this paper deﬁnes the utilities of CP and MNO and uses the Stackelberg game framework to model the competition between the two entities, where CP acts as a leader and sets the rental price of the content and NO acts as a follower responding to CP’s actions. An iteration-based dynamic programming algorithm is also designed to ﬁnd the Stackelberg equilibrium. Meanwhile, a caching-capable UAV is introduced into the vehicular network and, based on this, a Dijkstra-based path planning algorithm is designed to further increase the total utility of NO by optimizing the trajectory of the UAV. The simulation results show that the strategy in this paper can reasonably allocate the beneﬁts of CP and NO, reduce the average request delay, and increase the utility of NO; for example, we reduced the request latency for vehicle users by 27% and increased the total utility of NO by 13%.


Introduction
With the rapid development of vehicular networks, more and more in-vehicle applications are enriching the functions of vehicles, such as intelligent transportation, in-vehicle entertainment, in-vehicle office, road safety, and driverless applications. Vehicles need to obtain a variety of information from the outside world, such as traffic information, entertainment information, and real-time news, in order to provide users with a better driving experience. However, the number of vehicles is increasing dramatically and the information that users want to obtain is becoming more and more diverse, which leads to a dramatic increase in the communication load of the vehicular network. In addition, due to the high-speed mobility of vehicles, the topology of the vehicular network changes rapidly, making the communication links of vehicles easily interrupted, which leads to a degradation of the user's quality of experience. In order to reduce the workload of vehicular networks and improve the communication quality of the networks, caching techniques have been introduced in vehicular networks.
Caching technology in vehicular networks is to cache content at various nodes such as RSUs and vehicles at the edge of the network, so that vehicles can receive the demanded content directly from surrounding nodes, thus effectively reducing the communication distance for requesting vehicles to receive content and reducing the traffic load on content servers and networks. The more common caching strategies currently applied in invehicle networks can be categorized into non-cooperative caching [1,2] and cooperative caching [3][4][5][6], which reduce the impact of vehicle mobility on the in-vehicle network and improve the quality of service.
However, the above studies only determine the caching strategy from the perspective of the caching node, aiming at reducing the traffic load on the network or enhancing the user's quality of experience, and do not consider the overhead caused by caching as well as the benefits gained. In order to reduce request latency and save overhead, NO often chooses to rent some contents from remote CP at a cost and cache them on the network's edge devices, so a reasonable rental price and caching decision will benefit both the NO and the CP.
Most of the current papers that study the caching problem from an economic perspective do not utilize flexible air mobile caching devices for assistance, such as UAVs. Vehicle users in the vehicular network have high-speed mobility and have a short contact time with the cache nodes with fixed positions on the ground; the introduction of UAVs with caching function in the vehicular network can effectively solve this problem and reduce the average request latency of vehicle users.
In summary, existing caching strategies in in-vehicle networks do not simultaneously consider the overhead, revenue, or air resources utilization in the delivery process, so this paper designs a UAV-assisted caching strategy based on content cache pricing in vehicular networks. Specifically, we consider a traffic scenario consisting of a CP, an NO, and multiple mobile subscribers. The CP rents some popular content to the NO for profit, while the NO puts these contents into its RSU's local cache to save on costly backhaul transmission overhead and latency. However, CP and NO are both selfish and their interests conflict with each other because they have opposite expectations on content pricing. Therefore, we model the competition between the VP and MNO as a Stackelberg game to jointly maximize the profits of the VP and MNO. Meanwhile, a UAV with caching capability is utilized to further reduce the request latency and increase the utility of the NO. The innovative points of this paper are as follows: (1) A UAV-assisted vehicular network caching model considering the economic relationship between CP and NO is constructed. CP leases some popular contents to NO for benefits and NO caches these leased contents in RSU to save expensive backhaul transmission overhead and latency. The utility of CP and NO is defined by analyzing the benefit relationship between CP, NO, and vehicle users.
(2) The competing interests of CP and NO are modeled using the Stackelberg game model, where CP is the leader and NO is the follower. An iteration-based dynamic programming algorithm is designed to find the Stackelberg equilibrium point to obtain the optimal caching decision for the RSU and content rental price.
(3) A UAV with caching function is introduced into the vehicular network, which caches the contents already leased by the RSUs. A Dijkstra-based path planning algorithm is designed to further increase the total utility of the NO and improve the quality of service by optimizing the trajectory of the UAV.
The next sections of this paper are organized as follows: Section 2 reviews the related works in the literature; Section 3 gives the system model and contents delivery decision and establishes the optimization problem; Section 4 designs the joint optimization algorithm for the caching decision of the RSU and the trajectory of the UAV; Section 5 gives the simulation results of the algorithm and analyzes them; Section 6 concludes the whole paper.

Related Work
In terms of caching strategies, the undifferentiated caching strategy proposed by [1] means that the content is cached at each network node through which it passes. The caching with probability strategy proposed by [2] means that a network node caches the content that passes through that node with some fixed probability. These non-cooperative caching strategies make the content have a high redundancy in the network, resulting in a wastage of caching resources. In cooperative caching strategies, multiple nodes in the network cooperate with each other to effectively reduce the redundancy of content in the network. The authors of [3] identified and formulated the problem of maximizing the average cache hit rate considering the time-varying topology of the network, vehicle mobility, user preferences, and the limited cache capacity of the RSU. A cache update policy based on learning automata is designed to determine the appropriate content to be cached in the RSU. The authors of [4] proposed a file partitioning and grouping scheme in a distributed coding cache system that designs a static single-server system for heterogeneous file transfers. The authors of [5] designed a collaborative edge caching scheme based on location and popular content and proposed an optimal collaborative content layout for macro base stations and RSUs to reduce transmission latency and service cost. Ref. [6] proposed an active caching scheme for parked vehicles that uses parked vehicles to cache data in advance at appropriate times and locations so that users can receive the data as they pass by.
In order to study the issue of caching from an economic perspective, the authors of [7] considered a concave pricing mechanism where CPs are charged through a concave price function of the time that content is cached in the CP cache. The proposed concave pricing mechanism is analyzed theoretically and provides a solution for CPs to optimally choose the length of time to stay in the ISP cache. The authors of [8] considered transcoding between different versions of video and the base station considered the energy consumption value when caching video clips; then, based on the different uses of caching, base station computational resources, and backhaul links by the in-vehicle users, they proposed a network resource pricing algorithm to improve the flexibility of the utilization of in-vehicle network resources. Ref. [9] focused on the complex relationship between competition and cooperation among multiple CPs and solved a non-cooperative game model based on game theory to study the interaction between caching and pricing strategies of ICN entities. By establishing the optimal utility function of each entity, the optimal cache share of ISP (internet service provider) and the optimal pricing of ISP and CP are obtained. The authors of [10] proposed a joint video pricing and cache placement strategy by considering the heterogeneity of video file sizes and the classical law of demand in the field of economics, maximizing the profit of both CP and NO in the case of non-cooperative base station caching and cooperative base station caching. The authors of [11] focused on caching and transaction models for vehicular networks and proposed a generic cache valuation and online pricing framework to achieve the goals of incentive compatibility, personal rationality, privacy preservation, computational efficiency, and utility maximization.
In terms of utilizing UAVs, the authors of [12] optimized the caching and computational resource allocation of the network by optimizing the trajectory and flight altitude of the UAVs. In [13], an active caching scheme was designed in which UAVs were dispatched to provide content delivery services to vehicular users in a specific area. Ref. [14] considered content delivery in terrestrial networks for low earth orbit (LEO) satellites and cache-assisted UAV communications. The minimum achievable throughput per ground user (GU) is maximized by co-optimizing cache placement, UAV resource allocation, and trajectory with limited cache capacity and flight time. The authors of [15] considered a scenario where infrastructure is rendered unusable due to a disaster situation and use UAVs to service vehicles in the affected area to meet the quality of service for users. The authors of [16] used UAVs to help with infrastructure operations. Since deploying multiple UAVs incurs greater costs, the number of UAVs dispatched is optimized to provide coverage of specific areas. In [17], the authors investigated communication with multiple UAVs with an onboard network and proposed an efficient collaborative UAV sensing and sending protocol. Ref. [18] proposed a joint optimization problem for UAV deployment and content placement to minimize the average request latency and proposed a Q-learning algorithm to solve the content placement problem. In [19], the authors proposed a UAV network that transmits data from the vehicle to the core network, where one UAV serves the vehicle and the others are used to relay the data, saving energy by limiting the maneuverability of the UAVs in a certain area.

System Model
The system model is shown in Figure 1. The system includes a CP, an NO, and multiple mobile onboard users; the NO has a set of cache-enabled RSUs and a cache-enabled UAV, where the coverage of the RSUs is C R and the coverage of the UAV is C U , C R > C U . The CP rents some of the contents to the NO and the NO caches this rented content in the RSUs to save the expensive transmission cost. The roads in the city are two-way lanes with lanes appearing in pairs and there are two road sections, each of length L. An RSU is deployed next to each road section, which can provide services to vehicles on that road section, with the RSU on the left road section denoted as R l and the RSU on the right road section denoted as R r . A UAV with caching capabilities is deployed over the road sections to assist the RSUs in providing services to vehicle users. and content placement to minimize the average request latency and proposed a Q ing algorithm to solve the content placement problem. In [19], the authors prop UAV network that transmits data from the vehicle to the core network, where on serves the vehicle and the others are used to relay the data, saving energy by limit maneuverability of the UAVs in a certain area.

System Model
The system model is shown in Figure 1. The system includes a CP, an NO, an tiple mobile onboard users; the NO has a set of cache-enabled RSUs and a cache-e UAV, where the coverage of the RSUs is R C and the coverage of the UAV is U C , . The CP rents some of the contents to the NO and the NO caches this rented con the RSUs to save the expensive transmission cost. The roads in the city are two-wa with lanes appearing in pairs and there are two road sections, each of length L . A is deployed next to each road section, which can provide services to vehicles on th section, with the RSU on the left road section denoted as l R and the RSU on the rig section denoted as r R . A UAV with caching capabilities is deployed over the road s to assist the RSUs in providing services to vehicle users.

Traffic Model
In a dense urban vehicle environment, vehicles at the same moment tend to t similar speeds. However, as time changes, changes in traffic conditions will aff number of vehicles on the lane; for example, changes in traffic lights will lead to c in vehicle density. In this paper, we assume that there exists a set of continuous t riods set, denoted as For the movement of the UAV, the road is divided into several rectangular blo the side length of each rectangular block is the coverage diameter of the UAV. Taki time period Tx as a trajectory optimization cycle, the UAV stays in a block during a tory optimization cycle to serve the vehicle users in that block and selects an adjacen at the end of the current trajectory optimization cycle to fly to that block in the nex i.e., whenever the vehicle density changes, the UAV's position also changes. Choos ferent flight trajectories results in different gains and time delays. It is worth noti

Traffic Model
In a dense urban vehicle environment, vehicles at the same moment tend to travel at similar speeds. However, as time changes, changes in traffic conditions will affect the number of vehicles on the lane; for example, changes in traffic lights will lead to changes in vehicle density. In this paper, we assume that there exists a set of continuous time periods set, denoted as T = {T 1 , · · · , T x , · · · , T X }, and the vehicles in each time period travel at the same uniform speed and the vehicle users in different time periods travel at different speeds. Use n l 1 , · · · , n l x , · · · , n l X and n r 1 , · · · , n r x , · · · , n r X to denote the number of vehicles entering the roadway from the left and right, respectively, at each time period. Then, the total number of vehicle users coming in from the left side of the roadway is N l = X ∑ x=1 n l x and the total number of vehicle users coming in from the right side of the roadway is N r = X ∑ x=1 n r x . For the movement of the UAV, the road is divided into several rectangular blocks and the side length of each rectangular block is the coverage diameter of the UAV. Taking each time period Tx as a trajectory optimization cycle, the UAV stays in a block during a trajectory optimization cycle to serve the vehicle users in that block and selects an adjacent block at the end of the current trajectory optimization cycle to fly to that block in the next cycle, i.e., whenever the vehicle density changes, the UAV's position also changes. Choosing different flight trajectories results in different gains and time delays. It is worth noting that the caching decision of frequent node replacement will incur a large overhead, so the optimization of caching decision is a long-time optimization, while the UAV has mobility and needs to plan the flight trajectory for each time slot for it, which is a short-time optimization; the relationship between the two is shown in Figure 2.
Appl. Sci. 2023, 13, x FOR PEER REVIEW 5 of 23 the caching decision of frequent node replacement will incur a large overhead, so the optimization of caching decision is a long-time optimization, while the UAV has mobility and needs to plan the flight trajectory for each time slot for it, which is a short-time optimization; the relationship between the two is shown in Figure 2.

… …
Short-term optimization period for UAV's trajectory Long-term optimization period for caching policies

Content Request Model
Assume that there is a total of I contents, each of equal size, all of s . The set of contents is denoted as Due to the unstable topology of the in-vehicle network, the vehicle user transmission link may be interrupted, resulting in content acquisition failure. In order to ensure the success rate of content transmission, this paper divides the content into a number of equal-sized content blocks and, if the communication link is broken during transmission, the content blocks that have not been transmitted will be discarded and need to be transmitted again after the link is re-established.
The probability of a file being requested is positively correlated with the file popularity ranking; the probability of the file with the highest popularity ranking being requested can be expressed as [20] where  denotes the exponential constant of the Zipf distribution; the larger  is, the more the user's requests are concentrated in the top ranked files in terms of popularity;  denotes the ranking of the files in the file library. It is assumed that each vehicle user will request content according to popularity before driving off the first road; the request time follows a uniform distribution. The content request indicator variable is denoted by

Caching Model and Content Delivery Strategy
In the scenario of this paper, the NO leases some contents from the CP and caches them in the RSUs and pays a fee to the CP, while the UAV caches the leased content from the RSUs without paying an additional fee.
The UAV will select the content with high popularity from the content rented by the NO to cache. If the node caches the content requested by the vehicle user it is a cache hit, otherwise it is a cache miss. This paper involves both caching nodes, RSU and UAV, and the UAV only caches content leased by the RSU. Thus, for each content there are three caching cases: not cached

Content Request Model
Assume that there is a total of I contents, each of equal size, all of s. The set of contents is denoted as F = {F 1 , · · · , F i , · · · , F I }. Due to the unstable topology of the invehicle network, the vehicle user transmission link may be interrupted, resulting in content acquisition failure. In order to ensure the success rate of content transmission, this paper divides the content into a number of equal-sized content blocks and, if the communication link is broken during transmission, the content blocks that have not been transmitted will be discarded and need to be transmitted again after the link is re-established.
The probability of a file being requested is positively correlated with the file popularity ranking; the probability of the file with the highest popularity ranking being requested can be expressed as [20] where γ denotes the exponential constant of the Zipf distribution; the larger γ is, the more the user's requests are concentrated in the top ranked files in terms of popularity; τ denotes the ranking of the files in the file library. It is assumed that each vehicle user will request content according to popularity before driving off the first road; the request time follows a uniform distribution. The content request indicator variable is denoted by re t n,i , and re t n,i = 1 indicates that v n has requested f i at the moment t, otherwise re t n,i = 0.

Caching Model and Content Delivery Strategy
In the scenario of this paper, the NO leases some contents from the CP and caches them in the RSUs and pays a fee to the CP, while the UAV caches the leased content from the RSUs without paying an additional fee. The set of R m -caching decision indicator variables The UAV will select the content with high popularity from the content rented by the NO to cache. If the node caches the content requested by the vehicle user it is a cache hit, otherwise it is a cache miss. This paper involves both caching nodes, RSU and UAV, and the UAV only caches content leased by the RSU. Thus, for each content there are three caching cases: not cached by any caching node, cached only by RSU, and cached by both RSU and UAV. If the content is not cached by any node, the user of the vehicle requesting it will be on hold until it leaves the road. Then, the user will receive the content from the CP via the RSU relay. If the content is only cached by the RSU, the vehicle user obtains the content via the RSU. If the content is cached by both the RSU and the UAV, the vehicle user requesting it receives the content from the UAV first, as the transmission speed of the UAV is greater than that of the RSU. This situation is more complex and the delivery process can be divided into three cases as shown in Figure 3. Taking the delivery decision for content f i as an example, the analysis is as follows: by any caching node, cached only by RSU, and cached by both RSU and UAV. If the content is not cached by any node, the user of the vehicle requesting it will be on hold until it leaves the road. Then, the user will receive the content from the CP via the RSU relay. If the content is only cached by the RSU, the vehicle user obtains the content via the RSU. If the content is cached by both the RSU and the UAV, the vehicle user requesting it receives the content from the UAV first, as the transmission speed of the UAV is greater than that of the RSU. This situation is more complex and the delivery process can be divided into three cases as shown in Figure 3. Taking the delivery decision for content i f as an example, the analysis is as follows: Coverage of UAV Coverage of R l Coverage of R r (c) For the case where both RSUs and the UAV have cached i f , take the example of a UAV staying on the road section where l R is located, as shown in Figure 3a. When a ve- For the case where both RSUs and the UAV have cached f i , take the example of a UAV staying on the road section where R l is located, as shown in Figure 3a. When a vehicle drives in from the R l side of the road section, if the vehicle user requests f i when it is not in the UAV coverage range, such as V 1 , the user will first obtain f i from R l and, if it is still not obtained when driving into the UAV coverage range (rectangle block 2 ), it will continue to obtain f i from the UAV. If the vehicle user requests f i while within the coverage of the UAV, e.g., V 2 , the user will first obtain f i from the UAV and, if it is still not fully obtained when driving out of the UAV coverage, the remainder will continue to be obtained from R l , and, if it is still not fully obtained when driving out of the R l section, it will continue to obtain f i from R r . If the vehicle user requests f i after driving through the coverage area of the UAV, e.g., V 3 , it first obtains f i from R l and continues to obtain f i from R r if it is still not fully acquired when driving out of the section where R l is located. When a vehicle drives in from the R r -side section, as in V 4 , it first acquires f from R r , continues to acquire f i from R l if it has not yet fully acquired the content and is not in the coverage area of the UAV when it drives out of the R r section, and continues to acquire f i from the UAV if it has not yet fully acquired it when it drives into the coverage area of the UAV. The delivery process is the same when the UAV stays in the section where R r is located.
For the case where the UAV and one RSU caches f i and the other RSU does not, the UAV stays on the road section where the RSU that caches f i is located. Take the case where R l caches f i , R r is not cached, and the UAV stays on the R l side, as shown in Figure 3b. When a vehicle drives in from the R r side of the roadway, if the vehicle user sends a request when it is not in the coverage range of the UAV, e.g., V 5 , the user will first obtain f i from R l and continue to obtain f i from the UAV if the content is still not fully obtained when it drives into the coverage range of the UAV. If the vehicle user sends a request within the coverage area of the UAV, e.g., V 6 , it will first obtain f i from the UAV, then continue to obtain f i from R r if the content is still not fully available when driving out of the UAV's coverage area. If the content is still not fully available when driving out of the section where R r is located, it will enter a wait state until it has driven the entire section and then obtain the content from the CP. If the vehicle user requests f i after driving through the UAV coverage, e.g., V 7 , it first obtains content from R l and, if it still does not fully obtain f i when it leaves the section where R l is, it enters a wait state until it leaves the R r section and obtains f i from the CP. When the vehicle enters from the R r side of the road, e.g., V 8 , it sends a request and then enters a wait state until it enters the R l section and starts to obtain f i from R l ; if it still has not fully obtained the content when it enters the UAV coverage, it continues to obtain it from the UAV.
For the case where the UAV and one RSU caches f i and the other does not, the UAV stays on the road section where the RSU that does not cache the content is located. Take the example of R l caching f i , R r not caching, and the UAV staying on the R r side, as shown in Figure 3c. When a vehicle drives in from the R r side of the roadway, as in V 9 , it first acquires content from R l and, if it does not fully acquire content when driving out of the R l roadway, it continues to acquire f i from the UAV. When the vehicle is approaching from the R r -side section, if the vehicle user sends a request when it is not in the coverage area of the UAV, e.g., V 10 , it enters a wait state until it is in the coverage area and obtains f i from the UAV. If the vehicle user sends a request, e.g., V 11 , while in the coverage area of the UAV, it first obtains f i from the UAV and, if the content is still not fully obtained when driving out of the coverage area of the UAV, it enters the wait state until it drives into the R r side section and obtains f i from the R r . If the vehicle user sends a request, e.g., V 12 , after driving through the coverage area of the UAV, it enters the wait state until it drives into the R r side section and obtains f i from the R r .

Communication Model and Latency Analysis
This section analyses the communication model and latency for the vehicle user. When the vehicle user is at position l, the rate at which the roadside unit R m , m ∈ {l, r}, transmits content to the vehicle user v is expressed as where P m is the transmit power of R m ; G m,n (l) denotes the channel gain between R m and v n for the vehicle user at position l; l ∈ (0, 2l), G m,n (l) = χ · d m,n (l) −δ , d m,n (l) is the distance between R m and v n ; B is the channel bandwidth. For communication between UAVs and vehicle users, this paper considers a probabilitybased air-to-ground communication model consisting of two communication channels: a line-of-sight channel and a non-line-of-sight channel. The probability of line-of-sight transmission is The probability of non-line-of-sight transmission is P NLoS = 1 − P LoS , where θ denotes the elevation angle from the UAV to the vehicle user and β 1 and β 2 are constant parameters influenced by environmental factors. The path loss between the UAV and its associated vehicle user can be expressed as where η LoS and η NLos are the attenuation factors corresponding to the line-of-sight and non-line-of-sight links, f c is the carrier frequency, c denotes the speed of light, and d u,n (l) denotes the distance between the UAV and the vehicle user v n . Therefore, we can obtain the average path loss of the UAV-to-vehicle user link as L = P Los L Los + P NLos L NLos The content transmission rate from the UAV to the vehicle user v n at moment t can be obtained as where B denotes the bandwidth, P U denotes the transmitted power of the UAV, and σ 2 denotes the Gaussian white noise variance at the receiver. The request delay for the vehicle user consists of a waiting delay d w and a transmission delay d t with each cache node, where the waiting time is the time the vehicle user travels without a cache node transmitting content for it before the content is fully available; the transmission delay needs to be calculated based on the transmission speed. Assuming that the length of each time slot is t and the size of the content to be transmitted is S s , which is an integer multiple of the content block size s, and that the transmission link is broken at position L break , the calculation of the transmission delay is shown in Figure 4. Suppose the current time slot is the q th time slot after starting transmission, at this time, the vehicle user's position is l q = l + t · v · q, the amount of data that can be transmitted by this time slot is s t l q = t · R l q , the size of the data that has not been transmitted is s r l q = S − q−1 ∑ j=1 t · R l j . If s r l q > s T l q , it means that the current time slot cannot complete the content delivery and the next time slot will continue the transmission. Otherwise, the current time slot can complete content delivery; the time used for data transmission in this time slot is s r (lq) R(l q ) . The content transmission ends at this time slot and the transmission delay of the vehicle user at this cache node can be expressed as In summary, in conjunction with the content delivery policy in Section 3.3, the request latency can be calculated for each vehicle user.

Price Model
In this paper, we use the revenue sharing contract model from the literature [10], where each RSU in the NO pays a fee a to the CP for caching a content and a vehicle user pays a fee b for acquiring a content from the NO. For the benefits received from the vehicle user, the CP and NO will split the benefits proportionally according to how the vehicle acquires the content. If the NO caches the content requested by the user and completes delivery of the content before it drives off the entire road, it will split the benefits If the transmission link is broken or switched during transmission, i.e., the vehicle position exceeds L break , the content blocks that have not completed transmission will be discarded. The size of the completed content of the transmission is s t = s · f loor s t s , where floor is a downward rounding function, then the size of the uncompleted content of the transmission is s r = S − s t . The user will continue to acquire the remaining content blocks at the next cache node. The transmission delay of the vehicle user at that cache node can be expressed as In summary, in conjunction with the content delivery policy in Section 3.3, the request latency can be calculated for each vehicle user.

Price Model
In this paper, we use the revenue sharing contract model from the literature [10], where each RSU in the NO pays a fee a to the CP for caching a content and a vehicle user pays a fee b for acquiring a content from the NO. For the benefits received from the vehicle user, the CP and NO will split the benefits proportionally according to how the vehicle acquires the content. If the NO caches the content requested by the user and completes delivery of the content before it drives off the entire road, it will split the benefits with the CP in proportion θ 1 , i.e., the benefits of size θ 1 · b belong to the NO and the remaining benefits belong to the CP. If the NO does not complete delivery before the vehicle user drives off, it will need to acquire the content from the CP and then forward it to the vehicle user, splitting the benefits in proportion θ 2 , θ 1 > θ 2 .

RSU Caching Strategy Considering Content Cache Pricing
In this paper, we consider the utilities of both CP and NO. For the NO, we consider not only its economic utility, but also its time cost. In this section, we optimize the utility of the network and the request latency of the vehicle users in two ways. First, the utility formulas of CP and NO are analyzed by considering CP and NO as merchants and customers and content as a commodity. Second, the competition between them is modeled using the Stackelberg game framework to optimize the utility of both by changing the pricing of the content and the caching decision of the RSU. It is worth noting that the UAV caches the content already leased by the RSU without paying a fee to the CP and the effect of the UAV is not considered at this stage for now. Finally, when the rental price and the caching decision of RSU are determined, the utility of NO is further improved by optimizing the trajectory of UAV to further reduce the request latency of vehicle users and further improve the utility of NO.

Stackelberg Game for Joint Pricing and Cache Decision Optimization
Profit maximization of CP and profit maximization of NO are two conflicting optimization objectives. Assume that both CP and NO are selfish and intend to maximize their own revenues. Obviously, CP wants to increase the rental price in order to generate more revenue from the rental content. However, this will increase the rental cost and lead to a decrease in the benefit to the NO and thus a decrease in the amount of content rented by the RSU, which in turn may lead to a decrease in the total utility of the CP. To achieve a balance between the two competitions, game theory is an effective way to achieve their generally accepted prices and optimal cache placement strategies.

Utility Function
The total utility U MNO of the NO is determined by the benefit W MNO obtained from the vehicle user and the average delay D ave of content delivery. The utility of the CP consists of the rent obtained from the NO and the benefit obtained from the vehicle user. Different caching decisions and vehicle request locations yield different utilities; this section defines the utility functions for CP and NO.
Suppose that a vehicle drives at a uniform speed V m on a roadway covered by R m . If the vehicle user has complete access to the content through R m only, it must have at least a distance of length K m to drive within the coverage of R m ; K m ≤ L, K m is denoted as If the vehicle user obtains the content only through R m , it needs to send the request within the first L − K m distance after entering the R m section. Since the time when a vehicle user sends a request obeys a uniform distribution, the probability that any vehicle user can obtain the content f i completely from R m is Since requests for content are independent, the total utility can be split into the sum of the utilities of all content in the cache. Taking content f i as an example, when both RSUs cache content f i , i.e., y l,i = 1 and y r,i = 1, the user of the vehicle requesting f i must be able to obtain the content completely before driving out of the entire roadway. The benefit w MNO i, [1,1] and the average request delay obtained by the NO by caching f are w MNO i, [1,1]  The utility gained by the CP u CP i, [1,1] is: where D i (l) represents the transmission delay when the vehicle user requests content f i after entering the section l meters, which can be calculated according to the method in Section 3.4.
When R l caches f i and R r does not, i.e., y l,i = 1 and y r,i = 0, the benefits and request delays brought by vehicle users driving in from different directions are different. For the vehicles driving in from the R l section, only the requests sent within the first L − K l meters can receive the content through RSU only, otherwise the vehicle users cannot receive the f i completely when driving out of the R l section and enter the waiting state until they drive out of the whole section and receive the content from the CP through NO forwarding. Since vehicle users will only send requests within the first section they drive into, vehicles that drive in from the R r section and request f i will definitely receive the content. They send the request and enter the wait state until they drive into the R l section and receive the content from R l . In summary, the benefit w MNO i, [1,0] and the average request delay d ave [1,0] obtained by the NO through cache f i in this case are The utility obtained by CP is: When R l did not cache f i and R r cached f i , i.e., y l,i = 0 and y r,i = 1, similar to the above analysis, the benefit w MNO i,[0,1] and the average request delay d ave [0,1] obtained by NO by caching f i are denoted as The utility u CP i,[0,1] obtained by the CP is the same as u FP i, [1,0] above, denoted as When neither R l nor R r caches f i , i.e., y l,i = 0 and y r,i = 0, the vehicle will obtain the content from the CP as it drives the entire roadway. The benefit w MNO i,[0,0] and the average request delay d ave [0,0] obtained by the NO by caching f i are denoted as The utility u CP i, [1,1] obtained by CP is: Through the above analysis, the benefit obtained by NO through cache f i can be expressed as The average request latency of vehicle users for content f i is expressed as Thus, the utility obtained by NO through cache f is expressed as where r is a weighting factor for benefit and delay and q is a constant used to make the trend of delay the same as that of benefit. The utility of CP for content f is expressed as

Stackelberg Game Model
In this paper, the competitive relationship between the CP and the NO is modeled as a Stackelberg game, where the CP acts as a leader and the NO acts as a follower in response to the CP's actions. More specifically, CP first gives the rental price a and informs NO about it. The NO determines the amount of content it rents for each RSU and the optimal cache placement policy that maximizes its utility based on the lease price a, the expected number of content requests, and the average content request probability. Thus, the Stackelberg game consists of two subproblems: the problem of how the leader (CP) determines the lease price to maximize its utility and the problem of how the follower (NO) conducts caching to maximize its utility.
(1) The utility maximization problem of CP where constraint (C1) indicates that the lease price does not exceed the maximum unit price set by the market and (C2) indicates that the cache decision indicator variable of the RSU takes the value of 0 or 1.
(2) The utility maximization problem of NO where constraint (C3) indicates that the cached content data size cannot exceed the maximum cache capacity. The utility of the leader and the followers is optimal when the Stackelberg game reaches the equilibrium point; the lease price and cache decision corresponding to the equilibrium point are the optimal values. If any entity deviates from the equilibrium point, its own benefit will be reduced. According to the definition in [7], the equilibrium point of the Stackelberg game in this paper is defined as: Let a * denote the optimal solution of the utility maximization problem P1.1 for CP and Y R * denote the optimal solution of the utility maximization problem for NO given the optimal lease price a * . For any a, Y R in the feasible region, if the following conditions are satisfied: then a * , Y R * is the equilibrium point of the proposed Stackelberg countermeasure.

Iterative-Based Dynamic Programming Algorithm
For problem P1.2, with constant lease price a, it can be viewed as a backpack problem, where two RSUs with limited capacity are equivalent to two backpacks with different benefits obtained by placing content into different backpacks. The content caching decision is optimized according to the cache capacity and the utility obtained by each content in different RSUs so as to maximize the utility. However, this problem is different from the traditional knapsack problem in that, first, there are multiple knapsacks (two RSUs) and, second, the content cache obtains different benefits in different RSUs and the caching situation of two RSUs for the same content affects the mutual utility. That is, if the utility obtained by NO is u l when only R l caches content f i and u r when only R r caches, but the total benefit obtained when both RSUs cache f i is not equal to u l + u r . Therefore, this paper designs a dynamic programming-based cache optimization algorithm to solve this problem.
The core idea of the dynamic programming algorithm is to decompose the original backpack problem into a set of smaller backpack problems and find the relationship between the optimal solution of the original backpack problem and the optimal solution of the smaller backpack problems. Following this idea, we use a three-dimensional array DP C R l ,C R l ,I to store the solutions of these smaller backpack problems. The three-dimensional array is defined as where db cl,cr,i ∈ DP C R l ,C R l ,I denotes the maximum utility when the cache capacity of R l is c l and the cache capacity of R r is c r and the subscript of the cacheable content is between 0 and i. Initially, db cl,cr,0 = 0, db 1,0,0 = u MNO 1, [1,0] , and db 0,1,0 = u MNO 1,[0,1] , which can be calculated recursively based on the optimal solution to some smaller backpacking problem. Equation (31) determines the caching decision of NO for content f i . Meanwhile, this paper uses Y R = Y l i , Y r i to record the caching of RSU, where Y l i and Y r i denote the set of caching decisions of R l and R r for i contents, respectively. The specific procedure is shown in Algorithm 1. 2. For c l = 0 to C l do: 3.
For c r = 0 to C r do: 4.
db cl,cr,0 = 0 5. end for 6. end for 7. For c l = 0 to C l do: 8. For c r = 0 to C r do: 9.
For i = 1 to I do: 10.
If c l ≥ s and c r ≥ s: 12.
Else if c l ≥ s:

19.
If db cl−1,cr,i−1 + u MNO i, [1,0] Based on Algorithm 1, an iterative-based dynamic programming algorithm is designed in this paper to find the equilibrium point of all Stackelberg games; the specific process is shown in Algorithm 2. Algorithm 2 involves an iterative interaction between the CP and the NO. As the leader, the CP first initializes the lease price a to 0 and then starts the game. In the process of the game, the NO obtains the optimal content caching decision and the corresponding maximum utility at the current rental price by Algorithm 1 based on the content popularity and the local cache space. After that, the CP calculates its own utility based on the caching decision made by the NO and then increases the rental price a slightly by ∆a and iterates to repeat the above interaction.
Calculate the maximum utility W MNO a of NO when the lease price is a by Algorithm 1.

4.
Calculate the maximum utility W CP a of CP when the lease price is a according to Equation (26). 5. a = a+∆a 6. The lease price a * that maximizes W CP a is the final lease price 7. The utility of NO is W MNO a and the cache decision is Cache a * .

Trajectory Optimization of UAV
After determining the rental price of the content and the caching decision of the RSU, the utility of the CP is determined, but the utility of the NO can be further enhanced by optimizing the trajectory of the UAV. The optimization problem can be expressed as follows: where G denotes the trajectory of the UAV, constraint (C4) is a constraint on the flight speed of the UAV, (C5) indicates that the size of the UAV's cached content data cannot exceed the maximum cache capacity, and (C6) indicates that the UAV's cache decision indicator variable takes the value of 0 or 1.
In this paper, we assume that the road is divided into Z small road segments and the length of each segment is the coverage diameter of the UAV with Z = 2L C U . The UAV stays directly above the center of a block during a trajectory optimization cycle T x to serve the vehicle users within that block and selects an adjacent block at the end of the current trajectory optimization cycle to serve the vehicles within that block during the next trajectory optimization cycle T z+1 . The starting and ending points of the UAV are the same and fixed to facilitate the charging of the UAV. For the optimization period T x , if given the UAV location, the request delay d i (l) of the vehicle user who makes a request at location l when the UAV caches content f i can be calculated based on the analysis of the content delivery policy and content request delay in Sections 3.4 and 3.5; the average request delay of the user who requests f i at this time is According to the analysis in Section 4.1.1, the average request delay d ave i for f i when the UAV is not caching f i at this time can be calculated, so the delay saved by the UAV when slowing down f i can be expressed as Also, the introduction of a UAV can further increase the benefit of NO. When the UAV and one RSU caches f i and another RSU does not and the UAV stays on the roadway where the RSU that caches f i is located, with R l caching f i and R r not caching, and the UAV stays on the R l side, for example, the probability of obtaining the content increases to P U for the user of the vehicle that drives in from the R l side. When the UAV and one RSU caches f i and the other does not and the UAV stays on the roadway where the RSU that does not cache content f i is located, taking the example that R l caches f i , R r does not, and the UAV stays on the R r side, the probability of obtaining content increases to 1 for the user of the vehicle approaching from the R l side. Thus, the benefit that caching f i can bring to the NO is w save i = (P U − P) · N l · PR i · c · y l,i · 1 − y r,i + (1 − P) · N r · PR i · c · 1 − y l,i · y r,i , If the UAV stays on the R l side (1 − P) · N l · PR i · c · y l,i · 1 − y r,i + (P U − P) · N r · PR i · c · 1 − y l,i · y r,i , If the UAV stays on the R r side.
So, the problem can be simplified as follows: The UAV caches the contents of the RSU lease according to the prevalence, so the utility saved when the UAV is on each road block in different trajectory optimization cycles can be calculated; the utility saved when the UAV stays on road block L x in optimization cycle T x is denoted by u save x,z . At this point, P3 can be regarded as a shortest path problem and the UAV trajectory is regarded as a directed graph, as shown in Figure 5, where the point (x, z) indicates that the UAV stays on the rectangular block L z during the optimization period T x . The edge connecting the two points indicates the path of the UAV and u save x,z is used as the weight of the edge. l benefit that caching i f can bring to the NO is So, the problem can be simplified as follows: The UAV caches the contents of the RSU lease according to the prevalence, so the utility saved when the UAV is on each road block in different trajectory optimization cycles can be calculated; the utility saved when the UAV stays on road block x L in optimization cycle x T is denoted by , T 0  This problem can be solved by Dijkstra's algorithm, which has the main feature of starting from the starting point and using the strategy of greedy algorithm, traversing to the nearest neighboring point of the starting point that has not been visited each time until it is extended to the ending point. The details are shown in Algorithm 3. First, the starting point L begin and the ending point L end are given, L begin = L end , and a directed graph is drawn. Initially, the starting point is marked and the distance is recorded as 0, i.e., L begin = L end . All points except the starting point are unmarked and the distance to the starting point is recorded as positive infinity, i.e., dis 0,(T 0 ,L begin ) = ∞. Next, iteration is performed. The point just added is denoted as pt, its set of neighboring points is denoted as pt nei , and the distance dis pt,p of neighboring point p passing through pt to the starting point is calculated, p ∈ pt nei , where the value of dis pt,p is the weight of pt to pt nei plus the distance of pt to the starting point. If dis pt,p is less than the distance dis 0,p to the starting point recorded by p, update dis 0,p to the value of dis pt,p . After the distance update, the closest point from the unmarked point to the starting point is selected, marked, and included in the set of optimal paths. Repeat the last two steps until the update to the end point to achieve the final path and the saved utility. Algorithm 3: Dijkstra-based path planning algorithm For z = 1 to Z: 4.
For p in pt nei : 10.
end for 15.
Select the closest point to the starting point from the unmarked points, denoted as pt 16.
f lag pt = 1, add pt to the trajectory G of the UAV In summary, the complexity of the algorithm used for this strategy is o(C l · C r · I). The algorithm can accurately find the optimal RSU caching strategy and UAV flight trajectory, but the complexity is high and the computing time is long, which is suitable for scenarios with smaller contents and higher requirements for accuracy.

Analysis of Simulation Results
In order to evaluate the performance of the proposed algorithm, we use python to simulate the algorithm and compare it with the caching strategy in the literature [10]. The path loss index is α = 4, the channel fading gain is χ = 10 −2 , the channel width is B = 1.1 MHZ, the power of Gaussian noise is P n = −110 dBm, and other simulation parameters are shown in Table 1.  Figure 6 gives the variation of utility with different total number of contents, where the total utility of the NO is determined by both the benefits of the NO and the average delay of the vehicle users. From Figure 6a-d, it can be seen that, as the total number of contents increases, the total utility of the CP and the total utility of the NO decrease, the benefit of the NO increases, and the average request delay increases. This is because the cache capacity of the RSUs and the UAV is fixed and the larger the total amount of content, the smaller the percentage of content cached, the smaller the probability of vehicles receiving content through cache nodes and the larger the content request latency. The total utility of NO is determined by the benefits of NO and the average request latency; thus, as the total amount of content increases, the total utility of NO decreases, which leads to a lower demand for content by NO during the game, so that the rental price of content decreases and the total utility of CP decreases. Compared with the strategy in [10], the benefit of CP for the strategy in this paper is slightly lower than that of [10]; all other performances are better than that of the strategy in [10]. This is because the policy in [10] does not chunk the content, so if the content is not fully available when the vehicle user drives out of the coverage of the cache node, it needs to be re-downloaded at the next cache node, which leads to a lower probability of successful delivery. Moreover, the policy in [10] does not apply UAV, which leads to an increase in delay, a decrease in the benefit of the NO, and a decrease in the utility of the NO. Since the benefits of CP and NO are fixed, the utility of CP is slightly greater than that of this paper. The performance of this paper is better compared to the caching-by-popularity strategy. This is because the dynamic programming algorithm applied in this paper can optimize both the content rental price and the caching decision of RSUs to obtain better performance compared with caching by popularity.  Figure 7 gives the relationship between the performance of this paper's strategy a the Zipf distribution parameter for different total number of contents. From Figure 7b it can be seen that the average delay decreases with the increase in Zipf distribution rameters when the total number of contents is constant and the benefit and total utility  Figure 7 gives the relationship between the performance of this paper's strategy and the Zipf distribution parameter for different total number of contents. From Figure 7b-d, it can be seen that the average delay decreases with the increase in Zipf distribution parameters when the total number of contents is constant and the benefit and total utility of NO increases with the increase in Zipf distribution parameters. This is because, as the Zipf distribution parameter increases, the requests of vehicle users become more and more concentrated on the content with high popularity, so the cache hit rate increases, the request latency of vehicles becomes smaller, and the success rate of vehicles to successfully obtain the content becomes larger. As the cache hit rate increases, the benefit and total utility of NO increases. From Figure 7a, the average delay decreases as the Zipf distribution parameter increases when the total number of contents is constant. This is because, as the Zipf distribution parameter increases, the request rate of vehicle users for content with low popularity decreases, which will lead to a lower degree of demand for these contents by NO, so the content rental price decreases and the total utility of CP decreases accordingly.  Figure 8 gives the relationship between the performance of this paper's strategy an the vehicle speed for different total number of contents. From Figure 8a-d, it can be se that the average delay decreases with the increase in vehicle speed, the benefit and to utility of NO increases with the increase in vehicle speed, and the total benefit of CP d creases with the increase in vehicle speed. This is because, as vehicle speed increases, t time vehicle users spend traveling on the roadway becomes shorter, the maximum waiti  Figure 8 gives the relationship between the performance of this paper's strategy and the vehicle speed for different total number of contents. From Figure 8a-d, it can be seen that the average delay decreases with the increase in vehicle speed, the benefit and total utility of NO increases with the increase in vehicle speed, and the total benefit of CP decreases with the increase in vehicle speed. This is because, as vehicle speed increases, the time vehicle users spend traveling on the roadway becomes shorter, the maximum waiting delay becomes shorter, and the average delay decreases. As the vehicle travel speed increases, the contact time between the vehicle and each cache node decreases, and the success rate of content delivery decreases, so the degree of demand for content by NO decreases and the content rental price decreases, resulting in a decrease in the total utility of CP and an increase in the benefits and total utility of NO. Figure 8 gives the relationship between the performance of this paper's strategy the vehicle speed for different total number of contents. From Figure 8a-d, it can be s that the average delay decreases with the increase in vehicle speed, the benefit and t utility of NO increases with the increase in vehicle speed, and the total benefit of CP creases with the increase in vehicle speed. This is because, as vehicle speed increases time vehicle users spend traveling on the roadway becomes shorter, the maximum wai delay becomes shorter, and the average delay decreases. As the vehicle travel speed creases, the contact time between the vehicle and each cache node decreases, and the cess rate of content delivery decreases, so the degree of demand for content by NO creases and the content rental price decreases, resulting in a decrease in the total utilit CP and an increase in the benefits and total utility of NO.  Figure 9 gives the relationship between the performance of this paper's strategy and the communication radius of RSU for different total number of contents. From Figure 9a-d, it can be seen that the average delay, the benefit, and the total utility of NO decreases with the increase in the communication radius of RSU and the total benefit of CP increases with the increase in the communication radius of RSU. This is because, as the communication radius of the RSU increases, the success rate of the vehicle user in obtaining content from the RSU increases; then, the average request delay of the vehicle user decreases. At the same time, the demand of NO for content becomes larger and the content rental price increases; then, the total utility of CP increases and the benefit of NO decreases. The total utility of NO is determined by both the delay and the benefit of NO and, combined, it decreases slightly with the increase in RSU communication radius. from the RSU increases; then, the average request delay of the vehicle user decreases the same time, the demand of NO for content becomes larger and the content rental p increases; then, the total utility of CP increases and the benefit of NO decreases. The t utility of NO is determined by both the delay and the benefit of NO and, combine decreases slightly with the increase in RSU communication radius.

Conclusions
In this paper, we investigated a UAV-assisted caching strategy based on con cache pricing in vehicular networks. In a traffic scenario consisting of CP, NO, and m ple vehicle users, the CP leases some popular content to the NO for benefits and the caches this leased content in RSUs to save expensive backhaul transmission overhead latency. We analyzed the content delivery process and benefit exchanges between CP,

Conclusions
In this paper, we investigated a UAV-assisted caching strategy based on content cache pricing in vehicular networks. In a traffic scenario consisting of CP, NO, and multiple vehicle users, the CP leases some popular content to the NO for benefits and the NO caches this leased content in RSUs to save expensive backhaul transmission overhead and latency. We analyzed the content delivery process and benefit exchanges between CP, NO, and vehicle users and defined the utilities of CP and NO. Then, the competition between CP and NO was modeled using the Stackelberg game framework and an iteration-based dynamic planning algorithm was designed to find the Stackelberg equilibrium by optimizing the caching decisions of RSUs and the rental prices of contents. Finally, a cache-capable UAV was introduced into the onboard network to cache the content already leased by the NO and a Dijkstra-based path planning algorithm was designed to further increase the total utility of the NO and improve the service quality by optimizing the trajectory of the UAV. The simulation results show that the strategy in this paper can have a reasonable distribution of the benefits of CP and NO, reduce the average request latency, and increase the utility of NO. Compared with [10], the strategy in this paper reduces the total benefit of CP by 2% and increases the total utility of NO by 13% where the benefit of NO is increased by 3% and the request delay is reduced by 27%.