A Mobility Prediction-Based Relay Cluster Strategy for Content Delivery in Urban Vehicular Networks

: In recent years, cache-enabled vehicles have been introduced to improve the efﬁciency of content delivery in vehicular networks. However, because of the high dynamic of network topology, it is a big challenge to increase the success probability of content delivery. In this paper, we propose a relay strategy based on cluster’s prediction trajectory for the situation of no cache near the request vehicles. In our strategy, the roadside unit (RSU) divides vehicles into clusters by their prediction trajectory, and then proactively caches contents at a cluster that will be about to meet the request vehicle. In order to decrease the probability of unsuccessful content delivery caused by communication duration that is too short between the request vehicle and content source vehicle, RSU caches content chunks at multiple vehicles in a cluster. By letting the request vehicle communicate with vehicle-caching content chunks one by one, our strategy enlarges the communication duration and increases the success probability. Our strategy also maximizes the success probability by optimizing the number of vehicles selected to cache content chunks. Besides, based on statistical characteristics of vehicles’ speed, we derive the formula of success probability of content delivery. The simulation results show that our strategy can increase the success probability of content delivery, as well as decrease time delay, for example. For example, we increase the success probability by about 20%. Since the trajectory prediction-based cluster-dividing mechanism can improve clusters’ stability at intersections, this method is well suited for urban road scenarios.


Introduction
With the dramatic development of intelligent transportation systems (ITS), various vehicular applications, including road safety, intelligent transportation, in-vehicle entertainment, and self-driving [1], occur in our daily lives. The smart road can supply vehicles with autonomous drive, but it needs a large quantity of data transmitted between road facilities and vehicles [2,3], which will absolutely increase the traffic of vehicular network. The increasing number of vehicles on the road sharply increases the traffic burden of vehicular networks, and the highly dynamic vehicular network makes vehicular communication links unstable. Consequently, the quality-of-experience of vehicle users is poor. In order to solve these problems, researchers and scholars have introduced in-network caching into the vehicular network.
In-network caching is a mechanism to cache contents in every node of the vehicular network, and vehicle members can obtain desired contents from neighbor nodes. However, caching increases the redundancy of contents cached in networks, which can shorten the communication links between content source nodes and request vehicles and decrease traffic burden of content servers and network. The authors of [4] introduced the Leave Copy Everywhere (LCE) mechanism, which lets nodes cache every passing content by itself. The authors of [5] introduce the probability mechanism, which lets nodes cache contents the area. Although the mechanisms from the abovementioned research can weaken the influence of high dynamic topology, the researchers have only focused on past or current positions and have failed to take into account the vehicles' future trajectory, so the clusters have poor stability.
To sum up, the existing cluster-dividing mechanisms fail to well accommodate the high dynamic topology, especially in situations of urban areas, and vehicle clusters can easily disintegrate at intersections. So, we take into account vehicles' future trajectories to enlarge the stability of clusters. In order to increase the success probability of content delivery, this paper proactively caches request contents at clusters which will meet the request vehicle on its future trajectory. This paper also adopts the multiuser multichannel transmission mode in the communication process.
The main contributions of our paper are summarized as follows: • We propose a proactive caching strategy based on the cluster's prediction trajectory, which divides vehicles into clusters according to the vehicles' prediction trajectory. RSUs proactively cache content chunks at a cluster that will meet the request vehicle on the prediction trajectory. By letting vehicles receive content chunks from vehicles on the opposite road one by one, we enlarge the communication duration between the request vehicle and content source vehicle and increase the success probability of content delivery. Besides, in order to increase the success probability of content delivery between the request cluster and content source cluster, we introduce the multiuser multichannel transmission mode into the communication process between clusters. • Aiming at increasing stability of clusters, we treat vehicles' prediction trajectory as one of considerations of cluster division. By giving CMs the same prediction trajectory, we can obtain the cluster's prediction trajectory, which is the same as that of the vehicles' in the cluster. • Based on the prediction trajectory of the request vehicle and clusters, as well as the vehicle's speed, the RSU proactively caches content chunks at multiple vehicles in the corresponding cluster. During the process, the RSU formulates the optimal number of content chunks to maximize the success probability of content delivery. • Based on the statistical characteristics of vehicle's speed, vehicle flow, and the number of request arrivals in RSU, we theoretically derive the success probability of content delivery. The simulation results verify the validity of the process of derivation and demonstrate that this paper improve the system's performance in terms of time delay, as well as the success probability of content delivery. In comparison with the results achieved by the authors of [19], we increase success probability by about 20%.
The rest of this paper is organized as follows. The system model is presented in Section 2. In Section 3, we depict the content request and delivery process, introduce the cluster dividing algorithm, optimize the caching strategy, and derive a formula for the success probability of content delivery. Section 4 shows the simulation results and illustrates the performance of our proposed strategy. Section 5 summarizes the paper.

System Model
In this section, as shown in Figure 1, we consider a bidirectional two-lane urban area which has 2I lanes that have the same size with length E and width D. i and i + I, i ∈ [1, I] denote the two lanes of the dual carriageway, respectively. The traffic flow is different in different lanes, and we assume the vehicle number of an area of lane i follows a Poisson Point Process, with a rate of λ i per unit area [22]. For example, K denotes the vehicle number of lane i and the probability distribution function of K:  Assuming that the speed of vehicles on different streets is different, and the speed of vehicles in the same lanes follows the same Truncated Normal Distribution [23], the probability density function of speed, defined by i s of vehicles on lane i , is given as follows: Especially, the two lanes of a dual carriageway have the same distribution parameter, and there holds Assuming that the speed of vehicles on different streets is different, and the speed of vehicles in the same lanes follows the same Truncated Normal Distribution [23], the probability density function of speed, defined by s i of vehicles on lane i, is given as follows: where µ i and σ i denote the mean and standard deviation of s i , and parameter c i ensures the accumulated probability of f (s i ) equals 1 in the range [s min , s max ], which can be used to determine the value of c i , especially, s i and s i+I have the same probability density function. Because the speed of any two vehicles is independent of each other, the joint probability density function of speed s x i and speed s y i of two vehicles in lane i is derived as follows: The distribution function of speed s i can be derived from Formula (2) as follows: Each intersection in our system has an RSU. RSUs connect with the internet, and RSUs are linked to their neighboring counterparts, whereby vehicles can obtain all desired content. Assuming that M RSUs exist in this area, RSU is denoted by R m , and the subscript m denotes the sequence number of R m , all RSUs have the same communication range with R rsu , as well as the same maximum transmission rate with R max , and the number of request arrival at R m , m ∈ [1, M] obeys the distribution of Poisson Process with parameter λ rsu m . Vehicles in our system are denoted by V x , where the subscript x denotes the sequence number of V x , and all vehicles have the same transmission range with r v .
The authors of [24] indicated that the request for content approximately follows a Zipf distribution. Accordingly, we assume that the amount of content in the content library is L, and every content has the same size of W. Vehicles' requests for content in the library obeys the Zipf distribution, and content with higher popularity has a higher probability of being requested. The probability of requesting the content with popularity ranked at τ is given as follows: where γ means the parameter of Zipf distribution. As the value of γ increases, the vehicles will more frequently request content with high popularity. τ denotes the popularity rank of content in the library. Vehicles in our system have limited caching capacity, which is divided into two parts: Caching space and relay space. The former is used to cache the content which is of high popularity or is attractive to the vehicle user, and it can accommodate content with the maximum number of N(N < L). The latter is used to temporarily cache content chunks that will be delivered to the request vehicle when the vehicle meets the request vehicle. The vehicles selected to temporarily cache content are called as the relay vehicles. Because the relay space only caches part of the content, the amount of content that the relay space can cache was set to 2. The common symbols are showed in Table 1.

Content Acquisition Process
In vehicular networks, vehicles can obtain desired content from neighboring vehicles as well as RSUs, and the content acquisition process is illustrated in Figure 2. In our strategy, as shown in Figure 3, there are five cases of content acquisition. In case and case , the request vehicles obtain desired content from other vehicles on opposite lanes. Request vehicles and content source vehicles drive in opposite directions, which can easily lead to unsuccessful content transmission due to communication duration times being too short. For example, assuming that the speed of vehicles is 50 km/h and the coverage range of vehicle is 50 m, the communication duration time will be around 7.2 s. If the transmission rate is 1 Mbit/s, then the transmission volume of content is no more than 7.2 Mbit. However, the volume of desired content of vehicle users always exceeds 7.2 Mbit, which means that the situation of unsuccessful transmission occurs easily. So, our strategy caches content chunks using multiple vehicles, which shortens the volume of content transferred from the content source vehicle to the request vehicle and increases the probability of successful content transmission. The process of content acquisition for vehicle users is as follows.

Cluster-Dividing Mechanism
RSUs not only provide services for vehicles in the coverage area of themselves, but also proactively cache content at clusters going to meet request vehicles based on the requests forwarded by neighboring RSUs. Because the content transmission process between the clusters and request vehicles happens on the prediction trajectory of clusters, our mechanism makes use of neural network to predict the vehicles' behavior among turning left, going straight, turning right, and turning around at intersections. According to the prediction results, vehicles in the coverage area of RSUs were divided into different clusters.
Our mechanism makes use of a three-layer neural network to predict the vehicles' behavior, and the structure of the used neural network is shown in Figure 4. The neural network has a SoftMax layer behind the output layer, as well as a sigmoid activation function in the hidden layer. Some attribute information of the vehicle can affect the driver's driving behavior, for example, departure places and destinations can be used by drivers to select the most convenient path, and the drivers will take different choices at different times and different positions. Therefore, the input parameters of the neural network are defined as 1 x , denoting departure place; 2 x , denoting the destination; 3 x , denoting the current position; and 4 x , denoting the current time. A four-length vector can be input into the input layer of the used neural network. The hidden layer has two implicit layers with node numbers of 17 and 21. The output layer also a four-length vector, in which 1 y , 2 y , 3 y , and 4 y denote the probability of turning left, going straight, turning right, and turning around. In the paper, 9149 sets of driving data were obtained from the urban road environment stimulated by SUMO (Simulation of Urban Mobility) [25], and the proportion of training data and test data was 8:2. In order to depict prediction results in terms of probability, we added a SoftMax layer behind the hidden layer to normalize the output of hidden layer. The SoftMax function [26] is expressed as follows: The request vehicle will first searches for desired content among vehicles that are a one-hop distance from itself in the same lane. If the desired content exists, then the content source vehicle will directly transfer content to the request vehicle. As shown in Figure 3, V 2 caches the content that V 1 requests, and V 2 directly transfers the content as it receives the request from V 1 .
If the request vehicle fails to find desired content among vehicles that are a one-hop distance from itself in the same lane, then the request will be delivered to vehicles that are a two-hop distance from the request vehicle in the same lane. If the desired content exists, then the content source vehicle will transfer the content to the request vehicle that is a two-hop distance away, meaning that the content source vehicle will first transfer the content to a relay vehicle, and then the relay vehicle will transfer the content to the request vehicle. As shown in Figure 3, V 5 , which is a two-hop distance from V 3 , caches the content V 3 desires. V 5 first transfers the content to the relay vehicle V 4 , and then V 4 transfers the content to V 3 .
If the request vehicle fails to find desired content among vehicles that are a one-hop or two-hop distance from itself in the same lane, then the request will be forward-delivered to vehicles that are a one-hop or two-hop distance from the request vehicle in the opposite lane. If the desired content exists, then the request vehicle and content source vehicle will separately act as CHs and make vehicles that are a one-hop distance from themselves in the same lane become CMs to construct the request cluster and the content source cluster. On this basis, according to caching optimization algorithm depicted in Section 3.2, the content source vehicle will averagely divide content into multiple chunks and cache them at multiple relay vehicles in content source cluster, and those vehicles will transfer content chunks to vehicles in request cluster using the multiuser multichannel transmission mode. As shown in Figure 3, V 11 caches the content V 6 requests. Then, V 10 , V 11 , V 12 , and V 13 constitute the content source cluster, and V 6 , V 7 , V 8 , and V 9 constitute the request cluster after the request arrives at V 11 . According to caching optimization algorithm, V 11 divides the content into three chunks and separately caches them at V 11 , V 12 , and V 13 . During the transmission process, V 11 , V 12 , and V 13 separately transfer content chunks to V 6 , V 7 , and V 8 , and then V 6 receives the remaining content chunks from V 7 and V 8 .
If the request vehicle fails to find the desired content among vehicles that are a one-hop or two-hop distance away from the request vehicle in the same lane or in the front opposite lane, then the request will be delivered to an RSU. When the request vehicle is situated in the coverage area of the RSU, the request vehicle directly obtains desired content from the RSU. As shown in Figure 3, the request of V 14 is delivered to R 4 , and then R 4 directly transfers content to V 14 .
When the request vehicle is situated out of coverage area of the RSU, the RSU ahead of the request vehicle divides vehicles in its coverage area into clusters by their prediction trajectory. According to caching optimization algorithm depicted in Section 3.2, the RSU divides the content into multiple chunks averagely and caches them at multiple relay vehicles in a cluster that are about to meet request vehicle. As shown in Figure 3, R 2 delivers the request of V 15 to R 3 . Vehicles going to drive in the opposite lane of V 15 constitute the caching cluster, and R 3 chooses V 16 , V 17 , and V 18 from the cluster to cache content chunks. During communication process, V 15 receives content chunks from V 16 , V 17 , and V 18 one by one.

Cluster-Dividing Mechanism
RSUs not only provide services for vehicles in the coverage area of themselves, but also proactively cache content at clusters going to meet request vehicles based on the requests forwarded by neighboring RSUs. Because the content transmission process between the clusters and request vehicles happens on the prediction trajectory of clusters, our mechanism makes use of neural network to predict the vehicles' behavior among turning left, going straight, turning right, and turning around at intersections. According to the prediction results, vehicles in the coverage area of RSUs were divided into different clusters.
Our mechanism makes use of a three-layer neural network to predict the vehicles' behavior, and the structure of the used neural network is shown in Figure 4. The neural network has a SoftMax layer behind the output layer, as well as a sigmoid activation function in the hidden layer. Some attribute information of the vehicle can affect the driver's driving behavior, for example, departure places and destinations can be used by drivers to select the most convenient path, and the drivers will take different choices at different times and different positions. Therefore, the input parameters of the neural network are defined as x 1 , denoting departure place; x 2 , denoting the destination; x 3 , denoting the current position; and x 4 , denoting the current time. A four-length vector can be input into the input layer of the used neural network. The hidden layer has two implicit layers with node numbers of 17 and 21. The output layer also a four-length vector, in which y 1 , y 2 , y 3 , and y 4 denote the probability of turning left, going straight, turning right, and turning around. In the paper, 9149 sets of driving data were obtained from the urban road environment stimulated by SUMO (Simulation of Urban Mobility) [25], and the proportion of training data and test data was 8:2. In order to depict prediction results in terms of probability, we added a SoftMax layer behind the hidden layer to normalize the output of hidden layer. The SoftMax function [26] is expressed as follows: The SoftMax function can map the output of hidden layer to the range of (0,1) , by which the neural network can predict the probability of turning left, going straight, turning right, and turning around. The neural network will then treat the result with the maximum probability as the behavior of vehicles. The neural network considers the cross-entropy between the behavior distribution of the training output and the historical actual data as the objective function. By minimizing the value of objective function, our mechanism obtains network parameters and completes the training process. The objective function is formulated as follows: where s N denotes the sample number of training, ,1 ,2 ,3 ,4 [ , , , ]  The training process of the neural network uses the gradient descent to find the set θ to minimize ( ) J θ , and the performance of the neural network can be illustrated by the prediction accuracy, which can be derived by dividing the number of correct predictions to the volume of test data. Our paper compares the prediction accuracy when the The SoftMax function can map the output of hidden layer to the range of (0, 1), by which the neural network can predict the probability of turning left, going straight, turning right, and turning around. The neural network will then treat the result with the maximum probability as the behavior of vehicles.
The neural network considers the cross-entropy between the behavior distribution of the training output and the historical actual data as the objective function. By minimizing the value of objective function, our mechanism obtains network parameters and completes the training process. The objective function is formulated as follows: where N s denotes the sample number of training, denotes the actual result of the g-th sample, and [y g,1 (θ), y g,2 (θ), y g,3 (θ), y g,4 (θ)] denotes the training output of the g-th sample. The training process of the neural network uses the gradient descent to find the set θ to minimize J(θ), and the performance of the neural network can be illustrated by the prediction accuracy, which can be derived by dividing the number of correct predictions to the volume of test data. Our paper compares the prediction accuracy when the activation function is the sigmoid function and tanh function. The relationship between the prediction accuracy and training times is shown in Figure 5, which shows that prediction accuracy increased with the increase of training times, and that the prediction accuracy reached nearly 1 after 30 rounds training. Figure 5 also shows that the effect of the sigmoid function was better than that of the tanh function, so we chose the sigmoid function as the activation function. The neural network can predict the behavior of vehicles with high accuracy after finishing training. The cluster-dividing algorithm based on the vehicles' prediction trajectory is discussed next.  Based on prediction trajectory of vehicles in the coverage area of the RSUs, the RSUs divide vehicles into different clusters, and the CMs in a cluster must have the same next prediction lanes. For example, as shown in Figure 6  Based on prediction trajectory of vehicles in the coverage area of the RSUs, the RSUs divide vehicles into different clusters, and the CMs in a cluster must have the same next prediction lanes. For example, as shown in Figure 6 For any V x ∈ SET m 3.
input departure place x 1 , destination x 2 , current lane x 3 and current time x 4 of V x into 4.
neural network, and obtain output [so f tmax(y 1 ), so f tmax(y 2 ), so f tmax(y 3 ), so f tmax(y 4 )] 5. and the next prediction lane of V x 6.
if the next prediction lane of V x is lane i 7.
make V x become a CM of C m,i 8.
End For

Caching optimization for number k of content chunks
In our strategy, case ➂ and case ➄ both cache content at relay vehicles. The probability that the relay vehicles can meet the request vehicle is 1 in case ➂, which can be regarded as the particular case of case ➄ with a prediction probability of 1. So, our paper only discusses the caching optimization of case ➄, whereby RSUs averagely divide content into k chunks and cache them at k vehicles. If the value of k increases, then the size of content chunks cached at a vehicle decreases, and the success probability of content chunks transmission increases. However, the probability of meeting between the request vehicle and all of those k vehicles decreases. On the contrary, if the value of k decreases, the probability of meeting between the request vehicle and all of those k vehicles increases, but the size of content chunks cached at a vehicle increases and the success probability of content chunks transmission decreases. So, our strategy optimizes the number k of content chunks to maximum the success probability of content transmission from k vehicles in a cluster to the request vehicle.
Assuming that m R needs to cache content chunks at multiple vehicles in cluster

Caching optimization for number k of content chunks
In our strategy, case and case both cache content at relay vehicles. The probability that the relay vehicles can meet the request vehicle is 1 in case , which can be regarded as the particular case of case with a prediction probability of 1. So, our paper only discusses the caching optimization of case , whereby RSUs averagely divide content into k chunks and cache them at k vehicles. If the value of k increases, then the size of content chunks cached at a vehicle decreases, and the success probability of content chunks transmission increases. However, the probability of meeting between the request vehicle and all of those k vehicles decreases. On the contrary, if the value of k decreases, the probability of meeting between the request vehicle and all of those k vehicles increases, but the size of content chunks cached at a vehicle increases and the success probability of content chunks transmission decreases. So, our strategy optimizes the number k of content chunks to maximum the success probability of content transmission from k vehicles in a cluster to the request vehicle.
Assuming that R m needs to cache content chunks at multiple vehicles in cluster C m,i+I for vehicle V x on lane i, R m chooses vehicles with the top k prediction probability to cache k content chunks. Every content chunk has the same size of W k , and the prediction probability of those k vehicles constitutes the set {P During the meeting between C m,i+I and the request vehicle, the probability that all of those k vehicles successfully transfer content chunks to the request vehicle is as follows: The success probability of content transmission from cluster C m,i+I to request the vehicle is as follows: Our strategy maximizes the success probability of content transmission by optimizing the number of content chunks, and the optimization is formulated as follows: where, in case NV = H x , H x denotes the number of vehicles that are a one-hop distance away from V x , and in case , NV = C num i+I , C num i+I denotes the number of vehicles in C m,i+I . By optimizing the number of content chunks, R m maximizes the success probability of content transmission from the cluster to the request vehicle. The process of R m proactively caches content chunks at multiple vehicles in a cluster, as shown in Algorithm 2.

Algorithm 2 Caching Algorithm for Optimizing the Number of Content Chunks
Initialization: All vehicles in coverage area of R m constitute the set SET m , Max = φ, Pre = φ, NV

1.
predict behavior of all vehicles in set SET m , make vehicles with prediction lane i + I become CMs of cluster C m,i+I , and put prediction probability into set Pre

2.
for k = 1 to min C num m,i+I , NV do 3.
use k to calculate P trans i+I (k) in (8) (9), and put P suc i+I (k) into set Max 6. end For 7.
make k corresponding to maximum value in Max become the number of content chunks

Analysis of Success Probability of Contnet Acquisition
In our strategy, vehicles cache content in their caching space based on content's popularity, the caching space can accommodate content with maximum number of N, and the probability that a vehicle caches the content ranked τ − th is f τ (N). Assume that a vehicle V x on lane i requests the content with a rank of τ in popularity, and V x is within or is going to enter the coverage area of R m . As described in Section 3.1, there are five cases for the request vehicle to obtain desired content: Obtaining content from vehicles that are a one-hop distance away from the request vehicle in the same lane, obtaining content from vehicles that are a two-hop distance away from request vehicle in the same lane, obtaining content from vehicles that are a one-hop or two-hop distance away from the request vehicle on the front opposite lane, obtaining content directly from an RSU, and obtaining content from the cluster with content cached by the RSU. The analysis of those five cases is shown as follows.
Obtaining content from vehicles that are a one-hop distance away from the request vehicle Assuming that request vehicle V x drives after content source vehicle V y , the speed of eac vehicle is, respestively, s x i and s y i . The average transmission rate between vehicles is Appl. Sci. 2021, 11, 2157 12 of 20 V a , and the distance between the request vehicle and content source vehicle is r v 2 . When s x i > s y i , the condition of content success transmission is When s x i < s y i , the condition of content success transmission is  (3), the probability that a vehicle on lane i can obtain content from another vehicle that is a one-hop distance away from itself is derived as follows: Let K 1 denotes the number of vehicles that are a one-hop distance away from the request vehicle in the same lane. The probability that a request vehicle on lane i can obtain desired content from a vehicle that is a one-hop distance away from itself is derived as follows: Obtaining content from vehicles that are a two-hop distance away from the request vehicle Assumed that the distance from relay vehicle to request vehicle and content source vehicle is equal, it's 3r v 4 , and the probability of content success transmission from relay vehicle to request vehicle and content source vehicle is also equal. When s x i > s y i , the condition of content success transmission is · V a ≥ W and the feasible region is When s x i < s y i , the condition of content success transmission is According to Formula (3), the probability that a vehicle on lane i can obtain content from another relay vehicle is derived as follows: The probability that the desired content fails to be cached by vehicles that are a one-hop distance away from the request vehicle is as follows: Let K 2 denote the number of vehicles that are a two-hop distance away from the request vehicle. The probability that the desired content manages to be cached by vehicles that are a two-hop distance away from the request vehicle is as follows: The desired content will be transferred to the request vehicle by a vehicle that is a two-hop distance away. According to Formulas (13)(14)(15), the probability that a request vehicle on lane i can obtain desired content from the vehicle that is a two-hop distance away from itself is derived as the following: P nei i,τ = P ns i,τ · P en i,τ · (P two i ) 2 (16) Obtaining content from vehicles that are a one-hop or two-hop distance away from the request vehicle on the front opposite lane Let K 3 denote the number of vehicles that are a one-hop or two-hop distance away from the request vehicle in the same lane. The probability that the desired content fails to be cached by those K 3 vehicles is as follows: Let K 4 denote the number of vehicles with one-hop or two-hop distance away from request vehicle on front opposite lane, the probability that desired content manages to be cached at those K 4 vehicles is as following Assuming that the request vehicle needs to obtain the desired content from content source vehicle V c on opposite lane i + I, V c chooses K s vehicles that are a one-hop distance away from V c to average cache content chunks. As depicted in Section 3.2, those K s vehicles selected to cache content chunks will meet the request vehicle with a probability of 1. Assuming that the speed of any two vehicles, respectively, in the same and opposite lane are s x i and s y i+I , the condition of content chunk success transmission between the two vehicles is 2r v ·V a s x i +s y i+I ≤ W K s and the feasible region is According to Formulas (3), (17), and (18), probability that a request vehicle on lane i can obtain desired content from a vehicle that is a one-hop or two-hop distance away from itself on front opposite lane is derived as the following: where K s denotes the number of vehicles selected to cache content chunks, and V a denotes the average transmission rate between vehicles. Based on what is discussed above, the probability that a vehicle on lane i can obtain desired content without the participation of an RSU is as follows: Obtaining content directly from an RSU The authors of [22] indicated that the number of vehicles on the lane obeys the Poisson Point Process. Thus, the average driving distance of a vehicle in the coverage area of RSU was assumed to be R rsu . Assuming that the RSU serve and the request vehicles within the coverage area of itself have an equal probability, the probability that RSU transfers desired content to a request vehicle is as follows: Assuming that the average rate that an RSU transfers content to a vehicle is V per , the condition of request vehicle successfully obtains the desire content from an RSU is s i ≤ R rsu ·V per W . According to Formulas (4) and (21), the probability that the request vehicle directly obtains the desired content from an RSU is derived as the following: Obtaining content from cluster with content cached by an RSU Assuming that R m proactively caches content at cluster C m,i+I and chooses k vehicles from C m,i+I to respectively cache content chunks based on the caching optimization algorithm depicted in Section 3.2, the condition that RSU successfully transfers content chunks to a vehicle is s i ≤ R rsu ·V per ·k 2W , and the prediction probability of those k vehicles constitute is set as {P pre i+I,1 , P pre i+I,2 , . . . , P pre i+I,k }. According to Formulas (3), (4), and (21), the probability that the request vehicle can successfully obtain the desired content from C m,i+I is derived as the following:  [22], the probability that a vehicle is in the coverage area of an RSU is According to Formulas (22)- (24), the probability that a vehicle can obtain desired content with the participation of the RSU is derived as the following: The average probability that a vehicle on lane i can successfully obtain desired content ranked τ − th is derived as the following:

Parameter Settings
In order to evaluate the performance of proposed strategy, we used SUMO to simulate an urban road environment, and vehicles moved along the urban road like real scenes. The arrival rate of vehicle was 600 vehicles/h, and the maximum speed of vehicle was 80 km/h. An interface, named TraCI(Traffic Control Interface), was used to connect SUMO to Python. The 3-hour moving information of vehicles was collected by Python through the interface. Python used the moving information to perform the algorithm simulation and to compare the simulation results with those described by the authors of [19]. In our simulation, the channel model is expressed as β · P t · r −α with a path loss exponent α of 4, channel fading gain β of 10 −2 , channel bandwidth B of 1.5 MHz, and power of Gaussian noise P n of −110 dBm. The transmission rate of the network can be derived by the Shannon formula. The average rate between vehicles was set as 9 Mbps, and the average between vehicle and RSU was set as 20 Mbps. Other simulation parameters are shown in Table 2. The Average Delay was used to describe the average time duration between the vehicle send requests and when the vehicle obtains the desired content. The Request Success Ratio was used to describe the average ratio that can be derived by dividing the total number of requests by the number of successful requests. Table 2. Simulation parameters.

Parameter Value
The

Results and Analysys
Figures 7 and 8 show the variations of the Average Delay and the Request Success Ratio in the case of different content sizes. As illustrated in the graphs, the request delay of vehicles increased and the probability of content success transmission decreased as the content size increased. This is because larger content requires more time to be transferred by V2V or I2V, and the transmission process fails more easily. Compared with the strategy described by the authors of [19], our strategy has a lower request delay and higher probability of content success transmission, and the gap between the two strategies increases as content size increases. Because our strategy caches content chunks at multiple vehicles in a cluster, request vehicles can obtain content chunks from vehicles in the cluster one by one, which increases the transmission duration between vehicles. Our strategy also adopts the multiuser multichannel transmission mode in the communication process between clusters to solve the problem of the communication duration being too short to complete the content transmission. Figure 8 also demonstrates that the theory value and the simulation value of the Request Success Ratio were similar to each other. The former was slightly higher than the latter, and the gap increased as content size increased. This is because we assumed that the process of the request vehicle obtaining content chunks from different relay vehicles in the cluster was independent in case of our theory analysis. However, as the content size increases, the process between the request vehicle and relay vehicle may wait to begin until the last process is over, which make the theory value higher than the simulation value. Figures 9 and 10 show the variations of the Average Delay and the Request Success Ratio in the case of different vehicle caching sizes. Vehicle caching size is used to describe the maximum capability of the vehicle's caching space. As depicted in the figures, the request delay of vehicles decreased and the probability of content success transmission increased as the caching size increased. This is because increasing the caching size enlarges the probability that a vehicle can find desired content in neighboring vehicles. Compared with the strategy described by the authors of [19], our strategy has a lower request delay and higher probability of content success transmission. Because our strategy caches content chunks at multiple vehicles in a cluster, request vehicles can obtain content chunks from vehicles in the cluster one by one, which increases the transmission duration between vehicles. Our strategy also adopts the multiuser multichannel transmission mode in the communication process between clusters to solve the problem of the communication duration being too short to complete the content transmission. The probability that a vehicle can obtain content with participation of RSU decreases as vehicle caching size increases, which weakens the advantages of our strategy, so the gap in performance between those two strategies decreases as caching size increases. Figure 10 also indicates that the theory value and the simulation value of the Request Success Ratio were similar to each other. The former was slightly higher than the latter, and the gap increased as the vehicle caching size increased. This is because the probability that a vehicle will obtain content from neighboring vehicles increases as vehicle caching size increases. We assumed that the relay vehicle situated in the middle position between the request vehicle and content source vehicle in case of our theory analysis, which makes the theory value higher than the simulation value.
Ratio in the case of different content sizes. As illustrated in the graphs, the request delay of vehicles increased and the probability of content success transmission decreased as the content size increased. This is because larger content requires more time to be transferred by V2V or I2V, and the transmission process fails more easily. Compared with the strategy described by the authors of [19], our strategy has a lower request delay and higher probability of content success transmission, and the gap between the two strategies increases as content size increases. Because our strategy caches content chunks at multiple vehicles in a cluster, request vehicles can obtain content chunks from vehicles in the cluster one by one, which increases the transmission duration between vehicles. Our strategy also adopts the multiuser multichannel transmission mode in the communication process between clusters to solve the problem of the communication duration being too short to complete the content transmission. Figure 8 also demonstrates that the theory value and the simulation value of the Request Success Ratio were similar to each other. The former was slightly higher than the latter, and the gap increased as content size increased. This is because we assumed that the process of the request vehicle obtaining content chunks from different relay vehicles in the cluster was independent in case ➄ of our theory analysis. However, as the content size increases, the process between the request vehicle and relay vehicle may wait to begin until the last process is over, which make the theory value higher than the simulation value.  Vehicle caching size is used to describe the maximum capability of the vehicle's caching space. As depicted in the figures, the request delay of vehicles decreased and the probability of content success transmission increased as the caching size increased. This is because increasing the caching size enlarges the probability that a vehicle can find desired content in neighboring vehicles. Com-    Figures 11 and 12 show the variations of the Average Delay and the Request Success Ratio in the case of different Zipf distribution parameters. As depicted in the graphs, the request delay of vehicles decreased, and the probability of content success transmission increased as the parameter increased. This is because increasing the Zipf distribution parameter increases the probability that a vehicle can find desired content in neighboring vehicles. Compared with the strategy described by the authors of [19], our strategy possesses a lower request delay and higher probability of content success transmission. Because our strategy caches content chunks at multiple vehicles in a cluster, request vehicles can obtain content chunks from vehicles in the cluster one by one, which increases the transmission duration between vehicles, besides. Our strategy also adopts the multiuser multichannel transmission mode in the communication process between clusters to solve the problem of the communication duration being too short to complete content transmis-   Figures 11 and 12 show the variations of the Average Delay and the Request Success Ratio in the case of different Zipf distribution parameters. As depicted in the graphs, the request delay of vehicles decreased, and the probability of content success transmission increased as the parameter increased. This is because increasing the Zipf distribution parameter increases the probability that a vehicle can find desired content in neighboring vehicles. Compared with the strategy described by the authors of [19], our strategy possesses a lower request delay and higher probability of content success transmission. Because our strategy caches content chunks at multiple vehicles in a cluster, request vehicles can obtain content chunks from vehicles in the cluster one by one, which increases the transmission duration between vehicles, besides. Our strategy also adopts the multiuser multichannel transmission mode in the communication process between clusters to solve the problem of the communication duration being too short to complete content transmis-  Figures 11 and 12 show the variations of the Average Delay and the Request Success Ratio in the case of different Zipf distribution parameters. As depicted in the graphs, the request delay of vehicles decreased, and the probability of content success transmission increased as the parameter increased. This is because increasing the Zipf distribution parameter increases the probability that a vehicle can find desired content in neighboring vehicles. Compared with the strategy described by the authors of [19], our strategy possesses a lower request delay and higher probability of content success transmission. Because our strategy caches content chunks at multiple vehicles in a cluster, request vehicles can obtain content chunks from vehicles in the cluster one by one, which increases the transmission duration between vehicles, besides. Our strategy also adopts the multiuser multichannel transmission mode in the communication process between clusters to solve the problem of the communication duration being too short to complete content transmission. The probability that a vehicle can obtain content with the participation of RSU decreased as the Zipf distribution parameter increased, which weakened the advantages of our strategy, so the gap in performance between those two strategies decreased as the parameter increased. Figure 12 also illustrates that the theory value and the simulation value of the Request Success Ratio were similar to each other. The former was slightly higher than the latter, and the gap increased as the Zipf distribution parameter increased. This is because the probability that a vehicle will obtain content from neighboring vehicles increases as the Zipf distribution parameter increases. We assumed that the relay vehicle situated in the middle position between the request vehicle and the content source vehicle in case of our theory analysis, which makes theory value higher than the simulation value.
Appl. Sci. 2021, 11, 2157 22 of 24 sion. The probability that a vehicle can obtain content with the participation of RSU decreased as the Zipf distribution parameter increased, which weakened the advantages of our strategy, so the gap in performance between those two strategies decreased as the parameter increased. Figure 12 also illustrates that the theory value and the simulation value of the Request Success Ratio were similar to each other. The former was slightly higher than the latter, and the gap increased as the Zipf distribution parameter increased. This is because the probability that a vehicle will obtain content from neighboring vehicles increases as the Zipf distribution parameter increases. We assumed that the relay vehicle situated in the middle position between the request vehicle and the content source vehicle in case ➁ of our theory analysis, which makes theory value higher than the simulation value.   sion. The probability that a vehicle can obtain content with the participation of RSU decreased as the Zipf distribution parameter increased, which weakened the advantages of our strategy, so the gap in performance between those two strategies decreased as the parameter increased. Figure 12 also illustrates that the theory value and the simulation value of the Request Success Ratio were similar to each other. The former was slightly higher than the latter, and the gap increased as the Zipf distribution parameter increased. This is because the probability that a vehicle will obtain content from neighboring vehicles increases as the Zipf distribution parameter increases. We assumed that the relay vehicle situated in the middle position between the request vehicle and the content source vehicle in case ➁ of our theory analysis, which makes theory value higher than the simulation value. Figure 11. The relationship between Average Delay and Zipf distribution parameter. Figure 12. The relationship between Request Success Ratio and Zipf distribution parameter. Figure 12. The relationship between Request Success Ratio and Zipf distribution parameter.

Conclusions
In this paper, we proposed the idea of predicting the moving trajectory of a cluster based on the vehicle behavior prediction in intersection, whereby the driving behavior of vehicles in a cluster determines the moving behavior of the cluster. On this basis, RSU divides the content into multiple chunks and proactively caches those chunks at multiple relay vehicles in a cluster that is about to meet the request vehicle. By letting the request vehicle obtain content chunks from relay vehicles one by one, our strategy enlarges the communication duration. Our paper also optimizes the number of chunks to maximize the probability that the request vehicle will successfully obtain content from the cluster. Besides, our paper adopts the multiuser multichannel transmission mode in the communication process between clusters. Our simulation results demonstrate that the proposed strategy can improve the performance of vehicular network.
Our paper uses vehicle cluster in a trajectory-predicted approach to deliver content. However, there are still several issues, for example, the caching optimization algorithm fails to consider the factor of distance. Future works should explore this issue.

Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.

Data Availability Statement:
The data and codes presented in this study are available from the corresponding author by request.

Conflicts of Interest:
The authors declare no conflict of interest.