A Novel Base-Station Selection Strategy for Cellular Vehicle-to-Everything (C-V2X) Communications

: Cellular vehicle-to-everything (C-V2X) communication facilitates the improved safety, comfort, and efﬁciency of vehicles and mobility by exchanging information between vehicles and other entities. In general, only the macrocell or only the femtocell is the communication infrastructure for C-V2X. Currently, a macro-femtocell network is used as the new C-V2X networking architecture. However, there are two unresolved problems for C-V2X in macro-femtocell networks. Firstly, vehicle mobility requires the frequent switching of connections between different base stations; invalid switching results in worse communication quality. Secondly, unintelligent base station selections cause network congestion and network-load imbalance. To address the above challenges, this paper proposes a base station selection strategy based on a Markov decision policy for a vehicle in a macro-femtocell system. Firstly, we present a mechanism to predict received signal strength (RSS) for base station selection. Secondly, a comparing Markov decision policy algorithm is presented in C-V2X. To the best of our knowledge, this is the ﬁrst attempt to achieve predicted RSS based on a Markov decision policy in C-V2X technology. To validate the proposed mechanism, we simulated the traditional base station selection and our proposal when the vehicle moved at different speeds. This demonstrates that the effectiveness of a traditional base station selection policy is obvious only at high speeds, and this weakness can be resolved by our proposal. Then, we compare our solution with the traditional base station selection policy. The simulation results show that our solution is effective at switching connections between base stations, and it can effectively prevent the overloading of network resources.


Introduction
Cellular vehicle-to-everything (C-V2X) wireless communication technology is a new generation of communication technology that connects vehicles to everything. Compared with traditional vehicle-to-vehicle communications, C-V2X could leverage the comprehensive coverage of secure and well-established networks. It could also support highly reliable, real-time communication services at high speeds and in high-density traffic. Ultimately, C-V2X will play a pivotal role in developing automated-driving technology [1].
C-V2X acquires information based on wireless cellular technology, and it analyzes the collected information to obtain correct instructions. Wireless cellular technology utilizes cellular base station relays for data transmission between vehicles or data centers and among vehicles over a wide area. Normally, there are two kinds of base stations for wireless cellular technology. Macrocell base stations have large transmission power and wide coverage and can meet a wide range of user requirements. Macrocell base station signals are, however, usually blocked by buildings during vehicle movement, even in the case of traffic jams and density due to a large numbers of users; as a result, the macrocell signal is greatly interfered. Therefore, setting up a femtocell base station in the blind spot of a macrocell base station can improve signal transmission quality in a specific area. The distance between the vehicle and the base station is inversely proportional to the RSS, so the RSS obtained from the femtocell base station is more stable and higher. However, the femtocell base station has a lower transmit power, so coverage is limited. With the cooperation of two kinds of base stations, a macro-femtocell system can provide stable signal transmission services to C-V2X. When the vehicle is in motion, we need to select the appropriate base station to provide an efficient transmission service.
In previous work, a reinforcement learning-based Markov decision policy for base station selection was studied in [2]. The differences between the previous work and this paper are as follows. Firstly, this paper considers that femtocell base stations have a usage limit. Secondly, the previous work did not consider the random mobility of vehicles, while we predict the position of the vehicles' next action. Finally, in this system, each vehicle's action is independent, and the choices made by each vehicle affect the communication quality of the system. We need to have an optimal base station selection strategy according to the Markov decision policy.
When choosing a base station, we face the following problems: (1) For automatic vehicles, there are often more than one feasible path variables in a complex environment due to the randomness of automatic movement. It is therefore difficult to predict a user's next position. (2) With inefficient base station selection, when a femtocell base station is overloaded, unnecessary interference and signal interruption are generated.
To address the above problems, we propose a method to formulate base station selection as a Markov decision problem. The contributions of our work can be summarized as follows: (1) Predict the RSS from two kinds of base stations based on the current state of the vehicle. (2) Use the Markov decision policy to predict all base station selection possibility sequences and obtain the most appropriate selection sequence.
In this paper, we mainly consider the prediction of multi-vehicle base station selection. If combined with the analysis of the existing state of the vehicle, according to our proposed base station selection strategy, it can also be used for the prejudgment of a handover option with each vehicle in the system. The remainder of this paper is organized as follows. In Section 2, we introduce the related work. In Section 3, background and preliminaries are presented. In Section 4, we propose a Markov decision policy mechanism for multiple vehicles. In Section 5, we put forward a method to predict RSS. In Section 6, the Markov decision policy topology is presented. The simulation results are shown and discussed in Section 7. Finally, we present our conclusions and outline future work.

Related Work
Much related research has proposed many solutions. The work in [3] provided insight into the posed challenges by short-range vehicle-to-vehicle (V2V)-based vehicular sensor networks (VSNs), starting from economic motivations up to requirements and related enabling technologies; it also presented an overview of potential mandatory rules that are discussed at an international level to integrate connectivity on vehicles. The work in [4] described the development of vehicular communications. The development from dedicated short-range communications technology to C-V2X provides many solutions for improving vehicle communication systems. The work in [5] used sensors to obtain real-time traffic information to indemnify vehicle safety. It is an optimization of dedicated short-range communications technology that can effectively avoid traffic jams in a limited area. However, short-range communication cannot be achieved if faced with long-distance blocking-avoidance problems. Therefore, we consider the need to supplement cellular traffic information with cellular communications. The work in [6] provided long-distance transmissions using cellular communication systems, and an intelligent transportation system was realized by load-mapping and queuing theory, which can effectively cope with congestion problems. The work in [7], through analysis and simulation, showed that C-V2X-the technology developed within the third-generation Partnership Project (3GPP) and designed to operate in both vehicle-to-vehicle and vehicle-to-network modes-is the prominent technology that can achieve V2X requirements and pave the most efficient way to connected and automated driving. The work in [8] analyzed the parameters of the algorithm designed by 3GPP and their impact on system performance. Through simulations in different large-scale scenarios, we showed that modifying some parameters has a negligible effect, that the proper choice of others can indeed improve quality of service, and that a group of parameters allows trading reliability with update delay.
Due to bandwidth limitations in the real world, network congestion may be unavoidable. Therefore, we need to improve the antenna to help expand the bandwidth. The work in [9] explained how to mount an antenna on a vehicle, as communication quality can be improved with the vehicle antenna. This improvement, however, is limited. Only antenna base stations can effectively improve a network environment. The macro-femtocell system provides us with a solution. How to allocate network resources effectively and avoid network delay in the whole system is introduced in [10].
The use of a biomimetic approach for self-organization in a macro-femtocell system was introduced in [11]. Docitive radios were used to train the system on their policies when a new node joined the network. The design self-organization could adapt macrocell base station tilts. The work in [12] provided a solution for a femtocell base station handover. They advised to expand antenna coverage to improve system performance. A distance-based handover scheme was discussed in [13]. The threshold could be calculated by the antenna's overlaid area. With the handover decision between femtocell base stations, the work in [14] used a Markov decision problem to obtain channel allocation policy that proposed the optimal antenna handover decision.

Macro-Femtocell System
Since the invention of the femtocell was intended to compensate for blind spots and edges in the macrocell base station transmission coverage, it improved the service quality of the network [15]. Hence, we chose a combination model with one macrocell base station and several femtocell base stations. The macrocell base station serves as one of the transmission stations that vehicles can select. The base station plays the role of overall control [16]. All upstairs requests and resource allocations are processed through the macrocell base station. The setting of the femtocell base stations is within the coverage of the macrocell base station [17]. On the one hand, this approach could strengthen surrounding signal strength. On the other hand, it could share the transmission pressure of the macrocell base station. Thus, we set the macrocell base station to occupy the dominant position, which also has the function of data statistics and a unified deployment of communication resources [18].

Game Theory
Game theory considers the predictive and actual behavior of individuals in a game and studies their optimization strategies [19]. Different interactions on the surface may exhibit similar incentive structures, so they are a special case of the same game [20]. As in the choice of base station, we need to consider the actual and predictive behavior of the vehicles and show similar incentive structures through different interactions between macrocell and femtocell base stations [21,22].

Markov Decision Policy
In probability theory and statistics, the Markov decision policy provides a mathematical architecture model on how to make decisions in a state where part of the randomness is partially controlled by the decision-maker [23]. For a system, there is transition probability from one state to another, and this transition probability can be derived from the immediately preceding state [24,25]. Therefore, we determine the best base station selection strategy by combining the transfer function of the base station condition in the macro-femtocell system [26].

Assumptions
As shown in Figure 1, the system compares RSS power from all base stations when the vehicle is in motion. According to the compared RSS report from the vehicle, the system allocates the matched antenna resource to them. Although the base station takes the broadcast form to transmit signals, a user can select only one base station as their own transmission base station [27]. In this paper, we take the dedicated bandwidth-allocation strategy. The bandwidths of the macrocell and femtocell base stations are different. Adjacent femtocell base stations are also allocated different amounts of bandwidth. Thus, transmissions from macrocell to femtocell base stations do not interfere with each other. We mainly discuss the base station selection of vehicles in the case of a macrocell and a femtocell base station [28].
Suppose there are nvehicles in the macro-femtocell system. The transmission power of the macrocell base station is E M , and the transmission power of the femtocell base station is E F . The RSS of the j-th (j = 1, 2, 3, ... n) vehicle could be presented as [29]: (1) where RSS M j represents the signal strength that the j-th vehicle could receive when the vehicle treats the macrocell base station as its transmission station. When the j-th vehicle selects the femtocell base station, the signal strength that the vehicle received could be presented by the RSS F j . d M j represents the distance between the vehicle and the macrocell base station; d F j could also show the distance from the vehicle to the femtocell base station; h M and h F denote the channel gains from the macrocell and femtocell base stations, respectively; β denotes the path-loss exponent. In the pass loss model, we apply the Okumura-Hatamodel in the large-scale fading channel, which represents the shadow fading, subject to a Gaussian distribution with a mean of zero and a standard deviation up to 12 dB. Among them, 12 dB is the fading tolerance. Therefore, ξ could be presented as a Gaussian-distributed random variable with a mean of zero and a standard deviation of up to 12 dB [30].

Methodology of the Markov Decision Policy
The basic idea of base station selection is illustrated in Figure 2. The objective of this paper is to provide base station selection for vehicles in C-V2X, the strategy of which is based on the Markov decision policy. To achieve this goal, we divided this objective into two subproblems. The first one is providing a strategy to support base station selection. The second is providing an approach to address the complicated decision problem in base station selection. To solve the first subproblem, we propose a method to predict the probability of next second displacement and achieve RSS. We also propose femtocell base station redundancy as the probability of femtocell base station selection to address the second subproblem.

Objective:
Provide a base station selection for vehicles in C-V2X. The strategy is based on Markov decision policy.

Sub Problem 1:
Provide strategy to support the base station selection

Sub Problem 2:
Provide approach to address the complicated decision problem in base station selection

Method 1:
Predict the probability of next second displacement, and achieve received signal strength

Method 2:
Femtocell base station uses redundancy as the probability of Femtocell base station selection.  Figure 3 illustrates the workflow of the Markov decision policy. Firstly, all vehicles' speeds and directions of motion are transmitted to the data center. Then, the expected RSS from the two kinds of base stations is first obtained from the existing vehicle state that is based on our proposed method. As per the Markov decision policy, the two kinds of signal strength are used as a benefit to base station selection. Then, all vehicles are arranged in order. After that, all vehicles compete with the femtocell base station resources. Finally, the best selection strategy is obtained through different permutations and combinations.

Markov Decision Policy
In the Markov decision policy, we treat each vehicle as a competitor. We enumerate the sequence of all vehicle choices and allow all vehicles to select base stations in sequence. The choice of the previous vehicle affects the conversion probability of the next vehicle selection. In other words, vehicles that prefer femtocell base station resources reduce the opportunity to choose the femtocell base station. The process is shown in Algorithm 1.
For all vehicles, the opportunity to select the macrocell or femtocell base station is the same. We had all vehicles compete for femtocell base station resources. Vehicles that prefer the femtocell base station first have an advantage in receiving femtocell base station services. Other vehicles have fewer opportunities to use the femtocell base station. Signal strength obtained by vehicles from the macrocell and femtocell base stations is predicted by the vehicles' position in the next second. To reflect the priority of vehicle selection, we use the Markov decision policy to choose.
(1) Sort n vehicles in order with all permutations and combinations. Then, select the femtocell base station for the first user so there is a total of n! Sort the results.
Left vehicles have the same opportunity to choose the macrocell or femtocell base station. Therefore, there are n!2 (n−1) choice possibilities.
Obtain the conversion probability of each user's choice.
The corresponding predicted RSS and transition probability are treated as the value function of the vehicle.
The value-function sum of all sequences is the sequence value of the selection sequence.
The sequence with the highest value is the optimal selection sequence, and the base station of each vehicle in the sequence selects the best base station selection for the next second. To clearly describe the above mechanism, we provide an example in Figure 4. The number means marked vehicles (ex. 1,2,3...n). M means the macrocell base station that the vehicle selected, and F means the femtocell base station that the vehicle selected. Here, we emphasize that the j-th vehicle's selected base station corresponds to the corresponding RSS j .

Predict Received Signal Strength
As the j-th vehicle moves randomly in a macro-femtocell system, we need to predict the location of the next step. Figure 5 shows, as the system-provided path, that the possibility that a vehicle moves along the direction of the previous step is very high, while the possibility of moving in the opposite direction is very low. The next possible movement location could be treated as a circle, as the vehicle is currently at Position a and moves to Position b in the next second. Therefore, we can define displacement as a vector. In the next second, Position c of the user can be described as a circle. The center of the circle is Position b, and average mobility velocity r is the circle radius. We assume that the probability of the next second where the user moves forward along vector direction − → ab is high (P H ). Opposite vector direction − → ab could be treated as the lower probability( P L ). For the remaining probability, turning left or right, we set the probability P M , which means that this area has normal probability. P H , P M , and P L are shown in Figure 5, which satisfies the relationship: P H >P M >P L . It is assumed that the average mobility velocity of the vehicle is r m/s, which is according to the statistics of the macrocell base station database. Position c of the vehicle could be expressed as a circle. Position b could be the center of the circle, and r is the circle's radius.
Thus, for the j-th vehicle, the displacement probability function can be expressed as: When the j-th vehicle chooses the femtocell base station, F H j represents the higher probability of the vehicle's displacement; F M j represents the normal possibility of the vehicle's displacement; F L j represents the low probability of the vehicle's displacement in the range of values. In Figure 6, we define the current position co-ordination of the j-th vehicle with (0, 0). Then, we take any direction as the polar co-ordinates' positive direction. θ represents the angle between the vehicle and the positive direction; θ 0 j represents the angle between the previous motion step and the positive direction; θ M j represents the angle between the macrocell base station and the positive direction; and θ F j represents the angle between the femtocell base station and the positive direction. The polar co-ordinates of the macrocell base station can be expressed as (D M j , θ M j ). The polar co-ordinates from the vehicle to the femtocell base station can be expressed as (D F j , θ F j ). The displacement vector to the next second can be expressed as (r j , θ j ). The displacement vector to the next second can be expressed as (r j , θ 0 j ). The probability set could be presented by M H j , M M j , M L j , According to Formulas (1) and (2), we can obtain the power-density function: Thus, the transition-reward function can also be expressed as:

Markov Decision Policy Topology
In the Markov decision policy, future states only depend on the current state rather than the former. To apply the Markov decision policy to formulate the base station selection process, we have to define network states, actions, transition probability, and the value function. By comparison with the combined value function, we can make an optimal decision. Details are described as follows.

States
In state S, n vehicles are sorted in order with all permutations and combinations. Select the femtocell base station for the first vehicle. Hence, there is a total of n! Sort the results in state S.

Action
When the vehicle moves, the macrocell base station allocates the response resource according to current state S. In this paper, Action A could be presented by: where M means that the system selects the macrocell base station as the communication base station, while F indicates that the system selects the femtocell base station as the communication base station.

Transition Probability
Transition probability only depends on the current state rather than the former. Due to the small transmit power of the femtocell base station, the number of connected users is limited. To avoid the femtocell base station reaching its upper use limit, we used the redundancy probability used by the femtocell base station as the transition probability.
The n vehicles are sorted, and the base stations are selected in order. To ensure that the usage rate of the femtocell base station is not equal to zero, the first vehicle selects the femtocell base station. The utilization rate of the femtocell base station in the i-th (i ∈ n) choice is P i , and the redundancy use probability is 1 − P i . Then, the selection probability of the (i + 1)-th choice of the macrocell base station is P i , and the selection probability of the femtocell base station is 1 − P i . Therefore, usage rate P i of the cellular base station can be expressed as: where f i indicates the number of users connected to the femtocell base station in the i-th choice and F indicates the maximum number of vehicles connected to the femtocell base station. If the vehicle selects the macrocell base station in the i-th choice, then in the (i + 1)-th choice, the probability of selecting the macrocell base station is P i+1 = P i . Therefore, the probability of selecting the femtocell base station is still 1 − P i . If the vehicle selection is the femtocell base station in the i-th choice, the probability of selecting the macrocell base station is P i+1 = f i−1 +1 F in the (i + 1)-th choice, and the probability of selecting the femtocell base station is 1 −

Value Function
In the Markov selection strategy, each state selection corresponds to a probability, and each selection made corresponds to a reward. Therefore, multiplying probability by reward is the expectation of this choice. The quality judgment of the choice is to sum the expectation of each choice, which is the value function. In this paper, the value function can also be understood as the expected received signal-strength sum of the vehicle-selection state in the system. The reward is obtained after the choice of vehicle. The reward function could be treated as RSS, as Functions (1) and (2) show. Therefore, during vehicle operation, the best strategy chosen by the base station is π: We used the predicted signal strength of the next location as a reward and the redundancy of the base station to determine correct base station selection. It not only guarantees the quality of signal transmission, but also avoids excessive load in the femtocell base station.

Simulation Results
Due to the limited speed of the vehicle, if the vehicle is closer to the macrocell base station, the probability of a handover to the femtocell base station in the next second is small. Similarly, if the vehicle is very close to the femtocell base station, the next second is less likely to move to the macrocell base station, but the vehicle could prefer the femtocell base station.
In the existing selection method, the user selects the larger predicted RSS of the two base stations. Therefore, we mainly discuss the choice of vehicles at critical locations in the coverage of macrocell and femtocell base stations. Since the vehicle reward function is related to the running speed of the vehicle and the motion direction, base station selection of the vehicle is according to different running speeds and the motion direction. In this critical area, we discuss the average RSS in the system when using the Markov decision policy and the non-Markov decision policy if the vehicle selecting the femtocell base station exceeds the user limit.
We assumed that there were 150 randomly-distributed vehicles in the coverage of the femtocell base station, which can be used by up to 100 vehicles. We set vehicle speeds at 2, 4.5, and 30 m/s. At the speed of the previous step, the marked dot in the figure represents the real value obtained by the simulation; then, values are normalized to the line in the figure.
We simulated the base station selection problem using MATLAB. All parameters we used in this simulation are presented in Table 1. Here, we set h M = h F . We assumed that mobility speed was 2 m/s (low speed), where θ 0 is the user's displacement angle of the previous step, randomly setting these variables. In Figures 7-9, the three different colors represent the different locations where the macrocell and the femtocell base stations are located. Figure 7 shows, with the choice of a macrocell base station, the expected RSS we get in low speed. Figure 8 shows the expected RSS we get in low speed with the choice of a femtocell base station. As the traditional base station choice policy, the system chose the larger one. As Figure 9 shows, the expected signal strength of macrocell and femtocell base stations remained stable under low speed. This means that the advantage of traditional base station selection policy is not obvious in a low-speed environment. We can see from the results of Figures 7 and 8 that low-speed mobility through traditional base station selection cannot obviously change the RSS. Here, we changed r to 4.5 m/s (middle speed), θ 0 was randomly set, and the three different colors represent the different locations where the macrocell and femtocell base stations were located. Figure 10 shows the expected RSS we obtained at middle speed with the choice of the macrocell base station. Figure 11 shows the expected RSS we obtained at middle speed with the choice of the femtocell base station. As shown in Figure 12, there was a small efficiency improvement when compared with a traditional base station selection policy at middle speed.
Then, we assumed that mobility speed reached 30 m/s (high speed), and θ 0 was also randomly set. The three different colors represent the different locations where the macrocell and femtocell base stations were located. Figure 13 shows the expected RSS we obtained in high speed with the choice of a macrocell base station. Figure 14 shows the expected RSS we obtained in high speed with the choice of a femtocell base station. From Figure 15, we can see that, compared with the traditional base station selection policy, selection with base station choice was efficient. Therefore, in high speed, the priority of the vehicle selecting the femtocell base station was high.
Hence, from Figures 7-15, we can conclude that, when a vehicle made a base station selection, the traditional base station selection policy had difficulty in making a selection at low and middle speeds, and this can be solved by our proposed base station selection policy, which is based on the Markov decision policy.  To compare our proposed base station policy with the traditional base station policy, we randomly distributed 150 vehicles at coverage of femtocell base station. The initial-motion angles of all vehicles were the same. Since motion speed was different, the expected RSS was different in the next second. We assumed different initial vehicle motion angles when the running trend of all vehicles was the femtocell base station. According to the traditional base station selection policy, the predicted base stations of all vehicles were selected as femtocell base stations. However, since the femtocell base station had an upper limit of 100, there would be 50 random vehicles selecting the macrocell base station. According to our proposed Markov decision policy, we proposed the base station selection sequence and obtained the best choice based on the value function. In Figure 16, the result of the blue line is the average RSS obtained from the vehicle under the traditional base station selection strategy. In the next-second selection, the RSS did not change much, regardless of movement angle. However, with the Markov decision policy, we selected the higher-priority vehicle to prefer the femtocell base station, and the average RSS of the vehicle was improved.

Conclusions
In this paper, we studied the Markov decision policy-based base station selection policy for a C-V2X system. According to the vehicle's moving direction and moving speed, we proposed a method to predict the probability of next-step displacement. As vehicles compete for femtocell base station resources, the sorting of vehicles can obtain different sorting sequences. By comparing the maximum-value function for each strategy, we can obtain the best base station choice solution. The simulation compared effectiveness between the Markov decision policy and non-Markov decision policy. The Markov decision policy can help us ensure good signal quality of the system transmission. At the same time, it guarantees that the usage rate of the femtocell base station does not reach the upper limit. When a resource conflict occurs in the femtocell base station area, resources are reasonably allocated to improve the system's service quality.