Using Real-Time Dynamic Prediction to Implement IoV-Based Collision Avoidance

: For some IoV-based collision-avoidance architectures, it is not necessary that all vehicles have communication abilities. Hence, they need some particular designs and extra components. In the literature, one of them uses a camera mounted onto the infrastructure at an intersection to realize collision detection. Consequently, technologies for real-time object detection and dynamic prediction are required for the purposes of collision avoidance. In this paper, we propose an interesting method to predict the future position of a vehicle based on a well-known, real-time object detection project, YOLOv3. Our algorithm utilizes the concept of vehicle dynamics and the conﬁdence region to predict the future position on vehicles. This will help us to realize real-time dynamic prediction and Internet of Vehicles (IoV)-based collision detection. Lastly, in accordance with the experimental results, our design shows the performance for predicting the future position of a vehicle.


Introduction
The vehicle has been a most widely used form of transportation in people's lives over the last few decades, and various studies were consequently proposed for road safety. Passive safety technologies are designed to mitigate the effects of traffic accidents, and have been undergoing a lot of development. Nowadays, due to the increasing use of electronics in the automobile industry, it is now also using active safety technologies. Note that the difference between passive and active safety is the operating time-namely, an active safety system will operate before the accident and thus attempt to avoid such accidents [1]. Many advanced safety techniques have been developed for various scenarios, such as the Lane Departure Warning system (LDW) [2], Forward Collision Warning system (FCW) [3], Blind Spot Warning system (BSW) [4], and Parking Assistant System (PAS) [5][6][7], and navigation [8,9]. Therefore, vehicular safety systems have already become more and more indispensable for up-to-date vehicles. Among those techniques, Internet of Vehicles (IoV) is one of the most interesting and important technologies to the modern automobile industry, where it is clearly a moving network made up of the Internet of Things (IoT). A vehicle can communicate with other vehicles and infrastructures through the Internet for various applications. It enables vehicles to perform more effective managements, such as in collision avoidance systems, navigation, intelligent transport systems, and entertainment systems.
Communication between vehicles and traffic infrastructures has been studied extensively [1,[10][11][12][13]. In [10], the transmission of information in urban intersections was analyzed and evaluated. In order to improve its performance, the authors proposed a vehicle-assisted relaying scheme, where the relaying vehicle is selected in an autonomous manner. Moreover, since collision avoidance with multiple data resource reservation per schedule assignment is a critical issue for the improvement of broadcast reliability, the authors in [11] proposed an enhanced method to address it. Their simulation results show that by adopting their design, the network capacity in terms of supported vehicles under given service requirements is largely increased. Recently, a comprehensive survey on resource allocation schemes, including Dedicated Short-Range Communications (DSRC) and cellular networks, was proposed in [12]. On the other hand, IoV-based systems have been exploited in different traffic applications, such as collision avoidance. In [14], the Inertial Measurement Unit (IMU), DSRC, vehicle dynamics, computer vision, and Global Positioning System (GPS) were combined to improve both the accuracy and reliability of the vehicle's positioning system.
Unfortunately, the above studies are under the assumption that all vehicles in the scenario should have some form of communication ability. It is too difficult to meet such an assumption during the development of IoV. Therefore, making up for such a deficiency is currently a significant topic in IoV-based safety. Consequently, in [1], the authors proposed a form of collision-avoidance architecture based on computer vision, machine learning, vehicle dynamics, and the predictive algorithm. By adding some extra equipment, their architecture is able to eliminate the assumption that all vehicles in the considered scenario should have communication ability. Consider the following scenario: if there is an infrastructure at the intersection, which can observe every vehicle from each direction and predict whether the collision will happen or not, and then warn vehicles via communication ability and/or the traffic sign, the collision avoidance system will still work even if some vehicles are without communication ability. In particular, the authors designed a linear algorithm based on the output of an existing real-time object detection project, YOLOv3 [15,16], to predict the future position of a vehicle. Note that their algorithm can cooperate with different real-time object detection projects, YOLOv3 being one of the most feasible methods.
In this paper, we focus on the obtaining of the future position of a vehicle. According to the above discussion, the researchers in [1] design a linear algorithm to realize the prediction. Unfortunately, it is not accurate and stable for the purposes of estimating the further future position of a vehicle. The intuition is that it is very difficult for the movement of a vehicle to be described by a linear function. Hence, where linear and nonlinear functions are used instead to predict the position of a vehicle, we assume that vehicles present similar behaviors when entering the same road segment, whereas during a short time-period, the movement of a vehicle (such as the heading angle, absolute position) will not vary significantly. By capturing a video with higher resolution and frames per second (FPS), we would be able to use some of the skills of vehicle dynamics on the time axis to obtain the future confidence region. On the other hand, we use the concept of a confidence region to represent the possibility of the future position, which can help us predict and present the future information clearly. This paper is structured as follows. In Section 2, we give an overview of the IoV-based collision avoidance architecture proposed in [1], which can practically eliminate the assumption that all vehicles in the considered scenario should have communication ability. In Section 3, we start to introduce our idea, which also cooperates with YOLOv3. We will use our video recorded above a road to simulate the view that an infrastructure has been observed. Experimental results are shown and discussed in Section 4, and finally, Section 5 concludes the paper, where future work is also discussed.

IoV-Based Collision-Avoidance Architectures
The operating procedure for an infrastructure in Figure 1 is described as follows: • For an IoV-based collision avoidance architecture, the real-time video is captured by the camera mounted on the infrastructure. • By using the real-time object detection, we can obtain all objects that interest us, such as vehicles, motorcycles, and pedestrians.

•
By utilizing the real-time dynamic prediction, we can get the future position of a moving object.

•
Lastly, according to the future position of two moving objects, we can predict whether the collision will happen or not, and then warn them by using communication ability and/or the traffic sign.
Note that we will only consider the scenario of moving vehicles in the text. For instance, in Figure 2, there are two cars that come from distinct directions to the same intersection. If the collision happens, they will be alerted to slow down by the system via its communication ability and/or the traffic sign. Particularly, the vehicle with its communication ability can be warned by the IoV message and the roadside traffic sign. On the other hand, the vehicle without communication ability can be warned by the roadside traffic sign. It is practical to make up for the deficiency of the assumption that all vehicles must have communication ability.

Method
In this section, we will be introducing our predictive method. The main idea comes from vehicle dynamics [17][18][19]: the movement of a vehicle will not be a large value within a short time interval, such as the heading angle and the absolute position. Hence, if a video has a high resolution and FPS, we can estimate whether it is a reasonable position or not by calculating the difference between the two positions and using some skill of vehicle dynamics on the time axis to get the future confidence region. More specifically, the former means that if a vehicle is travelling at a higher speed, like the example shown in Figure 3, it is reasonable to estimate that its next position will not be at the left, rear-left, rear, rear-right, and right region. Conversely, if a vehicle just starts moving from a static situation, there are only two unreasonable regions-that is, left and right.
On the other hand, the latter means that a road segment should be considered, which is the specific representation of a portion of a road with uniform characteristics [20], where the traffic regulation of the entire road segment should be the same. Hence, we can assume that every car will have similar behavior while entering the road segment, and then be able to use the information of previous vehicles that have passed through the road segment to estimate the future position of a vehicle, which is currently on that segment. In other words, while a car enters the road segment, according to its movement in the beginning, we can find the most proper model from the database within a short time, and thus predict its future position. In particular, since our algorithm can predict the future position of a vehicle by finding the most proper model from the database where the data is collected from a particular road segment, such as the highway or intersection, it can be applied to every possible case. Therefore, in applying our design to predict the vehicle's position along two different roads which intersect in the same crossroad, possible collisions can be predicted. Notice that in this paper, a confidence region is utilized to describe the future position of a vehicle at time t, and a confidence model is also used to express the behavior of a vehicle on a road segment. Obviously, a confidence model is composed of many confidence regions at different times. Our design is composed of two parts-the training method, and the predictive algorithm. First, we illustrate the predictive algorithm, which is constructed of two parts-the error modification and the future position prediction. This algorithm is shown in Table 1. The original current position of the moving object mo provided by the real-time object detection module has been stored in p ori (line 1). Then, we declare two important variables, p t and m t , where p t is the position of mo at time t, and m t is the movement of mo at time t (lines 2-3). Note that m t is the difference in position between p t and p t−1 , and it is very critical for describing the dynamics of a vehicle. As there are no reference materials in the initial phase, the only thing that we could do was to wait for two consecutive p ori , and then to utilize them to estimate the value of m t (lines 4-6). Particularly, upon receiving a new p ori , we first projected the original position onto the map data of the current road segment (line 4), and then applied the concept of vehicle dynamics to obtain a new one (line 5). The above two actions helped us to get a reasonable amount of information. The former was also adopted in [1] for the same reason that a vehicle should drive on the road generally. The latter has been discussed in the first paragraph of Section 3 in order to avoid unreasonable position information. Upon receiving a new p ori , we were able to calculate m t by using two consecutive p ori (lines 7-9). Finally, by adopting some of the latest information, positions, and movements, we intended to find the most similar model from the database and thus predict its future information (line 10), that is, the future position, a frames later. Note that in order to find the most similar model from the database, we designed different functions with different properties, whose performances are compared in Section 4.
The training method is shown in Table 2. Most of the algorithm is the same as the predictive one illustrated in Table 1. In the last part, we made an output of all the saved position information into the database in sequence, that is, p t and m t (line 10). While we intend to predict the future position and/or movement of a vehicle, we were also able to adopt some of the latest information to find the most similar model from the database and then predict its future dynamics. Algorithm: Predictive algorithm for moving object mo input: (01) p ori : original current position of mo, which is obtained from the real-time object detection module directly. variables: (02) p t : position of mo at time t; (03) m t : movement of mo at time t; init: (04) p t ← map_matching( p ori ); (05) p t ← vehicle_dynamics( p t ); (06) m t ← 0; upon receiving a new p ori : (07) p t ← textbfmap_matching( p ori ); (08) p t ← vehicle_dynamics( p t ); (09) m t ← p t -p t−1 ; predict: (10)  (04) p t ← map_matching( p ori ); (05) p t ← vehicle_dynamics( p t ); (06) m t ← 0; upon receiving a new p ori : (07) p t ← textbfmap_matching( p ori ); (08) p t ← vehicle_dynamics( p t ); (09) m t ← p t -p t−1 ; output: (10) output all p t and m t ;

Experimental Results
In this section, we will illustrate our experiments using real data, which has been extracted from a video recorded by a camera. Noteworthily, this camera was set on a building above a road segment in order to observe the segment. The experimental settings are shown below:

•
The resolution of the video captured by the camera is 1920×1080.

•
The output frequency of the video captured by the camera is about 20 Hz.

•
The resolution of a confidence region was set to 50×50.

•
In this captured video, vehicles enter the scenario from the right side, and leave from the left side, as shown in Figure 4. Figure 4 shows an image of the experimental scenario where vehicles enter from the right side and leave from left one. The blue points in Figure 4 represent the trajectory of a vehicle stored in our database, which is called the dynamic model. The database consists of 194 trajectories collected in the experimental scenario which compose the different dynamic models of our system. The database includes straight, parallel parking, as well as turning to the right and left trajectories. Thus, after acquiring the required information of a vehicle entering the road segment, the proposed method can find the most similar dynamic model from the database. Figure 5 shows 50 different trajectories of our dataset.
In our experiments, we considered two different find functions with a distinct amount of past information, where all were used to predict the future position 0.25 s, 0.5 s, 0.75 s, 1.0 s, 1.25 s, and 1.5 s later. More specifically, the first design of the find function utilizes the concept of distance in Euclidean space with the same weighting value for all past x frames to find the most similar model from the database, where x is set to 3, 5, and 10, respectively. Figure 6 shows the six predicted positions (red color) of a vehicle after 0.25, 0.5, 0.75, 1, 1.25, and 1.5 seconds by means of the vehicle's information being acquired from three frames (blue color). Figures 7 and 8 show the six predicted positions using the vehicle's information from five and ten frames, respectively. Since the length of each video is about five seconds, we could compare the prediction of the future position with the actual one immediately, and calculate whether the above two positions were within the same confidence region or not. The results are shown in Figures 9-11. According to the experimental results, we can observe that for the prediction less than or equal to 0.5 s later, no matter how many past positions were used as references, they had a higher correct ratio-that is, of more than 80 percent. On the whole, using five past positions with the same weighting value as the references was the best choice among the three settings. The intuition is that considering too many past frames as references, which includes some obsolete information, will influence our design to find the most proper model from the database. On the other hand, considering fewer past frames as references will lead to insufficient information, meaning it will influence our design as well.        Furthermore, the second design of the find function also utilizes the concept of distance in Euclidean space for all past x frames to find the most similar model from the database, but with different weighting values, where x is set to 3, 5, and 10, respectively. In particular, the weighting value of y th frames is set to 1/y. Hence, for the setting of the past 3 frames as the references, we have the weighting value 1/1 for the first past frame, 1/2 for the second past frame, and 1/3 for the third past frame. The results are shown in Figures 12-14. We can observe that due to the design of different weighting values for distinct past frames, it can avoid the inclusion of obsolete and insufficient information. Thus, in Figure 14, we were able to obtain the best result among all the experiments, where the accuracy is more than 71 percent after 1.25 s, and 64 percent after 1.5 s.

Conclusions and Future Work
In this paper, we proposed an interesting method to predict the future position of a vehicle based on a well-known, real-time object detection project, YOLOv3. Since our algorithm can predict the future position of a vehicle by finding the most proper model from the database, where the data is collected from a particular road segment, such as highway or intersection, it can be applied to every possible case. Therefore, applying our design to predict the vehicle's position along two different roads which intersect in the same crossroad, possible collisions can be predicted. The experimental results have shown the ability of the proposed method to predict the vehicle's position, which makes IoV-based collision avoidance of vehicles possible without onboard communication systems. In addition, the proposed method can efficiently predict the vehicle's position using inferior past information.
In future work, we intend to design distinct comparing methods to find the best model from the database for all automatically moving objects, and to challenge even further the predictions of the future.