1. Introduction
Radar is an important part of the contemporary intelligent transportation system [
1,
2,
3]. Multi-target tracking with radar is also a hot issue in intelligent transportation research [
4,
5,
6]. By tracking passing vehicles, risky driving behavior can be predicted and an early warning signal can be issued [
7,
8]. Vehicle tracking helps to reduce the occurrence of traffic accidents, and also helps the development of intelligent transportation [
9]. Currently, there are many image-based multi-target tracking algorithms [
10]. However, these methods do not show good adaptability in the actual traffic scenes. Because they cannot adapt to the effects of weather, environment, and light [
11,
12]. Since radar signals can be well adapted to complex scenes [
13], more and more researchers are beginning to use millimeter-wave radar to solve multi-target tracking problems [
14,
15] in traffic.
The sampling points collected by the radar are scattered, and doped with noise [
16]. Therefore, clustering sampling points before the target tracking can promote better tracking of the targets [
17]. The experimental scene of this paper is a straight four-lane highway. In this scene, vehicles in the adjacent lanes may be close to each other during driving. At this time, the sampling points of the vehicles may be close together and cover each other. Current clustering algorithms cannot distinguish adjacent targets and covered targets well, and real-time performance is not good as well. Therefore, the purpose of this paper is to improve the cluster accuracy of adjacent vehicle sampling points in highway scenes.
There are a lot of clustering algorithms at present. For example, partition-based clustering algorithms [
18], hybrid density clustering algorithms, graph clustering algorithms, fuzzy clustering algorithms, and so on. The classic one in the partition-based clustering algorithm is the K-Means clustering algorithm [
19,
20]. This algorithm has a wider application and higher efficiency, but it also has obvious limitations. The algorithm must determine a cluster center of each cluster in advance. The choice of this cluster center determines the quality of the clustering results. The algorithm is sensitive to abnormal sample points and can only process numerical data sets. The FCM (Fuzzy C-Means) algorithm [
21,
22,
23] is a widely used clustering algorithm applied to the field of image segmentation. The algorithm uses a membership degree to determine the similarity of sample points. It is a fuzzy clustering method based on the objective function [
24,
25,
26]. The DBSCAN (Density-Based Spatial Clustering of Applications with Noise) algorithm is a density-based partitioning clustering method. It treats the data set as a collection of several high-density clusters separated by low-density regions. The main feature of this method is that clusters of any shape can be identified [
27].
Many researchers have made many improvements to existing algorithms. The K-MODES algorithm proposed by Nguyen [
28] overcomes the shortcomings of the K-Means algorithm that can only process numerical data. The K-MEDOIDS algorithm does not calculate the cluster center but directly represents a cluster to represent the cluster, which can effectively handle abnormal data [
29,
30]. Bezdek’s research team improved the FCM algorithm and they globally optimized the fuzzy objective function [
31]. Birant et al. improved the DBSCAN algorithm and proposed a new ST-DBSCAN (Spatial-Temporal DBSCAN) algorithm. The algorithm can find clusters of clusters in non-spatial values, spatial values, and temporal values [
32]. In 2014, density peak fast clustering was a new efficient clustering algorithm proposed by Italian researcher Rodriguez et al. [
33]. The main idea of the algorithm is that the cluster center has a higher density than the neighborhood, and the cluster center has a relatively large distance from the high-density point.
For the inaccurate clustering results of adjacent vehicles in the highway scenes, this paper constructs a spindle-based density peak fuzzy clustering (SDPFC) system using traffic radar. Our optimization goal is to increase the clustering accuracy of adjacent vehicles in highway scenes. In order to increase the clustering accuracy, the cluster centers and the number of clusters are calculated by the initial clustering algorithm based on density peak. The final clustering result is calculated by the fuzzy clustering algorithm based on spindle update. The main diagram of the spindle-based density peak fuzzy clustering system using traffic radar is shown in
Figure 1. The experimental results show that the SDPFC algorithm has advantages in clustering accuracy. In summary, the contributions of this paper can be summarized as follows:
This paper proposes a spindle-based density peak fuzzy clustering (SDPFC) algorithm. The algorithm is divided into two parts: initial clustering and quadratic correction clustering. The initial clustering is to determine the cluster center and the number of clusters by finding the density peak. The quadratic correction clustering is to correct the clustering results by iterative updating of the fuzzy matrix and the spindle. In this way, the problem of inaccurate clustering of adjacent vehicles is solved.
SDPFC overcomes the defect that the traditional fuzzy algorithm is not ideal for non-spherical sample set clustering. To improve the accuracy of the clustering algorithm, this paper changes the concept of iteratively updating the cluster center to the update of the spindle. In actual traffic scenes, SDPFC is more reasonable than other commonly used algorithms.
In order to accelerate the clustering algorithm, the randomly generated initial cluster center is no longer used in this paper. Instead, the ideal initial cluster center is calculated by finding the density peak. In this way, the structure of the SDPFC algorithm is optimized. Since the ideal initial cluster center is close to the real target cluster center, the optimization algorithm greatly reduces the number of iterations.
The rest of this paper is organized as follows. In
Section 2, we introduce the data acquisition method of multi-target traffic microwave radar and feature data extraction for vehicles. The data collected include distance, velocity, and angle. In
Section 3, the clustering algorithms related to this paper will be introduced. In
Section 4, the spindle-based density peak fuzzy clustering algorithm is explained.
Section 5 describes the experimental results of several real highway scenes, the performance of several algorithms is compared, and the applicability of the algorithm is discussed.
Section 6 summarizes this paper.
2. Radar Signal Preprocessing
The traffic scene in this paper is a straight four-lane highway, as shown in
Figure 2. On the highway, safe driving is significant. Therefore, not only the driver’s driving experience but also the strict supervision of the relevant departments are required to avoid accidents [
34]. On the highway, long-term cross-lane driving is extremely dangerous. However, in practice, it is found that, if two vehicles are driving in parallel when the driving distance is too close, the sampling points will gather and cover each other [
35]. At this time, it is usually judged that this is a car driving on the lane line. This will result in erroneous tracking of vehicle targets. At the same time, false alarms will be issued and the vehicle will be photographed, which will cause the owner to accept the penalty. These problems can result in wasted resources. Therefore, the algorithm in this paper is to solve the problem of inaccurate clustering results of the adjacent vehicle in highway scenes.
The radar systems used in this paper mainly include a radar, camera, and alarm. It is mounted on a beam 7 m above the ground on the side of the lane. In addition, it is capable of monitoring vehicle targets in the longitudinal direction between 50 and 300 m and in the lateral 4–5 lanes. The radar has the ability to monitor passing vehicles on the road. The traffic radar used in this paper has high measurement accuracy. In a distance range of about 150 m, the distance measurement error is about 0.15 m, and the angle measurement error is about 0.1 degrees. By processing the data collected by the radar, information such as the position and speed of each vehicle can be obtained, thereby tracking the trajectory of all vehicles in the current monitoring scene.
The raw data received by the radar are a time-domain signal, which needs to be converted by the following steps:
This paper establishes a plane rectangular coordinate system in the road plane, and the origin of the plane rectangular coordinate system is the projection position of the radar on the road plane. By transforming the coordinate in the polar coordinate system to the coordinate of the plane rectangular coordinate system, the vehicle trajectory can be visually displayed. The algorithm in this paper mainly uses the coordinate of the sampling point as input information.
4. The Spindle-Based Density Peak Fuzzy Clustering (SDPFC) Algorithm
The SDPFC algorithm proposed in this paper is characterized by the idea of using quadratic clustering to correct clustering results. The cluster center of each cluster and the number of clusters are obtained by the initial clustering algorithm based on the density peak [
43]. Then, the clustering result of the initial cluster is corrected by the fuzzy clustering algorithm based on the spindle update, and the final clustering result is obtained. The combination of these two clustering ideas will be explained in this section.
4.1. Initial Clustering Algorithm Based on Density Peak
Taking the sampling points of the paper as an example, the sample points set
to be clustered.
is the corresponding indicator set [
44]. Calculate the Euclidean distance
between all sample points, as shown in Formula (
11). Thus, the number of
is
:
All
are sorted in ascending order, and a percentage parameter
p is set, then the truncation distance
is defined as the
, where
r is calculated by Formula (
12) and
represents rounded off. In the experimental environment of this paper, the sampling points are relatively close, so
p in this article is chosen to be 2%. Users can modify
p according to their own experimental environment. The larger the
p that is selected, the more clusters are filtered out. Therefore,
p should be determined through the experimental environment:
The density
of each sample point
is calculated by Formula (
13):
For each sample point
, find all sample points
that are denser than the sample point
and select the smallest
, denoted as
. If the opposite is true, select the largest
and record it as
. The significance of this is that the characteristics of
and
can be used to determine whether the sample point is the cluster center. The selection method of
is as shown in Formula (
14):
where
∅ represents the empty set, and the expression of the indicator set is as shown in formula (
15):
You need to set the threshold parameter to find the center point of each type of sample, set the density threshold to , and the distance threshold to . If and of the sample point , the sample point is considered to be the cluster center of a certain cluster.
As shown in
Figure 3, this diagram is called a decision diagram. It can be clearly seen in the figure that the colored elements in the upper right corner have a larger
and
. This means that they are more likely to be the center of the cluster. With the decision graph, we can easily determine which points qualify as the center point and which points are not qualified by defining the density threshold
and the distance threshold
according to the experimental environment.
Define
as the number of the corresponding sample point of each cluster center that is,
represents the center of the
cluster. In addition, define
as the sample point clustering label, that is,
indicates that the
sample point in
S belongs to the
cluster. Thus,
satisfies the logic of Formula (
16).:
Define
as the number of nearest sample points in the sample points with the local density greater than
in the
S, as defined by Formula (
17):
Using the attributes defined by Formula (
17), the sample points are processed one by one by local density—the highest density sample point except for the center point. It falls into the cluster to which it is close. This way of processing one by one is much faster than loop iteration.
Define
as the identity of the cluster core and cluster halo. The cluster core indicates that the local density is large, corresponding to the core part of the cluster. The cluster halo is denser and corresponds to the edge of the cluster. The value of
is as shown in Formula (
18):
If , an average local density upper bound is generated for each cluster. For a fixed cluster, first determine its boundary area, which consists of sample points: they belong to the cluster itself, but within a range that does not exceed ; sample points belong to other clusters. Using the cluster in the boundary area, an average local density can be calculated to distinguish between the cluster core and the cluster halo.
The average density is calculated as shown in Formula (
19):
The upper bound of the average local density is obtained by Formula (
20):
The value of
is as shown in Formula (
21):
The general calculation process of the initial clustering algorithm based on the density peak is described below. Firstly, after initialization and preprocessing, calculate the Euclidean distance
between all sample points and determine the cutoff distance
according to Formula (
12). Calculate
and
for each sample point. Secondly, determine the cluster center and initialize the label according to Formula (
16). The cluster centers and their numbers of the cluster are finally obtained. Thirdly, the sample points that are not cluster centers are categorized until the categorization process for each sample point is completed. Finally, if
, the sample points in each cluster are further divided into cluster core and cluster halo.
The flow of initial clustering algorithm based on density peak as shown in Algorithm 3:
Algorithm 3 Initial clustering algorithm based on density peak |
Require: sample points z. Ensure: cluster center , the number of clusters i, and .
- 1:
initialization: cutoff distance , cluster center number i, density threshold , and distance threshold ; - 2:
for - 3:
for - 4:
calculate the Euclidean distance between all sample points and calculate and for each sample point. - 5:
end for - 6:
end for - 7:
repeat - 8:
get the cluster center and its number i; - 9:
until classify all sample points that are not cluster centers; - 10:
for - 11:
if then - 12:
the sample points in each cluster are further divided into cluster core and cluster halo according to Formula ( 21); - 13:
end if - 14:
end for - 15:
return: cluster center , the number of clusters i, and ;
|
Taking a scene on the highway as an example, the clustering result of the initial cluster is shown in
Figure 4. The horizontal and vertical coordinates in the figure indicate the distance. Each point in the graph represents each sample point. This graph is an image generated from the distance between sample points. Therefore, the distance between each sampling point in the graph is real, but their coordinate positions are not real positions. This graph is only used to clearly show the cluster centers and the number of clusters.
4.2. Fuzzy Clustering Algorithm Based on Spindle Update
According to the results obtained in the previous section, the correction after initial clustering is performed to obtain the final clustering result. The results of the previous section used in this section are: position coordinate information of all sampling points, cluster centers, and the number of clusters. Define represents the cluster, and for the cluster center.
The radar used in the article is a traffic scene monitoring radar that is mounted above the side of the lane. Therefore, the positional relationship between the radar and the lane is unchanged. The normal driving situation of the vehicle is that the vehicle travels forward between the lane lines. In addition, it is illegal to travel across the lane for a long time. Whether driving normally or across the lane, the vehicle travels in a direction parallel to the centerline of the lane. Therefore, based on the results obtained in the previous section, we use the straight line passing through the cluster center and parallel to the centerline of the lane as an important basis to correct the results of the initial clustering, so as to obtain better clustering results. We traveled on the road in advance through a dedicated calibration vehicle. The collected lane centerline is used as the initial spindle. Record the slope of the initial spindle as .
According to the above characteristics, the initial spindle is constructed based on the center of each cluster and the centerline of the lane. The initial spindle is a straight line parallel to the centerline of the lane and passing through the cluster center. Express the spindle of cluster
i as
, and its expression is:
, where
represents the
cluster and
represents the
sample point, and the value of
k is the number of all sample points except the cluster center. The expression with
as the slope and the initial spindle passing through the cluster center
of each cluster is:
The position coordinates of the sample points obtained in the plane rectangular coordinate system are defined as:
In the text, the distance between the
sample and the spindle of the
cluster is expressed by Formula (
24), and its value is
:
Define the objective function
T, and the expression is as shown in Formula (
25):
where
represents a fuzzy matrix, indicating the confidence that the sample points belong to a certain cluster.
m represents the factor of membership, that is, the weight. This value can be determined by the user. The constraint of
is:
In order to achieve the optimal solution of the objective function, it is necessary to make
T obtain a minimum value under the constraint condition. The Formula (
25) is expanded using the Lagrange method:
The proof that Formula (
25) is a continuously differentiable function is shown in
Appendix A. In addition, the result of the partial derivative of
T is:
Calculate
:
where
represents the membership between the
sampling point and the
clustering center. Substitute
into Formula (
26):
where
n represents the
cluster center. Simplify Formula (
30):
Substitute Formula (
31) into Formula (
29):
It can be clearly seen from Formula (
32) that, if you want to obtain the membership
between the
sample point and the
cluster center, you need to calculate the ratio of the distance from the sample point to a cluster center to the sum of the distances from the sample point to all cluster centers. The higher the ratio, the higher the membership. The Formula (
32) is the updated formula for
. While updating the fuzzy matrix
, the spindle needs to be updated.
Spindle update process:
The straight line formula of the spindle is: . The sample points are: .
is the distance from the sample point to the spindle of the
cluster:
Next, we need to reconstruct the clusters of the sample points. If the ordinate of the sample point is within the distance from the ordinate of the cluster center of a certain cluster, calculate the distance from the sample point to the spindle. Otherwise, calculate the Euclidean distance of the sample point to the cluster center. When is the minimum value, it is judged that the sample point is classified into the cluster.
The following update processes take place within the new clusters of reconstructing. Taking a certain cluster as an example, the spindle is updated in the following ways.
Define
as the sum of the squares of the errors. The expression is:
where
m represents the number of sample points outside the center of a certain cluster. Calculate the partial derivative of
a,
b respectively:
Let the two formulas of Formula (
35) be 0, and get:
The flow of the fuzzy clustering algorithm based on spindle update as shown in Algorithm 4:
Algorithm 4 Fuzzy clustering algorithm based on spindle update |
Require: sample point set , the number of clusters i, cluster center . Ensure: cluster center , and .
- 1:
initialization: weighted index m, membership matrix , spindle slope , spindle of each cluster , and termination error ; - 2:
repeat - 3:
update membership matrix according to Formula ( 32); - 4:
update the spindle of each cluster according to Formula ( 36); - 5:
until the objective function T tends to be stable according to the condition in Formula ( 37); - 6:
return: cluster center and ;
|
The initial clustering results of the previous section are subjected to quadratic modified clustering, and the final clustering result is shown in
Figure 5.
5. Comparison of Experimental Results
The experimental scene in this paper is a straight four-lane highway scene with the radar mounted above the side of the lane, as shown in
Figure 6. The vehicles on the highway are characterized by a fast speed and large distance between front and rear. However, during the driving process, the approach of the vehicle will occur, which will cause the radar sampling points to approach and cover each other. The current commonly used clustering algorithms cannot accurately distinguish between adjacent and covered vehicle targets. The spindle-based density peak fuzzy clustering algorithm proposed in this paper can better solve the clustering problem of adjacent vehicles in this scene.
Scene 1: There are three vehicles with a relatively short distance on the straight four-lane highway. Among them, two large vehicles have a lateral distance that is very close to each other and the other one is farther away from the two cars, as shown in
Figure 7. At this time, the radar returns a total of 117 valid sampling points, and the distribution of sampling points is shown in
Figure 8.
The comparison of clustering results of various algorithms in this scene is shown in
Figure 9.
In this scene, two large cars cover each other, resulting in an uneven distribution of sampling points. The DBSCAN algorithm classifies by density and results in a large number of clusters. The iterative update of the fuzzy matrix and the cluster center by the FCM algorithm cannot solve the problem of distinguishing adjacent targets well. The K-Means algorithm does not classify well for adjacent sample points. The clustering results obtained by the SDPFC algorithm in this paper can correspond well to the real scene. Compared with the results of other algorithms, the conclusion of the new algorithm is better.
Scene 2: There are three vehicles with a close driving distance on the straight four-lane highway. The lateral distance between the two vehicles is very close and the large vehicle covers the small vehicle. The other one is farther away from the two vehicles, as shown in
Figure 10. At this time, the radar returns a total of 115 valid sampling points, and their distribution is shown in
Figure 11.
The comparison of clustering results of various algorithms in this scene is shown in
Figure 12.
In this scene, a large car covers a small car, resulting in fewer sample points for the small car. The DBSCAN algorithm classifies by density and results in a large number of clusters. The iterative update of the fuzzy matrix and the cluster center point by the FCM algorithm cannot solve the problem of distinguishing adjacent targets well. The K-Means algorithm does not classify well for adjacent sample points. The clustering results obtained by the density peak fuzzy clustering algorithm in this paper can correspond well to the real scene. Compared with the results of other algorithms, the conclusion of the new algorithm is better.
Scene 3: There are five vehicles with a relatively short driving distance on the straight four-lane highway. There are many vehicles blocking each other, as shown in
Figure 13. At this point, the radar returns a total of 202 valid sampling points, and their distribution is shown in
Figure 14.
The comparison of clustering results of various algorithms in this scene is shown in
Figure 15.
In this scene, the number of adjacent vehicles is relatively large, and several vehicles are covering each other, resulting in uneven distribution of sampling points and fewer sampling points for small cars. The DBSCAN algorithm classifies by density and results in a large number of clusters. The iterative update of the fuzzy matrix and the cluster center point by the FCM algorithm cannot solve the problem of distinguishing adjacent targets well. The K-Means algorithm does not classify well for adjacent sample points. The clustering results obtained by the density peak fuzzy clustering algorithm in this paper can correspond well to the real scene. Compared with the results of other algorithms, the conclusion of the new algorithm is better.
Next, in order to compare the clustering accuracy of each algorithm, the accuracy of the clustering is defined as:
where
represents the clustering accuracy rate;
K represents the number of sampling points for correct clustering;
N represents the total number of sample points participating in the classification. The accuracy comparison of each algorithm is shown in
Figure 16.
It can be seen from the figure that, for the three practical experimental scenes in this paper, the SDPFC algorithm proposed in this paper has the best results, and the average accuracy can reach more than 95%.
In order to present the clustering results of the SDPFC algorithm better, in addition to the above three classic scenes, four scenes with adjacent vehicles are selected. The real situation of the four scenes are shown in
Figure 17. In addition, the clustering results of the SDPFC algorithm are shown in
Figure 18.
In order to visually compare the real-time performance of each algorithm, this paper selects ten different traffic scenes and performs the same experiments on the four algorithms mentioned in this paper to verify the real-time performance of the algorithm. Select all sampling points in about 0.2 s as the input data for the experiment in each scene.
As can be seen from
Table 1, the SDPFC algorithm proposed in this paper is the fastest of the four algorithms, and it is about 2.04 times faster than the slowest algorithm. The K-Means algorithm is ranked second because of its simple calculation method. The slowest running speed is the FCM algorithm, mainly because of the uncertainty of the initial point in the FCM algorithm, resulting in the number of iterative updates that have been maintained at a high level.
After comparing the clustering accuracy and running speed of each algorithm in a certain scene, this paper presents statistical images of the average of the accuracy of various algorithms in several specific scenes. The purpose is to show the stability of the clustering of these algorithms in the given scene. It can be seen from
Figure 19 that the SDPFC algorithm proposed in this paper has good adaptability in a certain scene and can maintain high accuracy. In addition, the K-Means algorithm is more adaptable, but the accuracy is not high. The adaptability of DBSCAN and FCM is relatively poor, but the accuracy of DBSCAN is significantly higher than that of FCM.
In order to show the clustering results of the SDPFC algorithm better, a new experiment is performed. Statistics on the accuracy of clustering adjacent vehicles in 1000 scenes. A vehicle sampling point collected every 40 milliseconds is defined as a scene. In each scene, there are 2–5 passing vehicles, and the radar echo signals are processed to obtain 100–300 sampling points. Comparing each scene with the real image and the clustering result graph, calculate the clustering accuracy according to the concept proposed by Formula (
38). The specific experimental results are shown in
Figure 20, where 1 on the abscissa represents the average clustering accuracy of the scene 1 to the scene 100, 2 represents the average clustering accuracy of the scene 101 to the scene 200, and so on. Finally, according to statistics, 3627 passing vehicles are processed in 1000 scenes, of which 3537 obtained correct clustering results. The correct clustering rate is about 97.52%.
Through the experiments in this section, we can conclude that: under the real highway traffic scene given in this paper, the SDPFC algorithm proposed in this paper can solve the problem of inaccurate clustering results of adjacent vehicles. The algorithm can complete the operation in a short time while maintaining high accuracy and has strong adaptability to scene changes.