A Trajectory Regression Clustering Technique Combining a Novel Fuzzy C-Means Clustering Algorithm with the Least Squares Method

Rapidly growing GPS (Global Positioning System) trajectories hide much valuable information, such as city road planning, urban travel demand, and population migration. In order to mine the hidden information and to capture better clustering results, a trajectory regression clustering method (an unsupervised trajectory clustering method) is proposed to reduce local information loss of the trajectory and to avoid getting stuck in the local optimum. Using this method, we first define our new concept of trajectory clustering and construct a novel partitioning (angle-based partitioning) method of line segments; second, the Lagrange-based method and Hausdorff-based K-means++ are integrated in fuzzy C-means (FCM) clustering, which are used to maintain the stability and the robustness of the clustering process; finally, least squares regression model is employed to achieve regression clustering of the trajectory. In our experiment, the performance and effectiveness of our method is validated against real-world taxi GPS data. When comparing our clustering algorithm with the partition-based clustering algorithms (K-means, K-median, and FCM), our experimental results demonstrate that the presented method is more effective and generates a more reasonable trajectory.


Introduction
In recent years, the increasing popularity of GPS (Global Position System)-enabled devices has facilitated users to track moving objects on the internet.Typically, GPS-device taxis are widely used in many cities [1], and record GPS information and movement trajectories that could reflect city states, such as traffic congestion [2][3][4], urban travel demand and transport services [5,6], and population migration distribution [7,8].Issues concerning how to mine the hidden information and understand the meaning of these states, as well as how the information of trajectories can be employed in urban development have become research hotspots.Therefore, a great quantity of clustering-based approaches was presented to describe states of the city, which utilized characteristics and trajectory pattern clustering of GPS data [1,6,[9][10][11][12][13][14][15][16].For example, Reference [10] presented a density-based line segments trajectories clustering algorithm that was based on a partition and group framework.The authors in Reference [9] presented a two-step clustering algorithm that was based on density, authors in Reference [9] presented a two-step clustering algorithm that was based on density, which was made of segment and trajectory clustering.The authors in Reference [12] presented a road network aware approach for the fast and effective clustering of road segment spatial trajectories, which was used to replace density-based clustering and Euclidean-based distance computing.Reference [11] presented a scalable and fast density clustering algorithm that was based on big data computing.Also, the authors of Reference [16] presented an improved density-based algorithm that was to be used for stops clustering in trajectories.In particular, work in Reference [17] proposed an anisotropic (angle-based standard deviation) density-based clustering algorithm, which was used to discover spatial point patterns with noise.
In general, there are several categories of clustering algorithms [17,18]; density-based, partitioning-based, grid-based, hierarchical-based, and graph-based, which have a wide range of applications in spatial data processing [19,20].Furthermore, each of these categories contains several well-known clustering algorithms (e.g., partitioning-based K-means, K-median, and fuzzy C-means (FCM)), with their specific pros and cons.In particular, density-based clustering algorithms are usually used to mine the hidden information of a given dataset and handle any GPS datasets, as they are particularly suitable for discovering clusters with arbitrary shapes and finding mutual exclusion clusters [9][10][11]16,17].However, it is difficult to handle the overlapping clusters (e.g., trajectory crossover), when considering fuzzy clusters and the loss of local information of trajectories.In addition, it is sensitive to the set neighborhood and the density (MinPts).In this paper, we focus on the partitioning-based approaches (e.g., FCM).However, they still have several shortcomings, including sensitivity to the initial cluster centers selection, slowness of convergence, and a tendency to become stuck in the local optimum.Therefore, In this paper, a novel trajectory regression clustering technique that is based on partition clustering is proposed, which combines a new line segments (based on angle) produced method (AngPart), a Lagrange-based fuzzy C-means clustering (FCML) algorithm, and the least squares regression model (LSR), which is used to construct an unsupervised trajectory clustering method instead of the map-based knowledge base.Namely, FCML is a novel unsupervised partitioning regression clustering algorithm that combines AngPart and FCM with LSR, which is shown in Figure 1.Firstly, a line segment partitioning method is constructed, which is used to efficiently produce line segments with three GPS data points (see Section 4.1), and is used to preserve the local information of trajectories.Secondly, the presented novel clustering algorithm combined a novel fuzzy C-means (NFCM) with the Lagrange operator [21] and Hausdorff-based K-means++ [22,23], which is used to capture the global optimum and to avoid getting stuck in local optimum, respectively, where the NFCM is used to achieve line segments clustering and K-means++ are used to produce the initial cluster centers of the line segments.In particular, the original fuzzy Cmeans (FCM) algorithm is a partitioning-based clustering method [24], the Hausdorff [25] distances computing between line segments must be used to replace the Euclidean distance.Finally, when the hidden information of GPS data is mined and obtained, the LSR is employed to achieve trajectory regression, with its aim to regress and generate trajectories of the clustering results without mapbased knowledge base, which can be used to explain and describe urban states (e.g., people, vehicles, roads, traffic flow, and reference as roads planning) around the produced trajectories.In fact, FCML can improve line segment partitioning and preserve the local information of trajectory using the angle-based method before the clustering operation.For example, if two GPS data points are generated as line segments, it is difficult to explain the local information among GPS data points and to capture the relationship of the successive GPS points (e.g., steering and intersection angle change of the successive GPS points).In fact, FCML can improve line segment partitioning and preserve the local information of trajectory using the angle-based method before the clustering operation.For example, if two GPS data points are generated as line segments, it is difficult to explain the local information among GPS data points and to capture the relationship of the successive GPS points (e.g., steering and intersection angle change of the successive GPS points).
In addition, the presented method (FCML) is an unsupervised learning technique.Therefore, when a map-based knowledge base is unnecessary, the least squares regression model (LSR) is used to produce the trajectories of the clustering results.
To verify the performance and effectiveness of FCML, a real-world GPS dataset in Beijing, China, is used as an experimental test (see Section 2), and the experiments as compared FCML with K-median, K-means, and FCM clustering methods using the PBM (Pakhira-Bandyopadhyay-Maulik)-index [26] cluster evaluation criteria.While PBM-index is a very good unsupervised evaluation technique [27,28]; note that distances in the PBM-index require the use of the Hausdorff method to calculate between the line segment center and line segments.Also, LSR is used to achieve the regression of the clustering results.The experimental results indicate that FCML achieves better quality trajectory regression than K-means, K-median, and FCM algorithms (see Section 5).
Therefore, the main works of the paper are summarized, as follows: (1) A novel line segments generation technique is proposed using the angle-based partitioning method (AngPart).(2) A novel fuzzy C-means (NFCM) clustering algorithm is put forth, combining the Lagrange operator with AngPart and K-means++.(3) A trajectory regression technique that is based on LSR is presented, which can be used to explain state of population migration around trajectories and can be used as reference for road planning of the city.(4) FCML is shown to work on real-world taxi GPS data in Beijing, China.
The rest of the paper is organized as follows.Section 2 describes of taxi GPS data in Beijing, China.Section 3 introduces the angle-based normalizing method that is used for the taxi GPS data.Section 4 proposes a trajectory regression technique combining FCML with LSR.Section 5 presents the experiments and the results for the preformation evaluation of the proposed approaches.Finally, Section 6 concludes the paper and suggests further work.

Description of Real-World Taxi GPS Data
The trajectory dataset that was used in this paper was collected from taxi GPS data in Beijing, China [29], the data of which were recorded by different GPS loggers (latitude, longitude) and angles in a given region.The sampling frequency was controlled in two minutes (≤2 min); namely, if different sample rates were less than or equal to two minutes, then different location information was recorded, which consisted of the GPS data points of the approximately 30 thousand taxis in 8:50-8:59 a.m. on 20 March 2016.When the origins and destinations (OD) were extracted and mined using a clustering algorithm in Reference [30], this dataset only contained 71,375 OD points in total, as shown in Figure 2. In particular, the OD points are usually used to describe trajectory patterns [8,31].When the method (angle-based partitioning) in Section 4.1 is performed, the dataset contains 23,785 line segments in total, as shown in Figure 3.
Figure 2 shows the distributions of taxis' OD in a road structure of the given land areas (0.18 × 0.3) within two minutes in Beijing, China.The overall distributions of OD reflect the traffic change demand of citizens and population migration that use taxicabs as a transportation tool.As a result, traffic information and the population migration distribution can explain the city's situation.When a new road is planned or an old road is improved, it is necessary to consider the traffic status and population migration, with the aim of providing convenient travel and easing traffic congestion.Therefore, in this paper, we present a trajectory regression method that combines FCML with LSR.

Preliminary
In this section, we present the new concepts and operations of the trajectory regression clustering that was used in our technique.
Trajectory: A trajectory is the user-defined GPS point of the evolution of the position of an object that is moving in location during a given time interval in order to achieve a given goal or solve a problem in a geographic information application, e.g., a trajectory can be defined as: = {( , ), ( , ), … , ( , )}, where p denotes a pair of GPS points (latitudes, latitudes) and t is the corresponding GPS time [11,20,32].Or, is described as = → → ⋯ → ( … ), which is a sequence of GPS points in a given time interval [1,10].In this paper, a taxi trajectory is defined as = {( , )} = {( , ), ( , ), … , ( , )}, or can also be described as … ∶ … → ∶ , which represents a sequence of GPS points, where = ( , ) is a GPS point and represents a longitude, latitude location, and a denotes the angle (steering angle) of each taxi GPS point, as

Preliminary
In this section, we present the new concepts and operations of the trajectory regression clustering that was used in our technique.
Trajectory: A trajectory is the user-defined GPS point of the evolution of the position of an object that is moving in location during a given time interval in order to achieve a given goal or solve a problem in a geographic information application, e.g., a trajectory can be defined as: = {( , ), ( , ), … , ( , )}, where p denotes a pair of GPS points (latitudes, latitudes) and t is the corresponding GPS time [11,20,32].Or, is described as = → → ⋯ → ( … ), which is a sequence of GPS points in a given time interval [1,10].In this paper, a taxi trajectory is defined as = {( , )} = {( , ), ( , ), … , ( , )}, or can also be described as … ∶ … → ∶ , which represents a sequence of GPS points, where = ( , ) is a GPS point and represents a longitude, latitude location, and a denotes the angle (steering angle) of each taxi GPS point, as Figure 3. Line segments are produced in term of taxi GPS data points (illustration in Figure 2) using the angle-based approach.

Preliminary
In this section, we present the new concepts and operations of the trajectory regression clustering that was used in our technique.
Trajectory: A trajectory is the user-defined GPS point of the evolution of the position of an object that is moving in location during a given time interval in order to achieve a given goal or solve a problem in a geographic information application, e.g., a trajectory can be defined as: T i = {(p 1 , t 1 ), (p 2 , t 2 ), . . . ,(p i , t i )}, where p denotes a pair of GPS points (latitudes, latitudes) and t is the corresponding GPS time [11,20,32].Or, is described as which is a sequence of GPS points in a given time interval [1,10].In this paper, a taxi trajectory is defined as T i = {(p, a)} = {(p 1 , a 1 ), (p 2 , a 2 ), . . . ,(p i , a i )}, or can also be described as p 1 p 2 . . .p i : a 1 a 2 . . .a i → p : a , which represents a sequence of GPS points, where p = (lng, lat) is a GPS point and represents a longitude, latitude location, and a denotes the angle (steering angle) of each taxi GPS point, as illustrated in Figure 4.In addition, we focus on low sampling rate taxi GPS trajectories with ∆t ≤ 2 min in order to meet the regression test demand.illustrated in Figure 4.In addition, we focus on low sampling rate taxi GPS trajectories with ∆ ≤ 2 min in order to meet the regression test demand.

Sub-trajectory:
A sub-trajectory is a subset of a trajectory .In this paper, three GPS points that are based on the shortest candidate Euclidean distance constitute a line segment = {( , ), ( , ), ( , )) = { : } , which can also be described as = { , , … , }( ≤ /3) .Therefore, a combination of ∀ ,…, ( 3 ⁄ ≥ ≥ 2) is considered as a subtrajectory of a trajectory .Note that we still use the Euclidean method to calculate the distance between the GPS data points.
L-Similarity: In fact, there exist several methods to measure the similarity of line segments, such as those that are reported in References [13,33,34], as well as cosine similarity.Similar results between line segments are usually used to achieve trajectory clustering and to produce sub-trajectories.In this paper, the given is relevant to angle changes (see Section 3.1).Therefore, the similarity method (similarity measure based on multiple information sources: SMIS) [35] is employed to measure the similarity of line segments, as shown in Equation ( 1): where ≥ 0 is a constant and ≥ 0 is a smoothing factor.= siml( ′, ) gives the cosine similarity values between ′ and ; ℎ = min (dist ′, ) is the minimum Hausdorff distance of GPS points between ′ and .
Trajectory clustering: A cluster is a set of trajectory partitions.A trajectory partition is a line segment , and the line segments that belong to the same cluster are close to each other in terms of the Hausdorff [25] distance measurement.According to Reference [10], a trajectory can belong to multiple clusters since a trajectory is partitioned into multiple , and trajectory clustering is performed over .A clustering result that is based on line segments , ,…, can indicate a common sub-trajectory.Therefore, for given a set of trajectories data (GPS data points) , that are partitioned into many line segments , then a set of clusters can be defined as C = { , , … , | ⊆ and = { , ,…, | ≤ }} and the cluster center of each cluster is defined as = { , , … , } .Namely, a clustering result of line segments is a sequence of GPS points, just like an ordinary trajectory.In particular, in this paper, a cluster center segment is also considered as a line segment without applying a density-based method.An example of trajectory clustering is shown in Figure 5.

Sub-trajectory:
A sub-trajectory ST j is a subset of a trajectory T i .In this paper, three GPS points that are based on the shortest candidate Euclidean distance constitute a line segment L j = p j−1 , a j−1 , p j , a j , (p j+1 , a j+1 )) = p j−1 p j p j+1 : a j−1 a j a j+1 , which can also be described as is considered as a sub-trajectory ST j of a trajectory T i .Note that we still use the Euclidean method to calculate the distance between the GPS data points.
L-Similarity: In fact, there exist several methods to measure the similarity of line segments, such as those that are reported in References [13,33,34], as well as cosine similarity.Similar results between line segments are usually used to achieve trajectory clustering and to produce sub-trajectories.In this paper, the given L j is relevant to angle changes (see Section 4.1).Therefore, the similarity method (similarity measure based on multiple information sources: SMIS) [35] is employed to measure the similarity of line segments, as shown in Equation ( 1): where α ≥ 0 is a constant and β ≥ 0 is a smoothing factor.l = siml L j , L j gives the cosine similarity values between L j and L j ; h = min dist L j , L j is the minimum Hausdorff distance of GPS points between L j and L j .Trajectory clustering: A cluster is a set of trajectory partitions.A trajectory partition is a line segment L j , and the line segments that belong to the same cluster are close to each other in terms of the Hausdorff [25] distance measurement.According to Reference [10], a trajectory can belong to multiple clusters since a trajectory is partitioned into multiple L j , and trajectory clustering is performed over L j .A clustering result that is based on line segments L 1,2,...,j can indicate a common sub-trajectory.Therefore, for given a set of trajectories data (GPS data points) T i , that are partitioned into many line segments L j , then a set of clusters can be defined as and the cluster center of each cluster is defined as c = {c 1 , c 2 , . . . ,c K }.Namely, a clustering result of line segments is a sequence of GPS points, just like an ordinary trajectory.In particular, in this paper, a cluster center segment is also considered as a line segment without applying a density-based method.An example of trajectory clustering is shown in Figure 5.
illustrated in Figure 4.In addition, we focus on low sampling rate taxi GPS trajectories with ∆ ≤ 2 min in order to meet the regression test demand.L-Similarity: In fact, there exist several methods to measure the similarity of line segments, such as those that are reported in References [13,33,34], as well as cosine similarity.Similar results between line segments are usually used to achieve trajectory clustering and to produce sub-trajectories.In this paper, the given is relevant to angle changes (see Section 3.1).Therefore, the similarity method (similarity measure based on multiple information sources: SMIS) [35] is employed to measure the similarity of line segments, as shown in Equation ( 1): where ≥ 0 is a constant and ≥ 0 is a smoothing factor.= siml( ′, ) gives the cosine similarity values between ′ and ; ℎ = min (dist ′, ) is the minimum Hausdorff distance of GPS points between ′ and .
Trajectory clustering: A cluster is a set of trajectory partitions.A trajectory partition is a line segment , and the line segments that belong to the same cluster are close to each other in terms of the Hausdorff [25] distance measurement.According to Reference [10], a trajectory can belong to multiple clusters since a trajectory is partitioned into multiple , and trajectory clustering is performed over .A clustering result that is based on line segments , ,…, can indicate a common sub-trajectory.Therefore, for given a set of trajectories data (GPS data points) , that are partitioned into many line segments , then a set of clusters can be defined as C = { , , … , | ⊆ and = { , ,…, | ≤ }} and the cluster center of each cluster is defined as = { , , … , } .Namely, a clustering result of line segments is a sequence of GPS points, just like an ordinary trajectory.In particular, in this paper, a cluster center segment is also considered as a line segment without applying a density-based method.An example of trajectory clustering is shown in Figure 5.

Methodology
Our methodology is described, as follows: (1) the angle-based partitioning and cosine-based constraint methods are used to generate line segments; (2) Hausdorff-based K-means++ is used to produce initial cluster centers, and a Lagrange-based method is presented to improve FCM clustering; and, (3) the least squares regression method is employed to achieve trajectory regression clustering, as shown in Figure 6.

Methodology
Our methodology is described, as follows: (1) the angle-based partitioning and cosine-based constraint methods are used to generate line segments; (2) Hausdorff-based K-means++ is used to produce initial cluster centers, and a Lagrange-based method is presented to improve FCM clustering; and, (3) the least squares regression method is employed to achieve trajectory regression clustering, as shown in Figure 6.

Angle-Based Partitioning and Cosine-Based Constraint
In this section, we present a steering angle-based method used in partitioning line segments, which is composed of two components: (i) ≥ π and (ii) < π, where denotes the steering angle.Meanwhile, we define a cosine-based method, which is used to restrict the intersection angle of three GPS points when the angle threshold is given.The algorithm is shown in Algorithm 1.

Algorithm 1. Angle-based partitioning and cosine-based constraint algorithm
Input: a given GPS dataset D including location information and angles, the number of iterations, the angle threshold T Output: regression trajectories Procedure: Divide into Taxi GPS data: location information and angles, which are set to numbers for each location and angle

Angle-Based Partitioning and Cosine-Based Constraint
In this section, we present a steering angle-based method used in partitioning line segments, which is composed of two components: (i) θ ≥ π and (ii) θ < π, where θ denotes the steering angle.Meanwhile, we define a cosine-based method, which is used to restrict the intersection angle γ of three GPS points when the angle threshold is given.The algorithm is shown in Algorithm 1. Steering angle-based partitioning: In general, the angles of taxi GPS data points are recorded and collected in terms of the north direction.First, we calculated the shortest distance between the selected first point and another ten candidate GPS data points around the first point (is shown in Figure 7), where "ten" is a given condition (which can also be numbers) that is used to capture angles among the first data point, and then selected the shortest distances between the first point and the other ten candidate data points that are used to handle angle-based partitioning.Therefore, for three selected taxi GPS points, if the bigger angle θ between points is greater than π, it indicates a steering angle in the counter-clockwise direction; if the bigger angle θ between points is less than π, then it indicates a steering angle in the clockwise direction.Then, we need to change the angle using formulas (Equations ( 2) and ( 3)), as follows, with the purpose of normalizing the angles and achieving a uniform standard.Illustrations are shown in Figures 8 and 9. (2) ISPRS Int.J. Geo-Inf.2018, 7, x FOR PEER REVIEW 8 of 23 Steering angle-based partitioning: In general, the angles of taxi GPS data points are recorded and collected in terms of the north direction.First, we calculated the shortest distance between the selected first point and another ten candidate GPS data points around the first point (is shown in Figure 7), where "ten" is a given condition (which can also be numbers) that is used to capture angles among the first data point, and then selected the shortest distances between the first point and the other ten candidate data points that are used to handle angle-based partitioning.Therefore, for three selected taxi GPS points, if the bigger angle between points is greater than π, it indicates a steering angle in the counter-clockwise direction; if the bigger angle between points is less than π, then it indicates a steering angle in the clockwise direction.Then, we need to change the angle using formulas (Equations ( 2) and ( 3)), as follows, with the purpose of normalizing the angles and achieving a uniform standard.Illustrations are shown in Figures 8 and 9.    Steering angle-based partitioning: In general, the angles of taxi GPS data points are recorded and collected in terms of the north direction.First, we calculated the shortest distance between the selected first point and another ten candidate GPS data points around the first point (is shown in Figure 7), where "ten" is a given condition (which can also be numbers) that is used to capture angles among the first data point, and then selected the shortest distances between the first point and the other ten candidate data points that are used to handle angle-based partitioning.Therefore, for three selected taxi GPS points, if the bigger angle between points is greater than π, it indicates a steering angle in the counter-clockwise direction; if the bigger angle between points is less than π, then it indicates a steering angle in the clockwise direction.Then, we need to change the angle using formulas (Equations ( 2) and ( 3)), as follows, with the purpose of normalizing the angles and achieving a uniform standard.Illustrations are shown in Figures 8 and 9.   Intersection angle-based constraint: When the steering angle-based partitions are achieved, the intersection angles ( = 1,2, … , ) needed to be restricted, which are used to explain the movement tendency of the trajectory, as shown in Figure 10.If three taxi GPS points ( , , ) are chosen, and Steering angle-based partitioning: In general, the angles of taxi GPS data points are recorded and collected in terms of the north direction.First, we calculated the shortest distance between the selected first point and another ten candidate GPS data points around the first point (is shown in Figure 7), where "ten" is a given condition (which can also be numbers) that is used to capture angles among the first data point, and then selected the shortest distances between the first point and the other ten candidate data points that are used to handle angle-based partitioning.Therefore, for three selected taxi GPS points, if the bigger angle between points is greater than π, it indicates a steering angle in the counter-clockwise direction; if the bigger angle between points is less than π, then it indicates a steering angle in the clockwise direction.Then, we need to change the angle using formulas (Equations ( 2) and ( 3)), as follows, with the purpose of normalizing the angles and achieving a uniform standard.Illustrations are shown in Figures 8 and 9.   Intersection angle-based constraint: When the steering angle-based partitions are achieved, the intersection angles ( = 1,2, … , ) needed to be restricted, which are used to explain the movement tendency of the trajectory, as shown in Figure 10.If three taxi GPS points ( , , ) are chosen, and  Intersection angle-based constraint: When the steering angle-based partitions are achieved, the intersection angles γ t (t = 1, 2, . . ., r) needed to be restricted, which are used to explain the movement tendency of the trajectory, as shown in Figure 10.If three taxi GPS points (P − , P, P + ) are chosen, and P is considered as a vertex, then the intersection angles are defined, as follows (Equation ( 4)), according to the cosine theorem.First, P is randomly selected from Lx in the Algorithm 1.Second, P is established as a center and then two points are chosen around P, which are captured in terms of the shortest distance that is based on the Euclidean.Third, P − and P + are selected.If γ ≤ T where T is a given angle threshold (e.g., T = π 6 ), then (P − , P, P + ) is stored in the memory and is separated from the taxi GPS dataset, and then the fourth step is executed; otherwise, we return to the first step.In the fourth step, GPS data points are traversed until the whole dataset is null, which indicates that line segments have been produced.
ISPRS Int.J. Geo-Inf.2018, 7, x FOR PEER REVIEW 9 of 23 P is considered as a vertex, then the intersection angles are defined, as follows (Equation ( 4)), according to the cosine theorem.First, is randomly selected from Lx in the Algorithm 1.Second, is established as a center and then two points are chosen around , which are captured in terms of the shortest distance that is based on the Euclidean.Third, and are selected.If ≤ where T is a given angle threshold (e.g., = ), then ( , , ) is stored in the memory and is separated from the taxi GPS dataset, and then the fourth step is executed; otherwise, we return to the first step.In the fourth step, GPS data points are traversed until the whole dataset is null, which indicates that line segments have been produced.As shown in Figure 3, we partitioned the urban area of Beijing into road regions using methods on taxi GPS data points.In other words, the taxi GPS data points of Beijing were divided into line segments using angle-based partitioning and cosine-based constraint methods.When comparing Figure 3 with Figure 2, we found that the local features in Figure 2 were not lost or only a little lost.This indicates that the angle-based method in this paper is effective and feasible (see Section 5).

Fuzzy C-Means Measure Based on the Lagrange Equation
In general, the clustering step effectively corresponds to the grouping phase and aims to derive a partitioning that is as relevant as possible.However, when the basic partitioning algorithms are used to perform clustering, it is easy to get stuck to the local optima and to suffer from iterative hillclimbing [36][37][38].Therefore, we present a novel fuzzy C-means (FCML) clustering algorithm using the Lagrange method, which tries hard to repair the error rates of clustering processing, improve the global optimization, and to balance the iterative hill-climbing.FCM is a partitioning algorithm that allows each data point to belong to multiple clusters with varying degrees of membership [39,40].In this paper, we seek to improve the FCM in order to achieve line segments clustering and avoid getting struck in the local optimum, which involved dividing the data points into groups with the most similarities between line segments on the same cluster, and minimum similarities between different clusters, as shown in Equation ( 5).
(2) L is the number of line segments, which is defined in Section 3.
(3) K is the number of clusters of line segments.
(4) m is the fuzzy partition matrix exponent used to control the degree of fuzzy overlap, which in in this paper is set to m = 2 according to Reference [41].
(5) and are defined in Section 3. ( 6) is the degree of membership of in the kth cluster, as shown in Equation ( 6) As shown in Figure 3, we partitioned the urban area of Beijing into road regions using methods on taxi GPS data points.In other words, the taxi GPS data points of Beijing were divided into line segments using angle-based partitioning and cosine-based constraint methods.When comparing Figure 3 with Figure 2, we found that the local features in Figure 2 were not lost or only a little lost.This indicates that the angle-based method in this paper is effective and feasible (see Section 5).

Fuzzy C-Means Measure Based on the Lagrange Equation
In general, the clustering step effectively corresponds to the grouping phase and aims to derive a partitioning that is as relevant as possible.However, when the basic partitioning algorithms are used to perform clustering, it is easy to get stuck to the local optima and to suffer from iterative hill-climbing [36][37][38].Therefore, we present a novel fuzzy C-means (FCML) clustering algorithm using the Lagrange method, which tries hard to repair the error rates of clustering processing, improve the global optimization, and to balance the iterative hill-climbing.FCM is a partitioning algorithm that allows each data point to belong to multiple clusters with varying degrees of membership [39,40].In this paper, we seek to improve the FCM in order to achieve line segments clustering and avoid getting struck in the local optimum, which involved dividing the data points into groups with the most similarities between line segments on the same cluster, and minimum similarities between different clusters, as shown in Equation (5).
where (1) ||*|| is any norm expressing the similarity between any measured data and the center.(2) L is the number of line segments, which is defined in Section 3.
(3) K is the number of clusters of line segments.
(4) m is the fuzzy partition matrix exponent used to control the degree of fuzzy overlap, which in in this paper is set to m = 2 according to Reference [41].(5) L j and c k are defined in Section 3.
(6) µ jk is the degree of membership of L j in the kth cluster, as shown in Equation ( 6) To update cluster centers, we combined K-means++ with Equation ( 7), with the aim of achieving cluster centers to initialize and calculate.
Then, when K-means++ determines the next line segment, the probability P + of K-means++ is given in Equation ( 8).In addition, we employed Hausdorff to calculate the distance between line segments in K-means++.
To adjust the objective function J m , the Lagrange operator is presented in Equations ( 8) and (9).
where g is the number of iterations; J max k and J min k are, respectively, the maximum and the minimal value of the lth quality constraint among the values of other available candidate solutions of the different clusters; ρ is the weight factor of the penalty; and, g max is the maximum number of iterations.In addition, g current /g max indicates that the penalty values are different at each iteration, and its aim is to be able to meet the constraint (10).If the objective function value is less than a specified maximum number of iterations, then the clustering operation ends.
The novel fuzzy C-means algorithm (FCML) that is based on Lagrange and K-means++ is shown in Algorithm 2.

Algorithm 2. The novel fuzzy C-means algorithm (FCML)
Input: line segments (see Algorithm 1), K Output: clustering results of line segments: Procedure: (1) Randomly initialize the clusters membership values µ jk in terms of line segments results; (2) Use Equation (8) to produce cluster centers; (3) Use Equation ( 6) to update membership values; (4) Use Equation ( 5) to calculate objective function values; (5) Use Equation ( 9) to repair error the rate of FCM, as well as to improve the global optimization and balance iterative hill-climbing; (6) Repeat steps 2-5 until J m improves by less than the specified maximum number of iterations.
Finally, when line segments are achieved through the clustering operation, each cluster represents a different grouping result, which is made of the line segments in each cluster.

Trajectory Regression Clustering Based on the Least Squares Model
It is generally known that least squares regression (LSR) is a typical technique in statistics theory, which has been widely applied in the fields of pattern recognition, data mining, and machine learning [42][43][44][45], such as classification, clustering, and regression.It involves finding a hyperplane through a set of data points, while minimizing the objective function.In this paper, the LSR is employed to achieve a trajectory regression using line segments clustering results, as written in Equations ( 11) and (12).min where x i ∈ R n×m is the clustering results, which denotes the kth cluster; and, m denotes the dimensionality of each cluster.A k ∈ R m×k denotes the regression matrix, k (=1, 2, . . ., K) is the total number of clusters, and y i ∈ R n×k is a target matrix of x i .Equation ( 11) is used to minimize the J m between the regressions A k x k i and y i , and is usually represented as a continuous vector.Equation ( 12) is employed to calculate the regression results.Therefore, if we need to obtain a regression curve, each trajectory clustering result can be put into Equations (11) and (12) in order to achieve trajectory regression clustering.The LSR-based regression clustering is shown in Algorithm 3.

Algorithm 3. LSR-based regression for clustering results from Algorithm 2
Input: clustering results (CR) of the line segments in Algorithm 2, number-order regression based on the least squares method Output: regression trajectories Procedure: FOR 1 to K // K is number of clusters FOR 1 to n // n is number of taxi GPS data points OutputV(x, y) ← polyfit(CR K (n), K, number) // polyfit is the regression function based on LSR // OutputV is the output function, and x and y denote the x axis and y axis, respectively END FOR END FOR If the number of taxi GPS data points is n, the dimensionality is m, the number of clusters is K, and the number of iterations of FCML is g; then, the time complexity of the presented method (FCML) is approximately equal to O where O(mn log n) represents the complexity of the generated line segments, O(gmnk log n) is the complexity of FCML, and O m 2 nk is the complexity of the LSR computing.On the one hand, the time complexity of FCML is lower than the time complexity of the K-median with O n 2 or higher; on the other hand, the time complexity of FCML is approximately equal to the time complexity of K-means and FCM when the clustering algorithms are different (use K-mean and FCM to replace the improved FCM).Note that the complexity of FCML O(FCML) does not contain built-in functions of Matlab.The test results of time complexity are presented in Table 1.However, the run time of the K-means and the K-median are higher than that of FCML, because the K-means and K-median tend to get stuck in the local optimum.

Experiment Results
In this paper, the experiment tests are presented in order to measure the performance of the cluster results in real-world taxi GPS datasets.The simulations are conducted in Matlab (v.2016b) on an Intel (R) Xeon (R) CPU E5-2658, computing at 2 × 2.10 GHz with 32 GB of RAM in Windows server 2008, which is running on a VMware-based cloud platform.Meanwhile, according to References [27,28], the PBM-index is superior to the common DB-index [46], Dunn's index [47], and XB index [24], in a measure of goodness of clustering on different partitions of a given dataset, and the PBM-index had been proposed as a measure of indication of the goodness/validity of a cluster solution in spatial data processing [48,49].Therefore, the PBM-index is employed to compare the clustering performance between FCML and other partition-based clustering algorithms (K-means, K-median, and FCM (Fuzzy C-means)), as shown in Table 2 and Figure 11.A larger PBM-index value implies a better clustering result, which is defined as follows: where K is the number of clusters of line segments.Here, In addition, the PBM-index is regarded as an unsupervised clustering evaluation index and therefore a knowledge base about the true partitioning of the location data is not necessary.In other words, the map-based knowledge base of location information is not necessary, resulting in a trajectory that is not consistent with the map road.Moreover, the regression trajectory based on LSR is not consistent with urban roads.Furthermore, FCML is an unsupervised SLR-based regression clustering algorithm that does not employ a knowledge base as a support.The partitioning-based K-means, K-median, and FCM clustering algorithms in this paper are compared with FCML, and therefore the cluster numbers K of the clustering algorithms (K-means, K-median, FCM, and FCML) are set to 20, 40, 80, and 100, respectively; the number of iterations is set to 100 (terminating the local optimal of K-means and K-median), which is used as the termination condition of the cluster algorithms.In addition, the termination condition can be defined for < , where ( + 1) is the next generation object value, ( ) is the current generation object value, and is a given value (which is used to denote the minimum distance between line segments and the cluster centers of the line segments, e.g., = 0.001).The convergence of the clustering results is shown in Figure 12; the convergence values ( ) in Figure 11 are normalized using the formula in order to reach the same comparison standard, where and are the maximum and minimum values of (convergence values).In particular, when K-means, K-median, and FCM clustering algorithms are used, they are only employed in order to replace the improved FCM, while other operation conditions remain unchanged.In this paper, the two-order regression of the LSR is employed to produce trajectories in order to better explain tendency of the urban state changes without considering one-order (straight line) or other, as shown in Figure 13; in addition, other regression order numbers of LSR can also be set according to different requirements.The weight factor is set to = 0.3 of the penalty of Lagrange, and α = 0.05 and β = 1 values are employed The partitioning-based K-means, K-median, and FCM clustering algorithms in this paper are compared with FCML, and therefore the cluster numbers K of the clustering algorithms (K-means, K-median, FCM, and FCML) are set to 20, 40, 80, and 100, respectively; the number of iterations is set to 100 (terminating the local optimal of K-means and K-median), which is used as the termination condition of the cluster algorithms.In addition, the termination condition can be defined for J(g+1)−J(g) J(g) < ε, where J(g + 1) is the next generation object value, J(g) is the current generation object value, and ε is a given value (which is used to denote the minimum distance between line segments and the cluster centers of the line segments, e.g., ε = 0.001).The convergence of the clustering results is shown in Figure 12; the convergence values (J) in Figure 11 are normalized using the formula J−J min J max −J min in order to reach the same comparison standard, where J max and J min are the maximum and minimum values of J (convergence values).In particular, when K-means, K-median, and FCM clustering algorithms are used, they are only employed in order to replace the improved FCM, while other operation conditions remain unchanged.In this paper, the two-order regression of the LSR is employed to produce trajectories in order to better explain tendency of the urban state changes without considering one-order (straight line) or other, as shown in Figure 13; in addition, other regression order numbers of LSR can also be set according to different requirements.The weight factor is set to ρ = 0.3 of the penalty of Lagrange, and α = 0.05 and β = 1 values are employed in terms of the work in Reference [35].In other words, when cluster results are obtained, the LSR can be employed in order to produce smoothness trajectories, which maybe be used to design and plan urban roads, support urban development, and identify traffic trends.Table 2 and Figure 11 show the superiority of the FCML clustering algorithm, which produces a better value of the PBM-index than those of other partitioning-based clustering algorithms (K-means, K-median, and FCM).Moreover, FCM and FCML exhibited better stability than other algorithms on 20 different runs without suffering from randomness impacts, but the PBM-index values of FCML are obviously better than those of FCM, indicating that the Lagrange-based method that is used to improve FCM is effective.
In Figure 13, the red curves express regression trajectories of the clustering results for GPS data, which can be used to describe the tendency of the urban state changes; the black squares stand for cluster centers (line segments) of the clustering results, which can be used to explain hot points of roads (e.g., traffic flow and population aggregation segments) in general.Figure 13 demonstrates that FCML can obtain a better clustering and regression results, for example, when K = 100, cluster centers of the FCML is evenly distributed in the main road of the Beijing without deviating too much, and the smooth trajectories can be stretched around road and the cluster centers.In particular, if the cluster centers are linked together according to roads, the number of cluster centers are enough, and a map-based knowledge base is also supported, which can construct some real trajectories.However, the regression trajectory generated method in this paper is directly used to express the hidden information of GPS data and to explain state the changes of city without producing real trajectories.
Figure 12 shows that FCML obtains a better solution than other algorithms (K-means, K-median, and FCM), and the convergence process is very smooth, fast, and robust without getting stuck in the local optimum, and local information loss is also reduced.Meanwhile, the convergence of FCM is also smooth and robust (except before 10 iterations), but the convergence speed of FCML is faster than that of FCM; namely, FCML begins convergence at 20 iterations, and FCM begins convergence at about 30, 50 iterations when K = 20 and 40, K = 80 and 100, respectively.The regression trajectories can be used as a reference to establish city road planning and other fields of urban development.However, it should be noted that FCML is unsupervised clustering method for producing trajectories without a knowledge platform based on maps (e.g., Google maps).However, K-means and K-median clustering algorithms exhibit premature convergence and suffer from instability, resulting in the production of many empty clusters and the regression of long trajectories (see Table 3), as shown in Figure 13.In other words, a great quantity of line segments are gathered together, and therefore a cluster contains many line segments, resulting in a cost of computing time.For example, when K = 20, the number of line segments in each cluster is shown in Table 3; in addition, because K-means and K-median easily become stuck in the local optimum, a lot of time is used to gather line segments in order to find more line segments in a cluster, and when the number of clusters changes from 20 to 40, 80, 100, time consumption also increases as the number of clusters increases.However, the run time of FCML and FCM exhibit only a slight fluctuation in different cluster numbers.The test results of time consumption are shown in Table 1, revealing that the run time of FCML is slightly lower than that of FCM.Meanwhile, trajectories in Figure 13 can be used to explain the state of population migration around trajectories and is used as reference for road planning of the city.

Conclusions
In this paper, we presented the FCML algorithm with the aim of achieving better performance of line segments clustering without getting stuck in the local optimum and losing more local information, in order to obtain a more effective regression trajectory.In the FCML algorithm, we first presented the new concept of trajectory and a new line segments generation method in order to reduce local information loss.A new Lagrange-based and Hausdorff-based distance K-means++ method was presented in order to improve the original fuzzy C-means clustering algorithm, which was used to avoid getting stuck in the local optimum, as well as to improve the convergence speed.In our improved fuzzy C-means method, the new Lagrange operator was used to adjust and control the similarity between line segments using Equation ( 5), as well as to achieve clustering operations.The Hausdorff-based K-means++ was employed to produce cluster centers.Finally, LSR was employed to achieve the regression of the clustering results and produce trajectories.In the experiments, we compared our method with three other clustering algorithms: K-means; K-median; and, FCM.The experimental results showed that FCML works better than K-means, K-median, and FCM.
However, our method requires the user to define the number of clusters in advance.Therefore, we will study an automatic generated method of the number of clusters that is based on PBM with a noise and density method in our future work.Meanwhile, when the FCML technique is used to support urban development, a large volume of GPS datasets must be analyzed; thus, we will study cloud-based analysis techniques in the future.In particular, when FCML is used in the context of real urban development, a knowledge platform that is based on maps needs to be established in future work.

Figure 2 .
Figure 2. Road structure of the origins and destinations (OD) in Beijing using taxis' GPS (Global Positioning System) data.

Figure 3 .
Figure 3. Line segments are produced in term of taxi GPS data points (illustration in Figure 2) using the angle-based approach.

Figure 2 . 23 Figure 2 .
Figure 2. Road structure of the origins and destinations (OD) in Beijing using taxis' GPS (Global Positioning System) data.

Figure 3 .
Figure 3. Line segments are produced in term of taxi GPS data points (illustration in Figure 2) using the angle-based approach.

Figure 4 .
Figure 4. Illustration of the GPS trajectory description based on angle.

Figure 4 .
Figure 4. Illustration of the GPS trajectory description based on angle.

Figure 4 .
Figure 4. Illustration of the GPS trajectory description based on angle.Sub-trajectory: A sub-trajectory is a subset of a trajectory .In this paper, three GPS points that are based on the shortest candidate Euclidean distance constitute a line segment = {( , ), ( , ), ( , )) = { : } , which can also be described as = { , , … , }( ≤ /3) .Therefore, a combination of ∀ ,…, ( 3 ⁄ ≥ ≥ 2) is considered as a subtrajectory of a trajectory .Note that we still use the Euclidean method to calculate the distance between the GPS data points.L-Similarity:In fact, there exist several methods to measure the similarity of line segments, such as those that are reported in References[13,33,34], as well as cosine similarity.Similar results between line segments are usually used to achieve trajectory clustering and to produce sub-trajectories.In this paper, the given is relevant to angle changes (see Section 3.1).Therefore, the similarity method (similarity measure based on multiple information sources: SMIS)[35] is employed to measure the similarity of line segments, as shown in Equation (1):

Figure 6 .
Figure 6.The overall flowchart of our methodology.

Figure 6 .
Figure 6.The overall flowchart of our methodology.

Figure 7 .
Figure 7. Three GPS data points selection method.

Figure 7 .
Figure 7. Three GPS data points selection method.

Figure 7 .
Figure 7. Three GPS data points selection method.

Figure 7 .
Figure 7. Three GPS data points selection method.

Algorithm 1 .
Angle-based partitioning and cosine-based constraint algorithmInput: a given GPS dataset D including location information and angles, the number of iterations, the angle threshold T DO P i : count → D // select a data point P i from D, and record the angle of the P i ; mark as count = i; // The "count" is used to count the number of selected data points; Distance → Euclidean(P i , P i+1 ) // Calculate distances between the selected P i and P i+1 ; // select second point P i+1 from D; // Call the built-in function pdist2 of the Matlab to calculate the Euclidean; Sort → Descending(Distance) // Sort the distances in descending order according to the Euclidean; DO Angle i : count!= null // Indicate the angle is effective in D, Calculate : Angle → θ // Calculate angle difference of the selected data point P i ; IF θ ≥ π Use Equation (2) to normalize the angles; ELSE Use Equation (3) to normalize the angles; END IF WHILE count ≤ 10 // the "10" is a given condition, which is used to handle taxi GPS selection // we select another 10 GPS data points around first points IF Sita ≤ T (P i , P i+1 ) → Lx // Indicates the two taxi GPS data points have been chosen // Denotes the shortest distance between P i and P i+1 ELSE count ← count + 1 END IF // when the DO . . .WHILE does not satisfy any given values (e.g., 10), then continue to loop // is shown in Figure 6 END WHILE IF Judge(Lx) i == 2 // whether two data points are selected from D; Select third GPS data point P i−1 as above operation steps // above operation steps stand for method of the selected first and second point; END IF (P i−1 , P i , P i+1 ) → Lx // three GPS data points are selected from D IF Length(Lx) == 3 γ → Cosine(P i−1 , P i , P i+1 ) // Calculate intersection angle between P i−1 , P i and P i , P i+1 , where P i is a vertex IF γ ≥ T LineSegment ← (D − Lx(P i−1 , P i , P i+1 ))

Table 1 .
The average computational time (in minutes) of the clustering algorithms for real-world taxi GPS datasets.

Table 2 .
The maximum (Max), mean, and minimum (Min) values of the PBM-index obtained by the K-means, K-median, and FCML for 20 different runs for four real-world taxi GPS datasets.The bold font indicates the best values for real-world taxi GPS data.

Table 3 .
The number of line segments in each cluster using K-means, K-median, and FCML (K = 20).