A Segment-Based Trajectory Similarity Measure in the Urban Transportation Systems

With the rapid spread of built-in GPS handheld smart devices, the trajectory data from GPS sensors has grown explosively. Trajectory data has spatio-temporal characteristics and rich information. Using trajectory data processing techniques can mine the patterns of human activities and the moving patterns of vehicles in the intelligent transportation systems. A trajectory similarity measure is one of the most important issues in trajectory data mining (clustering, classification, frequent pattern mining, etc.). Unfortunately, the main similarity measure algorithms with the trajectory data have been found to be inaccurate, highly sensitive of sampling methods, and have low robustness for the noise data. To solve the above problems, three distances and their corresponding computation methods are proposed in this paper. The point-segment distance can decrease the sensitivity of the point sampling methods. The prediction distance optimizes the temporal distance with the features of trajectory data. The segment-segment distance introduces the trajectory shape factor into the similarity measurement to improve the accuracy. The three kinds of distance are integrated with the traditional dynamic time warping algorithm (DTW) algorithm to propose a new segment–based dynamic time warping algorithm (SDTW). The experimental results show that the SDTW algorithm can exhibit about 57%, 86%, and 31% better accuracy than the longest common subsequence algorithm (LCSS), and edit distance on real sequence algorithm (EDR) , and DTW, respectively, and that the sensitivity to the noise data is lower than that those algorithms.


Introduction
With the rapid development of sensors technology and the popularization of personal smart devices, GPS sensors are widely used to track moving objects, such as people, cars, and animals. A large number of trajectory data emerges every day. The trajectory data from GPS sensors are the spatio-temporal data sequences of mobile objects with the space-time variation. With the development of the Internet of Things, urban computing, and other research fields, the analysis of spatio-temporal data-based transportation systems have become a hot topic in the fields of machine learning. The trajectory data analysis can be a great driving force for all of the fields, for example, through applying the trajectory similarity measure algorithm, the distance matrix can be computed, which can be used to cluster the trajectory of peoples' activities for finding the popular routes and hot spots and visualizing in OpenStreetMap [1,2]. In the intelligent transportation systems, it is of great practical value to measure the similarity of the trajectories of moving objects in a real-time, accurate, and reliable way. Intelligent trajectory measurement cannot only provide accurate location-based services, but also monitor and estimate traffic jams [3].
In trajectory data mining, one of the most important and fundamental works is to compute the similarity between different trajectories. Based on the similarity measurement of trajectory data, the trajectories can be clustered, classified, and retrieved [4]. The accuracy of the similarity measurement significantly affects the accuracy of the trajectory data mining. In recent years, some mainstream algorithms for trajectory similarity measurement have been proposed, such as the dynamic time warping algorithm (DTW) [5], longest common subsequence algorithm (LCSS) [6], and edit distance on real sequence algorithm (EDR) [7]. Those algorithms can obtain the results of similarity measurement through computing spatial point-to-point distances or temporal distances. However, there are common drawbacks resulting in the low accuracy. For example, the DTW algorithm just directly calculates the point-point distance, ignoring the influence of the different trajectory sampling methods on the generated trajectory sequence. The LCSS algorithm neglects optimizing the temporal distance of the trajectory data. The EDR algorithm does not consider the trajectory shape factor. In order to improve the accuracy, a segment-based dynamic time warping algorithm (SDTW) is proposed to measure the trajectory similarity. First, the proposed SDTW adopts the point-segment distance to reduce the sensitivity influence from the trajectory sampling methods. Then, considering the temporal distance factor, SDTW introduces the prediction distance to convert the temporal distance into the spatial distance. Finally, SDTW introduces the segment-segment distance to improve the computation accuracy by adjusting the parameters of shape factors.
The remainder of this paper is organized as follows. Section 2 discusses the related work and analyzes their drawbacks. Some definitions and problem statements are described in Section 3. Section 4 presents the proposed SDTW algorithm, and the performance evaluations are given in Section 5. Discussion and conclusions are given in Section 6.

Related Work
Trajectory sequence data can be regarded as time sequence data. Many approaches to the trajectory similarity measurement are introduced from the similarity measurement to the time sequence data. The simplest trajectory similarity measurement is the Euclidean distance, but it cannot obtain better accuracy when the local time shifts or when those trajectories lack the same length [8]. In order to improve the accuracy of similarity measurement, the dynamic time warping algorithm (DTW), longest common subsequence algorithm (LCSS), and edit distance on real sequence algorithm were proposed and widely applied.
Based on the idea of dynamic programming to find the optimal match point pairs between the trajectory points, the DTW can effectively solve the problem of local time shifting and various trajectory lengths [5]. The DTW algorithm was firstly introduced for speech recognition, then applied to the time sequence analysis later. The LCSS adopted a threshold ε to identify the match point pairs [6], but it is a similarity measurement in rough granularity without considering non-match pairs of points. The EDR is an edit distance-based algorithm, which uses a threshold ε to identify the match point pairs and the non-match points, different from the LCSS. Those similarity measurement algorithms can be divided into two types [8]: the one based on L1 and L2 paradigms, such as the DTW; and the other computing similarity scores based on the matching threshold, such as the LCSS and the EDR.
Wang et al. have evaluated the performance on the accuracy of main similarity measurement algorithms, DTW, LCSS, ERP [9], EDR, and SpaDe [10], in the different time sequence datasets [11]. The experimental results demonstrate that the DTW algorithm can obtain the most accurate results of the similarity measurement in the majority of datasets although its computation speed is slow. Based on the evaluation results, many similarity measurement algorithms, such as Kim [12], Keogh [13], and Improved [14], have been proposed to reduce the computation complexity at the same measurement accuracy as the DTW algorithm.
From the above analysis, it can be found that those algorithms have common drawbacks affecting the accuracy of similarity measurement. (1) The DTW, LCSS, and EDR algorithms only consider the comparison of two individual points.
In fact, different sampling methods can form different trajectory sequences, which results in a significant negative impact on the final measurement results [15]. As shown in Figure 1a, the trajectory sequence data of a curve trajectory with an arrow may have two-point sampling methods T 1 and T 2 . Two original trajectories are essentially identical, but their trajectory sequences are quite different. In Figure 1b, two trajectories intersect at point P, and their trajectory sequences T 1 and T 2 sample the point P. An obvious difference between the two trajectories is produced, but the difference is weakened due to the intersection point P. Thus, computing the trajectory similarity completely based on the discrete trajectory points will cause the loss of the details of the trajectories. It is necessary to find a way to keep the details to a certain extent. (2) Only considering the distances between the pairs of points, the mentioned algorithms cannot take shape factors into account [16]. However, shape factor is an important feature of a natural trajectory. It may result in the loss of computation accuracy when the shape factors are ignored. (3) Most algorithms of similarity measurement are derived from the time sequence similarity computation without considering the temporal distance computation between two trajectory points. Since the time measurement is different from the space measurement, it makes no sense just to simply add two weights. To solve that problem, Lee et al. proposed a trajectory distance measurement method with the weighted addition of the parallel distance, the perpendicular distance, and the angle distance [1]. Unfortunately, the proposed measurement method by Lee et al. cannot solve problems (1) and (2).
Sensors 2017, 17, 524 3 of 16 significant negative impact on the final measurement results [15]. As shown in Figure 1a, the trajectory sequence data of a curve trajectory with an arrow may have two-point sampling methods T1 and T2. Two original trajectories are essentially identical, but their trajectory sequences are quite different. In Figure 1b, two trajectories intersect at point P, and their trajectory sequences 1 T and 2 T sample the point P. An obvious difference between the two trajectories is produced, but the difference is weakened due to the intersection point P. Thus, computing the trajectory similarity completely based on the discrete trajectory points will cause the loss of the details of the trajectories. It is necessary to find a way to keep the details to a certain extent.
(2) Only considering the distances between the pairs of points, the mentioned algorithms cannot take shape factors into account [16]. However, shape factor is an important feature of a natural trajectory. It may result in the loss of computation accuracy when the shape factors are ignored. (3) Most algorithms of similarity measurement are derived from the time sequence similarity computation without considering the temporal distance computation between two trajectory points. Since the time measurement is different from the space measurement, it makes no sense just to simply add two weights. To solve that problem, Lee et al. proposed a trajectory distance measurement method with the weighted addition of the parallel distance, the perpendicular distance, and the angle distance [1]. Unfortunately, the proposed measurement method by Lee et al. cannot solve problems (1) and (2). To solve the above three problems, a segment-based trajectory similarity measurement algorithm is proposed to improve the accuracy.

Problems and Definitions
Mobile objects generally have time and space attributes, respectively. Space attributes can be three-dimensional or two-dimensional. Two-dimension is the most widely used, so all of the words "space" refers to two-dimension space in this paper. A trajectory records a continuous movement trace of a mobile object. Due to the limitations of the GPS sensors, a trajectory T consists of a series of points ( , , ) x y t , where (x, y) is the spatial recorded point, t is the recorded time. For convenience, a natural trajectory and a trajectory sequence are strictly distinct.  To solve the above three problems, a segment-based trajectory similarity measurement algorithm is proposed to improve the accuracy.

Problems and Definitions
Mobile objects generally have time and space attributes, respectively. Space attributes can be three-dimensional or two-dimensional. Two-dimension is the most widely used, so all of the words "space" refers to two-dimension space in this paper. A trajectory records a continuous movement trace of a mobile object. Due to the limitations of the GPS sensors, a trajectory T consists of a series of points (x, y, t), where (x, y) is the spatial recorded point, t is the recorded time. For convenience, a natural trajectory and a trajectory sequence are strictly distinct.

Definition 1 (natural trajectory). A continuous trajectory of a mobile object.
Definition 2 (trajectory sequence). With a given Euclidean space, a natural trajectory can be expressed as T = {P 1 , P 2 , ..., P n }, where the discrete trajectory points are ordered by time, P i refers to the trajectory point i, P i = (x i , y i , z i ), and n represents the number of points in the trajectory. T is the recorded trajectory sequence from the natural trajectory.    A discrete trajectory sequence represents a whole natural trajectory. A trajectory point on its sequence represents a part of the natural trajectory, called as the proxy natural sub-trajectory of the point. A natural trajectory can only be stored as a trajectory sequence; thus, the proxy natural subtrajectory segment cannot be obtained. It can only obtain the proxy sub-trajectory of the trajectory points.
The problem to be solved in this paper is to compute the distance ist( , ) D R S between two given trajectory sequences R and S, where  Definition 5 (proxy natural sub-trajectory segment). In Figure 3, P seg1 is denoted as a medium point of the natural sub-trajectory segment between P i and P i−1 . P seg2 is denoted as the medium point of the natural sub-trajectory segment between P i and P i+1 . The proxy natural sub-trajectory segment of trajectory point P i is the natural sub-trajectory segment between P seg1 and P seg2 .  PP , which is a sub-trajectory segment.
Definition 4 (natural sub-trajectory segment). A part of the natural trajectory between two adjacent discrete trajectory points is constructed as a natural sub-trajectory segment.
A trajectory sequence consists of a series of discrete points. Two adjacent discrete points are connected to form a sub-trajectory segment. Moreover, a real trajectory segment must exist between two adjacent discrete points. In Figure 2    A discrete trajectory sequence represents a whole natural trajectory. A trajectory point on its sequence represents a part of the natural trajectory, called as the proxy natural sub-trajectory of the point. A natural trajectory can only be stored as a trajectory sequence; thus, the proxy natural subtrajectory segment cannot be obtained. It can only obtain the proxy sub-trajectory of the trajectory points.
The problem to be solved in this paper is to compute the distance ist( , ) D R S between two given trajectory sequences R and S, where  Definition 6 (proxy sub-trajectory). P mid1 is marked as a midpoint of the sub-trajectory segment of P i and P i−1 , and P mid2 is marked as the midpoint of the sub-trajectory segment of P i and P i+1 . The sub-trajectory formed by P mid1 P i and P i P mid2 is the proxy sub-trajectory of P i .
A discrete trajectory sequence represents a whole natural trajectory. A trajectory point on its sequence represents a part of the natural trajectory, called as the proxy natural sub-trajectory of the point. A natural trajectory can only be stored as a trajectory sequence; thus, the proxy natural sub-trajectory segment cannot be obtained. It can only obtain the proxy sub-trajectory of the trajectory points.
The problem to be solved in this paper is to compute the distance Dist(R, S) between two given trajectory sequences R and S, where R = {P 1 , P 2 , ..., P n } and S = {SP 1 , SP 2 , ..., SP m }. The longer the distance, the less similarity Sim(R, S).

SDTW Algorithm
Due to ignoring the relationship between a trajectory sequence and a natural trajectory, the current trajectory similarity measurement algorithms are sensitive to the sampling methods. To reduce the sensitivity of the points sampling methods, a point-point distance can be converted to a distance from a point to a specific segment, which is defined as a point-segment distance. There is a fundamental difference between the temporal distance and the spatial distance of trajectory points. In this paper, the time difference and trajectory' shape are integrated to convert a temporal distance into a spatial distance and the prediction distance is presented. The DTW algorithm only uses the point-point distance, without considering the trajectory's important characteristic-shape-which results in the low accuracy of the DTW algorithm. If the shape factors are included, the accuracy of the similarity measurement can be improved. A trajectory sequence is regarded as multiple continuous trajectory segments, the shape lies in the difference of an angle between trajectory segments. An included angle can be considered into the similarity calculation, and its result is the segment-segment distance.
The above three distances are integrated with the traditional DTW algorithm to propose a new segment-based dynamic time warping algorithm (SDTW). SDTW adopts the point-segment distance, prediction distance, and segment-segment distance to compute the accumulative distance of two trajectory sequences, which can improve the accuracy.

Point-Segment Distance
The spatial distance of two trajectory points can be converted to the spatial distance of their proxy trajectories. In fact, the point-segment distance is the spatial distance of the pair of two trajectory points. The distance of the two proxy trajectories, S true is the area enclosed by them ( Figure 4a). The plane is an irregular polygon area, the computation is difficult. The sum of S 1 enclosed by P 1 and Seg 2 and S 2 enclosed by P 2 and Seg 1 (Figure 4b) shows a positive correlation with S true . That is, when the relative displacement of the two proxy trajectories occurs, the trend of S 1 + S 2 is the same as that of S true . So, S true can be replaced with the sum of S 1 and S 2 .

SDTW Algorithm
Due to ignoring the relationship between a trajectory sequence and a natural trajectory, the current trajectory similarity measurement algorithms are sensitive to the sampling methods. To reduce the sensitivity of the points sampling methods, a point-point distance can be converted to a distance from a point to a specific segment, which is defined as a point-segment distance. There is a fundamental difference between the temporal distance and the spatial distance of trajectory points. In this paper, the time difference and trajectory' shape are integrated to convert a temporal distance into a spatial distance and the prediction distance is presented. The DTW algorithm only uses the point-point distance, without considering the trajectory's important characteristic-shape-which results in the low accuracy of the DTW algorithm. If the shape factors are included, the accuracy of the similarity measurement can be improved. A trajectory sequence is regarded as multiple continuous trajectory segments, the shape lies in the difference of an angle between trajectory segments. An included angle can be considered into the similarity calculation, and its result is the segment-segment distance.
The above three distances are integrated with the traditional DTW algorithm to propose a new segment-based dynamic time warping algorithm (SDTW). SDTW adopts the point-segment distance, prediction distance, and segment-segment distance to compute the accumulative distance of two trajectory sequences, which can improve the accuracy.

Point-Segment Distance
The spatial distance of two trajectory points can be converted to the spatial distance of their proxy trajectories. In fact, the point-segment distance is the spatial distance of the pair of two trajectory points. The distance of the two proxy trajectories, true S is the area enclosed by them ( Figure   4a). The plane is an irregular polygon area, the computation is difficult. The sum of 1 S enclosed by  It is obvious that the distance calculation method based on the area is not an effective approach, especially for a trajectory point with a long proxy sub-trajectory, which results in a larger sum enclosed by it and other proxy trajectories. From the above analysis, the length of  It is obvious that the distance calculation method based on the area is not an effective approach, especially for a trajectory point with a long proxy sub-trajectory, which results in a larger sum enclosed by it and other proxy trajectories. From the above analysis, the length of Seg 1 and Seg 2 shows a positive correlation with the condition of a trajectory point with a long proxy sub-trajectory. The longer a trajectory is, the worse the result is. It can adopt S/Seg to convert the spatial distance between P 1 and P 2 into the sum of the distance from P 1 and Seg 2 , and the distance from P 2 and Seg 1 . That is, S/Seg is the sum of point-segment distances.
Assume that P i (x i , y i ) is trajectory point i on the trajectory sequence R, and SP j x j , y j is trajectory point j on the trajectory sequence S. Define dist ps P i , SP j as the point-segment distance of P i and SP j . Define dist ps SP j , P i as the point-segment distance of SP j and P i , and dist ps P i , SP j = dist ps SP j , P i . Figure 5 illustrates the point-segment distance computation.  Figure 5 illustrates the point-segment distance computation.   (1): Then it computes the shortest distance between i P and seg R .  (3):

Prediction Distance
Most of the trajectory similarity measurement algorithms are introduced from the time sequence similarity algorithms without considering to optimize the trajectory data. However, the time series data measurement and space measurement of the trajectory are essentially different, so it is necessary to figure out a solution to calculate the temporal distance integrated with spatial distance.
In Figure 6, the time distance between i P on trajectory R and j SP on trajectory S is computed.
The timestamp of i P is i t , the timestamp of j SP is j t . The difference between i t and j t can actually be reflected on a specific trajectory. Assume that i P is regarded as a mobile object. When i t < j t , its space location after the time interval j i t t  is the space location of R at the timestamp j t , known as a prediction position of i P , denoted as ' i P . To compute dist ps P i , SP j , it is first to compute the midpoint P mid1 (x mid1 , y mid1 ) of SP j and SP j−1 , and the midpoint P mid2 (x mid2 , y mid2 ) of SP j and SP j+1 . P mid1 (x mid1 , y mid1 ) and P mid2 (x mid2 , y mid2 ) can computed as follows Equation (1): Then it computes the shortest distance between P i and R seg . dist ps P i , R seg is [17]: The formula for dist ps SP j , P i is the same as dist ps P i , SP j , and the spatial distance dist p P i , SP j between P i and SP j with the SDTW is as shown in Equation (3):

Prediction Distance
Most of the trajectory similarity measurement algorithms are introduced from the time sequence similarity algorithms without considering to optimize the trajectory data. However, the time series data measurement and space measurement of the trajectory are essentially different, so it is necessary to figure out a solution to calculate the temporal distance integrated with spatial distance.
In Figure 6, the time distance between P i on trajectory R and SP j on trajectory S is computed. The timestamp of P i is t i , the timestamp of SP j is t j . The difference between t i and t j can actually be reflected on a specific trajectory. Assume that P i is regarded as a mobile object. When t i < t j , its space location after the time interval t j − t i is the space location of R at the timestamp t j , known as a prediction position of P i , denoted as P i . The temporal distance between i P and j SP is converted into the spatial distance between the prediction location of ' i P and j SP , known as the prediction distance. It can convert a temporal distance into a spatial distance, and reflect the time distance of trajectory points on the trajectory. It can be seen that he prediction distance has good interpretability. It can effectively improve the accuracy of similarity measurements. Therefore, the natural trajectory cannot be recorded, so the similarity measurement should be based on the trajectory sequence data.
Assume that   Suppose that it is a uniform linear motion between any two points on the trajectory, it can compute the velocity between two points as follows: where N is the total number of the points on the trajectory, on which point B is located.
The prediction distance between A and B is calculated as follows: where   A, dist B is the Euclidean distance between A and ' B in the coordination.
The prediction distance between A and B also presents the point-segment distance between point A and segment BB'. The temporal distance between P i and SP j is converted into the spatial distance between the prediction location of P i and SP j , known as the prediction distance. It can convert a temporal distance into a spatial distance, and reflect the time distance of trajectory points on the trajectory. It can be seen that he prediction distance has good interpretability. It can effectively improve the accuracy of similarity measurements. Therefore, the natural trajectory cannot be recorded, so the similarity measurement should be based on the trajectory sequence data.
Assume that P i (x i , y i , t i ) represents a trajectory point i on trajectory R and SP j x j , y j , t j is a trajectory point j on trajectory S. To compute the prediction distance between P i (x i , y i , t i ) and SP j x j , y j , t j , one first compares the timestamps of P i and SP j . The point with an earlier timestamp is set as A, and the other with the later timestamp as B. Their time difference is The next step is to compute the prediction location of point B with the later timestamp, named as B . Since the information stored in the trajectory sequence is limited and the positions of the moving object cannot be obtained at any time, the prediction location B is only an approximate position of point B. Then, it traverses the timestamp for each trajectory point to search the track range of point B at the timestamp t B + ∆t. Suppose at the timestamp t B + ∆t, point B is located between point i − 1 and i. The spatial coordinates (x B , y B ) of the prediction position B can be calculated as follows: Suppose that it is a uniform linear motion between any two points on the trajectory, it can compute the velocity between two points as follows: If there does exist the corresponding recorded trajectory point B at the timestamp t B + ∆t, the B can be estimated as follows: where N is the total number of the points on the trajectory, on which point B is located. The prediction distance between A and B is calculated as follows: where dist(A, B ) is the Euclidean distance between A and B in the coordination. The prediction distance between A and B also presents the point-segment distance between point A and segment BB'.

Segment-Segment Distance
Suppose that S i is one segment i on the trajectory R, and SS j is one segment j on the trajectory S. Suppose that S i 's two endpoints are P i (x i , y i ) and P i+1 (x i+1 , y i+1 ), and SS j 's two endpoints are SP j x j , y j and SP j+1 x j+1 , y j+1 , respectively. The segment-segment distance is dist s P i , SP j can be calculated as follows.
The point-point distance includes the spatial distance and the temporal distance. The spatial-temporal distance dist st P i , SP j between P i and SP j is calculated as shown in Equation (8): where t is the time sensitivity parameter. The larger parameter t is, the more sensitive the distance to the time dimension is. When parameter t = 0, the time dimension cannot be neglected. The segment-segment spatial-temporal distance is the sum of spatial-temporal distances between the two ends of the segments. dist st S i , SS j represents the segment-segment spatial-temporal distance of S i and SS j , as shown in Figure 7. dist st S i , SS j can be calculated as follows: Then, dist st S i , SS j and the angle distance can be combined to calculate the segment-segment distance. It computes the included angle between S i and SS j in Equation (10), denoted as θ: Under the same condition, if the included angle θ increases, dist st S i , SS j should be multiplied with a certain time for the computation. Thus, θ should be integrated with dist st S i , SS j : where f (θ) can be computed in Equation (12): where ω is an adjustable parameter and the shape negative factor. The greater ω, the less sensitive the distance to the shape factor. If there are no special requirements, let ω = 1. dist smid (S i , SS j ) is the spatial-temporal distance between midpoints S i and SS j . dist max (R, S) is the maximum temporal distance between any two points of trajectory sequences R and S. Furthermore, it makes no sense to compare the shapes of two trajectory sequences with a long distance. The shorter the distance, the more important the shape factor. Thus, dist max (R,S) is used to dynamically adjust the weight of the shape factor.
where t is the time sensitivity parameter. The larger parameter t is, the more sensitive the distance to the time dimension is. When parameter t = 0, the time dimension cannot be neglected. The segment-segment spatial-temporal distance is the sum of spatial-temporal distances between the two ends of the segments.
where ( ) f  can be computed in Equation (12): where  is an adjustable parameter and the shape negative factor. The greater  , the less sensitive the distance to the shape factor. If there are no special requirements, let

SDTW Computation
After all of the segment-segment distances between trajectory sequences R and S have been calculated, the accumulative distance is computed derived from the idea of the DTW algorithm. Similar to the DTW algorithm, the similarity measurement of the SDTW is as follows: otherwise (13) where n is the number of line segments on the trajectory sequence R, m is the number of line segments on the trajectory sequence S, and Head(R) indicates the first trajectory sequence S 1 , and Rest(R) is the new trajectory sequence after R eliminated Head(R). That is to say, dist s (Head(R), Head(S)) represents the segment-segment distance between Head(R) and Head(S).
The computed accumulative distance is negative correlation with the similarity between the trajectory sequences. The accumulative distances of different two trajectories will be quite different, thus, it cannot directly compare the accumulative distances. It is necessary to convert the accumulative distance into the range [0, 1], where 0 means the two trajectories are irrelevant and 1 means the two trajectories are the same. The conversion function uses the Gaussian kernel function. The conversion function is shown as Equation (14): where D represents the accumulative distance of R and S, σ is used to describe the sensitivity of the similarity to the accumulative distance. With the same D, the similarity is higher when σ is larger, and the similarity is lower when σ is small. In Figure 8, when d = 10, with the increase of σ, the value of sim grows slowly within the range σ from 0 to 1.5. When σ is in the range from 1.5 to 6, the value of sim grows rapidly. When σ is greater than 6, sim grows slowly and approaches 1.

SDTW Computation
After all of the segment-segment distances between trajectory sequences R and S have been calculated, the accumulative distance is computed derived from the idea of the DTW algorithm. Similar to the DTW algorithm, the similarity measurement of the SDTW is as follows: , with the increase of  , the value of sim grows slowly within the range  from 0 to 1.5. When  is in the range from 1.5 to 6, the value of sim grows rapidly. When  is greater than 6, sim grows slowly and approaches 1. To sum up, the pseudocode of the SDTW algorithm proposed in the paper for the similarity computation for the two trajectory sequences is as follows: As described in Algorithm 1, it first calculates the point-segment distance between each track point in the two trajectories according to Equation (3). If there is a temporal attribute in the trajectory data, it also needs to use Equation (7) to calculate the prediction distance. The segment-segment distance between each segment is then calculated using Equations (11) and (12). The subsequent calculation is the same as the DTW, and the final result is calculated using Equations (13) and (14) after initializing the accumulation distance matrix. To sum up, the pseudocode of the SDTW algorithm proposed in the paper for the similarity computation for the two trajectory sequences is as follows: As described in Algorithm 1, it first calculates the point-segment distance between each track point in the two trajectories according to Equation (3). If there is a temporal attribute in the trajectory data, it also needs to use Equation (7) to calculate the prediction distance. The segment-segment distance between each segment is then calculated using Equations (11) and (12). The subsequent calculation is the same as the DTW, and the final result is calculated using Equations (13) and (14) after initializing the accumulation distance matrix. SDTW needs to traverse every trajectory point of the two trajectories when calculating the point-segment distance, the prediction distance and the segment-segment distance. caclPSDistance( ) is used to calculate the point-segment distance of two points in the two different trajectory sequences based on Equation (3). caclTDistance is used to calculate the prediction distance of two points in the two different trajectory sequences based on Equation (7). caclSDistance is to calculate the segment-segment distance based on Equation (11). caclPSDistance and caclSDistance only involve the calculated points or segments, without considering the other points or segments. The computational complexity of function caclPSDistance and caclSDistance is constant order O(mn). In Equation (7), the dichotomy is used to find the trajectory point interval where the predicted point is located. The computational complexity of caclTDistance is O(log(m + n)mn). The computational complexity of DTW is also O(mn). The computational complexity of SDTW is O(mn) for the trajectory data without the timestamp attribute, otherwise the computational complexity of SDTW is O(log(m + n)mn) for the data with the time-stamp attribute.
In this paper, the SDTW algorithm does not change the core concept of the DTW, and just replaces the DTW distance computation method with three types of distance. The SDTW can also use the lower limit of the DTW distance algorithm to improve the execution efficiency. Moreover, the point-segment distance, prediction distance and segment-segment distance can be integrated with the LCSS, EDR, and other algorithms to propose new approaches to the similarity measurement.

Experimental Dataset and Metrics
The dataset used in the experiments are the GPS GeoLife Trajectories dataset from Microsoft Research [18] and CVRR Trajectory Analysis Dataset [19].
The experiments use the GeoLife dataset to compare DTW and SDTW. The dataset consists of GPS trajectory data of 182 users over five years, for a total length of 1,292,951 km, but the single trajectory sequence is too long, leading to rare trajectories with a high similarity, so the trajectory sequences in the dataset is split into about 500,000 shorter ones indexed with an R* tree. The dataset does not give the trajectory sequence relationship, so the experiment results can be evaluated through visual analysis.
The experiment uses the CVRR dataset to quantitatively analyze the accuracy, the robustness of the measurement algorithms, and the effects of the parameters. The dataset is specifically for assessing the trajectory analysis algorithm, and it mainly includes three types of trajectory data: the I5 dataset, the driving trajectory of a car on a two-way highway; the Labomni dataset, the data of people walking in the laboratory (Figure 9a); and the Cross dataset, the simulation of vehicles driving straight and turning at crossroads (Figure 9b). All of these datasets mark the clusters of each trajectory. These datasets can be clustered based on the trajectory similarity measure algorithm. The obtained clustering results can be compared with the correct clusters, which have been marked in the dataset, and give the accuracy analysis of the proposed SDTW algorithm. It should be noted that the I5 dataset is comprised of mainly linear trajectories, and most algorithms can obtain good results. Therefore, the experiments only use the Cross dataset and Labomni dataset. people walking in the laboratory (Figure 9a); and the Cross dataset, the simulation of vehicles driving straight and turning at crossroads (Figure 9b). All of these datasets mark the clusters of each trajectory. These datasets can be clustered based on the trajectory similarity measure algorithm. The obtained clustering results can be compared with the correct clusters, which have been marked in the dataset, and give the accuracy analysis of the proposed SDTW algorithm. It should be noted that the I5 dataset is comprised of mainly linear trajectories, and most algorithms can obtain good results. Therefore, the experiments only use the Cross dataset and Labomni dataset.
In the experiments, the error rate is included as one of metrics to evaluate the performance. [20]. The lower the value, the higher the accuracy of the algorithm. Suppose the number of the known trajectory sequences is N, the total number of clusters is k, and the correct number of the sequence belonging to the class c is c p . The error rate is defined as follows:

Search Similar Trajectory
The experiments use the same dataset of trajectory sequences for the trajectory queries. It can compute and obtain the top 15 most similar trajectory sequences with the original query trajectory in the dataset, through executing the SDTW and DTW algorithm, respectively. The computational results are visually displayed on the map. The original query trajectory is shown in Figure 10a, and the query results of the SDTW algorithm and DTW algorithm are shown in Figure 10b,c, respectively.
In Figure 10b, most of the query results of the trajectory sequence are close to the original query trajectory, and have high similarity in shape. In Figure 10c, many query results have low similarity in shape, compared with the original query trajectory. The reason is that the SDTW algorithm considers the shape factor of the natural trajectory and uses the point-segment distance to reduce the loss of the sampling method on the accuracy, so the SDTW algorithm is more accurate than the DTW algorithm. In the experiments, the error rate is included as one of metrics to evaluate the performance. Definition 7 (error rate). The error rate (ER) is the rate of wrongly-clustered trajectory sequences, which is different from CCR [20]. The lower the value, the higher the accuracy of the algorithm. Suppose the number of the known trajectory sequences is N, the total number of clusters is k, and the correct number of the sequence belonging to the class c is p c . The error rate is defined as follows:

Search Similar Trajectory
The experiments use the same dataset of trajectory sequences for the trajectory queries. It can compute and obtain the top 15 most similar trajectory sequences with the original query trajectory in the dataset, through executing the SDTW and DTW algorithm, respectively. The computational results are visually displayed on the map. The original query trajectory is shown in Figure 10a, and the query results of the SDTW algorithm and DTW algorithm are shown in Figure 10b,c, respectively.
In Figure 10b, most of the query results of the trajectory sequence are close to the original query trajectory, and have high similarity in shape. In Figure 10c, many query results have low similarity in shape, compared with the original query trajectory. The reason is that the SDTW algorithm considers the shape factor of the natural trajectory and uses the point-segment distance to reduce the loss of the sampling method on the accuracy, so the SDTW algorithm is more accurate than the DTW algorithm.

Clustering Error Rate Comparison
The experiment is based on the CVRR dataset with cluto [21] as a clustering tool. cluto is a lowdimensional clustering and high-dimensional data software package for the analysis of the characteristics of various categories. cluto can provide a variety of optimized clustering algorithms, and support the trajectory clustering based on the similarity matrix.
In order to evaluate the accuracy of similarity measurement, four algorithms, LCSS, EDR, DTW, and SDTW are selected to cluster the trajectory sequences. First, the trajectory similarity matrix of two datasets are generated with the similarity measurement algorithm. Then, it is clustered with agglomerative hierarchical clustering (AHC) and rbr with global optimization. Finally, the maximum ER of each dataset is regarded as the final result of clustering error rate. As to the LCSS and EDR algorithms, it is necessary to specify a threshold . In the experiments, LCSS and EDR algorithms should calculate the maximum ER with the threshold  varying in range from 1 to 5, when the step is set to 1.0. As to the SDTW algorithm, it is necessary to specify the parameter  . The SDTW algorithm should calculate the maximum ER with parameter  varying from 1 to 10, when the step set to 1.0. It should be noted that letting the parameter  in Gaussian kernel function can produce very good clustering results [22]. Figure 11 illustrates the compared results of the clustering error rate with the four algorithms.

Clustering Error Rate Comparison
The experiment is based on the CVRR dataset with cluto [21] as a clustering tool. cluto is a low-dimensional clustering and high-dimensional data software package for the analysis of the characteristics of various categories. cluto can provide a variety of optimized clustering algorithms, and support the trajectory clustering based on the similarity matrix.
In order to evaluate the accuracy of similarity measurement, four algorithms, LCSS, EDR, DTW, and SDTW are selected to cluster the trajectory sequences. First, the trajectory similarity matrix of two datasets are generated with the similarity measurement algorithm. Then, it is clustered with agglomerative hierarchical clustering (AHC) and rbr with global optimization. Finally, the maximum ER of each dataset is regarded as the final result of clustering error rate. As to the LCSS and EDR algorithms, it is necessary to specify a threshold ε. In the experiments, LCSS and EDR algorithms should calculate the maximum ER with the threshold ε varying in range from 1 to 5, when the step is set to 1.0. As to the SDTW algorithm, it is necessary to specify the parameter ω. The SDTW algorithm should calculate the maximum ER with parameter ω varying from 1 to 10, when the step set to 1.0. It should be noted that letting the parameter σ in Gaussian kernel function 1 N 2 ∑ i ∑ j sim ij = 0.1 can produce very good clustering results [22]. Figure 11 illustrates the compared results of the clustering error rate with the four algorithms.

Clustering Error Rate Comparison
The experiment is based on the CVRR dataset with cluto [21] as a clustering tool. cluto is a lowdimensional clustering and high-dimensional data software package for the analysis of the characteristics of various categories. cluto can provide a variety of optimized clustering algorithms, and support the trajectory clustering based on the similarity matrix.
In order to evaluate the accuracy of similarity measurement, four algorithms, LCSS, EDR, DTW, and SDTW are selected to cluster the trajectory sequences. First, the trajectory similarity matrix of two datasets are generated with the similarity measurement algorithm. Then, it is clustered with agglomerative hierarchical clustering (AHC) and rbr with global optimization. Finally, the maximum ER of each dataset is regarded as the final result of clustering error rate. As to the LCSS and EDR algorithms, it is necessary to specify a threshold . In the experiments, LCSS and EDR algorithms should calculate the maximum ER with the threshold  varying in range from 1 to 5, when the step is set to 1.0. As to the SDTW algorithm, it is necessary to specify the parameter  . The SDTW algorithm should calculate the maximum ER with parameter  varying from 1 to 10, when the step set to 1.0. It should be noted that letting the parameter  in Gaussian kernel function can produce very good clustering results [22]. Figure 11 illustrates the compared results of the clustering error rate with the four algorithms.  As shown in Figure 11, the algorithms with various datasets lead to various error rates, but the order of the error rate is the same. The LCSS, DTW, and SDTW can obtain good clustering results, and the EDR's clustering effect is poor. The error rate of the SDTW is the lowest, and the error rate of LCSS is higher than that of the DTW. The error rate of the SDTW is 80%, 96.12%, and 44% lower than the LCSS, EDR, and DTW with the Cross dataset, respectively; and 35.82%, 77.01%, and 18.87% lower with the Labomni dataset, respectively. To sum up, the SDTW algorithm can obtain better accuracy than that of the DTW, LCSS, and EDR. The reason is that the SDTW algorithm introduces the prediction distance to convert the temporal distance into the spatial distance, considering the temporal distance factor. Additionally, SDTW introduces the segment-segment distance to improve the computation accuracy by adjusting the parameters of shape factors.

Noise Effect Analysis
To evaluate the robustness of various algorithm, noisy data at different levels is superimposed into the original trajectory sequence data. The noise rate reflects the deviation points ratio in the trajectory sequence. When the noise rate is λ (0 ≤ λ ≤ 1), it indicates that the trajectory points of 100λ% in the original data have been deviated to a certain extent. It is noted that due to noise randomness, the experiments are repeated 20 times and the average value is taken to ensure the accuracy of the results.
In the experiment, the variation of λ is in the range [0.1, 1] with the step 0.1. The deviation degree uses a random number. The deviation degrees of most deviation points are greater than the maximum threshold ε in the LCSS and EDR. The other parameters are set to the same value, as in the Section 5.3. Figure 12a,b illustrates the experiment results with the Cross and Labomni datasets, respectively. The results of the clustering error rate are roughly consistent with the results in Section 5.3. It can be seen that the LCSS, DTW, and SDTW exhibit better accuracy than EDR, even in the case of noisy data. On the other hand, with the increase of the amount of noisy data, the clustering error rates of all algorithms increase gradually. From Figure 12, the maximum error rate of the LCSS, DTW, and SDTW is below 0.15 when the noise ratio varies from 0.1 to 1.0, which indicates good robustness to the noisy data for the above three algorithms. However, EDR exhibits poor performance on the robustness during the increase of noisy data. As shown in Figure 11, the algorithms with various datasets lead to various error rates, but the order of the error rate is the same. The LCSS, DTW, and SDTW can obtain good clustering results, and the EDR's clustering effect is poor. The error rate of the SDTW is the lowest, and the error rate of LCSS is higher than that of the DTW. The error rate of the SDTW is 80%, 96.12%, and 44% lower than the LCSS, EDR, and DTW with the Cross dataset, respectively; and 35.82%, 77.01%, and 18.87% lower with the Labomni dataset, respectively. To sum up, the SDTW algorithm can obtain better accuracy than that of the DTW, LCSS, and EDR. The reason is that the SDTW algorithm introduces the prediction distance to convert the temporal distance into the spatial distance, considering the temporal distance factor. Additionally, SDTW introduces the segment-segment distance to improve the computation accuracy by adjusting the parameters of shape factors.

Noise Effect Analysis
To evaluate the robustness of various algorithm, noisy data at different levels is superimposed into the original trajectory sequence data. The noise rate reflects the deviation points ratio in the trajectory sequence. When the noise rate is  (0 ≤  ≤ 1), it indicates that the trajectory points of 100 %  in the original data have been deviated to a certain extent. It is noted that due to noise randomness, the experiments are repeated 20 times and the average value is taken to ensure the accuracy of the results.
In the experiment, the variation of  is in the range [0.1, 1] with the step 0.1. The deviation degree uses a random number. The deviation degrees of most deviation points are greater than the maximum threshold  in the LCSS and EDR. The other parameters are set to the same value, as in the Section 5.3. Figure 12a,b illustrates the experiment results with the Cross and Labomni datasets, respectively. The results of the clustering error rate are roughly consistent with the results in Section 5.3. It can be seen that the LCSS, DTW, and SDTW exhibit better accuracy than EDR, even in the case of noisy data. On the other hand, with the increase of the amount of noisy data, the clustering error rates of all algorithms increase gradually. From Figure 12, the maximum error rate of the LCSS, DTW, and SDTW is below 0.15 when the noise ratio varies from 0.1 to 1.0, which indicates good robustness to the noisy data for the above three algorithms. However, EDR exhibits poor performance on the robustness during the increase of noisy data. On the other hand, In the Cross dataset, the average change ratio of ER in the LCSS, EDR, DTW, and SDTW is 23.35%, 14.88%, 14.15%, and 13.64%, respectively. In the Labomni dataset, that is 11.1%, 9.38%, 6.71%, and 4.21%, respectively. EDR and LCSS present poorer robustness than the DTW and SDTW. The conclusion that the robustness of the LCSS and EDR algorithms is better than the DTW when the deviation degree is greater than  from [4] is not correct. The above conclusion is similar with [23]. The SDTW algorithm exhibits the best performance in terms of robustness, which benefits On the other hand, In the Cross dataset, the average change ratio of ER in the LCSS, EDR, DTW, and SDTW is 23.35%, 14.88%, 14.15%, and 13.64%, respectively. In the Labomni dataset, that is 11.1%, 9.38%, 6.71%, and 4.21%, respectively. EDR and LCSS present poorer robustness than the DTW and SDTW. The conclusion that the robustness of the LCSS and EDR algorithms is better than the DTW when the deviation degree is greater than ε from [4] is not correct. The above conclusion is similar with [23]. The SDTW algorithm exhibits the best performance in terms of robustness, which benefits from the point-segment distance, which decreases the effect of sampling methods on the accuracy and also improves the robustness to noise.

Parameter Effect Analysis
In the SDTW algorithm, ω is an important parameter and determines the weight of the shape factor in the similarity computation. The experiments evaluate the effect of parameter ω varying from 0.3 to 20, as listed in Table 1. The experiment dataset is based on Labomni dataset. Two metrics are used to compute the error rate, AHC and rbr, respectively.  Figure 13, when the ω value is relatively small, the weight of the shape factor is quite large, which results in the high error rate. When the ω value lies in the range below 1.0, its small change will cause a great change in the error rate. When the ω value is larger than 1.0, its change will make little effect on the results. The results show no large difference with the optimal results. From the experimental results, the SDTW algorithm can obtain the optimal results with the appropriate value of parameter ω. Furthermore, the shape factors should be properly optimized, otherwise, the improper weights of the shape factors may result in poor performance on the error rate, as shown in Figure 13. from the point-segment distance, which decreases the effect of sampling methods on the accuracy and also improves the robustness to noise.

Parameter Effect Analysis
In the SDTW algorithm,  is an important parameter and determines the weight of the shape factor in the similarity computation. The experiments evaluate the effect of parameter  varying from 0.3 to 20, as listed in Table 1. The experiment dataset is based on Labomni dataset. Two metrics are used to compute the error rate, AHC and rbr, respectively.   Table 1 and Figure 13, when the  value is relatively small, the weight of the shape factor is quite large, which results in the high error rate. When the  value lies in the range below 1.0, its small change will cause a great change in the error rate. When the  value is larger than 1.0, its change will make little effect on the results. The results show no large difference with the optimal results. From the experimental results, the SDTW algorithm can obtain the optimal results with the appropriate value of parameter  . Furthermore, the shape factors should be properly optimized, otherwise, the improper weights of the shape factors may result in poor performance on the error rate, as shown in Figure 13.

Conclusions
With the rapid development of sensor technology and the popularization of personal smart devices, GPS sensors are widely used to track moving objects. A trajectory similarity measure is one of the most important steps in trajectory data mining of human activity and vehicle moving patterns. Unfortunately, the main similarity measure algorithms with the trajectory data have been found to be inaccurate, highly sensitive to sampling methods, and have low robustness to the noise data. In order to solve the above problem, a segment-based dynamic time warping algorithm (SDTW) is proposed to measure the trajectory similarity. First, the proposed SDTW adopts the point-segment distance to reduce the sensitivity influence of the trajectory sampling method. Then, considering the temporal distance factor, SDTW introduces the prediction distance to convert the temporal distance

Conclusions
With the rapid development of sensor technology and the popularization of personal smart devices, GPS sensors are widely used to track moving objects. A trajectory similarity measure is one of the most important steps in trajectory data mining of human activity and vehicle moving patterns. Unfortunately, the main similarity measure algorithms with the trajectory data have been found to be inaccurate, highly sensitive to sampling methods, and have low robustness to the noise data. In order to solve the above problem, a segment-based dynamic time warping algorithm (SDTW) is proposed to measure the trajectory similarity. First, the proposed SDTW adopts the point-segment distance to reduce the sensitivity influence of the trajectory sampling method. Then, considering the temporal distance factor, SDTW introduces the prediction distance to convert the temporal distance into the spatial distance. Finally, SDTW introduces the segment-segment distance to improve the computation accuracy by adjusting the parameters of the shape factors. The experimental results indicate that the SDTW algorithm can obtain about 57%, 86%, and 31% better accuracy than the LCSS, EDR, and DTW, respectively. Meanwhile, the SDTW algorithm exhibits better robustness to the noise than that of the other algorithms.