MiPo: How to Detect Trajectory Outliers with Tabular Outlier Detectors
Abstract
:1. Introduction
1.1. Detectors Using Mapping-Based Pre-Processing Methods
1.2. Detectors Using Sampling-Based Pre-Processing Methods
1.3. Detectors Using Partitioning-Based Pre-Processing Methods
1.4. The Remaining Trajectory Outlier Detectors
1.5. Limitations of Existing Outlier Detectors for Trajectory Data
1.6. Motivation and Contribution
2. Methods
2.1. Problem Statement
- (1)
- Processing the variable-length input;
- (2)
- Mitigating noise or outliers;
- (3)
- Addressing complicated road networks or real-time traffic.
- (4)
- Accuracy;
- (5)
- Complexity;
- (6)
- Robustness;
- (7)
- Generalizability;
- (8)
- Iterability;
- (9)
- Re-usability;
- (10)
- Compatibility;
- (11)
- Interpretability;
- (12)
- Scalability;
- (13)
- Reproducibility.
2.2. Middle Polar Coordinates (MiPo) for Tabular Feature Extraction from Trajectory Data
Algorithm 1: General idea of (Raw idea only) | |
Input: Trajectory dataset G with source point S and destination point D, Output: Tabular data features F | |
Algorithm 2: (Standard) | |
Input: Trajectory data G with source point S and destination point D, Output: Tabular data features F | |
Algorithm 3: |
Input: Trajectory data G, , Tabular outlier detector g Output: Outlier scores O |
2.3. MiPo-Related Discussion
- (1)
- Processing the variable-length input: MiPo allows variable-length input;
- (2)
- Dealing with noise or outliers: MiPo is developed specifically for outlier detection;
- (3)
- Facing complicated road networks: This is verified by the experiments in Section 3;
- (4)
- Accuracy: This is verified by the experiments in Section 3;
- (5)
- Complexity: MiPo has linear time and space complexity;
- (6)
- Robustness: MiPo does not rely on additional parameters. MiPo is not sensitive to outliers or noise;
- (7)
- Generalizability: MiPo can be used for other data mining tasks as shown in Section 3.8;
- (8)
- Iterability: More features can be extracted from each bin defined by MiPo;
- (9)
- Re-usability: The feature is re-usable;
- (10)
- Compatibility: The feature extracted by MiPo can be jointly used with the feature extracted by other techniques;
- (11)
- Interpretability: The features extracted by MiPo are the distances to the point M, which is easy to understand. The outliers detected by LOF have significant density deviations from their neighbors;
- (12)
- Scalability: MiPo processes each trajectory without relying on other trajectories;
- (13)
- Reproducibility: MiPo is not a randomness-based technique.
3. Results
3.1. Datasets
- The source and destination pair Considering the first two factors, datasets with three different source–destination pairs with the setting m for in Definition 2 were chosen, as shown in Figure 5. The first pair was from Oporto-Francisco de Sá Carneiro Airport to Mouzinho de Albuquerque Square (Ai2Sq), with a distance of 9658 m, which was considered as a long distance. For pre-processing, trajectories with a number of GPS points between 8 and 60 were selected, which covers most trajectories. The second pair was from São Bento Train Station to Dragão Stadium (St2St), with a distance of 2878 m, which was considered as an intermediate distance. For pre-processing, trajectories with a number of the GPS points between 8 and 30 were selected, which covers most trajectories. The third pair was from University of Porto and Monument Church of St Francis (Un2Ch), with a distance of 694 m, which was considered as a short distance. For pre-processing, trajectories with a number of GPS points between 8 and 30 were selected, which covers most trajectories. To obtain these three raw datasets, trajectories from the source to destination, defined as reverse trajectories from the destination to source, were selected. Then, all trajectories were sorted to ensure that they all have the same source and destination.
- Outlier types Three experts in outlier detection domains were employed to manually label the trajectories in these three raw datasets as either organic outliers or normalities. The final labels were decided democratically based on a voting majority of these three experts. Then, outliers are injected into these raw datasets with three different outlier generation models named , , and -, as shown in Figure 6. Similarly, these outlier generation models are also used by Zheng et al. [79], Liu et al. [25], Han et al. [24], Belhadi et al. [80], and Asma et al. [67]. The detour model created an outlier trajectory by shifting a sub-trajectory of a trajectory with a random direction and distance. The perturbation model created an outlier trajectory by moving a randomly selected point of a trajectory with a random direction and distance. The route-switching model created an outlier trajectory by connecting two sub-trajectories randomly selected from two different trajectories. Therefore, there were four types of outliers considered in the experiment: one organic outlier type and three injected outlier types.
- Outlier ratio Two levels of outlier ratios, 5% and 10%, were considered and controlled using random sampling.
- Ai2Sq Manually labeled normalities by experts were taken from the Ai2Sq raw dataset defined as the organic normalities. Six test datasets were generated as shown in Table 1 (No. 1∼6) by injecting two types of outliers:
- (1)
- Detour outlier: the shifting distance was set between 200 and 2000 m and the shifting direction was set to an angle between and with the moving direction of the trajectory. The sub-trajectory length was between and of the entire trajectory;
- (2)
- Perturb outlier: the moving distance was set between 0 and 200 m and the moving direction was set to an angle between 0 and . A total of of points in a trajectory were moved.
- St2St Manually labeled normalities by experts were taken from the St2St raw dataset, defined as the organic normalities. Four test datasets were generated as shown in Table 1 (No. 7∼10) by considering two types of outliers:
- (1)
- Organic outliers: manually labeled outliers by experts were taken from St2St raw dataset;
- (2)
- Route-switching outliers: A sub-trajectory starting from the source point of a trajectory and having a length ratio between and was connected to another sub-trajectory ending at the destination point of another trajectory and having a length ratio between and .
- Un2Ch Manually labeled normalities and outliers by experts were taken from the St2St raw dataset defined as the organic normalities and outliers as shown in Table 1 (No. 11∼12).
3.2. Baselines and Evaluation
3.3. AUC Comparison with Baselines
3.4. MiPo with More Tabular Outlier Detectors
3.5. Effect of the Parameter k
3.6. Time Complexity
3.7. Potential Applications
3.8. Limitation
4. Discussion
- The time for pre-processing data;
- The storage for the pre-processed data;
- The time for outlier detectors’ training, storing, and updating;
- The time and memory for outlier detectors’ predictions.
5. Conclusions
Author Contributions
Funding
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Appendix A
Method | Pre-Processing | Advantages | Disadvantages |
---|---|---|---|
TOP-EYE [17], 2010 | Mapping | Reduction in false positive rates and capturing the evolving nature of abnormal moving trajectories | Too sensitive to its own parameters |
iBAT [18], 2011 | Mapping | Linear time complexity | Poor accuracy for cluster outliers |
iBOAT [19], 2013 | Mapping | Real-time evaluation of outliers for active trajectories | Sensitive to noise or point outliers |
iBDD [20], 2015 | Mapping | Real-time detection of outliers | Limited to detection of disoriented behaviors of cognitively-impaired elders |
MT-MAD [21], 2016 | Mapping | Processing trajectories with uncertainty (i.e., trajectories not restricted by road network) | Being limited to the application of maritime vessels |
MCAT [22], 2021 | Mapping | Real-time detection of outliers | Additional parameter required: Time |
Ge et al. [23], 2011 | Mapping | Real-time detection of outliers | Limited to taxi fraud detection |
DeepTEA [24], 2022 | Mapping | Scalability | Limited to vehicles Additional parameter required: Time |
GM-VSAE [25], 2020 | Mapping | Scalability | Too sensitive to its own parameters |
TPRO [26], 2015 | Mapping | Detection of time-dependent outliers | Additional parameter required: Time Inability to detect non-time-dependent outliers |
ATDC [27], 2020 | Mapping | Analysis of different anomalous patterns | High time complexity |
RPat [29], 2013 | Mapping | Evaluation of real-time trajectories | Limited to video surveillance applications Additional parameter required: Traffic flow rate |
Detect [30], 2014 | Mapping | Detection of traffic-based outliers | Additional parameter required: Time |
MANTRA [31], 2016 | Mapping | Scalability | Additional parameter required: Time |
DB-TOD [32], 2017 | Mapping | Detection of outliers in an early stage | Limited to vehicles Additional parameter required: Road map |
RNPAT [33], 2022 | Mapping | Interpretability of the judgment for outliers | High time complexity Additional parameter required: Road map |
LoTAD [36], 2018 | Mapping | Interpretability of the judgment for outliers | High time complexity Additional parameter required: Road map |
Piciarelli et al. [38], 2008 | Sampling | Interpretability of the judgment for outliers | High time complexity |
Masciari [39], 2011 | Sampling | Interpretability of the judgment for outliers | High time complexity |
Maleki et al. [41], 2021 | Sampling | Robust performance in the presence of noise or outliers | Low scability |
Oehling et al. [43], 2019 | Sampling | Interpretability of the judgment for outliers | High time complexity Limited to flight trajectories |
RapidLOF [45], 2019 | Sampling | Low time complexity Interpretability of the judgment for outliers | Low generality to other outlier types |
PN-Opt [48], 2014 | Sampling | Interpretability of the judgment for outliers | High time complexity |
Yu et al. [49], 2017 | Sampling | Interpretability of the judgment for outliers | High time complexity |
Ando et al. [50], 2015 | Sampling | Being relatively robust even in the presence of noise or outliers | High time complexity |
Maiorano et al. [51], 2016 | Sampling | Interpretability of the judgment for outliers | High time complexity Additional parameter required: Time |
STN-Outlier [53], 2018 | Sampling | Interpretability of the judgment for outliers | High time complexity |
TRAOD [54], 2008 | Partitioning | Interpretability of the judgment for outliers | Heavy reliance on expert knowledge to tune parameters |
Luan et al. [55], 2017 | Partitioning | Interpretability of the judgment for outliers | Heavy reliance on expert knowledge to tune parameters |
Pulshashi et al. [56], 2018 | Partitioning | Interpretability of the judgment for outliers | Heavy reliance on expert knowledge to tune parameters |
CTOD [57], 2009 | Partitioning | Recognition of new clusters to aid in identification of new roads | High time complexity |
RTOD [59], 2012 | Partitioning | Interpretability of the judgment for outliers | High time complexity |
TODS [60], 2017 | Partitioning | Detection of outliers regardless of distance to the clusters | High time complexity |
F-DBSCAN [61], 2018 | Partitioning | Interpretability of the judgment for outliers | High time complexity |
TODCSS [62], 2018 | Partitioning | Interpretability of the judgment for outliers | High time complexity |
CaD [63], 2019 | Partitioning | Interpretability of the judgment for outliers | Being limited to video surveillance |
TOD-SS [64], 2011 | Partitioning | Interpretability of the judgment for outliers | High time complexity |
TAD-FD [65], 2020 | Partitioning | Interpretability of the judgment for outliers | High time complexity |
STC [66], 2021 | Partitioning | Interpretability of the judgment for outliers | Limited to taxis Additional parameter required: Time |
TPBA [67], 2020 | Partitioning | Interpretability of the judgment for outliers | High time complexity |
TCATD [68], 2018 | - | Interpretability of the judgment for outliers | Being sensitive to noise or outliers |
OFF-ATPD [69], 2021 | - | Detection of traffic-based outliers | Limited to bus Additional parameter required: Time, velocity |
TODDT [70], 2020 | - | Interpretability of the judgment for outliers | Limited to ship Additional parameter required: Time |
MiPo proposed,2022 | - | See Section 2.3 | See Section 2.3 |
References
- Meng, F.; Yuan, G.; Lv, S.; Wang, Z.; Xia, S. An overview on trajectory outlier detection. Artif. Intell. Rev. 2019, 52, 2437–2456. [Google Scholar] [CrossRef]
- Hawkins, D.M. Identification of Outliers; Springer: Dordrecht, The Netherlands, 1980; Volume 11. [Google Scholar]
- Yang, J.; Rahardja, S.; Rahardja, S. Click fraud detection: HK-index for feature extraction from variable-length time series of user behavior. In Proceedings of the Machine Learning for Signal Processing, Xi’an, China, 22–24 August 2022. [Google Scholar]
- Aggarwal, C.C. An introduction to outlier analysis. In Outlier Analysis; Springer: Dordrecht, The Netherlands, 2017; pp. 1–34. [Google Scholar]
- Alowayr, A.D.; Alsalooli, L.A.; Alshahrani, A.M.; Akaichi, J. A Review of Trajectory Data Mining Applications. In Proceedings of the 2021 International Conference of Women in Data Science at Taif University (WiDSTaif), Riyadh, Saudi Arabia, 30–31 March 2021; pp. 1–6. [Google Scholar]
- Cui, H.; Wu, L.; Hu, S.; Lu, R.; Wang, S. Recognition of urban functions and mixed use based on residents’ movement and topic generation model: The case of Wuhan, China. Remote Sens. 2020, 12, 2889. [Google Scholar] [CrossRef]
- Qian, Z.; Liu, X.; Tao, F.; Zhou, T. Identification of urban functional areas by coupling satellite images and taxi GPS trajectories. Remote Sens. 2020, 12, 2449. [Google Scholar] [CrossRef]
- Knorr, E.M.; Ng, R.T.; Tucakov, V. Distance-based outliers: Algorithms and applications. Vldb J. 2000, 8, 237–253. [Google Scholar] [CrossRef]
- Porikli, F. Trajectory pattern detection by hmm parameter space features and eigenvector clustering. In Proceedings of the 8th European Conference on Computer Vision, Prague, Czech Republic, 11–14 May 2004. [Google Scholar]
- Stauffer, C.; Grimson, W.E.L. Learning patterns of activity using real-time tracking. IEEE Trans. Pattern Anal. Mach. Intell. 2000, 22, 747–757. [Google Scholar] [CrossRef] [Green Version]
- D’Urso, P. Fuzzy clustering for data time arrays with inlier and outlier time trajectories. IEEE Trans. Fuzzy Syst. 2005, 13, 583–604. [Google Scholar] [CrossRef]
- Piciarelli, C.; Foresti, G.L. On-line trajectory clustering for anomalous events detection. Pattern Recognit. Lett. 2006, 27, 1835–1842. [Google Scholar] [CrossRef]
- Piciarelli, C.; Foresti, G.L. Anomalous trajectory detection using support vector machines. In Proceedings of the 2007 IEEE Conference on Advanced Video and Signal Based Surveillance, London, UK, 5–7 September 2007; pp. 153–158. [Google Scholar]
- Bradley, P.S.; Fayyad, U.M.; Reina, C.A.; Bradley, F.R.; Bradley, P.; Fayyad, U.; Reina, C. Scaling Clustering Algorithms to Large Databases”, Microsoft Research Report 1998. Available online: http://www.it.uu.se/edu/course/homepage/infoutv2/vt13/tr-98-37.pdf (accessed on 21 June 2022).
- Keogh, E.; Chakrabarti, K.; Pazzani, M.; Mehrotra, S. Locally adaptive dimensionality reduction for indexing large time series databases. In Proceedings of the 2001 ACM SIGMOD International Conference on Management of Data, California, CA, USA, 21–24 May 2001; pp. 151–162. [Google Scholar]
- Chu, S.; Keogh, E.; Hart, D.; Pazzani, M. Iterative deepening dynamic time warping for time series. In Proceedings of the 2002 SIAM International Conference on Data Mining, Arlington, VA, USA, 11–13 April 2002; pp. 195–212. [Google Scholar]
- Ge, Y.; Xiong, H.; Zhou, Z.h.; Ozdemir, H.; Yu, J.; Lee, K.C. Top-eye: Top-k evolving trajectory outlier detection. In Proceedings of the 19th ACM International Conference on Information and Knowledge Management, Toronto, Canada, 26–30 October 2010; pp. 1733–1736. [Google Scholar]
- Zhang, D.; Li, N.; Zhou, Z.H.; Chen, C.; Sun, L.; Li, S. iBAT: Detecting anomalous taxi trajectories from GPS traces. In Proceedings of the 13th International Conference on Ubiquitous Computing, Beijing, China, 17–21 September 2011; pp. 99–108. [Google Scholar]
- Chen, C.; Zhang, D.; Castro, P.S.; Li, N.; Sun, L.; Li, S.; Wang, Z. iBOAT: Isolation-based online anomalous trajectory detection. IEEE Trans. Intell. Transp. Syst. 2013, 14, 806–818. [Google Scholar] [CrossRef]
- Lin, Q.; Zhang, D.; Connelly, K.; Ni, H.; Yu, Z.; Zhou, X. Disorientation detection by mining GPS trajectories for cognitively-impaired elders. Pervasive Mob. Comput. 2015, 19, 71–85. [Google Scholar] [CrossRef]
- Lei, P.R. A framework for anomaly detection in maritime trajectory behavior. Knowl. Inf. Syst. 2016, 47, 189–214. [Google Scholar] [CrossRef]
- Chen, D.; Du, Y.; Xu, S.; Sun, Y.E.; Huang, H.; Gao, G. Online Anomalous Taxi Trajectory Detection Based on Multidimensional Criteria. In Proceedings of the 2021 International Joint Conference on Neural Networks (IJCNN), Shenzhen, China, 18–22 July 2021; pp. 1–8. [Google Scholar]
- Ge, Y.; Xiong, H.; Liu, C.; Zhou, Z.H. A taxi driving fraud detection system. In Proceedings of the 2011 IEEE 11th International Conference on Data Mining, Vancouver, Canada, 11–14 December 2011; pp. 181–190. [Google Scholar]
- Han, X.; Cheng, R.; Ma, C.; Grubenmann, T. DeepTEA: Effective and efficient online time-dependent trajectory outlier detection. Proc. Vldb Endow. 2022, 15, 1493–1505. [Google Scholar] [CrossRef]
- Liu, Y.; Zhao, K.; Cong, G.; Bao, Z. Online anomalous trajectory detection with deep generative sequence modeling. In Proceedings of the 2020 IEEE 36th International Conference on Data Engineering (ICDE), Dallas, TX, USA, 20–24 April 2020; pp. 949–960. [Google Scholar]
- Zhu, J.; Jiang, W.; Liu, A.; Liu, G.; Zhao, L. Time-dependent popular routes based trajectory outlier detection. In Proceedings of the International Conference on Web Information Systems Engineering, Miami, FL, USA, 1–3 November 2015; pp. 16–30. [Google Scholar]
- Wang, J.; Yuan, Y.; Ni, T.; Ma, Y.; Liu, M.; Xu, G.; Shen, W. Anomalous trajectory detection and classification based on difference and intersection set distance. IEEE Trans. Veh. Technol. 2020, 69, 2487–2500. [Google Scholar] [CrossRef]
- Lou, Y.; Zhang, C.; Zheng, Y.; Xie, X.; Wang, W.; Huang, Y. Map-matching for low-sampling-rate GPS trajectories. In Proceedings of the 17th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, Miami, FL, USA, 1–3 November 2009; pp. 352–361. [Google Scholar]
- Saleem, M.A.; Nawaz, W.; Lee, Y.K.; Lee, S. Road segment partitioning towards anomalous trajectory detection for surveillance applications. In Proceedings of the 2013 IEEE 14th International Conference on Information Reuse &Integration (IRI), San Francisco, CA, USA, 14–16 August 2013; pp. 610–617. [Google Scholar]
- Lan, J.; Long, C.; Wong, R.C.W.; Chen, Y.; Fu, Y.; Guo, D.; Liu, S.; Ge, Y.; Zhou, Y.; Li, J. A new framework for traffic anomaly detection. In Proceedings of the 2014 SIAM International Conference on DATA MINING, Philadelphia, PA, USA, 24–26 April 2014; pp. 875–883. [Google Scholar]
- Banerjee, P.; Yawalkar, P.; Ranu, S. Mantra: A scalable approach to mining temporally anomalous sub-trajectories. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 1415–1424. [Google Scholar]
- Wu, H.; Sun, W.; Zheng, B. A fast trajectory outlier detection approach via driving behavior modeling. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, Singapore, 6–10 November 2017; pp. 837–846. [Google Scholar]
- Zhao, X.; Su, J.; Cai, J.; Yang, H.; Xi, T. Vehicle anomalous trajectory detection algorithm based on road network partition. Appl. Intell. 2022, 52, 8820–8838. [Google Scholar] [CrossRef]
- Dijkstra, E.W. A note on two problems in connexion with graphs. Numer. Math. 1959, 1, 269–271. [Google Scholar] [CrossRef] [Green Version]
- Qin, K.; Wang, Y.; Wang, B. Detecting anomalous trajectories using the Dempster-Shafer evidence theory considering trajectory features from taxi GNSS data. Information 2018, 9, 258. [Google Scholar] [CrossRef] [Green Version]
- Kong, X.; Song, X.; Xia, F.; Guo, H.; Wang, J.; Tolba, A. LoTAD: Long-term traffic anomaly detection based on crowdsourced bus trajectory data. World Wide Web 2018, 21, 825–847. [Google Scholar] [CrossRef]
- Breunig, M.M.; Kriegel, H.P.; Ng, R.T.; Sander, J. LOF: Identifying density-based local outliers. In Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, Dallas, TX, USA, 16–18 May 2000; pp. 93–104. [Google Scholar]
- Piciarelli, C.; Micheloni, C.; Foresti, G.L. Trajectory-based anomalous event detection. IEEE Trans. Circuits Syst. Video Technol. 2008, 18, 1544–1554. [Google Scholar] [CrossRef]
- Masciari, E. Trajectory outlier detection using an analytical approach. In Proceedings of the 2011 IEEE 23rd International Conference on Tools with Artificial Intelligence, Boca Raton, FL, USA, 7–9 November 2011; pp. 377–384. [Google Scholar]
- Secker, A.; Taubman, D. Lifting-based invertible motion adaptive transform (LIMAT) framework for highly scalable video compression. IEEE Trans. Image Process. 2003, 12, 1530–1542. [Google Scholar] [CrossRef] [Green Version]
- Maleki, S.; Maleki, S.; Jennings, N.R. Unsupervised anomaly detection with LSTM autoencoders using statistical data-filtering. Appl. Soft Comput. 2021, 108, 107443. [Google Scholar] [CrossRef]
- Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
- Oehling, J.; Barry, D.J. Using machine learning methods in airline flight data monitoring to generate new operational safety knowledge from existing data. Saf. Sci. 2019, 114, 89–104. [Google Scholar] [CrossRef]
- Kriegel, H.P.; Kröger, P.; Schubert, E.; Zimek, A. LoOP: Local outlier probabilities. In Proceedings of the 18th ACM conference on Information and knowledge management, Hong Kong, China, 2–6 November 2009; pp. 1649–1652. [Google Scholar]
- Yang, J.; Mariescu-Istodor, R.; Fränti, P. Three rapid methods for averaging GPS segments. Appl. Sci. 2019, 9, 4899. [Google Scholar] [CrossRef] [Green Version]
- Arasu, A.; Babu, S.; Widom, J. The CQL continuous query language: Semantic foundations and query execution. Vldb J. 2006, 15, 121–142. [Google Scholar] [CrossRef]
- Liu, H.; Taniguchi, T.; Tanaka, Y.; Takenaka, K.; Bando, T. Visualization of driving behavior based on hidden feature extraction by using deep learning. IEEE Trans. Intell. Transp. Syst. 2017, 18, 2477–2489. [Google Scholar] [CrossRef]
- Yu, Y.; Cao, L.; Rundensteiner, E.A.; Wang, Q. Detecting moving object outliers in massive-scale trajectory streams. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA, 24–27 August 2014; pp. 422–431. [Google Scholar]
- Yu, Y.; Cao, L.; Rundensteiner, E.A.; Wang, Q. Outlier detection over massive-scale trajectory streams. ACM Trans. Database Syst. (Tods) 2017, 42, 1–33. [Google Scholar] [CrossRef]
- Ando, S.; Thanomphongphan, T.; Seki, Y.; Suzuki, E. Ensemble anomaly detection from multi-resolution trajectory features. Data Min. Knowl. Discov. 2015, 29, 39–83. [Google Scholar] [CrossRef]
- Maiorano, F.; Petrosino, A. Granular trajectory based anomaly detection for surveillance. In Proceedings of the 2016 23rd International Conference on Pattern Recognition (ICPR), Cancún, Mexico, 4–8 December 2016; pp. 2066–2072. [Google Scholar]
- Albanese, A.; Pal, S.K.; Petrosino, A. Rough sets, kernel set, and spatiotemporal outlier detection. IEEE Trans. Knowl. Data Eng. 2012, 26, 194–207. [Google Scholar] [CrossRef]
- Zhu, Z.; Yao, D.; Huang, J.; Li, H.; Bi, J. Sub-trajectory-and trajectory-neighbor-based outlier detection over trajectory streams. In Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining, Melbourne, VIC, Australia, 3–6 June 2018; pp. 551–563. [Google Scholar]
- Lee, J.G.; Han, J.; Li, X. Trajectory outlier detection: A partition-and-detect framework. In Proceedings of the 2008 IEEE 24th International Conference on Data Engineering, Cancún, Mexico, 7–12 April 2008; pp. 140–149. [Google Scholar]
- Luan, F.; Zhang, Y.; Cao, K.; Li, Q. Based local density trajectory outlier detection with partition-and-detect framework. In Proceedings of the 2017 13th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD), Guilin, China, 29–31 July 2017; pp. 1708–1714. [Google Scholar]
- Pulshashi, I.R.; Bae, H.; Choi, H.; Mun, S. Smoothing of trajectory data recorded in harsh environments and detection of outlying trajectories. In Proceedings of the 7th International Conference on Emerging Databases, Melbourne, VIC, Australia, 3–6 June 2018; pp. 89–98. [Google Scholar]
- Ying, X.; Xu, Z.; Yin, W.G. Cluster-based congestion outlier detection method on trajectory data. In Proceedings of the 2009 Sixth International Conference on Fuzzy Systems and Knowledge Discovery, Tianjin, China, 14–16 August 2009; Volume 5, pp. 243–247. [Google Scholar]
- Ester, M.; Kriegel, H.P.; Sander, J.; Xu, X. A density-based algorithm for discovering clusters in large spatial databases with noise. In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, Portland, Oregon, 2–4 August 1996; Volume 96, pp. 226–231. [Google Scholar]
- Liu, L.; Qiao, S.; Zhang, Y.; Hu, J. An efficient outlying trajectories mining approach based on relative distance. Int. J. Geogr. Inf. Sci. 2012, 26, 1789–1810. [Google Scholar] [CrossRef]
- Mao, J.; Wang, T.; Jin, C.; Zhou, A. Feature grouping-based outlier detection upon streaming trajectories. IEEE Trans. Knowl. Data Eng. 2017, 29, 2696–2709. [Google Scholar] [CrossRef]
- Zhang, T.; Zhao, S.; Chen, J. Ship trajectory outlier detection service system based on collaborative computing. In Proceedings of the 2018 IEEE World Congress on Services (SERVICES), San Francisco, CA, USA, 2–7 July 2018; pp. 15–16. [Google Scholar]
- Yu, Q.; Luo, Y.; Chen, C.; Wang, X. Trajectory outlier detection approach based on common slices sub-sequence. Appl. Intell. 2018, 48, 2661–2680. [Google Scholar] [CrossRef]
- San Román, I.; Martín de Diego, I.; Conde, C.; Cabello, E. Outlier trajectory detection through a context-aware distance. Pattern Anal. Appl. 2019, 22, 831–839. [Google Scholar] [CrossRef]
- Yuan, G.; Xia, S.; Zhang, L.; Zhou, Y.; Ji, C. Trajectory outlier detection algorithm based on structural features. J. Comput. Inf. Syst. 2011, 7, 4137–4144. [Google Scholar]
- Wang, Z.; Yuan, G.; Pei, H.; Zhang, Y.; Liu, X. Unsupervised learning trajectory anomaly detection algorithm based on deep representation. Int. J. Distrib. Sens. Netw. 2020, 16, 1550147720971504. [Google Scholar] [CrossRef]
- Kong, X.; Zhu, B.; Shen, G.; Workneh, T.C.; Ji, Z.; Chen, Y.; Liu, Z. Spatial-Temporal-Cost Combination Based Taxi Driving Fraud Detection for Collaborative Internet of Vehicles. IEEE Trans. Ind. Inform. 2021, 18, 3426–3436. [Google Scholar] [CrossRef]
- Belhadi, A.; Djenouri, Y.; Srivastava, G.; Djenouri, D.; Cano, A.; Lin, J.C.W. A two-phase anomaly detection model for secure intelligent transportation ride-hailing trajectories. IEEE Trans. Intell. Transp. Syst. 2020, 22, 4496–4506. [Google Scholar] [CrossRef]
- Wang, Y.; Qin, K.; Chen, Y.; Zhao, P. Detecting anomalous trajectories and behavior patterns using hierarchical clustering from taxi GPS data. ISPRS Int. J. Geo-Inf. 2018, 7, 25. [Google Scholar] [CrossRef] [Green Version]
- Zhang, X.; Zheng, Y.; Zhao, Z.; Liu, Y.; Blumenstein, M.; Li, J. Deep learning detection of anomalous patterns from bus trajectories for traffic insight analysis. Knowl.-Based Syst. 2021, 217, 106833. [Google Scholar] [CrossRef]
- Sun, S.; Chen, Y.; Zhang, J. Trajectory outlier detection algorithm for ship AIS data based on dynamic differential threshold. In Proceedings of the Journal of Physics: Conference Series, Xi’an, China, 18–19 October 2020; Volume 1437, pp. 12–13. [Google Scholar]
- Tan, X.; Yang, J.; Rahardja, S. Sparse random projection isolation forest for outlier detection. Pattern Recognit. Lett. 2022, 163, 65–73. [Google Scholar] [CrossRef]
- Yang, J. Outlier Detection Techniques. Ph.D. Thesis, University of Eastern Finland, Kuopio, Finland, 2020. [Google Scholar]
- Yang, J.; Chen, Y.; Rahardja, S. Neighborhood Representative for Improving Outlier Detectors. arXiv 2022, arXiv:2010.12061. [Google Scholar]
- Yang, J.; Chen, Y.; Rahardja, S. Regional Ensemble for Improving Unsupervised Outlier Detectors. SSRN 2022. preprint. [Google Scholar] [CrossRef]
- Yang, J.; Rahardja, S.; Fränti, P. Outlier detection: How to threshold outlier scores? In Proceedings of the International Conference on Artificial Intelligence, Information Processing and Cloud Computing, Sanya, China, 19–21 December 2019; pp. 1–6. [Google Scholar]
- Yang, J.; Rahardja, S.; Fränti, P. Mean-shift outlier detection and filtering. Pattern Recognit. 2021, 115, 107874. [Google Scholar] [CrossRef]
- Yang, J.; Rahardja, S.; Fränti, P. Mean-Shift Outlier Detection. In Proceedings of the 4th International Conference on Fuzzy Systems and Data Mining, Bangkok, Thailand, 16–19 November 2018; pp. 208–215. [Google Scholar]
- Fränti, P.; Yang, J. Medoid-Shift for Noise Removal to Improve Clustering. In Proceedings of the International Conference on Artificial Intelligence and Soft Computing, Busan, Korea, 27–29 June 2018; Springer: Dordrecht, The Netherlands; pp. 604–614. [Google Scholar]
- Zheng, G.; Brantley, S.L.; Lauvaux, T.; Li, Z. Contextual spatial outlier detection with metric learning. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, NS, Canada, 13–17 August 2017; pp. 2161–2170. [Google Scholar]
- Belhadi, A.; Djenouri, Y.; Lin, J.C.W.; Cano, A. Trajectory outlier detection: Algorithms, taxonomies, evaluation, and open challenges. ACM Trans. Manag. Inf. Syst. (Tmis) 2020, 11, 1–29. [Google Scholar] [CrossRef]
- Hautamaki, V.; Karkkainen, I.; Franti, P. Outlier detection using k-nearest neighbour graph. In Proceedings of the 17th International Conference on Pattern Recognition (ICPR), Cambridge, UK, 26 August 2004; Volume 3, pp. 430–433. [Google Scholar]
- Li, X.; Lv, J.; Yi, Z. An efficient representation-based method for boundary point and outlier detection. IEEE Trans. Neural Netw. Learn. Syst. 2016, 29, 51–62. [Google Scholar] [CrossRef] [PubMed]
- Ramaswamy, S.; Rastogi, R.; Shim, K. Efficient algorithms for mining outliers from large data sets. In Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, Dallas, TX, USA, 16–18 May 2000; pp. 427–438. [Google Scholar]
- Rousseeuw, P.J. Least median of squares regression. J. Am. Stat. Assoc. 1984, 79, 871–880. [Google Scholar] [CrossRef]
- Liu, F.T.; Ting, K.M.; Zhou, Z.H. Isolation forest. In Proceedings of the 2008 Eighth Ieee International Conference on Data Mining, Washington, DC, USA, 15–19 December 2008; pp. 413–422. [Google Scholar]
- Schölkopf, B.; Platt, J.C.; Shawe-Taylor, J.; Smola, A.J.; Williamson, R.C. Estimating the support of a high-dimensional distribution. Neural Comput. 2001, 13, 1443–1471. [Google Scholar] [CrossRef]
- Shyu, M.L.; Chen, S.C.; Sarinnapakorn, K.; Chang, L. A Novel Anomaly Detection Scheme Based on Principal Component Classifier; Technical Report; Miami Univ Coral Gables Fl Dept of Electrical and Computer Engineering: Coral Gables, FL, USA, 2003. [Google Scholar]
- Liu, Y.; Li, Z.; Zhou, C.; Jiang, Y.; Sun, J.; Wang, M.; He, X. Generative adversarial active learning for unsupervised outlier detection. IEEE Trans. Knowl. Data Eng. 2019, 32, 1517–1528. [Google Scholar] [CrossRef] [Green Version]
- Burgess, C.P.; Higgins, I.; Pal, A.; Matthey, L.; Watters, N.; Desjardins, G.; Lerchner, A. Understanding disentangling in beta-VAE. arXiv 2018, arXiv:1804.03599. [Google Scholar]
- Li, Z.; Zhao, Y.; Botta, N.; Ionescu, C.; Hu, X. COPOD: Copula-based outlier detection. In Proceedings of the 2020 IEEE International Conference on Data Mining (ICDM), Sorrento, Italy, 17–20 November 2020; pp. 1118–1123. [Google Scholar]
- Kriegel, H.P.; Schubert, M.; Zimek, A. Angle-based outlier detection in high-dimensional data. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge Discovery and Data Mining, Las Vegas, NV, USA, 24–27 August 2008; pp. 444–452. [Google Scholar]
- Almardeny, Y.; Boujnah, N.; Cleary, F. A novel outlier detection method for multivariate data. IEEE Trans. Knowl. Data Eng. 2020. [Google Scholar] [CrossRef]
No. | Dataset | Size | Outliers (%) | Outlier Type | Trajectory Length (max, avg, min) |
---|---|---|---|---|---|
1 | AiSq5D | 1115 | 55 (5%) | Detour | 73, 49, 36 |
2 | AiSq10D | 1115 | 112 (10%) | Detour | 75, 50, 36 |
3 | AiSq5P | 1115 | 56 (5%) | Perturb | 61, 49, 36 |
4 | AiSq10P | 1115 | 109 (10%) | Perturb | 61, 49, 36 |
5 | AiSq5DP | 1115 | 56 (5%) | Detour + Perturb | 70, 49, 36 |
6 | AiSq10DP | 1115 | 112 (10%) | Detour + Perturb | 70, 49, 36 |
7 | StSt5M * | 192 | 10 (5%) | Manually labeled | 30, 27, 18 |
8 | StSt10M * | 202 | 20 (10%) | Manually labeled | 30, 27, 18 |
9 | StSt5R | 191 | 9 (5%) | Route-switching | 38, 27, 18 |
10 | StSt10R | 200 | 18 (10%) | Route-switching | 38, 27, 18 |
11 | UnCh5M * | 763 | 38 (5%) | Manually labeled | 30, 19, 8 |
12 | UnCh10M * | 806 | 81 (10%) | Manually labeled | 30, 19, 8 |
Dataset | MiPo+LOF | IBAT | IBOAT | RLOF | LSTM | ATDC | TPBA | ELSTM | RNPAT |
---|---|---|---|---|---|---|---|---|---|
Proposed | [18] | [19] | [45] | [42] | [27] | [67] | [41] | [33] | |
AiSq5D | 1.00 | 1.00 | 0.98 | 0.85 | 0.75 | 0.90 | 0.84 | 0.76 | - |
AiSq10D | 0.98 | 0.99 | 0.97 | 0.78 | 0.76 | 0.80 | 0.87 | 0.77 | - |
AiSq5P | 1.00 | 0.98 | 0.99 | 0.67 | 0.70 | 0.90 | 0.87 | 0.70 | - |
AiSq10P | 1.00 | 0.98 | 0.98 | 0.69 | 0.72 | 0.82 | 0.87 | 0.72 | - |
AiSq5DP | 1.00 | 0.98 | 0.98 | 0.80 | 0.71 | 0.92 | 0.80 | 0.71 | - |
AiSq10DP | 0.99 | 0.99 | 0.97 | 0.68 | 0.73 | 0.83 | 0.81 | 0.73 | - |
StSt5R | 0.99 | 0.90 | 0.80 | 0.79 | 0.56 | 0.56 | 0.61 | 0.56 | - |
StSt10R | 0.99 | 0.92 | 0.70 | 0.61 | 0.58 | 0.59 | 0.57 | 0.58 | - |
StSt5M | 1.00 | 1.00 | 0.85 | 0.87 | 0.71 | 0.43 | 0.87 | 0.71 | 0.62 |
StSt10M | 0.99 | 0.99 | 0.74 | 0.82 | 0.62 | 0.24 | 0.87 | 0.62 | 0.60 |
UnCh5M | 0.97 | 0.85 | 0.69 | 0.95 | 0.85 | 0.84 | 0.92 | 0.85 | 0.77 |
UnCh10M | 0.98 | 0.92 | 0.73 | 0.85 | 0.85 | 0.67 | 0.89 | 0.85 | 0.81 |
AVG | 0.99 | 0.96 | 0.87 | 0.78 | 0.71 | 0.71 | 0.82 | 0.71 | 0.70 |
STD | 0.01 | 0.05 | 0.12 | 0.10 | 0.09 | 0.21 | 0.11 | 0.09 | 0.10 |
Media | 0.99 | 0.98 | 0.91 | 0.79 | 0.72 | 0.81 | 0.87 | 0.72 | 0.70 |
Dataset | DOD+ | MOD+ | LOF | ODIN | NC | KNN | MCD | IForest | SVM | PCA | SOGA | BVAE | COP | ABOD | ROD |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
[76] | [76] | [37] | [81] | [82] | [83] | [84] | [85] | [86] | [87] | [88] | [89] | [90] | [91] | [92] | |
AiSq5D | 0.97 | 0.98 | 0.99 | 0.81 | 0.60 | 1.00 | 0.96 | 0.95 | 0.94 | 0.96 | 0.26 | 0.97 | 0.93 | 1.00 | 0.91 |
AiSq10D | 0.96 | 0.97 | 0.98 | 0.80 | 0.55 | 0.99 | 0.92 | 0.93 | 0.86 | 0.89 | 0.30 | 0.84 | 0.89 | 0.99 | 0.88 |
AiSq5DP | 0.82 | 0.85 | 0.98 | 0.96 | 0.65 | 0.95 | 0.93 | 0.92 | 0.72 | 0.79 | 0.48 | 0.78 | 0.88 | 0.98 | 0.87 |
AiSq10DP | 0.86 | 0.89 | 0.97 | 0.90 | 0.52 | 0.95 | 0.89 | 0.89 | 0.71 | 0.80 | 0.33 | 0.77 | 0.86 | 0.98 | 0.88 |
AiSq5P | 0.76 | 0.77 | 0.97 | 0.99 | 0.67 | 0.93 | 0.94 | 0.93 | 0.63 | 0.89 | 0.35 | 0.87 | 0.90 | 0.98 | 0.90 |
AiSq10P | 0.75 | 0.77 | 0.97 | 0.97 | 0.66 | 0.92 | 0.93 | 0.92 | 0.63 | 0.86 | 0.44 | 0.83 | 0.90 | 0.98 | 0.89 |
StSt5R | 0.89 | 0.89 | 0.77 | 0.89 | 0.70 | 0.89 | 0.90 | 0.92 | 0.35 | 0.45 | 0.44 | 0.63 | 0.60 | 0.97 | 0.53 |
StSt10R | 0.87 | 0.86 | 0.85 | 0.88 | 0.52 | 0.88 | 0.87 | 0.89 | 0.29 | 0.36 | 0.36 | 0.66 | 0.43 | 0.92 | 0.46 |
StSt5M | 0.88 | 0.88 | 0.86 | 0.97 | 0.71 | 0.96 | 0.89 | 0.99 | 0.47 | 0.52 | 0.28 | 0.63 | 0.64 | 1.00 | 0.61 |
StSt10M | 0.88 | 0.88 | 0.82 | 0.87 | 0.62 | 0.93 | 0.92 | 0.97 | 0.58 | 0.62 | 0.42 | 0.72 | 0.70 | 0.96 | 0.61 |
UnCh5M | 0.95 | 0.96 | 0.94 | 0.66 | 0.49 | 0.96 | 0.95 | 0.95 | 0.95 | 0.94 | 0.12 | 0.94 | 0.95 | 0.94 | 0.93 |
UnCh10M | 0.95 | 0.95 | 0.80 | 0.64 | 0.53 | 0.96 | 0.97 | 0.97 | 0.94 | 0.96 | 0.10 | 0.93 | 0.96 | 0.96 | 0.93 |
AVG | 0.88 | 0.89 | 0.91 | 0.86 | 0.60 | 0.94 | 0.92 | 0.94 | 0.67 | 0.75 | 0.32 | 0.80 | 0.80 | 0.97 | 0.78 |
STD | 0.07 | 0.07 | 0.08 | 0.12 | 0.08 | 0.04 | 0.03 | 0.03 | 0.23 | 0.21 | 0.12 | 0.12 | 0.17 | 0.02 | 0.18 |
MEDIAN | 0.88 | 0.88 | 0.95 | 0.89 | 0.61 | 0.95 | 0.93 | 0.93 | 0.67 | 0.83 | 0.34 | 0.81 | 0.89 | 0.98 | 0.88 |
(n, m) | MiPo- | MiPo | MiPo+LOF | IBAT | IBOAT | RapidLOF | LSTM | ATDC | TPBA | ELSTMAE | RNPAT |
---|---|---|---|---|---|---|---|---|---|---|---|
Proposed | [18] | [19] | [45] | [42] | [27] | [67] | [41] | [33] | |||
(100, 30) | <0.01 | 0.15 | 0.15 | 0.79 | 0.58 | 0.01 | 3.54 | 0.06 | 1.12 | 3.02 | 447.15 |
(100, 60) | 0.01 | 0.25 | 0.25 | 0.86 | 0.76 | 0.02 | 11.09 | 0.09 | 1.32 | 9.66 | 658.08 |
(100, 90) | 0.01 | 0.34 | 0.34 | 0.93 | 0.84 | 0.04 | 18.04 | 0.15 | 1.28 | 17.57 | 800.2 |
(1000, 30) | 0.04 | 1.07 | 1.09 | 16.39 | 28.98 | 0.13 | 40.07 | 3.57 | 105.61 | 37.26 | 2615.72 |
(1000, 60) | 0.07 | 1.94 | 1.96 | 17.04 | 34.44 | 0.24 | 86.71 | 4.05 | 114.95 | 73.64 | 4062 |
(1000, 90) | 0.09 | 2.86 | 2.88 | 17.58 | 39.24 | 0.37 | 158.25 | 4.91 | 119.04 | 127.03 | 5690.37 |
(10,000, 30) | 0.36 | 9.52 | 10.89 | 210.02 | 315.43 | 2.06 | 389.04 | 319.34 | 10,616.66 | 346.19 | 30,834.41 |
(10,000, 60) | 0.59 | 18.15 | 19.42 | 220.48 | 376.96 | 3.25 | 720.84 | 343.72 | 10,576.89 | 719.02 | 51,515.42 |
(10,000, 90) | 0.79 | 26.69 | 27.97 | 239.64 | 423.23 | 4.58 | 1218.9 | 377.93 | 10,897.32 | 1326.29 | 80,572.54 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Yang, J.; Tan, X.; Rahardja, S. MiPo: How to Detect Trajectory Outliers with Tabular Outlier Detectors. Remote Sens. 2022, 14, 5394. https://doi.org/10.3390/rs14215394
Yang J, Tan X, Rahardja S. MiPo: How to Detect Trajectory Outliers with Tabular Outlier Detectors. Remote Sensing. 2022; 14(21):5394. https://doi.org/10.3390/rs14215394
Chicago/Turabian StyleYang, Jiawei, Xu Tan, and Sylwan Rahardja. 2022. "MiPo: How to Detect Trajectory Outliers with Tabular Outlier Detectors" Remote Sensing 14, no. 21: 5394. https://doi.org/10.3390/rs14215394
APA StyleYang, J., Tan, X., & Rahardja, S. (2022). MiPo: How to Detect Trajectory Outliers with Tabular Outlier Detectors. Remote Sensing, 14(21), 5394. https://doi.org/10.3390/rs14215394