Chain-Based Outlier Detection: Interpretable Theories and Methods for Complex Data Scenarios †
Abstract
1. Introduction
2. Related Work
2.1. Nearest Neighbor-Based Methods
2.2. Clustering-Based Methods
2.3. Statistics-Based Methods
3. The Proposed Methods
- Step 1. Local error bound for k-distance: For data point pairs p and q within the neighborhood, by the arc-chord estimation inequality in Riemannian geometry:Taking , , and denoting , we have:Therefore, the relative error of the k-distance satisfies:
- Step 2. Local error bound for chain distance: The chain distance decomposes long distances into multiple short segments, reaching the same through stepwise nearest neighbor linking. The distance between adjacent point pairs and in the chain is sufficiently small. Taking , , the error for local link sets in the chain path is:where . The total error of chain distance is:where . Therefore, the relative error of chain distance is:
- Step 3. Comparison of the two methods: When there exists at least one segment , the relative error upper bound of the chain method is strictly smaller than that of the k-nearest neighbor method:
3.1. Cascaded Chain Outlier Detection
| Algorithm 1: Outlier detection by cascaded chaining |
|
3.2. Parallel Chain Outlier Detection
| Algorithm 2: Outlier detection by parallel chaining |
|
4. Results
4.1. Detection Performance on Synthetic Data Sets
4.2. Detection Performance on Real-Life Data Sets
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Yao, Y.; Zhang, X.; Cui, W. A LOF-IDW based data cleaning method for quality assessment in intelligent compaction of soils. Transp. Geotech. 2023, 42, 101101. [Google Scholar] [CrossRef]
- Bałdyga, M.; Barański, K.; Belter, J.; Origlia, R.; Rossi, B. Anomaly detection in railway sensor data environments: State-of-the-art methods and empirical performance evaluation. Sensors 2024, 24, 2633. [Google Scholar] [CrossRef] [PubMed]
- Bianchi, G.; Fanelli, C.; Freddi, F.; Giuliani, F.; La Placa, A. Systematic review railway infrastructure monitoring: From classic techniques to predictive maintenance. Adv. Mech. Eng. 2025, 17, 16878132241285631. [Google Scholar] [CrossRef]
- Xing, Z.; Liu, Y.; Wang, Q.; Li, J. Multi-sensor signals with parallel attention convolutional neural network for bearing fault diagnosis. AIP Adv. 2022, 12, 075020. [Google Scholar] [CrossRef]
- Wang, X.; Mao, D.; Li, X. Bearing fault diagnosis based on vibro-acoustic data fusion and 1D-CNN network. Measurement 2021, 173, 108518. [Google Scholar] [CrossRef]
- Li, X.; Wang, Y.; Yao, J.; Zhang, H. Multi-sensor fusion fault diagnosis method of wind turbine bearing based on adaptive convergent viewable neural networks. Reliab. Eng. Syst. Saf. 2024, 245, 109980. [Google Scholar] [CrossRef]
- Wan, H.; Gu, X.; Yang, S.; Liu, P. A sound and vibration fusion method for fault diagnosis of rolling bearings under speed-varying conditions. Sensors 2023, 23, 3130. [Google Scholar] [CrossRef]
- Pacheco-Chérrez, J.; Fortoul-Diaz, J.A.; Cortés-Santacruz, F.; Aloso-Valerdi, L.M.; Ibarra-Zarate, D.I. Bearing fault detection with vibration and acoustic signals: Comparison among different machine leaning classification methods. Eng. Fail. Anal. 2022, 139, 106515. [Google Scholar] [CrossRef]
- Chen, S.; Wang, W.; van Zuylen, H. A comparison of outlier detection algorithms for ITS data. Expert Syst. Appl. 2010, 37, 1169–1178. [Google Scholar] [CrossRef]
- Ribeiro, R.P.; Pereira, P.; Gama, J. Sequential anomalies: A study in the railway industry. Mach. Learn. 2016, 105, 127–153. [Google Scholar] [CrossRef]
- Ghiasi, R.; Khan, M.A.; Sorrentino, D.; Friswell, M.I. An unsupervised anomaly detection framework for onboard monitoring of railway track geometrical defects using one-class support vector machine. Eng. Appl. Artif. Intell. 2024, 133, 108167. [Google Scholar] [CrossRef]
- Wan, T.H.; Tsang, C.W.; Hui, K.; Ho, I.W.H. Anomaly detection of train wheels utilizing short-time Fourier transform and unsupervised learning algorithms. Eng. Appl. Artif. Intell. 2023, 122, 106037. [Google Scholar] [CrossRef]
- Phusakulkajorn, W.; Núñez, A.; Wang, H.; Dollevoet, R. Artificial intelligence in railway infrastructure: Current research, challenges, and future opportunities. Intell. Transp. Infrastruct. 2023, 2, liad016. [Google Scholar] [CrossRef]
- Shaikh, M.Z.; Jatoi, S.; Baro, E.N.; Ahmed, A.; Memon, T.D. FaultSeg: A Dataset for Train Wheel Defect Detection. Sci. Data 2025, 12, 309. [Google Scholar] [CrossRef]
- Liu, H.; Zhang, S.; Wu, Z. Outlier detection using local density and global structure. Pattern Recognit. 2025, 157, 110947. [Google Scholar] [CrossRef]
- Omar, M.; Sukthankar, G. Text-Defend: Detecting Adversarial Examples using Local Outlier Factor. In Proceedings of the IEEE International Conference on Semantic Computing, Laguna Hills, CA, USA, 27 February–1 March 2023; pp. 118–122. [Google Scholar]
- Ruff, L.; Kauffmann, J.R.; Vandermeulen, R.A.; Montavon, G.; Samek, W.; Kloft, M.; Dietterich, T.G.; Müller, K.R. A unifying review of deep and shallow anomaly detection. Proc. IEEE 2021, 109, 756–795. [Google Scholar] [CrossRef]
- Liu, B.; Tan, P.N.; Zhou, J. Unsupervised anomaly detection by robust density estimation. In Proceedings of the AAAI Conference on Artificial Intelligence, Virtually, 22 February–1 March 2022; Volume 36, pp. 4101–4108. [Google Scholar]
- Romano, M.; Faiella, G.; Bifulco, P.; Cesarelli, M. Outliers detection and processing in CTG monitoring. In XIV Mediterranean Conference on Medical and Biological Engineering and Computing 2016; Springer: Cham, Switzerland, 2014; pp. 651–654. [Google Scholar]
- Perini, L.; Bürkner, P.C.; Klami, A. Estimating the contamination factor’s distribution in unsupervised anomaly detection. In Proceedings of the 40th International Conference on Machine Learning, Honolulu, HI, USA, 23–29 July 2023; pp. 27668–27679. [Google Scholar]
- Pearson, R.K. Outliers in process modeling and identification. IEEE Trans. Control Syst. Technol. 2002, 10, 55–63. [Google Scholar] [CrossRef]
- Dong, H.; Wang, Q.G.; Ding, W. Chain-Based Outlier Detection for Complex Data Scenarios. In Proceedings of the 2023 IEEE International Conference on Big Data (BigData), Sorrento, Italy, 15–18 December 2023; pp. 504–509. [Google Scholar]
- Yang, J.; Zhou, K.; Li, Y.; Liu, Z. Generalized out-of-distribution detection: A survey. Int. J. Comput. Vis. 2024, 132, 5635–5662. [Google Scholar] [CrossRef]
- Breunig, M.M.; Kriegel, H.P.; Ng, R.T.; Sander, J. LOF: Identifying density-based local outliers. In Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, Dallas, TX, USA, 16–18 May 2000; pp. 93–104. [Google Scholar]
- Tang, J.; Chen, Z.; Fu, A.W.C.; Cheung, D.W. Enhancing effectiveness of outlier detections for low density patterns. In Advances in Knowledge Discovery and Data Mining; Chen, M.S., Yu, P.S., Liu, B., Eds.; Springer: Berlin/Heidelberg, Germany, 2002; pp. 535–548. [Google Scholar]
- Huang, J.; Zhu, Q.; Yang, L.; Feng, J. A non-parameter outlier detection algorithm based on natural neighbor. Knowl.-Based Syst. 2016, 92, 71–77. [Google Scholar] [CrossRef]
- Jin, W.; Tung, A.K.H.; Han, J.; Wang, W. Ranking outliers using symmetric neighborhood relationship. In Advances in Knowledge Discovery and Data Mining; Ng, W.K., Kitsuregawa, M., Li, J., Chang, K., Eds.; Springer: Berlin/Heidelberg, Germany, 2006; pp. 577–593. [Google Scholar]
- He, Z.; Xu, X.; Deng, S. Discovering cluster-based local outliers. Pattern Recognit. Lett. 2003, 24, 1641–1650. [Google Scholar] [CrossRef]
- Amer, M.; Goldstein, M. Nearest-neighbor and clustering based anomaly detection algorithms for RapidMiner. In Proceedings of the 3rd RapidMiner Community Meeting and Conference (RCOMM’12), Shenzhen, China, 9–11 May 2012; pp. 1–12. [Google Scholar]
- Ester, M.; Kriegel, H.P.; Sander, J.; Xu, X. A density-based algorithm for discovering clusters in large spatial databases with noise. In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, Portland, OR, USA, 2–4 August 1996; pp. 226–231. [Google Scholar]
- Campello, R.J.G.B.; Moulavi, D.; Zimek, A.; Sander, J. Hierarchical Density Estimates for Data Clustering, Visualization, and Outlier Detection. ACM Trans. Knowl. Discov. Data 2015, 10, 1–51. [Google Scholar] [CrossRef]
- Du, H.; Zhao, S.; Zhang, D.; Wu, J. Novel clustering-based approach for local outlier detection. In Proceedings of the 2016 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), San Francisco, CA, USA, 10–14 April 2016; pp. 802–811. [Google Scholar]
- Yamanishi, K.; Takeuchi, J.I.; Williams, G.; Milne, P. On-line unsupervised outlier detection using finite mixtures with discounting learning algorithms. In Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 26–29 August 2001; pp. 320–324. [Google Scholar]
- Kriegel, H.P.; Kröger, P.; Schubert, E.; Zimek, A. LoOP: Local outlier probabilities. In Proceedings of the 18th ACM Conference on Information and Knowledge Management, Hong Kong, China, 2–6 November 2009; pp. 1649–1652. [Google Scholar]
- Papadimitriou, S.; Kitagawa, H.; Gibbons, P.B.; Faloutsos, C. LOCI: Fast outlier detection using the local correlation integral. In Proceedings of the 19th International Conference on Data Engineering, Bangalore, India, 5–8 March 2003; pp. 315–326. [Google Scholar]
- Tang, B.; He, H. A local density-based approach for outlier detection. Neurocomputing 2017, 241, 171–180. [Google Scholar] [CrossRef]
- Qin, X.; Cao, L.; Rundensteiner, E.A.; Madden, S. Scalable kernel density estimation-based local outlier detection over large data streams. In Proceedings of the 22nd International Conference on Extending Database Technology (EDBT), Lisbon, Portugal, 26–29 March 2019. [Google Scholar]
- Li, Z.; Zhao, Y.; Botta, N.; Ionescu, C.; Hu, X. COPOD: Copula-Based Outlier Detection. In Proceedings of the 2020 IEEE International Conference on Data Mining (ICDM), Sorrento, Italy, 17–20 November 2020; pp. 1118–1123. [Google Scholar]
- Li, Z.; Zhao, Y.; Hu, X.; Botta, N.; Ionescu, C.; Chen, H.G. ECOD: Unsupervised Outlier Detection Using Empirical Cumulative Distribution Functions. IEEE Trans. Knowl. Data Eng. 2022, 14, 71–77. [Google Scholar] [CrossRef]
- Papers with Code. ODDS (Outlier Detection Data Sets). Available online: https://paperswithcode.com/dataset/odds (accessed on 3 June 2025).
- Campos, G.O.; Zimek, A.; Sander, J.; Campello, R.J.; Micenková, B.; Schubert, E.; Assent, I.; Houle, M.E. On the evaluation of unsupervised outlier detection: Measures, data sets, and an empirical study. Data Min. Knowl. Discov. 2016, 30, 891–927. [Google Scholar] [CrossRef]






| Data Set | Contamination | Number of Instances | Number of Attributes | Number of Outliers |
|---|---|---|---|---|
| Example_l | 0.18% | 1100 | 2 | 2 |
| Example_c | 4.00% | 200 | 2 | 8 |
| Example_n | 0.18% | 1100 | 2 | 2 |
| Groups, Regressions, and Moons | 3.00% | 206 | 2 | 6 |
| 5.00% | 210 | 2 | 10 | |
| 7.00% | 214 | 2 | 14 |
| Data Set | LOF | COF | LoOP | LOCI | NOF | RDOS | ECOD | CCOD | PCOD |
|---|---|---|---|---|---|---|---|---|---|
| Example_l | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 0.810 | 1.000 | 1.000 |
| Example_c | 1.000 | 0.998 | 0.992 | 1.000 | 1.000 | 1.000 | 0.998 | 1.000 | 1.000 |
| Example_n | 1.000 | 1.000 | 1.000 | 0.994 | 0.988 | 1.000 | 0.606 | 1.000 | 1.000 |
| Data Set | LOF | COF | LoOP | LOCI | NOF | RDOS | ECOD | CCOD | PCOD |
|---|---|---|---|---|---|---|---|---|---|
| Example_l | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 0.000 | 1.000 | 1.000 |
| Example_c | 0.998 | 0.875 | 0.625 | 1.000 | 1.000 | 1.000 | 0.875 | 1.000 | 1.000 |
| Example_n | 1.000 | 1.000 | 1.000 | 0.500 | 0.000 | 1.000 | 0.500 | 1.000 | 1.000 |
| Data Set | LOF | COF | LoOP | LOCI | NOF | RDOS | ECOD | CCOD | PCOD | |
|---|---|---|---|---|---|---|---|---|---|---|
| Groups_e | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 0.722 | 1.000 | 1.000 | |
| 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 0.762 | 1.000 | 1.000 | ||
| 0.978 | 0.929 | 0.948 | 0.974 | 0.956 | 0.986 | 0.781 | 0.964 | 0.994 | ||
| Groups_u | 1.000 | 1.000 | 1.000 | 1.000 | 0.994 | 1.000 | 0.782 | 1.000 | 1.000 | |
| 1.000 | 1.000 | 1.000 | 1.000 | 0.999 | 1.000 | 0.827 | 1.000 | 1.000 | ||
| 1.000 | 1.000 | 1.000 | 1.000 | 0.994 | 1.000 | 0.840 | 1.000 | 1.000 | ||
| Groups_g | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 0.830 | 1.000 | 1.000 | |
| 0.980 | 1.000 | 0.998 | 1.000 | 0.997 | 1.000 | 0.871 | 0.982 | 0.987 | ||
| 0.929 | 0.996 | 0.995 | 0.998 | 0.989 | 1.000 | 0.806 | 0.918 | 0.897 | ||
| Regressions_e | 0.999 | 1.000 | 0.999 | 0.967 | 0.906 | 0.999 | 0.949 | 0.985 | 1.000 | |
| 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 0.964 | 1.000 | 1.000 | ||
| 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 0.950 | 1.000 | 1.000 | ||
| Regressions_u | 0.998 | 1.000 | 0.998 | 0.944 | 0.827 | 0.998 | 0.951 | 1.000 | 1.000 | |
| 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 0.949 | 1.000 | 1.000 | ||
| 0.992 | 1.000 | 0.999 | 0.977 | 0.934 | 0.999 | 0.917 | 0.985 | 1.000 | ||
| Regressions_g | 0.990 | 0.997 | 0.999 | 0.998 | 0.992 | 0.999 | 0.971 | 0.958 | 0.968 | |
| 0.989 | 0.998 | 0.999 | 0.999 | 0.995 | 1.000 | 0.960 | 0.989 | 0.981 | ||
| 0.936 | 0.998 | 0.998 | 0.999 | 0.965 | 0.999 | 0.939 | 0.953 | 0.886 | ||
| Moons_e | 1.000 | 1.000 | 1.000 | 0.977 | 0.965 | 1.000 | 0.673 | 1.000 | 1.000 | |
| 1.000 | 0.990 | 1.000 | 0.988 | 0.975 | 1.000 | 0.768 | 1.000 | 1.000 | ||
| 1.000 | 0.950 | 0.999 | 0.991 | 0.924 | 0.949 | 0.753 | 1.000 | 0.919 | ||
| Moons_u | 1.000 | 1.000 | 1.000 | 1.000 | 0.998 | 1.000 | 0.645 | 1.000 | 1.000 | |
| 0.999 | 0.999 | 1.000 | 0.996 | 0.984 | 1.000 | 0.737 | 1.000 | 1.000 | ||
| 0.998 | 0.998 | 0.999 | 0.997 | 0.978 | 0.994 | 0.675 | 0.964 | 0.980 | ||
| Moons_g | 0.899 | 1.000 | 1.000 | 0.982 | 0.879 | 0.999 | 0.693 | 1.000 | 0.999 | |
| 0.895 | 0.999 | 0.999 | 0.989 | 0.899 | 0.999 | 0.805 | 0.999 | 0.740 | ||
| 0.939 | 0.999 | 0.999 | 0.994 | 0.949 | 0.998 | 0.765 | 0.980 | 0.699 |
| Data Set | LOF | COF | LoOP | LOCI | NOF | RDOS | ECOD | CCOD | PCOD | |
|---|---|---|---|---|---|---|---|---|---|---|
| Groups_e | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 0.500 | 1.000 | 1.000 | |
| 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 0.600 | 1.000 | 1.000 | ||
| 0.929 | 0.929 | 0.929 | 0.929 | 0.929 | 0.929 | 0.571 | 1.000 | 1.000 | ||
| Groups_u | 1.000 | 1.000 | 1.000 | 1.000 | 0.833 | 1.000 | 0.333 | 1.000 | 1.000 | |
| 1.000 | 1.000 | 1.000 | 1.000 | 0.900 | 1.000 | 0.500 | 1.000 | 1.000 | ||
| 1.000 | 1.000 | 1.000 | 1.000 | 0.929 | 1.000 | 0.500 | 1.000 | 1.000 | ||
| Groups_g | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 0.667 | 1.000 | 1.000 | |
| 0.955 | 0.989 | 0.900 | 1.000 | 0.967 | 1.000 | 0.500 | 0.893 | 0.980 | ||
| 0.903 | 0.905 | 0.881 | 0.929 | 0.881 | 0.976 | 0.571 | 0.785 | 0.818 | ||
| Regressions_e | 0.800 | 1.000 | 0.900 | 0.900 | 0.800 | 0.900 | 0.667 | 0.973 | 1.000 | |
| 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 0.800 | 1.000 | 1.000 | ||
| 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 0.643 | 1.000 | 1.000 | ||
| Regressions_u | 0.833 | 1.000 | 0.833 | 0.833 | 0.667 | 0.833 | 0.333 | 1.000 | 1.000 | |
| 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 0.300 | 1.000 | 1.000 | ||
| 0.857 | 1.000 | 0.929 | 0.929 | 0.857 | 0.929 | 0.429 | 0.791 | 1.000 | ||
| Regressions_g | 0.889 | 0.889 | 0.889 | 0.944 | 0.889 | 0.944 | 0.333 | 0.894 | 0.792 | |
| 0.967 | 0.933 | 0.967 | 0.967 | 0.900 | 0.967 | 0.500 | 0.967 | 0.917 | ||
| 0.926 | 0.929 | 0.929 | 0.953 | 0.857 | 0.905 | 0.500 | 0.944 | 0.864 | ||
| Moons_e | 1.000 | 1.000 | 1.000 | 0.833 | 0.667 | 1.000 | 0.333 | 1.000 | 1.000 | |
| 1.000 | 0.867 | 1.000 | 0.900 | 0.700 | 1.000 | 0.600 | 1.000 | 1.000 | ||
| 1.000 | 0.929 | 0.929 | 0.929 | 0.714 | 0.929 | 0.500 | 1.000 | 0.750 | ||
| Moons_u | 1.000 | 1.000 | 1.000 | 1.000 | 0.833 | 1.000 | 0.167 | 1.000 | 1.000 | |
| 0.900 | 0.900 | 1.000 | 0.900 | 0.800 | 0.900 | 0.400 | 1.000 | 1.000 | ||
| 0.929 | 0.857 | 0.857 | 0.929 | 0.714 | 0.857 | 0.286 | 1.000 | 0.636 | ||
| Moons_g | 0.744 | 0.944 | 0.944 | 0.944 | 0.722 | 0.889 | 0.333 | 1.000 | 0.944 | |
| 0.833 | 0.933 | 0.900 | 0.933 | 0.733 | 0.900 | 0.400 | 0.933 | 0.740 | ||
| 0.640 | 0.953 | 0.929 | 0.929 | 0.786 | 0.929 | 0.500 | 0.833 | 0.729 |
| Data Set | Contamination | Number of Instances | Number of Outliers | Original Dim. | Reduced Dim. |
|---|---|---|---|---|---|
| Annthyroid | 7.42% | 7200 | 534 | 6 | 3 |
| BCancer | 2.46% | 366 | 9 | 30 | 3 |
| Cardiot | 29.73% | 2126 | 632 | 22 | 5 |
| Glass | 4.21% | 214 | 9 | 9 | 5 |
| HeartDis | 45.87% | 303 | 139 | 14 | 3 |
| Hepatitis | 20.65% | 155 | 32 | 20 | 3 |
| Ionosphere | 35.90% | 351 | 126 | 33 | 24 |
| Letter | 6.25% | 1599 | 100 | 32 | 19 |
| Musk | 3.17% | 3062 | 97 | 166 | 27 |
| Optdigits | 2.88% | 5216 | 150 | 64 | 29 |
| Page_blocks | 10.23% | 5473 | 560 | 11 | 3 |
| Pendigits | 2.27% | 6870 | 156 | 16 | 9 |
| Pen-global | 11.14% | 808 | 90 | 16 | 9 |
| Pen-local | 0.15% | 6723 | 10 | 16 | 9 |
| Satellite | 1.45% | 5099 | 74 | 36 | 6 |
| Shuttle | 1.89% | 46,463 | 878 | 9 | 3 |
| Spambase | 39.40% | 4601 | 1813 | 58 | 3 |
| Vowels | 3.43% | 1456 | 50 | 12 | 8 |
| Waveform | 2.90% | 3443 | 100 | 22 | 19 |
| Wine | 7.75% | 129 | 10 | 13 | 3 |
| Data Set | LOF | COF | LoOP | LOCI | NOF | RDOS | ECOD | CCOD | PCOD |
|---|---|---|---|---|---|---|---|---|---|
| Annthyroid | 0.62(4) | 0.63(3) | 0.62(4) | 0.61(7) | 0.62(4) | 0.60(8) | 0.60(8) | 0.64(2) | 0.66(1) |
| BCancer | 0.83(1) | 0.77(5) | 0.77(5) | 0.77(5) | 0.60(9) | 0.79(3) | 0.66(8) | 0.79(3) | 0.83(1) |
| Cardiot | 0.60(3) | 0.58(6) | 0.61(1) | 0.60(3) | 0.50(8) | 0.53(7) | 0.50(8) | 0.59(5) | 0.61(1) |
| Glass | 0.70(1) | 0.68(3) | 0.66(6) | 0.68(3) | 0.60(8) | 0.63(7) | 0.58(9) | 0.67(5) | 0.69(2) |
| HeartDis | 0.62(5) | 0.64(2) | 0.62(5) | 0.63(3) | 0.51(9) | 0.55(8) | 0.56(7) | 0.63(3) | 0.65(1) |
| Hepatitis | 0.61(4) | 0.64(1) | 0.60(6) | 0.61(4) | 0.52(8) | 0.55(7) | 0.50(9) | 0.62(3) | 0.64(1) |
| Ionosphere | 0.68(4) | 0.68(4) | 0.68(4) | 0.69(3) | 0.60(9) | 0.65(7) | 0.61(8) | 0.71(2) | 0.72(1) |
| Letter | 0.70(2) | 0.70(2) | 0.72(1) | 0.65(6) | 0.48(9) | 0.57(8) | 0.58(7) | 0.70(2) | 0.70(2) |
| Musk | 0.60(6) | 0.63(4) | 0.65(3) | 0.61(5) | 0.55(7) | 0.52(9) | 0.53(8) | 0.74(2) | 0.76(1) |
| Optdigits | 0.58(5) | 0.60(3) | 0.59(4) | 0.58(5) | 0.53(7) | 0.50(8) | 0.50(8) | 0.67(2) | 0.69(1) |
| Page_blocks | 0.68(1) | 0.65(5) | 0.66(3) | 0.64(6) | 0.60(7) | 0.59(8) | 0.58(9) | 0.66(3) | 0.67(2) |
| Pen-global | 0.76(4) | 0.82(2) | 0.71(6) | 0.58(8) | 0.57(9) | 0.70(7) | 0.76(4) | 0.82(2) | 0.84(1) |
| Pen-local | 0.56(5) | 0.65(3) | 0.55(6) | 0.51(8) | 0.51(8) | 0.58(4) | 0.55(6) | 0.73(1) | 0.71(2) |
| Pendigits | 0.60(5) | 0.58(6) | 0.62(4) | 0.63(3) | 0.55(8) | 0.56(7) | 0.50(9) | 0.65(2) | 0.67(1) |
| Satellite | 0.78(3) | 0.77(4) | 0.72(6) | 0.62(9) | 0.75(5) | 0.69(7) | 0.68(8) | 0.79(2) | 0.84(1) |
| Shuttle | 0.62(4) | 0.62(4) | 0.63(2) | 0.61(6) | 0.57(8) | 0.55(9) | 0.60(7) | 0.63(2) | 0.65(1) |
| Spambase | 0.58(4) | 0.58(4) | 0.56(6) | 0.55(7) | 0.50(9) | 0.52(8) | 0.61(3) | 0.65(2) | 0.67(1) |
| Vowels | 0.62(6) | 0.68(1) | 0.64(4) | 0.63(5) | 0.58(8) | 0.59(7) | 0.54(9) | 0.66(3) | 0.67(2) |
| Waveform | 0.64(3) | 0.65(2) | 0.60(6) | 0.61(5) | 0.55(7) | 0.55(7) | 0.54(9) | 0.64(3) | 0.66(1) |
| Wine | 0.80(6) | 0.80(6) | 0.82(3) | 0.81(4) | 0.78(8) | 0.84(1) | 0.60(9) | 0.81(4) | 0.84(1) |
| Avg. | 0.66(4) | 0.67(3) | 0.65(5) | 0.63(6) | 0.57(9) | 0.60(7) | 0.58(8) | 0.69(2) | 0.71(1) |
| Data Set | LOF | COF | LoOP | LOCI | NOF | RDOS | ECOD | CCOD | PCOD |
|---|---|---|---|---|---|---|---|---|---|
| Annthyroid | 0.28(4) | 0.30(3) | 0.28(4) | 0.25(7) | 0.28(4) | 0.24(8) | 0.23(9) | 0.35(1) | 0.35(1) |
| BCancer | 0.67(1) | 0.56(4) | 0.56(4) | 0.56(4) | 0.22(9) | 0.44(7) | 0.33(8) | 0.67(1) | 0.67(1) |
| Cardiot | 0.41(3) | 0.40(6) | 0.42(1) | 0.41(3) | 0.36(8) | 0.38(7) | 0.36(8) | 0.41(3) | 0.42(1) |
| Glass | 0.36(1) | 0.35(3) | 0.34(6) | 0.35(3) | 0.31(8) | 0.33(7) | 0.30(9) | 0.35(3) | 0.36(1) |
| HeartDis | 0.46(3) | 0.46(3) | 0.44(5) | 0.33(8) | 0.26(9) | 0.37(6) | 0.36(7) | 0.53(1) | 0.50(2) |
| Hepatitis | 0.46(3) | 0.46(3) | 0.43(5) | 0.32(8) | 0.27(9) | 0.37(6) | 0.33(7) | 0.52(1) | 0.49(2) |
| Ionosphere | 0.50(3) | 0.48(4) | 0.47(5) | 0.35(8) | 0.29(9) | 0.42(6) | 0.38(7) | 0.58(1) | 0.54(2) |
| Letter | 0.44(2) | 0.44(2) | 0.47(1) | 0.34(6) | 0.02(9) | 0.27(7) | 0.21(8) | 0.44(2) | 0.44(2) |
| Musk | 0.31(6) | 0.33(4) | 0.34(3) | 0.32(5) | 0.29(7) | 0.27(9) | 0.28(8) | 0.38(2) | 0.39(1) |
| Optdigits | 0.40(5) | 0.41(3) | 0.41(3) | 0.40(5) | 0.38(7) | 0.36(8) | 0.36(8) | 0.45(2) | 0.46(1) |
| Page_blocks | 0.35(1) | 0.34(3) | 0.34(3) | 0.33(6) | 0.31(7) | 0.31(7) | 0.30(9) | 0.34(3) | 0.35(1) |
| Pen-global | 0.57(5) | 0.68(2) | 0.49(6) | 0.24(8) | 0.23(9) | 0.49(6) | 0.58(4) | 0.66(3) | 0.70(1) |
| Pen-local | 0.11(5) | 0.22(4) | 0.11(5) | 0.00(8) | 0.00(8) | 0.29(3) | 0.11(5) | 0.60(1) | 0.60(1) |
| Pendigits | 0.31(5) | 0.30(6) | 0.32(4) | 0.33(3) | 0.29(7) | 0.29(7) | 0.26(9) | 0.34(2) | 0.35(1) |
| Satellite | 0.57(1) | 0.54(2) | 0.45(5) | 0.26(9) | 0.51(3) | 0.39(6) | 0.32(7) | 0.51(3) | 0.30(8) |
| Shuttle | 0.32(4) | 0.32(4) | 0.33(2) | 0.32(4) | 0.30(8) | 0.29(9) | 0.31(7) | 0.33(2) | 0.34(1) |
| Spambase | 0.44(3) | 0.42(4) | 0.41(5) | 0.30(9) | 0.36(7) | 0.36(7) | 0.38(6) | 0.55(1) | 0.52(2) |
| Vowels | 0.46(4) | 0.48(3) | 0.45(5) | 0.33(8) | 0.28(9) | 0.39(6) | 0.35(7) | 0.50(2) | 0.53(1) |
| Waveform | 0.33(3) | 0.34(1) | 0.31(6) | 0.32(5) | 0.29(7) | 0.29(7) | 0.28(9) | 0.33(3) | 0.34(1) |
| Wine | 0.42(4) | 0.42(4) | 0.43(3) | 0.42(4) | 0.41(8) | 0.44(1) | 0.31(9) | 0.42(4) | 0.44(1) |
| Avg. | 0.41(3) | 0.41(3) | 0.39(5) | 0.32(7) | 0.28(9) | 0.35(6) | 0.32(7) | 0.46(1) | 0.45(2) |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Dong, H.; Liu, M.; Wu, S.; Wang, Q.-G.; Zhao, Z. Chain-Based Outlier Detection: Interpretable Theories and Methods for Complex Data Scenarios. Machines 2025, 13, 1040. https://doi.org/10.3390/machines13111040
Dong H, Liu M, Wu S, Wang Q-G, Zhao Z. Chain-Based Outlier Detection: Interpretable Theories and Methods for Complex Data Scenarios. Machines. 2025; 13(11):1040. https://doi.org/10.3390/machines13111040
Chicago/Turabian StyleDong, Huiwen, Meiliang Liu, Shangrui Wu, Qing-Guo Wang, and Zhiwen Zhao. 2025. "Chain-Based Outlier Detection: Interpretable Theories and Methods for Complex Data Scenarios" Machines 13, no. 11: 1040. https://doi.org/10.3390/machines13111040
APA StyleDong, H., Liu, M., Wu, S., Wang, Q.-G., & Zhao, Z. (2025). Chain-Based Outlier Detection: Interpretable Theories and Methods for Complex Data Scenarios. Machines, 13(11), 1040. https://doi.org/10.3390/machines13111040

