You are currently viewing a new version of our website. To view the old version click .
Machines
  • Article
  • Open Access

11 November 2025

Chain-Based Outlier Detection: Interpretable Theories and Methods for Complex Data Scenarios †

,
,
,
and
1
School of Artificial Intelligence, Beijing Normal University, Beijing 100875, China
2
Guangdong Provincial/Zhuhai Key Laboratory of IRADS, and Department of Computer Science, BNU-HKBU United International College, Zhuhai 519087, China
3
Hong Kong Baptist University, Hong Kong
4
School of Mechanical Engineering, Wuhan University of Science and Technology, Wuhan 430081, China
This article belongs to the Section Machines Testing and Maintenance

Abstract

Outlier detection is a critical task in the intelligent operation and maintenance (O&M) of transportation equipment, as it helps ensure the safety and reliability of systems like high-speed trains, aircraft, and intelligent vehicles. Nearest neighbor-based detectors generally offer good interpretability, but often struggle with complex data scenarios involving diverse data distributions and various types of outliers, including local, global, and cluster-based outliers. Moreover, these methods typically rely on predefined contamination, which is a critical parameter that directly determines detection accuracy and can significantly impact system reliability in O&M environments. In this paper, we propose a novel chain-based theory for outlier detection with the aim to provide an interpretable and transparent solution for fault detection. We introduce two methods based on this theory: Cascaded Chain Outlier Detection (CCOD) and Parallel Chain Outlier Detection (PCOD). Both methods identify outliers through sudden increases in chaining distances, with CCOD being more sensitive to local data distributions, while PCOD offers higher computational efficiency. Experimental results on synthetic and real-world datasets demonstrate the superior performance of our methods compared to existing state-of-the-art techniques, with average improvements of 11.3% for CCOD and 14.5% for PCOD.

1. Introduction

Outlier detection (OD) aims to identify data points that deviate from the general data distribution. It has been proven particularly valuable in the intelligent operation and maintenance (O&M) of transportation equipment, where identifying abnormal patterns in sensor data, operational parameters, and system behaviors is essential for ensuring safety and reliability [1,2,3].
In transportation equipment monitoring, outlier detection enables early identification of mechanical anomalies such as bearing degradation, gear wear, and structural fatigue through vibration analysis and multi-sensor data fusion [4,5]. The detection of such outliers in operational parameters—including temperature fluctuations, pressure variations, and rotational speed irregularities—indicates potential mechanical failures that require immediate attention to prevent catastrophic system breakdowns [6]. Beyond traditional condition monitoring, outlier detection has been successfully applied to assess component health in rotating machinery, where abnormal vibration signatures can reveal incipient faults in bearings, shafts, and gearboxes [7,8]. In transportation-specific applications, outlier detection serves to isolate anomalous operational patterns that indicate equipment degradation or performance deviation [9,10]. This capability has proven critical in railway systems for detecting wheel-rail interaction anomalies [11,12], in aviation for identifying engine performance deviations, and in automotive systems for monitoring drivetrain irregularities [13,14].
Despite widespread applications, existing outlier detection methods face significant challenges when applied to complex data scenarios commonly encountered in transportation O&M. First, such datasets typically contain multiple outlier types simultaneously—local outliers that deviate from their neighborhoods, global outliers that are distant from the entire dataset, and cluster outliers where small clusters comprise a few outliers [15,16,17,18]. Detection strategies optimized for different outlier types may conflict with each other, leading to suboptimal performance. Second, normal operational data often exhibits non-uniform distributions, where spatially close clusters may be distant in the underlying manifold space. Existing distance-based detectors that search for neighbors in ambient space fail to capture manifold similarity, causing points in sparse operational modes to include neighbors from nearby dense clusters, thereby disrupting the underlying data structure [19]. Third, most detectors require prior specification of contamination—the expected proportion of outliers—which directly determines detection accuracy and can significantly impact system reliability in O&M environments [20,21]. Estimating contamination in dynamic transportation systems is particularly challenging due to varying operational conditions and evolving fault patterns.
This paper extends our preliminary work with substantial improvements and comprehensive evaluations [22]. We present a chain-based theory with interpretable distance analysis and manifold-aware connectivity principles for complex data scenarios. Based on this theory, we propose two chaining methods: Cascaded Chain Outlier Detection (CCOD) and Parallel Chain Outlier Detection (PCOD). CCOD employs sequential chaining where each point has one predecessor and successor, providing higher sensitivity to local distributions and better manifold structure preservation, while PCOD allows multiple successors for parallel expansion, offering superior computational efficiency and scalability for large-scale transportation monitoring applications. Both methods identify outliers through sudden increases in chaining distances without requiring contamination parameters, making them particularly suitable for dynamic O&M environments.
The remainder of this paper is organized as follows: Section 2 reviews the related work on outlier detection and the evolution of the local outlier detectors. Section 3 presents the chain-based theory of outlier detection, as well as the proposed cascaded and parallel chaining methods. Section 4 demonstrates the superiority of the proposed model using synthetic and real-life data sets. In the end, concluding remarks are given in Section 5.

3. The Proposed Methods

Empirical observations reveal that normal data points are typically concentrated in larger numbers and have sufficient neighbors within the cluster, whereas outlier data points do not conform to this pattern. To formalize this observation, we propose a chain-based theory that respects the intrinsic manifold structure of data by computing similarity via geodesic paths that approximate manifold distances. The chain distance method decomposes long distances into consecutive short paths, better adapting to the intrinsic geometric structure of the manifold. The following theorem establishes the theoretical advantage of chain distance.
Theorem 1.
Chain distance has a tighter error bound relative to k-distance. Suppose the dataset X = x i , i = 1 , 2 , , n is distributed on an m-dimensional smooth manifold M R d , with bounded sectional curvature, i.e., there exists a constant κ > 0 such that the absolute value of sectional curvature in any direction at any point does not exceed κ. When both methods find the same k-th nearest neighbor, the error bound of the chain method is smaller than that of the k-nearest neighbor method:
E c = d c ( x i , x j ) d M ( x i , x j ) d M ( x i , x j ) E k = d k ( x i , x j ) d M ( x i , x j ) d M ( x i , x j )
where x i , x j X , and x i is the k-th nearest neighbor of x j . d k ( x i , x j ) = x i x j represents the distance estimation of k-nearest neighbor, i.e., the Euclidean distance from x i to its k-th nearest neighbor x j ; d c ( x i , x j ) = t = 1 k 1 x t x t + 1 represents the chain distance from x i to x j , i.e., the cumulative distance of piecewise paths for the set { x 1 = x i , x 2 , , x k = x j } constructed by stepwise nearest neighbor linking; d M ( x i , x j ) represents the geodesic distance between x i and x j .
Proof. 
We will analyze the local error bounds for k-nearest neighbor distance and chain distance separately, and then compare them.
  • Step 1. Local error bound for k-distance: For data point pairs p and q within the neighborhood, by the arc-chord estimation inequality in Riemannian geometry:
    p q d M ( p , q ) p q 1 + κ p q 2 6
    Taking p = x i , q = x j , and denoting s = x i x j , we have:
    0 d M ( x i , x j ) d k ( x i , x j ) κ 6 s 3
    Therefore, the relative error E k of the k-distance satisfies:
    d k ( x i , x j ) d M ( x i , x j ) d M ( x i , x j ) κ s 2 6 1 + O ( κ s 2 )
  • Step 2. Local error bound for chain distance: The chain distance d c ( x i , x j ) decomposes long distances into multiple short segments, reaching the same x j through stepwise nearest neighbor linking. The distance between adjacent point pairs x t and x t + 1 in the chain is sufficiently small. Taking p = x t , q = x t + 1 , the error for local link sets in the chain path is:
    d M ( x t , x t + 1 ) x t x t + 1 κ ε t 3 6
    where ε t = x t x t + 1 . The total error of chain distance is:
    d c ( x i , x j ) d M ( x i , x j ) t = 1 k 1 κ ε t 3 6 κ ε ¯ t 2 6
    where ε ¯ t = 1 k 1 t = 1 k 1 ε t . Therefore, the relative error E c of chain distance is:
    d c ( x i , x j ) d M ( x i , x j ) d M ( x i , x j ) κ ε ¯ t 2 6
  • Step 3. Comparison of the two methods: When there exists at least one segment ε t s , the relative error upper bound of the chain method is strictly smaller than that of the k-nearest neighbor method:
    E c = κ ε ¯ t 2 6 E k = κ s 2 6
The chain-based theory enables us to distinguish normal points from outliers based on their chaining behavior. It is assumed that for a normal data point, there should exist a sufficient number of points in its neighborhood such that a series of chaining distances starting from this point does not increase excessively. Conversely, for an outlier data point, the number of points in its neighborhood is insufficient to prevent excessive increases in chaining distances.
As shown in Figure 1, the chain-based theory aims to chain normal points (green dots) following the manifold structure, while leaving outliers unchained as they cause excessive increases in chaining distances. However, traditional density-based methods search for neighbors in ambient space, and outliers can be selected as neighbors due to their proximity in ambient distance.
Figure 1. An illustration of chain-based theory.
Therefore, we utilize the chaining distance as an estimator of data distribution, searching neighbors via consecutive short paths that approximate manifold distances. Points whose chaining distances exhibit excessive increases are identified as outliers, indicating significant separation from neighbors in manifold geometry. Conversely, chained data points with stable chaining distance are identified as the potential normal data. Based on this chain-based theory, two methods are proposed for outlier detection.

3.1. Cascaded Chain Outlier Detection

Suppose X = x i , i = 1 , 2 , , n is the given data set, X N = { } is the data set of normal points, and X O = { } is the data set of outlier points. Use the Euclidean distance as the distance measure between any two points, i.e., d x i , x i ¯ = x i 2 + x i ¯ 2 , where x i , x i ¯ X , and i i ¯ , and calculate this distance only once. At the beginning of a chaining process in the search for outliers, select a random data point x i from X S as the start point and initialize the current chain as, i.e., X C = { x i j } with x i 1 = x i . It is the first point in the current chain, and any point can serve as a start point, provided it has not already been identified as normal or used as a start point. Thus, we define a data set of the remaining start points as X S = X X N { x i j } . Then, the second data point x i 2 to be included in the current chain is chosen from X { x i j } , which has the smallest distance to the start point x i 1 . If there are multiple points with the minimum distance from the start point, only one is selected. The smallest distance between x i 1 and x i 2 is defined as a chaining distance d 1 2 , i.e., d 1 2 = d x i 1 , x i 2 . It is the second point in the current chain, i.e., X C = X C { x i 2 } . For j 2 , select the j + 1 -th data point from X { x i j } which has the smallest Euclidean distance to the j-th data point x i j . The ratio of the chaining distance d j j + 1 over the average chaining distance in the current chain is calculated as
r i j + 1 = d j j + 1 j = 1 j d j 1 j / j 1 .
For j < k , if the x i j + 1 satisfies
1 γ r i j + 1 γ ,
means that including x i j + 1 in the series of chaining distances does not cause excessive increases, and x i j + 1 is a potential normal data point that can be added to the current chain, i.e., X C = X C { x i j + 1 } , and let j = j + 1 to chain the next point.
If 1 γ > r i j + 1 and r i j + 1 γ in the case of j < k , consider the possible merging of two chains. Set the pre-chain data set X ˜ C = X C , x i j ˜ + 1 = x i j + 1 , and start a new chain from x i j + 1 , X C = , j = 1 , x i 1 = x i j ˜ + 1 . Once | X C | + | X ˜ C | k and satisfies the condition as follows:
1 γ j ˜ = 1 j ˜ d j ˜ 1 j ˜ / j ˜ 1 j = 1 j d j 1 j / j 1 γ ,
these two chains have similar densities that can be combined as X C = X C X ˜ C , j = | X ˜ C | + j , and let j = j + 1 to chain the next point; If | X C | + | X ˜ C | k and Equation (11) is not satisfied, it means that the two chains have different densities and cannot be combined, thus the pre-chain is reset as X ˜ C = and let j = j + 1 to chain the next point.
If r i j + 1 > γ , it means that the chaining distances of x i j + 1 increase excessively, and the current chain process ends at x i j . In the case of j < k , X C is not considered a normal cluster and is discarded. A new chaining process begins with X C = from a random point in X S .
Note that the proposed method imposes stricter constraints in the case of j < k , in which the upper constraint γ aims to prevent the chain from cascading to outliers or other clusters, while the lower constraint 1 γ is to prevent the start point from being an outlier. If the start point is an outlier, the chaining distance of x i j j > 2 would increase excessively compared to that of x i 2 . However, if j k , the start point must not be an outlier, because the proposed method considers chains containing more than K elements to be normal, making the lower bound constraint unnecessary.
For j k , if x i j + 1 satisfies r i j + 1 γ , it means that including x i j + 1 in the series of chaining distances does not cause excessive increases, and x i j + 1 is a potential normal data point that can be added to the current chain, i.e., X C = X C { x i j + 1 } , and let j = j + 1 to chain the next point; If r i j + 1 > γ , it means that the chaining distances of x i j + 1 increase excessively, and the current chain process ends at x i j . X N = X N X C , and the data set of start points is updated as X S = X S X N . A new chaining process begins with X C = from a random point in X S .
The search stops when no new clusters can be formed, i.e., | X S |   = 0 . Finally, a point is considered normal if it belongs to X N and considered an outlier if it does not belong to X N , i.e., X O = X X N . In the proposed method, every chained point has only one predecessor and one successor, excluding the first and last. Hence, we refer to the method as outlier detection by cascaded chaining. The process of detecting the outliers by CCOD is summarized in Algorithm 1.
Algorithm 1: Outlier detection by cascaded chaining
Input: 
The given data set X = x i , i = 1 , 2 , , n , and a integer parameter k 2 < k n , a real parameter γ ( γ 1 ) , and a real parameter α 0 < α < 1 ;
  1:
 Initialize: chain data set X C = , pre-chain data set X ˜ C = , start point data set X S = X , normal data set X N = , and outlier data set X O = ;
  2:
 while  | X S |   > 0   do
  3:
    Take x i from X S randomly as a start point, j = 1 , x i j = x i , X C = { x i j } , X S = X S X N { x i j }
  4:
     x i j + 1 = arg min x i ( X { x i j } ) d ( x i j , x i )
  5:
     if  j < 2  then
  6:
         X C = X C { x i j + 1 } , j = j + 1 , go to 4
  7:
    else
  8:
       if  | X C | k  and  r i j + 1 γ  then
  9:
            X C = X C { x i j + 1 } , j = j + 1 ,
 10:
            X N = X N X C , go to 4
 11:
      else if  | X C |   < k  and  1 γ r i j + 1 γ  then
 12:
          X C = X C { x i j + 1 } , j = j + 1 , go to 4
 13:
      else if  | X C |   < k  and  r i j + 1 γ  then
 14:
          X ˜ C = X C , x i j ˜ + 1 = x i j + 1 ,
 15:
          X C = { } , j = 1 , x i 1 = x i j ˜ + 1 , go to 4
 16:
      else if  X ˜ C { } and | X C | + | X ˜ C   | α × k 2  then
 17:
         if satisfy Equation (11) then
 18:
            X C = X C X ˜ C , j = | X ˜ C | + j , j = j + 1 ,
 19:
            X N = X N X C , go to 4
 20:
         else
 21:
            X ˜ C = { } , j = j + 1 , go to 4
 22:
         end if
 23:
      else
 24:
          X C = { }
 25:
      end if
 26:
    end if
 27:
  end while
Output: 
X O = X X N
To illustrate the process of how CCOD forms chains, we present an example of a two-dimensional data set in Figure 2. The parameters ARE k = 5 and γ = 1.5 . Starting at the point x 1 , the second point in the chain is identified as x 3 , as it has the smallest distance from x 1 , i.e., x 3 2 = x 3 . Next, the point x 7 is cascaded as the successor of x 3 2 , i.e., x 7 3 = x 7 , because there are sufficient points in its neighborhood and the chaining distance condition 1 γ r 7 3 γ is satisfied. Similarly, x 9 4 is cascaded as the successor of x 7 3 . The current chain X C is { x 1 1 , x 3 2 , x 7 3 , x 9 4 } . Then, we focus on the nearest point to x 9 4 , which is x 15 and has a much shorter chain distance than before, i.e., 1 γ > r 15 5 and r 15 5 γ . At this point, we consider merging two chains. We copy the current chain into a pre-chain denoted as X ˜ C = { x 1 1 , x 3 2 , x 7 3 , x 9 4 } and a new current chain starts from x 15 . The search stops once | X C | + | X ˜ C | k , and the X C = { x 15 1 , x 18 2 , x 19 3 } . The current chain and pre-chain can be combined because they have similar densities according to Equation (11), i.e., X C = X C X ˜ C , and the order of the data points in X C is updated as { x 1 1 , x 3 2 , x 7 3 , x 9 4 , x 15 5 , x 18 6 , x 19 7 } . For | X C | > k , the x 20 8 is cascaded as the successor of x 19 7 , but x 21 as the nearest point of x 20 8 cannot be cascaded, because it would increase the chaining distance excessively, i.e., r 21 9 > γ . At this point, the current chain is formed, as it cannot be extended any further. The current chain is considered a normal data set, as it contains more than k data points. However, x 21 fails to be chained as a successor and cannot form a normal data set as a start point, and thus would be considered an outlier.
Figure 2. CCOD in a 2D example.
It should be noted that in Figure 2, a substantial number of normal data points persist unchained even after undergoing a chaining process. These data points possess adequate neighbors but are restricted by the expansion direction of the current chain. Hence, we propose a variant approach that allows each point to have multiple successors. The CCOD approach is a special case of the deformation method, where the number of successor points is restricted to 1. While the two methods differ in detail, they share the same underlying concept of identifying outliers with excessive increases in chain distances and collecting the chained data points into a cluster of potential normal data when the chaining process exhibits no excessive distance change. Next, the potential normal data cluster would be defined as a normal data set if it contains more than α × k 2 0 < α < 1 data points.

3.2. Parallel Chain Outlier Detection

Suppose X = x i , i = 1 , 2 , , n is the given data set, X N = { } is the data set of normal points, and X O = { } is the data set of outlier points. We still use the Euclidean distance as the distance measure between any two points. At the beginning of a chaining process, select a random data point x i from X S as the start point and initialize the current chain as X C = { x i j } with x i 1 = x i . It is the first point in the current chain, and any point can serve as a start point, provided it has not already been identified as normal or used as a start point. Thus, we define a data set of the remaining start points as X S = X X N { x i j } . Then, the next batch of points is chosen from X { x i j } such that they have the first K smallest distances to the start point x i 1 . If there are more than k following points for one data point, only the first k points are included. The distance between x i 1 and x i j is defined as a chaining distance d 1 j , i.e., d 1 j = d x i 1 , x i j . Let X B z represent the z-th batch. The first batch contains only a start point, i.e., X B 1 = { x i 1 } , and the second batch is recorded as X B 2 , i.e., X B 2 = { x i 2 , , x i k + 1 } where x i j X { x i 1 } and d ( x i 1 , x i j ) d ( x i 1 , x i j + 1 ) . These two batches are included in the current chain, i.e., X C = X B 1 X B 2 . For z 3 , the current points are selected from X B z in ascending order of superscript, and the k nearest points to be included in the z + 1 -th batch are selected from X X C based on their k smallest Euclidean distances to the current point. The t-th nearest point to the current point x i j is x i j + t , where t = 1 , , k . The ratio of the chaining distance d j j + t over the average chaining distance in the current chain is calculated as
r i j + t = d j j + t j = 1 j d j j 1 / j 1 ,
Note that points with smaller distances to their predecessors have higher priority in selecting successors, since each successor can only be selected once within the same batch or cluster.
For t = 1 , , k , if any x i j + t satisfies
1 γ r i j + t γ ,
it means that a series of chaining distances containing the x i j + t does not increase excessively, and the x i j + t is a potential normal data point that can be taken into the z + 1 -th batch as one of the successor points of x i j , i.e., x i j + 1 = x i j + t , X B z + 1 = X B z + 1 { x i j + 1 } and let j = j + 1 to chain the next point. Then, X C = X C X B z + 1 , and let z = z + 1 to chain the next batch. The current chain process stops when there are no new points to be chained, i.e., X C X B z + 1 = { } . Then, the X C is a normal cluster if j α × k 2 0 < α < 1 , such that X N = X N X C , and the data set of start points is updated as X S = X S X N . A new chaining process begins with X C = from a random point in X S .
The search stops when no new clusters can be formed, i.e., | X S |   = 0 . Finally, a point is considered normal if it belongs to X N ; Otherwise, it is considered an outlier, i.e., X O = X X N . In the proposed variant approach, except for the start point, each chain point has only one predecessor, but can have more than one successor. Hence, we refer to the method as outlier detection by parallel chaining. The process of detecting the outliers by PCOD is summarized in Algorithm 2.
To illustrate the process of how PCOD forms chains, we present an example of a two-dimensional data set in Figure 3. The parameters k = 5 and γ = 1.5 . Starting at the point x 1 as the first batch X B 1 = x 1 1 , the second batch points in the chain are identified as x 3 , x 4 , x 2 , x 6 and x 5 , as they have the k smallest distance from x 1 and d 1 3 < d 1 4 < d 1 2 < d 1 6 < d 1 5 , i.e., X B 2 = { x 3 2 , x 4 3 , x 2 4 , x 6 5 , x 5 6 } . The current chain is X C = X B 1 X B 2 . Next, x 7 is chained as the successor of x 3 2 from X X C , i.e., x 7 7 = x 7 , and x 8 is chained as the successor of x 6 5 from X X C , i.e., x 8 8 = x 8 , as 1 γ r 7 7 γ and 1 γ r 8 8 γ . However, the other points in X B 2 have no new successors, so the 3-rd batch has two elements, i.e., X B 3 = { x 7 7 , x 8 8 } . The current chain is X C = X C X B 3 . Similarly, x 9 9 , x 10 10 , and x 11 11 are chained as successors of x 7 7 from X X C , and x 12 12 , x 13 13 and x 14 14 are chained as successors of x 8 8 from X X C . The 4-th batch X B 4 is { x 9 9 , x 10 10 , x 11 11 , x 12 12 , x 13 13 , x 14 14 } , and the current chain is X C = X C X B 4 . Continuing in the same way, x 15 15 is chained as the successors of x 9 9 , x 18 16 are chained as successors of x 10 10 , x 19 17 are chained as successors of x 11 11 , and x 16 18 , x 17 19 are chained as successors of x 14 14 . The 5-th batch X B 5 is { x 15 15 , x 18 16 , x 19 17 , x 16 18 , x 17 19 } , and the current chain is X C = X C X B 5 . The 6-th batch X B 6 is { x 20 20 } , which only has one element as the successors of x 19 17 . However, x 21 fails to be chained as a successor and cannot form a normal data set as a start point, and thus would be considered an outlier.
Figure 3. PCOD in a 2-D example [22].
Both CCOD and PCOD have a theoretical time complexity of O ( n 2 ) , but their practical performance differs due to chain formation strategies. CCOD requires a large number of small chains, providing higher sensitivity to local distributions but making it unsuitable for large data sets. PCOD divides observations into fewer chains with more points in each chain, offering superior computational efficiency by greatly reducing the time required to form new chains, though more time is spent processing each individual chain. Therefore, CCOD is more suitable for small data sets requiring precise local analysis, while PCOD is ideal for large data sets prioritizing computational efficiency.
Algorithm 2: Outlier detection by parallel chaining
Input: 
 The given data set X = x i , i = 1 , 2 , , n , and a integer parameter k 2 < k n , a real parameter γ ( γ 1 ) , and a real parameter α 0 < α < 1 .
  1:
 Initialize: chain data set X C = , start point data set X S = X , normal data set X N = , outlier data set X O = ;
  2:
 while  | X S | > 0  do
  3:
    Take x i from X S randomly as a start point, j = 1 , x i j = x i , z = 1 , X B z = { x i j } , X C = X B z , X S = X S X N X B z
  4:
    while  | X B z | > 0  do
  5:
       for  t ^ = 1 : | X B z |  do
  6:
          for  t = 1 : k  do
  7:
             x i j + t = arg min x i j + t ( X X C ) , x i t ^ X B z d ( x i t ^ , x i j + t )
  8:
            if  | X C | = = 1  or  x i j + t satisfies Equation (13) then
  9:
                X B z + 1 = X B z + 1 { x i j + 1 } , j = j + 1
 10:
            end if
 11:
          end for
 12:
       end for
 13:
        X C = X C X B z + 1 , z = z + 1 , X B z + 1 = { }
 14:
     end while
 15:
     if  | X C | α × k 2  then
 16:
        X N = X N X C
 17:
     else
 18:
        X C = { }
 19:
     end if
 20:
 end while
Output: 
X O = X X N

4. Results

In this section, we compare the proposed models with seven state-of-the-art algorithms. To ensure a comprehensive comparison, we chose LOF [24] and its most-cited variants COF [25], LoOP [34], LOCI [35], NOF [26], RDOS [36], and ECOD [39] as representative local outlier detectors. Unlike existing methods requiring predefined contamination parameters, our approaches automatically identify outliers by detecting abrupt increases in chaining distances.
The performance metric employed in our study is the Area Under the Receiver Operating Characteristic Curve (AUC) and Precision (PRN). The AUC evaluates the trade-off between correctly identifying true outliers and minimizing false detections on normal points, where AUC = 1.00 indicates perfect detection performance to identify normal data and outliers. Unlike conventional detectors that produce continuous outlier scores and require a predefined contamination, the proposed methods output binary scores, assigning 1 to predicted normal points and 0 to predicted outliers. The PRN measures the proportion of correctly identified outliers among all predicted outliers, where PRN = 1.00 indicates that all predicted outliers are true outliers. All methods in our study achieved the highest AUC based on their respective parameters. All reported results are averaged over 5 independent runs.
Experiments are conducted on a 2.40 GHz Core i5-1135G7 CPU with 16 GB of RAM. The experiments are programmed in Python, and the software version is Python 3.8.

4.1. Detection Performance on Synthetic Data Sets

To illustrate the intuition behind the proposed methods, experiments are conducted on a variety of synthetic data sets. As shown in Figure 4a–c, Example_l, Example_c and Example_n are designed for different complex data scenarios. Example_l is constructed to evaluate the effectiveness of outlier detection algorithms in identifying both local and global outliers. The Example_c aims to evaluate the performance of detecting outlier clusters. Example_n investigates the ability of detectors to identify local and global outliers when two normal clusters with different densities are located near each other. As shown in Figure 4d–l, the Groups, Regressions, and Moons data sets are designed for various random distributions. For Groups data sets in Figure 4d–f, we generate Groups_e, Groups_u, and Groups_g subject to equidistant (_e), uniform (_u), and Gaussian (_g) as subtype distributions, respectively. The Regressions data sets in Figure 4g–i and Moons data sets in Figure 4j–l are generated in the same way. Each subtype includes three different random variations, and the experimental results are averaged across these variations. In addition, we add 3%, 5%, and 7% outliers to these distributions, respectively, in which outliers are subject to a uniform distribution. A list of characteristics of synthetic data sets can be found in Table 1.
Figure 4. Synthetic data sets for performance evaluation of outlier detection methods.
Table 1. The characteristics of synthetic data sets.
The performance of each detector on the Example_l, Example_c, and Example_n data sets in terms of AUC and PRN is presented in Table 2 and Table 3, respectively. For the Example_l data set, the proposed methods and most algorithms achieve perfect AUC and PRN values of 1.000. whereas ECOD exhibits significant limitations in identifying outliers in this small-scale data set. For the Example_c data set, COF, LoOP, and ECOD achieve the AUC and PRN values consistently lower than average, indicating that these methods fail to accurately detect clustered outliers due to high similarity with normal clusters. For the Example_n data set, LOCI, NOF, and ECOD all fail in this case due to the low local density of the sparse cluster members located near the dense clusters, resulting in normal data being considered outliers. The experiments demonstrate that only the proposed methods and RDOS achieve perfect performance across various scenarios.
Table 2. AUC of various outlier detection methods in Example_l, Example_c, and Example_n data sets.
Table 3. PNR of various outlier detection methods in Example_l, Example_c, and Example_n data sets.
The performance of each detector on the Groups, Regressions, and Moons data sets in terms of AUC and PRN is presented in Table 4 and Table 5.
Table 4. AUC of various outlier detection methods in Groups, Regressions and Moons data sets.
Table 5. PNR of various outlier detection methods in synthetic data sets.
For the Groups data sets, most methods achieve perfect AUC and PRN values of 1.000 on Group_e and Group_u when the contamination is 3 % and 5 % , except for NOF and ECOD. However, as the contamination increases to 7 % , only CCOD and PCOD maintain perfect AUC and PRN values of 1.000 on Group_u, while on Group_e, PCOD achieves the highest AUC of 0.994 . Particularly, on Group_e with 7 % contamination, PCOD outperforms LOF, COF, LoOP, LOCI, NOF, RDOS, and ECOD by 0.016 , 0.065 , 0.046 , 0.020 , 0.038 , 0.008 , and 0.213 , respectively. For the more irregular Group_g data set with 7 % contamination, CCOD achieves an AUC of 0.918 and PRN of 0.785 , while PCOD obtains an AUC of 0.897 and PRN of 0.818 , demonstrating CCOD’s superior sensitivity to local distribution variations in terms of AUC.
On the Regressions data sets, LOF, LoOP, LOCI, and RDOS show relatively unstable AUC values and correspondingly lower PRN values on linearly distributed data, while COF consistently demonstrates strong performance through its shortest-path linkage mechanism. Specifically, under 3 % contamination in Regression_e, LOF and NOF achieve PRN values of only 0.800 , ECOD obtains 0.667 , whereas CCOD reaches 0.973 . When the contamination increases to 5 % , CCOD and PCOD both achieve perfect scores with AUC and PRN of 1.000 on Regression_e and Regression_u, indicating that the chaining theory effectively adapts to linear manifold structures. However, at 5 % contamination on the Gaussian-distributed Regression_g, the performance of both methods degrades to AUC values of 0.989 and 0.981 , with corresponding PRN values of 0.967 and 0.917 , respectively, suggesting room for improvement in scenarios with more complex density distributions.
For the Moons data sets, most methods show significant performance degradation due to difficulties in capturing the diverse data distribution structures. CCOD and PCOD maintain perfect AUC and PRN values of 1.000 across all contamination levels on Moon_e, except that PCOD’s AUC drops to 0.919 at 7 % contamination, and achieve the best AUC and PRN performance at 3 % and 5 % contaminations on Moon_u and Moon_g. Particularly, on Moon_g with 3 % contamination, CCOD outperforms LOF, LOCI, NOF, and ECOD by 0.101 , 0.018 , 0.121 , and 0.307 , respectively. Under 7 % contamination on Moon_g, CCOD achieves a PRN of 0.833 while PCOD obtains 0.729 , demonstrating its superior precision within complex distribution scenarios. The experiments demonstrate the superior robustness of CCOD and PCOD across complex data scenarios with varying contamination levels.
Furthermore, we investigate the parameter tuning and sensitivity analysis for CCOD and PCOD. The minimum cluster size is determined by α × k 2 , where the parameter α is set to a fixed value of 0.2 in both CCOD and PCOD, whereas γ and k are tunable parameters. In the experiments on synthetic data sets, γ ranges from 1 to 5 with an interval of 0.2, and k ranges from 5 to 50 with an interval of 5. Empirically, when CCOD and PCOD achieve optimal AUC, the value of γ is approximately 3.6. Therefore, we recommend setting γ = 3.6 when prior statistical information is unavailable. Similarly, the optimal value of k is approximately 10 for CCOD and 15 for PCOD. It is recommended that k be increased moderately with growing data set size. Additionally, we evaluate the sensitivity of parameter γ . As shown in Figure 5, the values of γ that achieve optimal AUC in CCOD and PCOD are presented across varying values of k. It is observed that the correlation between γ and k is weak, with the variation in γ remaining insignificant as k increases from 5 to 50. In Figure 5d, the value of γ for both CCOD and PCOD consistently remains at 3.6 across all values of k.
Figure 5. Sensitivity of parameter γ to varying k in synthetic data sets.

4.2. Detection Performance on Real-Life Data Sets

To evaluate the predictive performance of different models in real-life scenarios, experiments are conducted on 20 real-life data sets with varying sample sizes, dimensionalities, and contamination. The data sets span diverse applications, including healthcare, sensor readings, handwriting, and tabular benchmarks [40,41]. In the medical domain, the breast cancer Wisconsin data set (abbr., BCancer) contains benign and malignant cases extracted from Fine Needle Aspirate (FNA) images of nuclei. By treating malignant cases as outliers, the performance of the detector is evaluated by identifying malignant cases. In pattern recognition, the pen-based recognition of the handwritten digits data set is a collection of data extracted from digitized handwritten images of different writers. Of these, the Pen-global treats digit 8 as normal and other digits as outliers, and Pen-local treats digit 4 forms an outlier cluster. In remote sensing, the Landsat satellite data set (abbr. Satellite) contains scenes extracted from satellite imagery, where semantic scenes corresponding to cotton crops and soil with vegetation stubble are selected as outliers. All numeric attributes are normalized before being used in our experiments. To mitigate the curse of dimensionality in high-dimensional data sets, we apply principal component analysis for dimensionality reduction while preserving at least 95% of the original variance to maintain detection capability and ensure fair comparison. Table 6 summarizes the comprehensive data set characteristics.
Table 6. The characteristics of real-life data sets.
Table 7 and Table 8 present the AUC and PRN performance of each detector on 20 real-life data sets, with rankings shown in parentheses. Overall, PCOD achieves the highest average AUC of 0.71 , while CCOD attains the highest average PRN of 0.46 and the second-best AUC of 0.69 .
Table 7. AUC of various outlier detection methods in real-life data sets.
Table 8. PNR of various outlier detection methods in real-life data sets.
For low contamination data sets (≤5%), we analyze 10 data sets with contamination ranging from 0.15 % to 4.21 % . In this case, PCOD demonstrates exceptional performance, achieving first rank in 7 out of 10 data sets, while CCOD consistently delivers competitive results. For the BCancer data set, PCOD and LOF achieve the highest AUC of 0.83 with a PRN of 0.67 , surpassing COF, LoOP, and LOCI by 0.06 , NOF by 0.23 , RDOS by 0.04 , and ECOD by 0.17 . For the Musk data set, PCOD achieves the best performance with an AUC of 0.76 and PRN of 0.39 , significantly outperforming LOF by 0.16 . For the Pen-local data set, both CCOD and PCOD achieve a PRN of 0.60 , while CCOD achieves the highest AUC of 0.73 compared to PCOD’s 0.71 , benefiting from its more refined cascaded local search mechanism that effectively identifies small outlier clusters. For the Shuttle and Optdigits data sets, PCOD achieves the highest AUC, and CCOD is the second highest performer. The superior performance is particularly notable on large-scale data sets, where PCOD achieves substantial improvements over other methods.
For medium contamination data sets (5–20%), we evaluate 5 data sets with contamination between 6.25 % and 11.14 % . In this case, PCOD ranks first in 3 out of 5 data sets, while CCOD consistently ranks in the top-4 across these data sets. For the Annthyroid data set, PCOD achieves the best performance with an AUC of 0.66 and PRN of 0.35 , its AUC outperforms LOF by 0.04 , and significantly surpasses NOF and ECOD by 0.06 . For the Pen-global data set, PCOD achieves the highest AUC of 0.84 , demonstrating superior effectiveness over LOCI by 0.26 and NOF by 0.27 , showcasing excellent performance in pattern recognition tasks such as handwriting recognition. However, on the Letter data set, LoOP achieves the highest AUC of 0.72 , with PCOD and CCOD at 0.70 . This occurs because traditional methods can use exact contamination levels as prior knowledge, whereas CCOD and PCOD determine outliers based on chaining distances without prior contamination knowledge, which may reduce precision in some cases.
For high contamination data sets ( 20 % ), we test 5 challenging data sets with contamination from 20.65 % to 45.87 % . Despite the challenging conditions with dense outlier distributions, PCOD maintains first ranking in all 5 data sets, while CCOD maintains competitive performance. For the HeartDis data set with the highest contamination ( 45.87 % ), PCOD achieves the best AUC of 0.65 and CCOD achieves 0.63 , with PCOD obtaining a PRN of 0.50 and CCOD achieving a PRN of 0.53 , demonstrating strong performance. For the Spambase data set, PCOD attains the highest AUC of 0.67 with a PRN of 0.52 , with CCOD achieving the second-best AUC of 0.65 and a PRN of 0.55 , substantially outperforming traditional methods like LOF and COF. In contrast, traditional methods show significant performance degradation in high contamination scenarios. For instance, ECOD achieves only 0.60 AUC with a PRN of 0.31 on the Wine data set, while NOF obtains AUC values below 0.55 with PRN values generally below 0.40 on multiple high contamination data sets. Although COF and LOF maintain competitiveness on some data sets, their overall performance remains unstable. The ensemble method ECOD demonstrates poor performance on multiple data sets, highlighting the limitations of feature independence assumptions. In this case, contamination parameters have a significant impact on outlier detection performance, and even minor contamination variations can cause significant changes in results. However, the proposed methods remain effective when dealing with highly contaminated data.
We present the running time analysis on real-life data sets. Detectors ranked by average execution time are: ECOD (1.73s), LOF (1.80s), PCOD (2.68s), NOF (2.82s), CCOD (3.95s), RDOS (7.27s), LOCI (23.97s), LoOP (39.27s), and COF (51.67s). ECOD achieves linear complexity O ( n ) , while LOCI has cubic complexity O ( n 3 ) . Other methods, including LOF, NOF, RDOS, LoOP, COF, and the proposed CCOD and PCOD, have quadratic complexity O ( n 2 ) . Among the O ( n 2 ) methods, LOF is computationally the most efficient, involving only distance calculations and simple density ratios. In contrast, NOF requires bidirectional verification of mutual k-nearest neighbors, RDOS and LoOP incur overhead from kernel density estimation and probabilistic computations, and COF suffers from expensive shortest path construction across connectivity patterns. Our CCOD and PCOD methods achieve practical efficiency comparable to LOF through streamlined contrastive computations, while delivering superior detection accuracy, effectively balancing cost and performance.
Experimental results demonstrate that our chain-based approaches consistently outperform state-of-the-art methods across both synthetic and real-life data sets, validating the effectiveness of chain-based theory in tackling complex outlier detection scenarios.
Furthermore, we investigate the parameter tuning and sensitivity analysis for CCOD and PCOD. The minimum cluster size is determined by α × k 2 , where the parameter α is set to a fixed value of 0.2 in both CCOD and PCOD, whereas γ and k are tunable parameters. In the experiments on real-life data sets, γ ranges from 1 to 5 with an interval of 0.2, and k ranges from 5 to 50 with an interval of 5. Empirically, when CCOD and PCOD achieve optimal AUC, the value of γ is approximately 3.6. Therefore, we recommend setting γ = 3.6 when prior statistical information is unavailable. Similarly, the optimal value of k is approximately 15 for CCOD and 25 for PCOD. It is recommended that k be increased moderately with growing data set size. Additionally, we evaluate the sensitivity of parameter γ . As shown in Figure 6, the values of γ that achieve optimal AUC in CCOD and PCOD are presented across varying values of k. It is observed that the correlation between γ and k is weak, with the variation in γ remaining insignificant as k increases from 5 to 50. In Figure 6b, the value of γ for both CCOD and PCOD consistently remains at 3.6 across all values of k.
Figure 6. Sensitivity of parameter γ to varying k in real-life data sets.

5. Conclusions

This paper proposes a novel chain-based theory for outlier detection that simultaneously addresses multiple challenges in complex data scenarios: handling different outlier types, adapting to varying cluster densities, and eliminating dependence on contamination parameters. We present CCOD and PCOD, two interpretable methods that utilize chaining distance as a manifold-aware data distribution estimator, identifying outliers through sudden distance increases in geodesic connectivity paths while providing intuitive explanations for detection decisions. Through extensive experiments with both synthetic data sets and real-life data sets, we demonstrate that performance is comparable or better to state-of-the-art methods, establishing chain-based theory as a promising solution for addressing complex data scenarios in outlier detection. The contamination-free design and computational efficiency make these methods particularly suitable for intelligent O&M applications where outlier proportions are unknown and large-scale monitoring is required. These results establish chain-based theory as a promising solution for addressing complex data scenarios in outlier detection.
Future work will extend our approach to real-time applications by developing online learning capabilities that can adapt to evolving operational conditions and integrate with existing maintenance decision support systems. We also aim to validate our methods across diverse transportation equipment types and develop comprehensive multi-modal sensor fusion frameworks that can provide holistic fault diagnosis for complex mechanical systems in dynamic operational environments.

Author Contributions

Conceptualization, H.D. and Q.-G.W.; methodology, H.D. and Q.-G.W.; validation, H.D., M.L. and S.W.; formal analysis, H.D. and Z.Z.; investigation, M.L. and S.W.; data curation, M.L. and S.W.; writing—original draft preparation, H.D.; writing—review and editing, H.D., M.L., S.W., Q.-G.W. and Z.Z.; visualization, H.D. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China, grant number 62373060.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Yao, Y.; Zhang, X.; Cui, W. A LOF-IDW based data cleaning method for quality assessment in intelligent compaction of soils. Transp. Geotech. 2023, 42, 101101. [Google Scholar] [CrossRef]
  2. Bałdyga, M.; Barański, K.; Belter, J.; Origlia, R.; Rossi, B. Anomaly detection in railway sensor data environments: State-of-the-art methods and empirical performance evaluation. Sensors 2024, 24, 2633. [Google Scholar] [CrossRef] [PubMed]
  3. Bianchi, G.; Fanelli, C.; Freddi, F.; Giuliani, F.; La Placa, A. Systematic review railway infrastructure monitoring: From classic techniques to predictive maintenance. Adv. Mech. Eng. 2025, 17, 16878132241285631. [Google Scholar] [CrossRef]
  4. Xing, Z.; Liu, Y.; Wang, Q.; Li, J. Multi-sensor signals with parallel attention convolutional neural network for bearing fault diagnosis. AIP Adv. 2022, 12, 075020. [Google Scholar] [CrossRef]
  5. Wang, X.; Mao, D.; Li, X. Bearing fault diagnosis based on vibro-acoustic data fusion and 1D-CNN network. Measurement 2021, 173, 108518. [Google Scholar] [CrossRef]
  6. Li, X.; Wang, Y.; Yao, J.; Zhang, H. Multi-sensor fusion fault diagnosis method of wind turbine bearing based on adaptive convergent viewable neural networks. Reliab. Eng. Syst. Saf. 2024, 245, 109980. [Google Scholar] [CrossRef]
  7. Wan, H.; Gu, X.; Yang, S.; Liu, P. A sound and vibration fusion method for fault diagnosis of rolling bearings under speed-varying conditions. Sensors 2023, 23, 3130. [Google Scholar] [CrossRef]
  8. Pacheco-Chérrez, J.; Fortoul-Diaz, J.A.; Cortés-Santacruz, F.; Aloso-Valerdi, L.M.; Ibarra-Zarate, D.I. Bearing fault detection with vibration and acoustic signals: Comparison among different machine leaning classification methods. Eng. Fail. Anal. 2022, 139, 106515. [Google Scholar] [CrossRef]
  9. Chen, S.; Wang, W.; van Zuylen, H. A comparison of outlier detection algorithms for ITS data. Expert Syst. Appl. 2010, 37, 1169–1178. [Google Scholar] [CrossRef]
  10. Ribeiro, R.P.; Pereira, P.; Gama, J. Sequential anomalies: A study in the railway industry. Mach. Learn. 2016, 105, 127–153. [Google Scholar] [CrossRef]
  11. Ghiasi, R.; Khan, M.A.; Sorrentino, D.; Friswell, M.I. An unsupervised anomaly detection framework for onboard monitoring of railway track geometrical defects using one-class support vector machine. Eng. Appl. Artif. Intell. 2024, 133, 108167. [Google Scholar] [CrossRef]
  12. Wan, T.H.; Tsang, C.W.; Hui, K.; Ho, I.W.H. Anomaly detection of train wheels utilizing short-time Fourier transform and unsupervised learning algorithms. Eng. Appl. Artif. Intell. 2023, 122, 106037. [Google Scholar] [CrossRef]
  13. Phusakulkajorn, W.; Núñez, A.; Wang, H.; Dollevoet, R. Artificial intelligence in railway infrastructure: Current research, challenges, and future opportunities. Intell. Transp. Infrastruct. 2023, 2, liad016. [Google Scholar] [CrossRef]
  14. Shaikh, M.Z.; Jatoi, S.; Baro, E.N.; Ahmed, A.; Memon, T.D. FaultSeg: A Dataset for Train Wheel Defect Detection. Sci. Data 2025, 12, 309. [Google Scholar] [CrossRef]
  15. Liu, H.; Zhang, S.; Wu, Z. Outlier detection using local density and global structure. Pattern Recognit. 2025, 157, 110947. [Google Scholar] [CrossRef]
  16. Omar, M.; Sukthankar, G. Text-Defend: Detecting Adversarial Examples using Local Outlier Factor. In Proceedings of the IEEE International Conference on Semantic Computing, Laguna Hills, CA, USA, 27 February–1 March 2023; pp. 118–122. [Google Scholar]
  17. Ruff, L.; Kauffmann, J.R.; Vandermeulen, R.A.; Montavon, G.; Samek, W.; Kloft, M.; Dietterich, T.G.; Müller, K.R. A unifying review of deep and shallow anomaly detection. Proc. IEEE 2021, 109, 756–795. [Google Scholar] [CrossRef]
  18. Liu, B.; Tan, P.N.; Zhou, J. Unsupervised anomaly detection by robust density estimation. In Proceedings of the AAAI Conference on Artificial Intelligence, Virtually, 22 February–1 March 2022; Volume 36, pp. 4101–4108. [Google Scholar]
  19. Romano, M.; Faiella, G.; Bifulco, P.; Cesarelli, M. Outliers detection and processing in CTG monitoring. In XIV Mediterranean Conference on Medical and Biological Engineering and Computing 2016; Springer: Cham, Switzerland, 2014; pp. 651–654. [Google Scholar]
  20. Perini, L.; Bürkner, P.C.; Klami, A. Estimating the contamination factor’s distribution in unsupervised anomaly detection. In Proceedings of the 40th International Conference on Machine Learning, Honolulu, HI, USA, 23–29 July 2023; pp. 27668–27679. [Google Scholar]
  21. Pearson, R.K. Outliers in process modeling and identification. IEEE Trans. Control Syst. Technol. 2002, 10, 55–63. [Google Scholar] [CrossRef]
  22. Dong, H.; Wang, Q.G.; Ding, W. Chain-Based Outlier Detection for Complex Data Scenarios. In Proceedings of the 2023 IEEE International Conference on Big Data (BigData), Sorrento, Italy, 15–18 December 2023; pp. 504–509. [Google Scholar]
  23. Yang, J.; Zhou, K.; Li, Y.; Liu, Z. Generalized out-of-distribution detection: A survey. Int. J. Comput. Vis. 2024, 132, 5635–5662. [Google Scholar] [CrossRef]
  24. Breunig, M.M.; Kriegel, H.P.; Ng, R.T.; Sander, J. LOF: Identifying density-based local outliers. In Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, Dallas, TX, USA, 16–18 May 2000; pp. 93–104. [Google Scholar]
  25. Tang, J.; Chen, Z.; Fu, A.W.C.; Cheung, D.W. Enhancing effectiveness of outlier detections for low density patterns. In Advances in Knowledge Discovery and Data Mining; Chen, M.S., Yu, P.S., Liu, B., Eds.; Springer: Berlin/Heidelberg, Germany, 2002; pp. 535–548. [Google Scholar]
  26. Huang, J.; Zhu, Q.; Yang, L.; Feng, J. A non-parameter outlier detection algorithm based on natural neighbor. Knowl.-Based Syst. 2016, 92, 71–77. [Google Scholar] [CrossRef]
  27. Jin, W.; Tung, A.K.H.; Han, J.; Wang, W. Ranking outliers using symmetric neighborhood relationship. In Advances in Knowledge Discovery and Data Mining; Ng, W.K., Kitsuregawa, M., Li, J., Chang, K., Eds.; Springer: Berlin/Heidelberg, Germany, 2006; pp. 577–593. [Google Scholar]
  28. He, Z.; Xu, X.; Deng, S. Discovering cluster-based local outliers. Pattern Recognit. Lett. 2003, 24, 1641–1650. [Google Scholar] [CrossRef]
  29. Amer, M.; Goldstein, M. Nearest-neighbor and clustering based anomaly detection algorithms for RapidMiner. In Proceedings of the 3rd RapidMiner Community Meeting and Conference (RCOMM’12), Shenzhen, China, 9–11 May 2012; pp. 1–12. [Google Scholar]
  30. Ester, M.; Kriegel, H.P.; Sander, J.; Xu, X. A density-based algorithm for discovering clusters in large spatial databases with noise. In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, Portland, OR, USA, 2–4 August 1996; pp. 226–231. [Google Scholar]
  31. Campello, R.J.G.B.; Moulavi, D.; Zimek, A.; Sander, J. Hierarchical Density Estimates for Data Clustering, Visualization, and Outlier Detection. ACM Trans. Knowl. Discov. Data 2015, 10, 1–51. [Google Scholar] [CrossRef]
  32. Du, H.; Zhao, S.; Zhang, D.; Wu, J. Novel clustering-based approach for local outlier detection. In Proceedings of the 2016 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), San Francisco, CA, USA, 10–14 April 2016; pp. 802–811. [Google Scholar]
  33. Yamanishi, K.; Takeuchi, J.I.; Williams, G.; Milne, P. On-line unsupervised outlier detection using finite mixtures with discounting learning algorithms. In Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 26–29 August 2001; pp. 320–324. [Google Scholar]
  34. Kriegel, H.P.; Kröger, P.; Schubert, E.; Zimek, A. LoOP: Local outlier probabilities. In Proceedings of the 18th ACM Conference on Information and Knowledge Management, Hong Kong, China, 2–6 November 2009; pp. 1649–1652. [Google Scholar]
  35. Papadimitriou, S.; Kitagawa, H.; Gibbons, P.B.; Faloutsos, C. LOCI: Fast outlier detection using the local correlation integral. In Proceedings of the 19th International Conference on Data Engineering, Bangalore, India, 5–8 March 2003; pp. 315–326. [Google Scholar]
  36. Tang, B.; He, H. A local density-based approach for outlier detection. Neurocomputing 2017, 241, 171–180. [Google Scholar] [CrossRef]
  37. Qin, X.; Cao, L.; Rundensteiner, E.A.; Madden, S. Scalable kernel density estimation-based local outlier detection over large data streams. In Proceedings of the 22nd International Conference on Extending Database Technology (EDBT), Lisbon, Portugal, 26–29 March 2019. [Google Scholar]
  38. Li, Z.; Zhao, Y.; Botta, N.; Ionescu, C.; Hu, X. COPOD: Copula-Based Outlier Detection. In Proceedings of the 2020 IEEE International Conference on Data Mining (ICDM), Sorrento, Italy, 17–20 November 2020; pp. 1118–1123. [Google Scholar]
  39. Li, Z.; Zhao, Y.; Hu, X.; Botta, N.; Ionescu, C.; Chen, H.G. ECOD: Unsupervised Outlier Detection Using Empirical Cumulative Distribution Functions. IEEE Trans. Knowl. Data Eng. 2022, 14, 71–77. [Google Scholar] [CrossRef]
  40. Papers with Code. ODDS (Outlier Detection Data Sets). Available online: https://paperswithcode.com/dataset/odds (accessed on 3 June 2025).
  41. Campos, G.O.; Zimek, A.; Sander, J.; Campello, R.J.; Micenková, B.; Schubert, E.; Assent, I.; Houle, M.E. On the evaluation of unsupervised outlier detection: Measures, data sets, and an empirical study. Data Min. Knowl. Discov. 2016, 30, 891–927. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.