A Double-Density Clustering Method Based on “Nearest to First in” Strategy

: The existing density clustering algorithms have high error rates on processing data sets with mixed density clusters. For overcoming shortcomings of these algorithms, a double-density clustering method based on Nearest-to-First-in strategy, DDNFC, is proposed, which calculates two densities for each point by using its reverse k nearest neighborhood and local spatial position deviation, respectively. Points whose densities are both greater than respective average densities of all points are core. By searching the strongly connected subgraph in the graph constructed by the core objects, the data set is clustered initially. Then each non-core object is classiﬁed to its nearest cluster by using a strategy dubbed as ‘Nearest-to-First-in’: the distance of each unclassiﬁed point to its nearest cluster calculated ﬁrstly; only the points with the minimum distance are placed to their nearest cluster; this procedure is repeated until all unclassiﬁed points are clustered or the minimum distance is inﬁnite. To test the proposed method, experiments on several artiﬁcial and real-world data sets are carried out. The results show that DDNFC is superior to the state-of-art methods like DBSCAN, DPC, RNN-DBSCAN, and so on.

and used widely. DBSCAN needs two parameters: eps and minpts. The density of each point is estimated by counting the number of points in its hypersphere with eps radius and each point then is recognized as a core or non-core object according to whether its density is greater than minpts or not. An obvious weakness of DBSCAN is that it is very difficult to select two appropriate parameters for different data sets. Rodriguez and Laio proposed a new clustering approach by fast search and find of density peaks (DPC) [18]. It is simpler and more effective than DBSCAN because only one parameter is needed and the clustering centers can be selected in a visible mode by projecting points into the ρ − decision graph. But DPC cannot partition a data set with adjacent clusters of different densities correctly. For improving the performance of DPC, many methods tried to update the density calculation, mode of clustering centers selection, and so on. Vadapalli etc. introduced reverse k nearest neighbors (RNN) model into density-based clustering (RECORD) [19] and adopted the same way as DBSCAN to define the reachability of points in a data set but only required one parameter k. Though RECORD often confuses the points lying on borders of clusters with noises, for its ability to detect the cores of clusters with different densities correctly, some improved methods have been proposed.
The density-based approaches mentioned above are using only one way to measure point densities in the data set. So, though they performed well on data sets with arbitrary shape but separately distributing clusters, they cannot partition low-density clusters from their adjacent high-density clusters in mixture-density data sets. In this paper, a density clustering method based on double density and Nearest-to-First-in strategy is proposed, which is dubbed as DDNFC. Compared with the above state-of-the-art density clustering algorithms, this method has the following advantages: (1) It can depict the distributions of data in a data set more accurately by using the two densities of each point, reverse k-nearest neighbor number and local offset of position; (2) According to the two densities and respective thresholds, a data set is divided initially into several core areas with high-density points and a low-density boundary surrounding them. In this way, the low density clusters are not over segmented easily; (3) Adapting the Nearest-to-First-in strategy, only those unclassified boundary points closest to the existing clusters are classified in each iteration, and then the sensitivity of clustering to the input parameter and the impact of storage order of data points on classification results are reduced.
The remainder of this paper is organized as follows. In Section 2, we redefined the core object, which is introduced in DBSCAN firstly, and give the procedure of clustering the core objects initially. Moreover, the complexity of initial clustering procedure was analyzed here. Section 3 described Nearest-to-First-in strategy proposed by this paper in detail. We also analyzed the complexity of the strategy in this section. In Section 4, experimental results on synthetic and real data sets are presented and discussed. Additionally, the choice of appropriate of k is under discussion in this section. Finally, conclusions are made in Section 5.

Core Objects and Initial Clustering
In this section, we give two densities to represent the distribution of data, introduce and improve some notations defined in DBSCAN and RECORD into this paper but improve them, describe the Nearest-to-First-in strategy.

Abbreviations
X: a data set with N d-dimension points; | |: the counting function to calculate the size of a set; x, y, z: three points in the data set X; d(x, y): a distance function returns a real value of the distance between points x and y, it is the Euclidean function in this paper; k: the input parameter indicates the number of nearest neighbors of an point; N k (x): the set of k nearest neighborhoods of point x; R k (x): the set of reverse k nearest neighbors of point x, which is defined as R k (x) := {y|y ∈ X & x ∈ N k (y)}; c: the number of clusters in data set X; C i , C j : the i th and j th clusters, 1 ≤ i, j ≤ c, i = j, C i ∩ C j = ∅; L: label array, in which, the value of each element indicates the cluster the corresponding point belongs to.

Density of Each Point
In some cases, using only one density cannot portray the real distribution of data. For example, as shown in Figure 1, Compound [20] is a typical data set composed of several clusters with different shapes and densities, and the clusters are adjacent or surrounded each other. Dividing it into correct clusters is a difficult task. In this paper, we use two densities to detect the true core areas of clusters. For each point in a data set, apart from using the reverse k nearest neighbors model to compute density, we introduce the offset of local position of a point to calculate its another density.

Definition 1.
Offset of local position. The offset of local position of point x is the distance between x and the geometric centerx of all its k nearest neighbors, which is defined as follows: As displayed in Figure 1, the original position of each point is a red ball in the figure, and the semi-transparent green cross connected with it is the geometrical center of its k nearest neighborhoods, and the length of line connected a red ball and its corresponding green cross with directed arrow is the offset of local position. Figure 1a is the entire view of points and their offsets of Component, and Figure 1b-d are the partial view of Compound. From these figures, we can find the truth that the value of offset of an point is small when it lies in the central of local region but is large in the boundary, and the sparser the data distributed locally, the larger the offset is. So it is reasonable to estimate density of the point by using the offset of local position. But it also has defects, as shown in Figure 1b, the two boundary points, which are circled out by a blue ellipse, have very small offset values because they locate in the local center of their neighbors. We use two density computing methods to estimate the true distribution of a point in the data set and they are defined as follows: (1) Density computed by the offset of local position (2) Density computed by the reverse k nearest neighbors model

Core Object
By the two densities of all points, two corresponding thresholds are set, and then we are able to classify each point into two types: core or non-core. Definition 2. Core object. A point x in data set X is a core object if it satisfies the following condition: Those points that do not satisfy the above condition are non-core objects. After extracting core objects from a data set, we improve the two relationships, density directly-reachable and density reachable, which were defined in DBSCAN firstly. By these relations we construct a directed graph and find its maximum connected components, which contain the cores of all clusters. Definition 3. Density directly-reachable. A point x is density directly-reachable from another point y if they satisfy the two conditions as follows: (1) x and y are all core objects; (2) x ∈ R k (y).
Because that x is one of reverse k nearest neighbors of y is not able to guarantee y is the reverse k nearest neighbor to x, too, y is not definitely density directly-reachable from x. It is to say that the density directly-reachable is not symmetrical.

Definition 4. Density reachable.
A point x is density reachable from another point y if there are a series of points, x 1 , ... , x m and x = x 1 , y = x m , satisfying the following conditions: (1) ∀i, 1 ≤ i ≤ m, x i is a core object; (2) ∀i, 1 ≤ i < m, x i is density directly-reachable from x i+1 . The density reachable is not symmetrical for density directly-reachable being unsymmetrical.

Initial Clustering
In Algorithm 1, instead of constructing a directed graph of core objects obviously, we randomly start from an unclassified core object and visit all unclassified core objects density directly-reachable or reachable relationship from the point classified before, and the procedure is implemented iteratively.

Algorithm 1: Initial clustering.
Input: data set X, k, N k , R k . Output: label array L.
Step 1. Initialize each element of L to 0, and set Cid to 0; Step 2. for each x i in X: Step 3. Compute Calculate ρ o f f (x i ) and ρ R (x i ) by (2) and (3) respectively; Step 5. Calculate the thresholdρ o f f by (4) Step 6. for each x i in X: Step 7. if L(x i ) = −1 Step 9. Initialize an empty queue Q Step 10. for each x i in X: Step 11.
while Q is not empty: Step 14.
In Algorithm 1, there are 2kN spaces to store the elements of N k and R k . And L, ρ o f f and ρ R need memory spaces N each. The storage spaces for the queue Q are dynamic but not beyond N. Thus the space complexity of the algorithm is O(kN). The loop needs repeat kN times for calculating the two densities for all point from step 2 to 4 of the algorithm. The times for computingρ o f f in step 5 and finding out all core objects from step 6 to 8 are all N obviously according to the corresponding definition. Steps from 10 to 16 form a three layers nested loop. The most inner loop is steps 15-16. It repeats more than k but less than 2k times on average. The iterating times of the second loop depend on how many core objects are density reachable from the original seed point taken in step 11 but obviously are no more than N. The outer loop calls the inner loops c times if the dataset has c core regions. So, the time complexity of the nested loop is O(ckN). The larger c is, the smaller the core regions will be, then the less time the second loop needs. Finding the k-nearest-neighbor set N k and the inverse k-nearest-neighbor set R k of each point before calling the algorithm are the common procedures for all related density-based approaches but do not specially need by our method, so we directly estimate their time complexity O(N 2 ). Therefore, the total time complexity of Initial Clustering procedure is O(N 2 ). Figure 2 shows the result of initial clustering on data set Compound with k = 7. Algorithm 1 gets 6 initial clusters. The blue ball in the right-bottom corner is the core object of the right sparse cluster.

Nearest-to-First-in Strategy
The initial clustering procedure groups all core objects, while there are a large number of non-core objects unclustered in the data set. Many density-based clustering such as RECORD, IS-DBSCAN, ISB-DBSCAN and RNN-DBSCAN adopted different approaches to classify each non-core object into one of clusters or treat it as noise. But these methods have a weakness that the clustering result suffers from the storage order of points. A Nearest-to-First-in strategy introduced in this section can solve the problem. We illustrate the basic idea and give the implementation steps of the algorithm. Moreover, we analyze the time and space complexity of this strategy.

Basic Idea of the Strategy
The Nearest-to-First-in Strategy is based on two simple ideas: (1) the closer the points are, the higher the probability is that they belong to the same category; (2) by using the higher probability event priority principle, the cost of clustering will be minimized. To illustrate the advances of our strategy, we first introduce several definitions.

Definition 5.
Distance from an unclassified point to a cluster. The distance between an unclassified point x and a cluster C i , dubbed as d(x, C i ), is defined as: To compute the distance between an unclassified point x and a certain cluster C i , we first check whether the intersection of the reverse k nearest neighbor set R k (x) and C i is empty or not. If the intersection is not empty, which we call x is adjacent to C i , the distance between x and C i is the distance between x and the point in the intersection nearest to x. Otherwise, the distance from x to C i is infinite.
Definition 6. Nearest cluster. Given an unclassified point x, its nearest cluster C N (x) is: In particular, if x has the same distance to several clusters, even if x is not adjacent to any cluster, C N (x) will be set to the first cluster in the ordered list of distances from x .

Definition 7.
Smallest distance to clusters. The smallest distance to clusters d min is the smallest one among the distances from all unclassified points to their nearest cluster respectively, that is: where L(x) represents the cluster label of x, its value is 0 if it is unclassified. The value of d min is ∞ if all unclassified points to their nearest clusters are ∞.
Definition 8. Nearest neighbors of clusters. Given the smallest distance d min , the nearest neighbors of clusters NN is defined as follows: NN is composed all those unclassified points closest to clusters. But NN will be empty if d min is infinite.

Main Procedures of Nearest-to-First-in Strategy
The Nearest-to-First-in strategy is a greedy strategy. As shown in Algorithm 2, the clustering procedures of the stragtegy for unclassified points are completed in an iterative way. In each iteration, the nearest neighbors of clusters NN is calculated by using (5)-(8), and then the points in the set NN are allocated to their corresponding nearest neighbor clusters. Repeat this process until the NN becomes empty. At this time, either all points have been classified, or the unclassified points are not adjacent to any existing clusters. In this case, the unclassified points will be regarded as noise.
It should be noted that according to Definition 6, getting the nearest cluster of an unclassified point x may be impacted by the storage order of the reverse k nearest neighbor of x, but the Definitions 7 and 8 will minimize this impact.
Input: X, k , R k , L. Output: label array L.
Step 1. Put all unclassified point (L(x) = 0) into an array unL and create two array D and C N , each has the same size as unL. The value of each element in C N is set to −1 and the value of each element in D is set to ∞; Step 2. while unL is not empty: for each x in unL: for each y in R k (x): Step 6. if L(y) > 0 and dist(x, y) < D(x): if d min = ∞: break; Step 10.
for each x in unL: Remove x from unL, D, C N ; Step 14. Return L.
In the process of classification, the strategy selects the points with global priority to be processed every time, but classifies them according to the principle of locality, so it can get better cluster results on unclassified points than other density-based clustering.
In the procedure of the Nearest-to-First-in strategy, the main memories needed for running the program are: 1. The distance matrix needs N 2 entries of real values. 2. Size of the reverse k nearest neighbors set is kN. 3. A label array needs N spaces to indicate the cluster label for each element. 4. Sets of unclassified points such as unL, D, C N need less than N units each. Therefore, the space complexity of this algorithm is O(N 2 ).
In completing the strategy, the number of unclassified points is the key factor in determination of the time complexity, and of course it is less than N. The most time-consuming part is a tree layer nested loop from step 2 to 8, which is used to calculate the nearest cluster C N and its respective distance D of all unclassified points according to (6) and (5). Obviously, the size of R k (x) of an unclassified point x is less than k because x is a non-core object, so the iterative times of the inner two nested loop from step 5 to 7 are less than kN. Another inner loop between step 10 and step 13 is to classify those unlabeled points which are nearest to existing clusters and its iterative times also depend on the number of unclassified points.
Step 13 removes all newly labelled points in one iteration and then the next loop times will decrease. While step 9 avoid the algorithm not to loop endlessly if there is any point cannot be classified. The total time complexity is kN 2 . In fact, after a round of iteration, it is not necessary to recalculate C N and D of all the remaining unclassified points. We only need to update the unclassified points in the k nearest neighbor of the new classification points in this round of iteration, so the time required is k 2 . In the worst case, there is only one unclassified point to be classified in one iteration but iterating N times. Thus, the worst time complexity of this algorithm is O(k 2 N).

Experiments and Results Analysis
In order to verify the effectiveness of DDNFC, we compare it with another six algorithms, DPC, RNN-DBSCAN [21], DBSCAN, KMeans [2], Ward-link and Meanshift [4], on artificial and real data sets. By using three cluster validity indexes [1,22,23], F-measure with α = 1 (F1), adjusted mutual information (AMI) and adjusted rand index (ARI), we evaluate and discuss the performances of these seven methods. All algorithms are implemented in Python Integrated Development Environments (version 3.7). DPC and RNN-DBSCAN are coded according to the original articles respectively. Other five methods we use are the built-in algorithms or models provided by scikit-learn library [24]. The ways of parameter setting of the algorithms are described as follows: (1) Meanshift uses the default parameters of the model in scikit-learn; (2) The input parameters of KMeans and Ward-Link are all set to the true number of clusters. KMeans adopts KMeans++ to initialize the centers of clusters; (3) To get the best result of DBSCAN, we use the grid-searching method to find the optimal values for the two parameters, eps and minpts. The value of eps varies in the range of 0.1 -0.5 and the interval is 0.1. While the minpts ranges from 5 to 30 and steps 5. (4) The cutoff distance required by DPC is calculated by the input percentage value. In this paper, the optimal parameter value is searched from 1.0 to 5.0 percentages in steps of 0.5. In order to simplify the implementation of DPC, we select c points with the top values of gamma, which defined as gamma = ρ * δ in [18], directly as the cluster centers. In addition, the density of DPC is calculated by Gauss kernel. (5) The optimal values of parameter k of our method and RNN-DBSCAN are set in the range of [2 √ N) . In addition, spectral clustering [25] is a famous algorithm and has been widely used too. Since it uses k-nearest neighbors to construct an adjacent matrix just as our methods do, we implemented the algorithm on data sets: Spiral, Swiss Roll, Compound, Pathbased, and Aggregation. Table 1 displays the basic information of artificial data sets used in the paper. These data sets are all two dimensions and composed of clusters with different densities, shapes and distributed orientations [20,26]. Table 2 shows the settings of parameters (dubbed as par) and the test results, such as the number of clusters the methods obtain and the true cluster numbers (dubbed as c/C), the values of F1, AMI and ARI, of the above seven algorithms on artificial and real data sets respectively.

Experiments on Artificial Data Sets and Results Analysis
Meanwhile, we display some results of the above seven algorithms running on these data sets. Figures 3 and 4 show the results of Compound and Pathbased, and other results are showed in Appendix A Figures A1-A5. Different clusters are distinguished by marks with different shapes and colors. Small black dots represent unrecognized data, i.e., noise points. Compound is composed of six classes with different densities. In the upper-left corner, two classes are adjacent to each other and both subject to Gaussian distribution. In the right side area, an irregular shape class is surrounded by a sparse distributed class and the two classes are overlap spatially. In the bottom-left corner, a small disk like class is encompassed by a ring-shape class. It is a big challenge to distinguish all classes correctly in Compound because they have different densities, various shapes and complex spatial relationships. As in Figure 3, our method classified all points correctly except one point on the border between two classes in the upper-left corner. RNN-DBSCAN also found six clusters, but nearly 75 percent points of the sparse class on the right side are misclassified to the dense cluster. DBSCAN only detected out five clusters, and a large number of points are divided into noise as the sixth cluster of data set. DPC almost distinguished the two Gaussian and the disk classes correctly, but partitioned the ring class into two parts and mingled the sparse class with the dense. The results of Ward-Link and K-Means++ showed that the two methods are unable to cluster arbitrary shape data set. Meanshift cannot partition two clusters when one contains another, so it only recognized four clusters in Compound.  Pathbased has 3 classes, in which one contains 110 points form an unclosed thin ring, the other two contain 97 points and 93 points separately and all are enclosed in the ring. It is easy to misclassify the points, which belong to the ring class but are near other classes, to their adjacent class. As shown in Figure 4, DDNFS and RNN-DBSCAN get the nearly correct results, a few points in the adjacent space between the ring and the left class were misclassified. One point is not unrecognized by RNN-DBSCAN. DBSCAN found out two clusters but were treat points of the ring class as noise. The other four algorithms divided the ring clusters into three parts, the top of the arc area was regarded as a class, and the other two parts were respectively were clustered into the other two clusters.   Flame has two classes with similar densities but different shapes, and they are very close to each other. As the results shown in Figure A2 and Table 2, the algorithms based on density can find out two clusters more accurately, except for some classification errors in the adjacent parts between the two clusters. DDNFC got the completely correct result; DBSCAN treated two outliers in the upper left corner as noise. Obviously, in this data set, the algorithm based on density is obviously superior to the other three clustering algorithms.
The characteristics of t8.8k and t7.10k are similar to Compound, in which the clusters have different shapes and some are embraced or semi-embraced by others. But the two data sets are seriously contaminated by noise. On these two data sets, DDNFC and RNN-DBSCAN got more accurate clustering results than the other methods.
Seven clusters with different sizes in Aggregation are independent of each other, except two pairs are connected slightly. Four density-based methods achieved better results than the others. DDFNC misclassified three points located in the adjacent area between the right two clusters. RNN-DBSCAN misclassified one point located in the adjacent area between the left two connected clusters. DBSCAN regarded one edge point of the upper-left cluster as noise. DPC performed best in this data set.
In addition to the above 7 data sets, Table 2 also lists the test result of the 7 algorithms on the other 5 data sets.
The data sets t5.8k contain 7 clusters, 6 of which form the word "GEORGE", and another bar-shape cluster runs through them. The data set contain noise data. As shown in Table 2, the results of all methods are not good because the bar-shape cluster is hard to recognize from the data set.
Clusters in the data sets Unbalance, R15, S1 and A3 are in general independent to each other, though some of them may be adjacent. Of course, the clusters have different densities and arbitrary shapes. On these data sets, DDNFC performed as well as the other algorithms.
The spectral clustering algorithm in scikit-learn library [24] need two parameters: the number of clusters and the number of nearest neighbors.
On Spiral, the k of DDNFC was set to 4, and the two parameters of spectral clustering were 3 and 4 respectively, and the results shown in Figure 5 are all entirely correct.
The Swissroll has 1500 points. The parameter settings of the two methods were 13 for DDNFC and (6, 13) for spectral clustering. From Figure 6 we can see that the width of clusters got by spectral clustering are more uniform than DDNFC. On remain three data sets Compound, Pathbased, and Aggregation, the parameter settings of the spectral clustering were (6,5), (3,6), and (7,9). As shown in Figure A6, the algorithm cannot classify the clusters in these data sets correctly. Table 3 lists 12 real-world data sets, which are widely used in clustering and classification methods testing and downloaded from UCI machine learning repository [27] for testing the algorithms in this paper. Additionally, we did data preprocessing on the data sets if needed.

Experiments on Real-World Data Sets and Results Analysis
• Data that have null values, uncertain values, or duplicates were removed.

•
Most of data sets have a class attribute. Table 3 only gives the number of none class attributes.

•
We conserved from the third to ninth features of the data in Echocardiogram because some of its data has missing values.
As shown in Table 3, Ionosphere, SPECT-train and Sona are sparse data sets for their higher ratio of dimension and number of instances. Ionosphere has 351 radar data. Each data is composed of 17 pulse numbers which are described by 2 attributes each, corresponding to the complex values. But in our tests, we treated 34 attributes as being independent. SPECT-train is one of a subset of SPECT, which has 22 binary attributes and 2 groups with 40 points each. Sona has 208 data with 60 real features. From Table 4, we can see that DDNFC outperformed other six algorithms on all benchmarks much more on Ionosphere and also got the best results on two benchmarks on SPECT-train and Sona. The three data sets are sparse, DPC was better than the other 5 methods. Meanshift cannot distinguish the data in these data sets. KMeans++ performed good on SPECT-train. RNN-DBSCAN got the highest F1 on Sona.
Page-block is a data set about classification of the blocks of the page layout in a document. Its data has 4 real-type and 6 integer-type features. It has 5 classes with 4913, 329, 28, 88 and 115 data, respectively. Haberman contains 306 cases of the survival status of patients with breast cancers after they had undergone surgery. The 3 attributes of each data are integers representing operating time, patient's year and number of positive axillary nodes detected. The data set is divided into 2 groups with 225 and 81 instances, respectively. Wilt-train consists two groups, one group has 4265 points but another has only 74 points. Obviously, the three data sets are unbalance data sets. The test results on them show that density-based clustering methods were better than others. DDNFC and RNN-DBSCAN got the best results. The two methods performed closely because they all used reverse k nearest neighbors model to determine core objects and the nearest-on-first-in tragedy of our method was not different from RNN-DBSCAN on these unbalance data sets. DBSCAN method performed badly.
The attributes of data in Breast-cancer-Wisconsin are integer category type. The original data set has 458 benign and 241 malignant cases, but we deleted cases with missing values. There are 444 benign and 239 malignant cases remained. Each data of Chess has 36 text category type attributes. The value of each attribute are selected from one of four groups f, t, g, l, b, n, w and n, t. For calculating the distance between two data, we replace the text label values of data to integer such as 0, 1, 2. The attributes of Breast-cancer-Wisconsin, Chess and Pendigits-train are categories or integers. On Breast-cancer-Wisconsin, Ward-link, KMeans++ and DDNFC got much higher benchmarks than the other methods, and DDNFC got the best performance in the density-based methods. On other two data sets, the performances of DDNFC, Ward-link, DPC and RNN-DBSCAN were close. Table 4 shows the settings of parameters (dubbed as par) and the test results, such as the number of clusters the methods obtain and the true cluster numbers (dubbed as c/C), the values of F1, AMI and ARI, of the above seven algorithms on artificial and real data sets respectively.
Contraceptive-Method-Choice is a subset of the National Indonesia Contraceptive Prevalence Survey in 1987, which has multi-type attributes including integer, category and binary. The attributes of Echocardiogram are also composed of different numeric types with different value scales. On Echocardiogram, all methods except DBSCAN got the similar results. KMeans++ was the best one in all while DDNFC was the best one in density-based methods. On Contraceptive-Method-Choice, only four methods got the correct number of clusters. DDNFC achieved the best F1.
Segmentation-test is one of a subset of Segmentation, which has 19 real attributes, 7 groups with 300 points each. It is an ordinary data set. RNN-DBSCAN and DBSCAN did not classify all 7 clusters. DDNFC outperformed all.

Conclusions
In this paper, in order to deal with data sets with mixed density clusters, we proposed a density clustering method, DDNFC, based on the strategy of double density and Nearest-to-First-in. DDNFC uses two density calculation methods to estimate the data distributions in a data set. By thresholding on the data set with two densities respectively, DDNFC can more accurately partition the data into high-density core area and low-density boundary area initially. In the process of classification of low density points, the sensitivity of the clustering to the input parameters and the data storage order is reduced by applying the Nearest-to-First-in strategy. By comparing the proposed algorithm in this paper and other classical algorithms on artificial and real data sets, the results show that DDNFC is better than the other six algorithms as a whole.

Conflicts of Interest:
The authors declare no conflict of interest.