You are currently viewing a new version of our website. To view the old version click .
Algorithms
  • Article
  • Open Access

25 November 2020

Unsupervised Clustering of Neighborhood Associations and Image Segmentation Applications

,
,
,
and
1
Chengdu Computer Application Institute Chinese Academy of Sciences, Chengdu 610041, China
2
University of Chinese Academy of Sciences, Beijing 100864, China
3
Chengdu Customs of China, Chengdu 610041, China
*
Author to whom correspondence should be addressed.
This article belongs to the Section Evolutionary Algorithms and Machine Learning

Abstract

Irregular shape clustering is always a difficult problem in clustering analysis. In this paper, by analyzing the advantages and disadvantages of existing clustering analysis algorithms, a new neighborhood density correlation clustering (NDCC) algorithm for quickly discovering arbitrary shaped clusters. Because the density of the center region of any cluster sample dataset is greater than that of the edge region, the data points can be divided into core, edge, and noise data points, and then the density correlation of the core data points in their neighborhood can be used to form a cluster. Further more, by constructing an objective function and optimizing the parameters automatically, a locally optimal result that is close to the globally optimal solution can be obtained. This algorithm avoids the clustering errors caused by iso-density points between clusters. We compare this algorithm with other five clustering algorithms and verify it on two common remote sensing image datasets. The results show that it can cluster the same ground objects in remote sensing images into one class and distinguish different ground objects. NDCC has strong robustness to irregular scattering dataset and can solve the clustering problem of remote sensing image.

1. Introduction

Cluster analysis is the most commonly used static data analysis method. Cluster analysis refers to the process of grouping a collection of physical or abstract objects into multiple classes composed of similar objects. The objects in the same cluster have great similarity, while objects in different clusters have great divergence. In general, clustering methods can be divided into mean-shift, density-based, hierarchical, spectral clustering [1], and grid-based [2] methods.
Different algorithms have different advantages and problems. Centroid-based algorithms, such as K-means (Kmeans) [3,4], K-medoid [5,6], fuzzy c-means (FCM), Mean shift [7,8], and some improved methods [9,10], have the advantages of simple principles, convenient implementation, and fast convergence. Because this kind of algorithm always takes the approach of finding the centroid and clustering the points close to the centroid, they are especially suitable for clustering. Such algorithms have the characteristics of good clustering results and low time complexity. However, real clustering samples usually contain a large number of clusters of arbitrary shapes. Consequently, centroid-based clustering algorithms, which cluster the points around a centroid into one class, lead to poor results on irregular shape clusters and many misclassified points.
Clusters of arbitrary shapes can be easily detected by a method that is based on local data point density. The density-based spatial clustering of applications with noise (DBSCAN) [11] has good robustness for clusters with uniform density of any shape. However, it is not easy to select a suitable threshold. Especially for clusters with large differences in density, the threshold selection is very difficult. Moreover, the circular radius needs to be adjusted constantly in order to adapt to different cluster densities, and there is no reference. At the same time, for clusters without obvious boundaries, it is easy to classify two clusters with different classes as belonging to the same class. Because DBSCAN uses the global density threshold MinPts, it can only find clusters that are composed of points with a density that satisfies this threshold; that is, it is difficult to find clusters with different densities. Moreover, clustering algorithms that are based on hierarchy, spectral features, and density also have serious difficulties with parameter selection.
In real clustering problems, there are many clusters with arbitrary shapes, and it is impossible to use a center of mass in order to represent the nature of the data in the cluster. Moreover, not all of the data have a real clustering center; and, in some cases, the centroid points of clusters with completely different distributions basically coincide, so clustering data based on centroid points often leads to misjudgments. Parameter selection that is based on the density-based algorithms is also very difficult, which often causes poor clustering results. No matter what kind of clustering algorithm, there are difficulties in parameter selection. Besides, methods, such as Silhouette Coefficient and sum of the squared errors, cannot completely realize unsupervised parameter selection. Our aims are to avoid the shortcomings of centroid- and density-based algorithms, and address the challenges of clustering datasets with clusters having different point densities.
The neighborhood density correlation clustering (NDCC) does not use the method of calculating the centroid within the cluster. Instead, it incorporates the idea of density clustering, but is not limited to the density of a fixed region and does not use a certain definite distance or a certain definite density as a measure of the differentiation between different classes. It takes the k nearest neighbor domain of each point as the analysis object, and considers each point and its neighboring k points as the same cluster data. By adjusting the k value, different clustering results are obtained. Although a series of parameters can be manually set, it is difficult to find suitable parameters for clustering without sufficient prior knowledge and multiple trials. By appropriate objective function setting and minimizing the objective function, the NDCC can automatically adjust the parameters to get the optimal solution and automatically cluster a sample dataset. The method can detect irregular shape clusters and automatically find the correct number of clusters. This method does not consider the influence of iso-density points between clusters in cluster classification. Its generalization performance and robustness are improved (iso-density points between clusters in this paper are similar to the noise points or edge points connecting two clusters in other work. This is because the existence of iso-density points between clusters will lead to misclassification when using a density-based clustering algorithm to distinguish these clusters).
The contributions of this paper are as follows:
  • By relying on the adaptive distance radius to distinguish core, edge, and noise points, the method that is proposed in this paper overcomes the problem that a distance threshold is difficult to select in a density-based clustering algorithm, eliminates the influence of some non-core points on the clustering process, and enhances the generalization performance of the algorithm.
  • Neighborhood density correlation is used instead of a real distance in order to measure the correlation between the core points. In addition, the method clusters a certain number of neighboring core points around a core point as a class. This approach can adapt to clustering problems with different density clusters in the same dataset.
  • An appropriate objective function is adopted in order to minimize the distance between some core points, achieve the local optimal clustering results, completely avoid the subjective factors of manually set parameters, improve the efficiency and objectivity of the algorithm, and realize unsupervised clustering.

3. Experiment

The experiment in this paper consists of two parts. The first part is to compare the algorithms that are mentioned in this paper in the common clustering data set and show the visual effect and index difference. The second part compares the visual effect and index difference of the algorithm on two public remote sensing data.

3.1. Index

This paper compares the following indicators of different algorithms: Accuracy ( A c c ), Normalized Mutual Information ( N M I ), Rand Index ( R I ), Adjusted Rand Index ( A R I ), Mirkin index ( M I ), and Hubert Index ( H I ). The following symbols are independent, and they are not associated with the symbol in the paper above.

3.1.1. A c c

The formula for calculating the A c c of sub-datasets is as follows:
A c c = i = 1 n δ s i , m a p r i n
where r i and s i represent the obtained label and the real label corresponding to data point x i , respectively; n represents the total number of data points and δ represents the indicator function, as follows:
δ ( x , y ) = 1 if x = y 0 otherwise
The m a p in the equation represents the optimal class object re-allocation, in order to ensure correct statistics.

3.1.2. N M I

Mutual information is a useful measure in information theory. It can be regarded as the information that is contained in a random variable regarding another random variable, or the uncertainty reduced by a random variable due to the knowledge of another random variable. The formula of N M I can be derived, as follows: Suppose tthat he ( X , Y ) are two random variables with the same number of elements. The joint distribution of ( X , Y ) is P ( x , y ) and their marginal distributions are P ( x ) and P ( y ) . Furthermore, M I ( x , y ) is the mutual information and it is the relative entropy of joint distribution P ( x , y ) and product distribution P ( x ) ( y ) . Therefore, we have
M I ( X , Y ) = i = 1 N j = 1 N P ( x , y ) log P ( x , y ) P ( x ) P ( y )
Here, P ( x ) is the probability distribution function of X and P ( y ) is the probability distribution function of Y. The joint probability distribution P ( x , y ) = | x i y j | N while using the abovementioned formula can be expressed, as follows:
N M I ( X , Y ) = 2 M I ( X , Y ) H ( x ) + H ( y )
The distribution of H ( X ) and H ( Y ) is the entropy of information for the random variable X and Y.
H ( X ) = i = 1 N P ( x i ) log ( P ( x i ) ) ; H ( Y ) = j = 1 N P ( y j ) log P ( y j )

3.1.3. R I , A R I , M I , H I and J C

Let the clustering result be C = C 1 , C 2 , , C m , and the known partition be P = P 1 , P 2 , , P m , Rand Index ( R I ) [28], and Jacarrd Index ( J I ) [28]. Subsequently, we have the following:
R I = a + d a + b + c + d
J I = a a + b + c
where, a indicates that the two data objects belong to the same cluster in C and the same group in P; b indicates that the two points belong to a cluster in C, but to different groups in P. c indicates that the two points do not belong to the same cluster in C, while P belongs to the same group of d, which in turn indicates that the two points do not belong to the same cluster in C and are in different groups in P. The higher the evaluation value of these two indexes, the closer the clustering result is to the real partition result, and the better the clustering effect.
For the A R I , it is assumed that the distribution of the model is random, which is, the division of P and C is random. Consequently, the number of data points of each category and cluster is fixed.
A R I = R I E ( R I ) max ( R I ) E ( R I )
E ( R I ) refers to the mean value of each cluster R I and max ( R I ) to the maximum value of each cluster R I .
A c c is a simple and transparent evaluation measure and N M I can be information-theoretically interpreted. The R I and A R I penalize both false positive and false negative decisions during clustering. The formulas for M I and H I are available in Lawrence Hubert’s paper [28]. While the larger of these index, including A c c , N M I , R I , A R I , J I , and H I represent the better clustering. Smaller M I represents better clustering, and M I is used as a reverse index to evaluate the performance of the algorithm.

3.2. Effect Evaluation of Scattered-Point Data Clustering

Six different algorithms are used to complete the clustering experiment on seven two-dimensional public datasets include Flame [29], Jain [30], Spiral [31], Aggregation [32], Compound [33], D31 [31], and R15 [31]. NDCC adopted the completely unsupervised objective function LDCC convergence method that was proposed by us to complete the clustering, and the other algorithms used manual parameter tuning in order to achieve better clustering effect as far as possible. From the experimental results, NDCC can complete clustering in a better way without intervention, and the effect is better than other algorithms. The indexes comparison are shown in Table 2 and the display effect of clustering is shown in Figure 4.
Table 2. Indexes and Time (s) of various algorithms on seven datasets. HC1: shortest-distance HC, HC2: weighted HC.
Figure 4. Clustering effects of various algorithms on seven different datasets. Shown from top to bottom are results for FCM, Kmeans, DBSCAN, HC1: shortest-distance HC, HC2: weighted average HC, GMM, and NDCC on (a) Flame, (b) Jain, (c) Spiral, (d) Aggregation, (e) Compound, (f) D31, and (g) R15.

3.3. Evaluation of Clustering Effect of Remote Sensing Data

In the field of remote sensing, it is expensive and difficult to obtain labeled data for training. Different ground features and different weather conditions make remote sensing images substantially different. Thus, it is difficult to apply a supervised learning method. In contrast, an unsupervised machine learning algorithm does not need training samples. It can cluster the data according to their natural distribution characteristics that are based on the spectral information given by geomagnetic radiation intensity in remote sensing images. It is a great way to group similar objects together. In this paper, we select several effective unsupervised clustering algorithms when compared with NDCC on two datasets. These are the labeled remote sensing dataset ’UCMerced-LandUse’ [34], and the ’2015 high-resolution remote sensing image of a city in southern China’ [35] dataset. The evaluation of the two datasets is divided into two steps: preprocessing and evaluation.
Step 1 Preprocessing: super-pixel segmentation (the simple linear iterative clustering super-pixel segmentation algorithm [36,37,38]) is adopted as a pre-processing step for remote sensing image clustering to reduce the amount of calculation. The number of super-pixel elements in each images is kept between 1000 and 3000. Figure 5c,d show the image and scatter effect of NDCC remote sensing clustering. It can be seen from the Figure 5d that the distribution of the super-pixel data points in remote sensing images presents an irregular shape, no definite clustering center.
Figure 5. NDCC algorithm on the ’UCMerced-LandUse’ remote sensing dataset clustering segmentation effect display. (a) are the original image. (d) are the distribution of superpixel scatter points of different types of ground objects in R and G channel graphs, corresponding to the segmentation of different ground objects in (c) graph. Our algorithm finds the number of image clusters in a completely unsupervised manner and realizes clustering segmentation.
Step 2 Evaluation: the comparative experiment of seven clustering algorithms is carried out with image super-pixel (RGB value) data points as the object.
The ’UCMerced-LandUse’ remote sensing dataset is used for verifying the algorithm clustering effect. It is a 21-class land-use-image dataset that is meant for research purposes. There are 100 images for each of the following classes. Each image measures 256 × 256 pixels. The images are manually extracted from larger images in the USGS National Map Urban Area Imagery collection for various urban areas around the country. The pixel resolution of this public domain imagery is one foot. This experiment compared the clustering effects of various algorithms cited in this paper on the dataset and verified the different clustering effects with indexes. 80 images of 21 class of ground objects are randomly selected for cluster comparison and repeated for 30 times. Table 3 shows the clustering effect pairs. Clustering segmentation effect of six algorithms on the ’UCMerced-LandUse’ remote sensing dataset are shown in Figure 6.
Table 3. Exponential performance of various methods on ’UCMerced-LandUse’ dataset. The bold data are the maximum values. All program runs 30 times. Statistically significant maximum values in the table are indicated with ’*’. Additionally, the mean deviation table of clustering index is shown in the following table. The table shows that NDCC achieved good results on the dataset with labels using unsupervised methods.
Figure 6. Clustering segmentation effect of six algorithms on the’UCMerced-LandUse’ remote sensing dataset. Our algorithm accurately separates different features.
The ’2015 high-resolution remote sensing image of a city in southern China’ dataset of the CCF Big Data competition is used as the dataset for verifying the algorithm clustering effect. It included 14,999 original geological remote sensing images and ground-truth images, with a size of 256 × 256 pixels. Because all images of the data set are not divided, in order to better verify the clustering discrimination of the five algorithms, we randomly selected 14,000 images and divided them into 20 groups with 700 sample images each. Executing 30 times clustering in order to generate 30 groups of comparative data of different algorithms. The clustering effect pairs are shown in Table 4. The average running times are shown in Table 5.
Table 4. Exponential performance of various methods on ’2015 high-resolution remote sensing image of a city in southern China’ dataset. The bold data are the maximum values. All program runs 30 times. Statistically significant maximum values in the table are indicated with ’*’. Additionally, the mean deviation table of clustering index is shown in the following table. The table shows that NDCC achieved good results on the dataset with labels while using unsupervised methods.
Table 5. Comparison of runtimes for various algorithms.

3.4. Discussion of Experimental Results

As can be seen from Table 2, in terms of positive indicators, N M I , A R I , R I , and H I are the four indicators, while NDCC is only in Com. The data set was slightly lower than DBSCAN, and it achieved the maximum value of the other six data sets, which was the best result. The main reason was that when we used DBSCAN to verify the data, we selected the optimal result after several rounds of manual adjustment. Besides, the Compound data set shape made it suitable for density clustering. In other data sets, despite multiple rounds of manual tuning, other methods are still unable to surpass the clustering effect of NDCC, which is completely adaptive without manual tuning. In terms of inverse indexes, M I indexes all obtained minimum values, indicating that NDCC can quickly find the optimal cluster on several scatter data sets. As can be seen from Table 4, when compared with the seven algorithms, NDCC is medium in terms of running time. Its running speed is generally better than that of Kmeans, FCM, and GMM, and slightly slower than that of DBSCAN and HC. However, the overall difference is order of magnitude of 10 2 s, which basically does not affect the running speed of clustering algorithm.
By randomly extracting the images from the datasets to execute 30 times clustering with the six algorithms, the measuring the mean value and deviation are presented in Table 3 and Table 4. Through the comparison of five indicators, NDCC showed better indicators on two large remote sensing image datasets than the other five algorithms, and the mean deviation are not obvious when compared with other algorithms. It can indicate that NDCC had better robustness for different images. Through this statistical test, we can fairly verify the clustering effect comparison between NDCC and other algorithms on remote sensing images.

4. Conclusions

In this paper, we proposed the NDCC algorithm, which is a clustering method that is based on the local density of data points. As the experimental results in Table 3 and Figure 4 show, NDCC achieved the best clustering results on seven datasets, such as Flame and Aggregation. Our algorithm obtained its results without any supervision. In contrast, the other algorithms obtained relatively good results while using manually adjusted parameters. Moreover, the algorithm is further evaluated clustering effect on the ’UCMerced-LandUse’ remote sensing dataset and ’2015 high-resolution remote sensing image of a city in southern China’ dataset remote sensing. The A c c and N M I , A R I , M I , H I , R I , and J I coefficients obtained showed that the clustering effect of the proposed method is better than that of five other existing algorithms. On the other hand, as the time complexity of the algorithm is at a general level, the calculation time is relatively long when processing extremely large datasets (over 100,000 data points). For each data point, we focus on the neighborhood points; hence, it is not necessary to calculate the distance between the data points that differ overly much. We can expect NDCC to perform well in natural language processing and text clustering [39,40].
In future work, we plan to optimize the structure of the algorithm according to the neighborhood characteristics of the data points, omit the calculation of the distance between data points with large differences, and reduce the time complexity of the algorithm.

Author Contributions

Z.W. and J.J.; methodology, software, validation, writing–original draft preparation, formal analysis, Z.L.; investigation, resources, W.L.; writing–review and editing, X.L.; project administration, funding acquisition. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Sichuan Provincial Department of the Science and Technology Program of China-Sichuan Innovation Talent Platform Support Plan (2020JDR0330) and The APC was funded by this project (2020JDR0330).

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Table A1. Summary of notations.
Table A1. Summary of notations.
SOrdered distance matrix.
AOrdered serial number matrix.
A The core point’s ordered serial number matrix.
D m ¯ The mean of the distance to the m-th point in the overall ordered distance matrix S.
D i j The distance between x i and x j .
x i The i-th point.
x j The j-th point.
mThe threshold used to distinguish core, edge and noise points.
E i Edge point.
C i Core point.
B i Noise point.
N i ( x i ) The number of neighborhood points of the i-th point.
JThe objective function.
GThe number of clusters.
kThe number of the neighborhood core points which used to objective function evaluation.
P ( x ) The probability distribution function of X.
P ( y ) The probability distribution function of Y.
P ( x , y ) The Joint probability distribution of X and Y.
δ The indicator function.
YRandom variable named Y.
X i The i-th Random variable named X.
Y j The j-th Random variable named Y.
NThe number of random variables.
M I ( X , Y ) The relative entropy of the joint distribution P ( x , y ) .
A c c The Clustering Accuracy.
N M I The Normalized Mutual Information.
A R I The Adjusted Rand Index.
R I The Rand Index.
M I The Mirkin Index.
H I The Hubert Index.
J I The Jacarrd Index.

References

  1. Borjigin, S. Non-unique cluster numbers determination methods based on stability in spectral clustering. Knowl. Inf. Syst. 2013, 36, 439–458. [Google Scholar] [CrossRef]
  2. Wang, W. STING: A Statistical Information Grid Approach to Spatial Data Mining. In Proceedings of the 23rd Very Large Database Conference, Athens, Greece, 25–29 August 1997. [Google Scholar]
  3. Jin, X.; Han, J. K-Means Clustering. In Encyclopedia of Machine Learning and Data Mining; Springer: Berlin, Germany, 2017. [Google Scholar]
  4. Liu, X.; Zhu, X.; Li, M.; Wang, L.; Zhu, E.; Liu, T.; Kloft, M.; Shen, D.; Yin, J.; Gao, W. Multiple Kernel k-means with Incomplete Kernels. IEEE Trans. Pattern Anal. Mach. Intell. 2019, 42, 1191–1204. [Google Scholar] [CrossRef] [PubMed]
  5. Zeidat, N.M.; Eick, C.F. K-me Generation. In Proceedings of the International Conference on International Conference on Artificial Intelligence, Louisville, KY, USA, 18 December 2004. [Google Scholar]
  6. Mohit, N.; Kumari, A.C.; Sharma, M. A novel approach to text clustering using shift k-me. Int. J. Soc. Comput. Cyber-Phys. Syst. 2019, 2, 106. [Google Scholar] [CrossRef]
  7. Yamasaki, R.; Tanaka, T. Properties of Mean Shift. IEEE Trans. Pattern Anal. Mach. Intell. 2019, 42, 2273–2286. [Google Scholar] [CrossRef] [PubMed]
  8. Ghassabeh, Y.A.; Linder, T.; Takahara, G. On the convergence and applications of mean shift type algorithms. In Proceedings of the 2012 25th IEEE Canadian Conference on Electrical and Computer Engineering (CCECE), Montreal, QC, Canada, 29 April–2 May 2012. [Google Scholar]
  9. Sanchez, M.A.; Castillo, O.; Castro, J.R.; Melin, P. Fuzzy granular gravitational clustering algorithm for multivariate data. Inf. Sci. 2014, 279, 498–511. [Google Scholar] [CrossRef]
  10. Defiyanti, S.; Jajuli, M.; Rohmawati, N. K-Me. Sci. J. Inform. 2017, 4, 27. [Google Scholar]
  11. Ester, M.; Kriegel, H.P.; Sander, J.; Xu, X. A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise; AAAI Press: Cambridge, MA, USA, 1996. [Google Scholar]
  12. Chacon, J.E. Mixture model modal clustering. Adv. Data Anal. Classif. 2019, 13, 379–404. [Google Scholar] [CrossRef]
  13. Jia, W.; Tan, Y.; Liu, L.; Li, J.; Zhang, H.; Zhao, K. Hierarchical prediction based on two-level Gaussian mixture model clustering for bike-sharing system. Knowl.-Based Syst. 2019, 178, 84–97. [Google Scholar] [CrossRef]
  14. Madan, S.; Dana, K.J. Modified balanced iterative reducing and clustering using hierarchies (m-BIRCH) for visual clustering. Pattern Anal. Appl. 2016, 19, 1023–1040. [Google Scholar] [CrossRef]
  15. Agarwal, P.; Alam, M.A.; Biswas, R. A Hierarchical Clustering Algorithm for Categorical Attributes. In Proceedings of the Second International Conference on Computer Engineering & Applications, Bali, Island, 26–29 March 2010. [Google Scholar]
  16. Karypis, G.; Han, E.H.; Kumar, V. Chameleon: Hierarchical Clustering Using Dynamic Modeling. Computer 2002, 32, 68–75. [Google Scholar] [CrossRef]
  17. Fop, M.; Murphy, T.B.; Scrucca, L. Model-based Clustering with Sparse Covariance Matrices. Stat. Comput. 2018, 29, 791–819. [Google Scholar] [CrossRef]
  18. Ankerst, M.; Breunig, M.M.; Kriegel, H.P.; Sander, J. OPTICS: Ordering Points to Identify the Clustering Structure. In Proceedings of the ACM Sigmod International Conference on Management of Data, Philadelphia, PA, USA, 1–3 June 1999. [Google Scholar]
  19. Moraes, E.C.C.; Ferreira, D.D.; Vitor, G.B.; Barbosa, B.H.G. Data clustering based on principal curves. Adv. Data Anal. Classif. 2019, 14, 77–96. [Google Scholar] [CrossRef]
  20. Abin, A.A.; Bashiri, M.A.; Beigy, H. Learning a metric when clustering data points in the presence of constraints. Adv. Data Anal. Classif. 2019, 14, 29–56. [Google Scholar] [CrossRef]
  21. Rodriguez, A.; Laio, A. Machine learning. Clustering by fast search and find of density peaks. Science 2014, 344, 1492. [Google Scholar] [CrossRef] [PubMed]
  22. Corizzo, R.; Pio, G.; Ceci, M.; Malerba, D. DENCAST: Distributed density-based clustering for multi-target regression. J. Big Data 2019, 6. [Google Scholar] [CrossRef]
  23. Hosseini, B.; Kiani, K. A big data driven distributed density based hesitant fuzzy clustering using Apache spark with application to gene expression microarray. Eng. Appl. Artif. Intell. 2019, 79, 100–113. [Google Scholar] [CrossRef]
  24. Zhao, Z.; Luo, Z.; Li, J.; Chen, C.; Piao, Y. When Self-Supervised Learning Meets Scene Classification: Remote Sensing Scene Classification Based on a Multitask Learning Framework. Remote Sens. 2020, 12, 3276. [Google Scholar] [CrossRef]
  25. Petrovska, B.; Zdravevski, E.; Lameski, P.; Corizzo, R.; Štajduhar, I.; Lerga, J. Deep Learning for Feature Extraction in Remote Sensing: A Case-Study of Aerial Scene Classification. Sensors 2020, 20, 3906. [Google Scholar] [CrossRef]
  26. Kushary, D. The EM Algorithm and Extensions. Technometrics 1997, 40, 260. [Google Scholar] [CrossRef]
  27. Rodriguez, M.Z.; Comin, C.H.; Casanova, D.; Bruno, O.M.; Amancio, D.R.; da Costa, F.L.; Rodrigues, F.A. Clustering algorithms: A comparative approach. PLoS ONE 2019, 14, e0210236. [Google Scholar] [CrossRef]
  28. Hubert, L.; Arabie, P. Comparing partitions. J. Classif. 2006, 2, 193–218. [Google Scholar] [CrossRef]
  29. Fu, L.; Medico, E. FLAME, a novel fuzzy clustering method for the analysis of DNA microarray data. BMC Bioinform. 2007, 8, 3. [Google Scholar] [CrossRef]
  30. Jain, A.K.; Law, M.H.C. Data Clustering: A Users Dilemma. In Proceedings of the International Conference on Pattern Recognition & Machine Intelligence, Kolkata, India, 20–22 December 2005. [Google Scholar]
  31. Chang, H.; Yeung, D.Y. Robust path-based spectral clustering. Pattern Recognit. 2008, 41, 191–203. [Google Scholar] [CrossRef]
  32. Gionis, A.; Mannila, H.; Tsaparas, P. Clustering Aggregation. ACM Trans. Knowl. Discov. Data 2007, 1, 4. [Google Scholar] [CrossRef]
  33. Zahn, C. Graph-theoretical methods for detecting and describing gestalt clusters. IEEE Trans. Comput. 1971, 100, 68–86. [Google Scholar] [CrossRef]
  34. Cheng, G.; Han, J.; Lu, X. Remote Sensing Image Scene Classification: Benchmark and State of the Art. Proc. IEEE 2017, 105, 1865–1883. [Google Scholar] [CrossRef]
  35. CCF Big Data Competition. High-Resolution Remote Sensing Images of a City in Southern China in 2015. Available online: https://drive.google.com/drive/folders/1SwfEZSc2FuI-q9CNsxU5OWjVmcZwDR0s?usp=sharing (accessed on 24 November 2020).
  36. Guo, Y.; Jiao, L.; Wang, S.; Wang, S.; Liu, F.; Hua, W. Fuzzy Superpixels for Polarimetric SAR Images Classification. IEEE Trans. Fuzzy Syst. 2018, 26, 2846–2860. [Google Scholar] [CrossRef]
  37. den Bergh, M.V.; Boix, X.; Roig, G.; de Capitani, B.; Gool, L.V. SEEDS: Superpixels Extracted via Energy-Driven Sampling. In Computer Vision–ECCV 2012; Springer: Berlin, Germany, 2012; pp. 13–26. [Google Scholar]
  38. Boemer, F.; Ratner, E.; Lendasse, A. Parameter-free image segmentation with SLIC. Neurocomputing 2018, 277, 228–236. [Google Scholar] [CrossRef]
  39. Silva, T.C.; Amancio, D.R. Word sense disambiguation via high order of learning in complex networks. EPL (Europhys. Lett.) 2012, 98, 58001. [Google Scholar] [CrossRef]
  40. Rosa, K.D.; Shah, R.; Lin, B.; Gershman, A.; Frederking, R. Topical clustering of tweets. In Proceedings of the ACM SIGIR: SWSM, Barcelona, Spain, 17–21 July 2011. [Google Scholar]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.