Euclidean distance between instances is widely used to capture the manifold structure of data and for graph-based dimensionality reduction. However, in some circumstances, the basic Euclidean distance cannot accurately capture the similarity between instances; some instances from different classes but close to the decision boundary may be close to each other, which may mislead the graph-based dimensionality reduction and compromise the performance. To mitigate this issue, in this paper, we proposed an approach called Laplacian Eigenmaps based on Clustering-Adjusted Similarity (LE-CAS). LE-CAS first performs clustering on all instances to explore the global structure and discrimination of instances, and quantifies the similarity between cluster centers. Then, it adjusts the similarity between pairwise instances by multiplying the similarity between centers of clusters, which these two instances respectively belong to. In this way, if two instances are from different clusters, the similarity between them is reduced; otherwise, it is unchanged. Finally, LE-CAS performs graph-based dimensionality reduction (via Laplacian Eigenmaps) based on the adjusted similarity. We conducted comprehensive empirical studies on UCI datasets and show that LE-CAS not only has a better performance than other relevant comparing methods, but also is more robust to input parameters.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited