Abstract
Despite the substantial body of work that has achieved large-scale data expansion using anchor-based strategies, these methods incur linear complexity relative to the sample size during iterative processes, making them quite time-consuming. Moreover, as feature dimensionality reduction is often overlooked in this procedure, most of them suffer from the “curse of dimensionality”. To address all these issues simultaneously, we introduce a novel paradigm with a superpixel encoding and data projecting strategy, which learns a small-scale bi-stochastic graph from the data matrix with large-scale pixels and high-dimensional spectral features to achieve effective clustering. Moreover, a symmetric neighbor search strategy is integrated into our framework to ensure the sparsity of graph and further improve the calculation efficiency. For optimization, a simple yet effective strategy is designed, which simultaneously satisfies all bi-stochastic constraints while ensuring convergence to the optimal solution. To validate our model’s effectiveness and scalability, we conduct extensive experiments on various-scale hyperspectral images (HSIs). The results demonstrate that our method achieves the state-of-the-art clustering performance, and can be better extended to large-scale and high-dimensional HSIs.