Kernelized Manifold-Optimized Linear KNN for Nonlinear Data Classification
Abstract
1. Introduction
- Due to its reliance on the linear representation assumption, when a dataset contains nonlinear relationships, the linear representation learned by the KNN method cannot represent the testing sample accurately, which loses its physical meaning, thereby weakening the generalization capability.
- The simplification of the linear KNN method may neglect the complex structures in the data, weakening its generalization capability.
- In linear models, each element of the weight vector represents the degree of influence of the corresponding feature on the target variable. However, if the data do not conform to the linear assumption, because nonlinear relationships may cause the physical meaning of weights to become ambiguous or unclear, this approach to interpreting the weights becomes ineffective.
- In nonlinear relationships, the interactions between different features lead to bias in the weights. Linear models cannot capture these complex interactions, so the optimized weights may become distorted and fail to accurately reflect the relationships between the features.
- If nonlinear relationships exist in the data, using linear models for prediction may result in performance degradation. Linear models cannot accurately fit nonlinear relationships, potentially leading to significant errors when predicting new data.
- By incorporating a manifold-preserving regularization term with an adaptive Laplacian matrix, KMOLNN ensures that the data retain their original local structure after high-dimensional mapping, significantly enhancing its adaptability to nonlinear datasets and outperforming traditional weighted KNN methods.
- This study provides, for the first time, a mathematical proof of the group effect of the nearest neighbors for the kernel LLK method, revealing the mechanism by which the model assigns higher weights to the training samples close to the testing sample, thereby offering greater theoretical depth compared to existing kernel KNN approaches.
- Through mathematical derivation, KMOLNN implicitly implements the Bayesian decision rule in weight optimization, approximating the maximum a posteriori probability estimation, which enhances the robustness and interpretability of KMOLNN on nonlinear datasets.
- The experimental results for all adopted datasets confirm that KMOLNN demonstrates enhanced generalization capability and acceptable runtime under our experimental protocol compared to the existing KNN variants.
2. Related Work and Preliminaries
2.1. Related Work
- Some linear KNN methods [6,7] focus on introducing regularization terms to learn the enhanced representation. For instance, Tibshirani et al. [6] proposed the Least Absolute Shrinkage and Selection Operator (LASSO) to learn a sparse representation for each testing sample by introducing an regularization constraint on the representation vector. McDonald et al. [18] proposed a Ridge regression to prevent overfitting and enhance the generalization capability by introducing an regularization constraint on the representation vector. Based on the LASSO, Zhong et al. [7] proposed the Elastic Net to learn the sparse and robust representation of each testing sample by introducing an regularization constraint on the representation vector. Wang et al. [7] proposed the Locality-constrained Linear Coding (LLC) method, which establishes a robust connection between the linear representation weights and distance metrics, facilitating the acquisition of accurate weights and enabling precise classification. Then, Liu et al. [9,19] further developed the LLC method into the Local Linear KNN method (LLKNN), establishing a more stable relationship between the representation weights and distance metrics. Additionally, LLKNN provides a solid theoretical foundation and demonstrates that the Bayesian decision rule can interpret prediction behaviors.
- Some linear KNN methods [4,10] focus on weighting the neighbors to achieve enhanced classification performance. For instance, Xu et al. [10] proposed the Weighted Local Linear KNN method (WLLKNN), which applies weighting to the -norm regularization term to obtain prior weights of the nearest neighbors by weighting the linear representation vector. Gou et al. [4] proposed the Weighted Local Mean Representation-based k-nearest neighbor method (WLMRKNN). This approach utilizes the weights of k-local mean vectors from each class to constrain the representation coefficients. These weights provide more information for reconstructing the testing samples, thereby enabling a more accurate representation of testing samples. This method demonstrates reduced sensitivity to the choice of k-value, contributing to the enhanced robustness of the KNN algorithm. Zhao et al. [20] proposed the Local Centroid Distance-Constrained Representation-based KNN (LCDR-KNN), which enhances the KNN classifier accuracy by selecting the k-nearest training samples for each class, using their centroid distance as a constraint, and combining it with a collaborative representation to calculate the weights for each neighbor in the classification.
- Some linear KNN methods [21,22] focus on adaptively selecting the k-value for each testing sample to achieve enhanced classification performance. For instance, Chen et al. [21] demonstrated that employing different k-values for different classes yields a better generalization performance than using a fixed k-value for all classes. Mullick et al. [22] utilized neural networks to learn the density information around testing samples from the training data, thereby determining the appropriate k-values. Zhang et al. [17] proposed using a decision tree method to predict the optimal k-value for testing samples. First, during the training phase, the optimal k-value for all samples is learned through sparse modeling. Then, using the training samples and the optimal k-value, a k-tree is constructed to rapidly predict the optimal k-value for the testing samples during prediction. Wang et al. [23] proposed adjusting the local k-value of testing samples based on confidence intervals. Manocha et al. [24] used Bayesian optimization to adaptively select and infer the optimal k-value for testing samples from training samples. Cheng et al. [25] proposed a sparse learning-based KNN (S-KNN), which introduces a correlation matrix between the training and testing samples. Based on the S-KNN, Zhang et al. [11] proposed the Graph Sparse KNN (GS-KNN) method, which reduces the impact of noise on the prediction labels of testing samples by introducing sparsity into the linear expression matrix. Additionally, Zhang et al. [12] proposed a One-Step KNN, which transforms the k-nearest neighbor search in a linear KNN into matrix operations, thereby computing the adaptive k-value. This approach aims to enhance the prediction accuracy and efficiency by simplifying the computational processes. Recent research has further extended these ideas. For instance, Amer et al. [12] proposed an efficient k-nearest neighbor model that introduces three KNN variants (PRKNN, EPRKNN, and WPRKNN) by combining preprocessing techniques and weighting schemes, significantly improving performance for big data classification. Zhang et al. [26] introduced a shared-style linear k-nearest neighbors classification method, emphasizing the integration of style information into multi-view data to optimize the classification accuracy. Furthermore, Fan et al. [27] proposed a multi-view adaptive k-nearest neighbors classification, which further enhances the model’s adaptability to heterogeneous data by dynamically adjusting the k-value for multi-view data. This trend is consistent with recent work on shared-style learning, which integrates the style information across multi-view data to optimize the classification accuracy [26]. These recent advances highlight the potential of kernelization and adaptive mechanisms for improving KNN performance, providing a solid foundation for the proposed KMOLNN method’s kernel mapping and manifold optimization.
2.2. Preliminaries
2.2.1. Linear KNN Method
2.2.2. Graph Theory and Laplacian Matrix
Graph Theory
Laplacian Matrix
3. The Objective Function and Prediction Function of the Proposed KMOLNN Method
3.1. Objective Function of the KMOLNN Method
3.2. Prediction Function of the Proposed KMOLNN Method
- The kernel function allows us to implicitly map the data onto a high-dimensional or even infinite-dimensional feature space by computing the kernel functions in the original feature space rather than directly calculating the mapped data. The purpose of this mapping is to reveal the intrinsic structure of data in the new feature space, making the linearly inseparable data in the original feature space linearly separable in the mapped feature space. For example, through the radial basis function (RBF) kernel, data can be mapped onto a higher-dimensional feature space where similar samples are closer together and dissimilar samples are more dispersed.
- The proposed KMOLNN method utilizes similarity probabilities computed by the kernel function for prediction. For a given testing sample, it calculates the sum of its similarity to the training samples in each class. The similarity is computed directly in the high-dimensional feature space via the kernel function, reflecting the proximity probability of the testing sample to the training samples of each class in the mapped high-dimensional feature space. Finally, the proposed KMOLNN method assigns the testing sample to the class with the highest sum of similarity, indicating that the testing sample is most similar to the training samples of that class in the high-dimensional feature space.
3.3. Theoretical Analysis of Generalization Capability
3.3.1. Definition of Rademacher Complexity
3.3.2. Generalization Error Bound
3.3.3. Rademacher Complexity Bound for the Proposed KMOLNN Method
3.4. Proof of the Nearest Neighbor Group Effect
3.5. Optimization of the Objective Function for the Proposed Method
3.5.1. Parameter Initialization
- For the initialization of the probability weight vector , each element of is assigned an identical value. This approach is based on a reasonable assumption: in the absence of any prior information, the likelihood of any testing sample belonging to each training sample is equal. At the same time, it ensures that , which satisfies the fundamental mathematical norms of probability.
- For the adaptive adjacency matrix , we employ the k-means clustering method [35] to cluster the data points to give an appropriate number of classes , which typically depends on the characteristics of the problem or is determined through certain criteria (such as the elbow method). Differentiate the connection weights within the same class from those between classes:
3.5.2. Alternating Optimization
- When the adjacency matrix is fixed, optimize the probability weight vector .
| Algorithm 1: Optimization process in Equation (21). |
Input: Given training sample matrix ; testing sample ; adjacency matrix ; distance vector ; and regularization parameters , and . Output: Probability weight vector of the testing sample. Procedure Step 1: Calculate and . Step 2: While . Step 3: For to do the following. Step 4: Calculate . Step 5: Add to . Step 6: Update . Step 7: . Step 8: Project to make it a probability distribution. Step 9: . Step 10: . Step 11: Return . |
- 2.
- When the probability weight vector is fixed, optimize the adjacency matrix .
| Algorithm 2: Optimizing Equation (27). |
Input: Given training samples ; testing sample probability vector ; and regularization parameters , and . Output: Optimal adjacency matrix . Procedure Step 1: Randomly initialize adjacency matrix . Step 2: While . Step 3: . Step 4: Update . Step 5: Projection to ensure non-negativity. Step 6: Return . |
| Algorithm 3: Alternating optimization process of the objective function in the proposed KMOLNN method. |
Input: Training samples ; testing sample ; kernel function ; parameters , , and ; kernel width ; maximum iterations ; and convergence threshold tol. Output: Optimized weight vector and adjacency matrix . Procedure Step 1: Initialization: , . Step 2: Initialize using k-means clustering; if and are in same cluster, set , otherwise . Step 3: Compute similarity vector , where for . Step 4: Compute kernel matrices , and . Step 5: Iterative optimization. Step 6: For to do the following. Step 7: Fix ; optimize weight vector . Compute Laplacian , where . Step 8: Fix ; optimize adjacency matrix . Step 9: Update , where . Step 10: If . Step 11: Break. Step 12: , . Step 13: Return , . |
4. Experimental Studies
4.1. Comparative Methods and Parameter Settings
4.2. The Benchmark Datasets
4.3. Evaluation Metrics
4.4. Generalization Capability Analysis
- KMOLNN consistently outperformed the other methods in terms of both accuracy and Macro F1-score across the majority of datasets. For instance, on datasets with complex structures, such as Sonar and Ionosphere, the proposed KMOLNN method achieved remarkable performance. Specifically, on the Sonar dataset, KMOLNN achieved an accuracy of 90.71%, surpassing the runner-up MVAKNN (90.47%) and significantly outperforming the traditional KNN (82.61%). On the Pendigits dataset (with 16 dimensions and 10 classes), KMOLNN achieved an accuracy of 99.49%, outperforming MK-AKNN’s 99.09% and Elastic Net’s 97.79%. This highlights the efficacy of kernelized manifold optimization at capturing complex data manifolds and enhancing generalization capability.
- While advanced methods like MK-AKNN and SSL-KNN demonstrated strong competitiveness, KMOLNN maintained the leading position. Looking at the data, MK-AKNN emerged as the strongest competitor, achieving the best results on datasets such as Wine and Vowel. However, KMOLNN provided a more robust mechanism overall, particularly on larger or more diverse datasets. For example, on the Letter Recognition dataset (sample size 20,000), KMOLNN attained an accuracy of 98.30%, distinctly higher than that of MK-AKNN (97.80%) and Elastic Net (95.10%). In terms of overall mean accuracy across all 15 datasets, KMOLNN achieved approximately 90.76%, whereas the traditional KNN scored 84.23%, representing a substantial improvement.
- The Macro F1-score results further confirm the balanced precision–recall capabilities of KMOLNN. On the datasets with potential multi-class challenges, such as Ecoli (8 classes) and Yeast (10 classes), KMOLNN maintained superior Macro F1-scores. For instance, on the Yeast dataset, KMOLNN achieved a Macro F1-score of 63.6%, which was the highest among all methods, surpassing MK-AKNN (62.5%) and significantly outperforming KNN (50.5%). The Macro F1-score of KMOLNN across all datasets reached 88.62%, demonstrating that the implicit Bayesian decision rule in our method effectively handles class separability even in difficult scenarios.
4.5. Runtime Analysis
4.6. Statistical Analysis
- Contrary to the assumption that simple linear models might suffice, KMOLNN demonstrates a statistically significant advantage over KNN, FKNN, PRKNN, Elastic Net, and LCDR-KNN. Notably, the performance gap is widest between KMOLNN (average rank 1.83) and the standard KNN (average rank 8.4), with a rank difference of 6.57—well above the critical difference (CD) of 2.7243. This statistically validates that our kernelization and manifold-preserving strategies successfully overcome the limitations of traditional linear and neighbor-based methods.
- When compared to the most advanced methods—MVAKNN, SSL-KNN, and WLMRKNN—the statistical differences fall below the critical threshold (hypothesis accepted). It is particularly noteworthy that KMOLNN ties with MVAKNN for the top rank (both at 1.83), at 5. This indicates that while KMOLNN is not statistically superior to these top-tier methods, it successfully reaches the state-of-the-art level, offering a highly competitive alternative that matches the performance of complex multi-kernel and style-based approaches.
4.7. Limitations and Future Work
4.8. Qualitative Analysis of Interpretability
4.9. Ablation Study
4.10. Parameter Sensitivity and Robustness
4.10.1. Interaction of and
4.10.2. Interaction of and
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Appendix A
References
- Cover, T.; Hart, P. Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 1967, 13, 21–27. [Google Scholar] [CrossRef]
- Altman, N.S. An Introduction to Kernel and Nearest-Neighbor Nonparametric Regression. Am. Stat. 1992, 46, 175–185. [Google Scholar] [CrossRef]
- Jiang, J.; Wu, J.; Luo, J.; Meng, X.; Qian, L.; Li, K. KATSA: KNN Ameliorated Tree Seed Algorithm for complex optimization problems. Expert Syst. Appl. 2025, 280, 127465. [Google Scholar] [CrossRef]
- Gou, J.; Qiu, W.; Yi, Z.; Shen, X.; Zhan, Y.; Ou, W. Locality constrained representation-based K-nearest neighbor classification. Knowl. Based Syst. 2019, 167, 38–52. [Google Scholar] [CrossRef]
- He, X.; Deng, K.; Wang, X.; Li, Y.; Zhang, Y.; Wang, M. LightGCN: Simplifying and Powering Graph Convolution Network for Recommendation. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval; ACM: New York, NY, USA, 2020. [Google Scholar]
- Tibshirani, R. Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B Stat. Methodol. 1996, 58, 267–288. [Google Scholar] [CrossRef]
- Zhong, L.W.; Kwok, J.T. Efficient sparse modeling with automatic feature grouping. IEEE Trans. Neural Netw. Learn. Syst. 2012, 23, 1436–1447. [Google Scholar] [CrossRef] [PubMed]
- Wang, J.; Yang, J.; Yu, K.; Lv, F.; Huang, T.; Gong, Y. Locality-constrained linear coding for image classification. In Proceedings of the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition; IEEE: New York, NY, USA, 2010; pp. 3360–3367. [Google Scholar]
- Liu, Q.; Liu, C. A novel locally linear KNN method with applications to visual recognition. IEEE Trans. Neural Netw. Learn. Syst. 2016, 28, 2010–2021. [Google Scholar] [CrossRef]
- Xu, Y.-L.; Chen, S.; Luo, B. A weighted locally linear KNN model for image recognition. In Proceedings of the CCF Chinese Conference on Computer Vision; Springer: Singapore, 2017; pp. 567–578. [Google Scholar]
- Zhang, S.; Zong, M.; Sun, K.; Liu, Y.; Cheng, D. Efficient kNN algorithm based on graph sparse reconstruction. In Proceedings of the International Conference on Advanced Data Mining and Applications; Springer: Cham, Switzerland, 2014; pp. 356–369. [Google Scholar]
- Zhang, S.; Li, J. KNN classification with one-step computation. IEEE Trans. Knowl. Data Eng. 2021, 35, 2711–2723. [Google Scholar] [CrossRef]
- Cao, J.; Li, Z.; Li, J. Financial time series forecasting model based on CEEMDAN and LSTM. Phys. A Stat. Mech. Its Appl. 2019, 519, 127–139. [Google Scholar] [CrossRef]
- Min, S.; Lee, B.; Yoon, S. Deep learning in bioinformatics. Brief. Bioinform. 2017, 18, 851–869. [Google Scholar]
- Bian, Z.; Zhang, J.; Chung, F.L.; Wang, S. Residual Sketch Learning for a Feature-Importance-Based and Linguistically Interpretable Ensemble Classifier. IEEE Trans. Neural Netw. Learn. Syst. 2024, 35, 10461–10474. [Google Scholar] [CrossRef] [PubMed]
- Venkata Krishna Reddy, V.; Vijaya Kumar Reddy, R.; Siva Krishna Munaga, M.; Karnam, B.; Maddila, S.K.; Sekhar Kolli, C. Deep learning-based credit card fraud detection in federated learning. Expert Syst. Appl. 2024, 255, 124493. [Google Scholar] [CrossRef]
- Zhang, S.; Li, X.; Zong, M.; Zhu, X.; Wang, R. Efficient kNN Classification With Different Numbers of Nearest Neighbors. IEEE Trans. Neural Netw. Learn. Syst. 2018, 29, 1774–1785. [Google Scholar] [CrossRef] [PubMed]
- Mcdonald, G.C. Ridge regression. Wiley Interdiscip. Rev. Comput. Stat. 2010, 1, 93–100. [Google Scholar] [CrossRef]
- Liu, Q.; Liu, C. A novel locally linear KNN model for visual recognition. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR); IEEE: New York, NY, USA, 2015; pp. 1329–1337. [Google Scholar] [CrossRef]
- Zhao, Y.; Liu, Y.; Liu, X.; Zou, E. Local Centroid Distance Constrained Representation-Based K-Nearest Neighbor Classifier. In Proceedings of the China Conference on Wireless Sensor Networks; Springer: Singapore, 2020. [Google Scholar]
- Li, B.; Chen, Y.W.; Chen, Y.Q. The Nearest Neighbor Algorithm of Local Probability Centers. IEEE Trans. Syst. Man Cybern. Part B (Cybern.) 2008, 38, 141–154. [Google Scholar] [CrossRef]
- Mullick, S.S.; Datta, S.; Das, S. Adaptive Learning-Based k-Nearest Neighbor Classifiers With Resilience to Class Imbalance. IEEE Trans. Neural Netw. Learn. Syst. 2018, 29, 5713–5725. [Google Scholar] [CrossRef]
- Wang, J.; Neskovic, P.; Cooper, L.N. Neighborhood size selection in the k-nearest-neighbor rule using statistical confidence. Pattern Recognit. 2006, 39, 417–423. [Google Scholar] [CrossRef]
- Manocha, S.; Girolami, M.A. An empirical analysis of the probabilistic K-nearest neighbour classifier. Pattern Recognit. Lett. 2007, 28, 1818–1824. [Google Scholar] [CrossRef]
- Cheng, D.; Zhang, S.; Deng, Z.; Zhu, Y.; Zong, M. kNN algorithm with data-driven k value. In Proceedings of the International Conference on Advanced Data Mining and Applications; Springer: Cham, Switzerland, 2014; pp. 499–512. [Google Scholar]
- Zhang, J.; Bian, Z.; Wang, S. Shared style linear k nearest neighbor classification method. Expert Syst. Appl. 2024, 241, 122702. [Google Scholar] [CrossRef]
- Fan, Z.; Huang, Y.; Xi, C.; Liu, Q. Multiview Adaptive K-Nearest Neighbor Classification. IEEE Trans. Artif. Intell. 2024, 5, 1221–1234. [Google Scholar] [CrossRef]
- Zhang, Z.; Lai, Z.; Xu, Y.; Shao, L.; Wu, J.; Xie, G.S. Discriminative Elastic-Net Regularized Linear Regression. IEEE Trans. Image Process. 2017, 26, 1466–1481. [Google Scholar] [CrossRef] [PubMed]
- Ortega, A.; Frossard, P.; Kovačević, J.; Moura, J.M.F.; Vandergheynst, P. Graph Signal Processing: Overview, Challenges, and Applications. Proc. IEEE 2018, 106, 808–828. [Google Scholar] [CrossRef]
- Monga, V.; Li, Y.; Eldar, Y.C. Algorithm unrolling: Interpretable, efficient deep learning for signal and image processing. IEEE Signal Process. Mag. 2021, 38, 18–44. [Google Scholar] [CrossRef]
- Bousquet, O.; Elisseeff, A. Stability and generalization. J. Mach. Learn. Res. 2002, 2, 499–526. [Google Scholar]
- Bartlett, P.L.; Mendelson, S. Rademacher and gaussian complexities: Risk bounds and structural results. J. Mach. Learn. Res. 2002, 3, 463–482. [Google Scholar]
- Sultana, T.; Dumitrescu, S. Globally optimal max-min rate joint channel and power allocation for hybrid NOMA-OMA downlink systems. IEEE Trans. Signal Process. 2025, 73, 1674–1690. [Google Scholar] [CrossRef]
- Bian, Z.; Chung, F.-L.; Wang, S. Enhanced fuzzy random forest by using doubly randomness and copying from dynamic dictionary attributes. IEEE Trans. Fuzzy Syst. 2022, 30, 4369–4383. [Google Scholar] [CrossRef]
- Xie, J.; Girshick, R.; Farhadi, A. Unsupervised deep embedding for clustering analysis. In Proceedings of the International Conference on Machine Learning; PMLR: Cambridge, MA, USA, 2016; pp. 478–487. [Google Scholar]
- Keller, J.M.; Gray, M.R.; Givens, J.A. A fuzzy K-nearest neighbor algorithm. IEEE Trans. Syst. Man Cybern. 1985, SMC-15, 580–585. [Google Scholar] [CrossRef]
- Amer, A.A.; Ravana, S.D.; Habeeb, R.A.A. Effective k-nearest neighbor models for data classification enhancement. J. Big Data 2025, 12, 86. [Google Scholar] [CrossRef]
- Li, G.; Jung, J.J. Deep learning for anomaly detection in multivariate time series: Approaches, applications, and challenges. Inf. Fusion 2023, 91, 93–102. [Google Scholar] [CrossRef]
- Demšar, J. Statistical Comparisons of Classifiers over Multiple Data Sets. J. Mach. Learn. Res. 2006, 7, 1–30. [Google Scholar]




| Symbol | Description |
|---|---|
| The set of training samples | |
| The testing sample | |
| The representation weight vector for the testing sample | |
| The nonlinear mapping function (kernel mapping) | |
| The kernel function (e.g., Gaussian kernel) | |
| The Laplacian matrix | |
| The adaptive adjacency matrix | |
| The degree matrix of the graph | |
| Regularization coefficients for different terms | |
| The hyperparameter controlling the kernel width |
| Method | Objective Function | Kernel Usage | Graph/Manifold Reg. | Theoretical Basis |
|---|---|---|---|---|
| LLKNN | Distance constrained | No | No | Bayesian rule |
| WLMRKNN | Local mean repr. | No | No | Weighted neighbors |
| GS-KNN | Sparse repr. | No | Sparse graph | Correlation matrix |
| MVAKNN | Multi-view adaptive | Yes | Laplacian reg. | Multi-view learning |
| KMOLNN (Ours) | Kernel-based error | Yes | Adaptive Laplacian | Group effect & Bayesian |
| Methods | Objective Functions | Prediction Rules |
|---|---|---|
| KNN | - | - |
| FKNN [36] | - | - |
| PRKNN [37] | - | - |
| Elastic Net [7] | ||
| WLMRKNN [4] | ||
| LCDR-KNN [20] | ||
| SSL-KNN [26] | ||
| MVAKNN [27] | ||
| KMOLNN |
| Method | Year | Key Idea | Kernelization | Manifold Learning | Adaptive Weighting |
|---|---|---|---|---|---|
| KNN [1] | 1967 | Majority voting based on Euclidean distance | No | No | No |
| FKNN [36] | 1985 | Fuzzy membership-based weighted voting | No | No | No |
| Elastic Net [7] | 2012 | Sparse modeling with and regularization | No | No | No |
| WLMRKNN [4] | 2019 | Local mean representation with weighted neighbors | No | No | Yes |
| LCDR-KNN [20] | 2020 | Local centroid distance-constrained representation | No | No | Yes |
| SSL-KNN [26] | 2024 | Shared style multi-view linear classification | Yes | No | Yes |
| MVAKNN [27] | 2024 | Multi-view adaptive -nearest neighbor classification | Yes | Yes | Yes |
| PRKNN [37] | 2025 | Proximal ratio-based noise and overlap handling | No | No | No |
| KMOLNN | 2026 | Kernelized manifold-optimized linear KNN | Yes | Yes | Yes |
| Datasets | Number of Sizes | Number of Dimensions | Number of Classes | |
|---|---|---|---|---|
| 1 | Sonar | 208 | 60 | 2 |
| 2 | Wine | 178 | 13 | 3 |
| 3 | Iris | 150 | 4 | 3 |
| 4 | Breast Cancer Wisconsin | 569 | 30 | 2 |
| 5 | Pima Indians Diabetes | 768 | 8 | 2 |
| 6 | Glass Identification | 214 | 9 | 6 |
| 7 | Ionosphere | 351 | 34 | 2 |
| 8 | Heart Disease (Cleveland) | 303 | 13 | 2 |
| 9 | Vowel | 990 | 10 | 11 |
| 10 | Ecoli | 336 | 7 | 8 |
| 11 | Yeast | 1484 | 8 | 10 |
| 12 | Pendigits | 10,992 | 16 | 10 |
| 13 | Satimage | 6435 | 36 | 6 |
| 14 | Vehicle Silhouettes | 846 | 18 | 4 |
| 15 | Letter Recognition | 20,000 | 16 | 26 |
| Dataset | KNN | FKNN | PRKNN | Elastic Net | LCDR-KNN | WLMRKNN | SSL-KNN | MVAKNN | KMOLNN |
|---|---|---|---|---|---|---|---|---|---|
| Sonar | 0.8262 | 0.8548 | 0.7643 | 0.8429 | 0.8857 | 0.8786 | 0.8929 | 0.9048 | 0.9071 |
| Wine | 0.9667 | 0.9861 | 0.9944 | 0.9833 | 0.9889 | 0.9917 | 0.9889 | 0.9944 | 0.9917 |
| Iris | 0.9600 | 0.9733 | 0.9533 | 0.9667 | 0.9733 | 0.9733 | 0.9800 | 0.9867 | 0.9733 |
| Breast Cancer Wisconsin | 0.9561 | 0.9754 | 0.9684 | 0.9719 | 0.9781 | 0.9807 | 0.9798 | 0.9851 | 0.9754 |
| Pima Indians Diabetes | 0.7247 | 0.7597 | 0.7649 | 0.7779 | 0.7682 | 0.7721 | 0.7753 | 0.7818 | 0.7948 |
| Glass Identification | 0.7047 | 0.7442 | 0.6209 | 0.7372 | 0.7512 | 0.7605 | 0.7651 | 0.7860 | 0.7977 |
| Ionosphere | 0.8629 | 0.9057 | 0.8857 | 0.8914 | 0.9186 | 0.9257 | 0.9357 | 0.9414 | 0.9529 |
| Heart Disease | 0.8148 | 0.8475 | 0.8377 | 0.8443 | 0.8623 | 0.8754 | 0.8705 | 0.8902 | 0.9115 |
| Vowel | 0.9051 | 0.9449 | 0.8942 | 0.9212 | 0.9621 | 0.9581 | 0.9601 | 0.9848 | 0.9747 |
| Ecoli | 0.8119 | 0.8507 | 0.7851 | 0.8448 | 0.8657 | 0.8716 | 0.8687 | 0.8910 | 0.9045 |
| Yeast | 0.5650 | 0.6051 | 0.5822 | 0.5980 | 0.6209 | 0.6350 | 0.6300 | 0.6519 | 0.6721 |
| Pendigits | 0.9720 | 0.9810 | 0.8950 | 0.9780 | 0.9920 | 0.9890 | 0.9900 | 0.9910 | 0.9950 |
| Satimage | 0.9060 | 0.9200 | 0.8420 | 0.9150 | 0.9350 | 0.9380 | 0.9420 | 0.9390 | 0.9470 |
| Vehicle Silhouettes | 0.7249 | 0.7852 | 0.6479 | 0.7621 | 0.8012 | 0.8148 | 0.8201 | 0.8450 | 0.8337 |
| Letter Recognition | 0.9420 | 0.9580 | 0.7650 | 0.9510 | 0.9650 | 0.9700 | 0.9720 | 0.9780 | 0.9830 |
| W/T/L (Ours vs. competing) | 15/0/0 | 15/0/0 | 14/0/1 | 15/0/0 | 15/0/0 | 14/0/1 | 13/0/2 | 10/0/5 | - |
| Dataset | KNN | FKNN | PRKNN | Elastic Net | LCDR-KNN | WLMRKNN | SSL-KNN | MVAKNN | KMOLNN |
|---|---|---|---|---|---|---|---|---|---|
| Sonar | 0.8143 | 0.8476 | 0.7571 | 0.8357 | 0.8786 | 0.8714 | 0.8857 | 0.9000 | 0.9000 |
| Wine | 0.9639 | 0.9833 | 0.9944 | 0.9833 | 0.9889 | 0.9889 | 0.9861 | 0.9861 | 0.9944 |
| Iris | 0.9567 | 0.9733 | 0.9533 | 0.9667 | 0.9733 | 0.9733 | 0.9800 | 0.9867 | 0.9867 |
| Breast Cancer Wisconsin | 0.9482 | 0.9693 | 0.9623 | 0.9649 | 0.9719 | 0.9763 | 0.9754 | 0.9807 | 0.9860 |
| Pima Indians Diabetes | 0.6851 | 0.7299 | 0.7240 | 0.7448 | 0.7383 | 0.7448 | 0.7500 | 0.7578 | 0.7669 |
| Glass Identification | 0.6256 | 0.6953 | 0.5070 | 0.6814 | 0.7093 | 0.7256 | 0.7302 | 0.7558 | 0.7323 |
| Ionosphere | 0.8457 | 0.8914 | 0.8714 | 0.8786 | 0.9057 | 0.9143 | 0.9286 | 0.9357 | 0.9286 |
| Heart Disease | 0.7951 | 0.8328 | 0.8197 | 0.8279 | 0.8443 | 0.8607 | 0.8574 | 0.8754 | 0.8393 |
| Vowel | 0.8848 | 0.9318 | 0.9220 | 0.9081 | 0.9551 | 0.9480 | 0.9520 | 0.9798 | 0.9712 |
| Ecoli | 0.7522 | 0.8104 | 0.6851 | 0.7955 | 0.8254 | 0.8343 | 0.8299 | 0.8627 | 0.8761 |
| Yeast | 0.5051 | 0.5650 | 0.4822 | 0.5519 | 0.5801 | 0.6051 | 0.5949 | 0.6249 | 0.6360 |
| Pendigits | 0.9680 | 0.9790 | 0.8820 | 0.9750 | 0.9900 | 0.9860 | 0.9880 | 0.9890 | 0.9730 |
| Satimage | 0.8850 | 0.9050 | 0.8150 | 0.8980 | 0.9200 | 0.9250 | 0.9300 | 0.9280 | 0.9470 |
| Vehicle Silhouettes | 0.7018 | 0.7680 | 0.6148 | 0.7450 | 0.7852 | 0.8018 | 0.8083 | 0.8320 | 0.8278 |
| Letter Recognition | 0.9380 | 0.9540 | 0.7420 | 0.9480 | 0.9610 | 0.9680 | 0.9700 | 0.9760 | 0.9670 |
| Dataset | KNN | FKNN | PRKNN | Elastic Net | LCDR-KNN | WLMRKNN | SSL-KNN | MVAKNN | KMOLNN |
|---|---|---|---|---|---|---|---|---|---|
| Sonar | 0.0012 | 0.0015 | 0.0030 | 0.0255 | 0.0312 | 0.0280 | 4.9155 | 0.0595 | 2.1886 |
| Wine | 0.0010 | 0.0013 | 0.0027 | 0.0210 | 0.0291 | 0.0255 | 3.1367 | 0.0528 | 2.5487 |
| Iris | 0.0079 | 0.0011 | 0.0023 | 0.0207 | 0.0239 | 0.0211 | 3.8646 | 0.0492 | 1.9123 |
| Breast Cancer | 0.0015 | 0.0018 | 0.0037 | 0.0297 | 0.0368 | 0.0340 | 5.0191 | 0.0712 | 2.9516 |
| Pima Indians | 0.0018 | 0.0021 | 0.0046 | 0.0363 | 0.0440 | 0.0407 | 5.7688 | 0.0807 | 3.5123 |
| Glass | 0.0011 | 0.0014 | 0.0028 | 0.0221 | 0.0307 | 0.0282 | 4.9450 | 0.0550 | 2.0859 |
| Ionosphere | 0.0013 | 0.0016 | 0.0031 | 0.0274 | 0.0333 | 0.0304 | 4.0524 | 0.0621 | 2.3719 |
| Heart Disease | 0.0014 | 0.0018 | 0.0033 | 0.0267 | 0.0347 | 0.0333 | 4.0825 | 0.0702 | 2.3566 |
| Vowel | 0.0052 | 0.0038 | 0.0079 | 0.0411 | 0.0491 | 0.0469 | 6.7292 | 0.0936 | 4.1845 |
| Ecoli | 0.0016 | 0.0020 | 0.0038 | 0.0333 | 0.0402 | 0.0380 | 7.7151 | 0.0794 | 3.6233 |
| Yeast | 0.0054 | 0.0058 | 0.0118 | 0.0510 | 0.0641 | 0.0575 | 9.4203 | 0.1163 | 5.6277 |
| Pendigits | 0.0148 | 0.0209 | 0.0405 | 0.1009 | 0.1235 | 0.1155 | 19.9106 | 0.2368 | 10.3476 |
| Satimage | 0.0129 | 0.0149 | 0.0304 | 0.0904 | 0.1168 | 0.1033 | 16.9335 | 0.2273 | 8.7759 |
| Vehicle | 0.0011 | 0.0027 | 0.0057 | 0.0443 | 0.0581 | 0.0545 | 8.1484 | 0.1087 | 4.3475 |
| Letter Rec. | 0.0174 | 0.0204 | 0.0420 | 0.1474 | 0.1911 | 0.1722 | 40.7081 | 0.3631 | 17.2134 |
| Dataset | KNN | FKNN | PRKNN | Elastic Net | LCDR-KNN | WLMRKNN | SSL-KNN | MVAKNN | KMOLNN |
|---|---|---|---|---|---|---|---|---|---|
| Sonar | 8 | 6 | 9 | 7 | 4 | 5 | 3 | 2 | 1 |
| Wine | 9 | 7 | 1.5 | 8 | 5.5 | 3.5 | 5.5 | 1.5 | 3.5 |
| Iris | 8 | 4.5 | 9 | 7 | 4.5 | 4.5 | 2 | 1 | 4.5 |
| Breast Cancer Wisconsin | 9 | 5.5 | 8 | 7 | 4 | 2 | 3 | 1 | 5.5 |
| Pima Indians Diabetes | 9 | 8 | 7 | 3 | 6 | 5 | 4 | 2 | 1 |
| Glass Identification | 8 | 6 | 9 | 7 | 5 | 4 | 3 | 2 | 1 |
| Ionosphere | 9 | 6 | 8 | 7 | 5 | 4 | 3 | 2 | 1 |
| Heart Disease | 9 | 6 | 8 | 7 | 5 | 3 | 4 | 2 | 1 |
| Vowel | 8 | 6 | 9 | 7 | 3 | 5 | 4 | 1 | 2 |
| Ecoli | 8 | 6 | 9 | 7 | 5 | 3 | 4 | 2 | 1 |
| Yeast | 9 | 6 | 8 | 7 | 5 | 3 | 4 | 2 | 1 |
| Pendigits | 8 | 6 | 9 | 7 | 2 | 5 | 4 | 3 | 1 |
| Satimage | 8 | 6 | 9 | 7 | 5 | 4 | 2 | 3 | 1 |
| Vehicle Silhouettes | 8 | 6 | 9 | 7 | 5 | 4 | 3 | 1 | 2 |
| Letter Recognition | 8 | 6 | 9 | 7 | 5 | 4 | 3 | 2 | 1 |
| Average ranking | 8.4 | 6.07 | 8.1 | 6.8 | 4.6 | 3.93 | 3.43 | 1.83 | 1.83 |
| Method | Average Rank Difference | CD | Null Hypothesis |
|---|---|---|---|
| KNN/KMOLNN | 6.57 | 2.724 | Reject |
| FKNN/KMOLNN | 4.24 | 2.724 | Reject |
| PRKNN/KMOLNN | 6.27 | 2.724 | Reject |
| Elastic Net/KMOLNN | 4.97 | 2.724 | Reject |
| LCDR-KNN/KMOLNN | 2.77 | 2.724 | Reject |
| WLMRKNN/KMOLNN | 2.1 | 2.724 | Accept |
| SSL-KNN/KMOLNN | 1.6 | 2.724 | Accept |
| MVAKNN/KMOLNN | 0 | 2.724 | Accept |
| Dataset | Metric | KMOLNN (Linear) | KMOLNN (w/o Manifold)/KKNN | KMOLNN (Fixed Graph) | KMOLNN (Full) |
|---|---|---|---|---|---|
| Sonar | Accuracy | 0.8125 | 0.8654 | 0.881 | 0.9071 |
| Macro F1 | 0.795 | 0.851 | 0.875 | 0.900 | |
| Ionosphere | Accuracy | 0.8714 | 0.9125 | 0.931 | 0.9529 |
| Macro F1 | 0.858 | 0.895 | 0.91 | 0.9286 | |
| Wine | Accuracy | 0.9611 | 0.9722 | 0.9833 | 0.9917 |
| Macro F1 | 0.958 | 0.975 | 0.985 | 0.9944 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Zhang, J.; Bian, Z.; Zhang, L.; Wang, F. Kernelized Manifold-Optimized Linear KNN for Nonlinear Data Classification. Electronics 2026, 15, 2572. https://doi.org/10.3390/electronics15122572
Zhang J, Bian Z, Zhang L, Wang F. Kernelized Manifold-Optimized Linear KNN for Nonlinear Data Classification. Electronics. 2026; 15(12):2572. https://doi.org/10.3390/electronics15122572
Chicago/Turabian StyleZhang, Jin, Zekang Bian, Liang Zhang, and Feng Wang. 2026. "Kernelized Manifold-Optimized Linear KNN for Nonlinear Data Classification" Electronics 15, no. 12: 2572. https://doi.org/10.3390/electronics15122572
APA StyleZhang, J., Bian, Z., Zhang, L., & Wang, F. (2026). Kernelized Manifold-Optimized Linear KNN for Nonlinear Data Classification. Electronics, 15(12), 2572. https://doi.org/10.3390/electronics15122572

