Comparative Analysis of Machine Learning–Kriging Integrative Approaches for Enhanced Spatial Prediction of Mineral Exploration Data
Abstract
1. Introduction
2. Materials and Methods
2.1. Study Area and Dataset
2.2. Integrated ML–GS Prediction Model
2.2.1. ML Prediction Model
2.2.2. GS Interpolation Method
2.2.3. Integrated ML–GS Model
- Trend prediction: In each outer fold, the deterministic component was represented by the ML model selected and tuned as described in Section 2.2.1. The model was fitted using the training folds only, and predictions were generated for the held-out fold and for spatial mapping over the study area.
- Residual modeling and kriging: An empirical variogram was constructed from the cross-fitted training residuals and fitted using the same variogram-selection procedure described in Section 2.2.2. The resulting OK or UK model was then used to predict the residual component for the held-out fold and to generate the residual surface over the study area.
- Integration: Final predictions were obtained by adding the kriged residual surface to the ML trend prediction, thereby combining nonlinear trend learning from ML with spatial error correction from GS.
2.3. Performance Metrics and Spatial Cross-Validation Assessment
3. Results
3.1. EDA of Original Dataset
3.2. Prediction of Al Concentrations Distribution
3.2.1. Standalone ML and GS Results
3.2.2. Residual Diagnostics and Hybrid Prediction Maps
3.2.3. Comparative Prediction Performance of Standalone and Integrated Models
4. Discussion
4.1. Bias-Aware Evaluation of Hybrid ML–GS Prediction Under Spatial Cross-Validation
4.2. Uncertainty Quantification, Statistical Testing, and Data Limitations
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Abbreviations
| ML | Machine Learning |
| Al | Aluminum |
| RF | Random Forest |
| XGB | XGBoost |
| ADA | AdaBoost |
| STN | Spatial Transformer |
| OK | Ordinary Kriging |
| UK | Universal Kriging |
| CV | Cross-validation |
| OOF | Out-Of-Fold |
| GS | Geostatistics |
| FSE | Feature Space Enhancement |
| RK | Regression Kriging |
| NN | Neural Network |
| EDA | Exploratory Data Analysis |
| TiO2 | Titanium Dioxide |
| UTM | Universal Transverse Mercator |
| MSE | Mean Squared Error |
| RMSE | Root Mean Squared Error |
| relRMSE | Relative Root Mean Squared Error |
| Predictive R2 | Predictive Coefficient of Determination |
| SSE | Sum of Squared Residuals |
| SST | Total Sum of Squares |
Appendix A. Fold-Wise Paired Statistical Comparisons
Appendix A.1. OK Versus UK Within Each Backbone (Difference = UK − OK, n = 10 Spatial Folds)
| Backbone | Mean Diff (UK − OK) | t | p | Holm p | 95% CI |
|---|---|---|---|---|---|
| RF | −0.000 | −0.000 | 1.000 | 1.000 | [−0.003, 0.003] |
| XGB | −2.056 | −0.853 | 0.416 | 1.000 | [−7.508, 3.396] |
| ADA | −2.114 | −0.643 | 0.536 | 1.000 | [−9.550, 5.322] |
| ResNet | +6.363 | +1.894 | 0.091 | 0.544 | [−1.235, 13.961] |
| UNet | −2.556 | −1.046 | 0.323 | 1.000 | [−8.084, 2.972] |
| STN | −2.902 | −0.367 | 0.722 | 1.000 | [−20.783, 14.979] |
| Backbone | Mean Diff (UK − OK) | t | p | Holm p | 95% CI |
|---|---|---|---|---|---|
| RF | −0.00010 | −0.557 | 0.591 | 1.000 | [−0.00051, 0.00031] |
| XGB | +0.3040 | +1.266 | 0.237 | 1.000 | [−0.239, 0.847] |
| ADA | −0.1734 | −0.294 | 0.775 | 1.000 | [−1.506, 1.159] |
| ResNet | −0.7445 | −1.466 | 0.177 | 1.000 | [−1.894, 0.405] |
| UNet | +0.5321 | +1.144 | 0.282 | 1.000 | [−0.520, 1.585] |
| STN | +0.4670 | +0.642 | 0.537 | 1.000 | [−1.178, 2.112] |
Appendix A.2. Standalone ML Versus Hybrid Models Within Each Backbone (Difference = Hybrid − ML, n = 10 Spatial Folds)
| Comparison | Mean Diff (Hybrid − ML) | t | p | Holm p | Wilcoxon p | Holm (Wilcoxon) |
|---|---|---|---|---|---|---|
| RF vs. RF–OK | −0.003 | −0.282 | 0.785 | 1.000 | 0.750 | 1.000 |
| RF vs. RF–UK | −0.003 | −0.300 | 0.771 | 1.000 | 0.625 | 1.000 |
| XGB vs. XGB–OK | +0.546 | +0.461 | 0.656 | 1.000 | 0.375 | 1.000 |
| XGB vs. XGB–UK | −1.510 | −0.553 | 0.594 | 1.000 | 0.557 | 1.000 |
| ADA vs. ADA–OK | −4.283 | −1.280 | 0.232 | 1.000 | 0.557 | 1.000 |
| ADA vs. ADA–UK | −6.397 | −1.404 | 0.194 | 1.000 | 0.232 | 1.000 |
| ResNet vs. ResNet–OK | +3.692 | +1.061 | 0.317 | 1.000 | 0.322 | 1.000 |
| ResNet vs. ResNet–UK | +10.055 | +1.627 | 0.138 | 1.000 | 0.492 | 1.000 |
| UNet vs. UNet–OK | −10.201 | −3.063 | 0.0135 | 0.1486 | 0.0195 | 0.2148 |
| UNet vs. UNet–UK | −12.757 | −4.471 | 0.00155 | 0.0186 | 0.00391 | 0.0469 |
| STN vs. STN–OK | −1.483 | −0.352 | 0.733 | 1.000 | 0.695 | 1.000 |
| STN vs. STN–UK | −4.385 | −0.442 | 0.669 | 1.000 | 0.432 | 1.000 |
| Comparison | Mean Diff (Hybrid − ML) | t | p | Holm p | Wilcoxon p | Holm (Wilcoxon) |
|---|---|---|---|---|---|---|
| RF vs. RF–OK | +0.0003 | +0.896 | 0.394 | 1.000 | 0.750 | 1.000 |
| RF vs. RF–UK | +0.0002 | +0.802 | 0.443 | 1.000 | 0.750 | 1.000 |
| XGB vs. XGB–OK | −0.2189 | −1.361 | 0.207 | 1.000 | 0.211 | 1.000 |
| XGB vs. XGB–UK | +0.0851 | +0.445 | 0.667 | 1.000 | 0.846 | 1.000 |
| ADA vs. ADA–OK | −0.0078 | −0.117 | 0.910 | 1.000 | 0.922 | 1.000 |
| ADA vs. ADA–UK | −0.1812 | −0.317 | 0.758 | 1.000 | 0.432 | 1.000 |
| ResNet vs. ResNet–OK | −0.6500 | −1.350 | 0.210 | 1.000 | 0.432 | 1.000 |
| ResNet vs. ResNet–UK | −1.3945 | −1.442 | 0.183 | 1.000 | 0.625 | 1.000 |
| UNet vs. UNet–OK | +0.3868 | +0.531 | 0.608 | 1.000 | 0.105 | 1.000 |
| UNet vs. UNet–UK | +0.9189 | +1.901 | 0.0898 | 1.000 | 0.0371 | 0.445 |
| STN vs. STN–OK | −0.3622 | −1.725 | 0.119 | 1.000 | 0.193 | 1.000 |
| STN vs. STN–UK | +0.1048 | +0.156 | 0.879 | 1.000 | 0.322 | 1.000 |
References
- Gu, A. Geostatistical approaches for resource estimation in mining and exploration. J. Environ. Risk Assess. Remediat. 2023, 7, 182. [Google Scholar]
- Hack, D.R. Issues and Challenges in the Application of Geostatistics and Spatial-Data Analysis to the Characterization of Sand-And-Gravel Resources; U.S. Geological Survey: Reston, VA, USA, 2005. [Google Scholar]
- Cellmer, R. The possibilities and limitations of geostatistical methods in real estate market analyses. Real Estate Manag. Valuat. 2014, 22, 54–62. [Google Scholar] [CrossRef]
- Silva, V.M. On the classification and treatment of outliers in a spatial context: A Bayesian updating approach. REM-Int. Eng. J. 2021, 74, 379–389. [Google Scholar] [CrossRef]
- Battalgazy, N.; Valenta, R.; Gow, P.; Spier, C.; Forbes, G. Addressing geological challenges in mineral resource estimation: A comparative study of deep learning and traditional techniques. Minerals 2023, 13, 982. [Google Scholar] [CrossRef]
- Sowińska-Botor, J.; Mastej, W.; Maćkowski, T. Ranking of the utility of selected geostatistical interpolation methods in conditions of highly skewed seismic data distributions: A case study of the Baltic Basin (Poland). Gospod. Surowc. Min.-Miner. Resour. Manag. 2023, 39, 149–172. [Google Scholar] [CrossRef]
- Heaton, M.J.; Millane, A.; Rhodes, J.S. Adjusting for spatial correlation in machine and deep learning. arXiv 2024, arXiv:2410.04312. [Google Scholar] [CrossRef]
- Frank, J.K.; Suesse, T.; Brenning, A. An assessment of spatial random forests for environmental mapping: The case of groundwater nitrate concentration. Environ. Model. Softw. 2025, 193, 106626. [Google Scholar] [CrossRef]
- Patelli, L.; Cameletti, M.; Golini, N.; Ignaccolo, R. A Path in Regression Random Forest Looking for Spatial Dependence: A Taxonomy and a Systematic Review. In Advanced Statistical Methods in Process Monitoring, Finance, and Environmental Science; Knoth, S., Okhrin, Y., Otto, P., Eds.; Springer: Cham, Switzerland, 2024. [Google Scholar] [CrossRef]
- Chen, L.; Ren, C.; Li, L.; Wang, Y.; Zhang, B.; Wang, Z.; Li, L. A comparative assessment of geostatistical, machine learning, and hybrid approaches for mapping topsoil organic carbon content. ISPRS Int. J. Geo-Inf. 2019, 8, 174. [Google Scholar] [CrossRef]
- Song, Y.Q.; Yang, L.A.; Li, B.; Hu, Y.M.; Wang, A.L.; Zhou, W.; Cui, X.S.; Liu, Y.L. Spatial prediction of soil organic matter using a hybrid geostatistical model of an extreme learning machine and ordinary kriging. Sustainability 2017, 9, 754. [Google Scholar] [CrossRef]
- Mohammadpour, M.; Roshan, H.; Arashpour, M.; Masoumi, H. Machine learning assisted kriging to capture spatial variability in petrophysical property modelling. Mar. Pet. Geol. 2024, 167, 106967. [Google Scholar] [CrossRef]
- Su, H.; Shen, W.; Wang, J.; Ali, A.; Li, M. Machine learning and geostatistical approaches for estimating aboveground biomass in Chinese subtropical forests. For. Ecosyst. 2020, 7, 64. [Google Scholar] [CrossRef]
- Adeniyi, O.D.; Brenning, A.; Maerker, M. Spatial prediction of soil organic carbon: Combining machine learning with residual kriging in an agricultural lowland area (Lombardy region, Italy). Geoderma 2024, 448, 116953. [Google Scholar] [CrossRef]
- Han, H.; Suh, J. Spatial prediction of soil contaminants using a hybrid random forest–ordinary kriging model. Appl. Sci. 2024, 14, 1666. [Google Scholar] [CrossRef]
- Sun, W.; Cao, S.; Cai, C.; Kong, F.; Liu, J. Biomass distribution characteristics of Picea schrenkiana var. tianschanica by integrating ordinary kriging and machine learning. In Proceedings of the 2024 Asia-Pacific Conference on Software Engineering, Social Network Analysis and Intelligent Computing (SSAIC); IEEE: New York, NY, USA, 2024; pp. 691–694. [Google Scholar] [CrossRef]
- Wu, Z.; Yao, F.; Zhang, J.; Liu, H. Estimating forest aboveground biomass using a combination of geographical random forest and empirical Bayesian kriging models. Remote Sens. 2024, 16, 1859. [Google Scholar] [CrossRef]
- Murphy, B.; Yurchak, R.; Müller, S. GeoStat-Framework/PyKrige: V1.7.2, v1.7.2; Zenodo: Geneva, Switzerland, 2024. [CrossRef]
- Korea Institute of Geoscience and Mineral Resources. Exploration and Utilization Technology Development for Rare Metal Resources in Korea: Excerpt from the Myeonsan Formation in the Taebaek Area (Research Report). 2022. Available online: https://www.kigam.re.kr/board.es?mid=a10704000000&bid=0028&list_no=51290&act=view (accessed on 11 April 2026).
- Kim, Y.; Moscoso-Pinto, F.; Seo, J.; Cho, K.; Cho, J.; Lee, S.; Kim, H. Mineral processing characteristics of titanium ore mineral from Myeosan Layer in domestic Taebaek area. Resour. Recycl. 2023, 32, 54–66. [Google Scholar] [CrossRef]
- Park, Y.; Rim, H.; Lim, M.; Shin, Y. The magnetic anomaly map of Korea. Geophys. Geophys. Explor. 2019, 22, 29–36. [Google Scholar] [CrossRef]
- Shin, Y.; Ko, I. Gravity anomaly in the Taebaeksan mineralized zone. J. Geol. Soc. Korea 2019, 55, 403–413. [Google Scholar] [CrossRef]
- Korea Institute of Geoscience and Mineral Resources. Geo Big Data Open Platform. Available online: https://data.kigam.re.kr/ (accessed on 16 March 2025).
- Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
- Chen, T.; Guestrin, C. XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; ACM: New York, NY, USA, 2016; pp. 785–794. [Google Scholar]
- Freund, Y.; Schapire, R.E. A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 1997, 55, 119–139. [Google Scholar] [CrossRef]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; IEEE: New York, NY, USA, 2016; pp. 770–778. [Google Scholar]
- Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, 5–9 October 2015; Proceedings, Part III; Springer: Cham, Switzerland, 2015; pp. 234–241. [Google Scholar] [CrossRef]
- Jaderberg, M.; Simonyan, K.; Zisserman, A. Spatial transformer networks. Adv. Neural Inf. Process. Syst. 2015, 28. [Google Scholar]
- Cressie, N. The origins of kriging. Math. Geol. 1990, 22, 239–252. [Google Scholar] [CrossRef]
- Journel, A.G.; Huijbregts, C.J. Mining Geostatistics; Academic Press: London, UK, 1976. [Google Scholar]
- Goovaerts, P. Geostatistics for Natural Resources Evaluation; Oxford University Press: New York, NY, USA, 1997. [Google Scholar]
- Hyndman, R.J.; Koehler, A.B. Another look at measures of forecast accuracy. Int. J. Forecast. 2006, 22, 679–688. [Google Scholar] [CrossRef]











| References | ML Model | GS Method | Hybrid Strategy | Target |
|---|---|---|---|---|
| [10] | Support vector regression, Artificial neural network | Ordinary kriging, Geographically weighted regression | RK | Soil organic carbon |
| [11] | Extreme learning machine | Ordinary kriging | RK | Soil organic matter |
| [12] | Least squared support vector regression | Simple kriging | RK | Petrophysical property modeling |
| [13] | Random forest | Ordinary kriging, Co-kriging | RK | Aboveground biomass |
| [14] | Artificial neural network, Extreme learning machine, Random forest | Ordinary kriging | RK | Soil organic carbon |
| [15] | Random forest | Ordinary kriging | RK | Soil contaminant |
| [16] | Random forest | Ordinary kriging | RK | Forest biomass |
| [17] | Geographical random forest | Empirical Bayesian kriging | RK | Aboveground biomass |
| Model | Option and Parameter |
|---|---|
| RF |
|
| XGB |
|
| ADA |
|
| ResNet |
|
| U-Net |
|
| STN |
|
| Method | Option and Parameter |
|---|---|
| OK (GS-only) |
|
| UK (GS-only) |
|
| Hybrid RK (OK/UK) |
|
| Variable | Mean | Median | Min. | Q1 | Q3 | Max. | Standard Deviation | Skewness | Kurtosis | Spearman Correlation |
|---|---|---|---|---|---|---|---|---|---|---|
| Al | 100.76 | 62.50 | 0.00 | 33.00 | 121.55 | 500.00 | 112.23 | 2.53 | 6.23 | NaN |
| Gravity anomaly | −0.58 | −1.58 | −16.62 | −8.27 | 7.46 | 19.98 | 9.54 | 0.26 | −1.00 | −0.13 |
| Magnetic anomaly | −65.98 | −69.59 | −192.22 | −94.19 | −46.00 | 114.28 | 45.20 | 0.43 | 1.44 | −0.11 |
| Distance from faults | 1565.66 | 894.43 | 0.00 | 409.23 | 2042.66 | 8228.00 | 1700.34 | 1.69 | 2.45 | −0.46 |
| Distance from deposits | 7608.10 | 7024.48 | 300.00 | 4669.57 | 9719.69 | 21,253.23 | 4265.09 | 0.76 | 0.48 | −0.08 |
| Fault density | 620.22 | 182.25 | 0.00 | 0.00 | 1003.90 | 6106.23 | 919.43 | 2.20 | 6.50 | 0.38 |
| Model | Fold-Wise RMSE (Mean ± std *) | Fold-Wise relRMSE (Mean ± std *) | Fold-Wise R2 (Mean ± std *) | OOF-ALL RMSE | OOF-ALL relRMSE | OOF-ALL R2 |
|---|---|---|---|---|---|---|
| RF | 82.09 ± 41.92 | 0.731 ± 0.373 | −0.879 ± 1.875 | 93.66 | 0.835 | 0.301 |
| XGB | 84.90 ± 41.21 | 0.756 ± 0.367 | −1.345 ± 3.187 | 95.78 | 0.853 | 0.269 |
| ADA | 96.00 ± 43.81 | 0.855 ± 0.390 | −2.326 ± 4.360 | 107.14 | 0.955 | 0.085 |
| ResNet | 95.53 ± 40.64 | 0.851 ± 0.362 | −2.656 ± 5.243 | 107.78 | 1.070 | 0.074 |
| U-Net | 92.68 ± 51.88 | 0.826 ± 0.462 | −0.935 ± 1.485 | 103.70 | 1.029 | 0.143 |
| STN | 95.00 ± 46.43 | 0.846 ± 0.414 | −1.845 ± 3.223 | 106.95 | 1.061 | 0.089 |
| Hybrid Model | Fold RMSE (Mean ± std *) | Fold relRMSE (Mean ± std *) | Fold R2 (Mean ± std *) | OOF-ALL RMSE | OOF-ALL relRMSE | OOF-ALL R2 |
|---|---|---|---|---|---|---|
| RF–OK | 82.087 ± 41.924 | 0.731 ± 0.374 | −0.879 ± 1.875 | 93.662 | 0.835 | 0.301 |
| RF–UK | 82.087 ± 41.925 | 0.731 ± 0.374 | −0.879 ± 1.875 | 93.663 | 0.835 | 0.301 |
| XGB–OK | 85.444 ± 38.382 | 0.761 ± 0.342 | −1.564 ± 3.661 | 95.009 | 0.847 | 0.281 |
| XGB–UK | 83.387 ± 40.017 | 0.743 ± 0.357 | −1.259 ± 3.153 | 93.922 | 0.837 | 0.297 |
| ADA–OK | 91.722 ± 34.934 | 0.817 ± 0.311 | −2.334 ± 4.407 | 99.360 | 0.885 | 0.213 |
| ADA–UK | 89.607 ± 38.247 | 0.798 ± 0.341 | −2.508 ± 5.969 | 98.545 | 0.878 | 0.226 |
| ResNet–OK | 86.808 ± 38.661 | 0.773 ± 0.344 | −1.286 ± 2.209 | 96.709 | 0.862 | 0.255 |
| ResNet–UK | 93.169 ± 40.864 | 0.830 ± 0.364 | −2.030 ± 3.638 | 100.171 | 0.893 | 0.200 |
| UNet–OK | 93.889 ± 40.657 | 0.837 ± 0.362 | −2.617 ± 5.478 | 100.107 | 0.892 | 0.201 |
| UNet–UK | 91.331 ± 42.485 | 0.814 ± 0.379 | −2.085 ± 4.171 | 99.178 | 0.884 | 0.216 |
| STN–OK | 92.498 ± 36.078 | 0.824 ± 0.321 | −2.184 ± 4.034 | 99.820 | 0.889 | 0.206 |
| STN–UK | 89.594 ± 42.806 | 0.798 ± 0.381 | −1.718 ± 3.292 | 106.779 | 0.951 | 0.092 |
| Model | OOF RMSE (mg/kg) | OOF relRMSE | OOF Predictive R2 | ||
|---|---|---|---|---|---|
| ML | Tree-based | RF | 93.66 | 0.835 | 0.301 |
| XGB | 95.78 | 0.853 | 0.269 | ||
| ADA | 107.14 | 0.955 | 0.085 | ||
| NN-based | ResNet | 109.25 | 0.973 | 0.049 | |
| U-Net | 109.03 | 0.971 | 0.053 | ||
| STN | 106.64 | 0.950 | 0.094 | ||
| GS | OK | 99.53 | 0.887 | 0.211 | |
| UK | 107.48 | 0.958 | 0.079 | ||
| ML Model | RMSE (mg/kg) | relRMSE | Predictive R2 | |||
|---|---|---|---|---|---|---|
| OK | UK | OK | UK | OK | UK | |
| RF | 93.662 | 93.663 | 0.835 | 0.835 | 0.301 | 0.301 |
| XGB | 95.009 | 93.922 | 0.847 | 0.837 | 0.281 | 0.297 |
| ADA | 99.360 | 98.545 | 0.885 | 0.878 | 0.213 | 0.226 |
| ResNet | 96.709 | 100.171 | 0.862 | 0.893 | 0.255 | 0.200 |
| U-Net | 100.107 | 99.178 | 0.892 | 0.884 | 0.201 | 0.216 |
| STN | 99.820 | 106.779 | 0.889 | 0.951 | 0.206 | 0.092 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Published by MDPI on behalf of the International Society for Photogrammetry and Remote Sensing. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Han, H.; Suh, J. Comparative Analysis of Machine Learning–Kriging Integrative Approaches for Enhanced Spatial Prediction of Mineral Exploration Data. ISPRS Int. J. Geo-Inf. 2026, 15, 175. https://doi.org/10.3390/ijgi15040175
Han H, Suh J. Comparative Analysis of Machine Learning–Kriging Integrative Approaches for Enhanced Spatial Prediction of Mineral Exploration Data. ISPRS International Journal of Geo-Information. 2026; 15(4):175. https://doi.org/10.3390/ijgi15040175
Chicago/Turabian StyleHan, Hosang, and Jangwon Suh. 2026. "Comparative Analysis of Machine Learning–Kriging Integrative Approaches for Enhanced Spatial Prediction of Mineral Exploration Data" ISPRS International Journal of Geo-Information 15, no. 4: 175. https://doi.org/10.3390/ijgi15040175
APA StyleHan, H., & Suh, J. (2026). Comparative Analysis of Machine Learning–Kriging Integrative Approaches for Enhanced Spatial Prediction of Mineral Exploration Data. ISPRS International Journal of Geo-Information, 15(4), 175. https://doi.org/10.3390/ijgi15040175

