An Explainable CatBoost Model for Crater Classification Based on Digital Elevation Model
Abstract
:1. Introduction
2. Materials and Methods
2.1. Data Preparation
2.2. Variable Extraction
2.2.1. Crater Density
- Define the central region: For each central crater, define a circular region centered on the crater, with a radius n times the crater’s radius. The value of n ranges from 10 to 100, with a step size of 10.
- Filter craters with similar radii: within this circular region, identify all craters whose radii are similar to that of the central crater (within a 20% radius difference).
- Calculate Gaussian summation: compute the Gaussian-weighted value for each crater relative to the central crater and sum these values to obtain the cluster density. The formula for the Gaussian summation is as follows (Equation (1)):
- Define the rectangular region: starting from the center of the central crater, define a rectangular region along specific directions (0°, 30°, 60°, 90°, 120°, 150°). The width of the region is 40 times the radius of the central crater, and the length is 100 times the radius.
- Filter craters with similar radii: within this rectangular region, identify all craters with radii similar to that of the central crater.
- Calculate Gaussian summation: similar to cluster density, apply Gaussian weighting to the craters in each direction to compute the chain density. The formula for chain density is as follows (Equation (3)):
2.2.2. Depth-to-Diameter Ratio
2.2.3. Voronoi Diagram
2.2.4. Slope
2.3. CatBoost for Crater Classification
2.4. Shapley Additive Explanations (SHAP)
2.5. Partial Dependence Plot (PDP)
2.6. Details of Training and Test Sets
3. Results
3.1. Comparison of Training and Test Set Metrics
3.2. Test Set Evaluation Results
3.3. Variable Ablation Study
4. Discussion
4.1. Interpretation of CatBoost Model with SHAP
4.1.1. SHAP Summary Plot
4.1.2. SHAP Dependence Plot
4.1.3. SHAP Decision Plot and Analysis of Misclassified Samples
4.2. Comparison of the Variable Importance Derived from CatBoost and SHAP
4.3. Partial Dependence Plot
4.4. Effectiveness Verification of Label Correction Approach
- Sampling Class 1 Instances: For each fold in the cross-validation, 10% of the samples originally labeled as class 1 (secondary craters), which were not modified during the cross-validation process, are randomly selected. The labels of these randomly selected samples are then changed from class 1 to class 0 (primary craters). These samples are used as a representative subset for further evaluation.
- Model Training and Prediction: A classifier is trained on the remaining portion of the data, excluding the selected 10%, and predictions are made on the high-confidence class 1 samples.
- Confidence Thresholding: In the CatBoost model, a sample is classified as class 1 (secondary crater) if the predicted probability for class 1 exceeds 0.5. Therefore, the model’s predicted probabilities are evaluated, and samples with a predicted probability greater than 0.5 are considered high-confidence predictions.
- Calculation of High-Confidence Ratio: The ratio of high-confidence samples—those with a predicted probability exceeding 0.5—to the total number of samples in the 10% subset is computed for each fold. This ratio serves as a key metric for quantifying the success of the cross-validation strategy in correctly classifying secondary craters. The purpose of this step is to evaluate the model’s ability to correctly recover mislabeled or missed labels when using the label correction approach, thereby assessing how effectively the model can identify and correct misclassifications.
- Average High-Confidence Ratio Across Folds: After evaluating each fold, the average ratio of high-confidence samples across all folds is calculated. This provides an aggregated measure of the overall impact of the pseudo-labeling approach on model performance.
4.5. Comparison with Previous Work
4.6. Considerations Regarding Small Craters
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Appendix A
References
- Wilhelms, D.E.; John, F.; Trask, N. The Geologic History of the Moon; U.S. Government Printing Office: Washington, DC, USA, 1987. [Google Scholar]
- Head, J.W.; Fassett, C.I.; Kadish, S.J.; Smith, D.E.; Zuber, M.T.; Neumann, G.A.; Mazarico, E.M. Global Distribution of Large Lunar Craters: Implications for Resurfacing and Impactor Populations. Science 2010, 329, 1504–1507. [Google Scholar] [CrossRef] [PubMed]
- Dundas, C.M.; McEwen, A.S. Rays and secondary craters of Tycho. Icarus 2007, 186, 31–40. [Google Scholar] [CrossRef]
- Bierhaus, E.B.; McEwen, A.S.; Robbins, S.J.; Singer, K.N.; Dones, L.; Kirchoff, M.R.; Williams, J.P. Secondary craters and ejecta across the solar system: Populations and effects on impact-crater-based chronologies. Meteorit. Planet. Sci. 2018, 53, 638–671. [Google Scholar] [CrossRef]
- Xiao, Z. Size-frequency distribution of different secondary crater populations: 1. Equilibrium caused by secondary impacts. J. Geophys. Res. Planets 2016, 121, 2404–2425. [Google Scholar] [CrossRef]
- Luo, F.; Xiao, Z.; Wang, Y.; Ma, Y.; Xu, R.; Wang, S.; Xie, M.; Wu, Y.; Deng, Q.; Ma, P. The Production Population of Impact Craters in the Chang’E-6 Landing Mare. Astrophys. J. Lett. 2024, 974, L37. [Google Scholar] [CrossRef]
- Su, Y.; Xu, L.; Zhu, M.H.; Cui, X.L. Composition and Provenance of the Chang’e-6 Lunar Samples: Insights from the Simulation of the Impact Gardening Process. Astrophys. J. Lett. 2024, 976, L30. [Google Scholar] [CrossRef]
- Werner, S.C.; Bultel, B.; Rolf, T. Review and Revision of the Lunar Cratering Chronology—Lunar Timescale Part 2. Planet. Sci. J. 2023, 4, 147. [Google Scholar] [CrossRef]
- Xu, Z.; Guo, D.; Liu, J. Maria Basalts Chronology of the Chang’E-5 Sampling Site. Remote Sens. 2021, 13, 1515. [Google Scholar] [CrossRef]
- Xu, L.; Qiao, L. Formation age of the Rima Sharp sinuous rill on the Moon, source of the returned Chang’e-5 samples. Astron. Astrophys. 2022, 657, A42. [Google Scholar] [CrossRef]
- Herrera, C.; Carry, B.; Lagain, A.; Vavilov, D.E. Binary craters on Ceres and Vesta and implications for binary asteroids. Astron. Astrophys. 2024, 688, A176. [Google Scholar] [CrossRef]
- Li, X.; Vincent, J.-B.; Weller, R.; Zachmann, G. Numerical approach to synthesizing realistic asteroid surfaces from morphological parameters. Astron. Astrophys. 2022, 659, A176. [Google Scholar] [CrossRef]
- Martin, A.C.; Denevi, B.W.; Speyerer, E.J.; Boyd, A.K.; Brown, H.M. Imaging in Shadows: A Comparison of Craters Observed in Primary and Secondary Illumination with the Lunar Reconnaissance Orbiter Camera. Planet. Sci. J. 2024, 5, 207. [Google Scholar] [CrossRef]
- Wueller, L.; Iqbal, W.; Frueh, T.; van der Bogert, C.H.; Hiesinger, H. Geologic History of the Amundsen Crater Region Near the Lunar South Pole: Basis for Future Exploration. Planet. Sci. J. 2024, 5, 147. [Google Scholar] [CrossRef]
- Rivera-Valentín, E.G.; Fassett, C.I.; Denevi, B.W.; Meyer, H.M.; Neish, C.D.; Morgan, G.A.; Cahill, J.T.S.; Stickle, A.M.; Patterson, G.W. Mini-RF S-band Radar Characterization of a Lunar South Pole-crossing Tycho Ray: Implications for Sampling Strategies. Planet. Sci. J. 2024, 5, 94. [Google Scholar] [CrossRef]
- Chang, Y.; Xiao, Z.; Liu, Y.; Cui, J. Self-Secondaries Formed by Cold Spot Craters on the Moon. Remote Sens. 2021, 13, 1087. [Google Scholar] [CrossRef]
- Xu, X.; Ye, L.; Kang, Z.; Jiang, W.; Luan, D.; Zhang, D. The Identification of Secondary Craters based on the Distribution of Iron Element on Lunar Surface. Geomat. Inf. Sci. Wuhan Univ. 2025, 47, 287–295. [Google Scholar] [CrossRef]
- Basilevsky, A.; Kozlova, N.; Zavyalov, I.; Karachevtseva, I.; Kreslavsky, M. Morphometric studies of the Copernicus and Tycho secondary craters on the moon: Dependence of crater degradation rate on crater size. Planet. Space Sci. 2018, 162, 31–40. [Google Scholar] [CrossRef]
- O’Brien, P.; Byrne, S. Degradation of the Lunar Surface by Small Impacts. Planet. Sci. J. 2022, 3, 235. [Google Scholar] [CrossRef]
- Guo, D.; Liu, J.; Head, J.W., III; Kreslavsky, M.A. Lunar Orientale Impact Basin Secondary Craters: Spatial Distribution, Size-Frequency Distribution, and Estimation of Fragment Size. J. Geophys. Res. Planets 2018, 123, 1344–1367. [Google Scholar] [CrossRef]
- Salih, A.; Lompart, A.; Grumpe, A.; Wöhler, C.; Hiesinger, H. Automatic detection of secondary craters and mapping of planetary surface age based on lunar orbital images. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2017, XLII-3/W1, 125–132. [Google Scholar] [CrossRef]
- McEwen, A.S.; Bierhaus, E.B. The importance of secondary cratering to age constraints on planetary surfaces. Annu. Rev. Earth Planet. Sci. 2006, 34, 535–567. [Google Scholar] [CrossRef]
- Bierhaus, E.; Chapman, C.; Merline, W. Secondary craters on Europa and implications for cratered surfaces. Nature 2005, 437, 1125–1127. [Google Scholar] [CrossRef] [PubMed]
- Yamamoto, S.; Matsunaga, T.; Nakamura, R.; Sekine, Y.; Hirata, N.; Yamaguchi, Y. An Automated Method for Crater Counting Using Rotational Pixel Swapping Method. IEEE Trans. Geosci. Remote Sens. 2017, 55, 4384–4397. [Google Scholar] [CrossRef]
- Benedix, G.K.; Lagain, A.; Chai, K.; Meka, S.; Anderson, S.; Norman, C.; Bland, P.A.; Paxman, J.; Towner, M.C.; Tan, T. Deriving Surface Ages on Mars Using Automated Crater Counting. Earth Space Sci. 2020, 7, e2019EA001005. [Google Scholar] [CrossRef]
- Wang, Y.; Wu, B.; Xue, H.; Li, X.; Ma, J. An Improved Global Catalog of Lunar Impact Craters (≥1 km) with 3D Morphometric Information and Updates on Global Crater Analysis. J. Geophys. Res. Planets 2021, 126, e2020JE006728. [Google Scholar] [CrossRef]
- Lin, X.; Zhu, Z.; Yu, X.; Ji, X.; Luo, T.; Xi, X.; Zhu, M.; Liang, Y. Lunar Crater Detection on Digital Elevation Model: A Complete Workflow Using Deep Learning and Its Application. Remote Sens. 2022, 14, 621. [Google Scholar] [CrossRef]
- Lagain, A.; Servis, K.; Benedix, G.K.; Norman, C.; Anderson, S.; Bland, P.A. Model Age Derivation of Large Martian Impact Craters, Using Automatic Crater Counting Methods. Earth Space Sci. 2021, 8, e2020EA001598. [Google Scholar] [CrossRef]
- Fairweather, J.H.; Lagain, A.; Servis, K.; Benedix, G.K.; Kumar, S.S.; Bland, P.A. Automatic Mapping of Small Lunar Impact Craters Using LRO-NAC Images. Earth Space Sci. 2022, 9, e2021EA002177. [Google Scholar] [CrossRef]
- Latorre, F.; Spiller, D.; Sasidharan, S.; Basheer, S.; Curti, F. Transfer learning for real-time crater detection on asteroids using a Fully Convolutional Neural Network. Icarus 2023, 394, 115434. [Google Scholar] [CrossRef]
- Vega García, M.; Aznarte, J.L. Shapley additive explanations for NO2 forecasting. Ecol. Inform. 2020, 56, 101039. [Google Scholar] [CrossRef]
- Lundberg, S.M.; Lee, S.I. A unified approach to interpreting model predictions. In Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS’17, Red Hook, NY, USA, 4–9 December 2017; pp. 4768–4777. [Google Scholar]
- Han, L.; Yang, G.; Yang, X.; Song, X.; Xu, B.; Li, Z.; Wu, J.; Yang, H.; Wu, J. An explainable XGBoost model improved by SMOTE-ENN technique for maize lodging detection based on multi-source unmanned aerial vehicle images. Comput. Electron. Agric. 2022, 194, 106804. [Google Scholar] [CrossRef]
- Friedman, J.H. Greedy function approximation: A gradient boosting machine. Ann. Stat. 2001, 29, 1189–1232. [Google Scholar]
- Liu, Q.; Cheng, W.; Yan, G.; Zhao, Y.; Liu, J. A Machine Learning Approach to Crater Classification from Topographic Data. Remote Sens. 2019, 11, 2594. [Google Scholar] [CrossRef]
- Robbins, S.J. A New Global Database of Lunar Impact Craters >1–2 km: 1. Crater Locations and Sizes, Comparisons with Published Databases, and Global Analysis. J. Geophys. Res. Planets 2019, 124, 871–892. [Google Scholar] [CrossRef]
- Singer, K.N.; Jolliff, B.L.; McKinnon, W.B. Lunar Secondary Craters and Estimated Ejecta Block Sizes Reveal a Scale-Dependent Fragmentation Trend. J. Geophys. Res. Planets 2020, 125, e2019JE006313. [Google Scholar] [CrossRef]
- Barker, M.; Mazarico, E.; Neumann, G.; Zuber, M.; Haruyama, J.; Smith, D. A new lunar digital elevation model from the Lunar Orbiter Laser Altimeter and SELENE Terrain Camera. Icarus 2016, 273, 346–355. [Google Scholar] [CrossRef]
- Chen, R.; Xu, Y.; Xie, M.; Zhang, L.; Niu, S.; Bugiolacchi, R. Sub-surface stratification and dielectric permittivity distribution at the Chang’E-4 landing site revealed by the lunar penetrating radar. Astron. Astrophys. 2022, 664, A35. [Google Scholar] [CrossRef]
- Prokhorenkova, L.; Gusev, G.; Vorobev, A.; Dorogush, A.V.; Gulin, A. CatBoost: Unbiased boosting with categorical features. Adv. Neural Inf. Process. Syst. 2018, 31. [Google Scholar]
- Štrumbelj, E.; Kononenko, I. Explaining prediction models and individual predictions with feature contributions. Ann. Stat. 2013, 41, 647–665. [Google Scholar]
- Ribeiro, M.T.; Singh, S.; Guestrin, C. “Why Should I Trust You?”: Explaining the Predictions of Any Classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; Association for Computing Machinery: New York, NY, USA, 2016; pp. 1135–1144. [Google Scholar] [CrossRef]
Kepler Crater | Copernicus Crater | |
---|---|---|
Primary craters | 2028 | 17,815 |
Secondary craters | 1255 | 4974 |
Total | 3283 | 22,789 |
Training Set | Test Set | |||
---|---|---|---|---|
Primary Crater | Secondary Crater | Primary Crater | Secondary Crater | |
Kepler | 1838 | 1154 | 190 | 101 |
Copernicus | 15,056 | 4113 | 2759 | 861 |
Parameter | Value |
---|---|
iterations | 100 |
learning_rate | 0.1 |
class_weights | {0:1.0, 1:2.5} |
depth | 10 |
loss_function | Logloss |
random_seed | 123 |
Recall | Precision | F1 Score | AUPRC | Kappa | Accuracy | |
---|---|---|---|---|---|---|
Primary Crater | 0.9366 | 0.9173 | 0.9268 | 0.9655 | 0.6927 | 0.8885 |
Secondary Crater | 0.7412 | 0.7922 | 0.7658 | 0.8626 |
Variable | Density with Gaussian | Average Slope | Slope Angle | Recall | Precision | F1 Score | AUPRC | Kappa | Accuracy | |
---|---|---|---|---|---|---|---|---|---|---|
baseline | - | - | - | Primary Crater | 0.9173 | 0.9154 | 0.9163 | 0.9593 | 0.6588 | 0.8737 |
Secondary Crater | 0.7401 | 0.7448 | 0.7424 | 0.8378 | ||||||
✓ | - | - | Primary Crater | 0.9332 | 0.9192 | 0.9261 | 0.9651 | 0.6926 | 0.8878 | |
Secondary Crater | 0.7484 | 0.7852 | 0.7664 | 0.8648 | ||||||
- | ✓ | ✓ | Primary Crater | 0.9179 | 0.9114 | 0.9147 | 0.9559 | 0.6493 | 0.8709 | |
Secondary Crater | 0.7266 | 0.7428 | 0.7346 | 0.8308 | ||||||
✓ | - | ✓ | Primary Crater | 0.9298 | 0.9152 | 0.9225 | 0.9640 | 0.6769 | 0.8821 | |
Secondary Crater | 0.7360 | 0.7738 | 0.7544 | 0.8600 | ||||||
✓ | ✓ | - | Primary Crater | 0.9271 | 0.9212 | 0.9241 | 0.9636 | 0.6884 | 0.8852 | |
Secondary Crater | 0.7568 | 0.7720 | 0.7643 | 0.8621 | ||||||
✓ | ✓ | ✓ | Primary Crater | 0.9366 | 0.9173 | 0.9268 | 0.9655 | 0.6928 | 0.8885 | |
Secondary Crater | 0.7412 | 0.7922 | 0.7658 | 0.8626 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhu, M.; Lai, J.; Zhang, X.; Xu, Y.; He, W. An Explainable CatBoost Model for Crater Classification Based on Digital Elevation Model. Remote Sens. 2025, 17, 1236. https://doi.org/10.3390/rs17071236
Zhu M, Lai J, Zhang X, Xu Y, He W. An Explainable CatBoost Model for Crater Classification Based on Digital Elevation Model. Remote Sensing. 2025; 17(7):1236. https://doi.org/10.3390/rs17071236
Chicago/Turabian StyleZhu, Minghao, Jialong Lai, Xiaoping Zhang, Yi Xu, and Weidong He. 2025. "An Explainable CatBoost Model for Crater Classification Based on Digital Elevation Model" Remote Sensing 17, no. 7: 1236. https://doi.org/10.3390/rs17071236
APA StyleZhu, M., Lai, J., Zhang, X., Xu, Y., & He, W. (2025). An Explainable CatBoost Model for Crater Classification Based on Digital Elevation Model. Remote Sensing, 17(7), 1236. https://doi.org/10.3390/rs17071236