Ship Equipment Order Target Price Prediction: An Interpretable Model Based on Boruta–Lasso and CatBoost-SHAP

Li, Kai; Sun, Shengxiang; Zhu, Chen; Zhang, Ying

doi:10.3390/jmse14100949

This is an early access version, the complete PDF, HTML, and XML versions will be available soon.

Open AccessArticle

Ship Equipment Order Target Price Prediction: An Interpretable Model Based on Boruta–Lasso and CatBoost-SHAP

¹

Naval University of Engineering, Wuhan 430033, China

²

Nanjing Centralized Fund Collection and Payment Management Center, Nanjing 210016, China

^*

Author to whom correspondence should be addressed.

J. Mar. Sci. Eng. 2026, 14(10), 949; https://doi.org/10.3390/jmse14100949 (registering DOI)

Submission received: 23 April 2026 / Revised: 17 May 2026 / Accepted: 18 May 2026 / Published: 20 May 2026

(This article belongs to the Special Issue Machine Learning Methodologies and Ocean Science, Second Edition)

Download Versions Notes

Abstract

The target price for naval equipment orders is driven by the coupling of multidimensional technical and economic factors, exhibiting typical characteristics such as high dimensionality, strong nonlinearity, multicollinearity, and small-sample fluctuations. Traditional cost estimation methods struggle to achieve high-precision fitting and interpretable decision support. To address these issues, this paper constructs an integrated prediction model that combines Boruta–Lasso two-stage feature selection, grid search-optimized CatBoost, and SHAP interpretability analysis. First, the Boruta algorithm is used for rough screening of feature significance, then Lasso regression is applied for sparse fine screening, effectively eliminating redundant features and significantly mitigating multicollinearity; grid search and five-fold repeated cross-validation are employed to optimize CatBoost hyperparameters, while 10 repeated experiments with random seeds are conducted to verify model generalization robustness. SHAP is used to quantify the marginal contribution of features, revealing nonlinear associations and statistical response transition points between core features and price. This study is based on 33 publicly available real data from main combat vessels, from which 198 modeling samples were generated through interpolation-based small-sample data augmentation. The interpolated samples were only used for data augmentation and were not considered independent empirical samples. All core conclusions were validated on the 33 original real samples, and there are no missing values in the dataset. Experimental results show that the proposed model achieved the best individual results on the test set, with a coefficient of determination of R² = 0.8949, root mean square error RMSE = 0.0554, and mean absolute error MAE = 0.0476. Across 10 repeated robustness experiments, the average results were R² = 0.8828, RMSE = 0.0586, and MAE = 0.0529, with overall performance better than comparison models such as XGBoost, random forest, and standard CatBoost. Ablation experiments validated the effectiveness of the two-stage Boruta–Lasso selection strategy in improving model accuracy and stability. SHAP attribution analysis shows that full-load displacement, number of vertical missile launch cells, number of phased array radars, and combat capability are core features highly correlated with price, all showing significant nonlinear positive correlations and clear statistical response transition points. The dataset in this study has no missing values, is entirely constructed based on publicly traceable data, and does not include confidential information such as internal shipyard costs. The findings reflect statistical associations rather than causal effects. However, the sample size and ship-type coverage are limited, so the model’s applicability is somewhat constrained, and its generalization ability needs to be further verified on larger-scale, multi-ship-type independent datasets. This model combines high prediction accuracy, strong robustness, and good interpretability, providing reliable technical support for ship equipment procurement pricing demonstration, full lifecycle cost management, and scientific procurement decision-making.

Keywords: ship equipment; order target price prediction; interpretable machine learning; two-stage feature selection; Boruta–Lasso; CatBoost; SHAP

Share and Cite

MDPI and ACS Style

Li, K.; Sun, S.; Zhu, C.; Zhang, Y. Ship Equipment Order Target Price Prediction: An Interpretable Model Based on Boruta–Lasso and CatBoost-SHAP. J. Mar. Sci. Eng. 2026, 14, 949. https://doi.org/10.3390/jmse14100949

AMA Style

Li K, Sun S, Zhu C, Zhang Y. Ship Equipment Order Target Price Prediction: An Interpretable Model Based on Boruta–Lasso and CatBoost-SHAP. Journal of Marine Science and Engineering. 2026; 14(10):949. https://doi.org/10.3390/jmse14100949

Chicago/Turabian Style

Li, Kai, Shengxiang Sun, Chen Zhu, and Ying Zhang. 2026. "Ship Equipment Order Target Price Prediction: An Interpretable Model Based on Boruta–Lasso and CatBoost-SHAP" Journal of Marine Science and Engineering 14, no. 10: 949. https://doi.org/10.3390/jmse14100949

APA Style

Li, K., Sun, S., Zhu, C., & Zhang, Y. (2026). Ship Equipment Order Target Price Prediction: An Interpretable Model Based on Boruta–Lasso and CatBoost-SHAP. Journal of Marine Science and Engineering, 14(10), 949. https://doi.org/10.3390/jmse14100949

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Ship Equipment Order Target Price Prediction: An Interpretable Model Based on Boruta–Lasso and CatBoost-SHAP

Abstract

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI