This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Open AccessArticle
Ship Equipment Order Target Price Prediction: An Interpretable Model Based on Boruta–Lasso and CatBoost-SHAP
by
Kai Li
Kai Li 1,2
,
Shengxiang Sun
Shengxiang Sun 1,*,
Chen Zhu
Chen Zhu 1
and
Ying Zhang
Ying Zhang 1
1
Naval University of Engineering, Wuhan 430033, China
2
Nanjing Centralized Fund Collection and Payment Management Center, Nanjing 210016, China
*
Author to whom correspondence should be addressed.
J. Mar. Sci. Eng. 2026, 14(10), 949; https://doi.org/10.3390/jmse14100949 (registering DOI)
Submission received: 23 April 2026
/
Revised: 17 May 2026
/
Accepted: 18 May 2026
/
Published: 20 May 2026
Abstract
The target price for naval equipment orders is driven by the coupling of multidimensional technical and economic factors, exhibiting typical characteristics such as high dimensionality, strong nonlinearity, multicollinearity, and small-sample fluctuations. Traditional cost estimation methods struggle to achieve high-precision fitting and interpretable decision support. To address these issues, this paper constructs an integrated prediction model that combines Boruta–Lasso two-stage feature selection, grid search-optimized CatBoost, and SHAP interpretability analysis. First, the Boruta algorithm is used for rough screening of feature significance, then Lasso regression is applied for sparse fine screening, effectively eliminating redundant features and significantly mitigating multicollinearity; grid search and five-fold repeated cross-validation are employed to optimize CatBoost hyperparameters, while 10 repeated experiments with random seeds are conducted to verify model generalization robustness. SHAP is used to quantify the marginal contribution of features, revealing nonlinear associations and statistical response transition points between core features and price. This study is based on 33 publicly available real data from main combat vessels, from which 198 modeling samples were generated through interpolation-based small-sample data augmentation. The interpolated samples were only used for data augmentation and were not considered independent empirical samples. All core conclusions were validated on the 33 original real samples, and there are no missing values in the dataset. Experimental results show that the proposed model achieved the best individual results on the test set, with a coefficient of determination of R2 = 0.8949, root mean square error RMSE = 0.0554, and mean absolute error MAE = 0.0476. Across 10 repeated robustness experiments, the average results were R2 = 0.8828, RMSE = 0.0586, and MAE = 0.0529, with overall performance better than comparison models such as XGBoost, random forest, and standard CatBoost. Ablation experiments validated the effectiveness of the two-stage Boruta–Lasso selection strategy in improving model accuracy and stability. SHAP attribution analysis shows that full-load displacement, number of vertical missile launch cells, number of phased array radars, and combat capability are core features highly correlated with price, all showing significant nonlinear positive correlations and clear statistical response transition points. The dataset in this study has no missing values, is entirely constructed based on publicly traceable data, and does not include confidential information such as internal shipyard costs. The findings reflect statistical associations rather than causal effects. However, the sample size and ship-type coverage are limited, so the model’s applicability is somewhat constrained, and its generalization ability needs to be further verified on larger-scale, multi-ship-type independent datasets. This model combines high prediction accuracy, strong robustness, and good interpretability, providing reliable technical support for ship equipment procurement pricing demonstration, full lifecycle cost management, and scientific procurement decision-making.
Share and Cite
MDPI and ACS Style
Li, K.; Sun, S.; Zhu, C.; Zhang, Y.
Ship Equipment Order Target Price Prediction: An Interpretable Model Based on Boruta–Lasso and CatBoost-SHAP. J. Mar. Sci. Eng. 2026, 14, 949.
https://doi.org/10.3390/jmse14100949
AMA Style
Li K, Sun S, Zhu C, Zhang Y.
Ship Equipment Order Target Price Prediction: An Interpretable Model Based on Boruta–Lasso and CatBoost-SHAP. Journal of Marine Science and Engineering. 2026; 14(10):949.
https://doi.org/10.3390/jmse14100949
Chicago/Turabian Style
Li, Kai, Shengxiang Sun, Chen Zhu, and Ying Zhang.
2026. "Ship Equipment Order Target Price Prediction: An Interpretable Model Based on Boruta–Lasso and CatBoost-SHAP" Journal of Marine Science and Engineering 14, no. 10: 949.
https://doi.org/10.3390/jmse14100949
APA Style
Li, K., Sun, S., Zhu, C., & Zhang, Y.
(2026). Ship Equipment Order Target Price Prediction: An Interpretable Model Based on Boruta–Lasso and CatBoost-SHAP. Journal of Marine Science and Engineering, 14(10), 949.
https://doi.org/10.3390/jmse14100949
Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details
here.
Article Metrics
Article Access Statistics
For more information on the journal statistics, click
here.
Multiple requests from the same IP address are counted as one view.