Abstract
Soil salinization is an escalating global concern threatening agricultural productivity and ecological sustainability, particularly in coastal regions where complex interactions among hydrological, climatic, and anthropogenic factors govern salt accumulation. The vertical differentiation and spatial heterogeneity of salinity drivers remain poorly resolved. We present an integrated modeling framework combining ensemble machine learning and spatial statistics to investigate the depth-specific dynamics of soil salinity in the Yellow River Delta, a vulnerable coastal agroecosystem. Using multi-source environmental predictors and 220 field samples harmonized to 30 m resolution, the hybrid Gray Wolf Optimizer–Random Forest–XGBoost model achieved high predictive accuracy for surface salinity (R2 = 0.91, RMSE = 0.03 g/kg, MAE = 0.02 g/kg). Spatial autocorrelation analysis (Global Moran’s I = 0.25, p < 0.01) revealed pronounced clustering of high-salinity hotspots associated with seawater intrusion pathways and capillary rise. The results reveal distinct vertical control mechanisms: vegetation indices and soil water content dominate surface salinity, while total dissolved solids (TDS), pH, and groundwater depth increasingly influence middle and deep layers. By applying SHAP (SHapley Additive Explanations), we quantified nonlinear feature contributions and ranked key predictors across layers, offering mechanistic insights beyond conventional correlation. Our findings highlight the importance of depth-specific monitoring and intervention strategies and demonstrate how explainable machine learning can bridge the gap between black-box prediction and process understanding. This framework offers a generalizable framework that can be adapted to other coastal agroecosystems with similar hydro-environmental conditions.