In recent years, the Yellow River Basin has experienced frequent extreme climate events, with an increasing intensity and frequency of droughts, exacerbating regional water scarcity and severely constraining agricultural irrigation efficiency and sustainable water resource utilization. The accurate estimation of reference crop evapotranspiration
[...] Read more.
In recent years, the Yellow River Basin has experienced frequent extreme climate events, with an increasing intensity and frequency of droughts, exacerbating regional water scarcity and severely constraining agricultural irrigation efficiency and sustainable water resource utilization. The accurate estimation of reference crop evapotranspiration (ET
0) is crucial for developing scientifically sound irrigation strategies and enhancing water resource management capabilities. This study utilized daily scale meteorological data from 31 stations across the Yellow River Basin spanning the period 1960–2023 to develop various machine learning models. The study constructed four machine learning models—random forest (RF), a Support Vector Machine (SVM), Gradient Boosting (GB), and Ridge Regression (Ridge)—using the meteorological variables required by the Priestley–Taylor (PT) and Hargreaves (HG) equations as inputs. These models represent a range of algorithmic structures, from nonlinear ensemble methods (RF, GB) to kernel-based regression (SVR) and linear regularized regression (Ridge). The objective was to comprehensively evaluate their performance and robustness in estimating ET
0 under different climatic zones and drought conditions and to compare them with traditional empirical formulas. The main findings are as follows: machine learning models, particularly nonlinear approaches, significantly outperformed the PT and HG methods across all climatic regions. Among them, the RF model demonstrated the highest simulation accuracy, achieving an R
2 of 0.77, and reduced the mean daily ET
0 estimation error by 0.057 mm/day and 0.076 mm/day compared to the PT and HG models, respectively. Under drought-year scenarios, although all models showed slight performance degradation, nonlinear machine learning models still surpassed traditional formulas, with the R
2 of the RF model decreasing marginally from 0.77 to 0.73, indicating strong robustness. In contrast, linear models such as Ridge Regression exhibited greater sensitivity to changes in feature distributions during drought years, with estimation accuracy dropping significantly below that of the PT and HG methods. The results indicate that in data-sparse regions, machine learning approaches with simplified inputs can serve as effective alternatives to empirical formulas, offering superior adaptability and estimation accuracy. This study provides theoretical foundations and methodological support for regional water resource management, agricultural drought mitigation, and climate-resilient irrigation planning in the Yellow River Basin.
Full article