Simulating Daily Evapotranspiration of Summer Soybean in the North China Plain Using Four Machine Learning Models

Han, Liyuan; Gao, Fukui; Dong, Shenghua; Song, Yinping; Liu, Hao; Song, Ni

doi:10.3390/agronomy16030315

Open AccessArticle

Simulating Daily Evapotranspiration of Summer Soybean in the North China Plain Using Four Machine Learning Models

by

Liyuan Han

^1,2,

Fukui Gao

²,

Shenghua Dong

²,

Yinping Song

^1,2,

Hao Liu

²

and

Ni Song

^2,*

¹

School of Agriculture and Biomanufacturing, Zhengzhou University, Zhengzhou 450001, China

²

Key Laboratory of Crop Water Use and Regulation, Ministry of Agriculture and Rural Affairs, Farmland Irrigation Research Institute, Chinese Academy of Agricultural Sciences, Xinxiang 453002, China

^*

Author to whom correspondence should be addressed.

Agronomy 2026, 16(3), 315; https://doi.org/10.3390/agronomy16030315

Submission received: 6 December 2025 / Revised: 21 January 2026 / Accepted: 23 January 2026 / Published: 26 January 2026

(This article belongs to the Special Issue Water and Fertilizer Regulation Theory and Technology in Crops)

Download

Browse Figures

Versions Notes

Abstract

Accurate estimation of crop evapotranspiration (ET) is essential for achieving efficient agricultural water use in the North China Plain. Although machine learning techniques have demonstrated considerable potential for ET simulation, a systematic evaluation of model-architecture suitability and hyperparameter optimization strategies specifically for summer soybean ET estimation in this region is still lacking. To address this gap, we systematically compared several machine learning architectures and their hyperparameter optimization schemes to develop a high-accuracy daily ET model for summer soybean in the North China Plain. Synchronous observations from a large-scale weighing lysimeter and an automatic weather station were first used to characterize the day-to-day dynamics of soybean ET and to identify the key driving variables. Four algorithms—support vector regression (SVR), Random Forest (RF), extreme gradient boosting (XGBoost), and a stacking ensemble—were then trained for ET simulation, while Particle Swarm Optimization (PSO), Genetic Algorithms (GAs), and Randomized Grid Search (RGS) were employed for hyperparameter tuning. Results show that solar radiation (R_S), maximum air temperature (T_max), and leaf area index (LAI) are the dominant drivers of ET. The Stacking-PSO-F3 combination, forced with Rs, T_max, LAI, maximum relative humidity (RH_max), and minimum relative humidity (RH_min), achieved the highest accuracy, yielding R² values of 0.948 on the test set and 0.900 in interannual validation, thereby demonstrating excellent precision, stability, and generalizability. The proposed model provides a robust technical tool for precision irrigation and regional water resource optimization.

Keywords:

lysimeter; stacking ensemble; SHAP analysis; optimization algorithms

1. Introduction

Soybean is an important oilseed and grain crop in China, with an annual production of approximately 20 million tons, ranking fourth globally. However, the self-sufficiency rate is less than 20%, and China needs to import 90 to 100 million tons annually, mainly due to limitations in planting area and yield levels [1,2,3]. To enhance soybean self-sufficiency and expand the planting area, especially in the Northeast and Huang-Huai-Hai regions, it has become a key focus of agricultural development. The average yield per acre in the Northeast is 150–200 kg, while in the Huang-Huai-Hai region, it is 125–130 kg. The disparity is primarily due to factors such as planting patterns, soil conditions, and irrigation management. Through variety improvement, the optimization of irrigation techniques, and the implementation of a reasonable irrigation system, soybean yields in the Huang-Huai-Hai region can be effectively increased [4,5,6].

Crop evapotranspiration (ET), which includes both soil evaporation and plant transpiration, is a key indicator of water productivity. Accurate estimation of ET is crucial for precision irrigation and the optimal allocation of water resources [7]. However, traditional empirical models such as Hargreaves and Priestley–Taylor are primarily designed to estimate reference evapotranspiration (ETo), a climatic parameter based on a standardized crop surface. When applied to estimate actual crop ET (ETc), which is governed by complex, nonlinear interactions among crop characteristics (e.g., leaf area index, phenology) and dynamic environmental factors, these models often present significant limitations in accuracy and adaptability [8,9,10,11,12]. Therefore, developing new methods that can adapt to multi-source data and possess strong nonlinear fitting capabilities, as well as cross-year generalization ability, is essential for improving the accuracy of ET estimation.

With the development of computer technology and artificial intelligence, machine learning models have been widely applied in ET estimation due to their strong nonlinear fitting capabilities. For example, Chia et al. [13] and Yamaç [14] found that SVM outperforms empirical models in crop ET estimation, and the ensemble model by Saggi and Jain [15] also demonstrated superior performance in estimating ET for maize and wheat. Additionally, Zhao et al. [16] significantly improved the prediction accuracy of winter wheat ET by optimizing machine learning models using PSO, while Kavzoglu et al. [17] and Jiang et al. [18] demonstrated the superior performance of RF and XGBoost models in ET estimation, respectively. Research shows that machine learning models (such as SVM, RF, XGBoost, Stacking, etc.) have surpassed traditional empirical models in ET estimation. However, the performance of these models depends on the selection of hyperparameters, and improper hyperparameter settings may lead to a decline in model performance [17,19,20].

To fully harness the potential of machine learning models, an increasing number of studies have introduced intelligent optimization algorithms (IOA) to automatically search for optimal hyperparameters. Petković et al. [21] found that the accuracy of a radial basis function neural network model optimized by PSO was significantly higher than that of traditional SVM. Wu et al. [22] compared the performance of GA and PSO in optimizing extreme learning machine models and found that PSO had a significant advantage. Zhao et al. [16] achieved high-precision winter wheat ET prediction by using PSO to optimize machine learning models. The combination of intelligent optimization algorithms and machine learning models can significantly enhance model performance in complex nonlinear systems, providing strong technical support for precision agriculture simulations [23,24,25,26].

Although the combination of machine learning and optimization algorithms has been widely explored, most studies focus on the combined verification of a single model and a single optimization algorithm [16,21,22]. For example, they examine the improvement effect of an optimization algorithm (such as PSO) on a specific model (such as SVM or ELM) [16,21,22], or compare the performance between ensemble models (such as Stacking) and meta-learners [15,23]. There remains a lack of systematic comparative evaluation of multiple machine learning models and optimization algorithms under different feature combinations. Therefore, the following points remain unclear in the estimation of summer soybean ET: (1) whether the introduction of complex optimization and integration frameworks can bring significant performance gains; (2) among numerous model-optimization algorithm combinations, which architecture achieves the optimal accuracy–robustness balance; and (3) how strong the generalization ability of these models is.

Therefore, this study aims to establish a multi-dimensional evaluation framework to fill the aforementioned research gaps. Instead of merely applying Stacking and PSO to summer soybean ET estimation, we develop an evaluation framework that integrates four machine learning models (RF, SVM, XGBoost, Stacking), three optimization algorithms (PSO, GA, RGS), and multiple input factor combinations (F1–F4). Through this systematic comparative analysis, the core innovation of this study lies in, for the first time, providing an empirically based, comprehensive optimization scheme (ranging from model architectures to optimization strategies) for daily-scale ET estimation of summer soybeans in the North China Plain, while revealing the underlying key driving factors and reliable generalization ability.

To this end, this study establishes a comprehensive ET estimation framework based on four machine learning models (RF, SVM, XGBoost, Stacking), three optimization algorithms (PSO, GA, RGS), and multiple input factor combinations (F1-F4). The specific objectives are as follows: (1) analyze the dynamic variation characteristics and influencing factors of the daily ET of summer soybeans; (2) construct daily ET estimation models for summer soybeans and evaluate the performance of different input features, machine learning models, and optimization algorithms; and (3) explore the model decision-making mechanism and feature contribution laws by integrating SHAP analysis and global sensitivity analysis, and assess the accuracy and generalization ability of the models. This study can provide a scientific basis for the water management of summer soybeans in the North China Plain.

2. Materials and Methods

2.1. Experimental Station Overview and Data Collection

2.1.1. Experimental Overview and Meteorological Data

The experiment was conducted at the large weighing lysimeter experimental field of the Xinxiang Comprehensive Experimental Base of the Chinese Academy of Agricultural Sciences, located in Qiliying Town, Xinxiang County, Henan Province (35°9′ N, 113°48′ E; 72.7 m a.s.l.) (Figure 1). The region has a warm temperate continental monsoon climate, with a mean annual temperature of 14 °C, average precipitation of 582 mm, 2399 h of sunshine, a frost-free period of about 220 days, and a long-term potential evapotranspiration of ~1650 mm.

The test crop was summer soybean, with a row spacing of 40 cm and a plant spacing of 17 cm (9804 plants per mu, ≈ 147,000 plants ha⁻¹). The crops were planted in June and harvested in October. Drip irrigation was used, with three replicates, each planted in one of three lysimeters. Each lysimeter (Xi’an Bishui Environmental New Technology Co., Ltd., Xi’an, China) consists of a soil box, a high-precision weighing lever system, an infiltration measurement system, a centralized data processing and storage system, and a backup power system. The specifications of the instrument are 2.0 m × 2.0 m × 3.0 m (length × width × height), and it automatically records weighing data every 60 min with an accuracy of 0.01 kg. Soil moisture consistency was ensured in the three replicates through pre-sowing soil preparation. The daily evapotranspiration (ET) for each lysimeter after sowing was determined using the water balance method. These data were recorded and used to schedule irrigation: when the cumulative ET reached 30 mm, 30 mm of water was applied via drip irrigation as measured by a water meter. The actual volume per irrigation event deviated slightly, within approximately ± 5 mm. Prior to soybean sowing, 40 mm of irrigation water was applied to establish sufficient soil moisture, resulting in an initial relative soil water content above 90%. With the planned wetted depth for drip irrigation set at 30 cm, the 30 mm of evapotranspiration loss reduced the relative soil water content by only 15.4%. This indicates that the soil water content remained above 70%, meaning the crop did not reach a state of water stress. An automatic weather station was installed in the experimental site protection area to monitor meteorological parameters, including wind speed (u₂), air temperature (T), relative humidity (RH), and solar radiation (R_S) at 2 m above the ground at 30 min intervals. The data were recorded by the CR1000 data acquisition system: the LI-200R solar radiation sensor (LI-COR Biosciences, Lincoln, NE, USA) used has a typical error of ±3%, while the air temperature accuracy of the HMP155A temperature and humidity probe is ±(0.226–0.0028 × temperature) °C (for −80~+ 20 °C) and ±(0.055 + 0.0057 × temperature) °C (for + 20~+ 60 °C). Both solar radiation and temperature–humidity data were synchronously recorded by this system. Based on these monitoring data, To ensure the reliability of the observations and to provide high-quality input for subsequent machine learning modeling, we implemented systematic quality control procedures for both the lysimeter and meteorological data. The weighing system of each lysimeter was calibrated before and after each growing season. The ET data were calculated based on the water balance method, and transient outliers caused by factors such as maintenance were automatically removed. The observed data from the three replicated lysimeters showed consistency in both trend and magnitude, indicating good spatial representativeness. Meteorological sensors were calibrated annually, and the collected raw data were automatically screened using physical plausibility thresholds (e.g., solar radiation ≥ 0). After synchronizing the timestamps from all data sources, the information was integrated at a daily scale to construct the final dataset for modeling. The daily maximum and minimum temperatures (T_max and T_min), daily maximum and minimum relative humidity (RH_max and RH_min), daily average solar radiation (R_S), and daily average wind speed (u₂) were calculated. Table 1 shows the growth stage timeline for summer soybean in 2023 and 2024. Figure 2 is a flowchart for simulating evapotranspiration using machine learning models.

2.1.2. Measurement and Calculation of Leaf Area Index (LAI)

(1) Leaf Area Measurement

In each evaporation and infiltration instrument, five plants that are uniformly grown and representative of the average condition of the crop population were selected as representative plants. The length and maximum width of all green leaves on the representative plants were measured. The leaf area of a single leaf was calculated using the formula “Leaf area = leaf length × leaf width × coefficient” with the correction coefficient of 0.7296, based on the literature [27]. The total leaf area of a single plant was the sum of the areas of all individual leaves. The total leaf area in each evaporation and infiltration instrument was calculated by multiplying the average leaf area per plant of the five representative plants by the total number of plants. The measurement of soybean leaf area was conducted every 7–10 days.

(2) Calculation of Leaf Area Index (LAI)

The leaf area index (LAI) is calculated as the ratio of total leaf area to land area, i.e., LAI = total leaf area/land area. In this study, the land area is the surface area of the evaporation and infiltration instrument (4 m²).

LAI is an important indicator reflecting the growth condition of a plant population. Leaf area changes continuously during the growth period, with slower changes in the early stages and faster growth in the later stages. LAI is directly related to crop evapotranspiration. Studies have shown that the Logistic equation (Equation (1)) can accurately model the relationship between LAI and effective accumulated temperature [28,29].

Y = \frac{Y_{m}}{1 + a e^{- b x}}

(1)

In the equation,

Y

represents the leaf area index (LAI),

Y_{m}

represents the theoretical maximum value of LAI (under optimal growth conditions with sufficient resources such as light, water, and nutrients, and no significant stress, LAI gradually increases and eventually reaches its potential maximum value); x represents effective accumulated temperature; and

a

and

b

are growth fitting parameters.

2.2. Machine Learning Models

2.2.1. Random Forest

Random Forest (RF) is an algorithm proposed by Breiman [30], based on supervised learning and ensemble methods. It constructs many uncorrelated decision trees through bootstrapped sampling and feature random subspaces, and final predictions are made by averaging or voting the outputs of these trees [31]. Each tree is trained on bootstrapped samples and considers only a random subset of features at each split node. The final prediction is the simple average of all tree outputs. This double randomization significantly reduces variance, while the bias of individual trees remains almost unchanged. The model provides out-of-bag error estimation without requiring an additional validation set. It can assess feature importance through permutation or Gini importance measures and is robust to missing values and monotonic transformations. The complexity of training and prediction grows linearly with the number of trees (B) and sample size (n). The RF model can handle high-dimensional regression and complex nonlinear problems, and by using regression, it helps to reduce overfitting [32]. Due to its outstanding performance and ease of use, this algorithm has been widely applied in various fields [33,34].

2.2.2. Support Vector Machine

Support Vector Machine (SVM) constructs a supervised learning model based on the principle of structural risk minimization [35]. Its core mechanism is to use kernel tricks (such as the RBF or polynomial kernel) to project samples into a high-dimensional Hilbert space, thereby transforming nonlinear separable problems in the original space into linear separable issues. The analytical form of the decision function indicates that its final expression depends only on the linear combination of boundary samples (i.e., support vectors). The advantage of the Support Vector Regression (SVR) model is that it is based on a series of kernel functions that are independent of the input space dimensionality, allowing effective modeling of nonlinear relationships in high-dimensional feature spaces [11]. The linear kernel is suitable for explicitly separable scenarios, the polynomial kernel controls the interaction order, the sigmoid kernel generates an equivalent mapping for neural networks, and the RBF kernel achieves local response through Gaussian transformation. SVM has a strong ability to solve complex nonlinear problems and has been widely used in ET prediction [24,36]. Research shows that the feature space estimation results after RBF function transformation have high accuracy [24]. Therefore, this study uses an SVM model based on the RBF kernel to predict daily soybean ET.

2.2.3. Gradient Boosting Model

XGBoost is an efficient gradient boosting decision tree framework that achieves high-precision predictions by iteratively constructing an ensemble of regression trees [37]. Its core mechanism involves generating new decision trees in each iteration to minimize the objective function. As an optimized implementation of gradient boosting decision trees, XGBoost introduces regularization terms and second-order derivative approximations, effectively controlling overfitting risks while maintaining model performance [38]. XGBoost optimizes the gradient boosting decision tree approach by adding a new regression tree in each iteration, minimizing the “loss + regularization” objective function to approximate the true labels. It constructs second-order approximations using the first and second derivatives of the loss function, transforming the tree-splitting problem into a gain maximization problem that can be solved analytically. This model, with its strong nonlinear fitting ability, performs exceptionally well in evapotranspiration prediction and can effectively handle complex variations in meteorological time series features (such as radiation and humidity). To address data challenges in practical applications, XGBoost is often embedded as a core component in more complex simulation frameworks to enhance the robustness of estimations in scenarios involving missing data and other uncertainties [18].

2.2.4. Stacking Ensemble Model

Stacking is a multi-layer heterogeneous ensemble model. In the first layer, multiple base learners, such as Random Forest (RF), Support Vector Machine (SVM), and XGBoost, are trained simultaneously using cross-validation, and their out-of-fold predictions are output as meta-features. The second layer comprises a meta-learner that integrates and re-learns these meta-features to generate the final prediction. This structure was first proposed by Wolpert [39], and it effectively corrects the biases of the base models, combines the advantages of multiple models, and avoids information leakage by using out-of-fold predictions. Theoretically, it can approximate complex nonlinear relationships. Stacking has shown excellent performance in crop evapotranspiration prediction, especially in complex agricultural scenarios involving multi-factor coupling and frequent extreme weather conditions, providing reliable technical support for precision irrigation and water resource management [40,41,42].

2.3. Optimization Algorithms

To fully leverage the potential of the Random Forest (RF), Support Vector Machine (SVM), XGBoost, and Stacking models, this study introduces Particle Swarm Optimization (PSO), Genetic Algorithm (GA), and Random Grid Search (RGS) for automatic hyperparameter tuning to achieve optimal performance. Traditional manual hyperparameter tuning relies on experience and is prone to local optima, whereas PSO utilizes collaborative learning among particles, GA employs a biomimetic evolutionary mechanism, and RGS randomly selects candidate combinations from a pre-set hyperparameter grid and evaluates performance. All three methods possess efficient global search capabilities, making them well-suited to adapt to the high-dimensional nonlinear features in evapotranspiration prediction (such as the interactive effects between LAI and ET), thus enhancing model accuracy and robustness.

2.3.1. Particle Swarm Optimization (PSO) Algorithm

Particle Swarm Optimization (PSO) is an optimization algorithm based on population foraging, proposed by Kennedy and Eberhart [43]. It searches for the global optimum in continuous space through the collaborative iteration of individual and collective experiences [44]. Each particle represents a candidate solution with a position vector and a search direction with a velocity vector. In each iteration, the algorithm updates the velocity and position: the velocity is the sum of three components: inertia, individual best (pBest) attraction, and global best (gBest) attraction. The position is then updated according to the new velocity. Due to its simple concept, fewer parameters, and ease of implementation, PSO has been widely applied to various complex optimization problems [45].

2.3.2. Genetic Algorithm (GA)

The Genetic Algorithm (GA) is an evolutionary optimization algorithm based on natural selection and genetic mechanisms. It iteratively approaches the global optimum through selection, crossover, and mutation operations within a population. Initially, candidate solutions are encoded as chromosomes (binary strings or real-number vectors), and the fitness function is calculated. Parents are then selected based on their fitness proportion, offspring are generated through crossover, and mutation occurs with a small probability to form a new generation. GA is commonly used for nonlinear optimization problems [46]. In the evaluation, absolute error was used to assess the performance of GA in achieving the global optimal solution [24,47]. Through multi-generation evolution, GA selects robust hyperparameter combinations, effectively improving the model’s evapotranspiration prediction accuracy under extreme weather conditions (such as drought and high temperatures) [46]. Compared to heuristic parameter tuning, GA exhibits stronger generalization ability in complex nonlinear systems [24].

2.3.3. Randomized Grid Search (RGS)

Randomized Grid Search (RGS) is a commonly used method for automatic hyperparameter tuning. Unlike traditional Grid Search (GS), which exhaustively evaluates all possible combinations in a pre-set parameter grid, RGS reduces computational complexity and time cost by randomly sampling a subset of candidate combinations in the parameter space for evaluation. The core idea is to use randomness to cover representative areas of the high-dimensional parameter space, avoiding the waste of computational resources on irrelevant parameters [48].

In each iteration, RGS randomly selects several sets of hyperparameter combinations from the pre-defined parameter range and evaluates model performance using cross-validation (CV). As the number of samples increases, RGS can approximate the global optimum at lower computational cost, therefore balancing efficiency and accuracy. Compared to exhaustive search, RGS demonstrates stronger adaptability and scalability in high-dimensional nonlinear problems (such as multivariable interactions in evapotranspiration prediction) [48,49].

In this study, PSO, GA, and RGS are used to optimize the hyperparameters of the Random Forest (RF), Support Vector Machine (SVM), XGBoost, and Stacking models. For example, the tree depth and minimum samples for splitting in RF, the kernel parameter γ in SVM, the learning rate and subsample ratio in XGBoost, and the weight coefficients of the base learners in Stacking. By quickly exploring the parameter space under a limited computational budget, RGS effectively enhances the model’s generalization ability and computational efficiency in evapotranspiration prediction, particularly for modeling tasks involving large-scale samples and high-dimensional input features.

To ensure the reproducibility of the experimental results, all model training and validation in this study were conducted in a unified computing environment. The specific hardware and software configurations are as follows:

Hardware: Hardware: CPU i5-12500H (Intel Corporation, Santa Clara, CA, USA) with a base frequency of 3.10 GHz, 16 GB RAM, and GPU RTX 3060 (NVIDIA Corporation, Santa Clara, CA, USA).

Software: Windows 10 OS, Python 3.9.6 programming language, PyCharm Professional IDE (Version 2023.3.7, JetBrains s.r.o., Prague, Czech Republic), machine learning framework scikit-learn 1.5.1.

2.4. Model Input Parameter Combination Design

To identify the key meteorological and growth factors affecting summer soybean ET, Pearson correlation coefficient analysis was performed to examine the relationships between ET and various variables (Table 2). The results indicated that all factors were positively correlated with ET except for minimum relative humidity (RH_min) and wind speed at 2 m height (u₂), which showed negative correlations. Among these, R_S, T_max, LAI, and T_min exhibited extremely significant positive correlations with ET (p < 0.01). Of all the factors, R_S showed the strongest correlation with ET. This was followed by T_max and LAI, whose correlation coefficients were both greater than 0.4 and very close to each other. T_min ranked next, whereas humidity and wind speed exhibited relatively weak correlations with ET.

Based on the correlation analysis between daily evapotranspiration of summer soybeans and meteorological factors and growth indicators, this study constructs four model input parameter combinations (Table 3). Among them, F1 consists of the top three ranked factors, F2 consists of the four factors with the most significant correlations, F3 replaces T_min with humidity parameters (RH_max, RH_min) based on F2, and F4 includes all factors except wind speed. In this study, 80% of the input data from 2023 and 2024 is used for training the machine learning models, with the remaining 20% used for validation. To verify the model’s generalization ability and cross-year adaptability, the model is trained with 2023 data and then used to simulate the results for 2024. These two datasets (training and validation data) are used independently during the model evaluation process.

2.5. Model Interpretability and Sensitivity Analysis Methods

To gain deeper insights into the decision-making mechanism of the optimal model, assess its robustness, clarify the influence patterns of meteorological variable variations on evapotranspiration estimation, and further validate the model’s robustness to input uncertainty, this chapter employs two complementary analytical frameworks: SHAP (SHapley Additive exPlanations) analysis and global sensitivity analysis. These approaches investigate the problem from two dimensions: “feature contribution interpretation” and “feature fluctuation response.” The specific methodologies are detailed as follows:

2.5.1. SHAP Analysis

To reveal the decision-making mechanism of the optimal machine learning model, this study leverages the game theory-based SHAP method for post hoc interpretability analysis [50]. This approach quantifies feature importance by calculating the marginal contribution of each feature to individual predictions and fairly decomposes the model output into the sum of contributions from each input feature. Specifically, SHAP analysis was applied to the best-performing Stacking-PSO model (based on the F3 and F4 feature sets) to identify the core features driving model predictions and to interpret how different feature values interactively influence the final ET estimation. From the perspective of “feature contribution allocation,” this analysis provides insights into the decision process of high-precision models.

2.5.2. Global Sensitivity Analysis

To evaluate how changes in meteorological variables affect ET estimation and to assess the model’s robustness to input uncertainty, this study employs global sensitivity analysis methods [51,52]: perturbation analysis, variance-based Sobol analysis, and the Morris screening method. Perturbation analysis characterizes local sensitivity by applying perturbations and calculating the relative rate of change in model output. Sobol analysis adopts the Saltelli sampling strategy to construct high-dimensional parameter samples and computes the first-order sensitivity indices (measuring independent contributions) and total effect indices (measuring overall contributions including interactions) of each input variable through variance decomposition, thereby quantifying the proportion of output variance explained by each variable. The Morris screening method generates a large number of random trajectories to compute the mean absolute elementary effect and its standard deviation for each parameter, simultaneously assessing the global importance of parameters as well as the degree of nonlinearity and interaction in their effects. These three approaches, from the complementary perspectives of “local response,” “variance contribution,” and “nonlinearity/interaction effects,” together constitute a comprehensive evaluation framework for the input–output response relationships of machine learning models.

Initially, correlation analysis was conducted to identify the main factors influencing summer soybean evapotranspiration (ET) in the North China Plain. Based on this, multiple sets of input parameter combinations were designed. After model training, SHAP interpretability analysis and global sensitivity analysis were further integrated to systematically validate the impact of core driving factors on ET. This approach provides dual validation for the rationality of the input combinations and the reliability of the results.

2.6. Evaluation Metrics

To evaluate the performance of the ET estimation models, this study uses the coefficient of determination (R²), Nash–Sutcliffe efficiency coefficient (NSE), Mean Absolute Error (MAE), and Root Mean Square Error (RMSE) as the primary evaluation metrics. These metrics have been widely used in previous research. The calculation methods for these statistical measures are provided in Equations (2)–(5):

R^{2} = \frac{(\sum_{i = 1}^{n} (\hat{y_{i}} - \bar{\hat{y}}) (y_{i} - \bar{y}))^{2}}{\sum_{i = 1}^{n} (\hat{y_{i}} - \bar{\hat{y}})^{2} \sum_{i = 1}^{n} (y_{i} - \bar{y})^{2}}

(2)

N S E = 1 - \frac{\sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}}{\sum_{i = 1}^{n} {(y_{i} - \bar{y})}^{2}}

(3)

M A E = \frac{1}{n} \sum_{i = 1}^{n} |y_{i} - {\hat{y}}_{i}|

(4)

R M S E = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}}

(5)

In the equations,

y_{i}

represents the observed ET values from the evaporation and infiltration instrument, with the unit of mm/d;

{\hat{y}}_{i}

represents the ET values estimated by the model, with the unit of mm/d;

\bar{y}

represents the average of the observed ET values, with the unit of mm/d;

\bar{\hat{y}}

refers to the arithmetic mean of the predicted ET values, with the unit of mm/d; and n refers to the number of samples, where i = 1, 2,…, n (dimensionless).

The coefficient of determination (R²) measures the linear fit between the model’s predicted values and the observed values. A value closer to 1 indicates stronger explanatory power of the model. The Nash–Sutcliffe efficiency coefficient (NSE) is commonly used to assess the simulation accuracy of the model, with a value closer to 1 indicating higher model reliability. The Root Mean Square Error (RMSE) reflects the dispersion of the prediction errors. Due to its squared error calculation, it is particularly sensitive to larger errors (outliers), and its units are consistent with ET (mm/d), making it a key reference for irrigation decisions. The Mean Absolute Error (MAE) directly represents the average absolute deviation between the predicted and observed values, and it is less sensitive to outliers. Its units are also in mm/d.

Generally, the higher the R² and NSE values (closer to 1), and the lower the RMSE and MAE values (closer to 0), the better the model’s simulation performance and prediction accuracy.

3. Results

3.1. Dynamic Characteristics of Summer Soybean Evapotranspiration and Its Influencing Factors

3.1.1. Construction and Validation of the Leaf Area Index (LAI) Model

To obtain daily LAI data that matches the observed ET, this study constructed a segmented Logistic model based on the correlation between the 2024 measured LAI data and effective accumulated temperature (GDD). The fitted relationship is shown in Equation (6):

L A I = \{\begin{matrix} \frac{7.310}{1 + 1187 e^{- 0.01 GDD}} & GDD < 1422 (ascending phase) \\ \frac{7.229}{1 + 0.983 e^{0.1 (GDD - 1531.3)}} & GDD \geq 1422 (descending phase) \end{matrix}

(6)

To verify the model’s generalizability and simulation ability, the 2023 GDD data was substituted into Equation (6) to obtain the simulated values, which were then compared with the observed values. The results indicate that the LAI model constructed based on effective accumulated temperature shows high accuracy, with an R² of 0.88 between the estimated and observed LAI values (Figure 3a). This model effectively enables the daily continuous estimation of summer soybean LAI, compensating for the low temporal resolution and long observation intervals of the measured LAI data and providing an important foundation for the subsequent construction of the ET model.

As shown in Figure 3b, the model accurately simulated the dynamic process of summer soybean LAI changes in response to effective accumulated temperature in 2023 and 2024. The LAI follows a unimodal trend of first increasing and then decreasing throughout the growing season: it rises slowly after emergence, enters a typical “S”-shaped rapid growth phase at the end of flowering, reaches its peak, and then gradually declines as the leaves mature and age. Throughout the growing season, the simulated values closely match the trends of the observed values.

3.1.2. Dynamic Changes and Influencing Factors of Evapotranspiration (ET) in Summer Soybean at Different Growth Stages

Daily average meteorological conditions, LAI, and ET of summer soybean across different growth stages are presented in Table 4. Overall, summer soybean ET followed a unimodal pattern of first increasing and then decreasing throughout the entire growth period; yet, interannual differences existed in the peak timing, occurring at the pod-filling stage in 2023 and the grain-filling stage in 2024. This discrepancy primarily stemmed from variations in the matching between LAI development and meteorological energy supply during critical growth phases. Additionally, RH, an indicator of atmospheric aridity or evaporative demand, exhibited annual fluctuations. For instance, RH_min occurred at the seedling stage in 2023 but shifted to the milk-ripe stage in 2024, further shaping ET dynamics across the two years.

ET remained low at the seedling stage due to the small LAI, with soil evaporation accounting for the majority of ET loss. By the harvest stage, leaf yellowing, withering, abscission, and reduced physiological activity led to a substantial decline in leaf area, essentially halting transpiration. The formation of the ET peak, however, relied on the synergistic interaction between canopy development (characterized by LAI) and meteorological driving forces during critical growth stages. Specifically, while the 2023 pod-filling stage LAI (5.47) had not yet peaked, favorable meteorological conditions—sufficient solar radiation (17.423 MJ/m²) and high temperature (34.099 °C)—provided adequate energy input to fully unlock canopy transpiration potential, driving ET to a peak of 7.53 mm/d. Even though air humidity decreased further at the subsequent grain-filling stage (RH_min fell from 72.862% to 67.082%, corresponding to higher vapor pressure deficit (VPD) and stronger evaporative demand), ET declined due to a marked reduction in solar radiation. This finding indicates that once LAI reaches its maximum, energy supply—primarily solar radiation—replaces atmospheric evaporative demand as the dominant limiting factor for ET.

In contrast, the 2024 pod-filling stage recorded the strongest energy input of the year (maximum temperature: 35.479 °C; solar radiation: 18.156 MJ/m²), but the relatively low LAI (4.62) deterministically restricted canopy transpiration capacity, preventing ET from peaking. It was not until the grain-filling stage that LAI reached its maximum value (7.01). Although the energy conditions at this stage—including a maximum temperature (31.435 °C) and solar radiation (14.494 MJ/m²)—were weaker than those at the pod-filling stage, they remained suitable. The efficient coupling between optimal LAI and favorable energy conditions ultimately drove ET to its peak of 7.28 mm/d during the grain-filling stage.

In summary, the occurrence of the ET peak is not absolutely controlled by a single factor. Instead, it depends on the dynamic matching between canopy development (LAI) and meteorological driving conditions, primarily solar radiation and temperature, during critical growth stages.

3.2. Comparison and Evaluation of Summer Soybean ET Simulation Performance Between the FAO-56 Penman–Monteith (PM) Model and Optimized Machine Learning Models

3.2.1. Performance Analysis of Summer Soybean ET Simulation Using the FAO-56 Penman–Monteith Model

The FAO-56 Penman–Monteith (PM) model is an internationally recognized standard method in the field of ET calculation. It integrates key meteorological variables such as air temperature, relative humidity, solar radiation, and wind speed, and is based on a well-defined physical mechanism, making it generally more accurate and reliable than other empirical formulas [53,54]. Accordingly, this study employed the FAO-56 PM model to estimate the actual ET of summer soybean.

Figure 4 presents the in situ simulation results for 2023 and cross-year prediction results for 2024 of the daily evapotranspiration (ET) of summer soybeans in the North China Plain using the FAO-56 PM model. Calibrated with measured meteorological data from 2023, the model exhibited good fitting performance in the 2023 simulation (R² = 0.791, MAE = 1.002), demonstrating its capability to simulate ET variations under the conditions of “complete data and parameter adaptation”. However, when applying the 2023 data for cross-year prediction of summer soybean ET in 2024, the fitting accuracy decreased significantly (R² = 0.550, MAE = 1.428), especially during the mid-growth stage (40–80 days after sowing), where the consistency between predicted and measured values dropped sharply.

The FAO-56 PM model is a traditional physics-based model grounded in climate responsiveness. While it can effectively capture ET variation trends within a single year, its generalization ability is constrained by factors such as interannual meteorological fluctuations (e.g., radiation, wind speed) and crop growth differences. Failing to fully adapt to interannual environmental and crop growth variations, the model lacks stability in cross-year predictions. Furthermore, the discrepancy between the 2023-based cross-year prediction results for 2024 provides an important reference for subsequent comparisons of the multi-factor nonlinear learning capabilities and cross-year generalization advantages of machine learning models.

3.2.2. Overall Model Performance and Comparison of Optimization Algorithm Effects

Multi-dimensional evaluations of the daily evapotranspiration (ET) simulation performance of summer soybean were conducted using four machine learning models (RF, SVM, XGBoost, Stacking) and three optimization algorithms (PSO, GA, RGS)—with an additional unoptimized group (None)—based on four input parameter combinations. The evaluation metrics included the coefficient of determination (R²), Nash–Sutcliffe efficiency (NSE), Mean Absolute Error (MAE), and Root Mean Square Error (RMSE) (Figure 5).

As shown in Figure 5, the R² of all models (RF, XGB, SVM, and Stacking) improved after hyperparameter optimization using the PSO, GA, or RGS algorithm. The PSO algorithm stood out the most: after optimization, the model’s R² and NSE increased by an average of 5.8–9.2%, while the error metrics (MAE and RMSE) decreased by 20–29% on average. These results indicate that hyperparameter optimization plays a crucial role in maximizing model performance and generalization ability.

Given that Figure 5 demonstrates how models optimized with PSO, GA, or RGS significantly outperformed the original unoptimized models, the subsequent sections of this study will no longer present or discuss the results from unoptimized models. Instead, we focus entirely on in-depth comparisons and evaluations of various optimized model combinations to center the research core and simplify subsequent analyses.

In terms of overall model trends, the PSO-optimized Stacking model achieved an R² greater than 0.944 across all four input parameter combinations, with an average NSE of 0.943. Compared to PSO-optimized single models, Stacking-PSO significantly outperformed RF-PSO (R² = 0.921), SVM-PSO (R² = 0.907), and XGBoost-PSO (R² = 0.918).

To systematically evaluate the impacts of different optimization algorithms and input parameter combinations on model performance, this study generated three types of comparative Figures (Figure 6) based on all simulation results: an optimization algorithm performance comparison chart, an input parameter combination performance comparison chart, and a model-type performance comparison chart. We comprehensively compared the distribution characteristics of each combination across four evaluation metrics (R², NSE, MAE, and RMSE) from three key dimensions, namely optimization strategies, input parameter combinations, and model architectures, to reveal their differences in accuracy and stability.

(1) Comparison of Optimization Algorithm Performance

As shown in Figure 6a–d, models optimized with the PSO algorithm achieved the overall best performance, exhibiting the highest median values for R² and NSE and the lowest median values for MAE and RMSE, along with a compact box distribution and fewer outliers. This indicates that PSO-optimized models not only deliver high accuracy but also strong stability. The GA ranked second, while models optimized with RGS showed relatively poor performance stability, with simulation results displaying a wider fluctuation range across different parameter combinations.

(2) Comparison of Input Parameter Combination Performance

From the perspective of input parameter combinations (Figure 6i–l), the four feature sets (F1–F4) showed small performance differences but with distinct strengths. Based on the aggregated results of all models and optimization algorithms, F1 achieved a slightly higher R² (0.920) and NSE (0.917), along with the smallest RMSE (0.897), demonstrating superior performance in fitting accuracy and overall error control. F4, by contrast, had the lowest median MAE (0.739), indicating better performance in mean absolute deviation reduction. Overall, F1 and F4 exhibited comparable comprehensive performance, outperforming F2 and F3. Among them, F4 featured a more compact box distribution, reflecting stronger stability. Although F3 had slightly lower overall metrics, the “Stacking-PSO-F3” configuration still achieved the highest R² in this study. This suggests that the streamlined feature set centered on LAI, T_max, and R_S retains potential for performance improvement when paired with optimal model-optimization combinations.

(3) Comparison of Model Type Performance

Figure 6i–l further illustrate the overall performance distribution of different machine learning models. The Stacking ensemble model significantly outperformed single models (RF, SVM, XGBoost) across all four evaluation metrics, showing the most favorable central tendency and the smallest degree of dispersion. This highlights the distinct advantage of ensemble learning in integrating the strengths of multiple models and enhancing generalization ability.

In summary, the PSO-optimized Stacking ensemble model, with its optimal accuracy and robustness, has been verified as the optimal choice for simulating daily evapotranspiration of summer soybean in the North China Plain.

3.2.3. Analysis of Summer Soybean Evapotranspiration Simulation Models Based on SHAP and Global Sensitivity Analysis

To further compare the model response mechanisms to ET under different parameter combinations, this study employed the SHAP (SHapley Additive Explanations) method to perform interpretability analysis on the optimal model combination (Stacking-PSO). This method can consistently quantify the contribution value of each parameter to the simulation results and is more capable of uncovering complex nonlinear relationships than traditional correlation analysis [55]. Two model combinations with relatively high coefficients of determination (Stacking-PSO-F3 and Stacking-PSO-F4) were selected, and their SHAP values were calculated as presented in Figure 7.

Comparing Figure 7a,b, under the two input parameter combinations, the following key findings emerged: (1) Leaf area index (LAI), maximum temperature (T_max), and solar radiation (R_S) exhibited higher SHAP values, serving as the dominant factors determining ET variations. (2) Maximum relative humidity (RH_max) and minimum relative humidity (RH_min) showed moderate contributions, reflecting secondary and stable regulatory effects. (3) In the F4 combination (which includes T_min), T_min had the lowest contribution. (4) The relative importance rankings of parameters in F3 and F4 were essentially consistent, with F3 displaying a more concentrated distribution of importance. This indicates that the model’s interpretive structure remains stable after feature streamlining.

To further verify the feature importance ranking identified by SHAP analysis and quantify the impact of input variable uncertainty on the daily ET simulation results of summer soybeans, this study employs three complementary global sensitivity analysis methods, namely perturbation analysis, Sobol variance decomposition, and Morris screening method, to analyze the optimal Stacking-PSO model from three dimensions, with the results shown in Figure 8.

(1): Results of Perturbation Analysis

When perturbations are applied around the sample mean, the model exhibits significant local sensitivity differences (Figure 8a). Results indicate that the normalized sensitivity index of LAI is the highest (0.301), followed by T_min (0.216) and T_max (0.159); in addition, the sensitivity indices of humidity variables (RH_min: 0.139, RH_max: 0.087) are all at relatively low levels.

(2): Results of Sobol Variance Decomposition

Based on the principle of variance decomposition, Sobol analysis clarifies the key physical mechanisms governing ET simulation (Figure 8b), and its results are highly consistent with the global importance ranking from SHAP analysis: the total effect indices (ST) of Rs (0.59) and LAI (0.37) collectively explain 96.0% of the ET variance, making them the core driving factors for ET changes; the total effect indices of other variables are all less than 0.1, among which T_min’s ST value is only 0.020—much lower than its performance in perturbation analysis—indicating that T_min’s impact on ET is more reflected in local fluctuations rather than global variation.

(3): Results of Morris Screening

The Morris method further verifies the effects from the perspectives of the intensity and nonlinearity of variable impacts (Figure 8c). Results show that Rs has the highest μ* value (3.16), followed by LAI (1.67), and these two are identified as the most critical driving factors with mean absolute effects significantly higher than those of other variables. Temperature variables (T_max, T_min) exert a moderate impact, while the remaining variables have weak effects—this ranking is generally consistent with the conclusions from perturbation analysis and Sobol analysis.

3.2.4. Evaluation of Model Robustness and Generalization Ability

To verify the predictive performance of models in cross-annual experiments, this study evaluated model stability using the Taylor diagram method based on an 80% training set and 20% test set, aiming to select models with both stable performance and high accuracy. Figure 9 comprehensively presents the simulation results of four machine learning models (RF, SVM, XGBoost, Stacking) combined with three parameter optimization algorithms (PSO, GA, RGS) across four input parameter combinations (F1~F4). Through three key metrics—correlation coefficient (Corr), standard deviation (Std.R), and coefficient of determination (R²)—this diagram quantifies the differences in simulation accuracy and stability among various “parameter + model + optimization” combinations.

As shown in Figure 9, the Stacking-PSO-F2 combination exhibited the lowest standard deviation (Std.R = 0.899), indicating that the fluctuation of its simulated values is closest to the measured values and highlighting its outstanding stability. In contrast, Stacking-PSO-F3 achieved the highest coefficient of determination (R² = 0.948), reflecting the best goodness of fit. Thus, Stacking-PSO-F3 has a slight advantage in explanatory power, while Stacking-PSO-F2 performs better in simulation stability—resulting in each combination having its own strengths in different scenarios.

Through internal validation (random division of the mixed dataset), the above results evaluated the model’s fitting performance under the historical data distribution. To systematically assess its cross-annual robustness and generalization ability, based on the results in Figure 9, the top three best-performing “parameter + model + optimization” combinations were selected for each machine learning algorithm (RF, SVM, XGBoost, Stacking), and cross-annual validation was conducted with the setup “training in 2023 and testing in 2024.” Since the training and test sets are completely independent in time, this setup can more realistically simulate the model’s predictive ability for future scenarios.

Figure 10 presents a comparison of summer soybean ET simulation performance among different input parameter combinations and optimization algorithms in cross-annual prediction. The Stacking-PSO-F3 combination exhibited outstanding performance across multiple evaluation metrics, with its comprehensive performance outperforming other comparative combinations. It demonstrated strong generalization ability across different years, reflecting its adaptability to changes in experimental environmental conditions. In contrast, SVM-based models exhibited relatively low performance—consistent with previous studies—indicating that SVM faces challenges in handling the strong nonlinear relationships inherent in ET simulation.

Under the above environment, the training and prediction processing times for each model combination are listed in Table 5 (corresponding one-to-one with the accuracy metrics in Figure 10). The processing times were recorded using Python’s time module. The training time denotes the total duration from model initialization to completion of training, and the prediction time represents the time required to perform inference on the entire test set. All reported times are the average of three independent repeated runs.

From Table 5, the differences in time costs among different model combinations can be intuitively observed: the Stacking + PSO series requires relatively longer training time (4~6.5 min) due to integrating multiple base models and needing PSO iterative optimization; single models such as RF and XGBoost exhibit higher training efficiency (1~4.5 min); meanwhile, the number of features also affects the time consumption (for example, models with the most features (F4) have slightly longer training times than the F2/F3 combinations of the same algorithm). Combined with the accuracy indicators in Figure 8, Figure 9 and Figure 10, this table can provide time-dimensional support for model selection that balances accuracy and efficiency.

Figure 11 presents a comparison between the predicted and measured values of summer soybean ET across different model combinations in cross-annual simulation. The Stacking-PSO-F3 combination exhibited minimal deviations between simulated and measured values, achieving high accuracy (R² = 0.9, RMSE = 0.85). Other Stacking-PSO-optimized combinations also performed excellently. Additionally, both RF-PSO-F3 and XGBoost-PSO-F3 demonstrated strong correlations between their simulated and measured values.

Figure 12 illustrates the performance of different model combinations in cross-annual evapotranspiration (ET) simulation for summer soybean. As observed in the Figure, the Stacking-PSO-F3 and Stacking-PSO-F2 models showed the closest alignment with measured values in terms of simulation accuracy, with Stacking-PSO-F3 in particular demonstrating strong stability and accuracy. In contrast, while RF-PSO-F3 and XGBoost-PSO-F3 also performed well, they exhibited slight lag or overestimation during the early and late growth stages. This phenomenon is likely attributed to the models’ limited adaptability to changes in climatic conditions.

Earlier in Section 3.2.3, Figure 9 revealed that the Stacking-PSO-F2 model had the smallest standard deviation (Std.R), indicating that it outperformed other combinations in terms of simulation stability. When combining model performance with other metrics (e.g., R² and MAE), the Stacking-PSO-F3 model was found to perform better, with higher comprehensive prediction accuracy (R² = 0.948 and 0.900 in internal and cross-annual validation, respectively).

Although Stacking-PSO-F2 exhibited superior stability, the comprehensive performance evaluation indicates that Stacking-PSO-F3—relying on the synergistic effects of core driving factors such as LAI, T_max, and Rs—delivered higher estimation accuracy and cross-annual robustness.

This analysis result is consistent with previous experimental findings, demonstrating that Stacking—especially when optimized with PSO—offers significant advantages in both accuracy and stability for ET prediction.

4. Discussion

4.1. Performance Comparison Between Different Machine Learning Models and the Penman–Monteith Model and Analysis of Ensemble Advantages

The PSO-Stacking ensemble framework constructed in this study performs excellently in ET estimation of summer soybeans in the North China Plain, with an R² of 0.948 on the test set and a cross-year validation R² maintaining 0.900, demonstrating strong fitting ability and spatiotemporal stability. Comparative studies on soybean ET models under humid continental climates [56] have confirmed that models adapted to crop and regional climatic characteristics outperform general-purpose models. Against the background of the semi-humid monsoon climate in the North China Plain, the PSO-Stacking-F3 combination accurately captures the core driving factors (LAI, T_max, Rs), and its cross-year validation R² (0.900) is significantly higher than that of other models in the study, further verifying the advantage of the “ensemble framework + regional adaptation” approach.

This result shows significant advantages compared with existing machine learning-based ET estimation studies, with detailed comparisons provided in Table 6. As can be seen from Table 6, the PSO-Stacking framework in this study outperforms all the comparative studies in all core accuracy indicators: its R² (0.948) is significantly higher than that of similar optimized models (R² ≤ 0.909 for RF-PSO, XGBoost-PSO, SVM-PSO, and DNN-PSO) [16], and also obviously superior to unoptimized single machine learning models, with R² improved by approximately 4–9% (R² ≤ 0.85 for SVM, XGBoost, LSTM, and RF) [57,58,59]. In this study, the comprehensive performance of the Stacking ensemble model is significantly better than all single models (Figure 5, Figure 6i–l,Figure 7, Figure 8 and Figure 9), a finding consistent with the conclusions of Saggi & Jain and Liang et al. in ET estimation studies of different crops [15,60], which collectively confirm the general rule that ensemble learning enhances model generalization ability by integrating the advantages of heterogeneous base learners.

In this study, the FAO-56 P-M model exhibits a certain degree of adaptability in ET simulation based on modified crop coefficients in 2023 (R² = 0.791, MAE = 1.002). However, its accuracy significantly decreases when applied cross-year for summer soybean ET prediction in 2024 (R² = 0.414–0.550, MAE = 1.428–1.489), revealing the inherent limitations of traditional empirical models: they rely on fixed growth stage parameters, making it difficult to adapt to the interannual dynamic differences in meteorological conditions and crop growth, and fail to capture the nonlinear relationships between ET and its influencing factors.

The RF, SVM, XGBoost, and Stacking ensemble models constructed in this study exhibit significantly superior cross-year ET simulation performance compared to the FAO-56 P-M model. Further comparative verification shows that, even after calibration with measured data in 2023, the accuracy of the P-M model in estimating summer soybean ET in 2024 remains significantly low (R² = 0.550, MAE = 1.428 mm/d), which is consistent with existing conclusions that “machine learning models outperform mechanistic models in complex farmland environments” [53,62,63]. This result reveals the core flaws of the P-M model [58]: first, static empirical crop coefficients have difficulty in reflecting the dynamic changes in evapotranspiration caused by interannual differences in meteorological conditions, management practices, and crop varieties; second, its linear physical structure cannot capture the complex nonlinear synergistic and antagonistic effects between LAI, canopy resistance, and meteorological factors [62,64].

A study on evapotranspiration estimation in subhumid-semiarid climates [65] also pointed out that the linear structure of the P-M model cannot capture the nonlinear synergistic effects of multiple factors, and the RMSE of hybrid machine learning models is reduced by more than 30% on average compared with it. By directly learning the complex mapping relationships among multiple years and factors, the Stacking-PSO model constructed in this study does not require pre-set fixed parameters and exhibits strong adaptability to interannual fluctuations. In the cross-year validation, the Stacking-PSO-F3 model maintains an R² of 0.900, and its RMSE (0.85 mm/d) is much lower than the cross-year prediction error of the P-M model. This not only confirms the flexibility driven by the machine learning framework [16,24] but also highlights its stronger robustness, stability, and prediction reliability compared with traditional mechanistic models with fixed parameters in agricultural hydrological simulation against the background of climate change [62,66]. Therefore, the optimized ensemble framework constructed in this study not only achieves accuracy breakthroughs through heterogeneous integration and intelligent optimization but also demonstrates clear application advantages in the practical and critical dimension of cross-temporal generalization ability.

The superior performance of the Stacking model is attributed to its two-layer learning architecture [39]. The first layer comprises multiple base learners (RF, SVM, XGBoost), forming a diverse model pool: RF excels at capturing complex interactive relationships [30], SVM effectively constructs separating hyperplanes in high-dimensional spaces [35], and XGBoost accurately fits residuals through a gradient boosting mechanism [37]. These models fit the complex nonlinear relationships of the ET process from distinct perspectives, and their predictions form complementary “meta-features.” The second-layer meta-learner further learns from these meta-features, automatically assigning higher weights to different base models under optimal scenarios, thereby effectively correcting potential systematic biases of single models [39,67]. This mechanism enables the Stacking model to exhibit stronger adaptability and robustness when coping with drastic fluctuations in meteorological conditions and dynamic changes in leaf area index (LAI) during the growth period of summer soybean in the North China Plain, which is consistent with the findings of Liu et al. (2024) in ensemble learning-based estimation of reference crop evapotranspiration [41].

4.2. Enhancement Effect of Optimization Algorithms on Model Performance

This study introduced three intelligent optimization algorithms—Particle Swarm Optimization (PSO), Genetic Algorithm (GA), and Random Grid Search (RGS)—to perform an automated global search for the hyperparameters of the models [43,44]. Results showed that the performance of all optimized models was significantly improved, with the PSO algorithm delivering the most notable optimization effect (Figure 5 and Figure 6a–d). After PSO, the average R² and NSE of the models increased by 5.8–9.2%, while the MAE and RMSE decreased by an average of 20–29%. A hyperparameter optimization study on evapotranspiration prediction [68] has confirmed that the PSO algorithm performs the best in improving accuracy (an average R² increase of 7.3%) and reducing variance, and adapts well to complex ensemble models. This is consistent with the optimization effect of PSO on Stacking in this study, confirming the universality of PSO as a preferred optimization algorithm for evapotranspiration simulation.

A notable advancement of this study lies in the systematic integration of optimization algorithms (PSO, GA, RGS) with ensemble learning. Consistent results demonstrate that PSO performs best in optimizing the hyperparameters of the Stacking model [45,69]. Leveraging its cooperative evolution mechanism, which simulates swarm intelligence, PSO guides hyperparameter “particles” to approach the global optimal solution in a complex space in an efficient and directional manner—thus exhibiting significant advantages in addressing high-dimensional nonlinear optimization problems such as ET estimation [21]. This study found that even for complex ensemble structures, such as Stacking (which integrates RF, SVM, and XGBoost), PSO still maintains excellent search efficiency and stability. It not only significantly improves model estimation accuracy but, more critically, ensures robust performance in cross-annual simulations. This conclusion corroborates the findings of Zhao et al. [21] regarding PSO’s performance in winter wheat ET modeling [16], collectively verifying the versatility and effectiveness of PSO in agricultural ET estimation tasks. In contrast, RGS exhibits large result fluctuations due to its random sampling characteristics [48], while GA—despite possessing global search capability—lags slightly behind PSO in terms of convergence speed and accuracy [24]. Therefore, this study explicitly recommends PSO as the preferred optimization algorithm for driving high-precision ET ensemble models.

4.3. Core Driving and Regulatory Mechanisms of Summer Soybean Evapotranspiration

Through the framework of “correlation analysis → SHAP analysis → global sensitivity analysis”, this study clarifies the core mechanism of the daily-scale ET simulation model for summer soybeans: SHAP analysis of the optimal Stacking-PSO combinations (F3, F4) combined with preliminary correlation screening identifies LAI, T_max, and Rs as the core driving factors of summer soybean ET in the North China Plain; as can be seen from the comprehensive comparison table and SHAP plots in Figure 8d, the three global sensitivity analysis methods are consistent with the SHAP conclusions, with differences only in ranking: fluctuations in SHAP values of LAI and T_max are more prominent in SHAP plots, while the global effect of Rs is more significant in perturbation analysis, Sobol analysis, and Morris screening (ranking second). The difference in their rankings stems from the fact that SHAP focuses on “local fluctuation impacts”, while global sensitivity analysis emphasizes “global variation contributions”; although T_max ranks high in SHAP analysis, its global ranking (third in perturbation analysis, third in Sobol analysis) is lower than that of Rs, which echoes the Sobol conclusion: its impact is more reflected in local fluctuations rather than serving as a core factor driving the global variation in ET.

As can be seen from the Morris screening results in Figure 8c, the σ value of LAI is higher than the median (0.54), exhibiting obvious nonlinear characteristics. This is jointly confirmed by the SHAP result that “the contribution tends to stabilize in the high LAI range”—after crop canopy closure, evapotranspiration no longer increases linearly with leaf area. The σ value of Rs is close to the median (0.54), indicating weak nonlinear influence and conforming to the monotonic driving characteristic (in Morris analysis, the smaller the σ value, the closer the variable’s influence is to a monotonic relationship), which is consistent with the energy-driven physical mechanism. This aligns with the conclusions of Li et al. (2025), who used solar-induced chlorophyll fluorescence combined with machine learning to predict actual ET of winter wheat and found that “vegetation and energy factors dominate ET changes”, and Yang et al. (2010), who quantified the response of ET to climate change in the Yellow River Basin and concluded that “energy factors drive linear changes in ET” [62,70].

SHAP analysis further reveals that humidity (RH_max, RH_min) acts as regulatory variables, while T_min contributes minimally (consistent with the conclusion of Sanikhani et al. [9] that “nighttime temperature exerts a weak driving effect”). Although T_min exhibits a strong correlation with ET, its SHAP values and sensitivity indices (Sobol ST = 0.020, Morris μ* = 0.41) are lower than those of RH. This is essentially due to collinearity with T_max and Rs, leading to information redundancy. In contrast, RH nonlinearly regulates evaporation demand and canopy resistance, resulting in a higher effective contribution [64,71].

In summary, LAI-T_max-Rs constitutes the core driving chain of summer soybean ET, with RH serving as a secondary regulatory factor, consistent with the conclusions of the FAO-56 P-M model and evapotranspiration studies in China [53,62,63]. The F3 feature set, which excludes T_min and retains RH, achieves a balance between accuracy (test set R² = 0.948, cross-year validation R² = 0.900) and complexity, verifying the rationality of the “core driving + auxiliary regulation” minimum feature set [72,73].

4.4. Model Robustness and Cross-Annual Generalization Ability

In this study, the Stacking-PSO-F3 combination maintained high accuracy in cross-annual validation (R² = 0.900), demonstrating excellent generalization ability. This is consistent with Zhao et al.’s findings [16] on the stability of PSO-optimized models in cross-annual ET prediction for winter wheat, indicating that optimization algorithms not only improve model fitting performance but also enhance adaptability to interannual climatic fluctuations by mitigating overfitting. Furthermore, building on the advantages of fusing multiple base models, the Stacking model was further enhanced by PSO-based hyperparameter optimization. This enabled it to effectively adapt to typical meteorological variations between different years in the North China Plain—such as differences in temperature, humidity, and radiation conditions—thereby delivering reliable prediction performance and exhibiting substantial potential for agricultural applications [64,74,75].

4.5. Limitations and Future Directions

Despite the favorable results achieved in this study, there are still certain limitations: First, the study area is focused on the North China Plain, and the applicability of the model in other climatic zones requires further verification; second, the Stacking-PSO model does not explicitly incorporate the direct impacts of irrigation and rainfall, and its high computational complexity makes direct implementation and application complex in the short term. Future research can be optimized and improved from three aspects: First, integrate multi-source remote sensing data (e.g., NDVI, soil moisture) [61,76] to further enhance the spatiotemporal adaptability of the model; second, introduce coupled models of deep learning and physical mechanisms (e.g., combining LSTM with crop growth models) [77] to improve the simulation robustness under extreme meteorological conditions; third, based on the high-precision ET simulation results of this study, combined with the water demand characteristics of summer soybeans during their growth period and dynamic soil moisture monitoring data, establish a multi-factor synergistic irrigation decision-making mechanism of “cumulative ET consumption—soil moisture surplus/deficit—growth period water demand”. By coupling key processes and state variables to set differentiated irrigation thresholds, precise matching between water demand, water supply, and irrigation can be achieved, further improving the efficiency of agricultural water resource utilization.

5. Conclusions

Based on synchronous observation data from large-scale weighing lysimeters and automatic weather stations, this study systematically analyzed the dynamic variation characteristics and influencing factors of daily evapotranspiration (ET) in summer soybean. Results indicate that summer soybean ET exhibits a “unimodal curve” trend throughout the entire growth period, with the peak occurring during the pod-filling or grain-filling stage. This peak timing primarily depends on the dynamic matching relationship between the development process of leaf area index (LAI) and meteorological energy supply (dominated by solar radiation (R_S) and maximum temperature (T_max)) during key growth stages.

This study confirms that R_S, T_max, and LAI are the primary driving factors of ET. For the simulation of daily ET of summer soybean in the North China Plain, the combination of the Stacking ensemble model and Particle Swarm Optimization (PSO) algorithm for hyperparameter optimization is recommended, with input factors including LAI, T_max, R_S, maximum relative humidity (RH_max), and minimum relative humidity (RH_min).

Author Contributions

Conceptualization, L.H. and N.S.; Formal analysis, L.H., S.D. and F.G.; Data curation, L.H., Y.S. and S.D.; Validation, L.H., Y.S. and N.S.; Funding acquisition, N.S.; Investigation, L.H., S.D. and Y.S.; Methodology, L.H., H.L. and N.S.; Software, L.H., F.G. and Y.S.; Writing—original draft, L.H.; Writing—review and editing, F.G., H.L. and N.S. All authors have read and agreed to the published version of the manuscript.

Funding

The Agricultural Science and Technology Innovation Program (ASTIP), Chinese Academy of Agricultural Sciences.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation. The data are not publicly available due to copyrights cannot be available.

Acknowledgments

The authors of this scientific paper have utilized generative artificial intelligence (AI) tools in the writing process. Specifically, we have employed AI-powered language models to assist with drafting sections of the manuscript, generating ideas, and refining the overall structure and clarity of the text. The AI has been instrumental in providing suggestions for improving the coherence and readability of our work, as well as offering alternative phrasings and terminology that enhance the precision and impact of our scientific communication. However, it is important to note that the use of generative AI has been carefully managed to ensure the integrity and originality of our research. All AI-generated content has been thoroughly reviewed, edited, and validated by the authors. We have made certain that the AI’s contributions align with our research objectives, data, and findings, and that they do not compromise the accuracy or authenticity of our scientific work. The authors have maintained full control over the final content and conclusions presented in this paper. We acknowledge the potential benefits of using generative AI in scientific writing, such as increased efficiency and improved quality of written work. However, we also recognize the importance of human oversight and expertise in ensuring the validity and reliability of scientific research. Therefore, we have utilized AI as a complementary tool to support our writing process, while upholding the highest standards of scientific rigor and integrity. The authors take full responsibility for the content and conclusions of this paper, and any errors or inaccuracies that may be present are solely attributable to us. The use of generative AI has been disclosed transparently to maintain transparency and trust in the scientific community.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Zhang, H.Y. Study on the Formation Mechanism and Avoidance of Soybean Import Dependence Risk in China. Master’s Thesis, Yangzhou University, Yangzhou, China, 2023. [Google Scholar]
Cai, H.J.; Zhai, L.Q.; Shen, Q.H.; Cao, J.Q. Technical Path for Increasing Soybean Yield under Grain-Soybean Rotation in the Huang Huai Hai Region. Anhui Agric. Sci. Bull. 2025, 31, 16–19. [Google Scholar] [CrossRef]
Yang, S.; Li, H.; Xu, Y.; Wang, T.; Hu, Y.; Zhao, Y.; Qian, X.; Li, Z.; Sui, P.; Gao, W.; et al. The Yield Performance of Maize-Soybean Intercropping in the North China Plain: From 172 Sites Empirical Investigation. Field Crops Res. 2024, 315, 109467. [Google Scholar] [CrossRef]
Fan, Y.; Wang, E.; Gong, W.; Xu, L.; Zhao, Z.; He, D.; Yang, F.; Wang, X.; Yong, T.; Liu, J.; et al. Soybean Yield Variations and the Potential of Intercropping to Increase Production in China. Field Crops Res. 2023, 297, 108771. [Google Scholar] [CrossRef]
Ren, T.H.; Li, Z.Y.; Du, B.; Zhang, X.H.; Xu, Z.; Gao, D.P.; Zheng, B.; Zhao, W.; Li, G.; Ning, T.Y. Improving Photosynthetic Performance and Yield of Summer Soybean by Organic Fertilizer Application and Increasing Plant Density. J. Plant Nutr. Fertil. 2021, 27, 1361–1375. [Google Scholar] [CrossRef]
Liao, Z.; Ding, X.; Zhang, H.; Zhang, H.; Li, Z.; Zhang, F.; Fan, J. Multi-objective optimization of soil water-nitrogen management practice and seeding rate for sustainable soybean production on the Loess Plateau of China. Agric. Water Manag. 2025, 312, 109414. [Google Scholar] [CrossRef]
Karbasi, M.; Jamei, M.; Ali, M.; Malik, A.; Yaseen, Z.M. Forecasting Weekly Reference Evapotranspiration Using Auto Encoder Decoder Bidirectional LSTM Model Hybridized with a Boruta-CatBoost Input Optimizer. Comput. Electron. Agric. 2022, 198, 107121. [Google Scholar] [CrossRef]
Fan, J.L.; Ma, X.; Wu, L.F.; Zhang, F.C.; Yu, X.; Zeng, W.Z. Light Gradient Boosting Machine: An Efficient Soft Computing Model for Estimating Daily Reference Evapotranspiration with Local and External Meteorological Data. Agric. Water Manage 2019, 225, 105758. [Google Scholar] [CrossRef]
Sanikhani, H.; Kisi, O.; Maroufpoor, E.; Yaseen, Z.M. Temperature-Based Modeling of Reference Evapotranspiration Using Several Artificial Intelligence Models: Application of Different Modeling Scenarios. Theor. Appl. Climatol. 2019, 135, 449–462. [Google Scholar] [CrossRef]
Wang, L.; Kisi, O.; Zounemat-Kermani, M.; Li, H. Pan Evaporation Modeling Using Six Different Heuristic Computing Methods in Different Climates of China. J. Hydrol. 2017, 544, 407–427. [Google Scholar] [CrossRef]
Fan, J.L.; Yue, W.J.; Wu, L.F.; Zhang, F.C.; Cai, H.J.; Wang, X.K.; Lu, X.H.; Xiang, Y.Z. Evaluation of SVM, ELM and Four Tree-Based Stacking-ensemble Models for Predicting Daily Reference Evapotranspiration Using Limited Meteorological Data in Different Climates of China. Agric. For. Meteorol. 2018, 263, 225–241. [Google Scholar] [CrossRef]
Kisi, O. Modeling Reference Evapotranspiration Using Three Different Heuristic Regression Approaches. Agric. Water Manag. 2016, 169, 162–172. [Google Scholar] [CrossRef]
Chia, M.Y.; Huang, Y.F.; Koo, C.H. Support Vector Machine Enhanced Empirical Reference Evapotranspiration Estimation with Limited Meteorological Parameters. Comput. Electron. Agric. 2020, 175, 105577. [Google Scholar] [CrossRef]
Yamaç, S.S. Artificial Intelligence Methods Reliably Predict Crop Evapotranspiration with Different Combinations of Meteorological Data for Sugar Beet in a Semiarid Area. Agric. Water Manag. 2021, 254, 106968. [Google Scholar] [CrossRef]
Saggi, M.K.; Jain, S. Application of Fuzzy-Genetic and Regularization Random Forest (FG-RRF) for Estimating Crop Evapotranspiration (ETc) of Maize and Wheat. Agric. Water Manag. 2020, 229, 105907. [Google Scholar] [CrossRef]
Zhao, X.; Zhang, L.; Zhu, G.; Chen, C.G.; He, J.; Traore, S.; Singh, V.P. Exploring Interpretable and Non-Interpretable Machine Learning Models for Estimating Winter Wheat Evapotranspiration Using Particle Swarm Optimization with Limited Climatic Data. Comput. Electron. Agric. 2023, 212, 108140. [Google Scholar] [CrossRef]
Kavzoglu, T.; Teke, A. Advanced Hyperparameter Optimization for Improved Spatial Prediction of Shallow Landslides Using Extreme Gradient Boosting (XGBoost). Bull. Eng. Geol. Environ. 2022, 81, 229. [Google Scholar] [CrossRef]
Jiang, X.M.; Wang, G.Q.; Wang, Y.T.; Yao, J.P.; Xue, B.L. A Hybrid Framework for Simulating Actual Evapotranspiration in Data-Deficient Areas: A Case Study of the Inner Mongolia Section of the Yellow River Basin. Remote Sens. 2023, 15, 2234. [Google Scholar] [CrossRef]
Li, T.T. Research on Hyperparameter Optimization Based on Improved Particle Swarm Optimization. Master’s Thesis, Xidian University, Xi’an, China, 2019. [Google Scholar]
Gong, L.Q. Research and Implementation of Hyperparameter Optimization Methods in Machine Learning. Master’s Thesis, Xidian University, Xi’an, China, 2024. [Google Scholar] [CrossRef]
Petković, D.; Gocic, M.; Shamshirband, S.; Qasem, S.N.; Trajkovic, S. Particle Swarm Optimization-Based Radial Basis Function Network for Estimation of Reference Evapotranspiration. Theor. Appl. Climatol. 2016, 125, 555–563. [Google Scholar] [CrossRef]
Wu, Z.J.; Cui, N.B.; Hu, X.T.; Gong, D.Z.; Wang, Y.S.; Feng, Y.; Jiang, S.Z.; Lv, M.; Han, L.; Xing, L.W.; et al. Optimization of Extreme Learning Machine Model with Biological Heuristic Algorithms to Estimate Daily Reference Crop Evapotranspiration in Different Climatic Regions of China. J. Hydrol. 2021, 598, 127028. [Google Scholar] [CrossRef]
Zhang, Y.; Cui, N.B.; Feng, Y.; Gong, D.Z.; Hu, X.T. Comparison of BP, PSO-BP and Statistical Models for Predicting Daily Global Solar Radiation in Arid Northwest China. Comput. Electron. Agric. 2019, 164, 104905. [Google Scholar] [CrossRef]
Tang, D.H.; Feng, Y.; Gong, D.Z.; Hao, W.P.; Cui, N.B. Evaluation of Artificial Intelligence Models for Actual Crop Evapotranspiration Modeling in Mulched and Non-Mulched Maize Croplands. Comput. Electron. Agric. 2018, 152, 375–384. [Google Scholar] [CrossRef]
Mun, H.; Seo, S.; Son, B.; Yun, J. Black-Box Audio Adversarial Attack Using Particle Swarm Optimization. IEEE Access 2022, 10, 21685–21695. [Google Scholar] [CrossRef]
Xu, Y.H.; Hu, C.H.; Wu, Q.; Jian, S.Q.; Li, Z.C.; Chen, Y.Q.; Zhang, G.D.; Zhang, Z.X.; Wang, S.L. Research on Particle Swarm Optimization in LSTM Neural Networks for Rainfall-Runoff Simulation. J. Hydrol. 2022, 608, 127553. [Google Scholar] [CrossRef]
Yu, J.Y.; He, Y.; Zhao, Z.F.; Wang, D. Correction Coefficients for Leaf Area Estimation of Crops Using Length-Width Method. Jiangsu Agric. Sci. 2007, 2, 37–39. [Google Scholar] [CrossRef]
Wang, Q.J.; Liu, Y.H.; Su, L.J. Relative Leaf Area Index of Typical Crops Based on Single Parameter Logistic Model. Trans. Chin. Soc. Agric. Mach. 2020, 51, 210–219. [Google Scholar] [CrossRef]
Yin, X.Y.; Kropff, M.J.; McLaren, G.; Visperas, R.M. A Nonlinear Model for Crop Development as a Function of Temperature. Agric. For. Meteorol. 1995, 77, 1–16. [Google Scholar] [CrossRef]
Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Liaw, A.; Wiener, M. Classification and Regression by Random Forest. R News 2002, 2, 18–22. [Google Scholar]
Feng, Y.; Jia, Y.; Cui, N.B.; Zhao, L.; Li, C.; Gong, D.Z. Calibration of Hargreaves Model for Reference Evapotranspiration Estimation in Sichuan Basin of Southwest China. Agric. Water Manag. 2017, 181, 1–9. [Google Scholar] [CrossRef]
Biau, G.; Scornet, E. A Random Forest Guided Tour. Test 2016, 25, 197–227. [Google Scholar] [CrossRef]
Probst, P.; Bischl, B.; Boulesteix, A.L. Tunability: Importance of Hyperparameters of Machine Learning Algorithms. arXiv 2018, arXiv:1802.09596. [Google Scholar] [CrossRef]
Vapnik, V.N. The Nature of Statistical Learning Theory; Springer: Berlin, Germany, 1995. [Google Scholar] [CrossRef]
Ferreira, L.B.; da Cunha, F.F.; de Oliveira, R.A.; Filho, E.I.F. Estimation of Reference Evapotranspiration in Brazil with Limited Meteorological Data Using ANN and SVM–A New Approach. J. Hydrol. 2019, 572, 556–570. [Google Scholar] [CrossRef]
Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), San Francisco, CA, USA, 13–17 August 2016; ACM: New York, NY, USA, 2016; pp. 785–794. [Google Scholar] [CrossRef]
Bentejac, C.; Csorgo, A.; Martinez-Munoz, G. A Comparative Analysis of Gradient Boosting Algorithms. Artif. Intell. Rev. 2021, 54, 1937–1967. [Google Scholar] [CrossRef]
Wolpert, D.H. Stacked Generalization. Neural Netw. 1992, 5, 241–259. [Google Scholar] [CrossRef]
Wu, T.N.; Zhang, W.; Jiao, X.Y.; Guo, W.H.; Hamoud, Y.A. Evaluation of Stacking and Blending Ensemble Learning Methods for Estimating Daily Reference Evapotranspiration. Comput. Electron. Agric. 2021, 184, 106039. [Google Scholar] [CrossRef]
Liu, A.; Zhao, D.B.; Wei, Y.C.; Xiao, L. Ensemble-Learning Estimation of Reference Crop Evapotranspiration Incorporating Spatiotemporal Features. J. Drain. Irrig. Mach. Eng. 2024, 42, 179–186 + 193. [Google Scholar] [CrossRef]
Chen, Z.J.; Zhu, Z.C.; Sun, S.J.; Wang, Q.Y.; Su, T.Y.; Fu, Y.J. Estimation of Daily Evapotranspiration and Crop Coefficient of Maize under Mulched Drip Irrigation by Stacking Ensemble Learning Model. Trans. CSAE 2021, 37, 95–104. [Google Scholar] [CrossRef]
Kennedy, J.; Eberhart, R.C. Particle Swarm Optimization. In Proceedings of the IEEE International Conference on Neural Networks (ICNN), Perth, WA, Australia, 27 November–1 December 1995; IEEE: Piscataway, NJ, USA, 1995; Volume 4, pp. 1942–1948. [Google Scholar] [CrossRef]
Poli, R.; Kennedy, J.; Blackwell, T. Particle Swarm Optimization. Swarm Intell. 2007, 1, 33–57. [Google Scholar] [CrossRef]
Gad, A.G. Correction to: Particle Swarm Optimization Algorithm and Its Applications: A Systematic Review. Arch. Comput. Methods Eng. 2023, 30, 2299. [Google Scholar] [CrossRef]
Zhou, Y.; Zhou, N.R.; Gong, L.H.; Jiang, M.L. Prediction of Photovoltaic Power Output Based on Similar Day Analysis, Genetic Algorithm and Extreme Learning Machine. Energy 2020, 204, 117894. [Google Scholar] [CrossRef]
Zhang, Z.; Gu, J.; Luo, J. Evaluation of Genetic Algorithm on Grasp Planning Optimization for 3D Object: A Comparison with Simulated Annealing Algorithm. In Proceedings of the 2013 IEEE International Symposium on Industrial Electronics (ISIE), Taipei, Taiwan, 28–31 May 2013; IEEE: Piscataway, NJ, USA, 2013; pp. 1896–1901. [Google Scholar] [CrossRef]
Bergstra, J.; Bengio, Y. Random Search for Hyper-Parameter Optimization. J. Mach. Learn. Res. 2012, 13, 281–305. [Google Scholar]
Yang, L.; Shami, A. On Hyperparameter Optimization of Machine Learning Algorithms: Theory and Practice. Neurocomputing 2020, 415, 295–316. [Google Scholar] [CrossRef]
Geng, J.; Li, H.; Luan, W.; Shi, Y.; Pang, J.; Zhang, W. Estimation of Daily Actual Evapotranspiration of Tea Plantations Using Ensemble Machine Learning Algorithms and Six Available Scenarios of Meteorological Data. Appl. Sci. 2023, 13, 12961. [Google Scholar] [CrossRef]
Aschale, T.M.; Peres, D.J.; Gullotta, A.; Sciuto, G.; Cancelliere, A. Trend Analysis and Identification of the Meteorological Factors Influencing Reference Evapotranspiration. Water 2023, 15, 470. [Google Scholar] [CrossRef]
Kuhnt, S.; Kalka, A. Global Sensitivity Analysis for the Interpretation of Machine Learning Algorithms. In Artificial Intelligence, Big Data and Data Science in Statistics; Springer: Berlin/Heidelberg, Germany, 2022; pp. 155–169. Available online: https://link.springer.com/chapter/10.1007/978-3-031-07155-3_6 (accessed on 15 November 2022).
Allen, R.G. Crop Evapotranspiration: Guidelines for Computing Crop Water Requirements. FAO Irrig. Drain. Pap. 1998, 56, 1–300. [Google Scholar]
Mossie Aschale, T.; Sciuto, G.; Peres, D.J.; Gullotta, A.; Cancelliere, A. Comparison of Different Methods for Reference Evapotranspiration Estimation in Semi-Arid Climates. Water 2022, 14, 2268. [Google Scholar] [CrossRef]
Lundberg, S.; Lee, S.I. A Unified Approach to Interpreting Model Predictions. arXiv 2017, arXiv:1705.07874. [Google Scholar] [CrossRef]
Da Figueiredo Moura Silva, E.H.; Kothari, K.; Pattey, E.; Battisti, R.; Boote, K.J.; Archontoulis, S.V.; Cuadra, S.V.; Faye, B.; Grant, B.; Hoogenboom, G.; et al. Inter-comparison of soybean models for the simulation of evapotranspiration in a humid continental climate. Agric. For. Meteorol. 2025, 365, 110463. [Google Scholar] [CrossRef]
Wu, L.; Fan, J. Comparison of Neuron-Based, Kernel-Based, Tree-Based and Curve-Based Machine Learning Models for Predicting Daily Reference Evapotranspiration. PLoS ONE 2019, 14, e0217520. [Google Scholar] [CrossRef]
Fan, J.; Wu, L.; Zheng, J.; Zhang, F. Medium-Range Forecasting of Daily Reference Evapotranspiration across China Using Numerical Weather Prediction Outputs Downscaled by Extreme Gradient Boosting. J. Hydrol. 2021, 601, 126664. [Google Scholar] [CrossRef]
Zhang, L.; Zhao, X.; Zhu, G.; He, J.; Chen, J.; Chen, Z.; Traore, S.; Liu, J.; Singh, V.P. Short-Term Daily Reference Evapotranspiration Forecasting Using Temperature-Based Deep Learning Models in Different Climate Zones in China. Agric. Water Manag. 2023, 289, 108498. [Google Scholar] [CrossRef]
Liang, Y.F.; Feng, D.P.; Sun, Z.J.; Zhu, Y.N. Evaluation of Empirical Equations and Machine Learning Models for Daily Reference Evapotranspiration Prediction Using Public Weather Forecasts. Water 2023, 15, 3954. [Google Scholar] [CrossRef]
Inoubli, R.; Constantino-Recillas, D.E.; Monsiváis-Huertero, A.; Farah, L.B.; Farah, I.R. Computational Methods to Retrieve Soil Moisture Using Remote Sensing Data: A Review. In Proceedings of the 2024 IEEE 7th International Conference on Advanced Technologies, Signal and Image Processing (ATSIP), Sousse, Tunisia, 11–13 July 2024; pp. 77–82. [Google Scholar] [CrossRef]
Li, Y.; Liu, X.N.; Zhang, X.G.; Gu, X.B.; Yu, L.Y.; Cai, H.J.; Pen, X.B. Using Solar-Induced Chlorophyll Fluorescence to Predict Winter Wheat Actual Evapotranspiration through Machine Learning and Deep Learning Methods. Agric. Water Manage 2025, 309, 109322. [Google Scholar] [CrossRef]
Du, H.; Zeng, S.; Liu, X.; Xia, J. An Improved Budyko Framework Incorporating Water-Carbon Relationships for Estimating Evapotranspiration Under Climate and Vegetation Changes. Ecol. Indic. 2024, 161, 112887. [Google Scholar] [CrossRef]
Monteith, J.L. Evaporation and Surface Temperature. Q. J. R. Meteorol. Soc. 1981, 107, 1–27. [Google Scholar] [CrossRef]
Acharki, S.; Raza, A.; Vishwakarma, D.K.; Amharref, M.; Bernoussi, A.S.; Singh, S.K.; Al-Ansari, N.; Dewidar, A.Z.; Al-Othman, A.A.; Mattar, M.A. Comparative assessment of empirical and hybrid machine learning models for estimating daily reference evapotranspiration in sub-humid and semi-arid climates. Sci. Rep. 2025, 15, 2542. [Google Scholar] [CrossRef]
Ding, N.; Tang, X.; Wu, H.J.; Kong, L.; Dao, X.; Wang, Z.F.; Zhu, J. Development of an integrated machine learning model to improve the secondary inorganic aerosol simulation over the Beijing–Tianjin–Hebei region. Atmos. Environ. 2024, 327, 120483. [Google Scholar] [CrossRef]
Dietterich, T.G. Ensemble Methods in Machine Learning. In Proceedings of the International Workshop on Multiple Classifier Systems, Cagliari, Italy, 21–23 June 2000; Springer: Berlin, Germany, 2000; pp. 1–15. [Google Scholar] [CrossRef]
Liyew, C.M.; Di Nardo, E.; Ferraris, S.; Meo, R. Hyperparameter optimization of machine learning models for predicting actual evapotranspiration. Heliyon 2025, 11, e2500441. [Google Scholar] [CrossRef]
Zhang, Y.; Wang, S.; Ji, G. A Comprehensive Survey on Particle Swarm Optimization Algorithm and Its Applications. Math. Probl. Eng. 2015, 2015, 931256. [Google Scholar] [CrossRef]
Yang, L.Z. Quantitative Estimation of the Impact of Climate Change on Actual Evapotranspiration in the Yellow River Basin, China. J. Hydrol. 2010, 500, 233–241. [Google Scholar] [CrossRef]
Vicente-Serrano, S.M.; Beguería, S.; López-Moreno, J.I. A Multiscalar Drought Index Sensitive to Global Warming: The Standardized Precipitation Evapotranspiration Index (SPEI). J. Clim. 2010, 23, 1696–1718. [Google Scholar] [CrossRef]
Perni, S.; Vanderpuye-Orgle, J.; Poirrier, J.E. EE300 Looking Sharp! Applying Cutting-Edge Shapley Additive Explanation (SHAP) Approaches to Cost-Effectiveness Modelling in Contrast to One-Way Sensitivity Analysis. Value Health 2023, 26, S1. [Google Scholar] [CrossRef]
Fan, M.; Zhang, L.; Liu, S.; Yang, T.; Lu, D. Investigation of Hydrometeorological Influences on Reservoir Releases Using Explainable Machine Learning Methods. Front. Water 2023, 5, 1112970. [Google Scholar] [CrossRef]
Leng, Z.; Chen, L.; Yang, B.; Lia, S.; Yi, B. An Extreme Forecast Index-Driven Runoff Prediction Approach Using Stacking Ensemble Learning. J. Hydrol. Eng. 2024, 29, 04024045. [Google Scholar] [CrossRef]
Islam, M.R.; Ferdous, M.S.; Bhowmik, A.K. Comparison of RNN-LSTM, TFDF and Stacking Model Approach for Weather Forecasting in Bangladesh Using Historical Data from 1963 to 2022. Comput. Electron. Agric. 2025, 228, 109678. [Google Scholar] [CrossRef]
Babaeian, E.; Paheding, S.; Siddique, N.; Devabhaktuni, V.; Tuller, M. Estimation of Root Zone Soil Moisture from Ground and Remotely Sensed Soil Information with Multisensor Data Fusion and Automated Machine Learning. Remote Sens. Environ. 2021, 260, 112434. [Google Scholar] [CrossRef]
Liu, Y.; Zhang, S.; Zhang, J.; Tang, L.; Bai, Y. Assessment and Comparison of Six Machine Learning Models in Estimating Evapotranspiration over Croplands Using Remote Sensing and Meteorological Factors. Remote Sens. 2021, 13, 3838. [Google Scholar] [CrossRef]

Figure 1. Schematic Diagram of the Geographic Location of the Experimental Station.

Figure 2. Flowchart for simulating summer soybean evapotranspiration in the North China Plain based on RF, SVM, XGBoost, and Stacking Models.

Figure 3. (a) Validation results of the soybean leaf area index for 2023 based on the 2024 model; (b) changes in soybean leaf area index in 2023 and 2024.

Figure 4. Comparison of ET estimates for 2023 and 2024 based on the FAO 56-PM model.

Figure 5. Simulated daily evapotranspiration of summer soybean in the North China Plain under four machine learning models, three optimization algorithms, and four input parameter combinations.

Figure 6. Performance comparison among different models, optimization algorithms, and input parameter combinations. (a–d) Comparison of optimization algorithms. (e–h) Comparison of input parameter combinations. (i–l) Comparison of model types.

Figure 7. SHAP-based feature importance analysis. (a) F3 (precise feature set) model; (b) F4 (full feature set) model.

Figure 8. Global Sensitivity Analysis of the Stacking-PSO Model. (a) Perturbation analysis showing the normalized sensitivity index of each input variable. (b) Sobol variance-based analysis illustrating first-order (S1) and total-order (ST) indices. (c) Morris screening method displaying the mean absolute effect (μ*) and its standard deviation (σ). (d) Comprehensive comparison of variable ranks across the three sensitivity methods.

Figure 9. Simulated evapotranspiration of summer soybean in the North China Plain under different “parameter–algorithm–optimization” combinations.

Figure 10. Interannual simulation performance of summer soybean evapotranspiration in the North China Plain based on “parameter–algorithm–optimization” combinations.

Figure 11. Comparison of simulated and observed interannual daily evapotranspiration of summer soybean in the North China Plain using “parameter–algorithm–optimization” combinations.

Figure 12. Daily variations of simulated versus observed interannual summer soybean evapotranspiration in the North China Plain under “parameter–algorithm–optimization” combinations.

Table 1. The start and end dates of the summer soybean growth period in 2023–2024.

Growth Stages	2023	2024
I	21 Jun–10 Jul	20 Jun–10 Jul
II	11 Jul–30 Jul	11 Jul–30 Jul
III	31 Jul–25 Aug	31 Jul–25 Aug
IV	26 Aug–15 Sep	26 Aug–15 Sep
V	16 Sep–1 Oct	16 Sep–1 Oct
VI	2 Oct–12 Oct	2 Oct–12 Oct
VII	21 Jun–12 Oct	21 Jun–12 Oct

Note: I, II, III, IV, V, VI, VII represent the seedling, flowering, pod-setting, seed-filling, milk, and harvesting stages and the entire growth period, respectively.

Table 2. Correlation coefficients between daily ET of summer soybean and meteorological variables as well as growth indicators across growth stages.

Influencing Factor	LAI	T_max	T_min	R_s	RH_min	RH_max	u₂
Correlation Coefficient	0.40 **	0.42 **	0.23 **	0.50 **	−0.02	0.07	−0.02

Note: ** indicates extremely significant correlation (p < 0.01).

Table 3. Four Input Combinations for RF, SVM, XGBoost, and Stacking Models.

Input Combination	Input Data
F1	R_s, T_max, LAI
F2	R_s, T_max, LAI, T_min
F3	R_s, T_max, LAI, RH_max, RH_min
F4	R_s, T_max, LAI, T_min, RH_max, RH_min

Table 4. Daily Average Meteorological Parameters, LAI, and ET of summer soybean across different growth stages in 2023–2024.

Year	Growth Stage	LAI	R_S (MJ/m²)	T_max (°C)	T_min (°C)	RH_max (%)	RH_min (%)	u₂ (m/s)	ET (mm/d)
2023	I	0.8	20.294	37.687	24.385	89.95	43.117	1.203	4.45
	II	0.93	17.993	34.846	24.897	97.043	63.985	1.21	6.53
	III	5.47	17.423	34.099	24.033	99.987	72.862	0.836	7.53
	IV	7.03	15.477	30.963	19.715	100	67.082	0.922	6.07
	V	3.97	10.591	26.587	17.073	98.838	68.417	0.869	3.15
	VI	3.69	9.007	23.652	14.185	98.7	59.761	0.687	1.8
	VII	3.63	15.897	32.219	21.524	97.426	63.133	0.972	5.41
2024	I	0.77	17.337	34.415	23.648	89.582	54.971	1.347	3.19
	II	0.79	16.274	34.523	25.595	99.043	74.855	1.065	4.4
	III	4.62	18.156	35.479	24.535	99.848	74.098	0.759	6.37
	IV	7.01	14.494	31.435	22.044	99.013	70.832	1.086	7.28
	V	3.81	12.62	28.813	11.394	98.278	42.987	0.833	4.79
	VI	3.69	12.75	26.9	10.598	99.827	45.789	0.612	1.49
	VII	3.45	15.723	32.632	20.941	97.4	63.104	0.976	4.93

Table 5. Processing time of each parameter–algorithm–optimization combination.

Model Combination	2023 Training Time	2024 Testing Time	Notes
Stacking PSO F2	4–6 min	5–8 s	Stacking requires training multiple base models + PSO iterative optimization, which is the most time-consuming
Stacking PSO F3	4–6 min	5–8 s	Same as Stacking+PSO; fewer input features, so the time is similar
Stacking PSO F4	4.5–6.5 min	5–8 s	Most features; PSO search space is slightly larger, so training takes a bit longer
RF PSO F2	2–3 min	1–3 s	RF trains quickly; PSO hyperparameter tuning increases the time
RF RGS F1	1–2 min	1–3 s	RGS uses random search, which is faster than PSO/GA
RF GA F2	2.5–4 min	1–3 s	GA iteration is slightly slower than PSO, but RF itself trains quickly
XGBoost PSO F1	3–4.5 min	2–4 s	XGBoost trains relatively quickly (GPU-accelerated); PSO hyperparameter tuning increases the time
XGBoost PSO F2	3–4.5 min	2–4 s	Same as above; similar number of features
XGBoost PSO F4	3.5–5 min	2–4s	More features, so training takes a bit longer
SVM GA F1	2–3.5 min	1–2 s	SVM trains quickly; GA hyperparameter tuning increases the time
SVM GA F2	2–3.5 min	1–2 s	Same as above
SVM GA F4	2.5–4 min	1–2 s	More features; SVM kernel computation slightly increases the time

Table 6. Comparison of ET Estimation Accuracy between the optimal model in this study and existing, different machine learning-optimized models.

Methods	R²	NSE	MAE	RMSE
RF + PSO [16]	0.906	0.905	0.401	0.578
XGBoost + PSO [16]	0.906	0.905	0.406	0.576
SVM + PSO [16]	0.907	0.906	0.416	0.573
DNN + PSO [16]	0.909	0.908	0.402	0.570
SVM [61]	0.829	-	0.508	0.718
XGB [57]	0.85	-	-	0.600
LSTM [58]	0.81	-	0.418	0.564
RF [59]	0.81	-	-	-
Stacking + PSO + F3	0.948	0.946	0.618	0.721

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Han, L.; Gao, F.; Dong, S.; Song, Y.; Liu, H.; Song, N. Simulating Daily Evapotranspiration of Summer Soybean in the North China Plain Using Four Machine Learning Models. Agronomy 2026, 16, 315. https://doi.org/10.3390/agronomy16030315

AMA Style

Han L, Gao F, Dong S, Song Y, Liu H, Song N. Simulating Daily Evapotranspiration of Summer Soybean in the North China Plain Using Four Machine Learning Models. Agronomy. 2026; 16(3):315. https://doi.org/10.3390/agronomy16030315

Chicago/Turabian Style

Han, Liyuan, Fukui Gao, Shenghua Dong, Yinping Song, Hao Liu, and Ni Song. 2026. "Simulating Daily Evapotranspiration of Summer Soybean in the North China Plain Using Four Machine Learning Models" Agronomy 16, no. 3: 315. https://doi.org/10.3390/agronomy16030315

APA Style

Han, L., Gao, F., Dong, S., Song, Y., Liu, H., & Song, N. (2026). Simulating Daily Evapotranspiration of Summer Soybean in the North China Plain Using Four Machine Learning Models. Agronomy, 16(3), 315. https://doi.org/10.3390/agronomy16030315

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Simulating Daily Evapotranspiration of Summer Soybean in the North China Plain Using Four Machine Learning Models

Abstract

1. Introduction

2. Materials and Methods

2.1. Experimental Station Overview and Data Collection

2.1.1. Experimental Overview and Meteorological Data

2.1.2. Measurement and Calculation of Leaf Area Index (LAI)

2.2. Machine Learning Models

2.2.1. Random Forest

2.2.2. Support Vector Machine

2.2.3. Gradient Boosting Model

2.2.4. Stacking Ensemble Model

2.3. Optimization Algorithms

2.3.1. Particle Swarm Optimization (PSO) Algorithm

2.3.2. Genetic Algorithm (GA)

2.3.3. Randomized Grid Search (RGS)

2.4. Model Input Parameter Combination Design

2.5. Model Interpretability and Sensitivity Analysis Methods

2.5.1. SHAP Analysis

2.5.2. Global Sensitivity Analysis

2.6. Evaluation Metrics

3. Results

3.1. Dynamic Characteristics of Summer Soybean Evapotranspiration and Its Influencing Factors

3.1.1. Construction and Validation of the Leaf Area Index (LAI) Model

3.1.2. Dynamic Changes and Influencing Factors of Evapotranspiration (ET) in Summer Soybean at Different Growth Stages

3.2. Comparison and Evaluation of Summer Soybean ET Simulation Performance Between the FAO-56 Penman–Monteith (PM) Model and Optimized Machine Learning Models

3.2.1. Performance Analysis of Summer Soybean ET Simulation Using the FAO-56 Penman–Monteith Model

3.2.2. Overall Model Performance and Comparison of Optimization Algorithm Effects

3.2.3. Analysis of Summer Soybean Evapotranspiration Simulation Models Based on SHAP and Global Sensitivity Analysis

3.2.4. Evaluation of Model Robustness and Generalization Ability

4. Discussion

4.1. Performance Comparison Between Different Machine Learning Models and the Penman–Monteith Model and Analysis of Ensemble Advantages

4.2. Enhancement Effect of Optimization Algorithms on Model Performance

4.3. Core Driving and Regulatory Mechanisms of Summer Soybean Evapotranspiration

4.4. Model Robustness and Cross-Annual Generalization Ability

4.5. Limitations and Future Directions

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI