Evaluation of Eight Decomposition-Hybrid Models for Short-Term Daily Reference Evapotranspiration Prediction

Chen, Yunfei; Liu, Zuyu; Long, Ting; Liu, Xiuhua; Gao, Yaowei; Wang, Sibo

doi:10.3390/atmos16050535

Open AccessArticle

Evaluation of Eight Decomposition-Hybrid Models for Short-Term Daily Reference Evapotranspiration Prediction

by

Yunfei Chen

^1,2

,

Zuyu Liu

^1,2,3,

Ting Long

^1,2,3,

Xiuhua Liu

^1,2,3,*

,

Yaowei Gao

^1,2,3 and

Sibo Wang

^1,2,3

¹

School of Water and Environment, Chang’an University, Xi’an 710054, China

²

Key Laboratory of Subsurface Hydrology and Ecological Effect in Arid Region of Ministry of Education, Chang’an University, Xi’an 710054, China

³

Key Laboratory of Eco-hydrology and Water Security in Arid and Semi-Arid Regions of Ministry of Water Resources, Chang’an University, Xi’an 710054, China

^*

Author to whom correspondence should be addressed.

Atmosphere 2025, 16(5), 535; https://doi.org/10.3390/atmos16050535

Submission received: 18 March 2025 / Revised: 27 April 2025 / Accepted: 28 April 2025 / Published: 30 April 2025

(This article belongs to the Special Issue Challenges in Weather and Climate Modelling: Model Development, Validation, and Perspectives)

Download

Browse Figures

Versions Notes

Abstract

:

Accurate reference evapotranspiration (ET_o) prediction is important for water resource management, particularly in arid regions where water availability is highly variable. However, the nonlinear and non-stationary characteristics of ET_o time series pose challenges for conventional prediction models. Given this, in this study we evaluate eight decomposition-hybrid models that integrate various decomposition techniques with a long short-term memory (LSTM) network to enhance short-term (5-day, 7-day, and 10-day) ET_o forecasting. Using a 40-year dataset from a meteorological station, we employ the Penman-Monteith equation to calculate ET_o and systematically compare model performance. Results show that VMD-LSTM and EWT-LSTM achieve the highest accuracy in the testing set (R² = 0.983 and 0.992, respectively) but exhibit reduced robustness in the prediction phase due to excessive high-frequency components. In contrast, EMD-LSTM and ESMD-LSTM demonstrate superior predictive stability, with no significant differences from actual values (p > 0.05). These findings underscore the importance of selecting appropriate decomposition methods to balance high-frequency information and predictive accuracy, offering insights for improving ET_o forecasting in arid regions.

Keywords:

reference crop evapotranspiration; hybrid forecasting model; decomposition algorithm; deep learning; arid regions

1. Introduction

Global warming, rapid population growth, and urbanization have intensified the competition for agricultural water resources [1,2], posing significant challenges to the precise management of these resources. Reference evapotranspiration (ET_o) plays a critical role in agricultural water resources management and irrigation scheduling. Its estimation and accurate prediction are essential components of agricultural irrigation planning, crop modeling, and ecologically sustainable water use in arid regions, and can assist policymakers in making informed decisions regarding agricultural development planning and water resource management [3].

ET_o is defined as the amount of water crops must consume to achieve maximum productivity at different growth stages under specific environmental conditions and current agricultural practices by FAO-56 [4]. It is not only a key link connecting soil water, crop water, and atmospheric water in the Soil-Plant-Atmosphere Continuum (SPAC) system but also a vital component of the hydrological cycle in arid and semi-arid regions [5]. Because the ET_o time series is influenced by the interaction of multiple variables (including soil, vegetation, atmosphere, etc.) during the multi-interface transmission and conversion process from soil to atmosphere, its exhibits strong nonlinear and non-stationary characteristics [6,7,8]. These features are mainly regulated by meteorological parameters (radiation, air humidity, wind speed, air temperature, saturated water vapor pressure difference) and vegetation parameters (stomatal conductance, leaf area index (LAI), vegetation coverage, and growth conditions) together [9,10,11]. Specifically, solar radiation and air temperature provide energy sources for the ET_o process [12] and further control its rate by affecting vegetation stomatal conductance [13]. Wind speed and saturated water vapor pressure difference enhance water vapor diffusion in the atmosphere [14], and the LAI determines the crop’s absorption of radiation and transpiration water demand [15]. These factors substantially increase the difficulty of accurately predicting ET_o. Therefore, accurately obtaining the dynamic changes of hydrological system variables (such as rainfall, evaporation, and percolation) over a future period through advanced technical means has always been a research hotspot in the fields of agriculture, forestry, and hydrology [16,17].

In the past few decades, data-driven technologies such as artificial neural networks (ANN), convolutional neural networks (CNN), and recurrent neural networks (RNN) have been shown to effectively replace empirical models with their powerful nonlinear mapping and learning capabilities (such as Hargreaves Samani and Blaney Criddle equation), and have become one of the most popular ET_o prediction algorithms in recent years [18,19,20]. In deep learning algorithms (DL), the Long Short-Term Memory (LSTM), as a special form of RNN, is particularly adept at handling time series due to its unique gate structure, which enables it to capture long-term dependencies within time series [21,22]. Therefore, using LSTM as the ET_o prediction model remains an effective approach among current prediction techniques [23]. It is worth mentioning that models using LSTM for ET_o prediction mainly include two categories: multivariate forecasting method and univariate forecasting method [24]. The multivariate forecasting method usually considers multivariate input, that is, it accounts for the complex relationship between variables, extracting features from multiple variables to obtain accurate predictions of ET_o in the medium to long term (time scales of 10 days and above) [25]. By contrast, the univariate forecasting method only considers a single time variable as the input and is more inclined toward short-term prediction within 10 days [26]. Although multivariate forecasting method can provide a more comprehensive understanding of the ET_o process, the availability of meteorological data is greatly limited for most drylands, making it difficult to meet the requirements of accurately calculating ET_o using the FAO-56 formula. Therefore, the univariate forecasting method can provide a clearer, simpler, easier-to-understand, and more accurate prediction method in this context.

However, the standalone deep learning model often ignores the periodic nature and trend components of the time series when predicting [27,28], especially the nonlinearity, uncertainty, and randomness of ET_o make it difficult for these models to capture embedded features fully [29,30], thus severely affecting the accuracy of the model’s prediction. So far, researchers have begun exploring new trends in ET_o forecast development to overcome the above obstacles by using advanced decomposition-DL hybrid prediction models to produce more powerful models that provide accurate ET_o predictions [18,31,32]. The advantage of the decomposition algorithm is that it can linearize and stabilize the non-linear and non-stationary ET_o signals step by step and decompose them into several subsequences with different frequencies as model inputs, making deep learning models easier to calibrate, thereby improving performance [33,34]. Popular decomposition methods, including Empirical Mode Decomposition (EMD) and its variant algorithm(s) [35,36,37], Variation Mode Decomposition (VMD) [38], Extreme Point Symmetric Mode Decomposition (ESMD) [39], and Empirical Wavelet Transform (EWT) [40], which have been attempted to be applied in soil moisture and runoff prediction, but ET_o prediction is still rare.

Despite its high predictability, a key issue with decomposition-DL hybrid models is that decomposition algorithms have multiple forms and variants, which may confuse readers. Furthermore, the practicality of these models in arid regions still needs to be verified. Therefore, it is urgent to systematically assess and explain the differences in accuracy among various hybrid decomposition models and the underlying reasons for these differences in short-term ET_o prediction. Here, this study adopts the coupling idea of “decomposition-prediction-reconstruction,” combining eight widely used decomposition method preprocessing techniques with deep machine learning method (that is, LSTM) to construct eight decomposition hybrid models (including EMD-LSTM, EEMD-LSTM, CEEMDAN-LSTM, VMD-LSTM, LMD-LSTM, ESMD-LSTM, DWT-LSTM, and EWT-LSTM) as short-term ET_o prediction models at daily scale. Combining the nonlinear approximation ability of the LSTM model and the ability of the decomposition algorithm to deal with nonlinear ET_o, accurate predictions of crop water demand during the prediction period were obtained. It should be emphasized that before constructing the hybrid model, it is necessary to use an artificial sequence of known components to verify the accuracy of the eight decomposition algorithms. The purpose of this study is to (1) predict daily ET_o values over different prediction periods (5, 7, and 10 days) using eight hybrid models, (2) evaluate the performance differences of eight hybrid models in predicting daily ET_o and compare them with standalone machine learning model (LSTM), and (3) find the most suitable hybrid prediction model for ET_o in arid regions, and reveal the reason for the difference of model prediction accuracy. The research results could inspire the accuracy of hybrid forecasting models based on multiple variables and explore the practicality of hybrid models in dryland hydrological and agricultural systems, promoting ET_o short-term prediction technology in a more convenient and clear direction.

2. Materials and Methods

2.1. Study Area

The study site—Yulin meteorological station (109°41′ E, 38°21′ N) in Yulin City (Figure 1), Shaanxi Province, is located in the middle area of the Mu Us Sandy Land and the Loess Plateau, which is a transitional zone between desert and steppe. The vegetation coverage is not high, and the sensitivity and vulnerability of the ecological environment are extremely significant [41]. The climate type is semi-arid continental monsoon climate zone, with an average annual temperature of 6.4 °C and extreme temperatures ranging from -32.7 to 38.6 °C. According to the multi-year meteorological records of meteorological stations since 1980, the annual precipitation in the past 40 years has been between 250 mm and 730 mm, and the average annual precipitation has been 420 mm. More than 65% of rainfall occurs from July to September as a short-term rainstorm, with obvious seasonal and interannual change [42]. In addition, the temperature difference between day and night in the region is also relatively large, with sunshine duration exceeding 2700 h and a total radiation amount of 608.37 KJ/cm². The evaporation is severe, with a potential annual evaporation of around 2300 mm, mainly concentrated from April to September, accounting for 70% to 80% of the annual evaporation. The vegetation and crops in the region mainly rely on rainfall and irrigation water for growth [43].

2.2. Data Collection and PM-Equation

In this study, it is necessary to verify that the eight decomposition algorithms do not lose any information when extracting relevant features (that is, the decomposition algorithms are accurate), which is an important prerequisite for comparing the accuracy of different decomposition-LSTM hybrid models. Therefore, this study established a set of artificial sequence (AS) data with known components generated by overlaying a set of sine functions and a segment of low-frequency noise (as shown in Figure 2a). Then, use the completeness of decomposition algorithms to verify the accuracy of these algorithms in processing AS data. Here, completeness refers to the ability of the Intrinsic Mode Function (IMF) and residual components obtained by EMD decomposition to be restored to the initial sequence after reconstruction and superposition. This property of “decomposition-reconstruction-reduction” is called the completeness of EMD decomposition [34,44]. Only after ensuring that the decomposition algorithm meets the accuracy requirements will this study construct a decomposition hybrid model to predict the calculated ET_o value further.

Here, the daily meteorological data from 1980 to 2019 of Yulin meteorological station, including daily maximum temperature, daily minimum temperature, sunshine hours, relative humidity, and wind speed at 10 m height, were sourced from the China Meteorological Administration (http://data.cma.cn/, (accessed on 17 March 2025)), used to establish the ET_o sequence data set (Equation (1)). Although the website does not provide specific measuring instruments. However, these data are measured according to Chinese national standards [45], and the measured data is internationally recognized [46,47]. All data has been rigorously reviewed and of good quality. The detailed checked and corrected processes were as follows [9]: If these were not met the criteria included the following, the values were deleted and replaced with monthly average values. (1) Climatological boundary value inspection: this refers to values that were not physically possible. (2) Climate extreme value inspection: whether the value of an element exceeds the maximum and minimum values that have appeared in the month’s history. (3) Internal consistency check: whether there was a physical connection between different elements or projects of the stations. (4) Time consistency check: whether the same elements exceed a certain range of change rates within a certain interval.

The calculated daily ET_o dataset (14,600 points in total) from 1 January 1990, to 26 December 2019, is divided into three segments:

Training set (80%): used to fit the model and estimate its parameters.
Testing set (20%): used to evaluate the model’s predictive performance.
Out-of-sample validation set: the final 10 days of the dataset—referred to as the prediction set in figures—were withheld and used exclusively to simulate a real-world forecasting scenario. This held-out segment enabled the assessment of the practical prediction capability of the decomposed hybrid model by comparing predicted ET_o values with actual observation.

This dataset partitioning strategy preserves the temporal structure of the time series and is consistent with best practices in time series forecasting evaluation [48,49,50], ensuring both robust model training and meaningful application-oriented validation [51].

At present, the traditional physics-based model (FAO-56 Penman-Monteith equation) is still considered the most effective method for estimating ET_o because it takes into account both aerodynamics and thermodynamics [4], and the formula is as follows:

E T_{o} = \frac{0.408 Δ (R n - G) + γ \frac{900}{T + 273} u_{2} (e s - e a)}{Δ + γ (1 + 0.34 u_{2})}

(1)

where ET_o (mm·d⁻¹) is reference evapotranspiration, R_n (MJ·m⁻² ·d⁻¹) is net radiation, G (MJ·m⁻²·d⁻¹) is the soil heat flux density, which is assumed to be 0 at daily time step, T (°C) is the air temperature, u₂ (m·s⁻¹) is the wind speed at 2 m height, e_s and e_a (kPa) are the saturation and actual vapor pressure, respectively, △ (kPa·°C⁻¹) is the slope of the vapor pressure curve at air temperature, and γ (kPa·°C⁻¹) is the psychrometric constant.

2.3. Decomposition Algorithms and LSTM

2.3.1. EMD, EEMD, and CEEMDAN

The EMD algorithm is essentially a process of continuously “filtering” data, which decomposes complex original data into a limited combination of several Intrinsic Mode Functions (IMFs) and residual components (Figure 3). In the “data filtering” process, the IMF needs to meet the following two conditions: (1) the number of extreme values and zeros must be equal or differ by a maximum of 1 throughout the entire data interval, and (2) at any data point, the envelope mean of the local maximum and local minimum values is zero [44]. The specific expression is as follows:

s (t) = \sum_{i = 1}^{n} I M F_{i} (t) + r_{n} (t)

(2)

where s(t) is the original data,

I M F_{i} (t)

is the its IMF component, and

r_{n} (t)

is the residual component.

In order to solve the mode mixing phenomenon caused by the uneven distribution of extreme points, Wu and Huang [37] proposed the Ensemble Empirical Mode Decomposition (EEMD) method. As an improved algorithm of EMD, it is essentially a multiple empirical mode decomposition with Gaussian white noise superimposed on it, using the statistical characteristics of the uniform distribution of the white noise frequency, and making the distribution of signal extreme points more uniform to reduce the “overshoot” and “undershoot” phenomena when the cubic spline is used for envelope fitting [52]. At the same time, the zero-mean property of white noise is used to make the noise cancel each other after multiple averages, thereby suppressing its impact. However, due to the added white noise not being completely neutralized, residual white noise will always be in the IMF, affecting the subsequent signal analysis and processing [53].

Therefore, Torres, et al. [36] proposed Complete EEMD with Adaptive Noise (CEEMDAN), which improves the EEMD algorithm. The advantage of CEEMDAN is that it can solve the residual white noise of IMF. It is because compared with EEMD, ① CEEMDAN does not directly add Gaussian white noise signal to the original signal s (t), thus solving the problem of different modes achieved by IMF with noise for different signals; ② After obtaining the first order IMF component, CEEMDAN decomposition performs an overall average calculation to obtain the final first order IMF component. Then, the residual parts are subjected to the above-repeated operations, effectively solving the problem of white noise transfer from high-frequency to low-frequency bands [54].

2.3.2. VMD

The Variation Mode Decomposition (VMD) is an EMD-like time-frequency analysis method [38]. Different from EMD, VMD has a fundamental difference in principle from EMD. It transfers the decomposition process to the variational framework and achieves adaptive signal decomposition by constructing and solving variational problems to search for the optimal solution of the variational model. It belongs to a completely non recursive model [55].

Therefore, different from the concept of IMF definition and the constraint conditions set by EMD, the VMD algorithm redefines the intrinsic mode function with a more stringent constraint of finite bandwidth. Each intrinsic mode component is characterized by amplitude and frequency modulation. The expression is as follows:

s_{k} (t) = A_{k} (t) \cos θ (ϕ_{k} (t))

(3)

where A_k(t) is the envelope amplitude of signal s_k(t),

ϕ_{k} (t)

is the instantaneous phase.

In addition to satisfying the EMD constraint, the VMD also adds two new constraints: (1) the sum of the bandwidths of the center frequencies of each modal component is minimized; (2) The sum of all modal components is equal to the original signal. Therefore, the VMD not only overcomes the endpoint effect and modal aliasing problem of EMD but also has better robustness and a more solid mathematical theoretical basis, which is very suitable for nonlinear and non-stationary time series analysis [56,57].

2.3.3. LMD

The Local Mean Decomposition (LMD) was proposed by Smith [58], which gradually decomposes a complex multi-component signal into a sum of several product functions (PF) and a residual component by multiple cyclic iterations. Each product function is a product of an envelope function and a pure frequency modulation function, and the product function component is a single-component modulation signal [59].

The basic idea of the LMD decomposition algorithm is to remove the local mean function from the original signal and demodulate it using the envelope estimation function until the standard pure frequency modulation function is obtained [60]. The envelope function is obtained by multiplying all the envelope estimation functions generated during the iterative process. This envelope function is then multiplied by the final pure frequency modulation function to derive the first-order PF component [61]. After separating the first-order PF component from the original signal, the above steps are repeated to decompose each order PF component and residual component R in sequence [62,63].

s (t) = \sum_{p = 1}^{k} P F_{p} (t) + u_{k} (t)

(4)

where PF_p(t) is the product functions, and u_k(t) is the residual components

2.3.4. ESMD

Extreme Point Symmetric Mode Decomposition (ESMD) is a new development based on the EMD algorithm proposed by Wang and Li [39]. Its essence is to replace the cubic spline interpolation method of the outer envelope in the EMD algorithm with an internal extreme value symmetric interpolation method. Therefore, the IMF component of ESMD is redefined as follows: (1) the local maximum and minimum points are distinguished, and adjacent equal extreme points are added during signal decomposition as extreme points, where the maximum must be positive and the minimum must be negative. (2) In a broad sense, the IMF component should exhibit envelope symmetrical or pole symmetry. The advantage of ESMD is that the decomposed residual component can reflect the overall trend of the data, serving as the “adaptive global mean”, and the least squares method is used to optimize the “adaptive global mean” to determine the optimal number of screenings [64]. ESMD not only retains the advantages of EMD but also effectively solves the problem of modal aliasing and allows for time-frequency analysis. However, its current application is limited and has mainly been applied in fields such as climate and ocean studies [65].

2.3.5. DWT and EWT

The basic idea of the Discrete Wavelet Transformation (DWT) is to decompose a signal into multiple wavelet sub-band, each representing a wavelet component of different frequencies. These wavelets sub-bands can be processed separately, through operations such as filtering, down-sampling, and other operations, and then reconstructed to recover the original signal [66,67].

Generally, the wavelet coefficients produced by DWT produces are based on a dyadic procedure. Two coefficients, including approximations (representing the high-scale and low-frequency components of the data) and details (denoting the low-scale and high-frequency components of the data), are produced during signal decomposition by DWT. The mathematical formulation can be given as:

ψ_{m, n} (\frac{t - b}{a}) = a_{0}^{- m / 2} ψ (\frac{t - n b_{0} a_{0}^{m}}{a_{0}^{m}})

(5)

where

ψ

represents the mother wavelet, m and n are the integers that signify wavelet dilation (scaling) and translation factors, respectively. In addition, b₀ and a₀ designate location variables (>0), and dilation step (>1). Usually, the value of a₀ = 2 and b₀ = 1 is preferred for practical application. This integer power of two logarithmic scales of translations and dilations is called dyadic grid arrangement. Finally, the wavelet coefficients are computes as:

W_{(m, n)} = 2^{- m / 2} \sum_{t = 0}^{N - 1} ψ (2^{- m} t - n) x_{i}

(6)

where W_(m,n) defines wavelet coefficients for the DWT at scale a = 2^m and location b = 2^mn. Also, x_i states a finite time scale (i = 0, 1, 2, …, N − 1, and N = 2^M). This way of DWT can perform multi-scale modelling.

By contrast, the Empirical Wavelet Transform (EWT) is a non-stationary signal processing method proposed by Gilles [40]. It combines the adaptive decomposition concept of the EMD method and the tight support framework of wavelet transform theory, providing a new adaptive time-frequency analysis approach for signal processing. Based on wavelet analysis and adaptive filtering, the EWT decomposition algorithm decomposes the signal into wavelet components of multiple local frequencies to achieve efficient processing and analysis of the signal [68]. The basic idea of the EWT method is to decompose the signal into multiple local bandpass (local frequency band) wavelet components and achieve local time-frequency analysis by adaptively selecting bandpass boundaries and frequency band intervals. Then, each sub-signal is denoised and smoothed using the Hilbert transform. Finally, the obtained time-frequency diagram was superimposed to reconstruct the time-frequency diagram of the original signal. Compared with the EMD method, EWT can adaptively select the frequency band, overcoming the modal aliasing problem caused by the discontinuous time-frequency scale of the signal. At the same time, it has a complete and reliable mathematical theoretical basis, low computational complexity, and could also overcome the problem of over-envelope and under-envelope in the EMD method. Therefore, it is gradually becoming popular in the field of signal processing, especially in the field of fault diagnosis.

The main steps of the EWT method include [69]:

Step1. Initialization, determine the proportion of signal decomposition and scaling parameters;

Step2. Decompose the signal into predefined local bandpass signals and achieve signal decomposition by solving the adaptive boundary of the bandpass signal;

Step3. Perform Hilbert transform on each local bandpass signal to obtain a time-frequency map;

Step4. Sum or average the local features on the time-frequency map to obtain the time-frequency map of the original signal, thereby achieving signal decomposition.

2.3.6. LSTM

Long short-term memory (LSTM) is a special RNN architecture composed of LSTM units proposed by Hochreiter and Schmidhuber [70]. It has strong memory ability and is suitable for processing time series data with long-term dependence. Compared with the traditional RNN, the LSTM neural network overcomes the problem of gradient disappearance and often exhibits higher prediction performance [71,72].

Traditional neural networks are composed of neurons, but LSTM networks are composed of memory blocks linked by continuous layers. Each block has gates for managing the status and output of the block. The model structure of the basic memory unit is shown in Figure 4. The memory unit in the LSTM structure mainly has three gates, including input gate (i_t), forgetting gate (f_t), output gate (o_t), which are used to forget and remember some key information. The function of each gate can be summarized as follows: both the i_t and f_t act on the internal state of the unit, controlling how much information about the previous time step’s internal state is forgotten and how much input information is absorbed at the current time step. If the value of the gate is 0, that is, no forgetting and no absorption at all, if the value of the gate is 1, that is, complete forgetting and full absorption. By contrast, the output gate determines the output content according to the input under the specified conditions and the memory of the block (Figure 4).

The detailed data processing is as follows: the LSTM process begins with the f_t, which evaluates the previous cell output h_t−1 and the current input cell x_t. Based on Equation (6), it determines what information should be retained or discarded from the previous cell state C_t−1. The i_t then controls which information is updated as Equation (7) and Equation (8). On the other hand, Equation (9) determines the extent to which new information is added to the current cell. Finally, through the o_t, the σ layers decide which part of the current cell state should be output, in Equation (10). The output content if the current cell is determined by combining C_t with o_t according to Equation (11). The overall LSTM operation formula is as follows:

f_{t} = σ (W_{f x} x_{t} + W_{f h} h_{t - 1} + b_{f})

(7)

i_{t} = σ (W_{i x} x_{t} + W_{i h} h_{t - 1} + b_{i})

(8)

u_{t} = \tanh (W_{u x} x_{t} + W_{u h} h_{t - 1} + b_{u})

(9)

C_{t} = f_{t} C_{t - 1} + i_{t} u_{t}

(10)

o_{t} = σ (W_{o x} x_{t} + W_{o h} h_{t - 1} + b_{o})

(11)

h_{t} = o_{t} \tanh (C_{t})

(12)

where x_t is the input vector, h_t is the output of the memory cell, σ and tanh are activation functions, W and b represents the weight and bias of the neural network.

2.4. Parameter Setting for Hybrid Forecasting Algorithms

In recent years, the hybrid model based on decomposition technology and machine learning have been widely used in hydrological forecasting and have gained high recognition in the academic community [75,76]. At the same time, using the coupling approach of “decomposition-prediction-reconstruction” to establish a hybrid prediction model to improve the accuracy of hydrological prediction has also become one of the current research hotspots (Figure 5).

The idea of establishing a hybrid model mainly includes the following three parts: (1) Decomposition: preprocess the data through EMD and its extended algorithms, decompose the data into multiple relatively stable time series (IMFs), not only achieve the purpose of noise reduction but also enable the model to capture the characteristics of hydrological series changes better; (2) Prediction: Combining prediction algorithms such as multiple linear regression models (MLR), autoregressive models (AR), autoregressive moving average models (ARMA) suitable for stationary sequence data, as well as machine learning algorithms such as artificial neural networks (ANN), support vector machines (SVM), and long short-term memory networks (LSTM) suitable for nonlinear models, to predict IMFs; (3) Reconstruction: Finally, the completeness of EMD is utilized to stack the prediction results, and could get in the better prediction performance than the original sequence.

Figure 6 shows the schematic flowchart of the modelling strategy for this study.

In this study, all the experiments were implemented in MATLAB2022b on Windows 11 with a 2.3 GHz Intel Core i7-12700H processor and a 64-bit operating system with 32 GB of RAM. All prediction methods were executed independently 10 times, and the different prediction models were assigned the same parameters. The detailed parameters setting of some prediction models involved in this study are listed in Table S1.

2.5. Statistical Analysis

The mean absolute error (MAE) in Equation (12), the mean square error (MSE) in Equation (13), the root means square error (RMSE) in Equation (14), the mean absolute percentage error (MAPE) in Equation (15), and the R-Square (R²) in Equation (16) were used to evaluate the performance of the model:

M A E = \frac{1}{n} \sum_{i = 1}^{n} |S i - O i|

(13)

M S E = \frac{1}{n} \sum_{i = 1}^{n} {(S_{i} - O_{i})}^{2}

(14)

R M S E = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(S_{i} - O_{i})}^{2}}

(15)

M A P E = \frac{100 %}{n} \sum_{i = 1}^{n} |\frac{S_{i} - O_{i}}{S_{i}}|

(16)

R^{2} = 1 - \frac{\sum_{i = 1}^{n} {(S_{i} - O_{i})}^{2}}{\sum_{i = 1}^{n} {(\bar{O} - O_{i})}^{2}}

(17)

where

O_{i}

and

S_{i}

are the calculated values by PM-equation and simulated values by hybrid models, respectively;

\bar{O}

is the mean calculated values; n is the number of samples. The two indexes represent the average degree of absolute error and coincidence degree between the simulated and measured values, respectively. The R² value ranges from 0 to 1, if the evaluated model accurately depicts the datasets, this statistical value should be close to 1. Low values of MAE, MSE, RMSE, and MAPE also indicate that predictions are close to the observations.

3. Results

3.1. Accuracy Evaluation of Eight Decomposition Algorithms on AS and ET_o

Here, Figures S1–S8 showed the decomposition results of AS data by eight decomposition algorithms, including EMD, EEMD, CEEMDAN, VMD, LMD, ESMD, DWT, and EWT, which decomposed these data into multiple high and low frequencies as well as a residual component. However, due to decomposition algorithms’ different principles and adaptive characteristics, the number of IMFs decomposed from the same asalso varies. For instance, EMD and EEMD decomposed the AS sequence into 8 IMFs and a residual component; VMD and ESMD decomposed into 6 IMFs and one residual component, and 5 IMFs and one residual component, respectively. DWT decomposed into 5 high-frequency components and 5 low-frequency components. By contrast, EWT decomposed AS into 11 IMF components.

The completeness of the decomposition algorithm was used to check the accuracy of the eight decomposition methods on the AS with known components, which was crucial for the subsequent construction of the decomposition hybrid ET_o prediction model. Figure 7 showed the mean and variance values of the reconstructed AS and ET_o sequences using eight decomposition methods, and there was no significant difference between them and original sequence (all of the p-values were less than 0.05). For AS sequences (Figure 7a), except for the mean values of EEMD and LMD, which were 11.979 and 11.968, the values of the other six decomposition algorithms were consistent with AS. In addition, except for the variances of VMD and EWT, which were 51.016 and 51.659, the values of the other six decomposition algorithms were consistent with AS, and there was no significant difference between them. By contrast, the reconstructed mean and variance values have no significant difference (p-values were less than 0.05) compared to the original ET_o sequence as well (Figure 7b). Except for the variance of VMD and EWT, which were 3.490 and 3.815, the values of the other six decomposition algorithms were consist with original ET_o sequences. These results indicated that these eight decomposition algorithms were accurate and satisfactory for relatively simple AS sequences and more complex ET_o sequences.

3.2. Accuracy Analysis of Eight Decomposition Hybrid Models in the ET_o Test Sets

Based on the above decomposition algorithm, eight LSTM-based decomposition hybrid models were established to evaluate their prediction performance differences on the ET_o testing set and the prediction set, and they were compared with the standalone deep learning model, LSTM. Then, the prediction model’s accuracy was evaluated using five performance indicators: MAE, MSE, RMSE, MAPE, and R2. Specifically Figure 8, Figure 9 and Figure 10 showed the R2 values of the testing set (black dots) and prediction set (red dots) predicted 5-days, 7-days, and 10-days ahead, respectively, the values of training test could refer to Table S2. Figure 11 further analyzed the performance differences of other prediction indicators of the M0–M8 model in the testing set. Here, the results of the testing set showed that the decomposition hybrid model could improve the accuracy by 27.6% compared with the standalone prediction model. Among them, the M4 (VMD-LSTM) and M8 (EWT-LSTM) had the highest accuracy, the R2 reaching 0.977 and 0.989, respectively. Meanwhile, the remaining M1–M3 and M5-M7 models also had pretty good accuracy, with R2 values greater than 0.90. By contrast, only the LSTM represented by M0 has the lowest accuracy, 0.779.

3.2.1. 5-Days Ahead Forecasting

In the testing set (black dots) of 5 days ahead forecasting, the fitting results of daily ET_o simulation and calculated values were present in Figure 8a and Figure 8b–i for the standalone LSTM model and decomposed hybrid model M1–M8, respectively. The R² values of M4 and M8 were the highest, reaching 0.983 and 0.992, respectively, and that of M3, M5, M6, and M7 also reached values more than 0.9. Only M0, M1, and M2 have relatively low R², and the values of M1 and M2 were 0.899 and 0.889, respectively, and M0 was only 0.779. Furthermore, this study also further analyzed whether there was a significant difference between the eight decomposed hybrid models and the calculated ET_o sequences in the testing set. Here, the results in Table 1 showed no significant difference between these hybrid prediction models and ET_o, verifying the high accuracy of these models in the statistical analysis in the testing set.

Other evaluation indicators of MSE, RMSE, MAE, and MAPE indicated that the smaller the value, the better the model. The results showed that the M0 model has the largest errors, and MSE, RMSE, and MAE were 0.660, 0.873, and 0.934, respectively (Figure 11). In contrast, these were all very small in the hybrid decomposition model. Specifically, the MSE, RMSE, and MAE in the M4 model have the smallest values of 0.069, 0.009, and 0.097, respectively. The second was in the M8 model, with 0.128, 0.030, and 0.174, respectively. In the models of M1-3 and M5-7, these values ranged from 0.335 to 0.477, 0.238 to 0.453, and 0.488 to 0.673, respectively. In addition, for MAPE, among the M0-M8 models, M4 and M8 still had the smallest values, 2.397%, and 5.117%, respectively. Moreover, like MSE, RMSE, and MAE evaluation indicators, M0 still has the highest value, reaching 25.926%.

3.2.2. 7-Days Ahead Forecasting

The prediction accuracy of 7-day ahead forecasting (Figure 9) was generally lower than that of 5-day ahead forecasting in the ET_o testing set. Specifically, the R² values of M4 and M8 were the highest, reaching 0.977 and 0.989, respectively. By contrast, M0 was the lowest, only 0.784. For evaluation indicators of MSE, RMSE, and MAE, the M4 model still has the smallest values of 0.109, 0.021, and 0.145, respectively. Next were M8, which were 0.156, 0.044, and 0.210, respectively. Here, the M1 model has the highest MSE, RMSE, and MAE values, which were 0.663, 0.865, and 0.930, respectively. Similarly, for the MAPE value, M4 and M8 were still the smallest, only having 4.076% and 6.494%, respectively, while M1 has the highest value, reaching 25.504%.

3.2.3. 10-Days Ahead Forecasting

Similar to the results of Figure 9, at the 10-day ahead forecasting (Figure 10), the R² values of M4 and M8 were still the highest, reaching 0.977 and 0.989, respectively, and M0 was the lowest, only 0.784. The MSE, RMSE, and MAE values of the M4 model were still the smallest, with values of 0.109, 0.021, and 0.145, respectively. Next was M8, which was 0.156, 0.044, and 0.210, respectively. Furthermore, the M1 model has the highest MSE, RMSE, and MAE values, which were 0.663, 0.865, and 0.930, respectively. For the MAPE value, M4 and M8 were still the smallest, 4.076% and 6.494%, respectively, and the value of M1 was the largest, reaching 25.504%.

3.3. Out-of-Sample Evaluation of Eight Decomposition Hybrid Models in Short-Term ET_o Prediction

Here, the R² of prediction set for 5-days, 7-days, and 10-days ahead forecasting between predicted ET_o and calculated values were presented in each sub-panel (red dots) of Figure 8, Figure 9 and Figure 10. In the results of the 5-days ahead prediction period (Figure 8), the R² values of M0, M1, M4, M5, M6, and M8 were all higher, reaching above 0.6, and the highest value of M8 was 0.841. By contrast, the R² of M2 and M7 were between 0.4 and 0.6, which were 0.533 and 0.446, respectively, and only M3 has the lowest R² value of 0.027. The ahead prediction R² values of the 7-day and 10-day decreased significantly (Figure 9 and Figure 10). Specifically, in the 7-day ahead prediction period results, only the M1, M5, and M6 decomposed hybrid prediction models achieved R² above 0.7, while the rest of the hybrid models had R² below 0.5. However, among the results of the 10-day ahead prediction period, only the M1, M5, and M6 decomposition hybrid prediction model achieved R² above 0.35, while the rest of the models had R² below 0.3. Moreover, to further evaluate the robustness of each model over longer forecasting windows, we calculated the relative decrease in R² from 5-day to 10-day predictions, referred to as R² degradation. As shown in Table 2 shows that the accuracy of the eight hybrid decomposition models significantly decreases with increasing prediction time. Among them, the R² accuracy of M1, M2, and M6 has decreased by about 45% from 5 to 10 days, and M5 has the least decrease, only 28.6%. By contrast, the remaining M0, M3, M4, M7, and M8 models have poor stability, especially the M0 and M8, with accuracy degradation of over 90% and poor robustness.

In addition, in Figure 12 and Table 3, the accuracy of each model in the prediction set was further analyzed. Here, this study found that although there was no significant difference between the eight decomposed mixed models in the test set and the calculated ET_o, there was a significant difference in the predicted set. Specifically, only the EMD-LSTM represented by M1 and the ESMD model represented by M6 in the prediction set presented no significant difference from the calculated ET_o, while the other decomposed mixed models showed significant differences from ET_o in three forecast periods, and M1 has the best stability, while the other decomposed hybrid model showed significant differences from ET_o. Therefore, it could be concluded that the EMD-LSTM represented by M1 and the ESMD-LSTM represented by M6 could as the optimal ET_o prediction decomposition hybrid model for arid areas.

4. Discussion

4.1. The Influence of Sequence Complexity on Model Prediction Accuracy

Improving the short-term prediction of ET_o has always been a research hotspot in hydrology, meteorology, and agriculture. In this study, the decomposition algorithm was used as a preprocessing method before prediction, which was proven to significantly reduce the non-stationary characteristics of ET_o time series and the complexity of LSTM calculation. Especially compared with standalone LSTM models, the decomposed hybrid model could improve accuracy by a minimum of 14.1% (EEMD-LSTM) and a maximum of 27.6% (EWT-LSTM), which was consistent with the results of Heddam, et al. [77], Mehdizadeh, et al. [78], and Lu, et al. [79]. Although the decomposition algorithm has good decomposition effects on simple AS and complex ET_o sequences (Figure 7), it was important to note that the complexity and length of the sequence would affect the number of components obtained by decomposition, which may further affect the accuracy of the model. For instance, the AS with short and simple sequences were decomposed into 4 and 11 components by LMD and EWT, respectively (Figures S1–S8). By contrast, ET_o data with longer and more complex sequences were decomposed into 10 and 22 components by LMD and EWT, respectively. Here, with the increase of sequence complexity and length, the eight decomposition hybrid models showed significant changes in the prediction fluctuations of ET_o, leading to a significant accuracy difference between the testing and prediction sets (Table 1 and Table 3).

4.2. The Reasons for Accuracy Difference of Hybrid Models in Testing and Prediction Sets

When comparing the accuracy of the proposed decomposition hybrid model on the testing set, the VMD-LSTM, represented by M4, and the EWT-LSTM model, represented by M8, have the highest accuracy. Nevertheless, in the prediction set, their accuracy drops off rapidly. For the 8 models studied, existing literature has also recorded some similar results, such as Lu, et al. [79] compared the prediction accuracy of three decomposition hybrid models of VMD, EMD, and EEMD in daily ET_o and pointed out that the model using VMD had the highest accuracy. The study of Özger, et al. [80] pointed out that the DWT hybrid model has higher accuracy than the EMD decomposition hybrid model. At the same time, Karbasi, et al. [25] pointed out in their study that EWT performs better than DWT wavelet in all prediction intervals. These results are basically consistent with our results in the testing set, that is, the VMD and EWT hybrid models have the highest prediction accuracy(Figure 8, Figure 9, Figure 10 and Figure 11).

We think that the difference accuracy of models in different datasets (testing set and prediction set) are due to two factors. Firstly, there are differences in the working principles: VMD determines the modal function by minimizing the average information between the signal and modal components [81], while EMD is an adaptive local signal decomposition method that obtains the local maximum and minimum envelope curves in the signal through cubic spline interpolation to form an IMF [82,83]. By contrast, DWT and EWT have more rigorous mathematical algorithms than EMD. Selecting the appropriate wavelet basis function [78] could not only solve the limitation of modal aliasing and boundary effect of EMD but also overcome the problem of over-envelope and under-envelope in the EMD method [33]. Secondly, there are differences in decomposition effects: it was precisely because of the different working principles leading to that the decomposition results were also significantly different, Although this study has confirmed that the accuracy of the decomposition algorithm after decomposition-reconstruction meets the requirements (Figure 7). However, it could not be denied that there were some differences in the decomposition effects of the eight decomposition methods (Figures S1–S8). For instance, EMD decomposed the AS sequence into two high-frequency components, six low-frequency components, and one residual component (Figure S1). By contrast, VWD decomposed five high-frequency components, one low-frequency component, and one residual component (Figure S4), while EWT decomposed nine high-frequency components, one low-frequency component, and one residual component (Figure S8). Therefore, we speculated that the quality of the IMF obtained by decomposition determines the prediction model’s performance. The higher the quality of the IMF components, the higher the accuracy of the ensemble output [84]. Specifically, the IMF obtained by VMD and EWT decomposition not only detects the low-frequency periodic part of the AS, and performs a more comprehensive decomposition of the high-frequency signal part, obtaining many high-frequency signal details.

It should be noted that this study compared the model prediction accuracy not only in the testing set but also retained the last ten ET_o values in the sequence (prediction set) to evaluate the effectiveness and practicality of the hybrid model’s prediction values compared to the actual ET_o values in practical applications. Although the predictive performance of the eight decomposition hybrid models was high in the testing set, with no significant difference, in the prediction set, there were significant differences in each model’s prediction values, and the models’ advantages and disadvantages were gradually revealed (Table 1 and Table 3). Specifically, the EMD-LSTM represented by M1 and the ESMD-LSTM model represented by M6 all have the highest accuracy in actual application, and there was no significant difference between the predicted ET_o results and the calculated ET_o results (Figure 12 and Table 3). By contrast, the ET_o prediction results of VMD-LSTM represented by M4 and the EWT-LSTM model represented by M8 were the opposite. Here, we may could explain why the prediction accuracy of M4 (VMD-LSTM) and M8 (EWT-LSTM) hybrid prediction models was much lower than that of M1 and M6. The main reason was that VMD and EWT produced extensive IMF high-frequency components, which were difficult to calibrate by the LSTM model, resulting in an overestimation of actual ET_o predicted values in practical applications (Figure 12) [85]. That is to say, excessive comprehensive high-frequency signals increase the cumulative error of prediction values when making predictions [86]. For instance, in this study, M2 (EEMD-LSTM) and M3 (CEEMDA-LSTM) were obvious cases. Although EEMD and CEEMDAN solve the EMD modal aliasing and boundary effects by adding white noise, the addition of excess signals may increase the cumulative error of IMF components. Therefore, the ETo accuracy of EEMD-LSTM and CEEMDAN-LSTM lower than EMD-LSTM. In addition, the M2–M5 models overestimated the ET_o value (Figure 12 and Table 3), while the EWT underestimated the ET_o value, which may be related to the robustness of the model. As mentioned earlier, only the prediction accuracy of M1, M2, M5, and M6 model’s R² value decreases the least as the prediction time increases (Table 2). It may mean that the remaining model (including M0, M3, M4, M7, and M8) is unstable with increasing prediction time, leading to overestimation/underestimation of ET_o. In addition, we also found that the accuracy of all models decreases with the increase of the prediction time. Especially for the M1 and M6 models, the accuracy of R² decreased the slowest with increasing prediction time (Table 2), which means that these two models may be more suitable for predicting longer time series.

4.3. Research Inspiration on Decomposed Hybrid Models

The above findings inspired us should to pay special attention to the high-frequency detail processing of decomposition components, which was an important step in further improving prediction accuracy. In particular, decomposition techniques such as VMD and EWT may generate many high-frequency components/information, which are difficult to calibrate by deep learning models. Therefore, several scholars have proposed a new trend to address this process: the difficult-to-calibrate IMF is further decomposed into sub-series that are easier to calibrate. For instance, the research of Prasad, et al. [87], Ahmed, et al. [88], and Karbasi, et al. [89] attempts to use the bi-decomposition model to reduce the high-frequency IMF components required for reconstruction, decomposing the IMF with the highest frequency and oscillation fluctuations into more IMFs, thereby simplifying the learning process of the model. However, this treatment did not compare it with other single decomposition techniques, resulting in an incomplete evaluation of the effectiveness of the proposed method and requiring further research. In addition to input optimization, many scholars also attempt to optimize the hyperparameters of each sub-model individually, adding optimization algorithms to optimize the sub-model learning rate and hidden neurons in the pre-prediction process to improve the model’s prediction accuracy further [25,90,91]. Popular optimization algorithms include the Cuckoo Search Algorithm (CS), Grey Wolf Optimizer (GWO), Whale Optimization Algorithm (WOA), and Particle Swarm Optimization (PSO). The purpose of adding the optimization algorithm is to select and train the decomposition sequence features that are more relevant to the original sequence, thereby enhancing the accuracy of the hybrid model.

Despite this study proved that integrating decomposition preprocessing technology into traditional deep learning, which can greatly improve prediction accuracy, the accumulated error caused by excessive high-frequency components will also seriously overestimate the ET_o value and had a negatively impact prediction accuracy. Therefore, the optimization and screening process (bi-decomposition technology and hyperparameter optimization) of high-frequency variables after decomposition is one of the important ways to improve the prediction accuracy of decomposition hybrid models in the future. In addition, this study also suggests that the robustness of different decomposition hybrid models and the length of sequences significantly impact prediction accuracy, and this study was only applied to one meteorological station, requiring more regions, stations, and meteorological elements to verify the above conclusion. Therefore, our next research focuses on determining the sequence length threshold for high-precision prediction of the model and developing more robust prediction models in arid regions.

5. Conclusions

The results indicate that the decomposition hybrid model constructed by decomposition preprocessing indeed has the highest accuracy than the standalone model and could improve accuracy by a minimum of 14.1% (EEMD-LSTM) and a maximum of 27.6% (EWT-LSTM). Here, in the accuracy comparison analysis of the testing set, the decomposition hybrid models represented by M4 (VMD-LSTM) and M8 (EWT-LSTM) have the highest accuracy, but due to their produced excessive high-frequency components, which increased the cumulative error of the sub-IMF, resulting in an overestimation of ET_o predicted values in practical applications. By contrast, the decomposition hybrid models represented by M1 (EMD-LSTM) and M6 (ESMD-LSTM) showed satisfactory performance in the testing set and had no significant difference in practical applications between the predicted ET_o and actual values after statistical testing. Moreover, with increasing prediction time, the R² values of M1 and M6 also have the highest stability. Therefore, this study recommends that EWT-LSTM and ESMD-LSTM are the preferred options for short-term prediction of ET_o in arid areas, followed by VMD-LSTM and EWT-LSTM. Given the significant impact of sub-IMF high-frequency components on ET_o prediction accuracy, this study suggests that strengthening the optimization and screening process (bi-decomposition technology and hyperparameter optimization) of high-frequency variables after decomposition is one of the important ways to improve the prediction accuracy of decomposition hybrid models in the future. The advantage of this study is that the computational performance of the evapotranspiration model was evaluated using eight decomposition-hybrid models, and it was found that the selection of appropriate decomposition methods can balance the importance of high-frequency information and prediction accuracy, which is promising for application in agrometeorology, hydrological forecasting, and other fields that need to deal with complex fluctuating signals. The limitation of this study is that the model evaluation is only applied to one meteorological station, and more regions, more stations, and more meteorological elements are needed to accomplish the validation of different model accuracies.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/atmos16050535/s1. Figure S1. The intrinsic mode functions (IMFs) and residual component decomposed by EMD on the AS sequences. Figure S2. The intrinsic mode functions (IMFs) and residual component decomposed by EEMD on the AS sequences. Figure S3. The intrinsic mode functions (IMFs) and residual component decomposed by CEEMDAN on the AS sequences. Figure S4. The intrinsic mode functions (IMFs) and residual component decomposed by VMD on the AS sequences. Figure S5. The intrinsic mode functions (IMFs) and residual component decomposed by LMD on the AS sequences. Figure S6. The intrinsic mode functions (IMFs) and residual component decomposed by ESMD on the AS sequences. Figure S7. The high-frequency and low-frequency components decomposed by DWT on the AS sequences. Figure S8. The intrinsic mode functions (IMFs) and residual component decomposed by EWT on the AS sequences. Table S1. Parameter setting for hybrid forecasting algorithms. Table S2. Comparison of R², MAE, MSE, RMSE, and MAPE of eight models on training and testing datasets.

Author Contributions

Y.C.: Conceptualization, Methodology, Data curation, Writing—Original draft, Project administration. Z.L. and T.L.: Data curation, Methodology, Software, Writing—review & editing, Visualization. X.L.: Conceptualization, Writing—review & editing, Supervision, Funding acquisition, Project administration. Y.G. and S.W.: Writing-review, Investigation, Visualization. All authors have read and agreed to the published version of the manuscript.

Funding

This work was financially supported by grants from the National Natural Science Foundation of China [grant number 42372288]; the Scientific Innovation Practice Project of Postgraduates of Chang’an University [grant number 300103724063].

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

EMD	Empirical mode decomposition
EEMD	Ensemble EMD
CEEMDA	Complete EEMD with adaptive Noise
VMD	Variational mode decomposition
LMD	Local mean decomposition
ESMD	Extreme point symmetric mode decomposition
DWT	Discrete wavelet transformation
EWT	Empirical wavelet transformation
LSTM (M0)	Long short-term memory neural network
M1	EMD-LSTM hybrid model
M2	EEMD-LSTM hybrid model
M3	CEEMDAN-LSTM hybrid model
M4	VMD-LSTM hybrid model
M5	LMD-LSTM hybrid model
M6	ESMD-LSTM hybrid model
M7	DWT-LSTM hybrid model
M8	EWT-LSTM hybrid model
IMFs	Empirical mode intrinsic functions
ETo	Reference crop evapotranspiration
AS	Artificial sequence

References

Babaeian, F.; Delavar, M.; Morid, S.; Jamshidi, S. Designing climate change dynamic adaptive policy pathways for agricultural water management using a socio-hydrological modeling approach. J. Hydrol. 2023, 627, 130398. [Google Scholar] [CrossRef]
Nouri, M.; Homaee, M.; Pereira, L.S.; Bybordi, M. Water management dilemma in the agricultural sector of Iran: A review focusing on water governance. Agric. Water Manag. 2023, 288, 108480. [Google Scholar] [CrossRef]
Zhao, R.; Wang, H.; Chen, J.; Fu, G.; Zhan, C.; Yang, H. Quantitative analysis of nonlinear climate change impact on drought based on the standardized precipitation and evapotranspiration index. Ecol. Indic. 2021, 121, 107107. [Google Scholar] [CrossRef]
Allen, R.; Pereira, L.; Raes, D.; Smith, M. FAO Irrigation and drainage paper No. 56. Rome Food Agric. Organ. United Nations 1998, 56, 26–40. [Google Scholar]
Chen, Y.; Liu, X.; Ma, Y.; Zheng, C.; Zeng, Y.; Gao, W.; He, J.; Hao, L.; Liu, Z.; Shi, C.; et al. Regulating and remolding of soil water flux by sparse shrubs in arid desert regions. Catena 2024, 245, 108285. [Google Scholar] [CrossRef]
Chen, Y.; Liu, X.; Ma, Y.; He, J.; He, Y.; Zheng, C.; Gao, W.; Ma, C. Variability analysis and the conservation capacity of soil water storage under different vegetation types in arid regions. Catena 2023, 230, 107269. [Google Scholar] [CrossRef]
Dang, C.; Zhang, H.; Yao, C.; Mu, D.; Lyu, F.; Zhang, Y.; Zhang, S. IWRAM: A hybrid model for irrigation water demand forecasting to quantify the impacts of climate change. Agric. Water Manag. 2024, 291, 108643. [Google Scholar] [CrossRef]
Pour, S.H.; Wahab, A.K.A.; Shahid, S.; Ismail, Z.B. Changes in reference evapotranspiration and its driving factors in peninsular Malaysia. Atmos. Res. 2020, 246, 105096. [Google Scholar] [CrossRef]
Chen, Y.; Liu, X.; Zheng, C.; Ma, Y.; Gao, W.; He, J.; Hao, L.; Liu, Z.; Shi, C.; Cao, Q. Estimation of water budget components and its driving factors analysis in arid grassland. Sci. Total Environ. 2024, 906, 167654. [Google Scholar] [CrossRef]
Hayat, M.; Zha, T.; Jia, X.; Iqbal, S.; Qian, D.; Bourque, C.P.A.; Khan, A.; Tian, Y.; Bai, Y.; Liu, P.; et al. A multiple-temporal scale analysis of biophysical control of sap flow in Salix psammophila growing in a semiarid shrubland ecosystem of northwest China. Agric. For. Meteorol. 2020, 288–289, 107985. [Google Scholar] [CrossRef]
Li, X.; Zhai, J.; Sun, M.; Liu, K.; Zhao, Y.; Cao, Y.; Wang, Y. Characteristics of Changes in Sap Flow-Based Transpiration of Poplars, Locust Trees, and Willows and Their Response to Environmental Impact Factors. Forests 2024, 15, 90. [Google Scholar] [CrossRef]
Guermoui, M.; Gairaa, K.; Ferkous, K.; Santos, D.S.d.O.; Arrif, T.; Belaid, A. Potential assessment of the TVF-EMD algorithm in forecasting hourly global solar radiation: Review and case studies. J. Clean. Prod. 2023, 385, 135680. [Google Scholar] [CrossRef]
Chen, D.; Wang, Y.; Liu, S.; Wei, X.; Wang, X. Response of relative sap flow to meteorological factors under different soil moisture conditions in rainfed jujube (Ziziphus jujuba Mill.) plantations in semiarid Northwest China. Agric. Water Manag. 2014, 136, 23–33. [Google Scholar] [CrossRef]
Wu, J.; Liu, H.; Zhu, J.; Gong, L.; Xu, L.; Jin, G.; Li, J.; Hauer, R.; Xu, C. Nocturnal sap flow is mainly caused by stem refilling rather than nocturnal transpiration for Acer truncatum in urban environment. Urban For. Urban Green. 2020, 56, 126800. [Google Scholar] [CrossRef]
Chen, Z.; Zhang, Z.; Sun, G.; Chen, L.; Xu, H.; Chen, S. Biophysical controls on nocturnal sap flow in plantation forests in a semi-arid region of northern China. Agric. For. Meteorol. 2020, 284, 107904. [Google Scholar] [CrossRef]
Núñez, J.; Rivera, D.; Oyarzún, R.; Arumí, J.L. On the use of Standardized Drought Indices under decadal climate variability: Critical assessment and drought policy implications. J. Hydrol. 2014, 517, 458–470. [Google Scholar] [CrossRef]
Mouatadid, S.; Raj, N.; Deo, R.C.; Adamowski, J.F. Input selection and data-driven model performance optimization to predict the Standardized Precipitation and Evaporation Index in a drought-prone region. Atmos. Res. 2018, 212, 130–149. [Google Scholar] [CrossRef]
Mandal, N.; Chanda, K. Performance of machine learning algorithms for multi-step ahead prediction of reference evapotranspiration across various agro-climatic zones and cropping seasons. J. Hydrol. 2023, 620, 129418. [Google Scholar] [CrossRef]
Ahmadi, A.; Daccache, A.; Sadegh, M.; Snyder, R.L. Statistical and deep learning models for reference evapotranspiration time series forecasting: A comparison of accuracy, complexity, and data efficiency. Comput. Electron. Agric. 2023, 215, 108424. [Google Scholar] [CrossRef]
Attri, I.; Awasthi, L.K.; Sharma, T.P.; Rathee, P. A review of deep learning techniques used in agriculture. Ecol. Inform. 2023, 77, 102217. [Google Scholar] [CrossRef]
Coşkun, Ö.; Citakoglu, H. Prediction of the standardized precipitation index based on the long short-term memory and empirical mode decomposition-extreme learning machine models: The Case of Sakarya, Türkiye. Phys. Chem. Earth Parts A/B/C 2023, 131, 103418. [Google Scholar] [CrossRef]
Zhang, Y.; Li, C.; Jiang, Y.; Sun, L.; Zhao, R.; Yan, K.; Wang, W. Accurate prediction of water quality in urban drainage network with integrated EMD-LSTM model. J. Clean. Prod. 2022, 354, 131724. [Google Scholar] [CrossRef]
Goyal, P.; Kumar, S.; Sharda, R. A review of the Artificial Intelligence (AI) based techniques for estimating reference evapotranspiration: Current trends and future perspectives. Comput. Electron. Agric. 2023, 209, 107836. [Google Scholar] [CrossRef]
Valipour, M.; Khoshkam, H.; Bateni, S.M.; Jun, C.; Band, S.S. Hybrid machine learning and deep learning models for multi-step-ahead daily reference evapotranspiration forecasting in different climate regions across the contiguous United States. Agric. Water Manag. 2023, 283, 108311. [Google Scholar] [CrossRef]
Karbasi, M.; Jamei, M.; Malik, A.; Kisi, O.; Yaseen, Z.M. Multi-steps drought forecasting in arid and humid climate environments: Development of integrative machine learning model. Agric. Water Manag. 2023, 281, 108210. [Google Scholar] [CrossRef]
Cao, Y.; Liu, S.; Cao, X.; Liu, X.; Hu, H.; Zhang, T.; Yu, L. EMD-based multi-algorithm combination model of variable weights for oil well production forecast. Energy Rep. 2022, 8, 13389–13398. [Google Scholar] [CrossRef]
Wu, Y.; Meng, X.; Zhang, J.; He, Y.; Romo, J.A.; Dong, Y.; Lu, D. Effective LSTMs with seasonal-trend decomposition and adaptive learning and niching-based backtracking search algorithm for time series forecasting. Expert Syst. Appl. 2024, 236, 121202. [Google Scholar] [CrossRef]
He, R.; Zhang, L.; Chew, A.W.Z. Modeling and predicting rainfall time series using seasonal-trend decomposition and machine learning. Knowl. Based Syst. 2022, 251, 109125. [Google Scholar] [CrossRef]
Bazrkar, M.H.; Chu, X. Ensemble stationary-based support vector regression for drought prediction under changing climate. J. Hydrol. 2021, 603, 127059. [Google Scholar] [CrossRef]
Ohana-Levi, N.; Munitz, S.; Ben-Gal, A.; Schwartz, A.; Peeters, A.; Netzer, Y. Multiseasonal grapevine water consumption—Drivers and forecasting. Agric. For. Meteorol. 2020, 280, 107796. [Google Scholar] [CrossRef]
Fu, T.; Li, X.; Jia, R.; Feng, L. A novel integrated method based on a machine learning model for estimating evapotranspiration in dryland. J. Hydrol. 2021, 603, 126881. [Google Scholar] [CrossRef]
Sharma, G.; Singh, A.; Jain, S. DeepEvap: Deep reinforcement learning based ensemble approach for estimating reference evapotranspiration. Appl. Soft Comput. 2022, 125, 109113. [Google Scholar] [CrossRef]
Ghozat, A.; Sharafati, A.; Babak Haji Seyed Asadollah, S.; Motta, D. A novel intelligent approach for predicting meteorological drought based on satellite-based precipitation product: Application of an EMD-DFA-DBN hybrid model. Comput. Electron. Agric. 2023, 211, 107946. [Google Scholar] [CrossRef]
Wu, Z.; Huang, N.E. A study of the characteristics of white noise using the empirical mode decomposition method. Proc. R. Soc. London. Ser. A Math. Phys. Eng. Sci. 2004, 460, 1597–1611. [Google Scholar] [CrossRef]
Liu, K.; Chen, Y.; Wu, B.; Gao, F.; Waheed, A.; Han, F.; Cao, Y.; Wu, J.; Xu, H. Multiple temporal scale variation characteristics and driving factors of arid inland runoff: A case study of Urumqi River, China. J. Hydrol. Reg. Stud. 2025, 58, 102298. [Google Scholar] [CrossRef]
Torres, M.E.; Colominas, M.A.; Schlotthauer, G.; Flandrin, P. A complete ensemble empirical mode decomposition with adaptive noise. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Prague, Czech Republic, 22–27 May 2011; pp. 4144–4147. [Google Scholar] [CrossRef]
Wu, Z.; Huang, N. Ensemble Empirical Mode Decomposition: A Noise-Assisted Data Analysis Method. Adv. Adapt. Data Anal. 2009, 1, 1–41. [Google Scholar] [CrossRef]
Dragomiretskiy, K.; Zosso, D. Variational Mode Decomposition. IEEE Trans. Signal Process. 2014, 62, 531–544. [Google Scholar] [CrossRef]
Wang, J.-L.; Li, Z.-J.J.A.D.S.A.A. Extreme-Point Symmetric Mode Decomposition Method for Data Analysis. Adv. Adapt. Data Anal. 2013, 5, 1350015. [Google Scholar] [CrossRef]
Gilles, J. Empirical Wavelet Transform. IEEE Trans. Signal Process. 2013, 61, 3999–4010. [Google Scholar] [CrossRef]
Liu, X.; Du, H.; Li, S.; Liu, X.; Fan, Y.; Wang, T. Dynamics of soil wind erosion in the Mu Us sandy land (in northern China) affected by cropland reclamation from 2000 to 2020. Ecol. Indic. 2023, 154, 110717. [Google Scholar] [CrossRef]
Chen, Y.; He, J.; He, Y.; Gao, W.; Zheng, C.; Liu, X. Seasonal hydrological traits in Salix psammophila and its responses to soil moisture and meteorological factors in desert areas. Ecol. Indic. 2022, 136, 108626. [Google Scholar] [CrossRef]
Zheng, C.; Chen, Y.; Gao, W.; Liang, X.; Šimůnek, J.; Liu, X. Water transfer mechanisms and vapor flow effects in seasonally frozen soils. J. Hydrol. 2023, 627, 130401. [Google Scholar] [CrossRef]
Huang, N.E.; Shen, Z.; Long, S.R.; Wu, M.C.; Shih, H.H.; Zheng, Q.; Yen, N.-C.; Tung, C.C.; Liu, H.H. The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis. Proc. R. Soc. Lond. Ser. A Math. Phys. Eng. Sci. 1998, 454, 903–995. [Google Scholar] [CrossRef]
Qiu, R.; Li, L.; Liu, C.; Wang, Z.; Zhang, B.; Liu, Z. Evapotranspiration estimation using a modified crop coefficient model in a rotated rice-winter wheat system. Agric. Water Manag. 2022, 264, 107501. [Google Scholar] [CrossRef]
Deng, N.; Grassini, P.; Yang, H.; Huang, J.; Cassman, K.G.; Peng, S. Closing yield gaps for rice self-sufficiency in China. Nat. Commun. 2019, 10, 1725. [Google Scholar] [CrossRef] [PubMed]
Feng, X.; Klingaman, N.P.; Hodges, K.I. Poleward migration of western North Pacific tropical cyclones related to changes in cyclone seasonality. Nat. Commun. 2021, 12, 6210. [Google Scholar] [CrossRef] [PubMed]
Cerqueira, V.; Torgo, L.; Mozetič, I. Evaluating time series forecasting models: An empirical study on performance estimation methods. Mach. Learn. 2020, 109, 1997–2028. [Google Scholar] [CrossRef]
Marquez-Grajales, A.; Villegas-Vega, R.; Salas-Martinez, F.; Acosta-Mesa, H.G.; Mezura-Montes, E. Characterizing drought prediction with deep learning: A literature review. MethodsX 2024, 13, 102800. [Google Scholar] [CrossRef]
Salas-Martínez, F.; Valdés-Rodríguez, O.A.; Palacios-Wassenaar, O.M.; Márquez-Grajales, A. Analysis of the Evolution of Drought through SPI and Its Relationship with the Agricultural Sector in the Central Zone of the State of Veracruz, Mexico. Agronomy 2021, 11, 2099. [Google Scholar] [CrossRef]
Salas-Martínez, F.; Márquez-Grajales, A.; Valdés-Rodríguez, O.-A.; Palacios-Wassenaar, O.-M.; Pérez-Castro, N. Prediction of agricultural drought behavior using the Long Short-Term Memory Network (LSTM) in the central area of the Gulf of Mexico. Theor. Appl. Climatol. 2024, 155, 7887–7907. [Google Scholar] [CrossRef]
Wang, W.-c.; Chau, K.; Xu, D.-M.; Chen, X.-Y. Improving Forecasting Accuracy of Annual Runoff Time Series Using ARIMA Based on EEMD Decomposition. Water Resour. Manag. 2015, 29, 2655–2675. [Google Scholar] [CrossRef]
Tan, Q.; Wang, X.; Wang, H.; Wen, X.; Ji, Y.; Kang, A.-q. An adaptive middle and long-term runoff forecast model using EEMD-ANN hybrid approach. J. Hydrol. 2018, 567, 767–780. [Google Scholar] [CrossRef]
Cao, J.; Li, Z.; Li, J. Financial time series forecasting model based on CEEMDAN and LSTM. Phys. A Stat. Mech. Appl. 2018, 519, 127–139. [Google Scholar] [CrossRef]
Parri, S.; Teeparthi, K. VMD-SCINet: A hybrid model for improved wind speed forecasting. Earth Sci. Inform. 2023, 17, 329–350. [Google Scholar] [CrossRef]
Chen, J.; Che, A.; Wang, L. Cumulative damage evolution rule of rock slope based on shaking table test using VMD-HT. Eng. Geol. 2023, 314, 107003. [Google Scholar] [CrossRef]
Mondal, A.; Le, M.-H.; Lakshmi, V. Land use, climate, and water change in the Vietnamese Mekong Delta (VMD) using earth observation and hydrological modeling. J. Hydrol. Reg. Stud. 2022, 42, 101132. [Google Scholar] [CrossRef]
Smith, J. The local mean decomposition and its application to EEG perception data. J. R. Soc. Interface R. Soc. 2005, 2, 443–454. [Google Scholar] [CrossRef]
Pham, H.T.H.; Bui, L.T. Mechanism of erosion zone formation based on hydrodynamic factor analysis in the Mekong Delta coast, Vietnam. Environ. Technol. Innov. 2023, 30, 103094. [Google Scholar] [CrossRef]
Lu, T.; Yu, F.; Wang, J.; Wang, X.; Mudugamuwa, A.; Wang, Y.; Han, B. Application of adaptive complementary ensemble local mean decomposition in underwater acoustic signal processing. Appl. Acoust. 2021, 178, 107966. [Google Scholar] [CrossRef]
Li, Y.; Xu, M.; Haiyang, Z.; Wei, Y.; Huang, W. A new rotating machinery fault diagnosis method based on improved local mean decomposition. Digit. Signal Process. 2015, 46, 201–214. [Google Scholar] [CrossRef]
Liu, H.; Han, M. A fault diagnosis method based on local mean decomposition and multi-scale entropy for roller bearings. Mech. Mach. Theory 2014, 75, 67–78. [Google Scholar] [CrossRef]
Ngoc-Lan Huynh, A.; Deo, R.C.; Ali, M.; Abdulla, S.; Raj, N. Novel short-term solar radiation hybrid model: Long short-term memory network integrated with robust local mean decomposition. Appl. Energy 2021, 298, 117193. [Google Scholar] [CrossRef]
Wang, X.; Li, X.; Li, S. Point and interval forecasting system for crude oil price based on complete ensemble extreme-point symmetric mode decomposition with adaptive noise and intelligent optimization algorithm. Appl. Energy 2022, 328, 120194. [Google Scholar] [CrossRef]
Gao, Y.; Wang, B.; Chen, F.; Zhang, W.; Zhou, D.; Wu, F.; Chen, D. Multi-step wind speed prediction based on LSSVM combined with ESMD and fractional-order beetle swarm optimization. Energy Rep. 2023, 9, 6114–6134. [Google Scholar] [CrossRef]
Geetha, K.; Hota, M.K.; Karras, D.A. A novel approach for seismic signal denoising using optimized discrete wavelet transform via honey badger optimization algorithm. J. Appl. Geophys. 2023, 219, 105236. [Google Scholar] [CrossRef]
Li, Y.; Peng, T.; Zhang, C.; Sun, W.; Hua, L.; Ji, C.; Muhammad Shahzad, N. Multi-step ahead wind speed forecasting approach coupling maximal overlap discrete wavelet transform, improved grey wolf optimization algorithm and long short-term memory. Renew. Energy 2022, 196, 1115–1126. [Google Scholar] [CrossRef]
Ni, C.; Peng, W. An integrated approach using empirical wavelet transform and a convolutional neural network for wave power prediction. Ocean. Eng. 2023, 276, 114231. [Google Scholar] [CrossRef]
Gu, Q.; Chang, Y.; Xiong, N.; Chen, L. Forecasting Nickel futures price based on the empirical wavelet transform and gradient boosting decision trees. Appl. Soft Comput. 2021, 109, 107472. [Google Scholar] [CrossRef]
Hochreiter, S.; Schmidhuber, J. Long Short-term Memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
Bian, L.; Qin, X.; Zhang, C.; Guo, P.; Wu, H. Application, interpretability and prediction of machine learning method combined with LSTM and LightGBM-a case study for runoff simulation in an arid area. J. Hydrol. 2023, 625, 130091. [Google Scholar] [CrossRef]
Tripathy, K.P.; Mishra, A.K. Deep learning in hydrology and water resources disciplines: Concepts, methods, applications, and research directions. J. Hydrol. 2023, 628, 130458. [Google Scholar] [CrossRef]
Bian, J.; Hou, T.; Ren, D.; Lin, C.; Qiao, X.; Ma, X.; Ma, J.; Wang, Y.; Wang, J.; Liang, X. Predicting mine water inflow volumes using a decomposition-optimization algorithm-machine learning approach. Sci. Rep. 2024, 14, 17777. [Google Scholar] [CrossRef]
Dong, J.; Xing, L.; Cui, N.; Zhao, L.; Guo, L.; Gong, D. Standardized precipitation evapotranspiration index (SPEI) estimated using variant long short-term memory network at four climatic zones of China. Comput. Electron. Agric. 2023, 213, 108253. [Google Scholar] [CrossRef]
Nourani, V.; Hosseini Baghanam, A.; Adamowski, J.; Kisi, O. Applications of hybrid wavelet–Artificial Intelligence models in hydrology: A review. J. Hydrol. 2014, 514, 358–377. [Google Scholar] [CrossRef]
Zhu, X.; Guo, H.; Huang, J.J.; Tian, S.; Zhang, Z. A hybrid decomposition and Machine learning model for forecasting Chlorophyll-a and total nitrogen concentration in coastal waters. J. Hydrol. 2023, 619, 129207. [Google Scholar] [CrossRef]
Heddam, S.; Merabet, K.; Difi, S.; Kim, S.; Ptak, M.; Sojka, M.; Zounemat-Kermani, M.; Kisi, O. River water temperature prediction using hybrid machine learning coupled signal decomposition: EWT versus MODWT. Ecol. Inform. 2023, 78, 102376. [Google Scholar] [CrossRef]
Mehdizadeh, S.; Ahmadi, F.; Danandeh Mehr, A.; Safari, M.J.S. Drought modeling using classic time series and hybrid wavelet-gene expression programming models. J. Hydrol. 2020, 587, 125017. [Google Scholar] [CrossRef]
Lu, Y.; Li, T.; Hu, H.; Zeng, X. Short-term prediction of reference crop evapotranspiration based on machine learning with different decomposition methods in arid areas of China. Agric. Water Manag. 2023, 279, 108175. [Google Scholar] [CrossRef]
Özger, M.; Başakın, E.E.; Ekmekcioğlu, Ö.; Hacısüleyman, V. Comparison of wavelet and empirical mode decomposition hybrid models in drought prediction. Comput. Electron. Agric. 2020, 179, 105851. [Google Scholar] [CrossRef]
Chen, C.; Hao, P.; Liu, J.; Lei, N.; Jiang, J.; Diao, X.; Gu, W. Pipeline Leak AE Signal Denoising Based on Improved SSA-K-α Index-VMD-MD. IEEE Sens. J. 2023, 23, 26177–26194. [Google Scholar] [CrossRef]
Sahani, M.; Dash, P.K.; Samal, D. A real-time power quality events recognition using variational mode decomposition and online-sequential extreme learning machine. Measurement 2020, 157, 107597. [Google Scholar] [CrossRef]
He, W.; Hao, T.; Ke, H.; Zheng, W.; Lin, K. Joint time-frequency analysis of ground penetrating radar data based on variational mode decomposition. J. Appl. Geophys. 2020, 181, 104146. [Google Scholar] [CrossRef]
Ni, L.; Wang, D.; Wu, J.; Wang, Y.; Tao, Y.; Zhang, J.; Liu, J. Streamflow forecasting using extreme gradient boosting model coupled with Gaussian mixture model. J. Hydrol. 2020, 586, 124901. [Google Scholar] [CrossRef]
Jamei, M.; Ali, M.; Malik, A.; Karbasi, M.; Rai, P.; Yaseen, Z.M. Development of a TVF-EMD-based multi-decomposition technique integrated with Encoder-Decoder-Bidirectional-LSTM for monthly rainfall forecasting. J. Hydrol. 2023, 617, 129105. [Google Scholar] [CrossRef]
Ng, K.W.; Huang, Y.F.; Koo, C.H.; Chong, K.L.; El-Shafie, A.; Najah Ahmed, A. A review of hybrid deep learning applications for streamflow forecasting. J. Hydrol. 2023, 625, 130141. [Google Scholar] [CrossRef]
Prasad, R.; Ali, M.; Xiang, Y.; Khan, H. A double decomposition-based modelling approach to forecast weekly solar radiation. Renew. Energy 2020, 152, 9–22. [Google Scholar] [CrossRef]
Ahmed, A.A.M.; Deo, R.C.; Ghahramani, A.; Feng, Q.; Raj, N.; Yin, Z.; Yang, L. New double decomposition deep learning methods for river water level forecasting. Sci. Total Environ. 2022, 831, 154722. [Google Scholar] [CrossRef]
Karbasi, M.; Jamei, M.; Ali, M.; Malik, A.; Chu, X.; Farooque, A.A.; Yaseen, Z.M. Development of an enhanced bidirectional recurrent neural network combined with time-varying filter-based empirical mode decomposition to forecast weekly reference evapotranspiration. Agric. Water Manag. 2023, 290, 108604. [Google Scholar] [CrossRef]
Zheng, Z.; Ali, M.; Jamei, M.; Xiang, Y.; Karbasi, M.; Yaseen, Z.M.; Farooque, A.A. Design data decomposition-based reference evapotranspiration forecasting model: A soft feature filter based deep learning driven approach. Eng. Appl. Artif. Intell. 2023, 121, 105984. [Google Scholar] [CrossRef]
Ali, M.; Deo, R.C.; Maraseni, T.; Downs, N.J. Improving SPI-derived drought forecasts incorporating synoptic-scale climate indices in multi-phase multivariate empirical mode decomposition model hybridized with simulated annealing and kernel ridge regression algorithms. J. Hydrol. 2019, 576, 164–184. [Google Scholar] [CrossRef]

Figure 1. The geographical location of Yulin Meteorological Station in Mu Us Sandy Land.

Figure 2. The artificial sequence (a) and ET_o data series (b), and data splitting of ET_o.

Figure 3. Flowchart and schematic diagram of EMD algorithm decomposition (the right figure adapt from: http://perso.ens-lyon.fr/patrick.flandrin/emd.html, accessed on 18 April 2021).

Figure 4. Model structure diagram of the basic memory cell in the LSTM. I^t, F^t, U^t, O^t, S^t, and Y^t refer to the input gate, forget gate, update gate, output gate, cell state, and output variable, respectively. (Which graph adapt from Bian, et al. [73] and Dong, et al. [74]).

Figure 5. Flowchart of the hybrid forecast model.

Figure 6. Schematic flowchart of the modelling strategy.

Figure 7. Accuracy comparison among the artificial sequences (a) and ET_o sequences (b) with the reconstructed sequence obtained by the eight composition methods. Lowercase letters indicate no significant difference between the eight decomposition algorithms (p < 0.05).

Figure 8. Scatter plots of daily ET_o simulated by (a) single LSTM and (b–i) eight-decomposition hybrid models. Each sub-panel is a scatter plot of forecasted 5-days vs. observed ET_o values.

Figure 9. Scatter plots of daily ET_o simulated by (a) single LSTM and (b–i) eight-decomposition hybrid models. Each sub-panel is a scatter plot of forecasted 7-days vs. observed ET_o values.

Figure 10. Scatter plots of daily ET_o simulated by (a) single LSTM and (b–i) eight-decomposition hybrid models. Each sub-panel is a scatter plot of forecasted 10-days vs. observed ET_o values.

Figure 11. Radar diagram of ET_o testing set (a) 5-days, (b) 7-days, and (c) 10-days accuracy results.

Figure 12. Comparison of ET_o forecasted and actual ET_o results of eight decomposition models at (a) 5-days, (b) 7-days, and (c) 10-days.

Table 1. Comparison of testing set statistical results of eight decomposition models at 5-days, 7-days, and 10-days.

Model	Number	Sum	Average	Variance	F-Value	p-Values	Difference Significant?
ET_o	2923 (5-days)	9122.493	3.121	3.942	-	-	-
M0		8992.026	3.076	2.951	0.845	0.358	N
M1		9165.941	3.136	3.242	0.090	0.764	N
M2		9196.112	3.146	3.093	0.264	0.608	N
M3		9181.587	3.141	3.423	0.162	0.687	N
M4		9036.122	3.091	3.424	0.347	0.556	N
M5		9175.254	3.139	3.361	0.130	0.718	N
M6		9226.005	3.156	3.536	0.490	0.484	N
M7		9212.231	3.152	3.602	0.365	0.546	N
M8		9035.326	3.091	3.835	0.334	0.563	N
ET_o	2925 (7-days)	9124.484	3.120	3.942	-	-	-
M0		8853.740	3.027	2.850	3.689	0.055	N
M1		9168.946	3.135	3.201	0.095	0.758	N
M2		9149.429	3.128	3.077	0.030	0.862	N
M3		9132.602	3.122	3.318	0.003	0.956	N
M4		9066.354	3.010	3.334	0.159	0.690	N
M5		9139.003	3.124	3.296	0.010	0.921	N
M6		9135.891	3.123	3.477	0.006	0.938	N
M7		9112.408	3.115	3.522	0.007	0.935	N
M8		9083.527	3.106	3.744	0.075	0.785	N
ET_o	2928 (10-days)	9126.990	3.117	3.944	-	-	-
M0		8856.531	3.025	2.852	3.676	0.055	N
M1		9171.911	3.133	3.202	0.097	0.756	N
M2		9153.073	3.126	3.077	0.033	0.856	N
M3		9136.846	3.121	3.318	0.005	0.946	N
M4		9070.584	3.098	3.334	0.149	0.699	N
M5		9143.889	3.123	3.295	0.014	0.908	N
M6		9139.766	3.122	3.476	0.008	0.931	N
M7		9115.694	3.113	3.523	0.006	0.939	N
M8		9084.852	3.103	3.747	0.079	0.779	N

Note: “Sum” and “Average” indicate the total and mean values of the corresponding dataset, respectively. “N” represents “NO”.

Table 2. Comparison of forecasted set R² values of eight decomposition models at 5-days and 10-days.

Model	R² for 5-Days	R² for 10-Days	R² Degradation (%)
M0	0.638	0.017	97.30%
M1	0.718	0.381	46.90%
M2	0.533	0.294	44.80%
M3	0.027	0.012	55.50%
M4	0.754	0.204	72.90%
M5	0.626	0.447	28.60%
M6	0.723	0.400	44.70%
M7	0.446	0.209	53.10%
M8	0.841	0.073	91.30%

Note: R² Degradation (%) = ((R²₅ − R²₁₀)/R²₅) × 100.

Table 3. Comparison of forecasted set statistical results of eight decomposition models at 5-days, 7-days, and 10-days.

Model	Number	Sum	Average	Variance	F-Value	p-Values	Difference Significant?
ET_o	5-days	3.557	0.711	0.003	-	-	-
M0		4.998	0.100	0.002	87.278	0.000	Y
M1		3.506	0.701	0.010	0.040	0.846	N
M2		4.864	0.973	0.004	49.171	0.000	Y
M3		7.069	1.414	0.008	232.407	0.000	Y
M4		5.379	1.076	0.043	14.436	0.005	Y
M5		5.891	1.178	0.007	114.245	0.000	Y
M6		3.989	0.798	0.035	0.999	0.347	N
M7		4.330	0.866	0.011	8.573	0.019	Y
M8		2.415	0.483	0.027	8.701	0.018	Y
ET_o	7-days	5.547	0.792	0.022	-	-	-
M0		7.164	1.023	0.011	11.491	0.005	Y
M1		5.167	0.738	0.012	0.615	0.448	N
M2		7.525	1.075	0.006	20.374	0.000	Y
M3		9.798	1.400	0.005	95.336	0.000	Y
M4		8.135	1.162	0.036	16.432	0.002	Y
M5		9.695	1.385	0.040	40.010	0.000	Y
M6		6.143	0.878	0.050	0.703	0.418	N
M7		6.884	0.983	0.017	6.608	0.025	Y
M8		3.506	0.501	0.033	10.778	0.007	Y
ET_o	10-days	8.051	0.805	0.020	-	-	-
M0		9.950	0.995	0.009	12.203	0.003	Y
M1		8.141	0.814	0.023	0.019	0.892	N
M2		11.178	1.118	0.009	33.441	0.000	Y
M3		14.041	1.404	0.004	148.168	0.000	Y
M4		12.361	1.236	0.048	27.086	0.000	Y
M5		14.572	1.457	0.040	70.572	0.000	Y
M6		10.017	1.002	0.074	4.113	0.058	N
M7		10.182	1.018	0.015	12.978	0.002	Y
M8		4.829	0.483	0.024	23.685	0.000	Y

Note: “Sum” and “Average” indicate the total and mean values of the corresponding dataset, respectively. “N” represents “NO” and “Y” represents “YES”.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Chen, Y.; Liu, Z.; Long, T.; Liu, X.; Gao, Y.; Wang, S. Evaluation of Eight Decomposition-Hybrid Models for Short-Term Daily Reference Evapotranspiration Prediction. Atmosphere 2025, 16, 535. https://doi.org/10.3390/atmos16050535

AMA Style

Chen Y, Liu Z, Long T, Liu X, Gao Y, Wang S. Evaluation of Eight Decomposition-Hybrid Models for Short-Term Daily Reference Evapotranspiration Prediction. Atmosphere. 2025; 16(5):535. https://doi.org/10.3390/atmos16050535

Chicago/Turabian Style

Chen, Yunfei, Zuyu Liu, Ting Long, Xiuhua Liu, Yaowei Gao, and Sibo Wang. 2025. "Evaluation of Eight Decomposition-Hybrid Models for Short-Term Daily Reference Evapotranspiration Prediction" Atmosphere 16, no. 5: 535. https://doi.org/10.3390/atmos16050535

APA Style

Chen, Y., Liu, Z., Long, T., Liu, X., Gao, Y., & Wang, S. (2025). Evaluation of Eight Decomposition-Hybrid Models for Short-Term Daily Reference Evapotranspiration Prediction. Atmosphere, 16(5), 535. https://doi.org/10.3390/atmos16050535

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Evaluation of Eight Decomposition-Hybrid Models for Short-Term Daily Reference Evapotranspiration Prediction

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Area

2.2. Data Collection and PM-Equation

2.3. Decomposition Algorithms and LSTM

2.3.1. EMD, EEMD, and CEEMDAN

2.3.2. VMD

2.3.3. LMD

2.3.4. ESMD

2.3.5. DWT and EWT

2.3.6. LSTM

2.4. Parameter Setting for Hybrid Forecasting Algorithms

2.5. Statistical Analysis

3. Results

3.1. Accuracy Evaluation of Eight Decomposition Algorithms on AS and ETo

3.2. Accuracy Analysis of Eight Decomposition Hybrid Models in the ETo Test Sets

3.2.1. 5-Days Ahead Forecasting

3.2.2. 7-Days Ahead Forecasting

3.2.3. 10-Days Ahead Forecasting

3.3. Out-of-Sample Evaluation of Eight Decomposition Hybrid Models in Short-Term ETo Prediction

4. Discussion

4.1. The Influence of Sequence Complexity on Model Prediction Accuracy

4.2. The Reasons for Accuracy Difference of Hybrid Models in Testing and Prediction Sets

4.3. Research Inspiration on Decomposed Hybrid Models

5. Conclusions

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

3.1. Accuracy Evaluation of Eight Decomposition Algorithms on AS and ET_o

3.2. Accuracy Analysis of Eight Decomposition Hybrid Models in the ET_o Test Sets

3.3. Out-of-Sample Evaluation of Eight Decomposition Hybrid Models in Short-Term ET_o Prediction