Application of Entropy Spectral Method for Streamflow Forecasting in Northwest China

Streamflow forecasting is vital for reservoir operation, flood control, power generation, river ecological restoration, irrigation and navigation. Although monthly streamflow time series are statistic, they also exhibit seasonal and periodic patterns. Using maximum Burg entropy, maximum configurational entropy and minimum relative entropy, the forecasting models for monthly streamflow series were constructed for five hydrological stations in northwest China. The evaluation criteria of average relative error (RE), root mean square error (RMSE), correlation coefficient (R) and determination coefficient (DC) were selected as performance metrics. Results indicated that the RESA model had the highest forecasting accuracy, followed by the CESA model. However, the BESA model had the highest forecasting accuracy in a low-flow period, and the prediction accuracies of RESA and CESA models in the flood season were relatively higher. In future research, these entropy spectral analysis methods can further be applied to other rivers to verify the applicability in the forecasting of monthly streamflow in China.


Introduction
Accurate streamflow forecasting is vital for flood control, reservoir management, restoration of river environment, irrigation, and navigation, among other uses [1]. Moreover, it can also provide guidelines for policy makers in the utilization and management of water resources and the formulation of water environmental health protection plans. So far, the simulation of monthly streamflow is a hotspot for hydrologic researchers but is still in exploration and development due to the limitations of forecasting methods. As a traditional method, time series analyses such as autoregressive (AR) or autoregressive moving average (ARMA) models are often used to simulate streamflow, but they cannot address the issue of seasonality that exists in the monthly streamflow series [2]. Fortunately, entropy spectral analysis can extract significant information from streamflow process and forecast monthly streamflow accurately coupled with the time series analysis method. Actually, the spectral method has been successfully used by some researchers for monthly streamflow forecasting with different types of entropy including Burg entropy [3], configuration entropy [1,2], and minimum relative entropy [4,5].
Burg [6] proposed Burg entropy theory (BET) in the frequency domain and then further developed the maximum Burg entropy spectral method (BESA) for time series forecasting. As a classic method for hydrologic forecasting, BESA has been widely used in groundwater level forecasting [7], flood

Methods
Suppose there is a streamflow series, y(t). Convert it to the frequency domain f. If f is considered a random variable, the spectral density function is normalized as a probability function. Burg entropy can then be expressed as: where W = 1/2·∆t is the Nyquist fold-over frequency and ∆t is the sampling period. The definition of configuration entropy is similar to Burg entropy and is defined as: With the given prior spectral density function q(f ), the relative entropy can be defined as: The prior spectral density is like background noise in the peak of observed periodicity. When spectral density is uniform, the relative entropy reduces to a configuration entropy.
The processes shown in Figure 1 mainly include (1) calculating parameters; (2) determining spectral density function; (3) extending autocorrelation function; and (4) forecasting streamflow and comparing the three methods for the selection of the most appropriate method.

Deriving Spectral Density Function
In order to obtain the least biased spectral density, under some given constraints, the Burg and configuration entropies are maximized while the relative entropy is minimized before spectral density estimation. According to the relationship of spectral density function and autocorrelation function, the constraints could be given as: where  -1 i , ( ) n  is the autocorrelation function of n-th lag; N usually equals 1/4 to 1/2 of the streamflow length series.
Subject to the constraints, entropy can be maximized or minimized using the Lagrangian function, which can be formulated as: where λn is the Lagrangian multiplier and H(f) is the entropy function. The partial derivatives of L to the spectral density are taken and then equated to zero. The least biased spectral densities obtained by maximizing Burg entropy and configuration entropy and minimizing relative entropy respectively, are expressed as follows:

Deriving Spectral Density Function
In order to obtain the least biased spectral density, under some given constraints, the Burg and configuration entropies are maximized while the relative entropy is minimized before spectral density estimation. According to the relationship of spectral density function and autocorrelation function, the constraints could be given as: where i = √ −1, ρ(n) is the autocorrelation function of n-th lag; N usually equals 1/4 to 1/2 of the streamflow length series.
Subject to the constraints, entropy can be maximized or minimized using the Lagrangian function, which can be formulated as: where λ n is the Lagrangian multiplier and H(f ) is the entropy function. The partial derivatives of L to the spectral density are taken and then equated to zero. The least biased spectral densities obtained by maximizing Burg entropy and configuration entropy and minimizing relative entropy respectively, are expressed as follows:

Calculating Lagrangian Multipliers
The methods of calculating Lagrangian multipliers are different due to the variation in the forms of spectral densities. For Burg entropy, Levinson-Burg algorithms [6,17] are applied to determine Lagrangian multipliers. While in the case of configuration entropy and relative entropy, cepstrum Take the prior and posterior cepstrum of autocorrelations which are transformed from the prior and posterior spectral densities and expressed as e q (n) and e p (n) in the following equations: Then Equation (9) can be abbreviated as: where δ n is a delta function. Lagrangian multipliers can be solved using N linear functions from Equation (12) of the relative entropy For configuration entropy, e q = 0, Lagrangian multipliers can then be calculated by: λ n = −1 + e p (n); n = 0 e p (n); n = 0 (14)

Forecasting Streamflow
BESA allows autocorrelation to expand as a linear combination of previous autocorrelation parameters with predicted coefficients. ρ N+k can be expressed as: where a j is obtained using the reflection recursion method proposed by Burg [18]. For the configurational and relative entropies, the autocorrelations are extended as: and According to the extended autocorrelation functions, the forecasting equations of the three spectral entropies methods are obtained as follows: where C p (T + k) is the cepstrum of streamflow series, and it always equals 1 2 e p (N + k). m is the order of the model, which is determined by BIC criteria.
where N is the length of streamflow series and σ ε 2 is the variance of residual of observed and forecasted streamflow.

Evaluating the Precision of Forecasting Results
In this paper, we selected average relative error (RE), root mean square error (RMSE), correlation coefficient (R) and determination coefficient (DC) as evaluation indicators for the forecasted results. The RE, RMSE, R and DC are expressed as: where x represents the average value of observed streamflow x(t), f represents the average value of the forecasted streamflow f (t), and n is forecasting period (month). According to the "forecasting norm for hydrology intelligence", the determination coefficient (DC) is classified into three levels as shown in Table 1.

Data Preprocessing
Observed streamflow data from five hydrological stations, Yingluoxia, Zamusi, Jiutiaoling, Xiangtang and Tangnaihai, in Northwest China were selected to verify these three spectral entropy methods. These five hydrological stations are located in the Yellow River, Heihe River and Shiyang River, respectively. Tangnaihai station is located at the mainstream of the Yellow River, while Xiangtang is located at the tributary of the Yellow river. Zamusi and Jiutiaoling stations are situated on the Shiyang River. Yingluoxia station is located at the Heihe River and it marks the boundary between the upstream and middle reaches [1]. Basic information on the five hydrological stations are shown in Figure 2 and Table 2.

Data Preprocessing
Observed streamflow data from five hydrological stations, Yingluoxia, Zamusi, Jiutiaoling, Xiangtang and Tangnaihai, in Northwest China were selected to verify these three spectral entropy methods. These five hydrological stations are located in the Yellow River, Heihe River and Shiyang River, respectively. Tangnaihai station is located at the mainstream of the Yellow River, while Xiangtang is located at the tributary of the Yellow river. Zamusi and Jiutiaoling stations are situated on the Shiyang River. Yingluoxia station is located at the Heihe River and it marks the boundary between the upstream and middle reaches [1]. Basic information on the five hydrological stations are shown in Figure 2 and Table 2.  The entropy spectral analysis model belongs to the autocorrelation methods, and the input data should be a standardized stationary random sequence. To meet the requirement, the streamflow sequences should be transformed using the Box-Cox method. Box-Cox transformation can eliminate data skewness and make data errors present a normal distribution [17]. In addition, standardized transformation was also performed on the sequences.
To test whether transformed sequences were stable, we verified the unit root of sequences. If the unit root exists in the sequence, it is not a stationary random sequence and vice versa [19]. The adftest function in the econometric toolbox of MATLAB 2010b (2010b, MathWorks, Beingjing, China, https://ww2.mathworks.cn/products/matlab-online.html) was used to test whether the unit root exists. The adftest function assumes that the unit root does not exist in the sequence. If the hypothesis is true, the logical value of H is 1 and the confidence can be returned. If the hypothesis is false, the logical value of H is 0. The test results of all transformed streamflow sequences for five hydrological stations show that all the sequences are stable and homogeneous (Table 3). Table 3. Adftest test results of monthly streamflow in each hydrologic station.  The entropy spectral analysis model belongs to the autocorrelation methods, and the input data should be a standardized stationary random sequence. To meet the requirement, the streamflow sequences should be transformed using the Box-Cox method. Box-Cox transformation can eliminate data skewness and make data errors present a normal distribution [17]. In addition, standardized transformation was also performed on the sequences.
To test whether transformed sequences were stable, we verified the unit root of sequences. If the unit root exists in the sequence, it is not a stationary random sequence and vice versa [19]. The adftest function in the econometric toolbox of MATLAB 2010b (2010b, MathWorks, Beingjing, China, https://ww2.mathworks.cn/products/matlab-online.html) was used to test whether the unit root exists. The adftest function assumes that the unit root does not exist in the sequence. If the hypothesis is true, the logical value of H is 1 and the confidence can be returned. If the hypothesis is false, the logical value of H is 0. The test results of all transformed streamflow sequences for five hydrological stations show that all the sequences are stable and homogeneous (Table 3).

Determining Training Period
In previous research, the training periods were always less than 100 months, and very few papers discussed the influence of the training period on the forecasted results. In this paper, we selected the observed streamflow data from the years 2008 to 2012 as the validation period. Additionally, observed data from 3 to 40 months were selected as a training period to evaluate the influence of the training period on forecasted results. In order to determine the period of models, the relationship between the different training periods and the optimal order of the models are explained in Figure 3. As seen in Figure 3, when the training period is short, the optimal fitting order of models is lower, and then the optimal order of models tends to be stable with the increase of training period.

Determining Training Period
In previous research, the training periods were always less than 100 months, and very few papers discussed the influence of the training period on the forecasted results. In this paper, we selected the observed streamflow data from the years 2008 to 2012 as the validation period. Additionally, observed data from 3 to 40 months were selected as a training period to evaluate the influence of the training period on forecasted results. In order to determine the period of models, the relationship between the different training periods and the optimal order of the models are explained in Figure 3. As seen in Figure 3, when the training period is short, the optimal fitting order of models is lower, and then the optimal order of models tends to be stable with the increase of training period. Beyond that, the relationship between the training period and the DC of the validation period were explored (Figure 4). The forecasting effect is weak and not stable enough when the training period is less than 15 years. However, increase in the training period increases and stabilizes the DC. In order to make use of the expert opinion to increase the forecasting precision, the calibration period was determined as 26 years.  Beyond that, the relationship between the training period and the DC of the validation period were explored (Figure 4). The forecasting effect is weak and not stable enough when the training period is less than 15 years. However, increase in the training period increases and stabilizes the DC. In order to make use of the expert opinion to increase the forecasting precision, the calibration period was determined as 26 years.

Determining Training Period
In previous research, the training periods were always less than 100 months, and very few papers discussed the influence of the training period on the forecasted results. In this paper, we selected the observed streamflow data from the years 2008 to 2012 as the validation period. Additionally, observed data from 3 to 40 months were selected as a training period to evaluate the influence of the training period on forecasted results. In order to determine the period of models, the relationship between the different training periods and the optimal order of the models are explained in Figure 3. As seen in Figure 3, when the training period is short, the optimal fitting order of models is lower, and then the optimal order of models tends to be stable with the increase of training period. Beyond that, the relationship between the training period and the DC of the validation period were explored (Figure 4). The forecasting effect is weak and not stable enough when the training period is less than 15 years. However, increase in the training period increases and stabilizes the DC. In order to make use of the expert opinion to increase the forecasting precision, the calibration period was determined as 26 years.  Spectral analysis is a powerful method employed to check the periodicity by finding out the frequency of spectrum peaks. The spectral densities estimated by these three spectral entropy methods were compared to the spectral density estimated by fast Fourier transform (FFT) ( Figure 5). Five representative rivers were chosen to show the ability of BESA, CESA, and RESA to estimate the spectral densities. For RESA, a prior spectral density function was hypothesized from data information. The determination process of prior spectral density functions is described in Appendix A. It can be discovered from Figure 5 that all of the stations displayed a peak at frequency 1/12. On the other hand, there were other peaks near frequency 1/4th and 1/6th in the spectral density at Zamusi, Jiutiaoling and Tangnaihai stations. For uni-peak streamflow series, the BESA, CESA, and RESA can check the periodicity equally as well as FFT. However, for multi-peak streamflow series, the BESA did not perform as effectively in detecting the principal periodicity. On the contrary, the CESA and RESA correctly checked the most significant peak at the 1/12th frequency. However, CESA always neglects all secondary spectral peaks to keep the peak at 1/12th most significant. The RESA detected less significant peaks, and was consistent with the FFT results.
In order to examine whether this variation would affect the forecasting precision, we used these three methods to forecast streamflow in five hydrological stations for selecting the optimal model in northwest China.

Estimating Spectral Density
Spectral analysis is a powerful method employed to check the periodicity by finding out the frequency of spectrum peaks. The spectral densities estimated by these three spectral entropy methods were compared to the spectral density estimated by fast Fourier transform (FFT) ( Figure 5). Five representative rivers were chosen to show the ability of BESA, CESA, and RESA to estimate the spectral densities. For RESA, a prior spectral density function was hypothesized from data information. The determination process of prior spectral density functions is described in Appendix A. It can be discovered from Figure 5 that all of the stations displayed a peak at frequency 1/12. On the other hand, there were other peaks near frequency 1/4th and 1/6th in the spectral density at Zamusi, Jiutiaoling and Tangnaihai stations. Spectral analysis is a powerful method employed to check the periodicity by finding out the frequency of spectrum peaks. The spectral densities estimated by these three spectral entropy methods were compared to the spectral density estimated by fast Fourier transform (FFT) ( Figure 5). Five representative rivers were chosen to show the ability of BESA, CESA, and RESA to estimate the spectral densities. For RESA, a prior spectral density function was hypothesized from data information. The determination process of prior spectral density functions is described in Appendix A. It can be discovered from Figure 5 that all of the stations displayed a peak at frequency 1/12. On the other hand, there were other peaks near frequency 1/4th and 1/6th in the spectral density at Zamusi, Jiutiaoling and Tangnaihai stations. For uni-peak streamflow series, the BESA, CESA, and RESA can check the periodicity equally as well as FFT. However, for multi-peak streamflow series, the BESA did not perform as effectively in detecting the principal periodicity. On the contrary, the CESA and RESA correctly checked the most significant peak at the 1/12th frequency. However, CESA always neglects all secondary spectral peaks to keep the peak at 1/12th most significant. The RESA detected less significant peaks, and was consistent with the FFT results.
In order to examine whether this variation would affect the forecasting precision, we used these three methods to forecast streamflow in five hydrological stations for selecting the optimal model in northwest China. For uni-peak streamflow series, the BESA, CESA, and RESA can check the periodicity equally as well as FFT. However, for multi-peak streamflow series, the BESA did not perform as effectively in detecting the principal periodicity. On the contrary, the CESA and RESA correctly checked the most significant peak at the 1/12th frequency. However, CESA always neglects all secondary spectral peaks to keep the peak at 1/12th most significant. The RESA detected less significant peaks, and was consistent with the FFT results. In order to examine whether this variation would affect the forecasting precision, we used these three methods to forecast streamflow in five hydrological stations for selecting the optimal model in northwest China.

Streamflow Forecasting Analysis
Streamflow was forecasted using three spectral entropy methods for five hydrological stations (Figures 6 and 7) with a validation period of five years. The results indicated that the forecasting accuracy was worse in Tangnaihai station where the DC is less than 0.6 ( Table 4) and belongs to level C compared with the other four stations. The reason for this may be that the catchment area of Tangnaihai station is much wider than other stations. Moreover, the intensive anthropogenic activities might also have a severe impact on the streamflow of Tangnaihai station. Therefore, it is difficult to accurately forecast streamflow with only streamflow from previous months using autoregression-based models.

Streamflow Forecasting Analysis
Streamflow was forecasted using three spectral entropy methods for five hydrological stations ( Figure 6 and Figure 7) with a validation period of five years. The results indicated that the forecasting accuracy was worse in Tangnaihai station where the DC is less than 0.6 ( Table 4) and belongs to level C compared with the other four stations. The reason for this may be that the catchment area of Tangnaihai station is much wider than other stations. Moreover, the intensive anthropogenic activities might also have a severe impact on the streamflow of Tangnaihai station. Therefore, it is difficult to accurately forecast streamflow with only streamflow from previous months using autoregression-based models.  By comparing the forecasting accuracy of the three models for five hydrological stations during the validation period, we discovered that the rank of forecasting accuracy with the evaluation criteria of DC, RMSE and R for the three models was in the order RESA > CESA > BESA for Yingluoxia station (Table 4). However, for Zamusi station of Shiyang River, the accuracy of the CESA model was higher than the other models (Table 4, Figure 6). For the remaining three hydrological stations, the accuracy was similar for the three models, and the RESA model was more accurate than CESA and BESA models using DC, RMSE and R criteria. However, the RE between the observed streamflow and forecasted streamflow using BESA was smaller than other methods. The reason for this is that RE reflects the linear error between observed values and forecasted values, while the RMSE, R and DC reflect the quadratic power error between observed values and forecasted values. When the forecasting error of the flood season was smaller, the RMSE, R and DC would be effective. However, when the forecasting error of the non-flood season were smaller, RE would be better.  To verify this conjecture, the whole period was divided into flood season from July to October and low-flow season from January to June, November, and December in each year. We extracted the forecasted streamflow of the non-flood season and compared it with the observed streamflow in the five stations (Table 5). As shown, BESA performs better than other methods. During the low flow season, the advantage of BESA over the others was significant, where the streamflow was forecasted close to the observation. However, the overall forecasting accuracy of the RESA model and the CESA model was higher. At the same time, because the streamflow forecasting itself serves as the optimal allocation of water resources, the annual or flood runoff prediction was more meaningful. As a whole, the RESA model can better adapt to the streamflow forecasting for the five hydrological stations in northwest China. Combining precipitation as a predictor, selecting one or more models with high accuracy in the flood season, and using the entropy spectrum model and its combination [1] to forecast streamflow could be a future research direction.

Conclusions
In this paper, the BESA, CESA, and RESA models were applied for spectral analysis and streamflow forecasting in northwest China using monthly streamflow data from five hydrological stations. The estimated spectral density and prediction accuracy of the three methods were compared based on the optimal length of the training period. The spectral density functions of the BESA, CESA and RESA was smoother than that of FFT, and all of them can clearly estimate the 12 month primary period of monthly streamflow sequence without deviation.
However, the spectral density function of BESA could not detect the other significant secondary periods, while that of CESA could detect the secondary periods for multi-period sequences despite a certain degree of leakage. By comparing these three entropy spectral methods, we discovered that all of these methods could forecast streamflow accurately. Among them, the RESA model has the highest prediction accuracy, followed by the CESA model.
Due to the lack of data, this paper only applied the entropy spectral theory to the monthly streamflow forecasting of few rivers in northwest China. In future research, three entropy spectral analysis methods can further be applied to other rivers to verify the applicability of the three entropy spectral analysis methods in the forecasting of monthly streamflow in China.
In this paper, the spectral density of the monthly streamflow sequence in each hydrological station was estimated to determine the major periods and their intensity. The prior spectral density for RESA was determined based on the estimated spectral density of CESA. Through the spectrum analysis of BESA and CESA, it can be found that all sequences have significant period of 12 months. There are 6 months and 4 months minor periods in most hydrological stations. Based on the spectral estimation values of each hydrological station, the prior spectral density functions mainly consist of six types. The specific settings are shown in Table A1. Table A1. Hypothesis on the prior spectral density.

Number
Period Spectral Density Function To find the optimal prior spectral density function for RESA, the Itakura-Saito distance (I-S) between the CESA spectral density and the prior RESA spectral density are calculated. The formula for solving I-S distance of two discrete sequences is as follows: where P 1 ( f ) and P 2 ( f ) are spectral functions of two different sequences, D I-S represents I-S distance of two spectral functions. The smaller the I-S distance values are, the smaller the difference between the two spectral density functions is. The distance between the estimated CESA spectral function of each hydrological station and the assumed I-S is shown in Table A2. For each hydrological station, the hypothesis corresponding to the minimum I-S distance is the closest to the spectral density estimated by CESA, which should be taken as the prior spectral density function. Results are shown in Table A2. Table A2. Itakura-Saito distance between CESA spectral density and each hypothesis spectral density for RESA. Note: Boldface represents the optimal spectral density functions for RESA in five hydrologic stations.