Applying Multiscale Entropy to the Complexity Analysis of Rainfall-runoff Relationships

This paper presents a novel framework for the complexity analysis of rainfall, runoff, and runoff coefficient (RC) time series using multiscale entropy (MSE). The MSE analysis of RC time series was used to investigate changes in the complexity of rainfall-runoff processes due to human activities. Firstly, a coarse graining process was applied to a time series. The sample entropy was then computed for each coarse-grained time series, and plotted as a function of the scale factor. The proposed method was tested in a case study of daily rainfall and runoff data for the upstream Wu–Tu watershed. Results show that the entropy measures of rainfall time series are higher than those of runoff time series at all scale factors. The entropy measures of the RC time series are between the entropy measures of the rainfall and runoff time series at various scale factors. Results also show that the entropy values of rainfall, runoff, and RC time series increase as scale factors increase. The changes in the complexity of RC time series indicate the changes of rainfall-runoff relations due to human activities and provide a reference for the selection of rainfall-runoff models that are capable of dealing with great complexity and take into account of obvious self-similarity can be suggested to the modeling of rainfall-runoff processes. Moreover, the robustness of the MSE results were tested to confirm that MSE analysis is consistent and the same results when removing 25% data, making this approach suitable for the complexity analysis of rainfall, runoff, and RC time series.


Introduction
Heavy rainfall and flooding are some of the disasters which cause the greatest loss of property and life in Taiwan. It is therefore essential to study the relation between the rainfall and runoff processes. Usually, observed rainfall and runoff data are applied to rainfall-runoff models to investigate the relation between the rainfall and runoff. Jakeman and Hornberger [1] indicated that the information content in a rainfall-runoff record is sufficient to support models of only very limited complexity. This raises the question of what limits should observed data place on the allowable complexity of rainfallrunoff models [1]. In practice, it is difficult to identify the underlying mechanisms that produced the data. The runoff coefficient (RC), defined as the ratio of the runoff to the rainfall, can demonstrate how much rainfall is transformed into runoff for a specific duration, and represent the transformed relation of rainfall-runoff processes in a river basin. Therefore, this study applies the RC time series to investigate the rainfall-runoff relationship.
The concept of entropy originated in the field of thermodynamics. After more than 100 years of development, the applications of entropy have gone far beyond the category of thermodynamics and statistical physics. Entropy has directly or indirectly permeated into the domains of information theory, mathematics, geosciences, life science, social science, and more [2]. Shannon introduced the concept of Boltzmann entropy in statistical thermodynamics to information theory in 1948. Entropy is considered a measure of uncertainty or information quantity of events, establishing the concept of information entropy, which is the basic theory of modern information theory. Information entropy has become an important tool to measure and analyze the complexity of various physical and non-physical systems [2].
Entropy is commonly used to characterize the complexity of a signal. Kolmogorov-Sinai (KS) entropy can be used to represent complexity, and is computed from the average production rate of new information. KS entropy is the basis of approximate entropy [3]. Approximate entropy is effective for analyzing the complexity of short time series. Richman and Moorman [4] defined sample entropy, which is a modification of approximate entropy. However, traditional sample entropy measures only quantify the regularity of time series. Each type of observation data may include different characteristic at various scales because of their complexity. This makes it difficult to explore the characteristics of data and processing results deficient in stability when the data analysis is limited to a single scale [2].
Both KS entropy and determined approximate entropy use a single scale. However, these methods neglect the characteristics when the number of scales exceeds one. Costa et al. [5,6] defined and applied multiscale entropy (MSE), which is based on sample entropy and is effective in analyzing the complexity of physiological signals. Their results demonstrate that MSE curves identified diseased hearts to have a significant decrease in the sample entropy on multiple time scales, indicating a lower degree of complexity. Valuable hidden information may be obtained by excavating the characteristics of signals at various scales [7]. Traditional methods, determined use a single scale, ignore sequential properties across scales, thereby may misinterpret the changes of reduced complexity with the order of different time series [8]. Greater entropy values are not always associated with an increase in complexity [8]. For example, although there is no underlying dynamical structure in a white Gaussian noise time series, the time series returns a high entropy value [8]. The MSE, an aggregate analysis method, has been suggested to overcome the limitations of single-scale entropy measures by using a moving average window that varies in scale to compute entropy over different variations of the same data set [8]. For this purpose, this study applies MSE to the complexity analysis of rainfall, runoff, and RC time series.
Hydrological time series, such as rainfall, runoff, and RC time series, usually operate across time scales and have more complexity than those on a single scale. Hence, using traditional entropy measures to describe the complexity of hydrological time series will be incomplete [9]. Compared to traditional entropy measures, MSE curves are used to compare the relative complexity of time series and are more effective for analyzing those series with multiple temporal scale characteristics [7]. In this study, the MSE is applied to RC time series for the investigation of the complexity of rainfallrunoff processes.
MSE analysis, which is a method of measuring the complexity of a finite length time series, has been applied to the analysis of multiscale hydrological time series. Li and Zhang [9] applied MSE analysis to long-term (131 year) daily flow rates (Q) of the Mississippi River (MR) to investigate possible changes in the complexity of the MR system caused by human activities since the 1940s. They found that the sample entropy for Q of the MR and its two components, overland flow (OF) and base flow (BF), generally increase as the time scale increases, and the MR may have been losing its complexity since the 1940s. Chou [10] presented a novel framework to determine the number of resolution levels in the application of a wavelet transformation to a rainfall time series. He decomposed rainfall time series using à trous wavelet transform. Then, he applied MSE analysis, which helps elucidate some hidden characteristics of the original rainfall time series, to the decomposed rainfall time series. His results show that the Mann-Kendall (MK) rank correlation test of MSE curves of residuals at various resolution levels can determine the number of resolution levels in the wavelet decomposition. The complexity of rainfall time series at four stations on a multiscale is compared. These results reveal that the suggested number of resolution levels can be obtained using MSE analysis and the MK test. The complexity of rainfall time series at various locations can also be analyzed to provide a reference for water resource planning and application. Zhou et al. [11] used MSE analysis to study the effects of water reservoirs on river flow records based on long streamflow series at four representative hydrological stations. They found that before the construction of the water reservoirs, the complexity of the streamflow series was decreasing from the upper reaches to the lower reach of the East River. The construction of water reservoirs greatly increased the degree of complexity of the hydrological processes. Their results have theoretical and scientific merit for conservation of the ecological environment and water resource management under the influences of climate changes and intensifying human activities.
The central objectives of this study were to apply MSE analysis to rainfall, runoff, and RC time series. The entropy measures of these time series at various scale factors can be compared and used for complexity analysis. The following sections describe the structure of the MSE, present a case study in Taiwan to demonstrate the effectiveness of the presented method, and discuss the results to draw the conclusions.

Sample Entropy
Sample entropy provides a measure of an "orderly structure" in a time series, by testing if there are any repeated patterns of various lengths. The sample entropy is the exact value of the negative average natural log of the conditional probability. If the data length of the original time series ) ( (2),..., (1), is N, then the sample entropy of the time series is calculated below [4,12].
, comprises a time series in which between X(i) and X(j) as follows: that are smaller than a given threshold r. Then, compute the ratio of this number to the total number N − m, as follows: for all i, as follows: (VI) The theoretical sample entropy of the time series is given by:

Multiscale Entropy
The parameters of sample entropy are determined by a single differentiation. The sample entropy, which is used in the analysis of a single scale factor, does not include the characteristics of the time series when the number of scale factor exceeds one. Costa et al. [5][6][7] developed a multiscale entropy analysis based on the definition of the sample entropy. Suppose that )  This procedure, called multiscale entropy analysis, is used to analyze entropy when the scale factor exceeds one. When τ = 1, { ) ( y } equals the original time series.

Application and Analysis
This study demonstrates the feasibility of applying the proposed MSE method to the complexity analysis of rainfall-runoff processes using the Wu-Tu watershed located in northern Taiwan as a case study. The watershed area is 203 km 2 (Figure 1). Because of the topography of this watershed, the runoff pathlines are short and steep, and rainfall is nonuniform in both time and space. Large floods develop quickly in the middle-to-downstream reaches of the watershed, causing severe damage. The daily rainfall and runoff data for 1966-1994 were collected. The basin average rainfall data were obtained from the Jui-Fang, Huo-Shao-Liao, and Wu-Tu weather stations using the block kriging method. The daily runoff data were obtained from the Wu-Tu hydrological station. The ratio of the runoff to the rainfall was defined as the RC. The RC demonstrates how much rainfall is transformed into runoff at a specific duration, and represents the transformed relationship of rainfall-runoff processes in a river basin. This study applies MSE analysis to the rainfall, runoff, and RC time series to extract some of their hidden characteristics. Basin average rainfall, runoff, and RC time series recorded along with a magnified view of one year are shown as Figure 2.   Figure 3 shows that the entropy measures of these three rainfall stations are similar in various scale factors. The complexities of these three rainfall stations in the same watershed are almost the same. In addition, the entropy measures of average rainfall are similar to those of the rainfall time series at the three stations at various scale factors. This finding confirms that the average rainfall is a suitable input of rainfallrunoff system, and does not enhance the complexity of rainfall time series. The MSE calculates the sample entropy at various scale factors. When the entropy measures increase as the scale factors increase, it represents that the time series have obvious self-similarity and great complexity [13]. Figure 3 confirms that the entropy measures of all the rainfall time series at the three stations and the average rainfall time series increase when scale factors increase. All of the above time series have obvious self-similarity and great complexity.

Results and Discussion
When the entropy measures of one time series are higher than that of another time series at most scale factors, it indicates that the former is more complex than the latter [7]. Figure 4 shows that the entropy measures of the rainfall time series are higher than those of runoff time series at all scale factors. This suggests that the rainfall time series is more complex than the runoff time series. In addition, these data provide evidence that the rainfall time series with high complexity transforms into the runoff time series with low complexity via the response of rainfall-runoff processes. Figure 4 shows that the entropy measures of rainfall, runoff, and RC time series increase as the scale factors increase. These results imply that these time series have self-similarity and great complexity. Figure 4 also shows that the entropy measures of RC are between those of the rainfall and runoff time series at all scale factors. The RC demonstrates how much rainfall is transformed into runoff at a specific duration, and represents the transformed relationship of rainfall-runoff processes in a river basin. The RC time series, which can be used to represent the rainfall-runoff relationship, are less complex than the rainfall time series, and are more complex than the runoff time series. Results also show that the entropy values of rainfall, runoff, and RC time series increase as scale factors increase, indicative of the obvious self-similarity and great complexity contained in the data [13]. The changes in the complexity of RC time series indicate the changes of rainfall-runoff relations due to human activities and provide a reference for the selection of rainfall-runoff models that are capable of dealing with great complexity and take into account of obvious self-similarity can be suggested to the modeling of rainfall-runoff processes. In this study, the results that the sample entropy for runoff time series generally increases with the scale factor are similar to those in Li and Zhang [9]. To test the robustness of MSE results, the same analyses were performed by removing the last 25% of the time series (Tables 1 and 2). Greater entropy values are not always associated with an increase in complexity when traditional methods which were determined using a single scale are used [8]. The average entropy measure, proposed by Chou, is used to compare the average complexity of rainfall time series at four different stations [10]. For example, Costa et al. [5] tested the MSE method on simulated white and 1/f noises. They found that for scale one, a higher value of entropy (about 2.48) is assigned to white noise time series in comparison with that (about 1.80) assigned to 1/f time series. However, while the value of entropy for the coarse-grained white noise time series monotonically decreases, the value of entropy for the coarse-grained 1/f series remains almost constant for all scales [5].  To test the robustness of MSE results, the same analyses were performed for r = 0.15SD (Figures 5  and 6) and r = 0.25 SD (Figures 7 and 8). The same analyses were performed for m = 3 and r = 0.20 SD (Figures 9 and 10). Figures 5-10 indicate that the entropy values of RC time series also increase with scale factors. These results confirm again that the changes in the complexity of rainfall-runoff relations. Table 3 shows the average entropy measures of the rainfall time series for average rainfall and three stations obtained from different m and r values. Table 4 shows the average entropy measures of the rainfall, runoff, and RC time series obtained from different m and r value. These results confirm that MSE analysis is consistent for various values of m and r. That is, if one time series has higher entropy measures than another time series, the results will be the same for various values of m and r. In this study, the MSE values with m = 2 and r = 0.20 SD can be used to robustly perform the complexity analysis of the rainfall-runoff relationship.

Conclusions
This study applies MSE, which has the ability to represent the complexity of signals, to the complexity analysis of rainfall, runoff, and RC time series. The results show that the entropy measures of rainfall, runoff, and RC time series increase when scale factors increase. All of the above time series have self-similarity and great complexity. In addition, the entropy measures of rainfall time series are higher than those of runoff time series at all scale factors. These results demonstrate that the rainfall time series is more complex than the runoff time series. The RC time series, which can be used to represent the rainfall-runoff relationship, are less complex than the rainfall time series and are more complex than runoff time series. These results also show that the rainfall time series with high complexity transform into the runoff time series with low complexity through rainfall-runoff processes.
The rainfall-runoff process in a river basin can be conceptualized as a single input-output system. The average rainfall obtained from several rainfall stations is usually considered as the input of hydrological systems. Results show that the entropy measures of these three rainfall stations are similar at various scale factors. That is, the complexities of these three rainfall stations in the same watershed are almost the same. In addition, the entropy measures of average rainfall are similar to those of the rainfall time series at all three stations at various scale factors. These results confirm that that average rainfall is a suitable input of the rainfall-runoff system.
To test the robustness of MSE results, the same analyses were performed by removing 25% of the time series. These results confirm that the MSE is not sensitive when removing part of the data. The influence of removing 25% of the time series on the MSE is small. The same analyses were also performed for different m and r values, with no change in results. These results confirm that MSE analysis is consistent for various values of m and r. That is, if one time series has higher entropy measures than another time series, the results will be the same for various values of m and r. In this study, the MSE values with m = 2 and r = 0.20 SD can be used to robustly perform the complexity analysis of the rainfall-runoff relationship.
The analysis of sample entropy with a single scale neglects the multiscale hidden characteristics of rainfall, runoff, and RC time series. This study indicates that the variation in the entropy measures of rainfall, runoff, and RC time series as scale factor increases can be obtained using MSE analysis. This study also proves that MSE supports a more complete complexity analysis than sample entropy using only one scale factor. Analyzing rainfall, runoff, and RC time series using MSE analysis also helps extract some of their hidden characteristics because of its capability of expressing complex time series.
In this study, the MSE has been applied to RC time series to investigate the rainfall-runoff relationship. As the MSE curve of RC time series shows that the entropy values increase with scale factors, RC time series contain information in various scales. That is, RC time series have great complexity and obvious self-similarity. Therefore, the MSE analysis of RC time series can be used to investigate changes in the complexity of rainfall-runoff processes due to human activities. In addition, the results provide a reference for selecting rainfall-runoff models that can deal with great complexity and take into account of obvious self-similarity.