Next Article in Journal
Evolution of Marine Organisms under Climate Change at Different Levels of Biological Organisation
Next Article in Special Issue
Suitability of a Coupled Hydrodynamic Water Quality Model to Predict Changes in Water Quality from Altered Meteorological Boundary Conditions
Previous Article in Journal
Application of Hydrologic Tools and Monitoring to Support Managed Aquifer Recharge Decision Making in the Upper San Pedro River, Arizona, USA
Previous Article in Special Issue
Understanding Irrigator Bidding Behavior in Australian Water Markets in Response to Uncertainty
 
 
Order Article Reprints
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Assessment of Short Term Rainfall and Stream Flows in South Australia

1
Centre for Water Management and Reuse (CWMR), School of Natural and Built Environments, University of South Australia, Mawson Lakes, SA 5095, Australia
2
CSIRO Computational Informatics (CCI), Hobart, TAS 7001, Australia
*
Author to whom correspondence should be addressed.
Water 2014, 6(11), 3528-3544; https://doi.org/10.3390/w6113528
Received: 12 June 2014 / Revised: 17 September 2014 / Accepted: 12 November 2014 / Published: 19 November 2014
(This article belongs to the Special Issue Water Resources in a Variable and Changing Climate)

Abstract

:
The aim of this study is to assess the relationship between rainfall and stream flow at Broughton River in Mooroola, Torrance River in Mount Pleasant, and Wakefield River near Rhyine, in South Australia, from 1990 to 2010. Initially, we present a short term relationship between rainfall and stream flow, in terms of correlations, lagged correlations, and estimated variability between wavelet coefficients at each level. A deterministic regression based response model is used to detect linear, quadratic and polynomial trends, while allowing for seasonality effects. Antecedent rainfall data were considered to predict stream flow. The best fitting model was selected based on maximum adjusted R2 values ( R a d j 2 ), minimum sigma square (σ2), and a minimum Akaike Information Criterion (AIC). The best performance in the response model is lag rainfall, which indicates at least one day and up to 7 days (past) difference in rainfall, including offset cross products of lag rainfall. With the inclusion of antecedent stream flow as an input with one day time lag, the result shows a significant improvement of the R a d j 2 values from 0.18, 0.26 and 0.14 to 0.35, 0.42 and 0.21 at Broughton River, Torrance River and Wakefield River, respectively. A benchmark comparison was made with an Artificial Neural Network analysis. The optimization strategy involved adopting a minimum mean absolute error (MAE).

1. Introduction

A review of rainfall-runoff modeling has been given by [1]. Rainfall and stream flow models can be applied to a diverse range of purposes including daily control of reservoirs, projecting future stream flows and flood management. Rainfall and stream-flow models can be classified as physically based, conceptual and empirical. Physically-based models include the Système Hydrologique Européan with sediment and solute transport [2] and Gridded Surface Subsurface Hydrologic Analysis [3] both of which require extensive spatial and temporal data and typically are used for small catchments. An example of a conceptual based model is the Modèle du Génie Rural à 4 paramètres Journalier (GR4J), which has been developed for understanding catchment hydrological behavior [4]. Other examples of conceptual rainfall-runoff models are the Sacramento Soil Moisture Accounting Model [5] and the SIMulation and HYDrologic model (SIMHYD) [6], which can be applied either as a lumped or gridded application. SIMHYD estimates daily stream flows from daily rainfall and areal potential evapotranspiration data. The class of empirical models includes time series models [7,8,9,10,11,12,13]. An advantage of an empirical model is that it can be fitted to situations where the hydrological data are restricted to rainfall and stream flow time series. A further advantage is that in a parametric test, a distribution can be fitted for assessing the hydrological behavior for any time period in any region. In addition, they can represent either linear or non-linear relationships. Time series models perform as well as physically-based alternatives [14]. Combined a conceptual model with an artificial neural network (ANN) for forecasting inflow into the Daecheong Dam in Korea [15]. Compared the wavelet decompositions of rainfall and runoff at four sites in the Tianshan Mountains [16]. They aimed to distinguish between errors in timing and errors in magnitude of hydrograph peaks. They used a cross-wavelet technique to quantify timing errors and hence provided an empirical adjustment to model predictions of stream flow.
In this study, we have proposed a novel method for assessing short-term rainfall and stream flow models. The travel time between rainfall and stream flow gauges using cross-correlation functions [10,17]. They reported that the travel time was less than one day for the Onkaparinga catchment in South Australia. In this paper, we presume that there is a higher order relationship between rainfall gauge and stream flow data. It is, therefore, important in this study to construct the correlation structure. Linear regression models are commonly used for time series analysis [18], particularly for assessing evidence of trends, higher order changes and variability, including allowing for seasonality. We developed deseasonalized and detrended time series rainfall and stream flow models from deterministic regression models including linear, quadratic and cubic terms. These models take account of both lag rainfall and the influence of stream flow. The results of this study will be useful for water managers and policy makers involved in sustainable water resource management and climate change adaptation for the catchments used in this study. The approach is capable of modeling the non-linear relationships between inputs and outputs using ANNs [19]. The first advantage of ANN is that it only requires a small number of parameters and learns through a number of training iterations involving adjusting the parameters (weights) of the network [20]. A second advantage is that it is useful in situations where it is complex to build a physical or conceptual model, such as hydrological modeling of rainfall- stream flow processes [21,22,23,24,25]. ANN models were useful to find the relationships between rainfall and river flow data in a river basin in India [26]. We present a statistical approach that uses the deterministic features of a regression model to build many neural networks with a combination of different lagged input patterns. A wavelet based regression model for stream flow using the discrete wavelet transform (DWT) of the entire time series [27]. They also provided a comparison of their model performance with ANN. A chaotic stream flow model using an ensemble wavelet network [28]. Used wavelet analyses of rainfall and runoff and wavelet rainfall–runoff cross-analyses to investigate the temporal variability of the rainfall-runoff relationship [17,29]. They found that wavelet transforms provide a physical explanation of the temporal structure of the catchment response.

2. Data Collection and Preparation

The analysis is based on data from three rainfall and stream flow stations in South Australia, as presented in Figure 1. The Broughton River (BR) station is at Mooroola, which is located approximately 40 km north of Port Broughton and 20 km south west of Port Pirie. Torrance River (TR) station is located at Mount Pleasant, and its rivers and tributaries are highly variable in flow and together drain an area of 508 km2. Wakefield River (WR) is an ephemeral river near Rhynie, with a catchment area of approximately 1913 km2.
Figure 1. Location of Broughton River (BR), Torrance River (TR), and Wakefield River (WR).
Figure 1. Location of Broughton River (BR), Torrance River (TR), and Wakefield River (WR).
Water 06 03528 g001
The elevation of the river may indicate the hydrological feature, presented in Table 1, Column 4. These stations were selected because they had long records of rainfall and stream flow and the highest quality control in terms of Australian Bureau of Meteorology, [30] and the Department for Environment, Water and Natural Resources [31] quality designations for rainfall and stream flow records. Information on these stations and data quality are presented in Table 1.
Table 1. Weather stations information, data quality and observations.
Table 1. Weather stations information, data quality and observations.
Stations nameIDLocationElevationVariablesData period% of Missing
LatitudeLongitudeStartEnd
Broughton River at MooroolaA5070503–33.53138.51196 mRainfallJun. 1989Dec. 20110.1
Stream flowJun. 1972Dec. 20110.7
Torrance at Mount PleasantA5040512–34.78139.02414.7 mRainfallJun. 1989Dec. 20110.6
Stream flowMay 1973Dec. 20110.1
Wakefield river near RhyineA5060500–34.13138.63202 mRainfallSep. 1985Dec. 20110.9
Stream flowJun. 1971Dec. 20110.2
In this paper, there was less than 1% missing data and these were replaced by the mean of the series of rainfall and stream flow, to give an unbroken time series for analysis. Methods for replacing periods of missing values are discussed [18,32]. In this paper, we propose a dyadic signal time period (i.e., 2n where n is an integer and n ≥ 0, for assessing the relationship between daily rainfall and stream flow during the period 1990–2012. We observe the discrete sequence of time series {yt} where {yt} is an integer ranging in length. We extract multi-level information of observed rainfall and stream flow series in three catchments in South Australia using the Haar wavelet decomposition. We split {yt} into 10 sub-time series of length power two i.e., 2n, where n is the level of the time series, starting from 0. We also investigate the correlation between rainfall and stream flow patterns for each sub-series from levels 0 to 8.

3. Statistical Analysis

3.1. Assessing the Relationship between Rainfall and Stream Flow

The open source software R [33] was used for the analyses in this paper. We calculate 10 subseries of rainfall and stream flow from 1990 to 2012 using the “wavethresh” R routine packages [34,35] for assessing the relationship between rainfall and stream flow. The length of time taken into account in 10 subseries for rainfall and stream flow is a period of 512 days.
The relationship between rainfall and stream flow within 10 subseries is presented in Figure 2. The maximum correlation coefficients are 0.08, 0.23 and 0.31 at Broughton River, Torrance River and Wakefield River, respectively. These values are between −1 and +1 in all cases, indicating the degree of linear dependence between rainfall and stream flow. For assessing short term spatial variability, a correlation coefficient of the sub-series of rainfall and stream flow less than 0.4 indicates a significant difference from 0 at each station. For example, in sub-series 2, the correlation coefficient was 0.04, 0.15 and 0.28 which indicates the independence of rainfall and stream flow at Broughton River, Torrance River and Wakefield River, respectively.
Figure 2. Correlation pattern subseries of rainfall and stream flow time series.
Figure 2. Correlation pattern subseries of rainfall and stream flow time series.
Water 06 03528 g002
In order to understand stream flow availability under the climatic conditions in South Australia, we investigated the characteristics of rainfall and stream flow patterns, as categorized by climatic phenomena. A statistical measure of the dispersion of rainfall and stream flow patterns around the mean is defined as follows:
C V = S x μ x
where CV is defined as the coefficient of variation and is represented by the ratio of the standard deviation (Sx) to the mean (µx). Table 2 shows the degree of variation in rainfall and stream flow patterns.
Table 2. Rainfall and stream flow variability at Broughton River, Torrance River and Wakefield River in South Australia (SA) from 1990 to 2011.
Table 2. Rainfall and stream flow variability at Broughton River, Torrance River and Wakefield River in South Australia (SA) from 1990 to 2011.
StatisticsBroughton RiverTorrance RiverWakefield River
RainfallStream flowRainfallStream flowRainfallStream flow
Mean1.6539.8171.5305.3961.28225.333
Estimated standard deviation0.3854.0750.2964.2010.22321.300
Coefficient of variation (CV)23.31%41.51%19.36%77.85%17.40%84.07%
In Table 2, the CV for stream flow patterns indicates higher variability than for the rainfall series.
Figure 3 shows the variability of the wavelet coefficients from levels 0 to 8. The evidence of association between the rainfall and stream flow coefficient is strongly correlated at the 5% significance level in Table 1.
Figure 3. Standard deviations of wavelet coefficients of rainfall and stream flow from level 0 to 8. (a) Rainfall; (b) Stream flow.
Figure 3. Standard deviations of wavelet coefficients of rainfall and stream flow from level 0 to 8. (a) Rainfall; (b) Stream flow.
Water 06 03528 g003

3.2. Correlation Structures between Rainfall and Stream Flow

In the previous sections, we calculated wavelet coefficients for each subset of the rainfall and stream flow series. In order to filter each of those series, we applied Haar wavelets.
The constructed correlation pattern for each rainfall and stream flow sub-series for levels 0 to 8 is given by:
r k = k = 0 8 j = 1 10 ( X w d j k X ¯ w d , k ) ( Y w d j k Y ¯ w d , k ) S x w d S y w d
and
X ¯ w d = j = 1 10 X w d j / 10 , Y ¯ w d = j = 1 10 Y w d j / 10 , ( j = 1 ,   2 ,   3 ,… 10 ,   and   k = 0 ,   1 ,   2 , 8 )
where rk is the constructed correlation with level n from 0 to 8 and X w d j , Y w d j is the jth sub-series of the rainfall and stream flow wavelet decomposed with the Haar procedure. The results are presented in Table 3.
The evidence of significant correlation (r ≥ 0.50) between rainfall and stream flow wavelet coefficient series with at least a 5% significance level is shown in Table 3. Furthermore, to avoid co-linearity problems, the squared rainfall and stream flow wavelet coefficient series are also included. We found that a correlation structure (r = 0.56) such as stream flow is determined by rainfall on at least 4 days with 5% level at the Broughton River Basin and Torrance River Basin, as shown in Table 3. The adjusted squared stream flow and rainfall has a little evidence of correlations (i.e., at 5% level) up to 64 days at Torrance River at, also a marginal correlation (r = 0.51) up to 128 days within squared adjusted rainfall and adjusted stream flow at Wakefield River. The rainfall and stream flow relationship was used to develop a response model for predicting stream flow.
Table 3. Constructed correlation pattern for different levels between (a) adjusted rainfall and adjusted stream flow; (b) squared adjusted rainfall and adjusted stream flow; (c) adjusted rainfall and squared adjusted stream flow; (d) squared adjusted rainfall and squared adjusted stream flow.
Table 3. Constructed correlation pattern for different levels between (a) adjusted rainfall and adjusted stream flow; (b) squared adjusted rainfall and adjusted stream flow; (c) adjusted rainfall and squared adjusted stream flow; (d) squared adjusted rainfall and squared adjusted stream flow.
DaysBroughton RiverTorrance RiverWakefield River
abcdabcdabcd
10.71 **0.89 ***0.70 **0.86 ***0.76 **−0.184−0.5130.4470.76 **−0.53 *−0.59 *0.86 ***
20.65 *0.2640.4690.4490.72 **0.570.1580.2650.3840.4130.2010.038
40.56 *−0.1890.2570.1460.63 *0.061−0.270.1250.324−0.28−0.3410.115
80.0870.369−0.2960.1030.0320.55 *0.173−0.51 *−0.3690.4830.191−0.418
160.2330.08−0.009−0.3260.275−0.150.009−0.3110.306−0.6520.081−0.393
320.0940.055−0.059−0.2480.68 *−0.84 **−0.81 ***0.97 ***−0.1160.002−0.382−0.121
640.4110.0360.2920.2380.4880.67 *0.4560.71 **−0.0910.166−0.005−0.301
1280.423−0.4090.604 *0.68 *0.299−0.1620.186−0.2230.279−0.5750.128−0.51 *
5120.4560.354−0.2920.343−0.218−0.4050.007−0.4050.0940.1170.098−0.163
Notes: * Coefficients are statistically significant at 5%; ** Coefficients are statistically significant at 1%; *** Coefficients are statistically significant at 0.1%

3.3. Rainfall-Stream Flow Response Modeling

The constructed correlations described in the previous section may be partly due to common seasonal variations and trends, so a first step is to estimate these deterministic features with regression models for entire period from 1990 to 2010. The residuals from these regressions are reformed to the deseasonalized and detrended (dsdt) time series. For all three stations, a cubic trend gave a statistically improved fit over a linear or quadratic trend over the study period. The seasonal variation was reasonably modelled by a sinusoidal curve. Therefore, the regression models are of the form:
T i = β 0 + β 1 × t i m e + β 2 × t i m e 2 + β 3 × t i m e 3 + β 4 × C + β 5 × S + ε t
where, Ti represents either rainfall or stream flow; time is the mean adjusted time, that is ( t t ¯ ) where t is the number of days from the start of the record and t ¯ is the mean of t, time2 and time3, which allows for possible quadratic and cubic trends; C is cos(2πt/365.25) and S is sin(2πt/365.25) and together these allow for seasonal variation of period one cycle per year; βj are the unknown coefficients to be estimated; and εt are random variations with mean 0 and constant standard derivation.
For the estimated coefficients, only a few values are significantly different from 0 even at the 5% significance level, as shown in Table 4. There is evidence of significantly different trends in rainfall at Wakefield River, which may have corresponded to increased stream flows if rainfall is increased. We have predicted the stream flow (Yt) on day t from rainfall (Xt) with corresponding lags k. This is referred to as a Response Model (RM). The regression is defined as:
Y t = β 0 + β 1 × X 1 + β 2 × X 2 + .............. β 128 × X 128 + ε t
We assess stream flow in response to rainfall at lags 0 to 128. The best fitted model is selected based on the adjusted coefficient of determination; ( R a d j 2 ) ; minimum sigma squared (σ2) and the Akaike Criterion Information (AIC); The AIC is defined as:
AIC = 2 × number of parameters − 2 Log(L)
where L is the maximized value of the likelihood function for the estimated model. Comparisons of the AIC for different model is as shown in Table 5. The R a d j 2 value significantly reduces and the estimated stream flow influence is close to zero after the exogenous rainfall at lag 7. Therefore, we reduced the exogenous rainfall at lags from 128 to 7 in the response model; referred to as RM0 in Table 5. This strategy is sub-optimal inasmuch as rejected terms might meet the retention criterion if added back individually. However; any small improvement in R a d j 2 would be balanced by increased complexity in the model; which is undesirable if interaction and squared terms are added. The regression model is defined as RM:
Y t = β 0 + β 1 × X 1 + β 2 × X 2 + β 3 × X 3 + β 4 × X 4 + β 5 × X 5 + β 6 × X 6 + β 7 × X 7
In the second model, we add deterministic features to the regression model including linear, quadratic and cubic terms of t, allowing for seasonality effects. This model is defined as RM_D:
Y t = β 0 + L + l = 1 7 β 5 + l × X l
where L = β 1 × t i m e + β 2 × t i m e 2 + β 3 × t i m e 3 + β 4 × C + β 5 × S t .
The third model is defined as RMD_AR[1] and is an autoregressive model of order 1 (AR[1]) with RM_D. It can be written in the form:
Y t = β 0 + L + l = 1 7 β 5 + l × X l + β 13 × Y t 1
The fourth model is defined as RMD_AR[2], and is an autoregressive model of order 2 (AR[2]) with RMD_AR[1]. It can be written in the form:
Y t = β 0 + L + l = 1 7 β 5 + l × X l + β 13 × Y t 1 + β 14 × Y t 2
Table 4. Estimated coefficients of rainfall and stream flow variability from 1990 to 2012.
Table 4. Estimated coefficients of rainfall and stream flow variability from 1990 to 2012.
StationStatistical SummaryIntercept (β0)Linear Term tQuadratic Term tCubic Term t
Broughton RiverEstimated rainfall1.58−0.000042−0.000000001−0.000000000003
Variability of rainfall0.1060.000080.0000000170.000000000008
Estimated stream flow52.08−0.01244 *0.0000031 *−0.000000000258
Variability of stream flow6.180.0046610.00000090.000000000485
Torrance RiverEstimated rainfall1.424−0.000070.0000000190.000000000004
Variability of rainfall0.0770.000060.0000000120.000000000006
Estimated stream flow3.47−0.00174 *0.0000003 *0.000000000149 *
Variability of stream flow0.6070.00045730.0000000920.000000000048
Wakefield RiverEstimated rainfall1.226−0.000123 *0.00000000370.00000000001 *
Variability of rainfall0.0670.0000510.000000010.000000000005
Estimated stream flow15.95−0.01144 *0.00000070.0000000008 *
Variability of stream flow3.5760.0026940.00000050.000000000280
Note: * statistical significance at 5%.
Table 5. Fitted regression model for Broughton River, Torrance River and Wakefield River.
Table 5. Fitted regression model for Broughton River, Torrance River and Wakefield River.
ModelBroughton RiverTorrance RiverWakefield River
R a d j 2 Std. ErrorAICRMSE * R a d j 2 Std. ErrorAICRMSE * R a d j 2 Std. ErrorAICRMSE *
RMO0.16333.61104.93.51070.2431.19742.775.0740.13195.81023.531.1777
RM_D0.18331.31103.93.15070.2631.02741.95.0120.14195.31023.11.1777
RMD_AR[1]0.35292.91085.10.03530.4227.35722.70.0520.21187.41016.90.11777
RMD_AR[2]0.36291.71084.50.03130.4327.35722.50.04520.22187.41016.80.10777
RMD_tau0.39285.81081.40.00350.4227.32722.10.04110.23187.31016.10.10178
Note: Asterisk (*) units are in m3s−1.
Finally, we develop a model for a benchmark comparison of stream flow on day t based on the entire previous period of stream flow and their influence (τ) adding with model RM_D. This model is defined as RMD_tau. Tau (τ) is 0 if there is no stream flow influence from the previous day’s rainfall. We have demonstrated an example of count stream flow influence in Table 6.
Table 6. An example of count tau and stream flow influence rainfall over time.
Table 6. An example of count tau and stream flow influence rainfall over time.
Stream flowy1y2y3y4y5y6y7y8y9y10y11y12y13
89000292235886
Rainfallx1x2x3x4x5x6x7x8x9x10x11x12x13
3253.232.82.62.42.221.81.61.4
In the Table 6, when the day t = 6, Y6 = 2, then we count tau = 3 (number of 0), and Y6-3-1 = 9, can be applied in the referred model RMD_tau.
The model RMD_tau can be written in the form:
Y t = β 0 + L + l = 1 7 β 5 + l × X l + β 13 × τ + β 14 × Y t τ 1
The fitted model for predicted stream flow in response to exogenous rainfall, deterministic features of the regression model, and previous stream flow influence, is presented in Table 5. The best fitting model selection was based on minimum AIC and minimum root mean square Error (RMSE). The RMSE is defined as:
R M S E = E ( Y ^ t Y t ) 2
where, Y ^ t is defined as the estimated stream flow and Yt is the observed stream flow, respectively.
The response model RM0 has 128 predictor variables namely the rainfall lags at 0 to 128. Therefore, there are 129 parameters to estimate including the intercept. The estimated rainfall effects belong to 0 up to 7 days lag, therefore we reduced the rainfall lags from 128 to 7 days and the optimized R a d j 2 values for this model are 0.16, 0.24 and 0.13 for Broughton River, Torrance River and Wakefield River, respectively, as presented in Table 5. We also offset the cross product term of lags to further reduce the complexity of this model. The second model included linear quadratic and cubic terms, and this model is denoted as RM_D. The number of parameters to be estimated is therefore 8 + 3 = 11 and the R a d j 2 increased to 0.18, 0.26 and 0.14 for Broughton River, Torrance River and Wakefield River, respectively, which is a practical and statistically significant improvement. We then added a first order autoregressive term, referred to as a RMD_AR[1] model, and a second order autoregressive term referred to as a RMD_AR[2] model. We also made a benchmark comparison by using the entire stream flow record and this model is denoted RMD_tau, as presented in Table 5.
In Table 5, there is evidence of improvement of R a d j 2 values, RMSE in m3s−1 from RM to RM_D. Adding autoregressive order 1 (AR[1]) with RM_D results in substantially improved R a d j 2 values (from 0.18, 0.26, and 0.14 to 0.35, 0.42 and 0.21 for Broughton River, Torrance River and Wakefield River, respectively. Furthermore, when adding autoregressive order 1 (AR[1]) with RM_D, there is evidence of improvement but this may be offset by the increasing number of parameters that affect the complexity of the model. In addition, the RMD_tau model represents a small improvement for two of the three river basins. The best fitted models are RMD_tau for Broughton River, RMD_AR[2] for Torrance River and RMD_tau for Wakefield River, were selected based on the minimum Akaike Information Criterion (AIC) and minimum root mean square error (RMSE) in m3s−1. The residuals from the best fitted models were transformed to normalized form by factor multiplication. A factor was calculated, which allows for the fact that the mean of a non-linear function of a random variable is not equal to that function of the mean. The transform series follow an identically normalized form with mean (μ) of zero, standard deviation (σ2) of 1 and a random disturbance term (εt) which is uncorrelated. The transformed series were used to predict the stream flow on day t based on the predicted stream flow influence over the short term, as shown in Figure 4.
Figure 4. Predicted stream flow based on dsdt rainfall for (a) Broughton River; (b) Torrance River; and (c) Wakefield River from 1990 to 2010.
Figure 4. Predicted stream flow based on dsdt rainfall for (a) Broughton River; (b) Torrance River; and (c) Wakefield River from 1990 to 2010.
Water 06 03528 g004
In Figure 4, we demonstrate the versatility of stream flow prediction. It can be seen that this is a non-linear relationship when expressed in terms of the physical interpretation of stream flow based on rainfall.

3.4. Modeling Stream Flow Using an Artificial Neural Network

Artificial neural network (ANN) techniques are motivated by the principles of biological nervous systems [36]. Although there are different types of ANN, the multilayer feed forward network is the most commonly used technique. For example, a common approaches of training using back-propagation in a multi-layer feed forward network [23]. The network consists of input, hidden and output layers. Each layer is fully connected with the proceeding layer with weights in each connection, as shown in Figure 5.
Figure 5. A schematic ANN including input, hidden and output layers.
Figure 5. A schematic ANN including input, hidden and output layers.
Water 06 03528 g005
In Figure 5, the number of nodes in the input layer is p, the number of nodes in the hidden layer is q and the number of nodes in the output layer is r. The initial assigned random weights are updated during the training process by comparing the predicted output and the known output for errors. Errors are then back-propagated to adjust the weights. The dsdt of daily rainfall and stream flow data from the regression model developed in the previous section are considered for developing a prediction model for each of the three river basins for the years 1990 to 2010. A certain methods proposed such as input selection, model architecture selection, model calibration (training) and validation (testing) [37]. In addition, we emphasize the fact that ANN set-up has to be carefully achieved and described to get the reliable results. This study described the steps in building the prediction models for stream flow. We consider the prediction function as: St+1 = f(St, St-1, St-2, ….., St-m, Rt, Rt-1, Rt-2,...,Rt-n) where S represents stream flow, R represents rainfall, t is the current day, m = {3,...,8}, n = {3,...,8} and f represents the ANN as a regression function. We investigate necessary lagged inputs of rainfall and river flow for modeling the river flows at three locations in South Australia. We apply an artificial neural network (ANN) technique for modeling river flow. ANN models are developed with all combinations of rainfall and river flow input ranges. In addition, a standard range of nodes in the hidden layer are also considered. Among all models based on inputs and hidden nodes, the best model is selected based on mean absolute error criteria. This entire process is applied to all three locations. ANN models capture the non-linear relationships of rainfall and river flow patterns in modeling river flows from large time series data. For example, if we consider 3 days lag of stream flow and 5 days lag of rainfall, then the total number of input nodes in the ANN structure will be 8 and we consider the number of nodes in the hidden layers ranging from 1 to 10. To achieve the best model using ANN for each location, all inputs not only apply in combination, but we also consider setting a range of parameters, such as different number of nodes in the hidden layer, for each combination of inputs.
In predicting stream flow one day ahead as output, we consider stream flow and rainfall with combinations of consecutive lags where the minimum lag is 3 days and the maximum lag is 8 days. Thus, for each location, the total number of models to be trained becomes 36. As the data set is large, one year of data is considered initially for testing. For training ANN models at each location, we consider stream flow and rainfall data for the period 1990 to 2009. The remaining data for the year 2010 is used for testing the best model found in the training phase.
For the Multilayer Perceptron (MLP) function, the ANN stream flow prediction model was built using the RWeka package in R Language [38]. One of the important parameters to specify is the number of nodes in the hidden layer, which may vary for time series modeling in different locations. Using trial and error, the number of nodes in the hidden layer is considered from 1 to 10. This range is widely used in hydrological time series modeling [21]. We consider the learning rate (the amount the weights are updated) to be 0.3, momentum is 0.2 and the number of epochs to train is 500.
Application of back propagation in ANN with a sigmoidal function was used to set the normalized data in the MLP function. Furthermore, the mean absolute error (MAE) in m3s−1 was minimized through an iteration process that varied the number of nodes in the hidden layer.
The best lag combination at each location is presented in Figure 6.
Figure 6. MAE for training data (1990–2009) using ANN with best lag combinations at each location, units in m3s−1.
Figure 6. MAE for training data (1990–2009) using ANN with best lag combinations at each location, units in m3s−1.
Water 06 03528 g006
We find that both input lags and nodes in the hidden layer are different for each location. The best model based on correlation coefficient ( R a d j 2 ) and the lowest root mean square error (RMSE) and mean absolute error (MAE) for each location is presented in Table 7. For Broughton River, 3 days rainfall and 6 days stream flow as lagged inputs with 9 nodes in the hidden layers produces the lowest MSE. At Torrance River, 3 days rainfall and 8 days stream flow as lagged inputs with 2 nodes in the hidden layers produces the lowest MSE. For Wakefield River, 4 days rainfall and 5 days stream flow as lagged inputs with only one node in the hidden layer produces the lowest MSE. This indicates the variability in the ANN models for different locations.
When the best model is identified based on the training data for each location, we use this model on testing data prediction. This study show the prediction results for the testing data for each location. Figure 7 shows the predicted and observed stream flows using testing data for the locations Broughton River, Torrance River and Wakefield River, respectively.
Table 7. Best prediction model based on R a d j 2 , lowest RMSE and MAE are in m3s−1 on the training data.
Table 7. Best prediction model based on R a d j 2 , lowest RMSE and MAE are in m3s−1 on the training data.
LocationInput LagsNodes in Hidden Layer in ANN(H) R a d j 2 RMSE *MAE *
Broughton River3 days rain, 6 days stream flow90.68270.3345.53
Torrance River3 days rain, 8 days stream flow20.7124.544.89
Wakefield River4 days rain, 5 days stream flow10.45179.4219.28
Note: Asterisk (*) units are in m3s−1.
Figure 7. Observed and predicted stream flow for (a) Broughton River; (b) Torrance River; and (c) Wakefield River for the year 2010.
Figure 7. Observed and predicted stream flow for (a) Broughton River; (b) Torrance River; and (c) Wakefield River for the year 2010.
Water 06 03528 g007
The MAE for training and testing data is shown in Figure 8 for all three locations. We observed that the MAE for the training and testing data at Broughton and Torrance Rivers do not vary significantly.
For Broughton, in training, the best ANN model structure includes 3 days lagged rainfall and 6 days lagged stream flow as inputs with 9 nodes in the hidden layer. This model has the lowest MAE, at 45.53 m3s−1. We further use this best model for testing and we find the MAE of 32.43 m3s−1. For Torrance, the ANN best model in training has 3 days lagged rainfall and 8 days lagged stream flow as inputs with 2 nodes in the hidden layer achieving the MAE of 4.89 m3s−1. For testing data, this model gives a MAE of 9.27 m3s−1. In case of Wakefield, the best ANN model has 4 days lagged rainfall and 5 days lagged stream flow as inputs with 1 node in the hidden layer achieving the MAE of 19.28 m3s−1. For the testing data, this model achieves an MAE of 42.88 m3s−1. The reason for the difference in MAE between the training and testing phases could be due to this river’s ephemeral nature, and its substantial dependence on rainfall.
Figure 8. Comparison of MAE for training and testing data, units are in m3s−1.
Figure 8. Comparison of MAE for training and testing data, units are in m3s−1.
Water 06 03528 g008

4. Conclusions

Initially, we split the whole series with a dyadic signal process for assessing the short term relationship between rainfall and stream flow including correlation using Haar wavelets. We have presented an innovative idea for the hydrological community for assessing stream flow for any catchment. In particular, the end user could assess the variability of changes and construct higher order correlations from 2 days up to as long as required. In addition, this study would be helpful for predicting stream flows using deterministic regression techniques, particularly where there is evidence of changes of statistical distribution characteristics, which is important for Water Sensitive Urban Design, as clearly demonstrated [39]. Using a deterministic regression based response model we found an increasing trend in stream flow when rainfall increased significantly. Predicted stream flow was more influenced by the previous few days’ stream flows than when considering the entire previous period of stream flow. We also developed artificial neural network models for three locations. The results show that the influence of lagged rainfall and stream flow lies within a short temporal window. The results demonstrate that the ANN models perform better for Broughton and Torrance River in capturing the rainfall and stream flow relationships.

Acknowledgments

This study was funded by the Goyder Institute for Water Research under their Climate Change program. The researchers are grateful to the developers of the R project for software code and to the Australian Bureau of Meteorology for providing meteorological data.

Author Contributions

This manuscript, draft design by Mohammad Kamruzzaman and Md Sumon Shahriar, complete revised by Simon Beecham.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Moradkhani, H.; Sorooshian, S. General review of rainfall-runoff modeling: Model calibration, data assimilation, and uncertainty analysis. Water Sci. Technol. Libr. 2008, 63, 1–23. [Google Scholar]
  2. Birkinshaw, S.J. Introduction. Available online: http://research.ncl.ac.uk/shetran/Introduction.htm (accessed on 26 July 2013).
  3. Downer, C.W.; Ogden, F.L. GSSHA: A model for simulating diverse stream flow generating processes. J. Hydrol. Eng. 2004, 9, 161–174. [Google Scholar] [CrossRef]
  4. Perrin, C.; Michel, C.; Andreassian, V. Improvement of a parsimonious model for stream flow simulation. J. Hydrol. 2003, 279, 275–289. [Google Scholar] [CrossRef]
  5. Burnash, R.J.C.; Ferreal, R.L.; McGuire, R.A. A Generalized Stream Flow Simulation System: Conceptual Modeling for Digital Computers; U.S. Department of Commerce, National Weather Service and Department of Water Resources: Sacramento, CA, USA, 1973. [Google Scholar]
  6. Chiew, F.H.S.; Peel, M.C.; Western, A.W. Application and testing of the simple rainfall-runoff model SIMHYD. In Mathematical Models of Watershed Hydrology; Singh, V.P., Frevert, D., Meyer, S., Eds.; Water Resources Publications: Littleton, CO, USA, 2002. [Google Scholar]
  7. Beven, K.J. Rainfall-Runoff Modelling: The Primer; Wiley: Chichester, UK, 2011; pp. 1–360. [Google Scholar]
  8. Castellano-Méndez, M.; Gonzàlez-Manteigo, W.; Febrero-Bande, M.; Prada-Sànchz, J.M.; Lozano-Calderon, R. Modelling of the monthly and daily behaviour of the runoff of the Xallas rivers using Box-Jenkins and neural networks methods. J. Hydrol. 2004, 296, 38–58. [Google Scholar] [CrossRef]
  9. Hipel, K.W. (Ed.) Stochastic and Statistical Methods in Hydrology and Environmental Engineering: Time Series Analysis in Hydrology and Environmental Engineering, Water Science and Technology; Springer: New York, NY, USA, 2010.
  10. Kamruzzaman, M.; Metcalfe, A.; Beecham, S. Wavelet based rainfall-stream flow models for the South-East Murray Darling Basin. ASCE J. Hydrol. Eng. 2014, 19, 1283–1293. [Google Scholar] [CrossRef]
  11. Kamruzzaman, M.; Beecham, S.; Metcalfe, A. Climatic influence on rainfall and runoff variability in the South-East region of the Murray Darling Basin. Int. J. Climatol. 2013, 33, 291–311. [Google Scholar] [CrossRef]
  12. Kamruzzaman, M.; Beecham, S.; Metcalfe, A. Evidence for Changes in Daily Rainfall Extremes in South Australia. In Proceedings of 9th International Workshop on Precipitation in Urban Areas, IWA/IAHR, St. Moritz, Switzerland, 6–9 December 2009.
  13. Solomatine, D.; See, L.; Abrahart, R. Data-driven modelling: Concepts, approaches and experiences. In Practical Hydroinformatics; Abrahart, R., See, L., Solomatine, D., Eds.; Springer Berlin: Heidelberg, Germany, 2008; Volume 68, pp. 17–30. [Google Scholar]
  14. McIntyre, N.; Al-Qurashi, A.; Wheater, H. Regression analysis of rainfall-runoff events from an arid catchment in Oman. Hydrol. Sci. J. 2007, 52, 1103–1118. [Google Scholar] [CrossRef]
  15. Kim, Y.; Jeong, D.; Ko, I. Combining rainfall-runoff model outputs for improving ensemble stream flow prediction. J. Hydrol. Eng. 2006, 11, 578–588. [Google Scholar] [CrossRef]
  16. Liu, H.L.; Bao, A.; Chen, X.; Wang, L.; Pan, X.L. Response analysis of rainfall-runoff processes using wavelet transform: A case study of the alpine meadow belt. Hydrol. Processes 2011, 25, 2179–2187. [Google Scholar] [CrossRef]
  17. Kamruzzaman, M.; Beecham, S.; Metcalfe, A. Wavelet Based Assessment of Relationship between Hydrological Time Series in South East Australia. In Proceedings of 2nd Conference on Practical Responses to Climate Change, Engineers Australia, Canberra, Australia, 1–3 May 2012.
  18. Kamruzzaman, M.; Beecham, S.; Metcalfe, A. Non-stationarity in rainfall and temperature in the Murray Darling Basin. Hydrol. Process 2011, 25, 1659–1675. [Google Scholar] [CrossRef]
  19. Hsu, K.-L.; Gupta, H.V.; Sorooshian, S. Artificial neural network modeling of the rainfall-runoff process. Water Resour. Res. 1995, 31, 2517–2530. [Google Scholar] [CrossRef]
  20. Sajikumar, N.; Thandaveswara, B.S. A non-linear rainfall-runoff model using an artificial neural network. J. Hydrol. 1999, 216, 32–55. [Google Scholar] [CrossRef]
  21. Kisi, O.; Shiri, J.; Tombul, M. Modelling rainfall-runoff process using soft computing techniques. Comput. Geosci. 2013, 51, 108–117. [Google Scholar] [CrossRef]
  22. Kisi, O. River flow forecasting and estimation using different artificial neural network techniques. Hydrol. Res. 2008, 39, 27–40. [Google Scholar] [CrossRef]
  23. Kisi, O. Stream flow forecasting using different artificial neural network algorithms. J. Hydrol. Eng. 2007, 12, 532–539. [Google Scholar] [CrossRef]
  24. Rezaeian Zadeh, M.; Amin, S.; Khalili, D.; Singh, V.P. Daily outflow prediction by Multi Layer perceptron with logistic sigmoid and tangent sigmoid activation functions. Water Resour. Manag. 2010, 24, 2673–2688. [Google Scholar] [CrossRef]
  25. Rezaeianzadeh, M.; Stein, A.; Tabari, H.; Abghari, H.; Jalalkamali, N.; Hosseinipour, E.Z.; Singh, V.P. Assessment of a conceptual hydrological model and Artificial Neural Networks for daily outflows forecasting. Int. J. Environ. Sci. Technol. 2013, 10, 1181–1192. [Google Scholar] [CrossRef]
  26. Sudheer, K.P.; Gosain, A.K.; Ramasastri, K.S. A data-driven algorithm for constructing artificial neural network rainfall-runoff models. Hydrol. Processes 2002, 16, 1325–1330. [Google Scholar] [CrossRef]
  27. Kisi, O. Wavelet regression model as an alternative to neural networks for river stage forecasting. Water Resour. Manag. 2011, 25, 579–600. [Google Scholar] [CrossRef]
  28. Dhanya, C.T.; Kumar, D.N. Predictive uncertainty of chaotic daily stream flow using ensemble wavelet networks approach. Water Resour. Res. 2011, 47. [Google Scholar] [CrossRef]
  29. Labat, D.; Ababou, R.; Mangin, A. Rainfall-runoff relations for karstic springs. Part II: Continuous wavelet and discrete orthogonal multi resolution analyses. J. Hydrol. 2000, 238, 149–178. [Google Scholar] [CrossRef]
  30. Australian Bureau of Meteorology (BoM). Climate Data Online. Available online: http://www.bom.gov.au/climate/data/index.shtml (accessed on 12 January 2013).
  31. Department for Environment, Water and Natural Resources (DEWNR) 2013. Available online: https://www.waterconnect.sa.gov.au/Systems (accessed on 12 March 2013).
  32. Weerasinghe, S. A missing values imputation method for time series data: An efficient method to investigate the health effects of sulphur dioxide levels. Environmetrics 2010, 21, 162–172. [Google Scholar]
  33. R Development Core Team. Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2011. [Google Scholar]
  34. Nason, G.P. Wavelet Methods in Statistics with R; Springer: New York, NY, USA, 2008. [Google Scholar]
  35. R routine in open access. Wavethresh package, 2013. Available online: http://cran.ms.unimelb.edu.au/web/packages/wavethresh/index.html (accessed on 10 November 2013).
  36. Haykin, S. Neural Networks—A Comprehensive Foundation, 2nd ed.; Prentice-Hall: Upper Saddle River, NJ, USA, 1998; pp. 26–32. [Google Scholar]
  37. Maier, R.H.; Jain, A.; Graeme, C.D.; Sudheer, K.P. Methods used for the development of neural networks for the prediction of water resource variables in river systems: Current status and future directions. Environ. Model. Softw. 2010, 25, 891–909. [Google Scholar] [CrossRef]
  38. R routine in open access. RWeka package, 2014. Available online: http://cran.r-project.org/web/packages/RWeka/index.html (accessed on 28 April 2014).
  39. Beecham, S.; Chowdhury, R. Effects of changing rainfall patterns on WSUD in Australia. Proc. ICE Water Manag. 2012, 165, 285–298. [Google Scholar] [CrossRef]

Share and Cite

MDPI and ACS Style

Kamruzzaman, M.; Shahriar, M.S.; Beecham, S. Assessment of Short Term Rainfall and Stream Flows in South Australia. Water 2014, 6, 3528-3544. https://doi.org/10.3390/w6113528

AMA Style

Kamruzzaman M, Shahriar MS, Beecham S. Assessment of Short Term Rainfall and Stream Flows in South Australia. Water. 2014; 6(11):3528-3544. https://doi.org/10.3390/w6113528

Chicago/Turabian Style

Kamruzzaman, Mohammad, Md Sumon Shahriar, and Simon Beecham. 2014. "Assessment of Short Term Rainfall and Stream Flows in South Australia" Water 6, no. 11: 3528-3544. https://doi.org/10.3390/w6113528

Article Metrics

Back to TopTop