Rainfall-Runoff Modelling Using Hydrological Connectivity Index and Artificial Neural Network Approach

The input selection process for data-driven rainfall-runoff models is critical because input vectors determine the structure of the model and, hence, can influence model results. Here, hydro-geomorphic and biophysical time series inputs, including Normalized Difference Vegetation Index (NDVI) and Index of Connectivity (IC; a type of hydrological connectivity index), in addition to climatic and hydrologic inputs were assessed. Selected inputs were used to develop Artificial Neural Networks (ANNs) in the Haughton River catchment and the Calliope River catchment, Queensland, Australia. Results show that incorporating IC as a hydro-geomorphic parameter and remote sensing NDVI as a biophysical parameter, together with rainfall and runoff as hydro-climatic parameters, can improve ANN model performance compared to ANN models using only hydro-climatic parameters. Comparisons amongst different input patterns showed that IC inputs can contribute to further improvement in model performance, than NDVI inputs. Overall, ANN model simulations showed that using IC along with hydro-climatic inputs noticeably improved model performance in both catchments, especially in the Calliope catchment. This improvement is indicated by a slight increase (9.77% and 11.25%) in the Nash–Sutcliffe efficiency and noticeable decrease (24.43% and 37.89%) in the root mean squared error of monthly runoff from Haughton River and Calliope River, respectively. Here, we demonstrate the significant effect of hydro-geomorphic and biophysical time series inputs for estimating monthly runoff using ANN data-driven models, which are valuable for water resources planning and management.


Introduction
Runoff models are useful in water resources planning and management, drought management, design of hydraulic structures (e.g., dams, bridges, culverts), and flood forecasting.During the past several decades, hydrologists and engineers have typically applied rainfall-runoff (R-R) models using different combinations of hydrologic and climatic data to estimate runoff [1].Runoff estimation procedures range from completely black-box models to very detailed conceptual and physically-based models [2][3][4].Generally, two types of models have been used: (i) deterministic/conceptual models; and (ii) theoretical systems-based or data-driven models [1].Since deterministic or conceptual R-R models often require a large number of different parameters, this approach can be problematic and the focus may shift to theoretical systems-based or data-driven models.Thus, the use of data-driven models in the simulation of R-R non-linear process is more popular than conceptual models [5].Data-driven models attempt to find a relationship between input and output parameters without considering the physical process [1].Artificial Neural Networks (ANNs) are a type of data-driven model with a flexible mathematical structure which includes both linear and nonlinear concepts which operate within a dynamic input-output system [6].
In the past several decades, there has been substantial growth in application of ANNs for R-R modelling [7][8][9][10][11][12][13][14][15] where which ANNs have been compared to other methods, including traditional statistical methods, conceptual models and other artificial intelligence models.Result of these studies have shown that ANN is more accurate than conventional and traditional statistical methods.For example, the ANN approach predicted more accurate runoff in the Mississippi River basin than the conventional autoregressive moving average with exogenous inputs (ARMAX) approach in [16].Birikundavyi et al. [17] have demonstrated that streamflow forecasting using ANN models has better performance than conceptual model (PREVIS) and the conventional autoregressive model coupled with a Kalman filter (ARMAX-KF) models.Additionally, comparison of ANNs with other artificial intelligence models (e.g., support vector machine (SVM), genetic programming (GA), fuzzy logic (FL)) showed better performance of ANNs in flow prediction [18][19][20].It should be noted that the ANFIS model as a hybrid ANN and fuzzy system also has high performance in R-R modelling [21][22][23][24].Toth and Brath [25] note that ANN is an excellent tool for modelling continuous R-R series using extensive hydro-climatic datasets.However, the prediction performance of ANNs relies on an appropriate network structure and input data.
A review of previous studies shows that, in some cases, rainfall variables are considered as the only input of the ANN models [26][27][28], while in most studies flow (or water level) antecedents were used as inputs in addition to rainfall [29][30][31][32][33][34][35][36][37].Some studies also used other input parameters, such as temperature, evaporation, evapotranspiration, or combinations of these parameters in addition to rainfall and/or flow inputs [13,14,36,[38][39][40][41].In some studies, to use physical and geomorphological characteristics of catchments for estimating surface flow, the GANN model (a three layer feed-forward ANN) was applied.In this model, the number of neurons within the hidden layer was considered equal to the number of possible flow paths within a catchment [42].In this regard, Zhang and Govindaraju [42] incorporated the geomorphologic instantaneous unit hydrograph (GIUH) within the architecture of ANN, and Hosseini and Mahjouri [27] incorporated the geomorphologic instantaneous unit hydrograph within the architecture of ANN and SVM to compute runoff values of catchment.Similarly, R-R modelling was improved using a hybrid ANN model together with physical characteristics of the catchment by Young and Liu [43].
Review of past research shows that, in spite of the appropriate flexibility and performance of ANNs in the simulation of R-R complex phenomenon, there are drawbacks in current models, including that they cannot reflect the heterogeneity of catchment physical and geomorphological properties on hydrological processes.On the other hand, Artificial Intelligence techniques (AI) have indicated great performance in forecasting and modelling of hydrological time series [24].Thus, in this study R-R modelling by ANNs (a type of AI technique) needs time series data.Past studies indicate that only climatic and hydrological time series variables have been used as runoff estimators and because catchment geomorphologic characteristics are static, thus, there have been no studies that have used geomorphological variables as input within ANNs to develop efficient tools for R-R modelling.Accordingly, Index of Connectivity (IC) was considered as a hydro-geomorphic tool to investigate flow connection within catchments [44].Application of hydro-geomorphic connectivity in catchments has experienced wide growth in more than the past two decades [45][46][47].In this study, Normalized Difference Vegetation Index (NDVI) was also considered to investigate the effect of vegetation in R-R modelling.NDVI is an indicator of the greenness of vegetation cover which is widely used for ecosystem monitoring.Therefore, the first objective of this study is to estimate the catchment runoff by ANN model using IC and NDVI as hydro-geomorphic and biophysical inputs, respectively, in addition to the climatic and hydrologic inputs.Moreover, the second objective of this study is to examine the effect of such inputs on the model performance.

Study Area and Database
Two catchments, the Haughton River at Powerline gauging station (119003A) and the Calliope River at Castlehope gauging station (132001A), in Queensland, Australia, were selected to test the objectives of this study, due to data availability and consistency in catchment conditions.For example, these catchments have not experienced any changes in hydrological response due to human-induced catchment development, construction of dams, or any other anthropogenic changes.The Calliope River catchment has an elevation range of 4 to 791 m (mean = 119 m) and an average slope gradient of 8.25%; the Haughton River catchment has an elevation range of 10 to 1225 m (mean = 208 m) and an average slope gradient of 13.44% (Figure 1; Table 1).Monthly discharge data at the catchment outlets were obtained from WMIP (Water Monitoring Information Portal) through the https://watermonitoring.information.qld.gov.auwebsite.Discharge data were downloaded and then converted to runoff.Additionally, monthly gridded rainfall data were obtained from SILO (Scientific Information for Land Owners) through the https://silo.longpaddock.qld.gov.au/gridded-datawebsite.SILO is a database of Australian climate data from 1889 to present, and provides gridded daily climate surfaces derived either from splining or kriging the observational data with a resolution of approximately 5 × 5 km.Gridded data were provided in NetCDF (Network Common Data Form) format and were arranged in annual blocks; each annual file contains all daily grids for a given year (or 12 monthly grids in the case of monthly rainfall).NDVI time series data for 2000 to 2018 were collected from the MODIS/MCD43A4_006_NDVI product available at https://explorer.earthengine.google.com/;this product is generated from the MODIS/006/MCD43A4 surface reflectance composites with a resolution of 500 × 500 m.The statistical parameters of the data used in this study are presented in Table 2.

Artificial Neural Networks (ANNs)
ANNs are one type of soft computing technique with a widespread parallel-distributed information processing system that have performance properties similar to biological neural networks and are able to recognize complex non-linear relationships between input and output vectors [48,49].ANNs with strong input-output structure have been widely used as a tool in hydrological models [24,[50][51][52].The ANN used in this study is a feed-forward neural network (FFNN) that is the most popular for time series forecasting models [53].The model is comprised of three layers: the input layer, output layer, and hidden layer (Figure 2).In feed-forward neural networks, neuron connection only occurs from a neuron of the input layer to neurons of the hidden layer and from a neuron of the hidden layer to neurons of the output layer; there is no connection between neurons within one layer.The output value of a three-layered FFNN is expressed as [54]: where, i, j and k show neurons of input, hidden, and output layers, respectively.w ij is weight of the ith neuron of input layer connected to the jth neuron of the hidden layer, w jk is weight of the jth neuron in the hidden layer connected to the kth neuron of the output layer, b j and b k are the bias for the jth hidden neuron and the kth output neuron, respectively.f h and f o are the activation functions for the hidden and the output neurons, respectively.Activation functions used in this study included a combination of a hyperbolic tangent function (in the hidden layer) and a pure line function (in the output layer) as these functions are effective when extrapolating beyond the range of the training data [55].x i is ith input variable, y k is estimated output variable.N is the number of the neurons in the input layer that are dependent on the modeled system, and M is the number of neurons in the hidden layer; in this study, to determine the proper number of neurons in the hidden layer, a trial and error method was used and an optimal ANN structure was selected according to the lowest root mean squared error (RMSE) for the validation phase.The final optimal structure (number of neurons in the input, hidden, and output layers, respectively) are shown in Tables 4-6.A back-propagation mechanism was used to train the network; this algorithm is based on a gradient scheme for weighting adjustment to reduce the error between predicted and observed data.Several variants of the back-propagation training scheme have been developed [56], among these, the Levenberg-Marquardt algorithm is applied in this study.
According to the Levenberg-Marquardt algorithm, the weights are adjusted as follows [57]: where, J is the Jacobian matrix that contains the first derivatives of the network errors relating to the weights and biases, ε is a vector of network errors, I is the identity matrix, and µ is learning rate with a positive scalar value adjusted at each iteration and governs the relative importance between the Gauss-Newton and gradient descent methods [58].Details of ANNs and their theoretical foundation can be found in [48].

Determination of Input Structure
In this study, eight scenarios of ANN models, based on the choice of inputs, were considered: (i) using rainfall data; (ii) using rainfall and runoff data; (iii) using rainfall and NDVI; (iv) using rainfall and IC; (v) using rainfall, NDVI, and IC; (vi) using rainfall, runoff, and NDVI; (vii) using rainfall, runoff, and IC; and (viii) using a combination of all parameters rainfall, runoff , NDVI, and IC.To set up current scenarios, statistical parameters, including cross-correlation and partial autocorrelation functions (CCF and PACF, respectively) to avoid trial and error in modelling process were applied to identify significant lag values of inputs [53].CCF is widely used for input selection in forecasting models, including R-R modelling using data-driven models [59].We also considered average of monthly rainfall of two previous months as the additional required input, since it had high correlation with runoff data.

Determination of Input Structure
In this study, eight scenarios of ANN models, based on the choice of inputs, were considered: (i) using rainfall data; (ii) using rainfall and runoff data; (iii) using rainfall and NDVI; (iv) using rainfall and IC; (v) using rainfall, NDVI, and IC; (vi) using rainfall, runoff, and NDVI; (vii) using rainfall, runoff, and IC; and (viii) using a combination of all parameters rainfall, runoff , NDVI, and IC.To set up current scenarios, statistical parameters, including cross-correlation and partial autocorrelation functions (CCF and PACF, respectively) to avoid trial and error in modelling process were applied to identify significant lag values of inputs [53].CCF is widely used for input selection in forecasting models, including R-R modelling using data-driven models [59].We also considered average of monthly rainfall of two previous months as the additional required input, since it had high correlation with runoff data.

Data Pre-Processing
In previous studies, to pre-process data in the ANN model by standardization process, one of these three ranges [0-1], [−1 to +1], or [0.1-0.9] were used.This confining data between specific limits minimizes bias and ensures that they receive the same attention within the network [55].All input variables in our study were normalized as follows [60] to be in the range [0-1]: where, X min and X max represent the minimum and maximum values among original data and X norm and X ori represent the normalized and original data, respectively.

Hydrological Connectivity
The concept of hydrological connectivity is often applied to describe the internal linkages between the sources of runoff and sediment within a catchment to the catchment outlet [61].Index of Connectivity (IC) is a type of hydrological connectivity index that can be computed in a GIS environment using topography and other catchment characteristics [44].IC is based on recognized major elements of hydrological connectivity, such as land use as a dynamic element and topographic characteristics as static elements that represents the potential connectivity between different parts of a catchment.The calculation of IC is expressed as follows [44]:

Data Pre-Processing
In previous studies, to pre-process data in the ANN model by standardization process, one of these three ranges [0-1], [−1 to +1], or [0.1-0.9] were used.This confining data between specific limits minimizes bias and ensures that they receive the same attention within the network [55].All input variables in our study were normalized as follows [60] to be in the range [0-1]: where, X min and X max represent the minimum and maximum values among original data and X norm and X ori represent the normalized and original data, respectively.

Hydrological Connectivity
The concept of hydrological connectivity is often applied to describe the internal linkages between the sources of runoff and sediment within a catchment to the catchment outlet [61].Index of Connectivity (IC) is a type of hydrological connectivity index that can be computed in a GIS environment using topography and other catchment characteristics [44].IC is based on recognized major elements of hydrological connectivity, such as land use as a dynamic element and topographic characteristics as static elements that represents the potential connectivity between different parts of a catchment.The calculation of IC is expressed as follows [44]: where, D up and D dn are the upslope and downslope components of connectivity, respectively.IC values range from −∞ to +∞, where larger IC values represent increasing connectivity.The upslope component (D up ) indicates the potential for downward routing of runoff and sediment and is defined as: where, W, S and A of the upslope contributing area are the average weighting factor (dimensionless), mean slope gradient (m/m), and area (m 2 ), respectively.The downslope component (D dn ) takes into account the flow path length that a particle travels to arrive at the designated target or sink.Therefore, D dn can be expressed as: where, W i is the weighting factor of the ith cell, S i is the slope gradient of the ith cell, and d i is the length of the ith cell along the downslope direction (m); d i can assume two values: cell size (l) in the case of the main direction and (l √ 2) in the case of the diagonal direction.The Hydrologically Enforced Digital Elevation Model (DEM-H) [62] and raster maps of weighting factors with resolutions of 30 × 30 m were applied as the main input data for calculating IC.One of main components of IC is the weighting factor (W) (Equations ( 5) and ( 6)).Borselli et al. [44] introduced the weighting factor to model the resistance of each cell against runoff and sediment flow based on characteristics of the local land use and soil surface that were derived from the C-factor of USLE-RUSLE models [63,64].C-factor values range from 0 to 1, where 0 represents dense vegetation and protected soil and 1 represents bare soil at risk of erosion.However, the same authors note that W should be based on surface features (e.g., roughness index) that influence runoff and sediment processes within a catchment or a hillslope [65].Considering catchment condition when calculating W provides reasonable results, e.g., differences in the impedance to transport of runoff and sediment in bare areas would not be represented by the C-factor, but better portrayed by a topographic roughness based index.In agricultural and forest environments, describing the impedance to runoff and sediment transport benefits from the application of the C-factor computed based on vegetation cover and land use management data [65].Therefore, in our two study catchments, applying the C-factor to the weighting factor W is suitable.To estimate the C-factor of USLE-RUSLE, we rescaled NDVI using the following method [66]: where, W is the weighting factor, C is C-factor of USLE-RUSLE and NDVI is Normalized Difference Vegetation Index

Normalized Difference Vegetation Index
The Normalized Difference Vegetation Index (NDVI) is an indicator of the greenness of vegetation cover, which is widely used for ecosystem monitoring.NDVI is generated from the Near-IR and Red bands of MODIS images as: where, NIR and RED are the spectral reflectances measured in the near infrared and red wavebands, respectively.NDVI values range from −1 to 1, where −1 represents bare soil and 1 represents dense vegetation [67].

Evaluation of Results
We considered three different types of standard statistics to assess the performance of various models during calibration and validation: coefficient of determination (R 2 ), RMSE, and Nash-Sutcliffe model performance coefficient (NS) [60]:

Evaluation of Results
We considered three different types of standard statistics to assess the performance of various models during calibration and validation: coefficient of determination (R 2 ), RMSE, and Nash-Sutcliffe model performance coefficient (NS) [60]: where, n is the number of data, O i and S i are the ith observed and simulated runoff, respectively, and O and S are the average of observed and simulated runoff, respectively.Higher R

Statistical Analysis of Data
Initially, monthly runoff and rainfall data were surveyed to ensure there were no gaps in the time series, then 70% of the total data were selected for calibration and the remaining 30% were selected for validation for each catchment.The statistical parameters for calibration, validation, and total datasets of rainfall, runoff, NDVI and IC datasets were calculated and are shown in Table 2.For Calliope River (station 132001A), the mean ( x) and standard deviation (σ x ) values for the runoff validation datasets were higher than those for the calibration, but for Haughton River (station 119003A), the mean and standard deviation values for runoff validation datasets were less than those for the calibration (Table 2).Furthermore, the coefficient of variation (Cv) of the runoff variable for Calliope River is greater than that of Haughton River.Both sites had relatively high values of skewness (G 1 ) and kurtosis (β 2 ) for both rainfall and runoff variables.Frequency distribution plots in these catchments exhibited non-normal distributions (Figure 4a,b).Based on the runoff box plots (Figure 4c,d), both stations contained several extreme runoff data that, together with the non-normal distribution of data, can affect ANN data-driven models and complicate runoff modelling.In this study, input selection using CCF was conducted on the training dataset to determine antecedent rainfall values, which had the best correlation with runoff in both catchments.Crosscorrelation between monthly runoff and monthly rainfall (with 3-month time lags), revealed that runoff data are highly correlated with rainfall data of the same month, followed by rainfall of the  In this study, input selection using CCF was conducted on the training dataset to determine antecedent rainfall values, which had the best correlation with runoff in both catchments.Cross-correlation between monthly runoff and monthly rainfall (with 3-month time lags), revealed that runoff data are highly correlated with rainfall data of the same month, followed by rainfall of the previous month for both catchments (Table 3).We also used the average of monthly rainfall of the two previous months as an input to estimate of runoff, since had high correlation with runoff data.It has also been suggested that using antecedent runoff can improve estimation of runoff values [59].Partial auto-correlation analysis was also conducted to determine the correlation of runoff in each month with antecedent runoffs in order to select the best lag time with the highest correlation.Results showed that monthly runoff data have the highest correlation coefficient with runoff of the previous month (Table 3).

ANN Models
The ANN models were applied to estimate runoff values using different input sets.Based on CCF and PACF analysis and using rainfall, runoff, NDVI, and IC time series data, 24 input patterns were considered.In all cases, ANN used runoff values of each month as the single output.For all 24 scenarios, simulated runoff was compared with observed runoff based on RMSE, NS, and R 2 .Results of all models are presented in two main sections: in the first section, we present the results of ANN models that used only hydro-climatic data as inputs (Table 4) and in the second section, results of the combination of hydro-climatic inputs with IC, NDVI, and both NDVI and IC are presented for Haughton River and Calliope River (Tables 5 and 6).

Different Patterns of Hydro-Climate Data
Here we analyzed two different patterns of hydro-climate data as ANN inputs: (i) rainfall data and (ii) a combination of rainfall and runoff data.Based on RMSE, NS, and R 2 , the best input pattern in first group was P t and P t−1, t−2 for Haughton River and P t and P t−1 for Calliope River.In the second group, P t , P t−1, t−2 and R t−1 was the best model for Haughton River and P t , P t−1 and R t−1 for Calliope River (Table 4).Overall, results showed that for both catchments, including rainfall of the previous months increased model accuracy compared to just using rainfall of the same month.This shows the importance of antecedent soil moisture in the runoff generation process [68].Additionally, including runoff and rainfall together as input data slightly improved model performance for both catchments (Table 4), indicating that only using rainfall is insufficient to precisely estimate runoff, while combining this with antecedent runoff improves predictions [59].

Combination of Hydro-Climate, Hydro-Geomorphic and Biophysical Data
To better evaluate all 24 ANN models with different inputs, average of performance indices (RMSE, NS, and R 2 ) for models using only hydro-climatic inputs (Table 4) and models using combinations of hydro-climatic data (P and R), hydro-geomorphic (IC) and biophysical data (NDVI) (Tables 5  and 6) were computed and presented separately for Haughton River and Calliope River (Figure 5a-c).Evaluation of results (compared to models using only hydro-climatic inputs) showed that for Haughton River catchment, IC as the model input, improved NS by 9.77%, R 2 by 11.76%, and decreased RMSE by 24.43%; NDVI as the model input, improved NS by 6.24%, R 2 by 8.08%, and decreased RMSE by 13.22%; and NDVI together with IC as model inputs, improved NS by 6.92%, R 2 by 6.86%, and decreased RMSE by 14.98%.For Calliope River catchment, IC improved NS by 11.25%, R 2 by 10.29%, and decreased RMSE by 37.89%; NDVI improved NS by 5.52%, R 2 by 6.72%, and decreased RMSE by 11.89%; and NDVI along with IC, improved NS by 7.05%, R 2 by 6.93%, and decreased RMSE by 22.48%.Comparison amongst different input patterns showed that IC inputs can better improve the performance of models compared to NDVI inputs in both catchments and the best input patterns during the validation phase based on RMSE, NS and R 2 were P t and IC t for Haughton River and P t , R t−1 and IC t for Calliope River.On the other hand, NDVI together with IC as model inputs improved the prediction results, but ANN performance is lower compared to using only IC data.These results may be explained by the fact that, since NDVI data was used to compute IC, thus NDVI together with IC input nodes increase the network complexity, without offering information that is essential for the modelling.Overall, our comparisons amongst ANN models showed that combining hydro-climatic data with IC and NDVI produced better results (higher NS and R 2 values and smaller RMSE) than models using only hydro-climatic data.These results demonstrate that runoff characteristics are affected by hydro-geomorphic features [47], biophysical data [69] and in general, incorporating catchment geomorphological characteristics within ANN model elevate the performance of R-R modelling [27,42,43].Comparing the calibration with the validation phase revealed that, in Calliope River, the performance indices values of the ANN models during the calibration phase were better than those during the validation phase, while in Haughton River this was reverse.This result based on the statistical parameters (Table 2) is understandable.For both the calibration and validation phases, scatter plots are presented for the best input patterns of ANN models (bold type in Tables 5 and 6) for the Haughton River catchment (Figure 6) and Calliope River catchment (Figure 7).Comparison of scatter plots between observed and simulated runoff data for two catchments showed that observed runoff during the validation period in Haughton River spanned a wider range than these values in Calliope River.Additionally, simulated runoff values for Calliope River were in closer agreement with the observed values (calibration: R 2 = 0.99, validation: R 2 = 0.96), than those for Haughton River (calibration: R 2 = 0.96, validation: R 2 = 0.82).Comparing the calibration with the validation phase revealed that, in Calliope River, the performance indices of the ANN models during the calibration phase were better than those during the validation phase, while in Haughton River this was reverse.This result based on the statistical parameters (Table 2) is understandable.For both the calibration and validation phases, scatter plots are presented for the best input patterns of ANN models (bold type in Tables 5 and 6) for the Haughton River catchment (Figure 6) and Calliope River catchment (Figure 7).Comparison of scatter plots between observed and simulated runoff data for two catchments showed that for the Haughton River catchment (Figure 6) and Calliope River catchment (Figure 7).Comparison of scatter plots between observed and simulated runoff data for two catchments showed that observed runoff during the validation period in Haughton River spanned a wider range than these values in Calliope River.Additionally, simulated runoff values for Calliope River were in closer agreement with the observed values (calibration: R 2 = 0.99, validation: R 2 = 0.96), than those for Haughton River (calibration: R 2 = 0.96, validation: R 2 = 0.82).Moreover, plots of temporal variations of observed runoff versus simulated runoff of the best model using only hydro-climatic inputs (P t ,P t-1, t-2 and R t-1 for Haughton River and P t , P t-1 and R t-1 for Calliope River) and simulated runoff of the best model using a combination of hydro-climatic inputs with new inputs (P t and IC t for Haughton River and P t , R t-1 and IC t for Calliope River) are shown in Figure 8a,b for Haughton and Calliope Rivers, respectively.These graphs confirm the agreement between observed and simulated runoff values during the validation period when IC is used along with hydro-climatic inputs, particularly for Calliope River.
Additionally, to evaluate the ability of model in addition to goodness of fit, accurate prediction of peak runoff values is also important.In this regard, relative error in simulating peak runoff (%RE P for the best model using only hydro-climatic inputs and the best model using hydro-climatic inputs along with new inputs was calculated according to the following equation [27] and the results are presented in Table 7. where, O P and S P are the observed and simulated peak runoff, respectively.The results indicated that, in general, peak runoff values were simulated more accurately when adding IC to hydroclimatic inputs, in the both catchments, especially in Calliope River catchment.Moreover, plots of temporal variations of observed runoff versus simulated runoff of the best model using only hydro-climatic inputs (P t , P t−1, t−2 and R t−1 for Haughton River and P t , P t−1 and R t−1 for Calliope River) and simulated runoff of the best model using a combination of hydro-climatic inputs with new inputs (P t and IC t for Haughton River and P t , R t−1 and IC t for Calliope River) are shown in Figure 8a,b for Haughton and Calliope Rivers, respectively.These graphs confirm the agreement between observed and simulated runoff values during the validation period when IC is used along with hydro-climatic inputs, particularly for Calliope River.
Additionally, to evaluate the ability of model in addition to goodness of fit, accurate prediction of peak runoff values is also important.In this regard, relative error in simulating peak runoff (%RE P ) for the best model using only hydro-climatic inputs and the best model using hydro-climatic inputs along with new inputs was calculated according to the following equation [27] and the results are presented in Table 7.
where, O P and S P are the observed and simulated peak runoff, respectively.The results indicated that, in general, peak runoff values were simulated more accurately when adding IC to hydro-climatic inputs, in the both catchments, especially in Calliope River catchment.4) and simulated runoff of the best model using hydroclimatic inputs along with new inputs (bold type in Tables 5 and 6) during the validation phase for the Haughton River catchment (a) and the Calliope River catchment (b).

Conclusions
This study was designed to improve R-R modelling using different input patterns in ANN datadriven models for two catchments in Queensland, Australia.Our objective was to examine the most effective and accessible model input variables based on hydrologic understanding to achieve better estimates of monthly runoff.To accomplish this goal, we investigated the effects of NDVI as a biophysical input and IC as a hydro-geomorphic input, along with rainfall and runoff as hydroclimatic inputs on modelling performance.This study concluded the significant effect of IC and NDVI time series inputs for estimating monthly runoff using ANN models.Combining catchment geomorphic and biophysical parameters as a single parameter (IC in this case) not only reduced the number of model parameters, also increased accuracy of the model simulations by a slight increase (9.77% and 11.25%) in the Nash-Sutcliffe efficiency and noticeable decrease (24.43% and 37.89%) in the root mean squared error of monthly runoff from Haughton River and Calliope River, respectively.We also concluded that remote sensing can provide spatially explicit information from catchment    4) and simulated runoff of the best model using hydro-climatic inputs along with new inputs (bold type in Tables 5 and 6) during the validation phase for the Haughton River catchment (a) and the Calliope River catchment (b).
Table 7. Assessment of models based on relative error in peak runoff for the best model using only hydro-climatic inputs and the best model using hydro-climatic inputs along with new inputs for Haughton River and Calliope River catchments.

Input Pattern Haughton River Calliope River
The

Conclusions
This study was designed to improve R-R modelling using different input patterns in ANN data-driven models for two catchments in Queensland, Australia.Our objective was to examine the most effective and accessible model input variables based on hydrologic understanding to achieve better estimates of monthly runoff.To accomplish this goal, we investigated the effects of NDVI as a biophysical input and IC as a hydro-geomorphic input, along with rainfall and runoff as hydro-climatic inputs on modelling performance.This study concluded the significant effect of IC and NDVI time series inputs for estimating monthly runoff using ANN models.Combining catchment geomorphic and biophysical parameters as a single parameter (IC in this case) not only reduced the number of model parameters, also increased accuracy of the model simulations by a slight increase (9.77% and 11.25%) in the Nash-Sutcliffe efficiency and noticeable decrease (24.43% and 37.89%) in the root mean squared error of monthly runoff from Haughton River and Calliope River, respectively.We also concluded that remote sensing can provide spatially explicit information from catchment conditions.This dynamic information from catchment such as land cover can be used in data-driven models to increase the model performance.These improvements in runoff predictions are valuable for water resources planning and management.Although our results showed that IC is a promising and reliable input within ANN models for better simulating R-R processes, however, more research is required to evaluate the efficiency of these inputs in other scenarios, such as simulation of event-based R-R, simulation of sediment transport, modelling using other artificial intelligence models (e.g., ANFIS, SVR), and application in other catchments.Given that the use of high quality data in data-driven models will lead to more reliable results, input data preparation is important for such studies.Availability of the reliable hydro-climatic data is one of the limitations of these kind of models.Moreover, availability of remotely-sensed land cover datasets (i.e., NDVI) and high resolution topographic data to represent catchment geomorphic characteristics, can also be a limiting factor.For example, availability of cloud-free satellite data that can present spatial and temporal dynamics of vegetation cover in catchment, is critical in deriving NDVI and IC parameters.

Figure 1 .
Figure 1.Location of stations and drainage network maps of Haughton and Calliope Rivers.

Figure 1 .
Figure 1.Location of stations and drainage network maps of Haughton and Calliope Rivers.

Figure 3 .
Figure 3. Monthly NDVI time series plot compared with monthly rainfall and runoff data for the Haughton River catchment (a) and the Calliope River catchment (b) (2000 to 2018).

Figure 3 .
Figure 3. Monthly NDVI time series plot compared with monthly rainfall and runoff data for the Haughton River catchment (a) and the Calliope River catchment (b) (2000 to 2018).
Water 2018, 10, x FOR PEER REVIEW 11 of 23

Figure 4 .
Figure 4. Frequency distribution histogram and box plot of runoff time series data for catchments of Calliope River (a) and (c) and Haughton River (b) and (d).The dashed-line in the frequency histogram shows the distribution curve.

4. 2 .
Development and Application of ANN Model 4.2.1.Results of the Best Input Delay Values

Figure 4 .
Figure 4. Frequency distribution histogram and box plot of runoff time series data for catchments of Calliope River (a) and (c) and Haughton River (b) and (d).The dashed-line in the frequency histogram shows the distribution curve.

Figure 5 .
Figure 5.Comparison of average of performance indices (root mean squared error (RMSE), Nash-Sutcliffe model performance coefficient (NS), and coefficient of determination (R 2 )) for the ANN model with different input patterns in the validation period for Haughton River and Calliope River catchments (a-c).

Figure 5 .
Figure 5.Comparison of average of performance indices (root mean squared error (RMSE), Nash-Sutcliffe model performance coefficient (NS), and coefficient of determination (R 2 )) for the ANN model with different input patterns in the validation period for Haughton River and Calliope River catchments (a-c).

Figure 6 . 23 Figure 6 .
Figure 6.Scatter plot of the best input pattern (bold type in Table 5) of ANN models during the calibration (a) and validation (b) phases in the Haughton River catchment.

Figure 7 .
Figure 7. Scatter plot of the best input pattern (bold type in Table6) of ANN models during the calibration (a) and validation (b) phases in the Calliope River catchment.

Figure 7 .
Figure 7. Scatter plot of the best input pattern (bold type in Table6) of ANN models during the calibration (a) and validation (b) phases in the Calliope River catchment.

Figure 8 .
Figure 8. Time-series curves of observed runoff versus simulated runoff of the best model using only hydro-climatic inputs (bold type in Table4) and simulated runoff of the best model using hydroclimatic inputs along with new inputs (bold type in Tables5 and 6) during the validation phase for the Haughton River catchment (a) and the Calliope River catchment (b).

Figure 8 .
Figure 8. Time-series curves of observed runoff versus simulated runoff of the best model using only hydro-climatic inputs (bold type in Table4) and simulated runoff of the best model using hydro-climatic inputs along with new inputs (bold type in Tables5 and 6) during the validation phase for the Haughton River catchment (a) and the Calliope River catchment (b).

Table 1 .
Geographical and hydrological characteristics of the studied stations.

Table 1 .
Geographical and hydrological characteristics of the studied stations.

Table 2 .
Statistical parameters of monthly rainfall, runoff, Normalized Difference Vegetation Index (NDVI) and Index of Connectivity (IC) for the total, calibration and validation datasets for studied catchments.

1 Variable Statistical Parameter The Haughton River Catchment The Calliope River Catchment Calibration (70%) Validation (30%) Total Data Calibration (70%) Validation (30%) Total Data
[50]no better than taking the mean of the observed data, and values <0 indicate very weak model performance.Generally, NS > 0.6 is considered as acceptable model performance[50].
2values represent better correlation between simulated and observed values.The RMSE ranges from 0 to +∞, with 0 indicating a perfect fit of the model to observed values.The range of NS values is from −∞ to 1, with a perfect match of simulated and observed values indicated by 1, a value of 0 indicates that simulated data

Table 3 .
Cross correlation between monthly runoff and rainfall data with 5% significance limits and Partial autocorrelation for monthly runoff data with 5% significance limits for Haughton River and Calliope River catchments.

Table 4 .
Results of monthly time series modelling by Artificial Neural Network (ANN) with hydro-climatic inputs during calibration and validation.
t : Rainfall value of the (t)th month, P t−1 : Rainfall value of the (t−1)th month, P t−1, t−2 : Average of monthly rainfall value of the (t−1)th and (t−2)th months and R t−1 : Runoff value of the (t−1)th month.

Table 5 .
Results of monthly time series modelling by ANN with different inputs during calibration and validation in the Haughton River catchment.

Table 6 .
Results of monthly time series modelling by ANN with different inputs during calibration and validation in the Calliope River catchment.