Comparing Machine Learning and Decision Making Approaches to Forecast Long Lead Monthly Rainfall: The City of Vancouver, Canada

: Estimating maximum possible rainfall is of great value for ﬂood prediction and protection, particularly for regions, such as Canada, where urban and ﬂuvial ﬂoods from extreme rainfalls have been known to be a major concern. In this study, a methodology is proposed to forecast real-time rainfall (with one month lead time) using different number of spatial inputs with different orders of lags. For this purpose, two types of models are used. The ﬁrst one is a machine learning data driven-based model, which uses a set of hydrologic variables as inputs, and the second one is an empirical-statistical model that employs the multi-criteria decision analysis method for rainfall forecasting. The data driven model is built based on Artiﬁcial Neural Networks (ANNs), while the developed multi-criteria decision analysis model uses Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS) approach. A comprehensive set of spatially varying climate variables, including geopotential height, sea surface temperature, sea level pressure, humidity, temperature and pressure with different orders of lags is collected to form input vectors for the forecast models. Then, a feature selection method is employed to identify the most appropriate predictors. Two sets of results from the developed models, i.e., maximum daily rainfall in each month (RMAX) and cumulative value of rainfall for each month (RCU), are considered as the target variables for forecast purpose. The results from both modeling approaches are compared using a number of evaluation criteria such as Nash-Sutcliffe Efﬁciency (NSE). The proposed models are applied for rainfall forecasting for a coastal area in Western Canada: Vancouver, British Columbia. Results indicate although data driven models such as ANNs work well for the simulation purpose, developed TOPSIS model considerably outperforms ANNs for the rainfall forecasting. ANNs show acceptable simulation performance during the calibration period (NSE up to 0.9) but they fail for the validation (NSE of 0.2) and forecasting (negative NSE). The TOPSIS method delivers better rainfall forecasting performance with the NSE of about 0.7. Moreover, the number of predictors that are used in the TOPSIS model are signiﬁcantly less than those required by the ANNs to show an acceptable performance (7 against 47 for forecasting RCU and 6 against 32 for forecasting RMAX). Reliable and precise rainfall forecasting, with adequate lead time, beneﬁts enhanced ﬂood warning and decision making to reduce potential ﬂood damages.


Introduction
All around the world, floods are known as a severe natural disaster with significant social, economic and environmental consequences. The consequences can be property losses and destruction Mekanik et al. [28] developed several machine learning methods such as ANNs to forecast spring rainfall for southeast Australia using large scale climate signals including ENSO, IOD and Inter-decadal Pacific Ocean (IPO). In some studies, rainfall has been used as the only input to the neural networks for rainfall forecasting. For example, Luk et al. [29] used the current and past rainfall values from a couple of rainfall gauges with different lags (15 min intervals) as inputs to three alternative types of ANNs for short term rainfall forecasting. Although ANNs have been the most popular data-based models for rainfall forecasting, a limited number of studies has also been observed in the literature for using the other types of machine learning methods. One example is Nasseri et al. [30] that investigated the application of ANNs coupled with coupled with genetic algorithm to train and optimize the networks for short term rainfall forecasting. Hong et al. [31], as another instance, used support vector machines for rainfall forecasting.
In this study, a methodology is proposed to forecast long term rainfall for Vancouver city, British Columbia (a Western Canadian province), using a set of spatially varying large scale climate signals with different lag times. Here, two different modeling approaches are developed and compared for rainfall forecasting. At the first step, MRMR (Maximum Relevance Minimum Redundancy) feature selection method picks the most effective signals as predictors [32]. Then, the two approaches are employed to build the forecast models. The first approach is an artificial neural network model (which has extensively been applied in the literature for hydrologic simulations) that uses the selected predictors to forecasts rainfall one month ahead of time. The second approach is based on TOPSIS (Technique for Order of Preference by Similarity to Ideal Solution), a multi-criteria decision making method, which tests different combinations of predictors, among those identified by MRMR, and investigates all the possible sets of data for rainfall modeling. TOPSIS has been more employed in social studies and for decision making (e.g., [33][34][35]), however, is has recently been shown to be powerful in hydrologic and weather forecasting as well [16,36]. Finally, the ANN and TOPSIS models with the best forecasting performance and their corresponding sets of predictors are identified and compared to propose a promising approach for long lead monthly rainfall forecasting.
The paper starts with an introduction to the study area and then the description of the methodology. Thereafter, results are provided and discussed, and finally, a summary and conclusion is given.

Study Site
To verify the skill of the proposed methodology to forecast rainfall, it is applied to a real world case study: Vancouver, British Columbia, Canada. Vancouver is a coastal city located in the Lower Mainland region of British Columbia with an area of 115 km 2 . It is the most populous city in the province of British Columbia with more than 630,000 people recorded in 2016 census. Communities located on the southwest of British Columbia have been affected by a number of flood events [37]. Examples of severe weather events in this region are the storm on 15 December 2006, flooding on 24 and 27 November 2011 and the landfall of Typhoon Freda on 12 October 1962 [38]. Flood vulnerability in Vancouver is expected to be increased due to sea level rise and climate change impacts [39,40].
Here, to perform the analysis to develop models for extreme rainfall forecasting, daily rainfall data are obtained from the British Columbia River Forecast Centre (BCRFC) database (bcrfc.env.gov.bc.ca). Characteristics of the rainfall gauge station are presented in Table 1. Table 1. Characteristics of the rainfall gauge station in the study area.

Name of the Station Latitude ( • ) Longitude ( • ) Elevation (m) Data Type and Length
Vancouver Harbour CS 49.3 −123.12 2.5 daily precipitation (1925-present) The maximum value of daily rainfall is 203. 2 Table 2 shows long-term averages of maximum daily rainfall in each month (RMAX) and cumulative value of rainfall for each month (RCU) for VANCOUVER HARBOUR CS gauge station over the period of 1925 to 2016.

Methodology
Proposing a methodology for reliable real time rainfall forecasting is the main focus of this study. Two modeling approaches are used and compared for this purpose. The methodology uses a set of large scale climate signals as well as the historical rainfall events as input for the two models, i.e., artificial neural network and TOPSIS. Prior to use the signals for rainfall modeling, MRMR method is employed to choose the most effective prediction features among the climate signals. To measure the forecasting performance of the models, several evaluation criteria are used. A coastal city in Western Canada has been selected for the real application of the proposed approach.
The flowchart presented in Figure 1 shows the proposed scheme of this study to forecast rainfall.

Methodology
Proposing a methodology for reliable real time rainfall forecasting is the main focus of this study. Two modeling approaches are used and compared for this purpose. The methodology uses a set of large scale climate signals as well as the historical rainfall events as input for the two models, i.e., artificial neural network and TOPSIS. Prior to use the signals for rainfall modeling, MRMR method is employed to choose the most effective prediction features among the climate signals. To measure the forecasting performance of the models, several evaluation criteria are used. A coastal city in Western Canada has been selected for the real application of the proposed approach.
The flowchart presented in Figure 1 shows the proposed scheme of this study to forecast rainfall.

Data Gathering and Preparation
Two sets of rainfall timeseries i.e., maximum daily rainfall (RMAX) in a month, and monthly cumulative rainfall (RCU) in a month, are considered as the forecasting targets. These rainfall timeseries are formed using the daily data obtained from the gauge station presented in Table 1. In order to constitute the rainfall models' input set, monthly timeseries of large scale climate signals, including geopotential height eight (GH), wind speed (W), air temperature (AM), SST, relative

1-Data gathering and preparation
Timeseries of large scale climate variables: geopotential height (GH), wind (W), atmospheric pressure(AM), sea surface temperature (SST), relative humidity (RH), sea level pressure (SLP), and precipitation rate (PR) Daily rainfall Study area and low and high pressure points' coordination

Data Gathering and Preparation
Two sets of rainfall timeseries i.e., maximum daily rainfall (RMAX) in a month, and monthly cumulative rainfall (RCU) in a month, are considered as the forecasting targets. These rainfall timeseries are formed using the daily data obtained from the gauge station presented in Table 1. In order to constitute the rainfall models' input set, monthly timeseries of large scale climate signals, including geopotential height eight (GH), wind speed (W), air temperature (AM), SST, relative humidity (RH), SLP, and precipitation rate (PR) are attained from the National Ocean and Atmospheric Administration (NOAA) website (monthly/seasonal mean time series from the NCEP Reanalysis Dataset from http: //esrl.noaa.gov/psd/cgi-bin/data/timeseries/timeseries1.pl). Climate signals data are available from March 1948 to present.
There is a high correlation between the large scale climate signals over the North-Eastern Pacific and the extreme hydrologic events across the North America West Coast [15,41]. One example is the SLP over the North Pacific Ocean which is dominated by the "North Pacific High" (a well-developed high pressure system located in the northeastern part of the Pacific Ocean) [42]. This high pressure system has the strongest effects on the storms during the northern hemisphere summer and causes dry summers and falls, and wet winters and springs over the western parts of Unites States [43]. According to Bonsal et al. [44], there is a strong relation between large scale teleconnections (patterns of pressure and circulation anomalies that span long distant geographical areas) and Canadian climate. In this study, to investigate the relationship between the large scale climate signals over the North Pacific Ocean and extreme rainfall for the study area, climate data are downloaded for four different regions (here called characteristic locations). One of these regions covers the study area (i.e., 110 • -130 • W longitude and 48. There is a high correlation between the large scale climate signals over the North-Eastern Pacific and the extreme hydrologic events across the North America West Coast [15,41]. One example is the SLP over the North Pacific Ocean which is dominated by the "North Pacific High" (a well-developed high pressure system located in the northeastern part of the Pacific Ocean) [42]. This high pressure system has the strongest effects on the storms during the northern hemisphere summer and causes dry summers and falls, and wet winters and springs over the western parts of Unites States [43]. According to Bonsal et al. [44], there is a strong relation between large scale teleconnections (patterns of pressure and circulation anomalies that span long distant geographical areas) and Canadian climate. In this study, to investigate the relationship between the large scale climate signals over the North Pacific Ocean and extreme rainfall for the study area, climate data are downloaded for four different regions (here called characteristic locations). One of these regions covers the study area (i.e., 110°-130° W longitude and 48.6°-54.4° N Latitude, shown by L1), and the rest of them (indicated by L2, L3, and L4) cover low and high pressure points on the North Eastern Pacific Ocean, with coordinates 120°-150° W and 50°-70° N, 140°-180° W and 40°-60° N, and 100°-120° W and 40°-50° N, respectively. Figure 2 shows the locations of the selected regions for acquiring large scale climate signals. The list of climate variables for location L1 is shown in Table 3 (rows 1 to 9). SST and SLP monthly anomalies (rows 6 and 8 in Table 3) are calculated by subtracting long-term average of SST and SLP over each month from the observed SST and SLP for the corresponding month. Likewise, SST and SLP seasonal anomalies (rows 7 and 9 in Table 3) are calculated by subtracting seasonal SST and SLP for all seasons from the monthly values of observed SST and SLP corresponding to each season.
Four lag times (1 month, and 2, 3 and 12 months) are considered. Based on these lag times, a total of 47 predictors for location L1 are constructed (shown by L11 to L147 in Table 3). Therefore, for four locations, a total of 188 (47 × 4) timeseries of climate signals are developed. In addition to the large scale climate signals with various lag times, historical rainfalls (historical RMAX and RCU) are also incorporated in constructing the predictors' sets (the last four rows in Table 3). This means that for example, to forecast rainfall for April 2016, rainfall values for March, February and January 2016, as well as April 2015, may be used (i.e., RMAX1-RMAX4 and RCU1-RCU4). RMAXL and RCUL in Table 3 signify the long term average of monthly rainfalls for 12 months (i.e., January to December). Therefore, for RMAX and RCU, by averaging the data for the entire time period, 12 values are obtained, and then for each month, the corresponding long term average of rainfall is used in the predictors' set. Adding the timeseries built from the historical rainfall to the large scale climate signals, 198 (188 + 10) predictors will be considered for rainfall forecasting. The list of climate variables for location L 1 is shown in Table 3 (rows 1 to 9). SST and SLP monthly anomalies (rows 6 and 8 in Table 3) are calculated by subtracting long-term average of SST and SLP over each month from the observed SST and SLP for the corresponding month. Likewise, SST and SLP seasonal anomalies (rows 7 and 9 in Table 3) are calculated by subtracting seasonal SST and SLP for all seasons from the monthly values of observed SST and SLP corresponding to each season.
Four lag times (1 month, and 2, 3 and 12 months) are considered. Based on these lag times, a total of 47 predictors for location L 1 are constructed (shown by L 1 1 to L 1 47 in Table 3). Therefore, for four locations, a total of 188 (47 × 4) timeseries of climate signals are developed. In addition to the large scale climate signals with various lag times, historical rainfalls (historical RMAX and RCU) are also incorporated in constructing the predictors' sets (the last four rows in Table 3). This means that for example, to forecast rainfall for April 2016, rainfall values for March, February and January 2016, as well as April 2015, may be used (i.e., RMAX1-RMAX4 and RCU1-RCU4). RMAXL and RCUL in Table 3 signify the long term average of monthly rainfalls for 12 months (i.e., January to December). Therefore, for RMAX and RCU, by averaging the data for the entire time period, 12 values are obtained, and then for each month, the corresponding long term average of rainfall is used in the predictors' set. Adding the timeseries built from the historical rainfall to the large scale climate signals, 198 (188 + 10) predictors will be considered for rainfall forecasting. Table 3. List of the large scale climate and precipitation variables considered to develop the rainfall forecasting model.

ID
Variable Identified variables for location L 2 (as similarly named by L 1 1 to L 1 47 for location L 1 ) L 3 1-L 3 47 Identified variables for location L 3 (as similarly named by L 1 1 to L 1 47 for location L 1 ) L 4 1-L 4 47 Identified variables for location L 4 (as similarly named by L 1 1 to L 1 47 for location L 1 ) RMAX1-RMAX4 RMAX with 1 month, and 2, 3 and 12 months lag time RCU1-RCU4 RCU with 1 month, and 2, 3 and 12 months lag time RMAXL Long term monthly averages of RMAX (from January to December) RCUL Long term monthly averages of RCU (from January to December)

Rainfall Forecasting
Two rainfall values are considered to be forecasted: monthly maximum daily rainfall (RMAX) (the maximum value of daily rainfall in a month) and monthly cumulative value of rainfall (RCU) (cumulative values of daily rainfalls for a month).
One of the main issues in developing predictive tools is to select the most appropriate set of predictors. Taking into account four locations (L 1 -L 4 ), and monthly and seasonal climate signals with different lag times (1, 2, 3 and 12 months), a large set of possible combinations of predictors can be built. Different feature selection methods, such as Mutual Information [45], stepwise regression [46] and Max-Relevance and Min-Redundancy [45,47] can be used for selecting the optimal input variables. Considering the findings of the previous studies and the efficiency of these methods, in this study Max-Relevance and Min-Redundancy (MRMR) method is employed to select the most appropriate predictors' set with efficient number of climate signals. MRMR considers the correlation between input variables (predictors) and rainfall (predictant), as well as inter-correlation between the inputs. The forecasting models use the identified variables by MRMR as input (predictors) and, RMAX and RCU as target (predictant).

Predictors' Selection for Rainfall Forecasting: Application of MRMR Method
For data driven models, inputs have substantial effect on the model simulation performance. Mutual information (MI)-based methods are tools used to select the most suitable inputs. MRMR is an example of these methods. It selects a set of predictors among a large number of predictors (features) that are related to a predictant. This method can pick out a set of appropriate inputs based on the desired number of predictor variables. The selected predictors have the highest MI with predictant (target) and the lowest MI among themselves.
MI for two random variables of a and b is shown by I(a; b). I(a; b) is defined based on their probabilistic density functions represented by p(a), p(b), and p(a, b), respectively: The purpose of MI is to find a set of predictors (called A) with k features that jointly have the largest dependency on the target class c: MRMR is represented by the criteria of maximum relevance (Max-R) and minimum redundancy (Min-R). In Max-R, selected features (a i ) among the predictors are required to individually have the largest MI with predictant (c). This reflects a large dependency with the target. Max-R means to find predictors satisfying Equation (3). This equation approximates D(A, c) with the mean value of all MI's between a i (the ith feature) and c: In the feature selection, selecting combinations of individually good features do not necessarily lead to a good performance for the classification. This is due to the redundancy among features. Therefore, the following Min-R condition is added to select mutually exclusive features with minimum redundancy: MRMR merges the above two max and min constraints. Then, the Φ operator combines D and R to maximize D and minimize R simultaneously: Here, the maximum number of features (k) in the predictors' set is considered to be 50 (i.e., 4 ≤ k ≤ 50). In the MRMR method, predictors are selected based on their order of correlation with the predictant. In other words, by increasing the value of k, a new predictor is added to the previous set of inputs.

Models' Development
Two modeling approaches are employed for rainfall simulation and forecasting. The first approach is based on the application of artificial neural network (ANN) machine learning method. The second model uses a multi-criteria decision analysis method, called Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS).

•
Data driven models for rainfall forecasting: application of artificial neural network (ANN) ANNs are composed of a system of interconnected neurons that fed by predictors (inputs) and compute outputs by feeding information through layers of neurons in the network (Figure 3). In this study, Multi-Layer Perceptron (MLP) neural network models (feedforward networks that consists of the input layer, the hidden layers and the output layer, and process the information from the input layer to the hidden layer and then the output layer) with various structures are tested: = 2 1 + exp −2 − 1 tansig Log − sigmoid transfer function 1 1 + exp logsig Hyperbolic tangent sigmoid transfer function where is the output, is the input, and and are the weights between neurons of the input and hidden layer and between the hidden layer and output, respectively. and are the bias vectors for the input and hidden layers, and and are the activation functions for the output layer and the hidden layer, respectively. Also, I and J signify the number of nodes in the input and hidden layer [28].
Networks with 1 hidden layer with the number of neurons allowed to vary between 2 and 40 are built [48]. To test if the network simulation performance could be improved, models are checked with logsig (Log-sigmoid) and tansig (Hyperbolic tangent sigmoid) non-linear transfer functions for the hidden layer ( in Equation (7)). Moreover, different training functions including traingdm (Gradient descent with momentum backpropagation), trainlm (Levenberg-Marquardt backpropagation) and traingdx (Gradient descent with momentum and adaptive learning rate backpropagation) are also examined. For the MLP networks, output unit is selected to be the linear purelin (pure linear) function ( In Equation (8)). The number of inputs in each time step is equal to the number of the selected predictors by MRMR. The initial weights are randomly selected, and then to obtain the best simulation, weights are adjusted during the network training. More information about the application of artificial neural networks can be found in [49]. A vector composed of historical observed rainfall is used for supervised network training using the back propagation algorithms. 70% of data is used for calibration, 20% for validation, and 10% for forecasting.
Set of the input data for the networks is built by choosing predictors from the first 50 variables selected among the 198 predictors by MRMR. The minimum number of predictors in the input set is In this study, Multi-Layer Perceptron (MLP) neural network models (feedforward networks that consists of the input layer, the hidden layers and the output layer, and process the information from the input layer to the hidden layer and then the output layer) with various structures are tested: where y t is the output, x i is the input, and w i and w j are the weights between neurons of the input and hidden layer and between the hidden layer and output, respectively. b i and b j are the bias vectors for the input and hidden layers, and f 1 and f 2 are the activation functions for the output layer and the hidden layer, respectively. Also, I and J signify the number of nodes in the input and hidden layer [28]. Networks with 1 hidden layer with the number of neurons allowed to vary between 2 and 40 are built [48]. To test if the network simulation performance could be improved, models are checked with logsig (Log-sigmoid) and tansig (Hyperbolic tangent sigmoid) non-linear transfer functions for the hidden layer ( f 2 in Equation (7)). Moreover, different training functions including traingdm (Gradient descent with momentum backpropagation), trainlm (Levenberg-Marquardt backpropagation) and traingdx (Gradient descent with momentum and adaptive learning rate backpropagation) are also examined. For the MLP networks, output unit is selected to be the linear purelin (pure linear) function ( f 1 In Equation (8)). The number of inputs in each time step is equal to the number of the selected predictors by MRMR. The initial weights are randomly selected, and then to obtain the best simulation, weights are adjusted during the network training. More information about the application of artificial neural networks can be found in [49]. A vector composed of historical observed rainfall is used for supervised network training using the back propagation algorithms. 70% of data is used for calibration, 20% for validation, and 10% for forecasting.
Set of the input data for the networks is built by choosing predictors from the first 50 variables selected among the 198 predictors by MRMR. The minimum number of predictors in the input set is 4 and the maximum number of predictors is 50. To automate the process of constructing the models (with different structures and number of input predictors) and find the model with the highest simulation performance, the whole process is scripted in MATLAB. Several metrics (Equations (17)- (22)) are used to evaluate the goodness of fit of the models' performance, and finally choose the ANN model with an optimized structure.
Thereafter, the weighted normalized decision matrix (T) is built: where W j , which is the weight given to each predictor (criteria), is considered to be 1. Then, the worst and the best alternatives (negative ideal and positive ideal solutions), A w and A b respectively, are determined: where J + = {j = 1, 2, . . . , n|j } and J − = {j = 1, 2, . . . , n|j } are associated with the criteria having a positive and negative impact, respectively. In the following, distances between alternative i and A w and A b , (shown by d iw and d ib , respectively) are calculated: Finally, the relative closeness to A w and A b is determined: s iw is equal to 0 or 1, if and only if the identified solution has the worst or the best conditions, respectively. Among the observed events in the rainfall timeseries (i.e., m alternatives), the last 10% of the events are kept to check the performance of the forecasting models. The rest of observed data are used to build the models. In other words, for each of the last 10% of the events, the model will look into the rest of the events (i.e., m − 10% × m events) to find the worst and the best alternatives and then identify the alternative with the highest value of relative closeness based on Equation (15).
To improve the performance of the TOPSIS models for the rainfall forecasting, instead of identifying only one alternative as the solution, 10 rainfall events with the highest values of relative closeness are selected. For this purpose, a set of alternatives is ranked according to the descending order of s iw . Then, the ten alternatives with the highest values of s iw are selected and combined using their corresponding relative weights: where V P is the forecasted value of rainfall with one month lead time by TOPSIS, V i , i = 1, . . . , 10 are the rainfall events with the highest values of relative closeness, and w i , i = 1, . . . , 10 are the values of relative closeness corresponding to the identified rainfalls. This procedure is repeated for all of the rainfall events kept for checking the models' performance, while models have different decision matrices. To identify the model with the ideal number of criteria (n), the models' performance for rainfall forecasting should be compared.

Evaluation of the Forecasting Models' Performance
The following metrics are used to analyze and compare the power of the developed models for rainfall simulation and forecasting: Nash-Sutcliffe Efficiency : Mean Bias Error : Mean Absolute Error : Index of agreement : Common Mean Correlation : Mean Squared Error : where O i and S i are observed and forecasted monthly rainfall in month i, respectively. O and S are the long term mean values of observed and forecasted monthly rainfall for the entire time period, respectively. NSE represents a measure of the proportion of the initial variance accounted for the model, and ranges between −∞ to +1 for a perfect correlation. MBE is a measurement of accuracy. MBE shows the difference between the expected value of rainfalls (forecast) and its true value (observation). MBE could be negative or positive, while values closer to zero are more preferable. MAE measures residual errors and provides information about the difference between the observed and forecasted values, while smaller values are more preferable. d 2 compares the difference between the simulated and observed rainfall means and represents the degree of error in simulations [50][51][52]. d 2 varies between 0, for complete disagreement, and 1 for perfect agreement between the observed and predicted data. CMC provides an informative measure of prediction performance and varies between 0 for weak and 1 for perfect performance [53]. MSE, the second moment of the bias (where bias is defined as an average of all errors), measures the average squares of the errors or deviations, which is the difference between the observed and forecasted rainfall. The MSE is a non-negative measure and values closer to zero are preferable.

Results and Discussion
In this section, results, for selecting the set of predictors, models' development for rainfall simulation and then comparing the models' performances, are presented according to the order of the methodology steps introduced in Figure 2.

MRMR Method: Selecting the Most Effective Predictors for Rainfall Forecasting
Tables 4 and 5 list the selected 50 predictors identified using the MRMR method to forecast RMAX and RCU. In these Tables, rank represents the preference order of the variable relative to the rainfall forecast, i.e., the lower ranks are associated with more appropriate predictors.
Tables 4 and 5 confirm that among the large scale climate signals, geopotential height (GH) is repeated more than the other signals (for all characteristic locations L 1 to L 4 ). Then, SST and SLP signals and anomalies are reported more frequently. Moreover, these tables suggest that historical values of rainfall (i.e., RMAX and RCU with different lag times) could be of significant effect for rainfall forecasting. For RMAX variables, the most repeated lag times are 1 and 12, while for RCU variables, lag times of 3 and 12 months are repeated more than the others. Among the characteristic locations, for both RMAX and RCU, L 4 is repeated more frequent than the other locations (14 times for RMAX and 19 times for RCU).

Application of ANNs
Using the automated script in MATLAB, different possible structures of ANN models (various transfer functions, training algorithms and input variables) are tested and their performances to simulate the rainfall timeseries are reported and compared. Number of inputs to the networks is allowed to vary between 4 and 50. In other words, initially, the first 4 variables, listed in Table 4 (for RMAX) and Table 5 (for RCU), are used for simulation, and then the number of variables is increased until all the variables shown in these tables are used as inputs to the ANN models. Comparing the simulation results based on the performance metrics indicate that the models with "logsig" transfer function and "traingdx" back propagating algorithm perform better than other structures for simulation of RMAX. However, for the simulation of RCU, ANN models with "tansig" transfer function and "traingdx" back propagation algorithm revealed better simulation performance. Figure 4 illustrates the NSE values for different ANNs with varying input variables and the above mentioned structures to simulate the RCU and RMAX. This figure is intended to depict how changes in the number of predictors could affect the modeling performance. The NSE metric shows the overall simulation performance of the model in the calibration and validation periods. Based on Figure 4a for RCU, the model with the first 47 predictors listed in Table 5 has the maximum value of NSE. Similarly, as shown in Figure 4b, the best ANN model uses the first 32 predictors listed in Table 4 for the simulation of RMAX. Number of neurons in the hidden layer for both models is 10.

Application of ANNs
Using the automated script in MATLAB, different possible structures of ANN models (various transfer functions, training algorithms and input variables) are tested and their performances to simulate the rainfall timeseries are reported and compared. Number of inputs to the networks is allowed to vary between 4 and 50. In other words, initially, the first 4 variables, listed in Table 4 (for RMAX) and Table 5 (for RCU), are used for simulation, and then the number of variables is increased until all the variables shown in these tables are used as inputs to the ANN models. Comparing the simulation results based on the performance metrics indicate that the models with "logsig" transfer function and "traingdx" back propagating algorithm perform better than other structures for simulation of RMAX. However, for the simulation of RCU, ANN models with "tansig" transfer function and "traingdx" back propagation algorithm revealed better simulation performance. Figure 4 illustrates the NSE values for different ANNs with varying input variables and the above mentioned structures to simulate the RCU and RMAX. This figure is intended to depict how changes in the number of predictors could affect the modeling performance. The NSE metric shows the overall simulation performance of the model in the calibration and validation periods. Based on Figure 4a for RCU, the model with the first 47 predictors listed in Table 5 has the maximum value of NSE. Similarly, as shown in Figure 4b, the best ANN model uses the first 32 predictors listed in Table 4 for the simulation of RMAX. Number of neurons in the hidden layer for both models is 10.   Table 6 indicates the simulation performance of selected ANN models for the simulation of RCU and RMAX.   Table 6 indicates the simulation performance of selected ANN models for the simulation of RCU and RMAX. As Table 6 shows, although the simulation performance of the models in the calibration period is noticeably high, during the validation period the models do not perform well, particularly considering the NSE metric. Moreover, the number of selected predictors is relatively high (47 for RCU and 32 for RMAX). As mentioned in the methodology section, 10% of data are used for testing the models' performance for forecasting rainfall. The structure of the models with the highest simulation performance in the calibration and validation period is saved. These models are then used with the testing data, as input, for forecasting rainfall. The results indicate that the performance of ANN models for the test period is not acceptable since they forecast negative values for rainfall. The models' functionality in the validation and during the test periods confirms that machine learning ANN models fail in forecasting rainfall.

Application of TOPSIS
Data from April 1949 to December 2015 are used to build the decision matrix. From these data, the last 10% of the rainfall events are not used in the building of the matrix. Data in this period are considered as the forecasting targets (TOPSIS will look for the closest alternatives to these events among the other historical rainfalls, based on comparing the distance between the corresponding sets of predictors).
All possible combinations of input criteria (i.e., 50 n where 1 ≤ n ≤ 50), as predictors, are investigated to build the decision matrices for forecasting RCU and RMAX. Criteria are selected among those 50 predictors selected by MRMR shown in Tables 4 and 5. For each value of n, the evaluation metrics are estimated for each individual combination and the best combination with the highest evaluation metric is picked. Results show that increasing n improves the modeling performance. Variations of d2, CMC and NSE performance metrics against the number of predictors for RCU and RMAX are shown in Figure 5. It can be seen that for n > 5, increases in the values of metrics are not significant. Moreover, running the models to check the performance of different combinations takes a considerable time (e.g., simulation run time can take up to a month for 15,890,700 combinations of variables when n = 6 → 50 6 ).
Therefore, given the negligible increase in the simulation performance after n = 5, the maximum value of n is deemed to be 7.
NSE performance metrics against the number of predictors for RCU and RMAX are shown in Figure 5. It can be seen that for n > 5, increases in the values of metrics are not significant. Moreover, running the models to check the performance of different combinations takes a considerable time (e.g., simulation run time can take up to a month for 15,890,700 combinations of variables when n = 6  50 6 ). Therefore, given the negligible increase in the simulation performance after n = 5, the maximum value of n is deemed to be 7. Seven variables that are shown to result in the best performance of the TOPSIS model for forecasting RCU are shown in Table 7. Same wise, Table 8 indicates the six variables that resulted in the highest simulation performance of TOPSIS for RMAX.   Tables 7 and 8 show how effective are GH and SST climate signals (with different lag times and at different characteristic locations) on rainfall forecasting. As for RCU, historical rainfall (cumulative precipitation with 3 months lag time and maximum precipitation with 2 months lag time) are also among the selected set of predictors. GH, SST and historical rainfall were also observed among the most repeated variables for rainfall simulation with the ANN models. Figure 6 compares the observed monthly rainfalls (from November 2011 to November 2015) with those forecasted by TOPSIS (i.e., selected alternatives among the historical rainfall events from April 1949 to October 2011) for RCU and RMAX. To find these alternatives, TOPSIS finds the best Seven variables that are shown to result in the best performance of the TOPSIS model for forecasting RCU are shown in Table 7. Same wise, Table 8 indicates the six variables that resulted in the highest simulation performance of TOPSIS for RMAX.   Tables 7 and 8 show how effective are GH and SST climate signals (with different lag times and at different characteristic locations) on rainfall forecasting. As for RCU, historical rainfall (cumulative precipitation with 3 months lag time and maximum precipitation with 2 months lag time) are also among the selected set of predictors. GH, SST and historical rainfall were also observed among the most repeated variables for rainfall simulation with the ANN models. Figure 6 compares the observed monthly rainfalls (from November 2011 to November 2015) with those forecasted by TOPSIS (i.e., selected alternatives among the historical rainfall events from April 1949 to October 2011) for RCU and RMAX. To find these alternatives, TOPSIS finds the best and worst ideal solutions among the sets of predictors (i.e., timeseries of 7 criteria for RCU shown in Table 7, and timeseries of 6 criteria for RMAX shown in Table 8). These solutions have, respectively, the minimum and maximum distance from the values of the same criteria corresponding to each rainfall event subjected for forecasting. Forecasting rainfall by TOPSIS is based on comparing the current teleconnection conditions (the values of climate signals) with those already occurred in the past, then finding the most similar conditions and expecting the corresponding value of past rainfall to occur for the considered current conditions. This method works well when, on average, similar weather conditions happen through time. As shown in Figure 6, for both RCU and RMAX, some of the peak observed values are not caught by TOPSIS. This means that the recorded historical events do not incorporate weather conditions similar to those correspond to the extreme observed events.  Table 9 further illustrates the numerical performance of TOPSIS model for rainfall forecasting based on several evaluation metrics. Comparing the results with those obtained from ANN model shows that both modeling approaches outperform the simulation of monthly cumulative rainfall against maximum daily rainfall in a month. Since, as explained in the methodology and shown by Equation (13), for each observed rainfall event, TOPSIS identifies the 10 of the closest alternatives, it is possible to determine a range of   Table 9 further illustrates the numerical performance of TOPSIS model for rainfall forecasting based on several evaluation metrics. Comparing the results with those obtained from ANN model shows that both modeling approaches outperform the simulation of monthly cumulative rainfall against maximum daily rainfall in a month. Since, as explained in the methodology and shown by Equation (13), for each observed rainfall event, TOPSIS identifies the 10 of the closest alternatives, it is possible to determine a range of variation (a confidence interval) for future rainfall looking into these 10 alternatives. To determine the lower and upper values for this range, the minimum and maximum values of rainfall among the 10 alternatives are identified and used. Figure 7 shows the obtained range of variation as the forecasted window for values of RCU and RMAX, from November 2011 to November 2015. Then, it is investigated if the observed value of rainfall, at each time step, falls within the identified forecasted window. Results show that 48 out of 50 events for RCU and 46 out of 50 events for RMAX fall in the identified range at each time. Therefore, the developed TOPSIS model, not only performs relatively well in forecasting the individual values of rainfall, but also is capable of estimating a forecasting window for future RCU and RMAX with the accuracy of 96% and 92%, respectively.

Summary and Conclusions
In this study, a framework is suggested for forecasting rainfall for Vancouver area, BC, Canada. The monthly and seasonal large scale climate signals, at the identified low and high pressure characteristic locations in the North Pacific Ocean, and the extreme and cumulative monthly rainfall for the study area are investigated to develop long lead rainfall forecasting models. A feature selection method (MRMR) is used to select the most effective predictors among the set of climate variables identified for forecasting rainfall for western Canadian regions. Then, two approaches are examined for rainfall forecasting. The first approach is based on MLP data driven models (i.e., ANNs) and the second one is designed using a multi-criteria decision analysis method (i.e., TOPSIS). ANNs are known as powerful tools when used for the aim of modeling and simulation. However, based on the results, they fail for forecasting purposes. Although ANNs' performance in the calibration period is promising, they do not show acceptable performance in the validation and then testing period (which corresponds to the data that are not used for the networks training in the calibration and  Results show that 48 out of 50 events for RCU and 46 out of 50 events for RMAX fall in the identified range at each time. Therefore, the developed TOPSIS model, not only performs relatively well in forecasting the individual values of rainfall, but also is capable of estimating a forecasting window for future RCU and RMAX with the accuracy of 96% and 92%, respectively.

Summary and Conclusions
In this study, a framework is suggested for forecasting rainfall for Vancouver area, BC, Canada. The monthly and seasonal large scale climate signals, at the identified low and high pressure characteristic locations in the North Pacific Ocean, and the extreme and cumulative monthly rainfall for the study area are investigated to develop long lead rainfall forecasting models. A feature selection method (MRMR) is used to select the most effective predictors among the set of climate variables identified for forecasting rainfall for western Canadian regions. Then, two approaches are examined for rainfall forecasting. The first approach is based on MLP data driven models (i.e., ANNs) and the second one is designed using a multi-criteria decision analysis method (i.e., TOPSIS). ANNs are known as powerful tools when used for the aim of modeling and simulation. However, based on the results, they fail for forecasting purposes. Although ANNs' performance in the calibration period is promising, they do not show acceptable performance in the validation and then testing period (which corresponds to the data that are not used for the networks training in the calibration and validation periods). Moreover, a large number of predictors (47 for RCU and 32 for RMAX) have to be used to obtain a high simulation performance for ANNs. In contrast, the developed TOPSIS model, TOPSIS, performed well for rainfall forecasting with a few number of predictors (6 for RCU and 7 for RMAX). The TOPSIS model also shows high capability in forecasting the domain of rainfall occurrence (future confidence interval).
Occurrence of flooding is an un-stoppable reality, and reliable flood forecasting is a serious challenge that most of the Canadian provinces are dealing with. Heavy rainfall events are one of the main reasons for river overbanking, extreme freshets and surface runoff flooding in urban areas. Given the great uncertainty associated with hydro-meteorological predictions, the development of models for real time (e.g., one month lead time in this study) extreme rainfall forecasting provides an insight to the evaluation of possible weather conditions in a region. These forecasts, with an acceptable accuracy, are beneficial in short-term operation and management of water resources. This paper, as a pioneer study in forecasting long lead rainfall for western Canadian watersheds, shows how large scale climate signals can be effectively used to provide a reliable estimate for future rainfall. Forecasted maximum rainfall provides valuable information for the prediction of surface runoff and potential inland flood. The predicted range of variation for rainfall can also offer input for hydrologic modeling when uncertainties are considered to be incorporated in the analysis.
A small catchment has been selected in this study to develop the methodology. However, to expand this study, larger areas with multiple rainfall gauge stations can be selected to investigate the application of more rainfall data as input as well as if the proposed method could successfully forecast the range and value of an average value for rainfall for the whole study area. The data driven model which is used in this study, i.e., FFNN, showed low simulation performance in the validation period and not acceptable performance in the forecasting time. Other structures of artificial networks are suggested to be developed and checked for rainfall forecasting. Considering thousands of different combinations of predictors that result in a long run time for the TOPSIS model, desktop computers with higher speed configuration could be used to afford the computational efforts and investigate all possible combinations. Moreover, in constructing the decision matrix in the TOPSIS method, assigning unequal weights to the predictors could be analyzed. In addition to investigating the forecast of maximum daily rainfall, application of monthly large scale climate signals for forecasting average monthly rainfall could also be analyzed.
Author Contributions: Zahra Zahmatkesh designed the multi-criteria decision analysis model for rainfall forecasting. Erfan Goharian developed the neural networks. Zahra Zahmatkesh and Erfan Goharian analyzed the results. Zahra Zahmatkesh wrote the paper and Erfan Goharian reviewed and provided feedback on paper.

Conflicts of Interest:
The authors declare no conflict of interest.