Artiﬁcial Intelligence Based Ensemble Modeling for Multi-Station Prediction of Precipitation

: The aim of ensemble precipitation prediction in this paper was to achieve the best performance via artiﬁcial intelligence (AI) based modeling. In this way, ensemble AI based modeling was proposed for prediction of monthly precipitation with three different AI models (feed forward neural network-FFNN, adaptive neural fuzzy inference system-ANFIS and least square support vector machine-LSSVM) for the seven stations located in the Turkish Republic of Northern Cyprus (TRNC). Two scenarios were examined each having speciﬁc inputs set. The scenario 1 was developed for predicting each station’s precipitation through its own data at previous time steps while in scenario 2, the central station’s data were imposed into the models, in addition to each station’s data, as exogenous input. Afterwards, the ensemble modeling was generated to improve the performance of the precipitation predictions. To end this aim, two linear and one non-linear ensemble techniques were used and then the obtained outcomes were compared. In terms of efﬁciency measures, the averaging methods employing scenario 2 and non-linear ensemble method revealed higher prediction efﬁciency. Also, in terms of Skill score, non-linear neural ensemble method could enhance predicting efﬁciency up to 44% in the veriﬁcation step.


Introduction
Precipitation is the most important component of the hydrological cycle and accurate prediction of precipitation plays critical roles in the design, planning and management of water resources and hydraulic structures.However, due to the complex, non-linear and stochastic nature of precipitation time series, its prediction is a quite difficult task.
The models for prediction of hydro-climate parameters (e.g., precipitation) are usually classified to 2 classes: physically based and black box models.The physically based model uses physical rules for modeling all of the proper physical processes involved in the precipitation procedure.On the other hand, black box models use historically observed data to make further estimations.Such black box methods are particularly developed on the basis of statistical and computational intelligence approaches.Although conceptual approaches are dependable methods to analyze the physics of the phenomena, they may show restrictions such as complexity, being time-consuming, and there being a lack of enough data for modeling; the physically-based models in comparison to black box models in comparison to black box models, deliver somewhat inaccurate results when there are not sufficient distributed information and data within the system.[1,2].So, once the accurate estimations for the process are more crucial than the physical interpretations, utilizing data driven (black box) methods will be better alternative [2].Recently, artificial intelligence (AI) methods as such black box methods showed great efficiency in modeling the dynamic precipitation process in the presence of the non-linearity, uncertainty, and irregularity of data.Comparative researches have shown that the AI-based models may generate reliable results in precipitation predictions with regard to the physically-based models [3,4].It should be mentioned that the more accurate and sufficient data (long-term recorded data in terms of quantity and accurate recorded data in terms of quality) are available for calibrating the model, the obtained results will be more accurate and reliable in the long run.But due to the magnifying the prediction error over the next time steps of forecasting such data driven (AI-based) models are more accurate in short term forecasting issues.One of the most commonly used AI methods for the precipitation modeling is feed forward neural network (FFNN) which is a common type of artificial neural network (ANN).In the recent decades, FFNN has acquired increasing popularity due to its flexibility and robustness to detect involved patterns in the various range of data.For example, Guhathakurta [5] employed ANN for prediction of the monthly precipitation over 36 meteorological stations of India to estimate the monsoon precipitation of upcoming years.The model could catch non-linear interactions among input and output data and estimate the seasonal rainfall.Hung et al. [6] employed ANN for real time precipitation predicting and flood management in Bangkok, Thailand.It was found out that the most dominant input in modeling is rainfall at previous time steps (as a Markovian process).Likewise, Abbot and Marohasy [3] predicted monthly and seasonal precipitations up to 3 months in advance over Queensland, Australia, by using dynamic, recurrent and time-delay ANNs.More recently, Khalili et al. [7] employed the Hurst rescaled range statistical analysis to evaluate the predictability of the available data for monthly precipitation prediction for Mashhad City, Iran.Devi et al. [8] applied ANNs for forecasting the rainfall time series using the temporal and spatial rainfall intensity data and proved wavelet Elman models as the best model for rainfall forecasting.Mehdizadeh et al. [9] introduced two novel hybrid models of artificial neural networks-autoregressive conditional heteroscedasticity (ANN-ARCH) and gene expression programming-autoregressive conditional heteroscedasticity (GEP-ARCH) for forecasting monthly rainfall time series.They indicated that GEP-ARCH and ANN-ARCH methods could lead to reliable outcomes for the studied regions with different climatic conditions.They also revealed that ANN-ARCH method can present more reliable results with regard to GEP-ARCH method.
As another type of AI model, the least square support vector machine (LSSVM), first proposed by Ye and Xiong [10], is one of the most effective predicting methods as an alternative method of ANN.The LSSVM has been used for non-linear classification problems [11] and, function and density estimations [12].The LSSVM is a machine learning algorithm to claim a model dealing with complicated classification problems.The LSSVM is capable of predicting non-linear, non-stationary, and stochastic processes [13].The LSSVM has been successfully used for the prediction of precipitation in the last decade [14,15].Lu and Wang [16] forecasted the monthly precipitation over a state in China employing LSSVM method using several kernel functions.Using the available observed data of 2 different stations from Turkey, Kisi and Cimen [17] employed the LSSVM with and without a wavelet-based data pre-processing technique for prediction of precipitation time series.Du et al. [18] employed an SVM-based precipitation forecasting method and reported the promising and effective results in the field of precipitation prediction.More recently, Danandeh Mehr et al. [19] developed a hybrid regression method on the basis of the support vector regression (SVR) and firefly algorithm (FFA) for precipitation prediction of rain gauges in Iran with promising accuracy.The outcomes revealed that the proposed combined method can significantly outperform the single SVR and GEP methods.
In addition to the ANN and LSSVM methods, the adaptive neural fuzzy inference system (ANFIS) model, which incorporates both the ANN learning power and fuzzy logic knowledge representation, has been considered as a robust model for precipitation prediction because of fuzzy concept ability in handling the uncertainty involved in the study processes [20].ANFIS is a Takagi-Sugeno-Kang (TSK) fuzzy based mapping algorithm which provides less overshoot, oscillation and minimal training time [21].The ANFIS can analyze the relationship involved in the input and output data sets via a training scheme to optimize the parameters of a given fuzzy inference system (FIS) [22].The ANFIS training can use alternative algorithms to decrease the training error.Some previous investigations indicated that ANFIS can be used as an efficient tool for precipitation modeling (e.g., see, [23][24][25][26]).Yaseen et al. [27] employed a hybrid ANFIS-FFA model to forecast one month ahead precipitation value and compared the results with the classic ANFIS model.The results showed that, the proposed ANFIS-FFA method could perform more accurate than the classic ANFIS method, so that the efficiency of the ANFIS-FFA and ANFIS methods were strongly governed by size of the inputs set.
With the recent developments in the AI techniques, although ANN, ANFIS and LSSVM have been reliably employed to model time series of various hydro-climatic variables (including precipitation), it is obvious that for a particular problem, different outcomes may be obtained from different models over different spans of the time series.As such, Zhang [28] used a hybrid model of ANN and auto regressive integrated moving average (ARIMA) models for time series prediction, and suggested that the combination of the models is an effective way to increase prediction accuracy.Li and Sankarasubramanian [29] presented a new dynamic approach for combining multiple hydrological models evaluating the performance during prediction and also a weighted averaging method to reduce the modeling uncertainty in monthly streamflow prediction.Yamashkin et al. [30] confirmed that reliability, objectivity, and accuracy of the analysis are increased by the use of ensemble systems.Sharghi et al. [31] indicated that performance of the seepage modeling can be enhanced by the ensemble method up to 20%.
The ensemble precipitation prediction is a set of predictions that presents the range of rainfall prediction possibilities with a minimized error.The uncertainty associated with any prediction indicates that different scenarios are possible and the prediction must reflect them.By providing a range of possible outputs, the model shows how likely various scenarios come true in the months ahead, and which methods are useful and for how long they are useful for the future forecasts.
The main aim of this study is to utilize the models ensemble concept for precipitation prediction employing data from seven stations located in Turkish Republic of Northern Cyprus.The proposed ensemble techniques are formed using the outputs of the ANN, ANFIS, and LSSVM methods to improve the efficiency of single modeling.The three techniques of ensembling, which are simple, weighted and non-linear neural averaging, are applied in this way.Furthermore, two input combinations as scenarios 1 and 2, are considered for modeling with different input combinations.Although the ensemble approaches have been focused during the last decades at different engineering fields [28,32,33], to the best knowledge of the authors, this paper presents the first AI-based ensemble approach for precipitation prediction.

Used Data and Efficiency Criteria
Cyprus is located at approximately 35 • N and 33 • E, in the east end of the Mediterranean Sea, and is ~224 km WSW to ENE, and ~97 km NNW-SSE with a land area of approximately 9250 km 2 (Figure 1).The island has two mountain ranges-the Troodos Massif (maximum elevation 1951 m) in the southwest and the Pentadaktylos (Girne) range (maximum height 1000 m) along the northern coast, which give Cyprus high topographical variability [34].
The climate of North Cyprus is typical Mediterranean with hot dry summers where the average temperature can reach up to 40 • C. In cool winter months the lowest temperature tends to be around 10 • C.
Data from seven main stations (automatic sensors are usually used to measure the precipitation data in TRNC) were used in this study to predict the precipitation (see Figure 1).( 1 It should be mentioned that Automatic sensors are usually used to measure the precipitation data in TRNC which work with solar energy and battery system and precipitation is loaded into the data loggers and then data is collected with GPRS in every 15 minutes.Also the fine adjustment and calibration of the sensors are handled based on the international standards.The sensors' accuracy and sensitivity are ±2% and 0.2 mm, respectively.Figure 2 shows the situation of rain gauge with the installation equipment.Also specifications of the rain gauges are tabulated in Table 1.
For training and validation of the models, the monthly data were obtained from these 7 meteorological stations for ten years, from 1 January 2007, to 31 December 2016.The characteristics of the stations and also the statistics of the data from stations are tabulated in Table 2.It should be mentioned that Automatic sensors are usually used to measure the precipitation data in TRNC which work with solar energy and battery system and precipitation is loaded into the data loggers and then data is collected with GPRS in every 15 minutes.Also the fine adjustment and calibration of the sensors are handled based on the international standards.The sensors' accuracy and sensitivity are ±2% and 0.2 mm, respectively.Figure 2 shows the situation of rain gauge with the installation equipment.Also specifications of the rain gauges are tabulated in Table 1.
For training and validation of the models, the monthly data were obtained from these 7 meteorological stations for ten years, from 1 January 2007, to 31 December 2016.The characteristics of the stations and also the statistics of the data from stations are tabulated in Table 2.     Usually, as a conventional method, linear correlation coefficient (CC) is computed between potential inputs and output to select most dominant input variables for the AI methods such as FFNN [35].However, implementation of CC for dominant input selection has been already criticized (e.g., see, [36]) since for modeling a non-linear process by a non-linear approach like FFNN, it will be more feasible to employ a non-linear criterion (e.g., mutual information (MI)) since in spite of a weak linear relation, strong non-linear relationships may be existing among input and output parameters.The MI value between random variables of X and Y can be written in the form of [37]: where x and y are the probability distributions of variables X and Y; H(x) and H(y) show respectively the entropies of distributions x and y, and H(x,y) is their joint entropy as: where P XY (x,y) is the joint distribution.The MI values between the observed precipitation time series of all seven stations relative to each other were calculated and tabulated in Table 3.As it can be seen from Table 3, overall, Ercan's precipitation data are more non-linearly correlated with the precipitation time series of other stations, maybe due to its central position with regard to the others.For instance, the auto-correlation function (ACF) plots (Correlogram) of Ercan and Lefkoşa precipitation time series are presented in Figure 3.As it can be seen from Figure 3, the precipitation time series of some stations such as Ercan station are more auto-correlated with 1 and 12-month lags, whereas the precipitation time series of some other stations such as Lefkoşa station are more auto-correlated with 1, 2, and 12-month lags.As noticed previously, CC is unable to recognize the non-linear relation between time series.Therefore, in continue the MI was employed to determine the non-linear relation between precipitation time series and their lag times.So, it was recognized that the precipitation time series are mostly correlated non-linearly with 1 and 12 month lags in all stations which denotes to both auto-regressive (Markovian) and seasonality of the process.
Beside computing auto-correlation function, for testing the normality of data, Kolmogorov Smirnov test [38] was used and results indicated that the data of all 7 stations are non-normal; so non-parametric tests should be applied to these datasets.Next, the Run test [39,40] was employed for testing randomness of precipitation time series of each station.Results of Run test at 95% confidence level indicated that precipitation of all stations are not random so that the precipitation of all stations are predictable.Also to check data homogeneity, Pettitt's test [41], Standard normal homogeneity test (SNHT) [42], Buishand's test [43] and Chi-square test [44] were applied to data of all stations which probed that data of stations are homogenous.
whereas the precipitation time series of some other stations such as Lefkoşa station are more autocorrelated with 1, 2, and 12-month lags.As noticed previously, CC is unable to recognize the nonlinear relation between time series.Therefore, in continue the MI was employed to determine the non-linear relation between precipitation time series and their lag times.So, it was recognized that the precipitation time series are mostly correlated non-linearly with 1 and 12 month lags in all stations which denotes to both auto-regressive (Markovian) and seasonality of the process.Beside computing auto-correlation function, for testing the normality of data, Kolmogorov Smirnov test [38] was used and results indicated that the data of all 7 stations are non-normal; so nonparametric tests should be applied to these datasets.Next, the Run test [39,40] was employed for testing randomness of precipitation time series of each station.Results of Run test at 95% confidence level indicated that precipitation of all stations are not random so that the precipitation of all stations Prior to the modeling, the monthly average precipitation data were first normalized by [45]: where P norm is the normalized value of the P (t) ; P max(t) and P min(t) are the max and min values of the observed data, respectively.For training and verifying purposes, the data were divided to 2 sub-sets.About 70% of whole data were used for calibration and the rest 30% of data were used for verifying the trained methods.
The "root mean square error (RMSE)" and "determination coefficient (DC)" were used to evaluate the prediction efficiency of the models as [46]: where n is the data number, P obsi is the observed data, and P comi is the predicted (computed) data.DC ranges from −∞ to 1 with a perfect score of 1 and RMSE ranges from 0 to +∞ with the perfect value of 0. Legates and McCabe [47] showed that any hydro-environmental method may be adequately evaluated by DC and RMSE criteria.Also the "Skill" of the proposed methodology was calculated as [48]: where A is the measure of accuracy (such as RMSE or DC), A ref is the set of reference predictions and A perf is a perfect prediction (what actually happened).Skill scores have a range of (∞,1], where a score of 1 presents perfect model performance, a score of 0 means the model is as accurate as the reference model, and a negative score means the model is less accurate than the reference.

Proposed Methodology
In this study, firstly, the monthly precipitation data were normalized by Equation (3).Three different black box models, ANN (a commonly used AI method), ANFIS (an AI method which serves Fuzzy tools to handle the uncertainties involved in the process) and LSSVM (more recently developed AI model), were separately created on the basis of two different scenarios.Then, outputs of the single models were ensembled using 3 ensemble techniques as linear simple and weighted averaging and non-linear neural ensemble methods.The inputs of the ensemble unit were outputs of the single models.The modeling was done via two scenarios.In scenario 1, each station's own data at previous time steps were used for predicting the same station's precipitation at current time step, while in scenario 2, another station's data in addition to each station's data were used for modeling to enhance the prediction performance.

First Scenario
For modeling via the first scenario, the aim was to predict precipitation value using its values at previous time steps.So, the prediction of the precipitation could be patterned as: i denotes to the station name (as Ercan, Gazima gusa, Geçitkale, Girne, Guzelyurt, Lefkoşa and Yeni Erenkoy stations) and P i t−1 , P i t−12 are the precipitation values of ith station corresponding to time steps t−1 and t−12 (or 1 and 12 months ago).
The conceptual model of the ensemble system for scenario 1 involving ANN, ANFIS and LSSVM single models is shown by Figure 4.The conceptual model of the ensemble system for scenario 1 involving ANN, ANFIS and LSSVM single models is shown by Figure 4.

P(t−1) and P(t−12) are previous monthly precipitation values corresponding to 1 and 12 months ago; PFFNN(t), PANFIS(t) and PLSSVM(t) are results of predictions (in current month
) by different models.The argumentation of using P(t−1) and P(t−12) as inputs for prediction of P(t) is supported by the following: a) As shown by some previous studies [3,6,27] in modeling precipitation, as a Markovian (auto-regression) process, P(t) is more correlated with precipitation values at prior time steps as P(t−1) and so on.For this reason, it is feasible to select previous time steps values as inputs for the AI models.According to Figure 3a,b, and also employing MI, as a non-linear correlating identifier, the lag times of 1 was selected as the dominant input in scenario 1 for all stations.b) Selection of input P(t−12) is related to the seasonality of the precipitation phenomenon.It means that due to the seasonality of the process (i.e., periodicity), the precipitation value of the current month has a strong relation (similarity) with the precipitation level in the same month at previous year.As it can be seen in Figure 3a,b, the precipitation is much correlated with the precipitation values with the values obtained 12 months ago.It should be noted that the CC could determine the linear correlation between two time series and it is unable to recognize the non-linear P(t−1) and P(t−12) are previous monthly precipitation values corresponding to 1 and 12 months ago; P FFNN (t), P ANFIS (t) and P LSSVM (t) are results of predictions (in current month) by different models.The argumentation of using P(t−1) and P(t−12) as inputs for prediction of P(t) is supported by the following: (a) As shown by some previous studies [3,6,27] in modeling precipitation, as a Markovian (auto-regression) process, P(t) is more correlated with precipitation values at prior time steps as P(t−1) and so on.For this reason, it is feasible to select previous time steps values as inputs for the AI models.According to Figure 3a,b, and also employing MI, as a non-linear correlating identifier, the lag times of 1 was selected as the dominant input in scenario 1 for all stations.
(b) Selection of input P(t−12) is related to the seasonality of the precipitation phenomenon.It means that due to the seasonality of the process (i.e., periodicity), the precipitation value of the current month has a strong relation (similarity) with the precipitation level in the same month at previous year.As it can be seen in Figure 3a,b, the precipitation is much correlated with the precipitation values with the values obtained 12 months ago.It should be noted that the CC could determine the linear correlation between two time series and it is unable to recognize the non-linear relation.Hence, MI was used to confirm the selection of dominant inputs for the modeling.

Second Scenario
In scenario 2, the prediction formula (7) was modified by introducing precipitation value from Ercan station P Ercan t as exogenous input.So, the following equation could be considered to formulate this scenario: In scenario 2, it was tried to use the data from another station as exogenous inputs to enhance the modeling efficiency.In this way, the data from Ercan station were also considered as input data for modeling all other stations.This was due to the fact that the position of Ercan station is central in comparison to the other stations and therefore has more non-linear correlation with other stations (see Table 3).Also, the location of this station is of vital importance as it is the main airport of TRNC, so the data obtained from here may be more accurate and complete.Thus, the data obtained from the Ercan station were considered as exogenous input in the modeling via scenario 2.
Employing scenario 2 can be more helpful for predicting the precipitation of stations when they get out of service (due to technical problems) using their available past observations as well as data from Ercan station.

Feed Forward Neural Network (FFNN) Concept
The artificial neural network as an AI-based model is a mathematical model aiming to handle non-linear relationship of input-output data set [49].ANN has proved to be effective with regards to complex function in various fields, including prediction, pattern recognition, classification, forecasting, control system and simulation [50,51].Among the different ANN algorithms, FFNN with back propagation (BP) training is widely applied and is the most common class of ANNs.In FFNN-BP, the network is trained by processing the input data through the network and it is transferred to the output layer, and the generated error propagated back to the network until the desired output is archived.The primarily strategy of FFNN-BP is to reduce the error, so that the ANN is trained by the training data set and can predict the correct output [52].FFNN includes 3 layers of input, hidden and output.
In this study, the input layer consisted of combinations of P(t−1), P(t−12) and the target was P(t) as shown in Figure 5.Both the architecture (the number of neurons, number of layers, transfer function) and learning rate is usually determined using the trial-and error process.The sigmoid activation function was employed for input and hidden layers while in the output layer, a linear function was applied in the used FFNN models [53].The developed ANN structure is illustrated by Figure 5.
In this study, the input layer consisted of combinations of P(t−1), P(t−12) and the target was P(t) as shown in Figure 5.Both the architecture (the number of neurons, number of layers, transfer function) and learning rate is usually determined using the trial-and error process.The sigmoid activation function was employed for input and hidden layers while in the output layer, a linear function was applied in the used FFNN models [53].The developed ANN structure is illustrated by Figure 5.

Adaptive Neural Fuzzy Inference System (ANFIS) concept
The conjunction of ANN and fuzzy system presents a robust hybrid system which is capable of solving complex nature of the relationships [21,54].ANFIS is a multi-layer feed-forward (MLFF) neural network that is capable of integrating the knowledge of ANN and fuzzy logic algorithms which maps the set of inputs to the outputs [51].ANFIS as AI-based model employs the hybrid training algorithms which consist of a combination of BP and least squares method [55].In addition, in terms of learning duration, the ANFIS model is very short in comparison with the ANN model [53].The schematic of the ANFIS model is depicted by Figure 6.

Adaptive Neural Fuzzy Inference System (ANFIS) concept
The conjunction of ANN and fuzzy system presents a robust hybrid system which is capable of solving complex nature of the relationships [21,54].ANFIS is a multi-layer feed-forward (MLFF) neural network that is capable of integrating the knowledge of ANN and fuzzy logic algorithms which maps the set of inputs to the outputs [51].ANFIS as AI-based model employs the hybrid training algorithms which consist of a combination of BP and least squares method [55].In addition, in terms of learning duration, the ANFIS model is very short in comparison with the ANN model [53].The schematic of the ANFIS model is depicted by Figure 6.The developed ANFIS consists of two inputs of P(t−1), P(t−12) and one output of P(t) as shown in Figure 6.Among different FISs used as fuzzy operations, the TSK engine was used in this study.The operation of ANFIS to generate the output function with 2 input vectors of P(t−1), P(t−12) and the first order of TSK applied to 2 fuzzy rules can be expressed as [26]: Rule (1): if µ(P(t−1)) is B1 and µ(P(t−12)) is C1 then f1=p1(P(t−1)) + q1(P(t−12)) + r1 Rule (2): if µ(P(t−1)) is B2 and µ(P(t−12)) is C2 then f1=p2(P(t−1)) + q2(P(t−12)) + r2 B1, B2, and C1, C2 are membership functions parameters, for inputs P(t−1) and P(t−12) and p1, q1, r1 and p2, q2, r2 are outlet functions' variables, the structure and formulation of ANFIS follows a five-layer neural network structure.For more explanation of ANFIS, the reader is referred to the studies of [20,25].The developed ANFIS consists of two inputs of P(t−1), P(t−12) and one output of P(t) as shown in Figure 6.Among different FISs used as fuzzy operations, the TSK engine was used in this study.The operation of ANFIS to generate the output function with 2 input vectors of P(t−1), P(t−12) and the first order of TSK applied to 2 fuzzy rules can be expressed as [26]: Rule (1): if µ(P(t−1)) is B1 and µ(P(t−12)) is C1 then f1=p1(P(t−1)) + q1(P(t−12)) + r1 Rule (2): if µ(P(t−1)) is B2 and µ(P(t−12)) is C2 then f1=p2(P(t−1)) + q2(P(t−12)) + r2 B1, B2, and C1, C2 are membership functions parameters, for inputs P(t−1) and P(t−12) and p1, q1, r1 and p2, q2, r2 are outlet functions' variables, the structure and formulation of ANFIS follows a five-layer neural network structure.For more explanation of ANFIS, the reader is referred to the studies of [20,25].

Least Square Support Vector Machine (LSSVM) concept
Learning in the context of SVM was proposed and introduced by [56], which provides a satisfactory approach to the problems of prediction, classification, regression and pattern recognition.SVM is based on the concept of machine learning which consists of data-driven model [56].The structural risk minimization and statistical learning theory are two useful functions of SVM which make it different from ANN because of its ability to reduce the error, and complexity and increases the generalization performance of the network.Generally, SVM is categorized into linear support vector regression (L-SVM) and non-linear support vector regression (N-SVM) [57].Therefore, support vector regression (SVM) is a form of SVM based on two basic structural layers; the first layer is kernel function weighting on the input variable while the second function is the weighted sum of kernel outputs [56].In SVM, firstly a linear function should fit to data and thereafter, the outcomes are passed through a non-linear kernel function to map non-linear patterns involved in data.The least squares formulation of SVM is called LSSVM.Thus, the solution in this method is obtained through solving a linear equations system.Efficient algorithms can be used in LSSVM [58].In LSSVN modeling a non-linear function can be expressed in the form of [59]: in which f shows relation among the input and output data, w is an m-dimensional weight vector, φ denotes to kernel function mapping input vector x to an m-dimensional feature vector; b stands for the bias [14].The regression problem can be given as follows [10]: which has the following constraints: where γ is the margin parameter and e i is the slack variable for X i .To solve the optimization problem, the objective function may be achieved by altering the constraint problem to the unconstraint problem, according to the Lagrange multiplier α_i as [60]: Vector w in Equation ( 9) should be calculated after solution of the optimization problem in the form of [16]: Therefore, the ultimate formula for LSSVM could be written in the form of: where P(x,x i ) shows kernel function which performs non-linear mapping to the feature space.The Gaussian radial basis function (RBF) is the most commonly used kernel function in LSSVM based modeling in the form of [23]: where γ and σ are the parameters of the kernel function.Figure 7 shows the structure of the LSSVM.
Vector w in equation ( 9) should be calculated after solution of the optimization problem in the form of [16]: Therefore, the ultimate formula for LSSVM could be written in the form of: where P(x,xi) shows kernel function which performs non-linear mapping to the feature space.The Gaussian radial basis function (RBF) is the most commonly used kernel function in LSSVM based modeling in the form of [23]: where γ and σ are the parameters of the kernel function.Figure 7 shows the structure of the LSSVM.

Ensemble unit
In the case that various models have better results at different parts or intervals or in modeling of peak values, it is supposed that by combining (ensembling) the outputs from several prediction methods, the final accuracy of a predicted time series can be improved.In an ensembling process, the outcomes of various models are used and as so, the final outputs will not be sensitive to selection of the best methods.Therefore, predicts of ensemble method will be more safe and less risky than the

Ensemble Unit
In the case that various models have better results at different parts or intervals or in modeling of peak values, it is supposed that by combining (ensembling) the outputs from several prediction methods, the final accuracy of a predicted time series can be improved.In an ensembling process, the outcomes of various models are used and as so, the final outputs will not be sensitive to selection of the best methods.Therefore, predicts of ensemble method will be more safe and less risky than the results of the single best methods.Various studies at different fields of engineering suggested to ensemble outcomes of several methods as an effective approach to improve the performance of time series predictions [32,61].
An ensemble technique, as a learning algorithm, gathers a set of classifiers to classify new variables by applying weights on the single prediction values.The goal of such ensemble learning technique is to develop an ensemble of the individual methods that are diverse and yet accurate.
In current paper, 3 ensemble techniques were applied to combine of the outputs from the used AI-based models to enhance the overall efficiency of the predictions as: (a) the simple linear averaging method: In which P(t) shows the outcome of simple ensemble method.N shows the number of used models (in this study, N=3) and P i (t) stands for the outcome of the ith method (i.e., ANN, ANFIS and LSSVM) in time step t.
(b) the linear weighted averaging method: where i shows imposed weight on the output of ith method and it may be computed on the basis of the performance measure of ith method as: where DC i measures the model efficiency (such as coefficient of determination).
(c) the linear weighted averaging method: For the non-linear neural ensemble method another FFNN model is trained by feeding the outputs of single AI models as inputs to the neurons of the input layer (see Figure 8).Number of hidden layer neurons and maximum epoch numbers are defined through trial-error procedure.

N
In which ) (t P shows the outcome of simple ensemble method.N shows the number of used models (in this study, N=3) and ) (t P i stands for the outcome of the ith method (i.e., ANN, ANFIS and LSSVM) in time step t. b) the linear weighted averaging method: Where i shows imposed weight on the output of ith method and it may be computed on the basis of the performance measure of ith method as: Where DCi measures the model efficiency (such as coefficient of determination).c) the non-linear neural ensemble method: For the non-linear neural ensemble method another FFNN model is trained by feeding the outputs of single AI models as inputs to the neurons of the input layer (see Figure 8).Number of hidden layer neurons and maximum epoch numbers are defined through trial-error procedure.

Results and Discussion
In this section, firstly obtained results of sole models are presented and then the results of the ensemble methods are summarized.

Results and Discussion
In this section, firstly obtained results of sole models are presented and then the results of the ensemble methods are summarized.

Results of Single AI Models
At the first step, FFNN, ANFIS, and LSSVM models were separately created via the proposed scenarios 1 and 2. For precipitation prediction of the stations, monthly precipitation values were individually imposed into ANN, ANFIS, and LSSVM models in order to predict one-month-ahead precipitation.For this purpose, the ANN, ANFIS, and LSSVM models' architectures set depends on the priority of the precipitation process.The monthly precipitation data are described by both Markovian and seasonal properties [13].For this reason, the current precipitation P(t) is related to its previous time steps, P(t−1), as well as the its value at twelve months ago, P(t−12).Consequently, the input values as P(t−1) and P(t−12) were applied to the FFNN, ANFIS, and LSSVM models to predict precipitation at time step t (P(t)) for scenario 1 (including more lagged precipitation values, i.e., P(t−2) and P(t−3) did not show higher MI with output and could not improve the efficiency of the modeling).For scenario 2, one more input, Ercan's station precipitation value as exogenous input, was also considered (in addition to the input of scenario 1) as another input neuron to enhance the prediction performance (Ercan station's precipitation at time step t−1 and t−2 was also examined as exogenous inputs but it couldn't improve the modeling efficiency remarkably and so, only the value at time step t was used in modeling).
To prevent the FFNNs from overtraining issue, it is important to select optimum number of hidden neurons as well as training iteration (epoch) number.Levenberg Marquardt algorithm [62] as training algorithm and 10-300 training epoch numbers and 1-30 hidden neurons were examined to develop the FFNN models.The best results by FFNN models for precipitation modeling of all stations are shown in Table 4 for both scenarios 1 and 2. To train the ANFIS models, the Sugeno FIS engine was used in the modeling framework.Each ANFIS should include some rules and membership functions.In this research, Gaussian-shaped and 2 Gaussian combinations MFs, as well as the Triangular-shaped and pi-shaped MFs were found to be appropriate for monthly precipitation modeling.Furthermore, the constant MF was applied in the output layer of the ANFIS models.Not only the number of membership functions but also the number of training epochs were examined to reach to the optimum ANFIS models.The ranges of 5-300 and 2-5 were considered respectively for the numbers of training epoch and membership functions.The best results for the ANFIS models are shown in Table 5 for all stations obtained via both scenarios 1 and 2. Thereafter, the LSSVM models were created to predict the precipitation time series of the stations using RBF kernel.Several studies have already reported more reliable results of LSSVM model using RBF kernel with regard to using other kernels maybe due to its smoothness assumption [59].The best results obtained by LSSVM in modeling the precipitation of the stations are shown in Table 6 for both scenarios 1 and 2. As it is shown by Tables 4-6, the results of the methods in scenario 1 show a bit better performance for Ercan and Lefkoşa stations than other stations in the verification phase since these stations are located in central and higher parts of the island in contrast to the other stations which are located in shore lines and are impacted more significantly by the irregular variations of the sea condition such as stormy or calm sea conditions.This can also be confirmed by the standard variation values presented in Table 2 which show lower values for these two stations.
In scenario 2, the models of the Girne station in verification step showed better efficiency than others.This can be due to its proximity to Ercan station.In other words, not only the small distance between Girne and Ercan stations but also the predominant wind direction over the island (which is from northwest to southeast) make the precipitation pattern of both stations more similar with regard to the others.It should be noted that other factors also influence the precipitation pattern but, since Cyprus is a small island and most of the conditions do not vary part to part; so it supposed that the wind may be the main factor.This can also be clearly seen from Table 3 which shows higher MI value between these two stations.
Considering the outcomes of both scenarios, because of using Ercan station's data as exogenous input (in addition to each station's own data), the results of scenario 2 were better than scenario 1, showing improvement of modeling efficiency up to 61% via ANFIS model for Gazima gusa station in calibration step and up to 58% via LSSVM model for Güzelyurt station in verification step.
For instance, Figures 9-11 illustrate the results of single AI methods for the calibration and verification steps and scatter plots for verification step for Ercan and Girne stations based on scenarios 1 and 2, respectively.
wind may be the main factor.This can also be clearly seen from Table 3 which shows higher MI value between these two stations.
Considering the outcomes of both scenarios, because of using Ercan station's data as exogenous input (in addition to each station's own data), the results of scenario 2 were better than scenario 1, showing improvement of modeling efficiency up to 61% via ANFIS model for Gazimağusa station in calibration step and up to 58% via LSSVM model for Güzelyurt station in verification step.
For instance, Figures 9-11 illustrate the results of single AI methods for the calibration and verification steps and scatter plots for verification step for Ercan and Girne stations based on scenarios 1 and 2, respectively.As it can be seen from Tables 4-6 and Figures 9-11, generally in most cases, the performance of FFNN was better than other models, however in some cases, the ANFIS and in some other cases LSSVM's performance was better than others.Also, comparison of the outputs obtained by the single AI methods (Figures [9][10][11] shows that in different parts of the time series, some of the models led to over estimations and others down estimations of the observed time series.Figure 9(a) highlights 2 points (i) and (ii).According to the figure, it is obvious that for the point (i), LSSVM method provided better fitting to the observed value.However, for sample point (ii), the FFNN model led to the minimum error.In addition, in the interval of November-2009 to March-2010 and November-2012 to January-2013, all models were unable to provide good predictions.Therefore, such different performances of different methods at different sample points and time spans confirm a need to ensemble the results of different methods via the ensemble techniques.

Results of Ensemble Modeling
In the ensemble modeling, the outputs of three AI based single models were combined to improve the predicting performance.In this step, only the verification dataset was employed to compute the weights of the averaging methods.For the neural averaging method, like the single FFNN model, the Levenberg Marquardt algorithm was used as training algorithm.The ranges of 10-300 and 1-30 respectively for the numbers of training epochs and hidden neurons were examined to obtain the best results.Results of different ensemble methods are shown by Table 7 and Table 8 respectively for scenarios 1 and 2. Also the Skill scores of ensemble methods with regard to classic single AI methods are tabulated in Table 9.As it can be seen from Tables 4-6 and Figures 9-11, generally in most cases, the performance of FFNN was better than other models, however in some cases, the ANFIS and in some other cases LSSVM's performance was better than others.Also, comparison of the outputs obtained by the single AI methods (Figures [9][10][11] shows that in different parts of the time series, some of the models led to over estimations and others down estimations of the observed time series.Figure 9a highlights 2 points (i) and (ii).According to the figure, it is obvious that for the point (i), LSSVM method provided better fitting to the observed value.However, for sample point (ii), the FFNN model led to the minimum error.In addition, in the interval of November-2009 to March-2010 and November-2012 to January-2013, all models were unable to provide good predictions.Therefore, such different performances of different methods at different sample points and time spans confirm a need to ensemble the results of different methods via the ensemble techniques.

Results of Ensemble Modeling
In the ensemble modeling, the outputs of three AI based single models were combined to improve the predicting performance.In this step, only the verification dataset was employed to compute the weights of the averaging methods.For the neural averaging method, like the single FFNN model, the Levenberg Marquardt algorithm was used as training algorithm.The ranges of 10-300 and 1-30 respectively for the numbers of training epochs and hidden neurons were examined to obtain the best results.Results of different ensemble methods are shown by Tables 7 and 8 respectively for scenarios 1 and 2. Also the Skill scores of ensemble methods with regard to classic single AI methods are tabulated in Table 9.For example, Figure 12 shows the results of precipitation predictions using the ensemble models for both calibration and verification phases and the scatter plot for the verification step by using the neural ensemble method for Girne station based on the second scenario.As mentioned above, any method has its own benefits and drawbacks.Some models may provide over and some others may provide lower estimates.Also each model could estimate specific intervals more accurate than other models.Thus, by combining outputs of different methods the final estimations may be more accurate in comparison with the results of single models.It should be noticed that since the outputs of the single methods are close together (see Tables 4-6), and because the efficiency of simple and weighted averaging ensemble techniques are in the same directs with the single methods, outputs of simple and weighted ensemble techniques are quite same.As it can be seen in Tables 7-8 and Figure 12, the efficiency of neural ensemble is better than the linear ensembling methods in the most cases.
For instance, the scatter plot of FFNN and neural ensemble methods based on scenario 2 for Girne station at verification step is presented by Figure 13.As mentioned above, any method has its own benefits and drawbacks.Some models may provide over and some others may provide lower estimates.Also each model could estimate specific intervals more accurate than other models.Thus, by combining outputs of different methods the final estimations may be more accurate in comparison with the results of single models.It should be noticed that since the outputs of the single methods are close together (see Tables 4-6), and because the efficiency of simple and weighted averaging ensemble techniques are in the same directs with the single methods, outputs of simple and weighted ensemble techniques are quite same.As it can be seen in Tables 7 and 8 and Figure 12, the efficiency of neural ensemble is better than the linear ensembling methods in the most cases.
For instance, the scatter plot of FFNN and neural ensemble methods based on scenario 2 for Girne station at verification step is presented by Figure 13.In terms of computed DC and Skill measures (see Tables 4-9), simple linear, weighted linear and neural averaging methods enhanced the predicting performance up to 26 ) Ercan International Airport; at this station, the summers are hot, arid, and clear and the winters are cold, windy, and mostly clear.Over the course of the year, the temperature typically varies from 4 • C to 35 • C and is rarely below 0 • C or above 37 • C; (2) Gazima gusa's climate is classified as warm and temperate.In winter, there is much more rainfall in Gazima gusa than in summer.The average temperature in Gazima gusa is 19.3 • C and the average rainfall is 407 mm; (3) The prevailing climate in Geçitkale is known as a local steppe climate.During the year, there is little rainfall in Geçitkale and the average annual temperature is 19.1 • C; (4) Girne station's climate is warm and temperate and the average annual rainfall is 382 mm.The winters are rainier than the summers.In Girne, the average annual temperature is 19.6 • C. Precipitation has an average of 449 mm; (5) Guzelyurt has a local steppe climate.There is little rainfall throughout the year.In Guzelyurt, the average annual values of temperature and precipitation are respectively 18.5 • C and 363 mm; (6) Lefkoşa has a hot semi-arid climate due to its low annual precipitation and annual temperature range.The city experiences long, hot, dry summers, and cool to mild winters, with most of the rainfall occurring in winter.The winter precipitation is occasionally accompanied by sleet and rarely by snow.The accumulation of snow is particularly rare (last events occurred in 1950, 1974, 1997, and 2015).There is occasionally light frost during the winter nights.The temperature reached 44.7 • C on 2 July 2017 in Lefkoşa; (7) Yeni Erenkoy's climate is classified as warm and temperate.There is more rainfall in the winter than in the summer in Yeni Erenkoy.The average temperature in Yeni Erenkoy is 18.7 • C and about 520 mm of precipitation falls annually.

Figure 1 .Figure 2 .
Figure 1.(a) Situation map of study area; (b) Location of stations.Figure 1.(a) Situation map of study area; (b) Location of stations.Atmosphere 2019, 10, x FOR PEER REVIEW 6 of 28

Figure 2 .
Figure 2. (a) Lefkosa rain gauge station; (b) Rain gauge with the installation equipment.The following numbers refer to the installation equipment of rain gauge: 1 = Sensor base; 2 = Sensor cable; 3 = Outer tube; 4 = Stand; 5 = Mounting bolts for the stand; 6 = Wedge bolts; 7 = Nut and washers for mounting bolts.
Atmosphere 2019, 10, x FOR PEER REVIEW 10 of 28 i denotes to the station name (as Ercan, Gazimağusa, Geçitkale, Girne, Guzelyurt, Lefkoşa and Yeni Erenkoy stations) and are the precipitation values of ith station corresponding to time steps t−1 and t−12 (or 1 and 12 months ago).

Figure 4 .
Figure 4. Conceptual model of the ensemble system in scenario 1.

Figure 4 .
Figure 4. Conceptual model of the ensemble system in scenario 1.

Figure 5 .
Figure 5. Structure of a three-layer feed forward neural network (FFNN).

Figure 5 .
Figure 5. Structure of a three-layer feed forward neural network (FFNN).

Figure 8 .
Figure 8. Schematic of the proposed neural ensemble method.ANN: artificial neural networks.

Figure 8 .
Figure 8. Schematic of the proposed neural ensemble method.ANN: artificial neural networks.

Figure 12 .
Figure 12.(a) Results of precipitation prediction using simple, weighted and neural averaging methods and observed precipitation; (b) Scatter plots for verification step using neural ensemble method based on scenario 2 for Girne station.

Figure 12 .
Figure 12.(a) Results of precipitation prediction using simple, weighted and neural averaging methods and observed precipitation; (b) Scatter plots for verification step using neural ensemble method based on scenario 2 for Girne station.

Atmosphere 2019 , 28 Figure 13 .
Figure13.Scatter plot for verification step using FFNN and neural ensemble method based on scenario 2 for Girne station.
and is rarely below 0 °C or above 37 °C.2) Gazimağusa's climate is classified as warm and temperate.In winter, there is much more rainfall in Gazimağusa than in summer.The average temperature in Gazimağusa is 19.3 °C and the average rainfall is 407 mm. 3) The prevailing climate in Geçitkale is known as a local steppe climate.During the year, there is little rainfall in Geçitkale and the average annual temperature is 19.1 °C.4)Girnestation's climate is warm and temperate and the average annual rainfall is 382 mm.The winters are rainier than the summers.In Girne, the average annual temperature is 19.6 °C.Precipitation has an average of 449 mm. 5) Guzelyurt has a local steppe climate.There is little rainfall throughout the year.In Guzelyurt, the average annual values of temperature and precipitation are respectively 18.5 °C and 363 mm. 6) Lefkoşa has a hot semi-arid climate due to its low annual precipitation and annual temperature range.The city experiences long, hot, dry summers, and cool to mild winters, with most of the rainfall occurring in winter.The winter precipitation is occasionally accompanied by sleet and rarely by snow.The accumulation of snow is particularly rare (last events occurred in1950, 1974, 1997, and 2015).There is occasionally light frost during the winter nights.The temperature reached 44.7 °C on 2 July 2017 in Lefkoşa.7) Yeni Erenkoy's climate is classified as warm and temperate.There is more rainfall in the winter than in the summer in Yeni Erenkoy.The average temperature in Yeni Erenkoy is 18.7 °C and about 520 mm of precipitation falls annually. °C

Table 1 .
Specifications of rain gauges.

Table 3 .
The mutual information (MI) between the observed precipitation time series of statins.

Table 4 .
Results of monthly precipitation predictions by FFNN for both scenarios 1 and 2.
a Only the results of the optimum models have been tabulated.In network structure (a.b.c) a,b,c respectively show the numbers of input, hidden, and output neurons.

Table 5 .
Results of monthly precipitation predictions by ANFIS model both scenarios 1 and 2.

Table 6 .
Results of monthly rainfall predictions by LSSVM for both scenarios 1 and 2.

Table 7 .
Results of ensembles using linear weighted and non-linear averaging methods for scenario 1.

Table 7 .
Cont.Model structure in weighted averaging method shows applied weights respectively on FFNN, ANFIS and LSSVM outputs, whereas for neural averaging it shows numbers of input, hidden and output neurons, respectively.The outputs obtained by the ensemble techniques indicate that almost all three ensemble techniques could produce reliable results in comparison to the single AI methods. a

Table 8 .
Results of ensembles using linear, weighted and non-linear averaging methods for scenario 2. Model structure in weighted averaging method shows applied weights respectively on FFNN, ANFIS and LSSVM outputs, whereas for neural averaging it shows numbers of input, hidden and output neurons, respectively. a

Table 9 .
Skill scores of ensembles using linear, weighted and non-linear averaging methods for scenarios 1 and 2.
.33% in Güzelyurt station relative to ANFIS, 25.33% in Güzelyurt station relative to ANFIS, 47.68% in Ercan station relative to ANFIS and 54.74% in Lefkoşa station relative to FFNN, 50% in Lefkoşa station relative to FFNN, 79.74% in Lefkoşa station relative to FFNN in calibration step for scenarios 1 and 2 and up to 32.3% Scatter plot for verification step using FFNN and neural ensemble method based on scenario 2 for Girne station.