Mapping of Groundwater Spring Potential in Karst Aquifer System Using Novel Ensemble Bivariate and Multivariate Models

: Groundwater is an important natural resource in arid and semi-arid environments, where discharge from karst springs is utilized as the principal water supply for human use. The occurrence of karst springs over large areas is often poorly documented, and interpolation strategies are often utilized to map the distribution and discharge potential of springs. This study develops a novel method to delineate karst spring zones on the basis of various hydrogeological factors. A case study of the Bojnourd Region, Iran, where spring discharge measurements are available for 359 sites, is used to demonstrate application of the new approach. Spatial mapping is achieved using ensemble modelling, which is based on certainty factors ( CF ) and logistic regression (LR). Maps of the CF and LR components of groundwater potential were generated individually, and then, combined to prepare an ensemble map of the study area. The accuracy (A) of the ensemble map was then assessed using area under the receiver operating characteristic curve. Results of this analysis show that LR (A = 78%) outperformed CF (A = 67%) in terms of the comparison between model predictions and known occurrences of karst springs (i.e., calibration data). However, combining the CF and LR results through ensemble modelling produced superior accuracy (A = 85%) in terms of spring potential mapping. By combining CF and LR mapping projects. The methodology developed here o ﬀ ers an e ﬃ cient method for assessing spring discharge and karst spring potentials over regional scales.


Study Area
The study area, Bojnourd Region, is located in the north east of Khorasan Shomali province, Iran, between latitudes 37 • 15 to 37 • 35 N and longitudes 57 • 03 to 57 • 40 E (Figure 1), with minimum and maximum elevation of 875 and 2968 m, respectively. The mean precipitation is 272 mm/year and temperature ranges between 6.8 to 19.7 • C [105]. There are six main lithological formations in the study area including Aitamir (Cretaceous), Sarcheshmeh (Early Cretaceous), Shurijeh (Jurassic-Cretaceous), Upper-Red (Miocene), Niur (Silurian) and Shemshak (Triassic-Jurassic). Lithological characteristics of these formations are summarized in Table 1. From a geomorphological point of view, the study area is dominated by a mountainous landscape. In addition, there are numerous folds and faults in the study area which cause the occurrence of springs [106]. The major faults are oriented NE-SW and the others NW-SE and N-S. Inceptisols and entisols are dominant soil types of the study area. The depth of soil varies from one to three meters, although there are numerous surficial rocks (lithological outcrops) in steep areas of upstream zones. According to the reports of Iranian Department of Water Resources Management (IDWRM), hydraulic conductivities in this mountainous region range between 10 −4 and 10 −5 m/s. Nearly 88% of water demand in Khorasan Shomali province is supplied by groundwater resources [107]. Therefore, it has become necessary to predict groundwater potential zone in this region. The discharge of springs ranges between 0.5 and 15.2 SI. The size of karstic areas in this province is over 17,000 km 2 , which are used as karst water resources to provide drinking water for the Khorasan Shomali and Khorasan Razavi provinces. Furthermore, Khorasan Shomali province covers approximately an area of 248,000 km 2 , which is inhabited by more than 6 million people and where agriculture is the major activity.

Methodology
The adopted methodology ( Figure 2) included: preparation of data, construction of models and production of karst spring potential map (KSPM), validation of results and evaluation of their efficiency.
province is over 17,000 km 2 , which are used as karst water resources to provide drinking water for the Khorasan Shomali and Khorasan Razavi provinces. Furthermore, Khorasan Shomali province covers approximately an area of 248,000 km 2 , which is inhabited by more than 6 million people and where agriculture is the major activity.

Methodology
The adopted methodology ( Figure 2) included: preparation of data, construction of models and production of karst spring potential map (KSPM), validation of results and evaluation of their efficiency.

Data Used
In order to analyze karst spring potential, springs locations and 10 spring-affecting factors were considered as dependent and independent variables, respectively. All data were obtained from the Iranian Department of Water Resources Management (IDWRM).

Karst Spring Inventory
A total of 359 karst springs was identified and recorded by the Iranian Department of Water Resources Management (IDWRM) in the study area, of which 251 (70%) were randomly selected for calibration/training of models and the remaining 108 (30%) for validation purpose. The distribution of training and validation karst springs is shown in Figure 1.

Data Used
In order to analyze karst spring potential, springs locations and 10 spring-affecting factors were considered as dependent and independent variables, respectively. All data were obtained from the Iranian Department of Water Resources Management (IDWRM).

Karst Spring Inventory
A total of 359 karst springs was identified and recorded by the Iranian Department of Water Resources Management (IDWRM) in the study area, of which 251 (70%) were randomly selected for calibration/training of models and the remaining 108 (30%) for validation purpose. The distribution of training and validation karst springs is shown in Figure 1.

Karst Spring-Affecting Factors
As mentioned by Crozier [108], preparation of corresponding thematic data layers and the selection of factors are important components of any model. In this study, 10 karst spring-affecting factors Water 2020, 12, 985 6 of 25 including distance from fault, land use/land cover, geology, topographic elevation, slope angle, slope aspect, plan curvature, topographic wetness index (TWI), drainage density and distance from river were utilized for producing KSPMs.

Distance from fault
Distance from fault is among those factors that has been widely used as an important groundwater influential factor [109]. The faults were extracted from a geological map of the study area as line features in ArcGIS 10.2 and a map of 'distance from faults' was developed using the Euclidean distance tool and classified (Figure 3a).
Land use/ land cover Land use/land cover is a significant factor in controlling groundwater recharge processes [110]. The land use/land cover map of the study area ( Figure 3b) shows that rangeland covers most of study area, followed by agriculture, forest and urban classes (Figure 3b).

Geology
Since the hydraulic conductivity and porosity of soils and rocks are affected by lithological variations, groundwater occurrence is highly influenced by geology [104,111]. In this study, the geologic map of study area with seven lithological groups was extracted from 1:100,000 scale geologic map of Iran ( Figure 3c, Table 1).
Water 2020, 12, x FOR PEER REVIEW 6 of 25

Karst Spring-Affecting Factors
As mentioned by Crozier [108], preparation of corresponding thematic data layers and the selection of factors are important components of any model. In this study, 10 karst spring-affecting factors including distance from fault, land use/land cover, geology, topographic elevation, slope angle, slope aspect, plan curvature, topographic wetness index (TWI), drainage density and distance from river were utilized for producing KSPMs.

Distance from fault
Distance from fault is among those factors that has been widely used as an important groundwater influential factor [109]. The faults were extracted from a geological map of the study area as line features in ArcGIS 10.2 and a map of 'distance from faults' was developed using the Euclidean distance tool and classified (Figure 3a).

Land use/ land cover
Land use/land cover is a significant factor in controlling groundwater recharge processes [110]. The land use/land cover map of the study area ( Figure 3b) shows that rangeland covers most of study area, followed by agriculture, forest and urban classes (Figure 3b).

Topographic factors
Surface topography controls the runoff rate and direction [112,113]. Therefore, topographic elevation has been frequently used by researchers as a controlling parameter for groundwater occurrence [31,114]. The digital elevation model (DEM) map of the study area was obtained from the IDWRM and we classified it using an equal-interval classification scheme (Figure 3d). IDWRM originally produced the DEM from aerial photographs and topographical lines was obtained from the Iranian Department of Water Resources Management (IDWRM) at 20 m cell resolution. Slope angle affects the infiltration, subsurface flow and duration of overland flow [115,116]. Therefore, the groundwater recharge process is strongly controlled by slope angle [117]. According to Magesh et al. [115], due to low infiltration and high run-off in steeper slope, these areas have poor potential to groundwater accumulation. The slope angle map of study area was extracted from the DEM and classified into seven classes based on IDWRM classification (Figure 3e). Plan curvature represents the rate of slope aspect or slope gradient change. Negative and positive values of plan curvatures characterize concavity and convexity in the downslope direction, respectively. Zero curvature also represents flat areas [118]. As mentioned by Lee and Pradhan [118], concave slope probably contains more water. The plan curvature map of study area was produced from DEM in concave, flat and convex classes ( Figure 3f). The slope aspect layer was derived from DEM and divided into nine classes including eight main directions ( Figure 3g).

Water-related factors
There is a significant relationship between distance from river and groundwater spring occurrence [31]. The maps of drainage density and distance from rivers were created from the topographic data using Kernel density and Euclidean distance tools, respectively. The maps of drainage density and distance from river are shown in Figure 3h,i, respectively. The soil moisture conditions are reflected in values of TWI [119], which is defined as: where A s is the specific catchment area (m 2 /m) and β is slope gradient [120]. TWI map was shown in Figure 3j. Drainage density is defined as ratio of total length of a stream to the area of the drainage basin [121]. In general, low drainage density areas are more suitable for groundwater development in comparison with high drainage density [122].

Statistical Methods
A bivariate statistical analysis (certainty factor) and a multivariate statistical analysis (logistic regression) were applied individually and as an ensemble to delineate karst spring potential. Finally, the ensemble map was compared with each individual map using the receiver operating characteristic (ROC) analysis to find if there is any improvement in combining the models.

Certainty Factor Model
Shortliffe and Buchanan [123] proposed and defined certainty factor (CF) as a probability function that was later modified by Heckerman [124]. Different researchers have applied this model in their studies for landslide susceptibility mapping [125] and groundwater potential mapping [26]. The formula to calculate CF is [124]: where P a is conditional probability of having a number of karst spring occurrence in class a, P b is prior probability of having the total number of karst spring occurrence in the study area. The CF value for different attribute classes in each conditioning factor were calculated from their relationship to the spring occurrence. The CF values ranges from −1 to +1. A positive value of CF means that the groundwater potential is higher, while a negative value illustrates a decrease in the certainty of groundwater occurrence [126]. In order to delineate the combination of two different layers of information, which is denoted as Z, the CF values of the spring-affecting factors are pairwise combined using the integration rules as follow [127].
where CF X and CF Y are CF values of two different spring-affecting factors.
The pairwise combination is carried out in different iterations until all the spring-affecting layers are added to prepare the KSPM. The final map was categorized into four classes: low, medium, high and very high [29,128].

Logistic Regression Model
LR aims at finding the best-fit model to express the relationship among a binary dependent variable (groundwater occurrence) and some independent variables [129]. This model is helpful to predict the absence or presence of outcome based on predictive variables (Lee 2005). The LR model is widely applied in different fields of geosciences [24,130]. However, it has some limitations to determine the weight of classes of independent variables [131]. Hosmer and Lemeshow [132] and Kleinbaum et al. [133] also explained the detailed information of this technique. The general form of logistic regression is as follows: where Z is a linear combination function of the explanatory variables, x 1 , x 2 , . . . , x m are independent variables and b 1 , b 2 , . . . , b m are the coefficients of the logistic regression to be estimated [134]. The relationship between the occurrence and its dependency on several variables is defined as: where P and e are the probability of spring occurrence (varying from 0 to 1) and the Neperian number, respectively [135]. In addition, Z factor was defined above Equation (4). Value of 1 means the presence of a spring, and value 0 indicates the absence of a spring [24]. After creating the KSPM by LR model, we classified it using natural break classification method. As mentioned by Hosmer and Lemeshow [132], independent variables in LR model is sensitive to co-linearity for model fitting, while more than two variables in LR model is often called 'multi-collinearity'. Multi-collinearity refers to a linear relation among two or more variables [136]. LR model was used to do statistical analysis and determine whether there was a significant effect on spring occurrence. Tolerance (TOL) and the variance inflation factor (VIF) are significant indicators for determining multi-collinearity among the independent variables [137] that were applied ( Table 2). Based on this diagnosis method, variables with TOL < 0.1 and/or VIF > 10 shall be excluded from further analysis [138]. Moreover, the overall statistic of the LR model for all 10 independent variables was assessed using χ 2 value of the Hosmer-Lemeshow test, Cox and Snell R 2 , Nagelkerke R 2 and pseudo-R 2 mesaures.3.2.3. Ensemble model. Ensemble modeling is a technical process of synthesizing the results of single models into a single incorporated model for enhancing the prediction [139]. This type of modeling approach tends to overcome their individual disadvantages while keeping their individual advantages [102]. Weighted linear combination technique is proposed here as an ensemble method to combine the result of CF and LR: where DF is distance from fault, L is land use/ land cover, G is geology, A is topographic elevation, TWI is topographic wetness index, DD is drainage density, DR is distance from river, S is slope angle, AS is slope aspect, PC is plan curvature and the subscripts 'r' and 'w' indicate ratings and weights for spring-affecting factors, respectively. Ratings are obtained from the CF model (i.e., the weights of classes of a given spring-affecting factor), whereas w was calculated based on the spatial sensitivity analysis (SSA) of LR model. The SSA allows to determine the influence of predictive variables on the model output [16,140]. In this study, this was conducted by excluding each spring-affecting factor and determining w through the relative decrease of AUC values Equation (7) [29]: where AUC all and AUC i indicate the AUC values obtained from the LR model using all spring-affecting factors and the prediction when the ith spring-affecting factor was excluded, respectively.

Performance Assessment
Performance assessment is required in any predictive analysis [141]. Nampak et al. [128] also stated that there is no merit in modeling with no scientific significance; hence, validity and uncertainty of models need to be assessed. Furthermore, many researchers have validated their results using different models. Naghibi et al. [31], Pourghasemi and Beheshtirad [109] and Rahmati et al. [3] have utilized area under ROC curve (AUC) to assess the accuracy of groundwater potential maps. This method illustrates to what extent the occurrence or non-occurrence of a considered occurrence is correctly predicted [29]. In the ROC curve, false and true positive rates are plotted in X-and Y-axis, respectively. The resultant AUC value ranges from 0.5 (i.e., a random prediction) to 1 (i.e., an excellent prediction) [142].

CF Modelling
CF values of spring-affecting factors are presented in Figure 4. As depicted in the figure, negative and positive relationships between factors and spring occurrence are well illustrated. Obviously, negative relationship exists between spring potential and distance from faults parameter. Areas with more distance from fault have lower potential to spring occurrence. In the case of geology factor, some lithologic formations including Silurian (S), Jurassic-Cretaceous (JC) and Triassic-Jurassic (TJ) had positive influence in spring occurrence with CF values of 0.94, 0.42 and 0.21, respectively.
Water 2020, 12, x FOR PEER REVIEW 11 of 25

CF Modelling
CF values of spring-affecting factors are presented in Figure 4. As depicted in the figure, negative and positive relationships between factors and spring occurrence are well illustrated. Obviously, negative relationship exists between spring potential and distance from faults parameter. Areas with more distance from fault have lower potential to spring occurrence. In the case of geology factor, some lithologic formations including Silurian (S), Jurassic-Cretaceous (JC) and Triassic-Jurassic (TJ) had positive influence in spring occurrence with CF values of 0.94, 0.42 and 0.21, respectively.  In the case of land use/land cover, urban areas and agriculture have the least and most impact on groundwater, respectively. The impervious surface areas in urban areas tend to reduce recharge and hence groundwater availability. As the agriculture areas as a consequence factor need to high water requirement, these areas have a higher potential to groundwater recharge [47]. It was also found that there was no significant correlation between spring occurrence and topographic elevation classes. Areas with elevation ranges of 1676-1978 m and 1414-1676 m have positive influence on spring occurrence probability with CF values of 0.45 and 0.16 which indicate a good karst spring occurrence probability. In another study in this region, Rahmati et al. [44] indicated that spring occurrence probability will reduce as elevation decreases which is in agreement with the results of our study. It is noted that, groundwater is more probable to occur in areas with high TWI values because it is well apparent in Figure 4e that positive correlation exists between increasing spring occurrence and high TWI value.
Conversely, Razandi et al. [26] stated that there is not positive correlation between groundwater occurrence and TWI, which shows a higher groundwater potential over a decreasing TWI value. The reason of this difference between their results and our achievements refer to geomorphological characteristics of study areas (i.e., their study area were plain whereas our study area is a mountainous region). In areas where drainage density ranges from 1.3 to 2.8 km/km 2 (CF = 1.82), groundwater potential is higher indicating that springs are more probable in areas with denser drainage density. The assessment of distance from river demonstrated that areas with lower distance from river have more potential for spring occurrence and as this distance increase, the probability to groundwater occurrence decreases. Lee et al. [143] proved these results by indicating that drainage density and distance from the river have positive and negative influence on groundwater yield. Results indicated that the areas with 8-40° slope angles have positive influences with highest value in 14-20° and 20-28° with CF values of 0.28 and 0.25. Manap et al. [21] also mentioned that flat to moderate slopes have higher groundwater probabilities. In the case of land use/land cover, urban areas and agriculture have the least and most impact on groundwater, respectively. The impervious surface areas in urban areas tend to reduce recharge and hence groundwater availability. As the agriculture areas as a consequence factor need to high water requirement, these areas have a higher potential to groundwater recharge [47]. It was also found that there was no significant correlation between spring occurrence and topographic elevation classes. Areas with elevation ranges of 1676-1978 m and 1414-1676 m have positive influence on spring occurrence probability with CF values of 0.45 and 0.16 which indicate a good karst spring occurrence probability. In another study in this region, Rahmati et al. [44] indicated that spring occurrence probability will reduce as elevation decreases which is in agreement with the results of our study. It is noted that, groundwater is more probable to occur in areas with high TWI values because it is well apparent in Figure 4e that positive correlation exists between increasing spring occurrence and high TWI value.
Conversely, Razandi et al. [26] stated that there is not positive correlation between groundwater occurrence and TWI, which shows a higher groundwater potential over a decreasing TWI value. The reason of this difference between their results and our achievements refer to geomorphological characteristics of study areas (i.e., their study area were plain whereas our study area is a mountainous region). In areas where drainage density ranges from 1.3 to 2.8 km/km 2 (CF = 1.82), groundwater potential is higher indicating that springs are more probable in areas with denser drainage density. The assessment of distance from river demonstrated that areas with lower distance from river have more potential for spring occurrence and as this distance increase, the probability to groundwater occurrence decreases. Lee et al. [143] proved these results by indicating that drainage density and distance from the river have positive and negative influence on groundwater yield. Results indicated that the areas with 8-40 • slope angles have positive influences with highest value in 14-20 • and 20-28 • with CF values of 0.28 and 0.25. Manap et al. [21] also mentioned that flat to moderate slopes have higher groundwater probabilities.
In the case of slope aspect, only west (0.22), southwest (0.19) and northwest (0.14) have a positive influence on groundwater occurrence. It is difficult to explain this result, but it might be related to the differences of aspects in terms of vegetation, soil depth and recharge potential. Plan curvature behavior also illustrates that springs are more probable to occur in convex areas. Moghadam et al. [114] also stated that concave areas are more favorable for groundwater compared to the convex slope. Finally, according to the CF value for each influencing factor, karst spring probability map was produced and then categorized into four classes; namely low, medium, high and very high using the natural break classification method (Figure 5), as suggested by Naghibi et al. [31]. Areas with high and very high karst spring potential are seen to be near rivers and faults.
Water 2020, 12, x FOR PEER REVIEW 13 of 25 In the case of slope aspect, only west (0.22), southwest (0.19) and northwest (0.14) have a positive influence on groundwater occurrence. It is difficult to explain this result, but it might be related to the differences of aspects in terms of vegetation, soil depth and recharge potential. Plan curvature behavior also illustrates that springs are more probable to occur in convex areas. Moghadam et al. [114] also stated that concave areas are more favorable for groundwater compared to the convex slope. Finally, according to the CF value for each influencing factor, karst spring probability map was produced and then categorized into four classes; namely low, medium, high and very high using the natural break classification method (Figure 5), as suggested by Naghibi et al. [31]. Areas with high and very high karst spring potential are seen to be near rivers and faults.

LR Modelling
According to results of TOL and VIF analysis, no severe multi-collinearity was found between independent variables ( Table 2).
As shown in Table 3, results indicated that all applied factors have significant effects on spring occurrence (p < 0.05). Similar results can be found in many previous studies [109].

LR Modelling
According to results of TOL and VIF analysis, no severe multi-collinearity was found between independent variables ( Table 2).
As shown in Table 3, results indicated that all applied factors have significant effects on spring occurrence (p < 0.05). Similar results can be found in many previous studies [109].   Table 4 showed the overall statistic of the LR model for all 10 independent variables. The results indicated that the goodness-of-fit of the equation could be accepted because the significance of χ 2 is larger than 0.05 [138]. The values of Cox and Snell R 2 and Nagelkerke R 2 showed that the independent variables could explain the dependent variables [138,144]. Using the coefficient values obtained from the final output of the LR analysis, the LR equation was obtained, as shown in Equation (8): Z = C Geology + (-0.003 × Distance f rom river) + (-0.199 × Slope angle) + C Slope aspect +(-0.990 × TWI) + (-2.663 × Drainage density) +(0.374 × Topographic elevation) + (-0.319 × Distance f rom f ault) +(0.044 × Plan curvature) + C Land use -16.33 (8) where C Geology is the LR coefficient for geology factor, whereas C Land use is the LR coefficient for land use factor. All coefficients of Equation (8) were obtained from Table 3. The final generated map using LR model was then classified into low, medium, high and very high classes using the natural break classification method (Figure 6), as suggested by Naghibi et al. [31]. Areas with high and very high potential to karst springs are located in southwest to southeast of the study area.
Water 2020, 12, x FOR PEER REVIEW 15 of 25 Figure 6. Karst spring potential map produced by the LR modelling.

Ensemble Modelling
Although LR was capable to implement multivariate statistical analysis, it had some weakness in analyzing the classes of each occurrences conditioning factor. Conversely, the CF model could evaluate the influence of classes of each spring-affecting factor on an occurrence. However, the relationship between karst spring locations and spring-affecting factors was mostly neglected by LR model. As a result, it can be clearly seen that CF and LR models had some weakness when applied individually. The generated map of ensemble CF and LR (ECL) model was prepared and then classified into low, moderate, high and very high classes using the natural break classification method (Figure 7), as suggested by Naghibi et al. [31].

Ensemble Modelling
Although LR was capable to implement multivariate statistical analysis, it had some weakness in analyzing the classes of each occurrences conditioning factor. Conversely, the CF model could evaluate the influence of classes of each spring-affecting factor on an occurrence. However, the relationship between karst spring locations and spring-affecting factors was mostly neglected by LR model. As a result, it can be clearly seen that CF and LR models had some weakness when applied individually. The generated map of ensemble CF and LR (ECL) model was prepared and then classified into low, moderate, high and very high classes using the natural break classification method (Figure 7), as suggested by Naghibi et al. [31].

Validation of Generated KSPMs
The ensemble map was compared to each individual (i.e., CF and LR) map using ROC curve method to find if there was any improvements in combining the individual models. The generated SPMs were overlaid with 30% of spring locations for this phase, which has not been used for model

Validation of Generated KSPMs
The ensemble map was compared to each individual (i.e., CF and LR) map using ROC curve method to find if there was any improvements in combining the individual models. The generated SPMs were overlaid with 30% of spring locations for this phase, which has not been used for model application. According to Figure 8, validation outcomes indicated that LR model could satisfactorily generate groundwater potential map (AUC = 78.4%) and performed better than CF (AUC=66.7%). Moreover, the ECL model had the best performance with an AUC value of 84.7%. It was shown that the ensemble method allowed for a better prediction of groundwater potentiality.

Validation of Generated KSPMs
The ensemble map was compared to each individual (i.e., CF and LR) map using ROC curve method to find if there was any improvements in combining the individual models. The generated SPMs were overlaid with 30% of spring locations for this phase, which has not been used for model application. According to Figure 8, validation outcomes indicated that LR model could satisfactorily generate groundwater potential map (AUC = 78.4%) and performed better than CF (AUC=66.7%). Moreover, the ECL model had the best performance with an AUC value of 84.7%. It was shown that the ensemble method allowed for a better prediction of groundwater potentiality.

Spatial Modeling of Karst Spring Potential
Although several studies have been conducted on construction of karst spring potential maps using statistical models, there are different drawbacks about them, especially defining strict assumptions prior to study. As stated by Dreiseitl and Ohno-Machado [145] and Ozdemir [146], the LR model is usually flexible (especially when it includes only the original covariates) and may be generally desirable. On the other hand, the weak point of bivariate statistical methods such as CF is the neglect of relationship between the groundwater springs and independent variables as observed in this study (i.e., it cannot determine the weight of independent variables). In other word, the CF model just determines the weight of classes of independent variables. Therefore, to overcome these shortcomings, we developed a GIS-based ensemble model that can link the strengths of LR and CF together.
The statistical analysis of LR (Table 3) demonstrated a high contribution of slope aspect (southwest), geology (Cretaceous, Silurian and Jurassic-Cretaceous), land use/land cover (forest) to the modelling process. Prior studies have noted the importance of slope aspect effects on sunshine, air temperature, soil transformation and soil moisture content [147,148]. According to the literature, structural and lithological variations often lead to differences in the permeability and hydraulic properties [8,149,150]. In the case of land use/land cover, it usually influences infiltration and recharge rates as well as the relationship between surface water and groundwater [151]. This is also in agreement with earlier reports, which showed that there is a strong relationship between land use/land cover factor and groundwater recharge [152,153].

Performance of Ensemble and Individual Models
In Section 4.4, models' results were analyzed and validated using the validation dataset and the ROC curve method. The results of validation revealed differences between all three approaches, namely including CF (i.e., bivariate statistical model), LR (i.e., multivariate statistical model) and their ensemble.
It should be emphasized that the ensemble model (ECL) provides the best simulation with a good accuracy. This result showed that the accuracy of the prepared KSPM was improved (6.3-17.9%) with the use of the ensemble of bivariate certainty factor and multivariate logistic regression. Therefore, the ensembling process leads to successfully overcome the limitations of LR and CF models. In fact, results shown in this study highlight a remarkable decrease of error through the ensemble approach. This result is in agreement with Arabameri et al. [154] in which the ensemble of LR with evidential belief function (EBF) model can extend an efficient model to reinforce the accuracy of spatial predictions. According to results of this study, the aggregation of multiple prediction of the same variable may lead to better performance, more exact learning and generalization than using a single model prediction.
The findings of the current study are consistent with that of Naghibi et al. [30] who found that frequency ratio data mining ensemble model had the best performance among all the single models for groundwater potential mapping. Study in Razandi et al. [26] also compared the performance of different data-mining models and stated that frequency ration (FR) and weight-of-evidence (WOE) could perform better than CF model for groundwater potential mapping in Varamin plain, Iran. This is in line with our finding that the CF model is incapable of analyzing the relation between karst groundwater potential and geo-environmental factors.

Conclusions
Springs, as groundwater resources, are important for many sectors, such as domestic consumption and agriculture in arid and semi-arid areas of the world. Sometimes, several families or villages depend heavily on occurrence of a spring; therefore, their spatial modeling is necessary. We designed a new ensemble approach called ECL to deal with this issue. In conclusion, the novel ensemble model gives better predictive performance than could be obtained from any of the constituent models. Therefore, the strength of the proposed methodology is to improve model accuracy by using this novel ensembling technique and to efficiently produce a KSPM for sustainable water recourses management programs. Decision makers and water resources planners can use the proposed approach as an efficient tool to assess the karst spring potential, to establish relationship between karst springs and geo-environmental setting, and to plan and implement effective water resources management in drought condition. Future studies on the current topic are therefore recommended.

Conflicts of Interest:
The authors declare no conflict of interest.