Application of Advanced Machine Learning Algorithms to Assess Groundwater Potential Using Remote Sensing-Derived Data

: Groundwater (GW) is being uncontrollably exploited in various parts of the world resulting from huge needs for water supply as an outcome of population growth and industrialization. Bearing in mind the importance of GW potential assessment in reaching sustainability, this study seeks to use remote sensing (RS)-derived driving factors as an input of the advanced machine learning algorithms (MLAs), comprising deep boosting and logistic model trees to evaluate their e ﬃ ciency. To do so, their results are compared with three benchmark MLAs such as boosted regression trees, k-nearest neighbors, and random forest. For this purpose, we ﬁrstly assembled di ﬀ erent topographical, hydrological, RS-based, and lithological driving factors such as altitude, slope degree, aspect, slope length, plan curvature, proﬁle curvature, relative slope position, distance from rivers, river density, topographic wetness index, land use / land cover (LULC), normalized di ﬀ erence vegetation index (NDVI), distance from lineament, lineament density, and lithology. The GW spring indicator was divided into two classes for training (434 springs) and validation (186 springs) with a proportion of 70:30. The training dataset of the springs accompanied by the driving factors were incorporated into the MLAs and the outputs were validated by di ﬀ erent indices such as accuracy, kappa, receiver operating characteristics (ROC) curve, speciﬁcity, and sensitivity. Based upon the area under the ROC curve, the logistic model tree (87.813%) generated similar performance to deep boosting (87.807%), followed by boosted regression trees (87.397%), random forest (86.466%), and k-nearest neighbors (76.708%) MLAs. The ﬁndings conﬁrm the great performance of the logistic model tree and deep boosting algorithms in modelling GW potential. Thus, their application can be suggested for other areas to obtain an insight about GW-related barriers toward sustainability. Further, the outcome based on the logistic model tree algorithm depicts the high impact of the RS-based factor, such as NDVI with 100 relative inﬂuence, as well as high inﬂuence of the distance from river, altitude, and RSP variables with 46.07, 43.47, and 37.20 relative inﬂuence, respectively, on GW potential.


Introduction
The demand for water supply is continuously rising as an outcome of population growth and development [1]. A population greater than approximately 2 billion is exposed to extreme water stress [2]. In arid and semiarid areas, aquifers form the major freshwater reserves [3]. Long-term over-use of groundwater (GW) supplies has adverse consequences such as GW table decline and

Materials and Methods
The framework applied in this research, consisting of data provision, RS-based processes, application of the MLAs, and validation step, is elucidated in Figure 1.

Study Area
Hesare-No area is located in northern part of Khorasan-e Razavi province within E longitudes 58°21′05″ and 58°44′10″ and N latitudes 36°18′36″ and 36°39′40″ in north-eastern Iran ( Figure 2). The Hesare-No area covers about 712 km 2 with an average elevation of 1700 m. A semiarid climate dominates the Hesare-No area and severe water scarcity has been a problem during the past decades. Further, the minimum and maximum temperatures are 4 °C and 40 °C, respectively, and the average annual precipitation is about 249 mm. Although the annual rainfall has continuously decreased, most likely due to climate change, farming activities have increased, leading to an increased water shortage in the region. The main source of water is GW since surface water resources are not adequate to meet the water demands for agricultural, domestic, and industrial targets. The Hesare-No is a mountainous area with 620 springs. This study split the spring data into 70%, i.e., 434 cases, and 30%, i.e., 186 cases, for training and validating the MLAs, respectively. Moreover, to construct the MLAs, we randomly generated 620 non-spring or absence data.

Study Area
Hesare-No area is located in northern part of Khorasan-e Razavi province within E longitudes 58 • 21 05" and 58 • 44 10" and N latitudes 36 • 18 36" and 36 • 39 40" in north-eastern Iran ( Figure 2). The Hesare-No area covers about 712 km 2 with an average elevation of 1700 m. A semiarid climate dominates the Hesare-No area and severe water scarcity has been a problem during the past decades. Further, the minimum and maximum temperatures are 4 • C and 40 • C, respectively, and the average annual precipitation is about 249 mm. Although the annual rainfall has continuously decreased, most likely due to climate change, farming activities have increased, leading to an increased water shortage in the region. The main source of water is GW since surface water resources are not adequate to meet the water demands for agricultural, domestic, and industrial targets. The Hesare-No is a mountainous area with 620 springs. This study split the spring data into 70%, i.e., 434 cases, and 30%, i.e., 186 cases, for training and validating the MLAs, respectively. Moreover, to construct the MLAs, we randomly generated 620 non-spring or absence data.

GW Spring Driving Factors
Generally, the productivity of GW for a given aquifer depends on a variety of components, such as topographical and hydrological, which rely on the accessibility and quality of data. In general, RS data are extensive with high resolution spatial information. To assemble topographical and hydrological variables, Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) Global Digital Elevation Model (GDEM) data, with a spatial resolution of 30 m, were acquired from the National Aeronautics and Space Administration (NASA) website. Next, topographical, and hydrological driving factors were computed using the DEM explained in the below sections.

Topographical Driving Factors
Altitude Altitude has a vital implication on climate conditions and leads to a diverse categorization of vegetation and soil in any area [32,33]. Apart from this, it impacts GW potential indirectly; for instance, lower altitudes have lower slopes and the rate of infiltration enhances accordingly [34,35].

GW Spring Driving Factors
Generally, the productivity of GW for a given aquifer depends on a variety of components, such as topographical and hydrological, which rely on the accessibility and quality of data. In general, RS data are extensive with high resolution spatial information. To assemble topographical and hydrological variables, Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) Global Digital Elevation Model (GDEM) data, with a spatial resolution of 30 m, were acquired from the National Aeronautics and Space Administration (NASA) website. Next, topographical, and hydrological driving factors were computed using the DEM explained in the below sections.

Topographical Driving Factors
Altitude Altitude has a vital implication on climate conditions and leads to a diverse categorization of vegetation and soil in any area [32,33]. Apart from this, it impacts GW potential indirectly; for instance, lower altitudes have lower slopes and the rate of infiltration enhances accordingly [34,35]. The altitude map of the Hesare-No basin was extracted from the "ASTER-DEM." It varies between 1189 and 2922 m with a mean altitude of about 1691 m (Figure 3a).

Slope Degree
Slope degree was derived from the elevation variation and is looked upon as a primary component of the surface water flow regime due to the influence of gravity on water movement [36]. Runoff generation is directly proportional to the slope and GW recharge is greater in areas with lower slopes [37]. Water flow on gentle slopes is slow and the infiltration rate increases recharge to the aquifer, while steep slopes increase water flow velocity, hence the decreased recharge [38]. The slope map of the Hesare-No basin ranges between 0 and 62 degrees (Figure 3b).

Aspect
Aspect represents the dominant direction of the slope and indicates the direction of the drainage system [39]. Generally, aspect is used in hilly and mountainous regions, because sunshine or shadow duration plays a critical role in determination of the soil moisture [36]. The aspect also affects the amount of runoff generation through impacting vegetation growth and GW augmentation [40]. This factor is classified into nine categories as depicted in Figure 3c.

Slope Length (LS)
The LS factor is a combination of two components consisting of "slope length (L) and slope steepness (S)" that signifies the ratio of soil loss per unit basin area [7,41]. This factor can be calculated by [42]:  (1) where B s is the specific basin area (m 2 ) and α is the slope gradient in degrees. The LS of the Hesare-No area ranges between 0 and 53 (m) (Figure 3d).

Plan and Profile Curvatures
The curvature displays the change of topography and consists of two major parts, i.e., "profile and plan curvature," which affect the acceleration or deceleration and the convergence or divergence of the flow across the surface, respectively [43]. These driving factors were created using SAGA-GIS (SAGA User Group Association, Hamburg, Germany) ( Figure 3f).

Relative Slope Position (RSP)
RSP can be used to distinguish between topographical properties, e.g., "ridge peaks, valleys, mid-slopes, flat surfaces, foot-slopes, and upper slopes" [8]. The RSP is one of the contributing variables for GW potential [44,45]. In this research, the RSP map was produced implementing SAGA-GIS ( Figure 3g).

Hydrological Driving Factors
Distance from Rivers and River Density (Rd) Rivers are the principal origin of GW recharge in semiarid regions. Hence, distance from rivers is one of the major hydrological elements affecting GW potential. This layer was generated in accordance with the "Euclidean distance function" (Figure 4a).
River density (Rd) is another hydrological driving factor that is explained by the ratio of the total length of the river system to the total area of any given pixel [46]. Rd depends on river location, runoff generation, vegetation cover, climate, and lithology [47]. Therefore, Rd fulfils a critical role in the appraisal of GW potential for any given watershed. Rd was calculated employing the line density function in ArcGIS. The Rd map of the area ranges between 0 and 1.5 km/km 2 ( Figure 4b).
Topographic Wetness Index (TWI) TWI is used to demonstrate the spatial pattern of moisture and delineate the impacts of topographic conditions for locating the size of runoff saturated areas [39,48]. It plays a major role in the transport and accumulation of runoff at the soil surface. Greater TWI values demonstrate better GW retention capability of an area [7]. The TWI factor is calculated as: where β is the cumulative upslope area drained through a particular point (per unit contour length), and α is the gradient of the slope in that point. The TWI in the Hesare-No area varies between 2.5 and 23 (Figure 4c).

Hydrological Driving Factors
Distance from Rivers and River Density (Rd) Rivers are the principal origin of GW recharge in semiarid regions. Hence, distance from rivers is one of the major hydrological elements affecting GW potential. This layer was generated in accordance with the "Euclidean distance function" (Figure 4a).
River density (Rd) is another hydrological driving factor that is explained by the ratio of the total length of the river system to the total area of any given pixel [46]. Rd depends on river location, runoff generation, vegetation cover, climate, and lithology [47]. Therefore, Rd fulfils a critical role in the appraisal of GW potential for any given watershed. Rd was calculated employing the line density function in ArcGIS. The Rd map of the area ranges between 0 and 1.5 km/km 2 ( Figure 4b).
Topographic Wetness Index (TWI) TWI is used to demonstrate the spatial pattern of moisture and delineate the impacts of topographic conditions for locating the size of runoff saturated areas [39,48]. It plays a major role in the transport and accumulation of runoff at the soil surface. Greater TWI values demonstrate better GW retention capability of an area [7]. The TWI factor is calculated as: where β is the cumulative upslope area drained through a particular point (per unit contour length), and α is the gradient of the slope in that point. The TWI in the Hesare-No area varies between 2.5 and 23 ( Figure 4c).

RS-Derived Factors
Satellite Data and Pre-Processing To produce RS-derived factors such as LULC, NDVI, distance from lineament, and lineament density, Landsat-8 (OLI) data in 2019 (August), provided by the "United States Geological Survey (USGS)," were acquired. The image was clear and nearly "cloud-free" (with total cloud cover less than 1%). Pre-processing of the image comprised "radiometric calibration," "QUick atmospheric correction," and "geometric correction" algorithms using ENVI software. The pre-processing was conducted to convert the digital numbers of the multi-spectral bands (band 1-7) to "surface reflectance" (ranging from 0 to 1). To increase the resolution of the multi-spectral bands in the Landsat-8 image from 30 to 15 m, the "Gram-Schmidt method" was used for pan-sharpening in ENVI. This is a method to increase the lower spatial resolution of the multi-spectral Landsat images based on higher resolution of panchromatic imagery. Finally, the following factors were generated:

Satellite Data and Pre-Processing
To produce RS-derived factors such as LULC, NDVI, distance from lineament, and lineament density, Landsat-8 (OLI) data in 2019 (August), provided by the "United States Geological Survey (USGS)," were acquired. The image was clear and nearly "cloud-free" (with total cloud cover less than 1%). Pre-processing of the image comprised "radiometric calibration," "QUick atmospheric correction," and "geometric correction" algorithms using ENVI software. The pre-processing was conducted to convert the digital numbers of the multi-spectral bands (band 1-7) to "surface reflectance" (ranging from 0 to 1). To increase the resolution of the multi-spectral bands in the Landsat-8 image from 30 to 15 m, the "Gram-Schmidt method" was used for pan-sharpening in ENVI. This is a method to increase the lower spatial resolution of the multi-spectral Landsat images based on higher resolution of panchromatic imagery. Finally, the following factors were generated:

Generation of Land Use/Land-Cover Classification and Accuracy Assessment
Many studies have underlined the importance of LULC in the hydrological interpretation for the development of GW potential [44,49,50]. The presence and productivity of GW in a particular area as well relies upon the soil surface features, which act as a mediator in the process of "runoff-infiltration and runoff-evapotranspiration" [51]. Hence, to detect LULC variation for the Hesare-No Basin, multi-spectral bands of Landsat-8 (OLI) images and three supervised classification algorithms, i.e., maximum likelihood, neural network, and decision tree, were implemented. The Hesare-No Basin was categorized into five principal LULCs encompassing "orchard, agriculture, bare land, rangeland, and water surface." Validation of the LULC maps was conducted by electing 50 checkpoints (through random sampling) on the ground for each LULC class compared with the corresponding grid in the satellite images by calculating the overall accuracy and kappa coefficients. These indices are frequently used as quantitative validation indices [52,53]. The validation results are depicted in Table 1. The table shows that the overall accuracy for the maximum likelihood, neural network, and decision tree algorithms was 87%, 88%, and 91%, respectively, while the kappa coefficients were 76%, 78%, and 82%, respectively. According to previous research, kappa coefficients greater than 75% demonstrate that the classification and ground truth data are comparable [54]. Thus, the decision tree algorithm was implemented to create the LULC map for the Hesare-No Basin due to its greater efficiency ( Figure 5a). The LULC map demonstrated that 0.03%, 1%, 11.5%, 37.8%, and 49.7% of the Hesare-No Basin are covered by water surface, orchards, agriculture, bare land, and rangeland, respectively. Table 1. Accuracy assessment of land use/cover (LULC) classification for the Hesare-No Basin.

Maximum Likelihood Neural Network Decision Tree
Overall Accuracy (%) 87 88 91 Kappa Coefficient (%) 76 78 82 Retrieval of Normalized Difference Vegetation Index (NDVI) NDVI is one of the primary indicators to measure vegetation cover. NDVI values between 0.1 and 0.75, in general, denote vegetation cover, while values greater than 0.75 imply a dense canopy. The NDVI for bare land and soil is close to zero, and negative values indicate water surface such as reservoirs [55]. The NDVI was calculated from the red and near infrared bands using: where NIR refers to surface reflectance of band 5 (Landsat-8), and Red refers to surface reflectance of band 4 (Landsat-8). The distribution map of NDVI is shown in Figure 5b.

Distance from Lineament and Lineament Density
The distance from lineaments is beneficial for GW potential assessment as the target hydrogeological zones that are located adjacent to linear structures control the movement and storage of GW [13,56]. It also describes surface morphologies such as "faults, fractures, cracks," as well as provides information on the infiltration of water depending on rock properties [51]. The distance from lineaments in the Hesare-No Basin was provisioned from the Landsat-8 OLI image as shown in Figure 5c. Another factor is lineament density, which is utilized to distinguish areas with great lineament concentrations that are correlated with permeability and GW potential [57]. Lineament density illustrates the total length of lineaments in a unit area, which is calculated by a line density function. The distribution map of the lineament density factor is shown in Figure 5d.

Lithology
Geological formations are important for GW potential, influencing the porosity and permeability of the aquifer material [12,58]. The geology controls the quantity and quality of GW [13]. The geological map was obtained from the Iranian Department of Geological Survey [59]. Accordingly, the Hesare-No Basin was grouped into 15 lithological units with differences in both lithology and geological age (Table 2, Figure 6).

Lithology
Geological formations are important for GW potential, influencing the porosity and permeability of the aquifer material [12,58]. The geology controls the quantity and quality of GW [13]. The geological map was obtained from the Iranian Department of Geological Survey [59]. Accordingly, the Hesare-No Basin was grouped into 15 lithological units with differences in both lithology and geological age (Table 2, Figure 6).   Table 2).

Logistic Model Tree
Logistic model tree is a classification MLA that integrates the "C4.5 and LR algorithms" [60]. These include a tree structure with a number of inner nodes and leaves [61]. The "information gain" is utilized to divide, and the LogitBoost is employed to build a LR at each node [62]. The logistic model tree implements the CART to prune the trees [63], and implements cross-validation for exploring the number of iterations of logitBoost to tackle overfitting [26]. Logistic model tree employs the C4.5 at nodes and LR to explore the probability within a leaf node as:  Table 2).

Logistic Model Tree
Logistic model tree is a classification MLA that integrates the "C4.5 and LR algorithms" [60]. These include a tree structure with a number of inner nodes and leaves [61]. The "information gain" is utilized to divide, and the LogitBoost is employed to build a LR at each node [62]. The logistic model tree implements the CART to prune the trees [63], and implements cross-validation for exploring the number of iterations of logitBoost to tackle overfitting [26]. Logistic model tree employs the C4.5 at nodes and LR to explore the probability within a leaf node as: where P(N|x) presents "posterior probability" in a leaf node of N categories within the input vector x, and L i (x) denotes the least square fits obtained as: where n is the number of driving factors, and α 0 and α i are the coefficients of the component of vector x = x i , representing the driving factors. To apply the logistic model tree, R statistical software (R Foundation for Statistical Computing, Vienna, Austria) and two packages including "caret" [64] and "RWeka" [65] were used through a 10-fold cross-validation.

Deep Boosting
Deep boosting was introduced by Cortes et al. [28] and implements a number of deep decision trees to obtain highly accurate outputs. The major feature of this method is a "capacity-conscious element" for hypothesis choice [28]. The method is a boosting-based MLAs and an ensemble of many weak learners in order to obtain a high accuracy output. A full description of the deep boosting can be found in Cortes et al. [28]. To run the deep boosting algorithm, "caret" [64] and "deepboost" [66] packages were run in R statistical software considering a 10-fold cross-validation.

Boosted Regression Trees
Boosted regression trees are one of many ensemble MLAs that enhance the efficacy as well as prediction capability of single methods by making use of many trees [7,41,67,68]. Two influential components, namely "gradient boosting" and "classification and regression tree," are integrated in boosted regression trees [69]. Boosting is applied to enhance the predictive accuracy of regression trees, and is related to the CART class of algorithms [41]. Decision trees display information from great data amounts in an efficient manner that is straightforward, comprehensible, and quick. It is less susceptible to over-fitting [67,70]. In boosted regression trees, three factors are required to be optimized, i.e., "number of trees," "interaction depth," and "shrinkage" [69]. The boosted regression trees can be calculated based on previous literature [41,68,71]. To run the boosted regression trees algorithm, caret [64] and gbm [72] packages were conducted in the R statistical software through a 10-fold cross-validation scheme.

K-Nearest Neighbors
The k-nearest neighbors technique is a common classification MLA that is nonparametric and eliminates estimation of the class density function [73]. This algorithm has been implemented in processes such as GW, landslide, gully erosion, and flood mapping [17,74,75]. The main assumption of the k-nearest neighbors is that unknown cells are classified based on their similarity to cells in the training set [76], which is explained through the Euclidean distance application [77]. To exemplify the classification procedure in the k-nearest neighbors, an illustrative case is presented in Figure 7. The blue triangle in the figure is an unknown cell, and the target is to distinguish whether it is a GW spring or not. If the value of K (number of nearest neighbors) is equal to 3 (the internal dashed circle), it will be assumed to be a spring (green squares) since there are two springs and one non-spring (red circle) in the inward dashed circle. However, if the value of k is equal to 9 (the external dashed circle), it would be interpreted as a non-GW spring, as there are five non-spring cells and four springs in the outer dashed circle. To run k-nearest neighbors in this study, the caret package [64] was applied in the R statistical software through a 10-fold cross-validation scheme.
Remote Sens. 2020, 12, x FOR PEER REVIEW 13 of 25 Figure 7. Schematic of classification procedure in the k-nearest neighbors algorithm.

Random Forest
Random forest is another vigorous and efficient MLA that was developed by Breiman [78] as an extension of the "classification and regression trees" to promote its prediction competence [39]. Numerous decision trees are fabricated through randomly bootstrapped calibration sets [78]. Afterwards, the model integrates the average outcomes of all the trees [8]. The user shall select two parameters, i.e., "the number of variables at each split" and "the number of tree" [39]. This model does not use all available data to grow the tree; it utilizes 66.66% of the Bootstrap information. Then, 33.33% of the remaining data are used to evaluate the fitted tree. In this research, this model was applied through the 'randomForest' package [79] in the R statistical software.

Validation of the Algorithms
The performance of the logistic model tree, deep boosting, boosted regression trees, and knearest neighbors in the investigation of GW potential was assessed by utilizing the receiver operating characteristic (ROC) curve, accuracy, kappa, sensitivity, and specificity [8,12,[80][81][82]. To quantify the prediction accuracy, the area under the ROC curve (AUC) was computed. The AUC values from 0.7 to 0.8 indicate that the prediction accuracy is acceptable, AUC ranging from 0.8 to 0.9 indicates excellent, and greater than 0.9 shows outstanding results [44].
Sensitivity and specificity indicators are the percentage of correctly classified pixels in regions with high and low GW potential, respectively. Sensitivity and specificity are calculated as: where TP is "true positives," FN is "false negatives," TN is "true negatives," and FP is "false positives." TP and TN are the number of pixels that are correctly classified, whereas FP and FN are the number of pixels erroneously classified. Accuracy and kappa indicators are calculated as:

Random Forest
Random forest is another vigorous and efficient MLA that was developed by Breiman [78] as an extension of the "classification and regression trees" to promote its prediction competence [39]. Numerous decision trees are fabricated through randomly bootstrapped calibration sets [78]. Afterwards, the model integrates the average outcomes of all the trees [8]. The user shall select two parameters, i.e., "the number of variables at each split" and "the number of tree" [39]. This model does not use all available data to grow the tree; it utilizes 66.66% of the Bootstrap information. Then, 33.33% of the remaining data are used to evaluate the fitted tree. In this research, this model was applied through the 'randomForest' package [79] in the R statistical software.

Validation of the Algorithms
The performance of the logistic model tree, deep boosting, boosted regression trees, and k-nearest neighbors in the investigation of GW potential was assessed by utilizing the receiver operating characteristic (ROC) curve, accuracy, kappa, sensitivity, and specificity [8,12,[80][81][82]. To quantify the prediction accuracy, the area under the ROC curve (AUC) was computed. The AUC values from 0.7 to 0.8 indicate that the prediction accuracy is acceptable, AUC ranging from 0.8 to 0.9 indicates excellent, and greater than 0.9 shows outstanding results [44].
Sensitivity and specificity indicators are the percentage of correctly classified pixels in regions with high and low GW potential, respectively. Sensitivity and specificity are calculated as: where TP is "true positives," FN is "false negatives," TN is "true negatives," and FP is "false positives." TP and TN are the number of pixels that are correctly classified, whereas FP and FN are the number of pixels erroneously classified. Accuracy and kappa indicators are calculated as: where n is the proportion of pixels that are properly categorized as spring or non-spring, and N is the total number of training pixels. Kappa ranges between 0 and 1, and the closer to 1 indicates a better accuracy. At last, for determining the significant difference of the algorithms' outputs, we implemented the Friedman test. The Friedman test is a "non-parametric" test equivalent to the repeated measures analysis of variance (ANOVA). Under the null-hypothesis, it depicts that all the algorithms are equivalent, so a rejection of this hypothesis shows the existence of differences among the efficiency of all the algorithms investigated. To do so, all the computed accuracy indices for the algorithms were considered. It is noteworthy to mention that area under the ROC curve was considered between 0 and 1 like other indices.

Machine Learning Algorithm Parameter Optimization Results
All MLAs were optimized based upon a 10-fold cross-validation. The logistic model tree was constructed by 21 "iterations," leading to accuracy and kappa values of 0.83, and 0.66, respectively. As per the deep boosting algorithm, the final parameter optimization is presented in Figure 8. As observed, deep boosting was constructed with 100 iterations, "tree depth" equal to 3, beta equal to 0.0039, lambda equal to 0.0625, and "loss type" equal to l with accuracy and kappa values of 0.840, and 0.681, respectively. In the k-nearest neighbors algorithm, a k value equal to 1 among values from 1 to 45 was opted as the best parameter ( Figure 8). Further, a significantly diminishing trend was observed between the accuracy of the k-nearest neighbors and the k. The calculated accuracy and kappa indices for the k-nearest neighbors were 0.807 and 0.614, respectively. In the case of the boosted regression trees algorithm, it was optimized with 200 trees, "interaction depth" of 3, shrinkage of 0.01, and "minimum terminal node" size of 20, leading to accuracy and kappa values of 0.840 and 0.680, respectively. The results of the random forest algorithm showed that it was tuned with "Minimum size of terminal nodes" of five, three variables at each node, and 700 trees.
where n is the proportion of pixels that are properly categorized as spring or non-spring, and N is the total number of training pixels. Kappa ranges between 0 and 1, and the closer to 1 indicates a better accuracy. At last, for determining the significant difference of the algorithms' outputs, we implemented the Friedman test. The Friedman test is a "non-parametric" test equivalent to the repeated measures analysis of variance (ANOVA). Under the null-hypothesis, it depicts that all the algorithms are equivalent, so a rejection of this hypothesis shows the existence of differences among the efficiency of all the algorithms investigated. To do so, all the computed accuracy indices for the algorithms were considered. It is noteworthy to mention that area under the ROC curve was considered between 0 and 1 like other indices.

Machine Learning Algorithm Parameter Optimization Results
All MLAs were optimized based upon a 10-fold cross-validation. The logistic model tree was constructed by 21 "iterations," leading to accuracy and kappa values of 0.83, and 0.66, respectively. As per the deep boosting algorithm, the final parameter optimization is presented in Figure 8. As observed, deep boosting was constructed with 100 iterations, "tree depth" equal to 3, beta equal to 0.0039, lambda equal to 0.0625, and "loss type" equal to l with accuracy and kappa values of 0.840, and 0.681, respectively. In the k-nearest neighbors algorithm, a k value equal to 1 among values from 1 to 45 was opted as the best parameter ( Figure 8). Further, a significantly diminishing trend was observed between the accuracy of the k-nearest neighbors and the k. The calculated accuracy and kappa indices for the k-nearest neighbors were 0.807 and 0.614, respectively. In the case of the boosted regression trees algorithm, it was optimized with 200 trees, "interaction depth" of 3, shrinkage of 0.01, and "minimum terminal node" size of 20, leading to accuracy and kappa values of 0.840 and 0.680, respectively. The results of the random forest algorithm showed that it was tuned with "Minimum size of terminal nodes" of five, three variables at each node, and 700 trees.

Validation of Maps and Performance Analysis of the Algorithms
The validation outcomes for the MLAs in this research are illustrated in Table 3. Based on the table, deep boosting and logistic model tree algorithms had slightly better performances than the boosted regression trees and random forest algorithm, and much higher performances than the k-nearest neighbors algorithm regarding accuracy, kappa, and ROC. In a more detailed view, regarding specificity, it is seen that the logistic model tree and deep boosting algorithms had the highest ability to predict the field data, while the boosted regression trees algorithm showed a slightly higher performance in determining absence points. The non-parametric Friedman test with a p-value threshold of 0.05 was performed to compare the performances of MLAs. The findings of the five groundwater potential models from the Friedman test are shown in Table 4. The findings indicate that the p-values were lower than 0.05 (0.007); thus, the null hypothesis is rejected, which indicates the existence of statistically significant differences between the performances of the models.

GW Potential Maps
The results of the GW potential maps produced by the logistic model tree, deep boosting, boosted regression trees, k-nearest neighbors, and random forest are depicted in Figure 9. In line with previous research [9,13,39,83], the natural break (Jenks) method was utilized to classify the GW potential maps into four classes of "low, moderate, high, and very high." The MLAs illustrated a similar pattern of GW potential in the Hesare-No Basin. However, for GW exploitation, the GW potential map generated by the logistic model tree is suggested regarding its superior performance. This map implies that the closest areas, as well as areas along the rivers, are assigned as very high GW potential (Figure 9). The north-eastern part of the study region has higher GW potential and is recommended for GW exploitation. On the other hand, the map reveals that the eastern part of the Hesare-No Basin is categorized as low GW potential, which can be associated with greater distances to rivers. This information can be used for water resources and also land use management. The calculated range and total area of each class are presented in Table 5. The results reveal that 110. 77, 92.90, 78.44, 73.93, and 62.83 km 2 of the Hesare-No Basin are classified as very high GW potential by the k-nearest neighbors, logistic model tree, boosted regression trees, random forest, and deep boosting, respectively. On the other hand, low GW potential was predicted to cover 371.02, 342.25, 315.04, 230.23, and 220.90 km 2 of the total basin area by the logistic model tree, boosted regression trees, random forest, deep boosting, and k-nearest neighbors, respectively. The percentage of each class by the logistic model tree, deep boosting, boosted regression trees, k-nearest neighbors, and random forest is shown in Figure 10. It can be observed that 29%, 28%, 27%, 27%, and 24% of the Hesare-No Basin are classified as high and very high GW potential by the k-nearest neighbors, deep boosting, random forest, boosted regression trees, and logistic model tree, respectively. Additionally, it was demonstrated that the highest percentage of "low, moderate, high, and very high" classes belonged to the logistic model tree, k-nearest neighbors, deep boosting, and k-nearest neighbors MLAs, respectively.
boosting, random forest, boosted regression trees, and logistic model tree, respectively. Additionally, it was demonstrated that the highest percentage of "low, moderate, high, and very high" classes belonged to the logistic model tree, k-nearest neighbors, deep boosting, and k-nearest neighbors MLAs, respectively.           Table 6 presents the importance of the driving factors in the modelling procedure for the Hesare-No Basin obtained by different MLAs. The importance of the driving factors is sorted based on logistic model tree results. It should be noted that the logistic model tree obtained the highest accuracy. As can be seen, the NDVI was found to be the most important factor in the modelling with all assessed MLAs. The secondary important factor was distance from the rivers in the logistic model tree, while altitude was the second important factor in boosted regression trees and random forest with 20.73 and 19.46 values, respectively. Moreover, the results indicate that altitude generally is the third important factor. Overall, RSP and LULC, with significant differences, are specified as the other important factors. The results also illustrate that distance from lineament and lineament density factors could be contributed moderately to the groundwater potential mapping. However, aspect, slope degree, and slope length factors had the lowest important value in all MLAs, generally.

Discussion
Pinpointing areas with high GW potential based on spring data is regarded as a significant method for the proper conservation and management of freshwater supplies, in particular in semi-arid districts. Hence, MLAs and RS-derived driving factors have been gaining popularity in this field. To qualitatively assess the outcomes, logistic model tree, boosted regression trees, deep boosting, and random forest algorithms led to GW potential maps with AUC values between 0.8 and 0.9, categorizing them as "very good" predictors, while k-nearest neighbors had an AUC value between 0.7 and 0.8, categorizing it as a "good" predictor based on Yesilnacar and Topal [84]. More precisely, the outputs reveal that the deep boosting and logistics model tree produced competitive and slightly better performances than the boosted regression tree algorithm. The deep boosting and logistic model tree also had better performances than the random forest with area under the ROC curve differences of greater than 1.4%. This difference proves their acceptable performance based on this fact that random forest has been reported as a strong algorithm in groundwater potential mapping [8,16]. Based on Friedman non-parametric test results, the difference in the performance of the algorithms is statistically significant. The logistic model tree and deep boosting algorithms have the potential to be strengthened by using parameter optimization algorithms, such as the genetic algorithm, to improve their accuracy [85]. As an example of the impact of the genetic algorithm on groundwater potential algorithms, we can refer to Naghibi et al. [85], which used the genetic algorithm to optimize random forest and confirmed its considerable impact on the model's performance.
The greater accuracy of the logistic model tree could be associated with the fact that it benefits from the CART algorithm, which assists pruning of the trees and hinders them from "overfitting." The stated feature of the logistic model tree accompanied by the application of LogitBoost can be the reason for its superior performance [86]. Similarly, high accuracy of the logistic model tree is declared by researchers in spatial investigations such as flash flood susceptibility, landslides, and gully erosion [29,31,87]. Deep boosting takes advantage of the data analyses and the favorable learning ability [28], which are the reasons for its superior performance in generating a GW potential map. On the other hand, boosted regression trees also produced high accuracy maps, which could be due to the fact that they keep important GW conditioning factors, detect the interactions, model distinct kinds of factors, and finally, manage missing data [17,88]. Their acceptable efficiency is in line with the previous studies [7,17,69]. Further, the great accuracy of the random forest model could be due to its low aptitude to overfitting, and the capability to support high-dimensional datasets [17,89]. For weaker performance of the k-nearest neighbors algorithm, this can be referred to its sensitivity to unrelated and unnecessary data as all characteristics are involved in the similarity and have a role in the final prediction.
Among the seven highly important driving factors, three, i.e., NDVI, distance from lineament, and lineament density, are direct RS-derived products. Except the RS-based factors, distance from rivers, altitude, RSP, and profile curvature are other important factors in the assessment of GW potential in the study region. This fact highlights the great impact of the RS-derived data on GW studies and necessitates the demand for high-accuracy RS products in future research on GW potential. The NDVI was ranked as the most influencing factor since it represents the vegetation cover and impacts the velocity of "water flow" and, therefore, "soil infiltration rate." The results are in agreement with Davoodi-Moghaddam et al. [8] and Naghibi et al. [17], which indicated that the NDVI factor is the most significant factor for groundwater potential mapping. Furthermore, the great influence of distance from rivers can be associated with the impact of rivers on GW natural recharge. As stated, rivers are the primary sources of natural GW augmentation particularly in arid and semiarid regions [90]. Moreover, the large contribution of the altitude, RSP, and profile curvature can be connected to the influence of topography on drainage system development, water movement, and subsequently, the soil infiltration rate. With respect to lineament-based factors, their greater influence could be linked to their controlling role in soil infiltration, and the GW movement and storage in an aquifer. Several studies have shown that NDVI [20], distance from rivers [9] altitude [6,41], and RSP [7,44] are highly contributing factors to assess GW potential, which is in line with our outcomes.

Conclusions
Accurate appraisal of GW supplies is fundamental in reaching sustainable development. However, there is often a general shortage of detailed hydro-geological data of aquifers and their productivity in developing countries. RS-based data products can offer a wealth of information for the data-scarce regions. This study applied the two new algorithms, deep boosting and logistic model tree, and compared them with the three benchmarks, i.e., boosted regression trees, k-nearest neighbors, and random forest MLAs. The outcome was satisfying in accordance with the AUC and accuracy scores. The deep boosting and logistic model tree models produced accurate and similar maps compared to the boosted regression trees, and outperformed the random forest and k-nearest neighbors algorithm. This emphasizes that the new MLAs, i.e., deep boosting and logistic model tree, can be used in GW studies to detect areas with high potential for GW exploitation. The other target of the current research was to scrutinize the importance of the factors in GW potential assessment, which demonstrated that the NDVI with high difference was the most important factor, followed by the distance from rivers, altitude, RSP, profile curvature, distance from lineaments, and lineaments density. This highlights the outstanding contribution of the NDVI factor and the considerable impact of lineament-based factors among the impact of DEM-derived factors. This approves the high influence of RS-derived factors on the modelling process. Using RS-based data is indeed a potential alternative for areas confronted with lack of data, and in particular, hydrogeological information, which is difficult to access. Overall, a combination of RS and DEM-derived data can provide sufficient amount of relationships for GW studies. As per transferability, in the case of the algorithms, all of them are freely available in the R statistical software and can be used by the water resources managers. Moreover, this study used the ASTER digital elevation model, which is freely available. Further, RS-derived factors were obtained by Landsat-8 (OLI) images that are also freely available worldwide. However, two other factors, including spring locations and lithology, might not be available in some areas, which can limit the application of the current methodology. In the case of springs as an indicator of GW potential, it can be replaced by the well discharge data. In the case of inaccessibility to both spring and well locations, researchers can utilize expert judgement-based approaches, such as the analytical hierarchy process, and define the weights of the factors and obtain GW potential maps. In the respect of lithology, it can be replaced by some other factors such as DEM-derived and other RS-based factors to feed the models with as much information and patterns as possible. Due to these limitations, we can recommend the application of the current methodology with some changes regarding the factors in other regions to tackle data insufficiency.