Next Article in Journal
Geochemical Indicators for Paleolimnological Studies of the Anthropogenic Influence on the Environment of the Russian Federation: A Review
Next Article in Special Issue
Morphometric Determination and Digital Geological Mapping by RS and GIS Techniques in Aseer–Jazan Contact, Southwest Saudi Arabia
Previous Article in Journal
Plant Growth-Promoting Rhizobacteria (PGPR): A Rampart against the Adverse Effects of Drought Stress
Previous Article in Special Issue
Evaluation of the Impact of Drought and Saline Water Intrusion on Rice Yields in the Mekong Delta, Vietnam
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:

Novel Ensemble Machine Learning Modeling Approach for Groundwater Potential Mapping in Parbhani District of Maharashtra, India

Department of Geography, Faculty of Natural Sciences, Jamia Millia Islamia, New Delhi 110025, India
Institute for Global Environmental Strategies, Hayama 240-0115, Kanagawa, Japan
Department of Geography, University of Mumbai, Mumbai 400098, India
Faculty of Agronomy, Université Catholique de Bukavu, Bukavu 285, Democratic Republic of the Congo
Centre Régional d’Etudes Interdisciplinaires Appliquées au Développement Durable (CEREIAD), Université Catholique de Bukavu, Bukavu 285, Democratic Republic of the Congo
Department of Geography, University of Gour Banga, Malda 732101, India
Author to whom correspondence should be addressed.
Water 2023, 15(3), 419;
Submission received: 17 December 2022 / Revised: 14 January 2023 / Accepted: 17 January 2023 / Published: 19 January 2023


Groundwater is an essential source of water especially in arid and semi-arid regions of the world. The demand for water due to exponential increase in population has created stresses on available groundwater resources. Further, climate change has affected the quantity of water globally. Many parts of Indian cities are experiencing water scarcity. Thus, assessment of groundwater potential is necessary for sustainable utilization and management of water resources. We utilized a novel ensemble approach using artificial neural network multi-layer perceptron (ANN-MLP), random forest (RF), M5 prime (M5P) and support vector machine for regression (SMOReg) models for assessing groundwater potential in the Parbhani district of Maharashtra in India. Ten site-specific influencing factors, elevation, slope, aspect, drainage density, rainfall, water table depth, lineament density, land use land cover, geomorphology, and soil types, were integrated for preparation of groundwater potential zones. The results revealed that the largest area of the district was found under moderate category GWP zone followed by poor, good, very good and very poor. Spatial distribution of GWP zones showed that Poor GWPZs are spread over north, central and southern parts of the district. Very poor GWPZs are mostly found in the north-western and southern parts of the district. The study calls for policy implications to conserve and manage groundwater in these parts. The ensembled model has proved to be effective for assessment of GWP zones. The outcome of the study may help stakeholders efficiently utilize groundwater and devise suitable strategies for its management. Other geographical regions may find the methodology adopted in this study effective for groundwater potential assessment.

1. Introduction

Water is one of the most significant and dynamic natural resources on the planet. Various forms of life and developmental activities rely upon it. Exponential increase in population has affected the availability and quality of water [1,2]. It tends to run dry during the summer season due to limited water in lakes and rivers, making it scarce in a number of cases [3]. Consequently, groundwater has been the major supplanting source of water to satisfy human necessities [4]. The changing pattern of precipitation and temperature has posed a constant threat to groundwater hydrology. The scientific community agrees that climate change is expected to affect groundwater sources significantly during the upcoming decades [5,6,7]. The significance of groundwater has increased for irrigation in the era of climate change. Many communities, including rural dwellers in low-income countries, depend on groundwater in this situation of water insecurity, as it is the only perennial source of fresh water in these countries. Groundwater-fed irrigation has become essential for achieving global food security because it provides a shield against climate extreme events. Furthermore, it helps to eradicate poverty by increasing yields and lessening crop failures. Groundwater is expected to become more valuable in the coming years as temporal variations in precipitation, temperature, surface water and soil moisture are expected to increase as a result of more frequent and intense climate extremes linked to climate change [8].
South Asia, the Middle East and Africa are facing severe water scarcity [5,9]. Sustainable groundwater management is exceptionally significant in arid regions where land has become sensitive due to frequent drought events in recent decades [10]. The Parbhani district has faced several droughts in the past due to its hard rock structure and undulating topography. A basaltic Deccan trap lava layer limits the infiltration rate in the district. The poorly permeable soil layer restricts the recharging capacity of water [5]. High dependence on groundwater further accentuates pressure. Thus, mapping of the groundwater potential zone (GWPZ) is essential for a water resource management plan in the district. GWP is considered in terms of the availability and possibility of a certain quantity of groundwater storage within the region [11]. Traditionally, hydrogeological drilling, field surveying and geophysical sampling methods have been used for preparation of GWP zones [12]. Though these methods are effective, they are expensive, time-consuming and labor intensive [5]. With advancements in remote sensing, geographical information systems (GIS) and photogrammetry, new and more sophisticated methods of water assessment have been developed recently [10]. Various numerical models and methods have been utilized to assess GWPZs in the past literature, including the analytical hierarchy process (AHP) [5,13], frequency ratio [14], fuzzy logic [14,15], Shannon entropy [16], evidential belief function [17], maximum entropy [18], weights of evidence [19], logistic regression [20], artificial neural networks [21] and tritium injection method [22]. The analytical hierarchy process is an effective method for assessing groundwater potential. However, long-term groundwater data are unavailable for its maximum application in analyzing groundwater in geo-ecological environments [23,24].
Statistical methods are mostly utilized for GWP mapping mainly due to the availability of springs and wells data [25]. Furthermore, several studies [26,27,28] have examined the application of ensemble methods such as evidential belief function and boosted regression tree, weights of evidence, linear regression, tree-based models and artificial neural network (ANN) for preparing GWP maps. Subsequently, the non-linearity aspect has led to development of artificial intelligence-based models. The models utilized for GWP zones mapping include decision stumps [29], random forest (RF) [5,30], classification and regression tree (CART) [31], integrated decision-making model [32], ANN [33], support vector machine (SVM) [34], multivariate adaptive regression splines (MARS) [35] and alternating decision trees [36]. The purpose of constructing each model is to track down the most affordable and productive result. Additionally, the accuracy of the models has improved due to the integration of field data [37,38]. Machine learning models are advanced algorithms for precious mapping and analyzing of GWP [39]. Though a large number of scholars have utilized machine learning algorithms for assessing groundwater potential, there has always been uncertainty of the result produced. In the current study, artificial neural network multi-layer perceptron (ANN-MLP), random forest (RF), M5 prime (M5P) and support vector machine for regression (SMOReg) machine learning models were ensembled to produce an optimal predictive model for groundwater potential zones. We argue that the ensemble approach can be utilized in other geographical regions interested in managing groundwater resources effectively.

2. Site Description

Parbhani District is situated on the bank of the Godavari River in the state of Maharashtra, India. It is located between 18°45′ and 20°01′ north latitude and between 76°13′ and 77°26′ east longitude with an area of 6155 sq km2 (Figure 1). The district is experiencing rapid population and economic growth [40] and is the home to 1.8 million people. Topographically, the study area comprises hard rock with a Deccan trap basalt lava layer which limits the infiltration of water. Poor permeability and porosity in aquifers restrict the recharging capacity of water. Due to the rolling and undulating topography, soil with varying thickness has developed [41]. The soil varies from clayey to clayey loamy in valleys to sandy loamy on hills and slopes. The district enjoys semi-arid climate conditions [42]. Generally, it receives average rainfall of 938 mm from June to September during the south-west monsoon [43]. It is sited in a moderate to moderately high rainfall zone. The temperature varies from 21.8 °C to 44.6 °C. Most of the population is engaged in agricultural activities. Cotton, soyabean, pigeon pea, black and green gram are the main crops in the study area [44]. Canals are the primary source of irrigation in the study area. Dudhana dam and Majalgaon dam are the major sources of drinking water. The demand for groundwater has increased for agricultural and domestic needs with increased population growth.

3. Database and Methodology

3.1. Groundwater Potential Conditioning Factors

A large number of conditioning factors determine groundwater occurrence and replenishment. Ten site-specific influencing, namely elevation, slope, aspect, drainage density, land use land cover, rainfall, water table depth, lineament density, geomorphology and soil types, were used to assess groundwater potential zones based on previous literature surveys [14,45]. The thematic layer maps were prepared using the geographical information systems (GIS) environment. The details of the datasets utilized for preparing GWP zones are presented in Table 1.

3.1.1. Elevation

The elevation is the most crucial conditioning factor for GWP mapping. It was utilized to examine the changes in vegetation and soil characteristics due to changes in response to climate conditions [46]. The potential of groundwater is inversely proportional to elevation. High elevation corresponds to low probability of groundwater occurrence and vice versa [39]. Elevation in the district ranges from 329 m to 586 m, thus discerning a wide variation across the district (Figure 2a). High elevated areas are mostly located in northern and southern parts of the district. Flood plain areas of Godavari River and the eastern part of the district have low elevation.

3.1.2. Aspect

Aspect is the direction of slope and characterizes physiographic trend. The direction of surface runoff controls the accumulation of water and affects groundwater potential. It also affects the length of visibility of sunlight and has an impact on the physical characteristics of a slope [23]. Thus, it determines soil moisture content flow paths to a significant extent. Flat, north, north-east, east, south-east, south, south-west, west and north-west aspects were observed in the study area (Figure 2b).

3.1.3. Slope

Slope determines surface runoff and the rate of infiltration [13]. It helps in interpreting geological and geomorphological structure. A steep slope restricts groundwater recharge and replenishment of water during a rainfall event. Slope varies from 0° and 18° in the district. Most parts of the district have a gentle slope while northern and southern flanks are characterized by steep slope (Figure 2c).

3.1.4. Drainage Density

Drainage density is a measure of closeness of streams and of similar other water bodies. Drainage density is inversely proportional to the groundwater prospect [14,38]. Thus, areas with higher drainage density are characterized by less infiltration. Hence, less infiltration will have low groundwater potential [13]. The north-east and south-west regions show higher drainage density (Figure 2d). Moderate drainage density was found in south-western and north-central parts of the district while central-eastern parts of the district experienced low drainage density.

3.1.5. Rainfall

Rainfall is a crucial component of the hydrological cycle and a major source of groundwater recharge. The amount of rainfall and its spatial–temporal distribution directly impact hydrology and hydrogeological conditions [47]. The distribution of rainfall controls the volume of water and moisture contained in the soil. A low rainfall zone (below 875 mm) was observed in the north-western part of the district, while a high rainfall zone (above 900 mm) was found in the eastern part of the district (Figure 3a).

3.1.6. Land Use Land Cover (LULC)

LULC highly influences groundwater occurrence through various hydrological processes, such as infiltration, evapotranspiration and surface runoff. Alteration in the LULC by anthropogenic activities affects groundwater quantity and quality [13]. Most of the area of the district is under agriculture and vegetation. However, barren land is situated in the northern and southern parts of the district. Built-up areas are concentrated in the central part of the Parbhani district. The Godavari River is the main water body in the study area. However, some water bodies are also found in the north-eastern part (Figure 3b).

3.1.7. Lineaments Density

Lineaments are physically curvilinear features with relatively linear alignments. Underlying structural features are determined by lineaments. Permeability and porosity are controlled by fault and fracture in the rocks [14]. The runoff and occurrence of groundwater are determined by the rate of infiltration through these lineament features. Thus, a positive relationship exists between lineament density and groundwater recharge. The higher the lineament density, the higher the probability of occurrence of groundwater [48]. The north-west and south-west parts of the district are characterized by larger number of lineaments and thus have higher lineament density (Figure 3c).

3.1.8. Lithology

Lithology is one of the key geological factors for assessing groundwater potential. It directly influences the porosity and permeability of aquifer rocks. Thus, the status of the groundwater can easily be understood by assessing lithology [48]. Most of the area of the district contains lithology of denudational origin with a few spots of fluvial origin (Figure 3d). Lithology of fluvial origin is favorable for groundwater occurrence.

3.1.9. Soil Type

Soil is controlled by topographic conditions, geomorphology, climate, vegetative cover and the parent rock. Infiltration and percolation depend on soil conditions [49]. Three main soil types were clearly distinguished in the district. Cambisol vertisols covers the largest area of the district. Chromic vertisols are found in small pockets in the south while chromic luvisols are found in the northern part of the district (Figure 4a).

3.1.10. Water Table

The water table layer is one of the most essential groundwater potential conditioning factors. Water table depths control the time required for the surface water to percolate and reach the water table. Shallower water levels with shorter travel times yield higher potential groundwater [50,51,52]. Water table depth varies throughout the year and its variations may be wide in some regions, especially where winter precipitation is often higher than summer precipitation. The north-eastern, north-western and north-central parts of the district have shallow water table depth while the southern part is characterized by a deep-water table (Figure 4b).

3.2. Optimization of Machine Learning Algorithms

The groundwater potential conditioning factors dataset was divided into training and testing data to optimize ANN-MLP, RF, M5P, SMOReg and ensemble models. Various researchers have recommended 70% training and 30% testing of data [23,26,28,29]. Thus, the data was randomly divided into training and testing datasets in the ratio of 70 per cent to 30 per cent, respectively, to avoid overestimation of accuracy.

3.2.1. ANN-MLP

The ANN algorithm is widely used for groundwater potential mapping worldwide [28,53,54,55]. The multi-layer perceptron (MLP) neural network algorithm of ANN has an advantage over traditional ANN for prediction [23]. It does not require any pre-assumptions from a training dataset to establish the relationship among the variables. Input, hidden and output layers are three components in the MLP. The input layer is created using ground point control functions. The output layer is obtained by processing the input layer by a hidden layer. Groundwater potential zones are identified on the output layer. The following equations were utilized to utilize ANN-MLP model.
Assume that X = Xi, where I = 1, …, 10 represents GWP conditioning variables,. When y = 1 it represents GWP points, while y = 0 represents non-GWP points. Hence, the basic equation of MLP is:
y = f X
where the unknown function f(X) is optimized using adjustable weights through the training of the neural network model. The error is equal to the difference between the target output (dk)and the network output (ok) during the training time and it is calculated as:
e k = d k o k
The output error is minimized using weight adjustments in the neural network and the following equation was utilized [53]:
w i j n + 1 = n δ o i + a w i j ˙ n
The changes of weight in epochs (n + 1) and (n) represent by ∇wij(n + 1) and ∇wij(n), respectively, δ represents the error change rate, n represents the learning rate parameter and α represent the momentum coefficient [27].

3.2.2. Random Forest (RF)

The random forest algorithm is a non-parametric statistical machine learning algorithm based on decision trees [56]. The RF algorithm can solve the both classification and regression problems [28]. A bagging approach was used to train the dataset. It splits the data into multiple decision trees in order to generate an accurate result. The output of each decision tree determined the final output [18]. The following equation was utilized to determine the generalization error (GE).
G E = P x y m g x , y < 0
where P stands for predictor, probability space (x, y) is represented by groundwater potential factors and mg for a function is expressed as:
m g = x , y = a v k l ( h k x = y m x a v k l h k x = j
where, l(∗), j and hk represent the indicator function, the union of hyper-rectangles and the union of hyper-rectangles, respectively.

3.2.3. SMOreg

Support vector machine (SVM) is a non-parametric supervised classification algorithm and has an advantage over conventional methods in terms of training datasets and accuracy [56,57,58]. It has been widely used and modified by various researchers [38,59]. The SMOreg is an advanced version of SVM for regression analysis. It represents the support vector machine for regression [57]. SMOreg divides large quadratic programming (QP) problems into a collection of shortest possible QP problems in SVM regression [60]. Initially, an n-dimensional feature space is created by means of a kernel-mapping procedure in the training dataset of this algorithm. Following this, a one-dimension hyperplane which is lower than its surrounding space is drawn based on the defined data. The best fit hyperplane is identified by increasing the margin to its maximum value [61]. The f(x, w) hyperplane was determined as:
f x 1 w = J = 1 m w j g j x + b
For SVM regression, 𝑔j(𝑥) is the set of nonlinear transformation, b is the bias, 𝑤j is the weights and Ɛ is the insensitive loss function described by [58]. Therefore, the |ξ|Ɛ is calculated using the following equation:
ξ = 0 ,     i f ε ξ ε ,     o t h e r w i s e
The optimized problem is solved by means of slack variables 𝜉𝑖 and 𝜉𝑖, as formulated in the following equation:
Minimize = 1 2 w 2 + C i = 1 l ξ i + ξ i
y i f x i , w ε ξ i
f x i , w y i ε ξ i
ξ i , ξ i 0 , i = 1 , , n
This regression utilizes and maximizes large-dimension space to identify linear space employing Ɛ and increasing the model simplicity.

3.2.4. M5P

Recently, this technique has been widely used for predicting various natural hazards, namely landslides susceptibility mapping [53], flood hazard mapping [62], river-bed transport prediction [11] and spring discharge forecasting [63]. M5P is an adaptable algorithm and can construct decision trees using multivariable linear modelling [11]. Construction, pruning and smoothing of the trees are the main steps followed for this algorithm. The performance of best model is achieved by enhancing the standard deviation reduction (SDR) during the M5P modelling. The SDR can be optimized as:
S D R = S D E i E i E x S D E 1
where E represents the set of cases, SD(E) indicates the standard deviation of E, Ei is the ith subset of cases from splitting the tree and SD(Ei) is the standard deviation of Ei.
After the construction of trees, the pruning process starts to eliminate undesired sub-trees. The attributes are reduced one by one to reduce estimated error. The sharp discontinuities between adjacent linear models are compensated by smoothing. Smoothing is accomplished by computing the predicted value with the leaf model. The predicated value is filtered along the path back to the root node [11].
Various studies have employed ensemble modeling method to achieve high accuracy by combining the outputs of multiple models [4,14,23,26,28]. These outputs are considered as inputs to improve the final results. There are various methods to assemble models. This study employed a stacking method to ensemble ANN-MLP, RF, SMOreg, and M5P models to generate the final groundwater potential zones map.

3.3. Validation Method

The receiver operating characteristic (ROC) curve is widely utilized to predict model accuracy [55,64,65]. High ROC value indicates better capability of the model. The ROC curve was constructed by plotting specificity and sensitivity on the X and Y axis, respectively, using the following equations.
Sensitivity X = T P T P + F N
Specificity Y = T N T N + F P
where TP is the true positive, TN is the true negative, FP is the false positive and FN is the false negative. True positive (TP) and true negative (TN) refer to the proportion of positive and negative pixels correctly classified into the positive and negative classification, while false positives (FP) and false negatives (FN) are pixels that were incorrectly classified.
Groundwater table data obtained from the Central Groundwater Board (CGWB) was utilized to establish the relationship between groundwater potential map and dug well depth. The pixel values from the same location were obtained from the ensemble groundwater potential zones map. The values were then utilized in RStudio to analyze correlation between the pixels of groundwater potential and dug well depth locations.

4. Results

Four machine learning algorithms were utilized for assessing GWP zones in the district. Since all the models yielded high accuracy, an ensemble model was constructed using all the models for better deciphering of GWP zones. Figure 5a represents the groundwater potential map using artificial neural network multi-layer perceptron. The difference between the original and predicated value was represented by mean absolute error (MAE). The average difference of the datasets determined the MAE value. The MAE value for the multi-layer perceptron was 0.0962, thus showing the high sustainability of the ANN-MLP model (Table 2). The root mean squared error (RMSE) was also found to be low (0.1104) in the original and predicated values among all algorithms (Table 2). RF and SMOReg model have 0.1309 and 0.223 MAE values, reflecting the suitability of these models. However, the M5P model has high MAE (0.347) and RMSE (0.3915), indicating less suitability.
The largest area was found under the very good category (24.81%) followed by good (23.73%), moderate (20.59%), poor (18.09) and very poor (12.78%) for GWP zones in the ANN-MLP model (Table 3). Very good groundwater potential was found in eastern and central-eastern parts of the district, while very poor groundwater potential was found in the northern and southern parts of the district (Figure 5a). Figure 5b shows the ground water potential map using random forest. The MAE value (0.1309) for random forest was higher than the ANN-MLP (Table 2). The RMSE value for original and predicated value was 0.1594. The largest area was found in the moderate category (26.11%) followed by poor (24.59%), good (20.21%), very good (19.20%) and very poor (9.90%) categories in the random forest map (Table 3). Very good groundwater potential was found in the eastern part of the district while very poor groundwater potential was assessed in northwestern and southern parts (Figure 5b). Figure 5c represents groundwater potential map using SMOReg model. The MAE value (0.223) for SMOReg was higher than the ANN-MLP and random forest (Table 2). The RMSE value for original and predicated value was 0.2831. The largest area for SMOReg GWPZ was observed in the moderate category (25.68%) followed by good (23.70%), poor (19.93%), very good (19.58%) and very poor (11.12%) categories (Table 3). Figure 5d reflects the groundwater potential map using the M5P model. The MAE value (0.347) for MP was higher than the ANN-MLP, random forest and SMOReg (Table 2). The RMSE value for original and predicated value was 0.3915. The largest area for M5P GWPZ was observed in the moderate category (28.52%) followed by good (25.11%), very good (20.69%) poor (16.55%), and very poor (9.13%) categories (Table 3). This followed the spatial pattern of random forest where very good groundwater potential was found in eastern part while north-western and southern parts of the district experienced very poor groundwater potential (Figure 5d). Groundwater potential maps were validated using ROC curve (Figure 6). All the models have more than 90 per cent accuracy in testing dataset except M5P (Table 4).
The largest area of the district in the ensemble model was found under moderate category of GWP zone (24.5%), followed by poor (22.1%), good (21.7%) very good (18%) and very poor (13%) (Table 5). Moderate GWP zones are mainly found in the central part of the district (Figure 7a). These areas are characterized by high elevation, medium drainage density and water table depth, and medium and low lineament density. Poor GWPZs are spread over north, central and southern parts of the district. Structural origin lithology, chromic luvisols and barren land influenced the poor groundwater potential in these zones. Good GWPZs were found in the central and central-eastern part of the district (Figure 7a). Flat slope ranging from 0° to 5° and low to medium elevation are the characteristics of these zones. Water table depth varies from very low to medium in this zone. High percolation and infiltration rates are determined by denudation and fluvial origin geomorphology. This zone is characterized by low and very low lineament density (Figure 3c). High rainfall and cambisol vertisols soil are the other characteristics of this zone. High rainfall, low water table depth and presence of floodplain made for very good groundwater potential in this zone. Very low lineament density, denudation origin geomorphology and cambisol vertisols are largely associated with very good GWPZ (Figure 3). GWPZ has a direct relationship with waterbodies and vegetation. However, it has an inverse relationship with barren land and agricultural land. The district’s southern and northwestern parts experienced very poor groundwater potential (Figure 7a). High elevation with moderately steep slopes, low rainfall, very low drainage density, high barren land and water depth up to 12 m are the main characteristic of very poor groundwater potential. High elevation and high barren land are the main cause of poor groundwater potential in these zones. However, chromic vertisols largely determined the groundwater potential in the south and low rainfall in the north-western parts of the study area.

Validation of Groundwater Potential Maps Using ROC Curve and Field Data

The ROC curve is a threshold-based method for evaluating classified outcome [66,67]. The final groundwater potential map produced using the ensemble model was categorized into five classes based on equal intervals: very good, good, moderate, poor and very poor [68]. High ROC value reflects the high effectiveness of the model. Accuracy is displayed in the ROC plot with a range between 0 and 1. Maximum accuracy is represented by 1, while 0 reflects no difference between groundwater potential and reference data. Artificial neural network multi-layer perceptron (0.94) has high accuracy based on ROC curve (Figure 6). However, random forest (0.925), SMOReg (0.908) and M5P (0.893) also yielded good accuracy (Table 4). ROC curve revealed a strong association (AUC = 0.938) between GWPZs map and reference data in the ensemble model (Figure 7b). A field survey was also conducted to verify the level of groundwater in the district (Figure 8). A strong and negative linear correlation (r = −0.777) was found between GWPZs map and dug well depth (Figure 9). Results revealed that, the higher the dug well depth, the lower the groundwater potential. Details of water table depth are provided in Table 6.

5. Discussion

Groundwater is depleting due to anthropogenic activities, skewed development and low water availability, regarding its recharge in a changing climate. Hence, assessment of groundwater potential is imperative for effective planning and sustainable development. Information on groundwater is important for the designing and constructing of structures for improving groundwater recharge. Advancement in geospatial techniques and soft computing methods have been significantly used in groundwater prediction [69]. However, a single model can predict various environmental issues and parameters at various scales with high accuracy. Our finding in is in line with [70]. They also found multi-layer perceptron is the most accurate model with correlation coefficient (CC) of 0.933, scatter index (SI)of 0.576, Willmott’s index (WI) of 0.961, and Nash Sutcliffe efficiency (NSE) of 0.867. Other models, RF, SMOReg and M5P, have also shown high accuracy. However, these models have also included some deficiencies. RF requires more trees to achieve high accuracy in the model. Construction of large number of trees makes the algorithm slow and ineffective for a large dataset. Selection of a suitable kernel in SMOReg model is a challenging task and sometime leads to false results. Enumerated attributes are transformed into binary variables in the M5P model before the construction of a tree leading to generalizing the actual values. Therefore, a novel ensemble model was utilized to predict final groundwater potential in the district. The hydrological settings of the Parbhani district exhibits that groundwater occurs in unconfined aquifers having cambisol vertisols soil and fractured rocks (Figure 3d and Figure 4a). Generally, cambisol vertisols form in medium- and fine-textured materials derived from a variety of rocks. It is one of the more productive soils and contributes to fertile agricultural land [71]. The groundwater availability is not uniform spatially. The floodplains of Godavari River have very good groundwater potential as depicted in Figure 7a. Low elevation topography promoted a high infiltration rate of water. Therefore, flat topography has very good GWPZs. Various studies on GWPZs have shown that groundwater potentiality increases with low topographic elevation and gentle slope owing to the longer residence time for water percolation [5,54,72].
High GWPZs are concentrated in areas where the distribution of rainfall is high (above 900). The misuse of existing groundwater for sugarcane cultivation is a serious problem in the Parbhani district. Land use cover has s major impact on groundwater potential. The major favorable factors for very good and good GWPZs were vegetation and surface water. Barren land topography created a barrier for infiltration of water leading to poor groundwater potential. Structural lithology also restricted the percolation of surface water. The district is characterized by hard rock aquifers. Hard rock yields an extensive and complex low-storage aquifer system. These aquifer systems are characterized by poor permeability which hampers the groundwater recharge through rainfall. In such system, water level tends to drop rapidly. Many wells and boreholes have already been deepened due to pumping out of water for irrigation (Figure 8a). As a result, a large number of wells have dried up (Figure 8d). Increasing population, unsustainable agricultural practices and inadequate water management strategies have depleted groundwater resources in the district.

6. Conclusions

The study assessed the groundwater potential using an ensemble model in the Parbhani district. Ten site-specific factors, elevation, slope, aspect, drainage density, rainfall, land use land cover, water table depth, lineament density, geomorphology, and soil types, were utilized based on an extensive literature review. Four models, ANN-MLP, random forest, SMOreg and M5P, were used as an ensemble to generate the final groundwater potential map. This ensemble model has reflected high suitability for the assessment of groundwater potentiality. The result of the ensemble model ROC curve revealed a strong association (AUC = 0.938) between GWPZs map and reference data. Wide variations were observed in groundwater potential. The district’s eastern and central-eastern parts experienced very good GWP while the northern and the southern parts possessed very poor GWP. Thus, rainwater harvesting, artificial recharge of aquifers and integrated water resource management need to be implemented in the Parbhani district. Change in cropping pattern is also required to utilize groundwater judicially. This study’s findings may help policy makers, local managers and hydrologists in better management of water resources in the study area. Other geographical regions interested in identifying groundwater potential zones with high accuracy level may find the methodology adopted in this study effective for future progression of research in the domain. However, the employed ensemble model can only be applied in regions with similar geo-environmental characteristics. More care is needed for selecting site-specific parameters. Availability of input data at various spatio-temporal scales is a major limitation for groundwater potential mapping. Thus, future studies should carefully consider and adapt the input data to the specific conditions of the study area.

Author Contributions

Conceptualization, M.M.; Methodology, T.K.S. and M.M.; Software, T.K.S. and M.H.R.; Validation, P.C.; Formal analysis, M.M. and H.S.; Investigation, M.M.; Resources, P.K.; Writing—original draft, M.M. and H.S.; Writing—review and editing, H.S., O.S. and S.P.; Visualization,, L.C.K.; Supervision, H.S. and P.K.; Funding acquisition, P.K. All authors have read and agreed to the published version of the manuscript.


No funding information.

Data Availability Statement

Data available in a publicly accessible repository.


The authors are thankful to the editor and anonymous reviewers for their constructive comments and insightful suggestions which helped us to improve the overall quality of our work. Pankaj Kumar also would like to acknowledge the support from Japan Science and Technology Agency (JST) as a part of the Abandonment and rebound: Societal views on the landscape- and land-use change and their impacts on water and soils (ABRESO) project under Belmont Forum.

Conflicts of Interest

The authors declare no conflict of interest.


  1. Choudhari, P.P.; Nigam, G.K.; Singh, S.K.; Thakur, S. Morphometric based prioritization of watershed for groundwater potential of Mula river basin, Maharashtra, India. Geol. Ecol. Landsc. 2018, 2, 256–267. [Google Scholar] [CrossRef] [Green Version]
  2. He, L.; Shao, F.; Ren, L. Sustainability appraisal of desired contaminated groundwater remediation strategies: An information-entropy-based stochastic multi-criteria preference model. Environ. Dev. Sustain. 2020, 23, 1759–1779. [Google Scholar] [CrossRef]
  3. Macklin, M.G.; Lewin, J. The rivers of civilization. Quat. Sci. Rev. 2015, 114, 228–244. [Google Scholar] [CrossRef]
  4. Sachdeva, S.; Kumar, B. A novel ensemble model of automatic multilayer perceptron, random forest, and ZeroR for groundwater potential mapping. Environ. Monit. Assess. 2021, 193, 722. [Google Scholar] [CrossRef] [PubMed]
  5. Masroor, M.; Rehman, S.; Sajjad, H.; Rahaman, H.; Sahana, M.; Ahmed, R.; Singh, R. Assessing the impact of drought conditions on groundwater potential in Godavari Middle Sub-Basin, India using analytical hierarchy process and random forest machine learning algorithm. Groundw. Sustain. Dev. 2021, 13, 100554. [Google Scholar] [CrossRef]
  6. Mourot, F.M.; Westerhoff, R.S.; White, P.A.; Cameron, S.G. Climate change and New Zealand’s groundwater resources: A methodology to support adaptation. J. Hydrol. Reg. Stud. 2022, 40, 101053. [Google Scholar] [CrossRef]
  7. Velasco, E.M.; Gurdak, J.J.; Dickinson, J.E.; Ferré, T.; Corona, C.R. Interannual to multidecadal climate forcings on groundwater resources of the U.S. West Coast. J. Hydrol. Reg. Stud. 2017, 11, 250–265. [Google Scholar] [CrossRef] [Green Version]
  8. Taylor, R.G.; Scanlon, B.; Döll, P.; Rodell, M.; Van Beek, R.; Wada, Y.; Longuevergne, L.; Leblanc, M.; Famiglietti, J.S.; Edmunds, M.; et al. Ground water and climate change. Nat. Clim. Chang. 2013, 3, 322–329. [Google Scholar] [CrossRef] [Green Version]
  9. Bozorg-Haddad, O.; Zolghadr-Asli, B.; Sarzaeim, P.; Aboutalebi, M.; Chu, X.; Loáiciga, H.A. Evaluation of water shortage crisis in the Middle East and possible remedies. J. Water Supply Res. Technol. 2019, 69, 85–98. [Google Scholar] [CrossRef]
  10. Feng, P.; Wang, B.; Liu, D.L.; Ji, F.; Niu, X.; Ruan, H.; Shi, L.; Yu, Q. Machine learning-based integration of large-scale climate drivers can improve the forecast of seasonal rainfall probability in Australia. Environ. Res. Lett. 2020, 15, 084051. [Google Scholar] [CrossRef]
  11. Khosravi, K.; Khozani, Z.S.; Cooper, J.R. Predicting stable gravel-bed river hydraulic geometry: A test of novel, advanced, hybrid data mining algorithms. Environ. Model. Softw. 2021, 144, 105165. [Google Scholar] [CrossRef]
  12. Ganapuram, S.; Kumar, G.V.; Krishna, I.M.; Kahya, E.; Demirel, M.C. Mapping of groundwater potential zones in the Musi basin using remote sensing data and GIS. Adv. Eng. Softw. 2009, 40, 506–518. [Google Scholar] [CrossRef]
  13. Arulbalaji, P.; Padmalal, D.; Sreelash, K. GIS and AHP Techniques Based Delineation of Groundwater Potential Zones: A case study from Southern Western Ghats, India. Sci. Rep. 2019, 9, 2082. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  14. Rahaman, H.; Sajjad, H.; Roshani; Masroor, M.; Bhuyan, N.; Rehman, S. Delineating groundwater potential zones using geospatial techniques and fuzzy analytical hierarchy process (FAHP) ensemble in the data-scarce region: Evidence from the lower Thoubal river watershed of Manipur, India. Arab. J. Geosci. 2022, 15, 677. [Google Scholar] [CrossRef]
  15. Rajasekhar, M.; Raju, G.S.; Sreenivasulu, Y.; Raju, R.S. Delineation of groundwater potential zones in semi-arid region of Jilledubanderu river basin, Anantapur District, Andhra Pradesh, India using fuzzy logic, AHP and integrated fuzzy-AHP approaches. Hydroresearch 2019, 2, 97–108. [Google Scholar] [CrossRef]
  16. Elvis, B.W.W.; Arsène, M.; Théophile, N.M.; Bruno, K.M.E.; Olivier, O.A. Integration of shannon entropy (SE), frequency ratio (FR) and analytical hierarchy process (AHP) in GIS for suitable groundwater potential zones targeting in the Yoyo river basin, Méiganga area, Adamawa Cameroon. J. Hydrol. Reg. Stud. 2022, 39, 100997. [Google Scholar] [CrossRef]
  17. Pourghasemi, H.R.; Beheshtirad, M. Assessment of a data-driven evidential belief function model and GIS for groundwater potential mapping in the Koohrang Watershed, Iran. Geocarto Int. 2014, 30, 662–685. [Google Scholar] [CrossRef]
  18. Golkarian, A.; Rahmati, O. Use of a maximum entropy model to identify the key factors that influence groundwater availability on the Gonabad Plain, Iran. Environ. Earth Sci. 2018, 77, 369. [Google Scholar] [CrossRef]
  19. Tahmassebipoor, N.; Rahmati, O.; Noormohamadi, F.; Lee, S. Spatial analysis of groundwater potential using weights-of-evidence and evidential belief function models and remote sensing. Arab. J. Geosci. 2015, 9, 79. [Google Scholar] [CrossRef]
  20. Park, S.; Hamm, S.-Y.; Jeon, H.-T.; Kim, J. Evaluation of Logistic Regression and Multivariate Adaptive Regression Spline Models for Groundwater Potential Mapping Using R and GIS. Sustainability 2017, 9, 1157. [Google Scholar] [CrossRef]
  21. Bloomfield, J.; Lewis, M.; Newell, A.; Loveless, S.; Stuart, M. Characterising variations in the salinity of deep groundwater systems: A case study from Great Britain (GB). J. Hydrol. Reg. Stud. 2020, 28, 100684. [Google Scholar] [CrossRef]
  22. Hakim, W.L.; Nur, A.S.; Rezaie, F.; Panahi, M.; Lee, C.-W.; Lee, S. Convolutional neural network and long short-term memory algorithms for groundwater potential mapping in Anseong, South Korea. J. Hydrol. Reg. Stud. 2022, 39, 100990. [Google Scholar] [CrossRef]
  23. Arabameri, A.; Pal, S.C.; Rezaie, F.; Nalivan, O.A.; Chowdhuri, I.; Saha, A.; Lee, S.; Moayedi, H. Modeling groundwater potential using novel GIS-based machine-learning ensemble techniques. J. Hydrol. Reg. Stud. 2021, 36, 100848. [Google Scholar] [CrossRef]
  24. Pal, S.; Sarda, R. Modelling water richness in riparian flood plain wetland using bivariate statistics and machine learning algo-rithms and figuring out the role of damming. Geocarto Int. 2022, 37, 5585–5608. [Google Scholar] [CrossRef]
  25. Dai, F.; Lee, C.; Zhang, X. GIS-based geo-environmental evaluation for urban land-use planning: A case study. Eng. Geol. 2001, 61, 257–271. [Google Scholar] [CrossRef]
  26. Chen, W.; Li, H.; Hou, E.; Wang, S.; Wang, G.; Panahi, M.; Li, T.; Peng, T.; Guo, C.; Niu, C.; et al. GIS-based groundwater potential analysis using novel ensemble weights-of-evidence with logistic regression and functional tree models. Sci. Total Environ. 2018, 634, 853–867. [Google Scholar] [CrossRef] [Green Version]
  27. Kordestani, M.D.; Naghibi, S.A.; Hashemi, H.; Ahmadi, K.; Kalantar, B.; Pradhan, B. Groundwater potential mapping using a novel data-mining ensemble model. Hydrogeol. J. 2018, 27, 211–224. [Google Scholar] [CrossRef] [Green Version]
  28. Nguyen, P.T.; Ha, D.H.; Jaafari, A.; Nguyen, H.D.; Van Phong, T.; Al-Ansari, N.; Prakash, I.; Van Le, H.; Pham, B.T. Groundwater Potential Mapping Combining Artificial Neural Network and Real AdaBoost Ensemble Technique: The DakNong Province Case-study, Vietnam. Int. J. Environ. Res. Public Health 2020, 17, 2473. [Google Scholar] [CrossRef] [Green Version]
  29. Pham, B.T.; Jaafari, A.; Prakash, I.; Singh, S.K.; Quoc, N.K.; Bui, D.T. Hybrid computational intelligence models for groundwater potential mapping. Catena 2019, 182, 104101. [Google Scholar] [CrossRef]
  30. Naghibi, S.A.; Ahmadi, K.; Daneshi, A. Application of Support Vector Machine, Random Forest, and Genetic Algorithm Optimized Random Forest Models in Groundwater Potential Mapping. Water Resour. Manag. 2017, 31, 2761–2775. [Google Scholar] [CrossRef]
  31. Choubin, B.; Rahmati, O.; Soleimani, F.; Alilou, H.; Moradi, E.; Alamdari, N. Regional Groundwater Potential Analysis Using Classification and Regression Trees. In Spatial Modeling in GIS and R for Earth and Environmental Sciences; Elsevier: Amsterdam, The Netherlands, 2019; pp. 485–498. [Google Scholar] [CrossRef]
  32. Sun, X.; Zhou, Y.; Yuan, L.; Li, X.; Shao, H.; Lu, X. Integrated decision-making model for groundwater potential evaluation in mining areas using the cusp catastrophe model and principal component analysis. J. Hydrol. Reg. Stud. 2021, 37, 100891. [Google Scholar] [CrossRef]
  33. Panahi, M.; Sadhasivam, N.; Pourghasemi, H.R.; Rezaie, F.; Lee, S. Spatial prediction of groundwater potential mapping based on convolutional neural network (CNN) and support vector regression (SVR). J. Hydrol. 2020, 588, 125033. [Google Scholar] [CrossRef]
  34. Al-Fugara, A.; Ahmadlou, M.; Shatnawi, R.; AlAyyash, S.; Al-Adamat, R.; Al-Shabeeb, A.A.-R.; Soni, S. Novel hybrid models combining meta-heuristic algorithms with support vector regression (SVR) for groundwater potential mapping. Geocarto Int. 2020, 37, 2627–2646. [Google Scholar] [CrossRef]
  35. Zabihi, M.; Pourghasemi, H.R.; Pourtaghi, Z.S.; Behzadfar, M. GIS-based multivariate adaptive regression spline and random forest models for groundwater potential mapping in Iran. Environ. Earth Sci. 2016, 75, 1097. [Google Scholar] [CrossRef]
  36. Moghaddam, D.D.; Rahmati, O.; Panahi, M.; Tiefenbacher, J.; Darabi, H.; Haghizadeh, A.; Haghighi, A.T.; Nalivan, O.A.; Tien Bui, D. The effect of sample size on different machine learning models for groundwater potential mapping in mountain bedrock aquifers. CATENA 2020, 187, 104421. [Google Scholar] [CrossRef]
  37. Adeyeye, O.; Ikpokonte, E.; Arabi, S. GIS-based groundwater potential mapping within Dengi area, North Central Nigeria. Egypt. J. Remote. Sens. Space Sci. 2018, 22, 175–181. [Google Scholar] [CrossRef]
  38. Arabameri, A.; Arora, A.; Pal, S.C.; Mitra, S.; Saha, A.; Nalivan, O.A.; Panahi, S.; Moayedi, H. K-Fold and State-of-the-Art Metaheuristic Machine Learning Approaches for Groundwater Potential Modelling. Water Resour. Manag. 2021, 35, 1837–1869. [Google Scholar] [CrossRef]
  39. Chen, W.; Zhao, X.; Tsangaratos, P.; Shahabi, H.; Ilia, I.; Xue, W.; Wang, X.; Bin Ahmad, B. Evaluating the usage of tree-based ensemble methods in groundwater spring potential mapping. J. Hydrol. 2020, 583, 124602. [Google Scholar] [CrossRef]
  40. Chandramouli, C. Census of India: Primary Census Abstract; Office of the Registrar General & Census Commissioner, India: New Delhi, India, 2013.
  41. Masroor, M.; Rehman, S.; Avtar, R.; Sahana, M.; Ahmed, R.; Sajjad, H. Exploring climate variability and its impact on drought occurrence: Evidence from Godavari Middle sub-basin, India. Weather Clim. Extrem. 2020, 30, 100277. [Google Scholar] [CrossRef]
  42. Tarate, S.B.; Kumar, P. Characterization and trend detection of meteorological drought for a semi-arid area of Parbhani district of Indian state of Maharashtra. Mausam 2021, 72, 583–596. [Google Scholar] [CrossRef]
  43. Dakhore, K.K.; Rathod, A.S.; Kadam, D.R.; Shinde, G.U.; Kadam, Y.E.; Ghosh, K. Prediction of Kharif cotton yield over Parbhani, Maharashtra: Combination of extended range forecast and DSSAT-CROPGRO-Cotton model. Mausam 2021, 72, 635–644. [Google Scholar] [CrossRef]
  44. Dakhore, K.K.; Kadam, Y.E.; Vijaya, K.P. Study the rainfall variability and impact of El Nino episode on rainfall and crop productivity at Parbhani. Mausam 2020, 71, 285–290. [Google Scholar] [CrossRef]
  45. Mukherjee, I.; Singh, U.K. Delineation of groundwater potential zones in a drought-prone semi-arid region of east India using GIS and analytical hierarchical process techniques. Catena 2020, 194, 104681. [Google Scholar] [CrossRef]
  46. Ahmed, R.; Sajjad, H.; Husain, I. Morphometric Parameters-Based Prioritization of Sub-watersheds Using Fuzzy Analytical Hierarchy Process: A Case Study of Lower Barpani Watershed, India. Nat. Resour. Res. 2017, 27, 67–75. [Google Scholar] [CrossRef]
  47. Shao, Z.; Huq, E.; Cai, B.; Altan, O.; Li, Y. Integrated remote sensing and GIS approach using Fuzzy-AHP to delineate and identify groundwater potential zones in semi-arid Shanxi Province, China. Environ. Model. Softw. 2020, 134, 104868. [Google Scholar] [CrossRef]
  48. Doke, A.B.; Zolekar, R.B.; Patel, H.; Das, S. Geospatial mapping of groundwater potential zones using multi-criteria decision-making AHP approach in a hardrock basaltic terrain in India. Ecol. Indic. 2021, 127, 107685. [Google Scholar] [CrossRef]
  49. Akhtar, N.; Syakir, M.I.; Anees, M.T.; Qadir, A.; Yusuff, M.S. Characteristics and Assessment of Groundwater. In Groundwater Management and Resources; IntechOpen: London, UK, 2021. [Google Scholar] [CrossRef]
  50. Chitsazan, M.; Akhtari, Y. A GIS-based DRASTIC Model for Assessing Aquifer Vulnerability in Kherran Plain, Khuzestan, Iran. Water Resour. Manag. 2008, 23, 1137–1155. [Google Scholar] [CrossRef]
  51. Madrucci, V.; Taioli, F.; de Araújo, C.C. Groundwater favorability map using GIS multicriteria data analysis on crystalline terrain, São Paulo State, Brazil. J. Hydrol. 2008, 357, 153–173. [Google Scholar] [CrossRef]
  52. Pradhan, R.M.; Guru, B.; Pradhan, B.; Biswal, T.K. Integrated multi-criteria analysis for groundwater potential mapping in Precambrian hard rock terranes (North Gujarat), India. Hydrol. Sci. J. 2021, 66, 961–978. [Google Scholar] [CrossRef]
  53. Alqadhi, S.; Mallick, J.; Talukdar, S.; Bindajam, A.A.; Saha, T.K.; Ahmed, M.; Khan, R.A. Combining logistic regression-based hybrid optimized machine learning algorithms with sensitivity analysis to achieve robust landslide susceptibility mapping. Geocarto Int. 2022, 1–26. [Google Scholar] [CrossRef]
  54. Lee, J.; Kim, C.-G.; Lee, J.E.; Kim, N.W.; Kim, H. Application of Artificial Neural Networks to Rainfall Forecasting in the Geum River Basin, Korea. Water 2018, 10, 1448. [Google Scholar] [CrossRef] [Green Version]
  55. Razavi-Termeh, S.V.; Sadeghi-Niaraki, A.; Choi, S.-M. Groundwater Potential Mapping Using an Integrated Ensemble of Three Bivariate Statistical Models with Random Forest and Logistic Model Tree Models. Water 2019, 11, 1596. [Google Scholar] [CrossRef] [Green Version]
  56. Masroor, M.; Sajjad, H.; Rehman, S.; Singh, R.; Rahaman, H.; Sahana, M.; Ahmed, R.; Avtar, R. Analysing the relationship between drought and soil erosion using vegetation health index and RUSLE models in Godavari middle sub-basin, India. Geosci. Front. 2021, 13, 101312. [Google Scholar] [CrossRef]
  57. Alhatali, A.; Soosaimanickam, A. A Comparative Study of the Efficient Data Mining Algorithm to Find the Most Influenced Fac-tor on Price Variation in Oman Fish Markets. Sak. Univ. J. Comput. Inf. Sci. 2018, 1, 1–16. [Google Scholar]
  58. Cortes, C.; Vapnik, V. Support-vector networks. Mach. Learn. 1995, 20, 273–297. [Google Scholar] [CrossRef]
  59. Zhang, L.; Traore, S.; Ge, J.; Li, Y.; Wang, S.; Zhu, G.; Cui, Y.; Fipps, G. Using boosted tree regression and artificial neural networks to forecast upland rice yield under climate change in Sahel. Comput. Electron. Agric. 2019, 166, 105031. [Google Scholar] [CrossRef]
  60. Platt, J. Sequential Minimal Optimization: A Fast Algorithm for Training Support Vector Machines (Issue MSR-TR-98-14). 1998. Available online: (accessed on 16 January 2023).
  61. Myers, S.A.; Leskovec, J. The bursty dynamics of the Twitter information network. In Proceedings of the 23rd International Conference on World Wide Web—WWW ’14, Seoul, Republic of Korea, 7–11 April 2014; pp. 913–924. [Google Scholar] [CrossRef] [Green Version]
  62. Abdelkarim, A.; Al-Alola, S.S.; Alogayell, H.M.; Mohamed, S.A.; Alkadi, I.I.; Youssef, I.Y. Mapping of GIS-Flood Hazard Using the Geomorphometric-Hazard Model: Case Study of the Al-Shamal Train Pathway in the City of Qurayyat, Kingdom of Saudi Arabia. Geosciences 2020, 10, 333. [Google Scholar] [CrossRef]
  63. Granata, F.; Saroli, M.; De Marinis, G.; Gargano, R. Machine Learning Models for Spring Discharge Forecasting. Geofluids 2018, 2018, 1–13. [Google Scholar] [CrossRef]
  64. Masroor, M.; Razavi-Termeh, S.V.; Rahaman, H.; Choudhari, P.; Kulimushi, L.C.; Sajjad, H. Adaptive neuro fuzzy inference system (ANFIS) machine learning algorithm for assessing environmental and socio-economic vulnerability to drought: A study in Godavari middle sub-basin, India. Stoch. Environ. Res. Risk Assess. 2022, 37, 233–259. [Google Scholar] [CrossRef]
  65. Kulimushi, L.C.; Bashagaluke, J.B.; Prasad, P.; Heri-Kazi, A.B.; Kushwaha, N.L.; Masroor, M.; Mohammed, S. Soil erosion sus-ceptibility mapping using ensemble machine learning models: A case study of upper Congo river sub-basin. Catena 2023, 222, 106858. [Google Scholar] [CrossRef]
  66. Das, J.; Mandal, T.; Rahman, A.T.M.S.; Saha, P. Spatio-temporal characterization of rainfall in Bangladesh: An innovative trend and discrete wavelet transformation approaches. Theor. Appl. Clim. 2021, 143, 1557–1579. [Google Scholar] [CrossRef]
  67. Saha, T.K.; Pal, S. Exploring physical wetland vulnerability of Atreyee river basin in India and Bangladesh using logistic regression and fuzzy logic approaches. Ecol. Indic. 2018, 98, 251–265. [Google Scholar] [CrossRef]
  68. Djurovic, N.; Domazet, M.; Stričević, R.; Pocuca, V.; Spalevic, V.; Pivic, R.; Gregoric, E.; Domazet, U. Comparison of Groundwater Level Models Based on Artificial Neural Networks and ANFIS. Sci. World J. 2015, 2015, 1–13. [Google Scholar] [CrossRef] [Green Version]
  69. Cai, H.; Shi, H.; Liu, S.; Babovic, V. Impacts of regional characteristics on improving the accuracy of groundwater level prediction using machine learning: The case of central eastern continental United States. J. Hydrol. Reg. Stud. 2021, 37, 100930. [Google Scholar] [CrossRef]
  70. Shadkani, S.; Abbaspour, A.; Samadianfard, S.; Hashemi, S.; Mosavi, A.; Band, S.S. Comparative study of multilayer perceptron-stochastic gradient descent and gradient boosted trees for predicting daily suspended sediment load: The case study of the Mississippi River, U.S. Int. J. Sediment Res. 2020, 36, 512–523. [Google Scholar] [CrossRef]
  71. Allam, A.; Moussa, R.; Najem, W.; Bocquillon, C. Hydrological cycle, Mediterranean basins hydrology. In Water Resources in the Mediterranean Region; Elsevier: Amsterdam, The Netherlands, 2020; pp. 1–21. [Google Scholar] [CrossRef]
  72. Al-Djazouli, M.O.; Elmorabiti, K.; Rahimi, A.; Amellah, O.; Fadil, O.A.M. Delineating of groundwater potential zones based on remote sensing, GIS and analytical hierarchical process: A case of Waddai, eastern Chad. Geo J. 2020, 86, 1881–1894. [Google Scholar] [CrossRef]
Figure 1. The study area: (a) its location in India, (b) Parbhani district.
Figure 1. The study area: (a) its location in India, (b) Parbhani district.
Water 15 00419 g001
Figure 2. Factors affecting groundwater potential: (a) elevation, (b) aspect, (c) slope, (d) drainage density.
Figure 2. Factors affecting groundwater potential: (a) elevation, (b) aspect, (c) slope, (d) drainage density.
Water 15 00419 g002
Figure 3. Factors affecting groundwater potential: (a) rainfall, (b) land use land cover, (c) lineament density, (d) lithology.
Figure 3. Factors affecting groundwater potential: (a) rainfall, (b) land use land cover, (c) lineament density, (d) lithology.
Water 15 00419 g003
Figure 4. Factors affecting groundwater potential: (a) soil types, (b) water table.
Figure 4. Factors affecting groundwater potential: (a) soil types, (b) water table.
Water 15 00419 g004
Figure 5. Groundwater potential zones: (a) ANN-MLP (b) RF (c) SMOReg (d) M5P.
Figure 5. Groundwater potential zones: (a) ANN-MLP (b) RF (c) SMOReg (d) M5P.
Water 15 00419 g005
Figure 6. ROC for GWPM models.
Figure 6. ROC for GWPM models.
Water 15 00419 g006
Figure 7. (a) groundwater potential zones using ensemble model, (b) ROC curve of ensemble model.
Figure 7. (a) groundwater potential zones using ensemble model, (b) ROC curve of ensemble model.
Water 15 00419 g007
Figure 8. Visuals from the field survey: (a,b) Pumping of groundwater (c) Perennial surface water body (d) Dried well.
Figure 8. Visuals from the field survey: (a,b) Pumping of groundwater (c) Perennial surface water body (d) Dried well.
Water 15 00419 g008
Figure 9. Scatter plot showing the relationship between dug well depth and groundwater potential.
Figure 9. Scatter plot showing the relationship between dug well depth and groundwater potential.
Water 15 00419 g009
Table 1. Description of dataset used for groundwater potential zonation mapping.
Table 1. Description of dataset used for groundwater potential zonation mapping.
S. NoFactorsData TypeSource of DataData DetailsTime Period
1AspectRaster layerSRTM DEM, version 3.0, (USGS)30 m spatial resolution2018
2ElevationRaster layerSRTM DEM, version 3.0, (USGS)30 m spatial resolution2018
3SlopeRaster layerSRTM DEM, version 3.0, (USGS)30 m spatial resolution2018
4Lineament densityVector dataBhuvan thematic services of National Remote Sensing Centre1:50,0002011
5Drainage densityRaster layerSRTM DEM, version 3.0, (USGS)30 m spatial resolution2018
6LithologyVector dataBhuvan thematic services of National Remote Sensing Centre1:50,0002005–2006
7Land use land coverRaster layerLANDSAT 8 OLI30 m spatial resolutionDecember (2018)
8RainfallVector dataIndian Metrological Department (IMD)Annual average rainfall in mm2018
9Water tableVector dataCentral Groundwater Board (CGWB)water table depth in meters2018
10Soil typeVector dataFood and Agriculture Organization1:5,000,0002017
Table 2. Calculated statistical errors for various applied models.
Table 2. Calculated statistical errors for various applied models.
Performance Indicators Applied Models
Mean absolute error (MAE)0.09620.13090.2230.347
Root mean squared error (RMSE)0.11040.15940.28310.3915
Relative absolute error (RAE)19.83%26.18%29.33%30.08%
Root relative squared error (RRSE)28.07%31.07%33.43%37.68%
Table 3. Area under different groundwater potential zones derived by different methods.
Table 3. Area under different groundwater potential zones derived by different methods.
Potential Zone
Pixels% of
Pixels% of
Pixels% of
Pixels% of
Very good1,813,81924.811,403,66519.201,431,67619.581,513,21320.69
Very poor934,82712.78723,7489.90812,79011.12667,6899.13
Table 4. AUC values under ROC curves for the training and testing datasets.
Table 4. AUC values under ROC curves for the training and testing datasets.
ML ModelsArea (AUC)
Training DatasetsTesting Datasets
Table 5. Percentage area estimated in various categories of GWP zones in ensemble model.
Table 5. Percentage area estimated in various categories of GWP zones in ensemble model.
Groundwater Potential ZonesPixels% of Area
Very good1,323,19418.10
Very poor989,97813.54
Table 6. Location and depth of dug well in the study area.
Table 6. Location and depth of dug well in the study area.
S.No.Block NameLatLonSite TypeDepth of Dug Well (m)
1Manwat19.343876.5616Dug well15
2Parbhani19.426376.7658Dug well14.8
3Palam18.983376.8666Dug well13.2
4Jintur19.485576.7225Dug well12.1
5Pathri19.3076.30Dug well12
6Palam19.019476.8347Dug well10.32
7Jintur19.544776.9588Dug well9.1
8Gangakhed18.965576.7311Dug well9
9Parbhani19.266176.8225Dug well8.32
10Parbhani19.387576.6875Dug Well8
11Jintur19.622276.5391Dug Well0
12Jintur19.71876.6572Dug Well0
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Masroor, M.; Sajjad, H.; Kumar, P.; Saha, T.K.; Rahaman, M.H.; Choudhari, P.; Kulimushi, L.C.; Pal, S.; Saito, O. Novel Ensemble Machine Learning Modeling Approach for Groundwater Potential Mapping in Parbhani District of Maharashtra, India. Water 2023, 15, 419.

AMA Style

Masroor M, Sajjad H, Kumar P, Saha TK, Rahaman MH, Choudhari P, Kulimushi LC, Pal S, Saito O. Novel Ensemble Machine Learning Modeling Approach for Groundwater Potential Mapping in Parbhani District of Maharashtra, India. Water. 2023; 15(3):419.

Chicago/Turabian Style

Masroor, Md, Haroon Sajjad, Pankaj Kumar, Tamal Kanti Saha, Md Hibjur Rahaman, Pandurang Choudhari, Luc Cimusa Kulimushi, Swades Pal, and Osamu Saito. 2023. "Novel Ensemble Machine Learning Modeling Approach for Groundwater Potential Mapping in Parbhani District of Maharashtra, India" Water 15, no. 3: 419.

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop