Machine Learning Models for Groundwater Level Prediction and Uncertainty Analysis in Ruataniwha Basin, New Zealand

Kanito, Dawit; Benaafi, Mohammed; Baalousha, Husam Musa

doi:10.3390/hydrology12110282

Open AccessArticle

Machine Learning Models for Groundwater Level Prediction and Uncertainty Analysis in Ruataniwha Basin, New Zealand

by

Dawit Kanito

¹

,

Mohammed Benaafi

^1,2

and

Husam Musa Baalousha

^1,*

¹

Department of Geosciences, College of Petroleum Engineering and Geosciences, King Fahd University of Petroleum and Minerals (KFUPM), Dhahran 31261, Saudi Arabia

²

Interdisciplinary Research Center for Membranes and Water Security, King Fahd University of Petroleum and Minerals, Dhahran 31261, Saudi Arabia

^*

Author to whom correspondence should be addressed.

Hydrology 2025, 12(11), 282; https://doi.org/10.3390/hydrology12110282

Submission received: 3 October 2025 / Revised: 27 October 2025 / Accepted: 28 October 2025 / Published: 29 October 2025

(This article belongs to the Special Issue Water Resources Management Under Uncertainty and Climate Change (Second Edition))

Download

Browse Figures

Versions Notes

Abstract

Groundwater level predictive monitoring is necessary to address accelerated aquifer depletion and ensure sustainable management under increasing climatic and anthropogenic pressures. This study uses machine learning approaches to model groundwater level (GWL) dynamics in six observation wells in the Ruataniwha Basin, New Zealand. These models are enhanced with seasonal decomposition techniques. This study uses both static properties and dynamic variables to capture hydrogeological heterogeneity. Random Forest (RF) and Support Vector Machine (SVM), with seasonal decomposition preprocessing, were developed for GWL modelling. The models were trained on 80% of the dataset and tested using the remaining 20% of the data. Model accuracy was assessed using five statistical metrics: mean absolute error (MAE), root mean square error (RMSE), the coefficient of determination (R²), mean absolute percent error (MAPE), and percent bias (PBIAS). Model uncertainty was analyzed using Bayesian Model Averaging combined with the p-factor and d-factor at the 95% confidence level. The results demonstrate that both models delivered strong predictive performance across training, testing, and full period evaluations. However, the RF model demonstrated a marginally superior predictive accuracy by achieving lower errors (MAE: 0.013–0.174; RMSE: 0.04–0.283), better bias control (PBIAS ≈ 0%), and slightly tighter error bounds in most wells. Uncertainty quantification revealed that models provided a minimum p-factor of 0.878, capturing more than 87% of the observed GWL data within the uncertainty bounds. Comparing the results of both models, the RF model has higher p-factor values ranging from 0.878 to 0.976 with precise interval widths (d-factor: 0.436–0.769), indicating its reliability for adaptive groundwater management.

Keywords:

support vector machine (SVM); random forest (RF); groundwater level; Ruataniwha basin; seasonal decomposition; uncertainty analysis

1. Introduction

Groundwater is a lifeline resource that significantly affects various aspects of human life such as domestic, agricultural, and industrial activities [1,2,3]. Its importance extends beyond human needs, sustaining ecological balance and buffering against climate change extremes [4]. Global groundwater resources are the largest reservoir, estimated to exceed 50 million km³. Of this, only about 4 million km³ is classified as accessible freshwater and provides almost 43 percent of irrigation water and 50 percent of drinking water [1,5]. This excludes confined deep aquifers with negligible recharge rates, saline aquifers, and the water retained in unsaturated zones [1]. In recent years, groundwater has been facing unprecedented stresses from both anthropogenic and natural factors such as a rapidly increasing population and the associated water demand, urbanization, and climate change, all of which have led to rapid groundwater depletion [6,7,8,9]. Studies on GWL have identified several cases of significant decline in groundwater levels (>50 cm year⁻¹) throughout the 21st century, especially in arid regions characterized by extensive agricultural practices. Over the past four decades, groundwater levels have declined at an increasing rate in 30 percent of the world’s regional aquifers [10]. Furthermore, the over-exploitation of groundwater can also lead to land subsidence. Groundwater depletion-caused subsidence is a gradual and persistent geomechanical process that permanently diminishes aquifer system storage. It also increases flood risk and susceptibility by endangering around 75 percent of major coastal cities worldwide [11]. Recent groundwater abstraction practices, as evidenced by accelerating subsidence, correlate with the elevated arsenic concentrations in aquifer systems across the United States and South Asia [12,13]. Salinization, water shortages, and groundwater quality deterioration are all consequences of declining GWLs.

The groundwater levels indicate the availability and accessibility of groundwater resources. It provides important information about long-term aquifer dynamics [14]. However, forecasting the GWL is a major challenge. The challenge is primarily due to the complex and non-linear relationship between influencing factors including intrinsic aquifer characteristics, terrain attributes, vegetation, climate, and anthropogenic activities [9,15,16]. Additionally, uncertainty in model structure and input data add difficulty in GWL predictions [17,18]. Globally, various methods have been utilized to analyze GWL dynamics. They are traditionally grouped into three broad categories: experimental, physics-based, and numerical models [19]. Despite the reliable procedures and promising outcomes, conventional methods experience some notable gaps. Conventional methods struggle with complex data, heterogeneous environments, and dynamic aquifer systems, like the Ruataniwha Basin [20,21]. Additionally, conventional models require a large amount of data and high computational power. Recently, the statistical revolution and artificial intelligence (AI) have substantially improved groundwater level predictions [22]. To overcome the aforementioned limitations, it is essential to develop techniques with outstanding accuracy and less computational data and time requirements, like machine learning (ML). ML models have efficiently improved the planning and management of groundwater resources [23].

Machine learning (ML) algorithms have gained popularity in the field of hydrology with an adequate level of satisfaction. Part of AI is learning intricate relationships between input and output for forecasting spatiotemporal data. It improves complex data management and requires less time and computational cost. In groundwater studies, ML algorithms have been widely employed in GWL prediction, groundwater salinity distribution, groundwater potential mapping, uncertainty quantification, and the determination of aquifer properties [24,25,26,27]. Different machine learning models have been used in recent studies to predict groundwater level such as Random Forest (RF), Support Vector Machines (SVMs), artificial neural networks (ANNs), and Adaptive Neuro-Fuzzy Inference Systems (ANFISs) [16,28,29,30,31,32]. Liu et al. [31] developed an SVM model to assess groundwater level fluctuation in the Northeast United States. Their results pinpointed a reasonable performance of SVM in handling complex patterns. ANNs have been employed in GWL prediction [23,33]. However, the ANN models face overfitting and local minima problems. The SVM model can handle the aforementioned weaknesses. Comparative research has been conducted using ANN and SVM for predicting groundwater level. For instance, BinMakhashen and Benaafi [34] explored these models in predicting seawater intrusion in coastal aquifers. In addition, Bubakran et al. [23] assessed the performance of an ANN and SVM in predicting groundwater level. They pinpointed that because of its effective structure, the SVM model outperformed the ANN in terms of robustness and accuracy. SVM models have many advantages: efficiency in handling dimensional datasets, excellent generalization performance, and resistance to overfitting. While SVM outperforms ANN in explaining GWL dynamics, SVM is time-intensive due to its trial-and-error nature [35]. SVM is affected by outlier datasets. On the contrary, bagging and boosting make RF resistant to outliers. Comparative studies on RF and SVM have been carried out in GWL forecasting. For instance, research conducted in Indonesia reported RF’s effectiveness in predicting GWL dynamics [30]. Rodriguez-Galiano et al. [36] also conducted a comparative study and concluded that the RF model showed stronger predictive capabilities and higher robustness with a shorter training time than the SVM.

Uncertainty quantification (UC) techniques play a crucial role in minimizing the negative effects during decision-making and resource optimization [37]. The quantification of uncertainty has been used to address several practical issues in engineering and research. Everyday scenarios deal with uncertainties in a range of domains, such as weather forecasting and hydrology, and always aim to make decisions based on gathered observations and knowledge of the uncertain domain. ML models have increasingly become popular and efficient in addressing uncertainty. Models developed using ML are broadly employed in policy making, implying its significance for evaluating effectiveness and reliability [38]. Quantified uncertainty reduces the risks of decision-making in sustainable water resource management [39,40]. Monte Carlo simulation, Bayesian Model Averaging (BMA), numerical estimation, and probability distributions are widely implemented UQ methods. Though no single methodology is universally applicable to combine uncertainty, BMA offers a unified framework. It combines competing models into one predictive distribution by weighting them according to empirical support. This systematic pooling typically yields more reliable uncertainty estimates than individual models and has been especially useful in groundwater applications [41].

The Ruataniwha Basin is one of the main sources of groundwater in the Hawke’s Bay Region, New Zealand. It is characterized by heterogeneous and complex hydrogeological conditions, seasonal rainfall extremes, and competing water demands [4]. Predominantly used for agriculture, the catchment supports extensive irrigation schemes. These combined factors make this catchment a representative natural laboratory for investigating groundwater level dynamics and the uncertainties inherent in water resource management. Nonetheless, the basin exhibits significant potential for irrigation, indicating that the demand for water is expected to rise in the upcoming years [42]. The catchment has been extensively monitored by the Hawke’s Bay Regional Council, with a network of piezometers providing decadal groundwater level, quality, and abstraction data. Previous studies in the area have focused on conventional hydrogeological modelling, but gaps remain in integrating machine learning for GWL prediction and uncertainty quantification, a key motivation for this work. Thus, this study aims to develop and evaluate effective machine learning models for high-accuracy GWL prediction, addressing critical gaps in forecasting reliability under heterogeneous hydrogeological conditions. To the best of the authors’ knowledge, this is the first study to combine seasonal decomposition with RF and SVM for GWL forecasting in this specific hydrogeological setting, integrating both static and dynamic inputs. In addition, this study aims to integrate uncertainty-aware frameworks by integrating probabilistic metrics to quantify prediction confidence to ensure actionable insights for adaptive water management in the study basin.

2. Materials and Methods

2.1. The Study Area

The Ruataniwha Basin is located within the Tukituki River catchment in southern Hawke’s Bay, New Zealand. It is geographically located between latitudes 39°37′53.18″ and 40°7′49.23′ S and longitude 176°2′33.29″ and 176°38′19.7″ E and encompasses an area of approximately 800 km² (Figure 1). The catchment is bounded by the foothills of the Ruahine Range in the west, the Argyll Range in the north, the Manawatu Region in the south, and the Turiri Range and Raukawa Range in the east. The basin plays a vital role in supporting the agricultural, ecological, industrial, and community water needs in the Hawke’s Bay Region [42]. Agricultural land use is dominant throughout the catchment, with observed trends indicating an increasing intensification of farming practices. Consequently, this has led to significant groundwater abstraction, compounded by over-extraction and climatic variability. Addressing these pressures necessitates the development of robust predictive models for effective and sustainable water resource management.

2.2. Geology and Hydrogeology

The study area’s geology is dominated by Quaternary alluvial deposits interbedded with clay layers, which are the result of the prolonged erosion of the uplifted ranges that have contributed to a multi-layered groundwater system [4,43]. The sedimentary fill of the Ruataniwha Basin is relatively young, with deposits primarily from the Pleistocene and Holocene epochs ranging from 2 to 1 million years old [44]. These deposits are predominantly alluvial, comprising river gravels, sands, graywacke, and finer sediments that have accumulated in the basin’s shallow depressions (Figure 2). The alluvial deposits, with varying hydraulic properties, form aquifers that store and transmit groundwater, making the basin a vital water resource for the region. Recent SkyTEM-derived 3D models indicate that the cumulative aquifer thickness varies from 2 m to 400 m [45]. The basin is heterogeneous with two main gravel layers: Salisbury gravel, a more consolidated gravel deposit from the Pleistocene, and the young gravel sediment formation from the late Pleistocene, which is unconsolidated, make up the basin [46].

The basin is classified as a closed hydrogeological system, with no lateral groundwater flow, relying entirely on surface water inflows from the two main rivers in the basin, the Tukituki and the Waipawa, in addition to rainfall recharge [47]. There is a strong hydraulic connection between surface water and groundwater within the basin [4]. As such, the flow pattern in the basin’s rivers and streams is controlled by the dynamic exchange processes between aquifers and surface water. Based on the local hydrological conditions, the water either enters or leaves the stream network [42]. Two main rivers traverse the basin from west to east, namely the Tukituki and the Waipawa, in addition to many other smaller streams. All rivers and streams merge to form the two main rivers as they exit the basin, which is the only water outflow mechanism out of the basin. However, the river–aquifer interaction within the basin is complex as there is a strong connection between the two water bodies. Groundwater abstraction within the proximity of rivers or streams within the basin may result in a drop in surface water level. This drop in surface water is because of the induced flow from river to aquifer.

2.3. Data Description

This research integrated both aquifer properties and hydrometeorological parameters to achieve the objective, focusing on understanding the groundwater level dynamics in a closed hydrogeological system. The dataset comprised both dynamic and static parameters, collected over 17 years from 2006 to 2022. All the datasets were obtained from the Hawke’s Bay Regional Council’s (HBRC) published and unpublished records (https://www.hbrc.govt.nz/environment/environmental-data/ accessed 15 July 2025). The daily datasets obtained from the four meteorological stations were averaged to obtain a representative measurement. Using the average value enhances the accuracy and homogeneity of the dataset for modelling. The GWL and groundwater abstraction datasets were collected on a monthly basis from six monitoring wells. The well depths are as follows: 1430 (46 m), 1458 (25 m), 4695 (6 m), 4700 (98 m), 4701 (111 m), and 4702 (105 m). They provide crucial information on the response of the aquifer to anthropogenic and climatic impacts. This study used 17 years of data to facilitate the evaluation of long-term trends and seasonal fluctuations in groundwater level. The static properties of the aquifer were also integrated to understand the hydrological dynamics of the aquifer. Storativity and transmissivity values were obtained from the report by Jasper [48]. Table 1 shows a summary of the dataset used in this study.

2.4. Data Preprocessing and Feature Engineering

For model development and estimation, the data was partitioned into a training set comprising 80% (163 months) of the total data and a testing set containing the remaining 20% (41 months). Prior to model training, feature scaling was performed using MinMaxScaler to normalize values between 0 and 1 (Equation (1)). This procedure ensures that all features contribute equally to the model and enhances convergence performance.

x_{i, j}^{'} = \frac{x_{i, j} - m i n (x_{j})}{\max (x_{j}) - m i n (x_{j})}

(1)

where

x_{i, j}^{'}

is the scaled value,

x_{i, j}

is the original value of the ith data point for the jth feature (target), and

m i n (x_{j})

and

\max (x_{j})

stand for the minimum and maximum values of the variables in the training dataset, respectively. The overall analysis was performed using Python 3.13, as illustrated in Figure 3.

2.5. Model Development

2.5.1. Random Forest (RF)

Random Forest, developed by Breiman [49], is an ensemble machine learning algorithm widely used for both classification and regression tasks to overcome the drawbacks of one decision tree such as instability and overfitting. It works by creating several decision trees on a random subset of training data and outputting the mean prediction (for regression) or the mode of the classes (for classification) of the discrete trees. The basic principles behind RF are bootstrap aggregating (bagging) and random feature selection at each node split. These mechanisms contribute to the model’s robustness against overfitting, its ability to handle high-dimensional data, and its capacity to capture complex non-linear relationships. It has been widely employed in both surface–subsurface hydrology, including surface water–groundwater potential mapping and nitrate pollution vulnerability [50,51]. However, based on the best knowledge of the authors, few studies to date have used RF for the prediction of groundwater levels with the combined use of dynamic and static input parameters.

The Random Forest regression was implemented in the following manner [49]:

Bootstrap sampling ( $x_{i}$ , i is iteration) for each tree (b = 1 to B, where B is the total number of trees in the forest).
Grow a regression tree T_b using the bootstrap sample; this helps inject randomness at each node split.
At each node of a growing tree, a random subset of the input features is selected B trees $T_{b}$ .
The final prediction is the average of the individual predictions from all the trees in the forest, as shown in Equation (3):

\hat{y} (x_{i}) = \frac{1}{B} \sum_{b = 1}^{B} T_{b} (x_{i})

(2)

2.5.2. Support Vector Machine (SVM)

Support Vector Machine is a powerful machine learning method relying on statistical learning theory and the principle of structural risk reduction [52]. The characteristics of learning machines that allow SVM to effectively generalize to unseen data are explained by statistical learning theory. The governing idea behind SVM can be explained by four fundamental notions: (i) kernel function, (ii) separating hyperplane, (iii) hard margin, (iv) soft margin [53]. It constructs an optimal hyperplane that maximally separates distinct classes. The support vector regressor (SVR) has an inherent capability to enhance the response by reducing overfitting issues in regression. SVR is an extension of SVM that intends to predict numerical estimates. Due to their flexibility for addressing high dimensionality, SVMs are broadly employed in hydrogeological studies including GWL forecasting [32]. Different kernel functions such as linear, polynomial, sigmoidal, Gaussian, and radial basis function (RBF) strengthen its effectiveness in addressing intricate relationships in hydrological systems. This study implemented the RBF kernel function to capture the complex GWL dynamics. The approach specifically addresses a heterogeneous aquifer response through high-dimensional feature mapping, enabling the precise forecasting of GWL fluctuation under variable hydroclimatic conditions.

2.5.3. Model Optimization

Model optimization is the process of selecting the best solution out of the various feasible solutions. The performance of the SVR model is sensitive to the choice of its hyperparameters: the regularization parameter (C), the kernel coefficient gamma (γ), and the epsilon-tube width (ϵ). This study used the GridsearchCV algorithm to optimize the parameters of both SVM and RF from the Sklearn library in Python. GridsearchCV is a technique that identifies the optimal hyperparameters with the best combination during model training. The GridsearchCV approach evaluates all possible combinations of hyperparameter values to determine the arrangement that maximizes the predictive accuracy [54]. The datasets were split into time periods using TimeSeriesSplit. The approach makes sure that each piece of data (training and testing) precisely represents the chronological sequence of the actual occurrence. It helps uphold the temporal dimensions of the original data [55]. For RF, TimeSeriesSplitCV was employed with R² score as the scoring metric. Additionally, this study used RandomizedSearchCV to effectively search a wider hyperparameter space (number of trees, maximum tree depth, minimum samples for splitting, minimum samples per leaf, and maximum features).

2.6. Model Performance Evaluation

In the present study, five statistical metrics were employed for model performance evaluation. Statistical metrics included the mean coefficient of determination (R²), absolute error (MAE), mean absolute percent error (MAPE), root mean square error (RMSE), and percentage bias (PBIAS). The metrics were calculated to quantify the fit between observed data (L) and predicted data (M). The parameters were calculated using Equations (3)–(7).

M A E = \frac{1}{n} \sum_{i = 1}^{n} | L_{i} - M_{i}

(3)

R M S E = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(L_{i} - M_{i})}^{2}}

(4)

R^{2} = \frac{\sum_{i = 1}^{n} {(L_{i} - M_{i})}^{2}}{\sum_{i = 1}^{n} {(L_{i} - Ĺ)}^{2}}

(5)

M A P E = \frac{100}{n} \sum_{n = 1}^{n} (\frac{L_{i} - M_{i}}{M_{i}})

(6)

P B I A S = \frac{\sum_{i}^{n} (L_{i} - M_{i})}{\sum_{i = 1}^{n} L_{i}} * 100

(7)

where L_i and M_i represent the ith observed and predicted values, respectively, for i = 1, …, n, where n is the total number of data points. Ĺ represents the mean of the observation data.

MAE measures the average magnitude of errors between predicted and observed data regardless of their direction. RMSE estimates the standard deviation of the model prediction errors. A lower value of RMSE indicates that the model performance is good and vice versa. The R² value measures the proportion of the variance in observed data explained by the model. MAPE quantifies accuracy, precision, and bias. The overall statistical metrics used in this study provide a comprehensive evaluation of the model’s accuracy, precision, and bias.

2.7. Uncertainty Analysis and Quantification

This research used the Bayesian Model Averaging (BMA) approach to capture model uncertainty. BMA is a statistical method used to study inherent uncertainty. It follows the total probability law and Bayesian theorem to associate n number of credible models by taking the mean value of their following probabilities of forecasted values. It was calculated using Equation (8) [32]:

P_{y} = \sum_{k = 1}^{k} β_{k} F_{y k} + e_{y}

(8)

where F_yk represents the point prediction of model k, e_y is the noise term, β_k denotes the weight vector of the model, P indicates the number of observations, and k and y represent the number of models and observations, respectively. The weight and SD of normal probability distribution functions were computed using the log-likelihood function (Equation (9)):

L (β_{B M A}, σ_{B M A}| F, P) = \sum_{i = 1}^{n} l o g \{\sum_{k = 1}^{k} β_{k} \frac{1}{\sqrt{2 Π σ_{k}^{2}}} e x p [- \frac{1}{2} σ_{k}^{- 2} {(P_{y} - F_{j k})}^{2}]\}

(9)

where

(β_{B M A}, σ_{B M A}| F, P)

represents the maximum likelihood Bayesian weight. Markov Chain Monte Carlo (MCMC) simulations were employed to compute the log-likelihood function.

This study computed two objective methods to quantify model uncertainty: p-factor and d-factor. The d-factor represents the average distance between the upper and lower boundaries and the standard deviation of observation data. Whereas the p-factor denotes the proportion of observation datasets that fall within the 95% confidence interval (Equation (12)). This approach has been effectively used in hydrological studies [56]. The following equation was used for determining the d-factor (Equations (10) and (11)):

đ_{x} = \frac{1}{K} \sum_{i = 1}^{k} (X_{U} - X_{L})

(10)

d - f a c t o r = \frac{đ_{x}}{σ_{x}}

(11)

p - f a c t o r = \frac{N_{o b s, 95 % P P U}}{N_{t o t a l}}

(12)

where

đ_{x}

is the average distance between the lower limit and upper limit (X_L, X_U) of the confidence interval, and σ_x is the standard deviation of the observed data. N_obs,95PPU and N_total are the number of observed data points within the 95% uncertainty interval and the total number of observations, respectively.

3. Results and Discussion

3.1. Results of Model Performance Evaluation

In this study, the performance of the models was assessed and presented using five statistical metrics: MAE, RMSE, R², MAPE, and PBIAS. The performance was calculated separately for the training period, the independent test period, and the combined dataset. These metrics collectively quantify the accuracy of the models, their explanatory power, and bias estimation. Overall, they provide a comprehensive evaluation of the model’s predictive capabilities. This multi-period evaluation highlights the model’s learning capacity and generalization ability for unseen data. Table 2 and Table 3 present the detailed performance metrics for each model at each well location. The evaluation revealed variability in the performance of both SVM and Random Forest models across the six monitoring wells.

During model training, both models achieved excellent agreement with observed GWL, with R² values exceeding 0.9 for all wells (SVM R² ranging from 0.985 to 0.998 and RF R² ranging from 0.973 to 0.996) (Table 2). This indicates that both models were highly effective in learning the patterns within the training data. MAE and RMSE were recorded as small (RF MAE = 0.012 to 0.175 m; SVM MAE = 0.016 to 0.269 m), indicating the models’ ability to capture groundwater fluctuations. The near-zero PBIAS across wells confirms that neither model displayed systematic over- or under-prediction during training.

In the test period, RF retained a very strong predictive accuracy with a minimum R² of 0.951, MAE = 0.011–0.242 m, and RMSE = 0.021–0.457 m, while SVM’s performance slightly dipped with a minimum R² of 0.918, MAE = 0.023–0.628 m, and RMSE = 0.031–0.829 m. In particular, SVM underperformed at well 1430 (test MAE = 0.628 m, RMSE = 0.829 m, R² = 0.918), suggesting that RF’s ensemble averaging better handles non-linearities and noise in that particular record. Crucially, the performance on the independent testing dataset reflects the models’ ability to generalize to unseen data. The RF model maintained a high performance during testing, with R² values ranging from 0.951 to 0.985 across the wells. The SVM model also showed good generalization, with testing R² values ranging from 0.918 to 0.995. While there is a slight decrease in performance from training to testing for both models, the magnitude of this decrease appears relatively small.

The overall model performance indicates that the RF model generally exhibited a slightly better performance compared with the SVM model across most metrics (Table 3):

➢: MAE/RMSE: RF overall MAE = 0.013 to 0.174 m and RMSE = 0.04 to 0.53 m, versus SVM overall MAE = 0.016 to 0.339 m and RMSE = 0.02 to 0.59 m.
➢: R²: Across wells, RF explains 97.6 to 99.2% of variance, marginally lower than SVM with 97.8 to 99.7%.
➢: MAPE/PBIAS: RF maintains a lower percent error (MAPE = 0.01 to 0.08%) and bias near zero, whereas SVM exhibits a slightly larger error (MAPE = 0.23 to1.56%) and modest bias at well 4701 (PBIAS = 0.85%).

RF’s ensemble structure, which inherently averages predictions across multiple decision trees, likely enhanced its robustness to spatial heterogeneity. This is reflected in its lower overall MAE (0.013 to 0.174 compared with SVM’s 0.016 to 0.339) and MAPE (a maximum of 0.08 compared with 1.56 for SVM), which is critical for operational applications where proportional error minimization is essential. This result aligns with previous hydrogeologic applications, where RF’s bagging-based structure often yields tighter error distributions and greater robustness to outliers than kernel-based methods [57]. The high R² values, low error metrics, and minimal biases observed for both models suggest that they are capable of accurately predicting GWL across the different wells, making them suitable for practical applications in groundwater management and decision-making.

While the RF and SVM models excel in pattern recognition and can handle non-linear relationships, they lack physical transparency [58]. Recent comparative studies have shown that ML models can match or exceed the predictive accuracy of numerical models under certain conditions. However, physically based models such as MODFLOW offer process-based insights and allow for the explicit representation of flow dynamics [59].

3.2. Comparison of the Time Series Prediction

The result from each model over the study period shows the overall visual agreement that offers insights about the model’s ability to capture the complex trend inherent in the groundwater system (Figure 4 and Figure 5). Two of the models demonstrated a strong capability in reproducing the temporal patterns of GWL across most of the wells. For instance, in SVM, the predicted GWL time series closely follows the actual observed data (Figure 4). In the same fashion for RF, the predicted GWL time series corresponds closely to the actual measurements. This illustrates that these models successfully capture the sub-seasonal signal variations and predicted trends with plausible shapes. This visual fidelity aligns well with the favorable performance metrics (Table 3).

3.3. Scatter Plot Analysis of RF and SVM

The scattered distribution of observed and predicted values with the corresponding regression equation and the coefficient of determination (R²) value is visualized in Figure 6 and Figure 7. The result provides a direct visual assessment of model accuracy. The figures signify the agreement between the observed and predicted GWL, with the overall performance of SVM marginally superior to RF. However, for SVM the fitting curve slightly deviated from the y = x line, as seen in Figure 7. This deviation can be explained by the slightly higher MAE and RMSE values (0.339 and 0.59, respectively) recorded in Figure 7 for well 1430. Conversely, it can also be observed that the plots for wells 1458, 4700, and 4701 are less dispersive and closer to the fitted lines compared with the others. The results generally illustrate a strong predictive performance in most wells with data points clustering relatively close to the 1:1 line with few outliers. This tight clustering visually reinforces the high R² values achieved by the models. The minimal spread of data points around the line supports the low MAE and RMSE values reported in Table 3, indicating small prediction errors. However, the near-zero PBIAS values suggest that these predictions are also less biased.

3.4. Uncertainty Analysis

In this study, the uncertainty of GWL modelling is quantified by two widely used metrics (p-factor and d-factor) and visualized by 95% confidence intervals [56]. These metrics give insight into the reliability and precision of the models. The results show that the p-factor values ranged between 0.927 and 0.951 for SVM and 0.878–0.976 for RF (Table 4). These values exceed the commonly accepted threshold value of 0.7–0.8, indicating both models provide good coverage of the observed data [60]. The SVM model exhibited a consistent p-factor of 0.951 across the 1430, 1456, 4695, and 4700 wells, indicating that the model can capture 95% of the observed data within the uncertainty bounds. For the 4701 and 4702 wells, the p-factor slightly decreased to 0.927 while still maintaining a high level of reliability. However, the d-factor for the SVM model varied across the wells, with values ranging from 0.43 to 1.07. The lower d-factor values, such as 0.47 and 0.43 for wells 4700 and 4701, signify that the model has a relatively low uncertainty (narrower prediction intervals) and can make precise GWL predictions for those wells. The higher d-factor values, like 1.07 and 1.04 (well 1430 and 4695), show a higher level of uncertainty with the widest prediction interval relative to the observed variability (Figure 8).

The RF model showed narrower intervals with the d-factor ranging between 0.436 and 0.769, with the lower values indicating better precision in the predictive performance of GWL with narrower uncertainty bounds (Figure 7 and Table 3). For example, the d-factor of 0.436 for well 1430 and 0.45 for well 4702 suggests that the RF model has relatively low uncertainty in these wells (1430 and 4702). In addition, the model has demonstrated a p-factor ranging from 0.878 to 0.976. This signifies that the model can reliably capture the observed data within the 95% prediction uncertainty bounds. The model maintained a high p-factor of 0.976 for well 1430, showing the highest performance with a low d-factor (0.436). This observation is consistent with studies such as Pham et al. [61] and Rahaman et al. [62], which reported RF’s robustness in handling non-linear groundwater data and delivering tighter confidence bands. Moreover, wells such as 4695 demonstrated a strong performance under both models, and this suggests site-specific factors [63]. This might help prioritize the allocation of resources and decision-making processes focusing on the wells with lower uncertainty to ensure more reliable groundwater management strategies. Overall, the RF model achieved lower d-factor values across most wells, indicating its tighter prediction intervals. The spatial variability in uncertainty metrics across the wells is consistent with the variations in the model’s prediction performance (Table 3).

The result from the confidence interval analysis revealed that the 95% confidence interval calculated by the RF model was found to be narrower than the one estimated by the SVM model (Table 5 and Figure 9). This might be attributed to the fact that the Random Forest model has many decision trees that are trained using various sets of randomly chosen and replacement-based samples.

The results also show that the Random Forest model may be slightly robust in capturing the non-linear response of GWL to climate change and anthropogenic factors. This could be attributed to the model’s ensemble learning structure, which efficiently handles complex variable interactions and reduces overfitting. This is particularly important in a complex environment like the Ruataniwha Basin where aquifer heterogeneity, recharge variability, and pumping rates play a crucial role. However, practically, both the SVM and RF models require large input datasets (climatic variables, pumping test, groundwater abstraction, GWLs, and others) which is a limitation in data-scarce areas. In addition, their interpretability is often a barrier to real-time operation. They also face challenges in capturing the long-term dependencies and seasonal trends inherent in groundwater level fluctuations [33]. To mitigate these shortcomings, this research employed strategic approaches such as seasonal decomposition, lagged variables, grid search, and time series cross-validation. Furthermore, future research could integrate additional physical explanatory variables like soil moisture and geophysical data to improve the model and policy relevance for water resource management in the region.

4. Conclusions

This study developed machine learning models for groundwater level prediction in the heterogeneous aquifer of Ruataniwha Basin, New Zealand. The aim of this study was to achieve the reliable forecasting of groundwater level considering climate variables and aquifer heterogeneity. By considering both static aquifer properties and dynamic variables we are able to undertake an effective analysis and consideration of the complex dynamics of groundwater systems.

Random Forest (RF) and Support Vector Machine (SVM) models were trained and validated using data from six monitoring wells from 2006 to 2022. The results revealed that both algorithms achieved an excellent predictive performance. The training phase results showed R² values exceeding 0.97 for both models, while independent testing maintained a strong performance with R² values above 0.91. However, the RF model was found to have a better overall performance, as indicated by low prediction errors (MAE: 0.013–0.174 m; RMSE: 0.04–0.53 m) and enhanced bias control (PBIAS ≈ 0%) compared with SVM. This excellent performance is because of the ensemble structure of RF, which provides better robustness to spatial heterogeneity and outliers through its bagging-based averaging mechanism.

The uncertainty analysis using Bayesian Model Averaging showed that both models have reliable predictions with high confidence. The p-factor values are higher than 0.87 for all wells, which shows that more than 87% of observed groundwater level data fell within the 95% prediction intervals. It should be noted that the RF model has tighter uncertainty bounds (d-factor: 0.436–0.769) while maintaining excellent coverage (p-factor: 0.878–0.976), which proves its reliability for operational groundwater management applications.

Coupling seasonal decomposition with machine learning approaches enables the analysis of the complex temporal changes in groundwater levels. The excellent predictions enable effective pumping schedule optimization, drought contingency planning, and water resources allocation. Future research should focus on integrating climate change projections to enhance the models’ utility for long-term planning.

Author Contributions

Conceptualization, D.K. and H.M.B.; methodology, D.K.; software, D.K. and M.B.; validation, all authors; formal analysis, D.K. and H.M.B.; investigation, M.B.; resources, H.M.B.; data curation, D.K. and M.B.; writing—original draft preparation, D.K.; writing—review and editing, M.B. and H.M.B.; visualization, all authors; supervision, H.M.B. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The dataset used for this study was obtained from Hawke’s Bay Regional Council (HBRC), New Zealand.

Acknowledgments

Thanks to Hawke’s Bay Regional Council (HBRC), New Zealand, for providing the data needed for this study.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Ghosh, A.; Adhikary, P.P.; Bera, B.; Bhunia, G.S.; Shit, P.K. Assessment of groundwater potential zone using MCDA and AHP techniques: Case study from a tropical river basin of India. Appl. Water Sci. 2022, 12, 37. [Google Scholar] [CrossRef]
Ravindiran, G.; Rajamanickam, S.; Sivarethinamohan, S.; Karupaiya Sathaiah, B.; Ravindran, G.; Muniasamy, S.K.; Hayder, G. A Review of the Status, Effects, Prevention, and Remediation of Groundwater Contamination for Sustainable Environment. Water 2023, 15, 3662. [Google Scholar] [CrossRef]
Saito, L.; Christian, B.; Diffley, J.; Richter, H.; Rohde, M.M.; Morrison, S.A. Managing Groundwater to Ensure Ecosystem Function. Groundwater 2021, 59, 322–333. [Google Scholar] [CrossRef]
Baalousha, H.M. Characterisation of groundwater-surface water interaction using field measurements and numerical modelling: A case study from the Ruataniwha Basin, Hawke’s Bay, New Zealand. Appl. Water Sci. 2012, 2, 109–118. [Google Scholar] [CrossRef]
Rajeevan, U.; Mishra, B.K. Sustainable management of the groundwater resource of Jaffna, Sri Lanka with the participation of households: Insights from a study on household water consumption and management. Groundw. Sustain. Dev. 2020, 10, 100280. [Google Scholar] [CrossRef]
Dao, P.U.; Heuzard, A.G.; Le, T.X.H.; Zhao, J.; Yin, R.; Shang, C.; Fan, C. The impacts of climate change on groundwater quality: A review. Sci. Total Environ. 2024, 912, 169241. [Google Scholar] [CrossRef]
Ghazi, B.; Jeihouni, E.; Kalantari, Z. Predicting groundwater level fluctuations under climate change scenarios for Tasuj plain, Iran. Arab. J. Geosci. 2021, 14, 115. [Google Scholar] [CrossRef]
Khorrami, M.; Malekmohammadi, B. Effects of excessive water extraction on groundwater ecosystem services: Vulnerability assessments using biophysical approaches. Sci. Total Environ. 2021, 799, 149304. [Google Scholar] [CrossRef] [PubMed]
Wunsch, A.; Liesch, T.; Broda, S. Deep learning shows declining groundwater levels in Germany until 2100 due to climate change. Nat. Commun. 2022, 13, 1221. [Google Scholar] [CrossRef] [PubMed]
Jasechko, S.; Seybold, H.; Perrone, D.; Fan, Y.; Shamsudduha, M.; Taylor, R.G.; Fallatah, O.; Kirchner, J.W. Rapid groundwater decline and some cases of recovery in aquifers globally. Nature 2024, 625, 715–721. [Google Scholar] [CrossRef]
Herrera-García, G.; Ezquerro, P.; Tomas, R.; Béjar-Pizarro, M.; López-Vinielles, J.; Rossi, M.; Mateos, R.M.; Carreón-Freyre, D.; Lambert, J.; Teatini, P.; et al. Mapping the global threat of land subsidence. Science 2021, 371, 34–36. [Google Scholar] [CrossRef] [PubMed]
Erban, L.E.; Gorelick, S.M.; Zebker, H.A.; Fendorf, S. Release of arsenic to deep groundwater in the Mekong Delta, Vietnam, linked to pumping-induced land subsidence. Proc. Natl. Acad. Sci. USA 2013, 110, 13751–13756. [Google Scholar] [CrossRef]
Smith, R.; Knight, R.; Fendorf, S. Overpumping leads to California groundwater arsenic threat. Nat. Commun. 2018, 9, 2089. [Google Scholar] [CrossRef]
Evans, S.W.; Jones, N.L.; Williams, G.P.; Ames, D.P.; Nelson, E.J. Groundwater Level Mapping Tool: An open source web application for assessing groundwater sustainability. Environ. Model. Softw. 2020, 131, 104782. [Google Scholar] [CrossRef]
Khan, J.; Lee, E.; Balobaid, A.S.; Kim, K. A Comprehensive Review of Conventional, Machine Leaning, and Deep Learning Models for Groundwater Level (GWL) Forecasting. Appl. Sci. 2023, 13, 2743. [Google Scholar] [CrossRef]
Igwebuike, N.; Ajayi, M.; Okolie, C.; Kanyerere, T.; Halihan, T. Application of machine learning and deep learning for predicting groundwater levels in the west coast aquifer system, south africa. Earth Sci. Inform. 2025, 18, 6. [Google Scholar] [CrossRef]
Jafarzadeh, A.; Khashei-Siuki, A.; Pourreza-Bilondi, M. Performance Assessment of Model Averaging Techniques to Reduce Structural Uncertainty of Groundwater Modeling. Water Resour. Manag. 2022, 36, 353–377. [Google Scholar] [CrossRef]
Mustafa, S.M.T.; Moudud Hasan, M.; Saha, A.K.; Rannu, R.P.; Van Uytven, E.; Willems, P.; Huysmans, M. Multi-model approach to quantify groundwater-level prediction uncertainty using an ensemble of global climate models and multiple abstraction scenarios. Hydrol. Earth Syst. Sci. 2019, 23, 2279–2303. [Google Scholar] [CrossRef]
Ntona, M.M.; Busico, G.; Mastrocicco, M.; Kazakis, N. Modeling groundwater and surface water interaction: An overview of current status and future challenges. Sci. Total Environ. 2022, 846, 157355. [Google Scholar] [CrossRef]
Sriram, R. Groundwater Level Prediction: A novel study on Machine Learning based approach with regression models for Sustainable Resource Management. In Proceedings of the 2023 IEEE International Conference on Cloud Computing in Emerging Markets (CCEM), Mysuru, India, 2–4 November 2023; pp. 137–142. [Google Scholar]
Taccari, M.L.; Wang, H.; Nuttall, J.; Chen, X.; Jimack, P.K. Spatial-temporal graph neural networks for groundwater data. Sci. Rep. 2024, 14, 24564. [Google Scholar] [CrossRef]
Huang, R.; Ma, C.; Ma, J.; Huangfu, X.; He, Q. Machine learning in natural and engineered water systems. Water Res. 2021, 205, 117666. [Google Scholar] [CrossRef]
Bubakran, K.S.; Novinpour, E.A.; Aghdam, F.S. A comparative study of artificial neural networks and support vector machines for predicting groundwater levels in the Ziveh Aquifer–West Azerbaijan, NW Iran. Arab. J. Geosci. 2023, 16, 287. [Google Scholar] [CrossRef]
Hou, Z.; Lao, W.; Wang, Y.; Lu, W. Homotopy-based hyper-heuristic searching approach for reciprocal feedback inversion of groundwater contamination source and aquifer parameters. Appl. Soft Comput. 2021, 104, 107191. [Google Scholar] [CrossRef]
Cai, H.; Shi, H.; Liu, S.; Babovic, V. Impacts of regional characteristics on improving the accuracy of groundwater level prediction using machine learning: The case of central eastern continental United States. J. Hydrol. Reg. Stud. 2021, 37, 100930. [Google Scholar] [CrossRef]
Kumar, R.; Dwivedi, S.B.; Gaur, S. A comparative study of machine learning and Fuzzy-AHP technique to groundwater potential mapping in the data-scarce region. Comput. Geosci. 2021, 155, 104855. [Google Scholar] [CrossRef]
Sahour, H.; Gholami, V.; Vazifedan, M. A comparative analysis of statistical and machine learning techniques for mapping the spatial distribution of groundwater salinity in a coastal aquifer. J. Hydrol. 2020, 591, 125321. [Google Scholar] [CrossRef]
Baalousha, H.M. Machine Learning-Driven Calibration of MODFLOW Models: Comparing Random Forest and XGBoost Approaches. Geosciences 2025, 15, 303. [Google Scholar] [CrossRef]
Baalousha, H.M. Machine Learning Approaches for Groundwater Vulnerability Assessment in Arid Environments: Enhancing DRASTIC with ANN and Random Forest. Groundw. Sustain. Dev. 2025, 30, 101496. [Google Scholar] [CrossRef]
Hikouei, I.S.; Eshleman, K.N.; Saharjo, B.H.; Graham, L.L.B.; Applegate, G.; Cochrane, M.A. Using machine learning algorithms to predict groundwater levels in Indonesian tropical peatlands. Sci. Total Environ. 2023, 857, 159701. [Google Scholar] [CrossRef]
Liu, Q.; Gui, D.; Zhang, L.; Niu, J.; Dai, H.; Wei, G.; Hu, B.X. Simulation of regional groundwater levels in arid regions using interpretable machine learning models. Sci. Total Environ. 2022, 831, 154902. [Google Scholar] [CrossRef]
Seifi, A.; Ehteram, M.; Singh, V.P.; Mosavi, A. Modeling and uncertainty analysis of groundwater level using six evolutionary optimization algorithms hybridized with ANFIS, SVM, and ANN. Sustainability 2020, 12, 4023. [Google Scholar] [CrossRef]
Tao, H.; Hameed, M.M.; Marhoon, H.A.; Zounemat-Kermani, M.; Heddam, S.; Sungwon, K.; Sulaiman, S.O.; Tan, M.L.; Sa’adi, Z.; Mehr, A.D.; et al. Groundwater level prediction using machine learning models: A comprehensive review. Neurocomputing 2022, 489, 271–308. [Google Scholar] [CrossRef]
BinMakhashen, G.M.; Benaafi, M. Predicting seawater intrusion in coastal areas using machine learning: A case study of arid coastal aquifers, Saudi Arabia. Groundw. Sustain. Dev. 2024, 26, 101300. [Google Scholar] [CrossRef]
Raghavendra, S.; Deka, P.C. Support vector machine applications in the field of hydrology: A review. Appl. Soft Comput. J. 2014, 19, 372–386. [Google Scholar] [CrossRef]
Rodriguez-Galiano, V.; Sanchez-Castillo, M.; Chica-Olmo, M.; Chica-Rivas, M. Machine learning predictive models for mineral prospectivity: An evaluation of neural networks, random forest, regression trees and support vector machines. Ore Geol. Rev. 2015, 71, 804–818. [Google Scholar] [CrossRef]
Abdar, M.; Pourpanah, F.; Hussain, S.; Rezazadegan, D.; Liu, L.; Ghavamzadeh, M.; Fieguth, P.; Cao, X.; Khosravi, A.; Acharya, U.R.; et al. A review of uncertainty quantification in deep learning: Techniques, applications and challenges. Inf. Fusion 2021, 76, 243–297. [Google Scholar] [CrossRef]
Abbaszadeh Shahri, A.; Shan, C.; Larsson, S. A Novel Approach to Uncertainty Quantification in Groundwater Table Modeling by Automated Predictive Deep Learning. Nat. Resour. Res. 2022, 31, 1351–1373. [Google Scholar] [CrossRef]
Herrera, P.A.; Marazuela, M.A.; Hofmann, T. Parameter estimation and uncertainty analysis in hydrological modeling. Wiley Interdiscip. Rev. Water 2022, 9, e1569. [Google Scholar] [CrossRef]
Varouchakis, E.A.; Yetilmezsoy, K.; Karatzas, G.P. A decision-making framework for sustainable management of groundwater resources under uncertainty: Combination of Bayesian risk approach and statistical tools. Water Policy 2019, 21, 602–622. [Google Scholar] [CrossRef]
Fragoso, T.M.; Bertoli, W.; Louzada, F. Bayesian Model Averaging: A Systematic Review and Conceptual Classification. Int. Stat. Rev. 2018, 86, 1–28. [Google Scholar] [CrossRef]
Baalousha, H.M. Modelling surface-groundwater interaction in the Ruataniwha basin, Hawke’s Bay, New Zealand. Environ. Earth Sci. 2012, 66, 285–294. [Google Scholar] [CrossRef]
Griffiths, J.; Yang, J.; Woods, R.; Zammit, C.; Porhemmat, R.; Shankar, U.; Rajanayaka, C.; Ren, J.; Howden, N. Parameterization of a National Groundwater Model for New Zealand. Sustainability 2023, 15, 3280. [Google Scholar] [CrossRef]
Ballance, P.F. New Zealand Geology: An Illustrated Guide; Geoscience Society of New Zealand: Wellington, New Zealand, 2017. [Google Scholar]
Rawlinson, Z. Hawke’s Bay 3D Aquifer Mapping Project: 3D Hydrogeological Models from SkyTEM Data in the Ruataniwha Plains; Consultancy Report 2023/117; GNS Science: Lower Hutt, New Zealand, 2024; 60p. [Google Scholar]
Francis, D. Subsurface Geology of the Ruataniwha Plains and Relation to Hydrology; Technical report; Geological Research Ltd.: Lower Hutt, New Zealand, 2001. [Google Scholar]
Baalousha, H. Stochastic water balance model for rainfall recharge quantification in Ruataniwha Basin, New Zealand. Environ. Geol. 2009, 58, 85–93. [Google Scholar] [CrossRef]
Jasper, C. Ruataniwha Aquifer Properties Analysis and Mapping (HBRC Publication No. 5640); Hawke’s Bay Regional Council: Hawke’s Bay, New Zealand, 2018. [Google Scholar]
Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
He, S.; Wu, J.; Wang, D.; He, X. Predictive modeling of groundwater nitrate pollution and evaluating its main impact factors using random forest. Chemosphere 2022, 290, 133388. [Google Scholar] [CrossRef]
Sachdeva, S.; Kumar, B. Comparison of gradient boosted decision trees and random forest for groundwater potential mapping in Dholpur (Rajasthan), India. Stoch. Environ. Res. Risk Assess. 2021, 35, 287–306. [Google Scholar] [CrossRef]
Vapnik, V. The Nature of Statistical Learning Theory; Springer science & business media: Berlin/Heidelberg, Germany, 1999. [Google Scholar]
Noble, W.S. What is a support vector machine? Nat. Biotechnol. 2006, 24, 1565–1567. [Google Scholar] [CrossRef]
Pohan, A.B.; Kurniasih, A. Optimization of Classification Algorithm with GridSearchCV and Hyperparameter Tuning for Sentiment Analysis of the Nusantara Capital City. J. Artif. Intell. Eng. Appl. 2024, 3, 808–814. [Google Scholar] [CrossRef]
Kumar, V.; Kedam, N.; Sharma, K.V.; Mehta, D.J.; Caloiero, T. Advanced Machine Learning Techniques to Improve Hydrological Prediction: A Comparative Analysis of Streamflow Prediction Models. Water 2023, 15, 2572. [Google Scholar] [CrossRef]
Abbaspour, K.C.; Johnson, C.A.; van Genuchten, M.T. Estimating Uncertain Flow and Transport Parameters Using a Sequential Uncertainty Fitting Procedure. Vadose Zone J. 2004, 3, 1340–1352. [Google Scholar] [CrossRef]
Demir, S.; Şahin, E.K. Liquefaction prediction with robust machine learning algorithms (SVM, RF, and XGBoost) supported by genetic algorithm-based feature selection and parameter optimization from the perspective of data processing. Environ. Earth Sci. 2022, 81, 459. [Google Scholar] [CrossRef]
Shen, C.; Appling, A.P.; Gentine, P.; Bandai, T.; Gupta, H.; Tartakovsky, A.; Baity-Jesi, M.; Fenicia, F.; Kifer, D.; Li, L.; et al. Differentiable modelling to unify machine learning and physical models for geosciences. Nat. Rev. Earth Environ. 2023, 4, 552–567. [Google Scholar] [CrossRef]
Goodarzi, M.R.; Bafrouei, H.B.; Vazirian, M. Insight into groundwater level prediction with feature effectiveness: Comparison of machine learning and numerical methods. Hydrol. Res. 2024, 56, 74–92. [Google Scholar] [CrossRef]
Simões, K.; de Condé, R.C.C.; Roig, H.L.; Cicerelli, R.E. Application of the swat hydrological model in flow and solid discharge simulation as a management tool of the indaia river basin, alto são francisco, minas gerais. Rev. Ambient. Agua 2021, 16, e2694. [Google Scholar] [CrossRef]
Pham, Q.B.; Kumar, M.; Di Nunno, F.; Elbeltagi, A.; Granata, F.; Islam, A.R.M.T.; Talukdar, S.; Nguyen, X.C.; Ahmed, A.N.; Anh, D.T. Groundwater level prediction using machine learning algorithms in a drought-prone area. Neural Comput. Appl. 2022, 34, 10751–10773. [Google Scholar] [CrossRef]
Rahaman, M.M.; Thakur, B.; Kalra, A.; Li, R.; Maheshwari, P. Estimating high-resolution groundwater storage from GRACE: A random forest approach. Environments 2019, 6, 63. [Google Scholar] [CrossRef]
Haaf, E.; Giese, M.; Reimann, T.; Barthel, R. Data-Driven Estimation of Groundwater Level Time-Series at Unmonitored Sites Using Comparative Regional Analysis. Water Resour. Res. 2023, 59, e2022WR033470. [Google Scholar] [CrossRef]

Figure 1. Location map of Ruataniwha Basin.

Figure 2. Geological map of the Ruataniwha Basin.

Figure 3. Flowchart of the overall methodology.

Figure 4. Observed and predicted GWL for RF model in six wells.

Figure 5. Observed and predicted GWL for SVM in six wells.

Figure 6. Scatter plot of observed and predicted GWL data obtained using RF model.

Figure 7. Scatter plot of observed and predicted GWL data obtained using SVM model.

Figure 8. Graphs showing 95% confidence interval for the RF model.

Figure 9. Graphs showing 95% confidence interval for the SVM model.

Table 1. Summary of dataset components.

Variables	Temporal Resolution	Source	Remark
Rainfall, temperature, and PET	Daily	Gauge stations	Averaged across stations
Storativity, transmissivity	Static	Piezometric records	Location-specific values for each of the 6 wells
Groundwater level and abstraction	Monthly	6 wells	Direct measurements from HBRC monitoring wells

Table 2. Model performance evaluations.

Model	Metrics	Training						Test
Model	Metrics	1430	1458	4695	4700	4701	4702	1430	1458	4695	4700	4701	4702
SVM	MAE	0.269	0.016	0.014	0.104	0.049	0.056	0.628	0.03	0.023	0.222	0.114	0.17
	RMSE	0.512	0.02	0.016	0.121	0.057	0.071	0.829	0.037	0.031	0.28	0.142	0.254
	R²	0.985	0.997	0.997	0.997	0.998	0.998	0.918	0.993	0.966	0.990	0.995	0.968
	PBIAS	−0.02	0.00	0.00	0.00	0.00	0.00	0.12	0.00	0.00	0.000	0.02	0.04
	MAPE	0.13	0.01	0.01	0.05	0.03	0.03	0.29	0.02	0.02	0.11	0.07	0.1
RF	MAE	0.175	0.012	0.014	0.08	0.036	0.038	0.169	0.04	0.011	0.242	0.129	0.121
	RMSE	0.564	0.024	0.050	0.218	0.097	0.115	0.366	0.075	0.021	0.457	0.393	0.187
	R²	0.982	0.996	0.973	0.992	0.994	0.994	0.984	0.972	0.985	0.974	0.951	0.983
	PBIAS	−0.02	0.00	0.00	0.00	0.00	0.00	0.04	−0.01	0.00	−0.02	−0.05	0.03
	MAPE	0.08	0.01	0.01	0.04	0.02	0.02	0.08	0.03	0.01	0.12	0.08	0.07

Table 3. Overall model performance evaluations.

Metrics	Support Vector Machine						Random Forest
Metrics	1430	1458	4695	4700	4701	4702	1430	1458	4695	4700	4701	4702
MAE	0.339	0.019	0.016	0.127	0.063	0.077	0.174	0.018	0.013	0.112	0.055	0.055
RMSE	0.59	0.024	0.02	0.166	0.082	0.130	0.53	0.04	0.045	0.283	0.196	0.133
R²	0.978	0.997	0.996	0.996	0.997	0.993	0.982	0.991	0.976	0.988	0.98	0.992
PBIAS	0.76	−0.2	−0.22	−0.48	0.85	0.82	0.00	0.00	0.00	0.00	−0.01	0.00
MAPE	1.56	0.29	0.23	1.39	0.87	0.86	0.08	0.01	0.01	0.05	0.03	0.03

Table 4. Uncertainty quantification of predictive models using p-factor and d-factor.

Metric	Support Vector Machine						Random Forest
Metric	1430	1458	4695	4700	4701	4702	1430	1458	4695	4700	4701	4702
p-factor	0.951	0.951	0.951	0.951	0.927	0.927	0.976	0.927	0.902	0.878	0.951	0.902
d-factor	1.07	0.57	1.04	0.47	0.43	0.89	0.436	0.576	0.581	0.685	0.769	0.45

Table 5. Observed and predicted data bound on the test set (95%).

Model	Well No.	Lower Boundary	Upper Boundary
SVM	1430	216.988	220.070
	1458	158.426	158.682
	4695	143.165	143.342
	4700	208.541	209.874
	4701	170.691	171.457
	4702	168.425	169.688
RF	1430	218.046	219.306
	1458	158.412	158.670
	4695	143.205	143.303
	4700	208.123	210.048
	4701	170.468	171.835
	4702	168.758	169.401

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Kanito, D.; Benaafi, M.; Baalousha, H.M. Machine Learning Models for Groundwater Level Prediction and Uncertainty Analysis in Ruataniwha Basin, New Zealand. Hydrology 2025, 12, 282. https://doi.org/10.3390/hydrology12110282

AMA Style

Kanito D, Benaafi M, Baalousha HM. Machine Learning Models for Groundwater Level Prediction and Uncertainty Analysis in Ruataniwha Basin, New Zealand. Hydrology. 2025; 12(11):282. https://doi.org/10.3390/hydrology12110282

Chicago/Turabian Style

Kanito, Dawit, Mohammed Benaafi, and Husam Musa Baalousha. 2025. "Machine Learning Models for Groundwater Level Prediction and Uncertainty Analysis in Ruataniwha Basin, New Zealand" Hydrology 12, no. 11: 282. https://doi.org/10.3390/hydrology12110282

APA Style

Kanito, D., Benaafi, M., & Baalousha, H. M. (2025). Machine Learning Models for Groundwater Level Prediction and Uncertainty Analysis in Ruataniwha Basin, New Zealand. Hydrology, 12(11), 282. https://doi.org/10.3390/hydrology12110282

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Machine Learning Models for Groundwater Level Prediction and Uncertainty Analysis in Ruataniwha Basin, New Zealand

Abstract

1. Introduction

2. Materials and Methods

2.1. The Study Area

2.2. Geology and Hydrogeology

2.3. Data Description

2.4. Data Preprocessing and Feature Engineering

2.5. Model Development

2.5.1. Random Forest (RF)

2.5.2. Support Vector Machine (SVM)

2.5.3. Model Optimization

2.6. Model Performance Evaluation

2.7. Uncertainty Analysis and Quantification

3. Results and Discussion

3.1. Results of Model Performance Evaluation

3.2. Comparison of the Time Series Prediction

3.3. Scatter Plot Analysis of RF and SVM

3.4. Uncertainty Analysis

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI