Machine-Learning Models and Global Sensitivity Analyses to Explicitly Estimate Groundwater Presence Validated by Observed Dataset at K-NET in Japan

Thabet, Mostafa

doi:10.3390/geosciences15040126

Open AccessArticle

Machine-Learning Models and Global Sensitivity Analyses to Explicitly Estimate Groundwater Presence Validated by Observed Dataset at K-NET in Japan

by

Mostafa Thabet

Geology Department, Faculty of Science, Assiut University, Assiut 71516, Egypt

Geosciences 2025, 15(4), 126; https://doi.org/10.3390/geosciences15040126

Submission received: 17 February 2025 / Revised: 21 March 2025 / Accepted: 25 March 2025 / Published: 1 April 2025

(This article belongs to the Section Geophysics)

Download

Browse Figures

Versions Notes

Abstract

:

This study incorporates the comprehensively observed proxies of in situ geotechnical, geophysical, petrophysical, and lithological datasets to estimate groundwater presence. Two machine-learning approaches, random forest regression (RFR) and deep neural network (DNN), are applied. The constructed RFR and DNN models are validated using observed depths of groundwater levels at 772 K-NET sites in Japan. The RFR model exhibited effectiveness and robust performance compared to the poor-fitting performance of the DNN model and previous groundwater detection physical-based approaches. The RFR and DNN models yielded a remarkable 1:1 agreement between the observed and predicted groundwater levels at 733 and 470 K-NET sites, respectively. During the RFR training process, all datasets at the 772 K-NET sites were split into training, validating, and unseen testing datasets with the ratio set at 1:1:11. This k-fold cross-validation strategy demonstrates better-fitting performance for the RFR model. The contributions and interactions among the in situ observed proxies utilizing the variance-based global sensitivity analyses can be understood. The P-wave velocity and the standard penetration test values have exhibited prominent contributions among other proxies at groundwater depths. To apply the RFR model at any given site, reliable and detailed P- and S-wave velocity structures are crucial to building the needed source datasets.

Keywords:

machine learning; sensitivity analyses; P- and S-wave velocities; groundwater; K-NET Japan

1. Introduction

Groundwater prospecting for potential sources is important, particularly due to the unprecedented global hydrological drought under extreme climate change [1,2,3]. The remarkable availability of combined geological, geophysical, petrophysical, and hydrological datasets can provide the needed information for accurate and reliable approaches to determine groundwater presence. Marine seismic reflection datasets have been used to study offshore groundwater exploration in different locations around the world [4]. Their review highlighted the importance of the comprehensive interpretation of high-resolution seismic stratigraphy 2D datasets to yield proper information on properties and architectures for various deep groundwater aquifers. An index to determine the groundwater presence (i.e., seismic reflectivity parameter, SRP) is proposed by [5] using shallow subsurface velocity structures (i.e., P-wave and S-wave velocities,

V_{P}

and

V_{S}

, respectively) ≤ 20 m depth. The study by [5] was validated theoretically based on pioneering studies by [6,7], which studied the

V_{P}

and

V_{S}

variabilities under different groundwater saturation conditions and based on comparison with the water seismic index (WSI), which was introduced by [8].

Combining

V_{P}

and

V_{S}

is highly decoupled in the presence of fluids [6,7]. Recently, the influence of saturation and pore fluids on

V_{P}

and

V_{S}

has been studied theoretically by [9] and experimentally by [10]. In addition, significant variations of

V_{P} / V_{S}

and Poisson’s ratio were highlighted whenever correlated with the groundwater level [11]. The saturation degree significantly impacts shallow and high-resolution seismic reflection and refraction [12]. The groundwater level was indicated to correspond to a

V_{P}

of 1100–1200 m/s [13], while a

V_{P}

< 1500 m/s for water-saturated soils at depths > 4 m [14]. However, these case-by-case thresholds could mislead the actual presence of the groundwater level. Most of these previous studies were theoretical and tended to assign thresholds for

V_{P}

,

V_{P} / V_{S}

, and/or Poisson’s ratio to indicate the groundwater level. The impact of

V_{P}

and

V_{S}

on the saturation and pore fluids in consolidated lithologies had been studied in a theoretical study by [15]. Nevertheless, detection of the non-saturated/saturated interface is characterized by complex seismic responses [16].

Therefore, depending on these valuable previous studies, the current study introduces nonparametric models through two machine-learning approaches, which have been widely recognized in recent years. This study aims to experimentally understand the complexity at the groundwater level (i.e., non-saturated/saturated interface) by utilizing observed in situ geotechnical, geophysical, petrophysical, and lithological proxies combined with observed groundwater levels.

This study aims to construct a deep neural network model (DNN) and random forest regression model (RFR) to explicitly estimate the groundwater presence, validated by observed groundwater levels at 772 shallow borehole sites of K-NET in Japan [17]. It is widely recognized that K-NET provides comprehensive observed in situ geotechnical, geophysical, petrophysical, and lithological proxies for the upper 20 m depth. Initially, the impact of these proxies and their interactions with each other is quantified through variance-based global sensitivity analyses with first- and total-order indices. These sensitivity analyses are combined with machine learning to assess the contributions of these input proxies to the output model under dry and wet conditions (i.e., unsaturated and saturated conditions, respectively). Subsequently, the needed source datasets at those 772 K-NET sites (i.e., N-SPT values—number of blows for the standard penetration test—P- and S-wave velocity structures of PS-logging measurements (

V_{P}

and

V_{S}

), bulk density (

ρ

), and lithology) can be confirmed to input into the DNN and the RFR approaches and effectively predict the groundwater presence at each K-NET site. Eventually, two new machine-learning models (i.e., DNN and RFR) can be constructed and compared to estimate the groundwater presence. The prediction performance of both the DNN and RFR models is evaluated using two validation approaches. The first is applying a k-fold cross-validation approach. The second is dividing the source datasets into two portions for training and validation. Since

V_{P}

and

V_{S}

are the key factors needed to build the source datasets and input them into the DNN and RFR approaches, groundwater depths can be detected effectively and robustly. This paper presents the first use of the DNN and RFR to estimate groundwater presence validated by the observed depths at numerous 772 sites. Thus, the main objective of the present study is to apply the constructed RFR and DNN models to any given region for various environmental, engineering, and/or geotechnical practices.

2. Methodology

To explicitly understand and estimate the groundwater presence validated by observed groundwater levels at K-NET sites in Japan, the current study introduces variance-based global sensitivity analyses to understand the interactions and contributions between the observed in situ geotechnical, geophysical, petrophysical, and lithological proxies combined with observed groundwater levels. Then, DNN and RFR models are constructed and compared to predict the depth of the groundwater presence efficiently. Moreover, these predicted groundwater levels are compared with those derived using SRP [5] and WSI [8]. A flowchart diagram of the adopted methodology is shown in Figure 1.

2.1. Global Sensitivity Analyses

The potentiality of the global sensitivity approach is used in the current study to identify the proxies that have the highest effect on predicting groundwater presence. Recently, Python (Ver. 3.12) implementations of a range of global sensitivity analysis techniques [18] were introduced through the SALib library, which is dependent on previous studies [19,20]. The SALib library under the Python framework, a widely used library in global sensitivity analysis methods, including Sobol methods [21], is used in the current study (Figure 2).

In this implementation, the variance-based Sobol sensitivity analysis technique is used. A simple non-linear model function is used, which is defined as a combination of summation and interaction terms of the used input proxies. Using a more complex model function can lead to less interpretable results, particularly in this study, where the dataset is limited to the shallow 20 m depths. The parameter space (i.e., minimum and maximum values for each proxy of N-SPT,

V_{P}

,

V_{S}

,

ρ

, and lithology) is initially determined to define the bounds for Sobol sampling. After determining these bounds, a Sobol sequence is generated to sample input proxies. In this Sobol sensitivity analysis, multiple random samples are generated based on the probability distributions of the model proxies. These random samples are set based on

2^{n}

, where

n

ranges from 10 to 15.

Here, RFR, a machine-learning technique, is introduced. The RFR allows for the internal evaluation of various input parameters and exhibits robustness to noise [22]. After generating a set of samples using the Sobol sampling method, the sampled datasets are split into training and testing subsets using a 50:50% split. In the RFR, hyperparameter tuning is performed using grid search and k-fold cross-validation. Whenever the best hyperparameters of RFR are identified, the optimal model is trained accordingly and is evaluated using k-fold cross-validation. The final efficient trained model function is used iteratively to generate predictions for the Sobol-sampled datasets.

The effect or contribution of each proxy on the outputs of the model is quantified through the Sobol sensitivity indices, which include the first order (

S_{x}

) and the total order (

{S T}_{x}

) indices. It is worth noting that

x

refers to the input proxies, such as N-SPT,

V_{P}

,

V_{S}

,

ρ

, and lithology. These Sobol sensitivity indices (i.e.,

S_{x}

and

{S T}_{x}

) quantify each input proxy contribution through model output variance. Here, it is important to know that

S_{x}

refers to the main effect indices that quantify the proportion of the output variance that can be attributed to each input proxy independently. In addition,

{S T}_{x}

refers to the total effect indices, including the first-order effect and interaction effects of the input proxy. A larger Sobol sensitivity index indicates that the corresponding proxy has a greater influence on the model output, signifying higher sensitivity. Thus, the most influential proxies can be determined.

The current approach of variance-based global sensitivity analyses is run adopting two perspectives. The first is analyzing the contributions (

S_{x}

and

{S T}_{x}

) of these input proxies (i.e., N-SPT,

V_{P}

,

V_{S}

,

ρ

, and lithology) under dry and wet conditions (i.e., above and below the unsaturated/saturated interfaces). The second is analyzing

S_{x}

and

{S T}_{x}

at each depth point using these input proxies.

Eventually, this global sensitivity analysis (Figure 2) provides the needed information and insights through the relative importance of various input proxies and guides further modeling for better understanding and decision-making to estimate the groundwater presence. Therefore, this proposed approach combines sensitivity analysis with machine learning to provide a data-driven approach for assessing the importance of proxies (i.e., observed in situ geotechnical, geophysical, petrophysical, and lithological proxies) in predictive modeling. Recent application of sensitivity analysis to identify the electrical conductivity as the highest impact on the groundwater quality [23]. Various machine-learning techniques were incorporated with the sensitivity analysis to enhance the accuracy and robustness of the groundwater quality prediction [23].

2.2. DNN

DNN is a subset of machine-learning approaches. Artificial DNNs are employed in many studies to combine empirical models and physics-based models. A DNN was used to estimate the S-wave velocity structure through training input datasets of horizontal-to-vertical spectral ratios, bedrock depths, and geomorphological categories [24]. Moreover, the horizontal site amplification factors based on horizontal-to-vertical ratios of microtremors were estimated using DNN [25]. The data-knowledge-driven DNN approach was used as an earthquake early warning by extracting arrivals of P-waves [26]. The current study uses the widely widespread TensorFlow library under the Python framework. This library is open-source deep learning [27].

A fully connected artificial DNN is used in the adopted deep learning approach, which is composed of nine hidden layers. In each layer, neurons based on ≤

2^{9}

are set. The epochs range between 700 and 1300 and are accompanied by a 1% reduced learning rate every 50 epochs. The higher the number of neurons in each hidden layer and the iterative epochs, the longer the time consumed for analyses, and vice versa.

Mathematically, the fundamental terms of DNN can be defined in Equations (1) and (2). In the current study, this definition is called the affine linear transformation applied to each hidden layer.

T w_i + b_i = h_i,

(1)

h_i w_i + b_i = m,

(2)

where

i

is the iteration based on the hidden layers to update the matrix of weights (

w

) considering a bias vector (

b

). The training dataset and the predicted model are

T

and

m

, respectively.

While training, Kernel regularization or weight decay is applied to prevent overfitting due to large weights. In addition, the Rectified Linear Unit (ReLU) is applied as a non-linear activation function to each hidden layer.

The datasets are divided into training, testing, and unseen subsets adopting a thirteen-fold cross-validation approach. The batch size is 300, corresponding to the input and output datasets; then, batch normalization is applied to enhance the training process. Dropout is set after each hidden layer to optimize the interaction between affine linear transformation layers and prevent overfitting. The loss function is the mean squared error (MSE) to measure the differences between the observed and predicted groundwater levels.

Here, it is worth noting that there is a trade-off relation between the best performance of the DNN model and the number of hidden layers, neurons, epochs, Kernel regularization, ReLU, and batch normalization. Therefore, a trial-and-error approach is applied to set these parameters. Figure 3 illustrates the architecture of the artificial DNN used in this study. The first step in this architecture is preparing the input datasets (i.e., Depth, N-SPT,

V_{P}

,

V_{S}

,

ρ

,

V_{P} / V_{S}

, Poisson’s ratio, P- and S-wave impedance contrasts,

V_{S 30}

, and lithology) and the output datasets (i.e., observed groundwater levels). Those input and output datasets compose the source datasets and are explained in detail in the next section.

2.3. RFR

In the current study, the Scikit-learn library is used under the Python framework [28]. Scikit-learn is a widespread open-source machine-learning library. The RFR algorithm is one of the most high-performing algorithms in the Scikit-learn library. There is a remarkable outperforming RFR whenever compared with other machine-learning algorithms [29]. Figure 4 shows the architecture of RFR used in this study. The following consequential procedure describes the workflow of the current RFR algorithm accompanied by the selected hyperparameters.

The training datasets comprise a set of samples (i.e., K-NET sites), and each sample, consequently, contains a set of features (i.e., Depth, N-SPT, $V_{P}$ , $V_{S}$ , $ρ$ , $V_{P} / V_{S}$ , Poisson’s ratio, P- and S-wave impedance contrasts, $V_{S 30}$ , and lithology). At each decision tree in the RFR, the training datasets are sampled as bootstrap and out-of-bag ( $O O B$ ) datasets of approximate ratio two-to-one, respectively. The bootstrap samples are generated to train the model, whereas out-of-bag ( $O O B$ ) samples are used to evaluate the model by averaging the predictions all over the decision trees. The evaluation using $O O B$ is an important step for providing the generalization applicability to the resulting RFR model.
The k-fold cross-validation procedure is applied. It is acquired from the Scikit-learn library in the Python framework. A thirteen-fold cross-validation approach is adopted.
The hyperparameters are tuned utilizing grid search capability in the Scikit-learn library. These hyperparameters are the number of decision trees, the maximum depth of these trees, the minimum splits of bootstrap samples, and the minimum sample leaves for these decision trees. This step consumes a long analysis period to yield the highest-performing RFR model.

The final step is the average regression for all the predictions of all decision trees. The differences between the observed and predicted groundwater levels are measured using the MSE.

3. Data

K-NET sites are spatially distributed all over Japan with 25 km spacing [17]. The current study uses 772 K-NET sites (Figure 5) because of the availability of observed groundwater levels. It is worth noting that observed groundwater levels are considered the backbone to validate the constructed DNN model, RFR model, and the global sensitivity analyses. Figure 6 shows a map of the groundwater levels in Japan utilizing the observed groundwater levels at those 772 K-NET sites.

At each borehole K-NET site, the N-SPT,

V_{P}

,

V_{S}

,

ρ

, and lithology datasets are obtained down to ≤20 m depth [17]. The subsurface structure datasets show information at each 1 m vertically. Figure 7 shows the subsurface structures of N-SPT,

V_{P}

, and

V_{S}

exhibiting a wide variety at those 772 K-NET sites. These subsurface structures are initially analyzed by calculating the averages and the standard deviation at each depth point (Figure 8). A gradual downward increase is observed from these subsurface structures. After obtaining these invaluable datasets, they are prepared as inputs for the global sensitivity analysis and the DNN and RFR training process.

For the global sensitivity analysis, the contributions of N-SPT,

V_{P}

,

V_{S}

,

ρ

, lithology index, and

V_{P} / V_{S}

are analyzed. Table 1 summarizes the lithology index and its corresponding definition. The importance of the lithology index comes from the variability of permeability and porosity of these various lithologies. The permeability and porosity of different lithologies determine the characteristics of groundwater aquifers. It is important to note that permeability and/or porosity are not provided by [17].

Three common extrapolation techniques were proposed to determine the time-averaged

V_{S}

down to 30 m depth (

V_{S 30}

) [30]. Those techniques are as follows: (1) constant velocity, (2) correlating

V_{S 30}

with

V_{S}

at arbitrary depth, and (3) velocity statistics to assign seismic site class. In the current study, the first extrapolation technique is adopted by assuming the constant velocity of the

V_{S}

of the bottom layer to extend 30 m in depth at each borehole K-NET site (Equation (3)).

V_s30 = 30/{t(d) + (30 − d/V_sd)},

(3)

where

V_{S d}

is the

V_{S}

of the bottom layer,

d

is the depth difference between the depth of the borehole bottom layer and 30 m depth, and

t (d)

is the total travel time from the surface to the borehole bottom layer.

The constant velocity technique is adopted due to its ease [30]. Henceforth, the seismic site classifications at those 772 K-NET sites are categorized according to guidelines by [31]. Thus, the determined site classes (i.e.,

V_{S 30}

) are inputted into the DNN and RFR training process.

For the DNN and RFR training, the source datasets are composed of input and output datasets. The input dataset is composed of depth (i.e., range between 1.0 m and 20.0 m), N-SPT,

V_{P}

,

V_{S}

,

V_{S 30}

,

ρ

, lithology index,

V_{P} / V_{S}

, Poisson’s ratio, and P- and S-wave impedance contrasts (Table 2). The output dataset to be trained is the groundwater levels at each borehole K-NET site. Because some borehole K-NET sites are restricted to depths <20 m, constant subsurface structures of the bottom borehole layers are assumed to extend up to 20 m depth. During DNN and RFR training, we constructed a single pulse shape to identify the observed groundwater level with an index of “1” at each K-NET site (Table 2).

4. Results and Discussion

4.1. Performance of the RFR Model

The availability of the observed groundwater levels at 772 K-NET sites in Japan (Figure 6) is used to validate and recognize the performance of the RFR model in predicting the depths of these groundwater levels. Figure 9 shows examples of a remarkable superimposing of observed and predicted groundwater levels.

The residuals (i.e., the ratio between the observed and predicted groundwater levels) are calculated to demonstrate the overall performance of the RFR model. In addition, the residuals are calculated concerning the predicted groundwater levels of those derived using SRP [5] and WSI [8]. Predicting groundwater levels using SRP is based on the difference between P- and S-wave reflection coefficients. The SRP was theoretically constructed using 1045 K-NET sites without validating with observed groundwater levels [5]. On the other hand, the theoretical framework of WSI is based on the variability of P- and S-wave velocities across wet/dry interfaces adopting two- and three-dimensional propagation models. However, only three sites were used in Italy to validate the theoretical framework of WSI. However, the current study has the remarkable merit of validating the predicted depths using observed ones at 772 K-NET sites. Figure 10 shows the distribution of these residuals.

Using the RFR model, significant and remarkable agreement can be achieved between observed and predicted groundwater levels at 733 out of 772 K-NET sites. Applying SRP and WSI cannot reproduce the predicted groundwater levels compared to the observed ones. Most of the predicted groundwater levels derived from SRP and WSI tend to overestimate the observed groundwater depths.

Figure 11 shows a remarkable agreement between the observed and the RFR-predicted groundwater levels at K-NET sites. This agreement indicates that most observed groundwater levels are concentrated at shallow depths from 1 m to 8 m. Figure 12 shows the RFR learning process adopting MSE calculations from all the decision trees. The present fluctuations in the RFR learning process may have two reasons. First, a high number of correlated features (i.e., Depth, N-SPT,

V_{P}

,

V_{S}

,

ρ

,

V_{P} / V_{S}

, Poisson’s ratio, P- and S-wave impedance contrasts,

V_{S 30}

, and lithology) are used at 772 K-NET sites. Second, different bootstrap and bagging samples are adopted at each tree. Hence, fluctuations are introduced in the resulting MSE, particularly whenever the decision trees are highly comparable.

4.2. Validating the RFR Model

Comparing the performance of the RFR and DNN models, it is obvious that RFR has an outperforming result when estimating the groundwater presence. However, two approaches are employed to validate the results of the RFR model. They are (1) k-fold cross-validation and (2) splitting the source datasets into training and validating datasets. The validating datasets are processed as external or unseen datasets. Therefore, the capability of the RFR model can be evaluated to apply to any given site and ensure the generalization capabilities of the resulting RFR model.

In the first validation approach, Figure 13 shows the application process of the k-fold cross-validation in the current study. Thirteen folds are used, and a grid parameter search is adopted for tuning the hyperparameters in the k-fold cross-validation process. The k-fold cross-validation approach is often used when training datasets are limited [32,33]. This study has a huge variability among the source datasets of the 772 K-NET sites. For this reason, this huge variability is considered equivalent to limited training datasets. However, the ratio between the training fold, the validating fold, and the unseen testing datasets of other folds is 1:1:11 (Figure 13). The k-fold cross-validation yields a remarkable performance, exhibiting that 733 out of 772 K-NET sites have residuals of ≈1 (Figure 10 and Table 3).

In the second validation approach, the source datasets are divided into training and validating datasets. Thus, the validating dataset is completely external or unseen. The impact of using various training-validating ratios is examined. It is worth noting that we selected various training datasets distributed equally all over Japan. Therefore, the generalization capability of the RFR model can be evaluated. Figure 14 shows examples of different training dataset ratios. The decrease in the training dataset ratio yields a reduction in the K-NET sites with a residual value of ≈1. Additionally, there is an increase in K-NET sites that show over- and underestimation. Table 3 summarizes the performance results of both validation approaches. This drastic increase in over- and underestimated residuals at the expense of the decrease in the matched residuals is caused by the huge variabilities of the correlated features (i.e., Depth, N-SPT,

V_{P}

,

V_{S}

,

ρ

,

V_{P} / V_{S}

, Poisson’s ratio, P- and S-wave impedance contrasts,

V_{S 30}

, and lithology) among the 772 K-NET sites. However, employing the training-validating splitting approach does not outperform when compared with the k-fold cross-validation approach.

4.3. Performance of the DNN Model

On the other hand, the performance of the DNN model does not outperform whenever compared with the performance of the RFR model. Figure 15 shows the distribution of the observed/predicted residuals using the DNN model. The first validation approach, the k-fold cross-validation (Figure 13), is used to validate the DNN model and adopt thirteen folds. Thus, the 1:1:11 ratio is set as the ratio of training: validating: unseen testing datasets. Compared with the performance of the RFR model (Figure 10), there is a drastic decrease in the matched residuals at the expense of over- and underestimated residuals. However, the matched residuals of the DNN model remarkably outperform those derived from SRP [5] and WSI [8].

Figure 16 shows examples of superimposing between observed and predicted groundwater levels. Some K-NET sites show excellent agreement between the depths of observed and predicted groundwater levels, whereas other K-NET sites show unreasonable matching. Additionally, the agreement between the observed and the DNN-predicted groundwater levels (Figure 17) ensures the results of Figure 15. Over- and underestimation predictions are dominant in DNN training. The remarkable difference is obvious compared to the RFR performance in Figure 11.

Figure 18 shows the DNN learning process adopting MSE calculations (i.e., loss) from all the epochs. The minimum MSE reaches an approximate value of 0.5 during the DNN model’s training. This MSE is better than the achievement during the RFR model’s training (i.e., ≈0.82). Although we used k-fold cross-validation adopting thirteen folds, the validation loss is worse than the training loss.

The size of the source datasets (Table 1 and Table 2), limited to 20 m structures, is considered as limited datasets. The size of datasets significantly impacts the performance during the RFR and DNN training. RFR can generalize well using limited datasets, whereas the DNN needs a large dataset size for effective training [34]. Therefore, the DNN model suffers from overfitting or failing to learn meaningful predictions for the groundwater presence.

The huge variability in the source datasets is considered an outlier or noise. The RFR model can handle noise or outliers by averaging predictions across multiple decision trees, whereas the DNN model is quite sensitive to noise or outliers [35]. Furthermore, the RFR model exhibits minimal sensitivity to the scaling of input parameters (Table 1 and Table 2).

For additional fair comparison between RFR and DNN performances, Table 4 summarizes the performance results of both validation approaches using DNN training. Employing the training-validating splitting approach does not outperform when compared with the k-fold cross-validation approach. This conclusion is like the RFR performance. However, the DNN training process yields poor-fitting performance regardless of the validation approach. Interestingly, there is a drastic increase in overestimated residuals and a decrease in underestimated residuals. This DNN performance contradicts RFR performance.

4.4. Global Sensitivity Analyses

Using Sobol methods [21], the variance-based global sensitivity analyses under dry and wet conditions (i.e., unsaturated and saturated conditions) are shown. The contribution and interactions are studied between the different proxies of N-SPT,

V_{P}

,

V_{S}

,

ρ

,

V_{P} / V_{S}

, and lithology. Standard scaling is applied to these proxies to avoid overshadowing caused by high numerical proxies (e.g.,

V_{P}

and

V_{S}

) on smaller numerical proxies. The proxies above the groundwater level in dry conditions are only considered. On the other hand, the proxies at the groundwater level in wet conditions are only considered.

Figure 19 and Figure 20 show the results of the global sensitivity analysis of different proxies above and at the groundwater levels, respectively. In dry conditions, the sensitivity indices (i.e., first- and total-order indices) exhibit comparable contributions or interactions among the various proxies. Interestingly, the N-SPT and

V_{P}

proxies exhibit prominent and significant contributions in wet conditions. These results explain the usage of solely

V_{P}

proxy to correlate with the groundwater presence; for example,

V_{P}

of 1000 m/s was suggested by [36]. In addition, the groundwater level was deducted to correspond to a

V_{P}

< 1500 m/s for water-saturated soils at shallow depths of K-NET sites [14]. The

V_{P}

≈ 800 m/s was attributed to the groundwater presence as deduced from seismic refraction tomography in Kyrgyzstan, Central Asia [37]. Although the prominent contributions of the N-SPT and

V_{P}

proxies, other proxies make significant interactions in wet conditions. However, density proxies make minimal contributions to dry and wet conditions. Therefore, all the available proxies are involved during the RFR and DNN training process.

The sensitivity indices are calculated at each depth point. Figure 21 shows the resulting global sensitivity analysis of different proxies versus depth. On the contradiction of results in Figure 19 and Figure 20, proxies N-SPT,

V_{P}

,

V_{S}

,

V_{P} / V_{S}

, and lithology show comparable sensitivity indices. Only the density proxy has a minimal contribution, like the results in Figure 19 and Figure 20. Thus, the sensitivity indices of N-SPT and

V_{P}

proxies at each depth point do not exhibit the same prominent contributions seen previously. This can be attributed to the contribution of depth proxy. The sensitivity indices resulting in Figure 19 and Figure 20 do not include the contribution of depth proxy, whereas those resulting in Figure 21 include the indirect contribution of depth proxy.

4.5. Prospecting Applicability

One of the most significant limitations to implementing the RFR model at any given site is obtaining the needed proxies for source datasets.

V_{P}

and

V_{S}

are the key factors in determining all the different proxies explained earlier.

Recently, computational approaches adopting diffuse field theory were applied to horizontal-to-vertical spectral ratio by [38,39]. As a result, the subsurface structures were derived from 1743 sites in Japan and 75 sites in Egypt. These derived subsurface structures were

V_{P}

,

V_{S}

, and

ρ

versus depth. The horizontal-to-vertical spectral ratio technique is non-destructive and cost-effective. Therefore, it is considered one of the best approaches to derive the essential geotechnical proxies for any given region.

Furthermore,

V_{S}

and

V_{P}

can be derived by deploying a geophysical field survey of multi-channel analyses of surface waves and P-wave seismic refraction, respectively. The

ρ

can be determined using empirical relationships, such as [40].

However, lithology can be identified based on the regional geology of the given region. The N-SPT can be empirically determined using

V_{S}

based on the site-specific empirical relations, such as [41].

The impact of using only two proxies on the resulting RFR and DNN performance is examined, as shown in Figure 22. Only

V_{S}

and

V_{P}

are used as input into the training process. There is a drastic decrease in predicting groundwater depths. Although

V_{S}

and

V_{P}

are the key factors, other proxies are crucial to constructing the needed regressions inside each training process. Consequently, predicting groundwater depths can be much more efficient for the RFR and DNN models using more than only two input proxies (

V_{S}

and

V_{P}

). The performance differences between RFR and DNN may occur for three reasons. First is the tuning needed to yield optimal hyperparameters used in each training process. Second is the strong dependency on the case study and the nature of the source datasets. Third is the difference in algorithms between RFR and DNN. Recently, a study by [42] used random forest, support vector machine, back propagation neural network, and extreme gradient boosting to predict the characterized development of natural fractures. They found different accuracy values that ranged between 83% and 96% [42]. However, detailed reasons behind the various performances of the RFR and DNN are beyond the scope of this study.

Eventually, we can retrieve the source datasets needed at any region to input into the RFR model and effectively predict the groundwater level. It is important to be aware that the reliability of the derived subsurface structures strongly affects the generalization capabilities of the resulting RFR model. The potentiality can be examined using the key factors of

V_{P}

and

V_{S}

structures. As a result, the generalization capabilities of the RFR model are enhanced and reproduced.

5. Conclusions

This study introduces the RFR model to explicitly estimate the depths of groundwater levels utilizing comprehensive observed in situ geotechnical, geophysical, petrophysical, and lithological datasets. This introduced RFR model demonstrates prediction effectiveness when compared to previous physical-based approaches. The proxies of depth, N-SPT,

V_{P}

,

V_{S}

,

ρ

,

V_{P} / V_{S}

, Poisson’s ratio, P- and S-wave impedance contrasts,

V_{S 30}

, and lithology are used as input into the RFR and DNN training. The observed depths of groundwater levels are used as output proxies at 772 K-NET sites in Japan. The performances of the RFR model and the DNN model are compared. The DNN model exhibited poor fitting performance compared to the effectiveness and robust performance of the RFR model. This raises questions and challenges about the suitability of the machine-learning technique and the training datasets that are beyond the scope of this study. Furthermore, validating the prediction results using k-fold cross-validation demonstrated that the RFR model can enrich the generalization capabilities of the resulting model. The global sensitivity analyses indicate the essential need for all available proxies to estimate groundwater presence. Although the prominent contributions of the N-SPT and

V_{P}

proxies to the groundwater presence, other proxies exhibit significant interactions in wet conditions. This explains the reason for less optimal fitting performance between observed and predicted depths of groundwater levels using previous one- or two-proxy approaches (e.g., [5,8,14,36,37]). The most important challenge here is the RFR model’s capability to effectively predict the depths of the groundwater levels in different localities worldwide, particularly utilizing derived subsurface structures (e.g., [38,39]). Indeed, the huge variability of the datasets used in this study ensures the generalization capabilities of the resulting RFR model. Implementing the RFR model on a new dataset at different localities is essential, particularly to enhance the present RFR model utilizing a transfer learning approach and a new dataset for other regions. Furthermore, extending the current RFR model’s applicability to greater depths (i.e., >20 m) is another important challenge due to the limitation of observed dataset sources.

Funding

This research received no external funding.

Data Availability Statement

The processed and analyzed data are available upon request from the author.

Acknowledgments

The author is highly appreciative and grateful to the Japanese National Research Institute for Earth Science and Disaster Resilience for providing invaluable in situ geotechnical, geophysical, petrophysical, and lithological datasets at K-NET sites. The author is highly appreciative and grateful to Fumiaki Nagashima (DPRI—Kyoto University, Japan) for providing observed groundwater levels at the 772 K-NET sites. The author appreciates and acknowledges the generous, valuable, comprehensive, and constructive comments and suggestions from the editor and the reviewers. Some figures were plotted using PyGMT (Ver.0.14.2, https://doi.org/10.5281/zenodo.14868324).

Conflicts of Interest

The author declares no conflicts of interest.

References

Satoh, Y.; Yoshimura, K.; Pokhrel, Y.; Kim, H.; Shiogama, H.; Yokohata, T.; Hanasaki, N.; Wada, Y.; Burek, P.; Byers, E.; et al. The timing of unprecedented hydrological drought under climate change. Nat. Commun. 2022, 13, 3287. [Google Scholar] [CrossRef] [PubMed]
Wanders, N.; Wada, Y.; Van Lanen, H.A.J. Global hydrological droughts in the 21st century under a changing hydrological regime. Earth Syst. Dyn. 2015, 6, 61–81. [Google Scholar] [CrossRef]
Hari, V.; Rakovec, O.; Markonis, Y.; Hanel, M.; Kumar, R. Increased future occurrences of the exceptional 2018–2019 Central European drought under global warming. Sci. Rep. 2020, 10, 12207. [Google Scholar] [CrossRef]
Bertoni, C.; Lofi, J.; Micallef, A.; Moe, H. Reflection Methods in Offshore Groundwater Research. Geosciences 2020, 10, 299. [Google Scholar] [CrossRef]
Thabet, M. Applicability of a proposed groundwater level determination approach for the K-NET in Japan. Near Surf. Geophys. 2021, 19, 447–463. [Google Scholar] [CrossRef]
Biot, M.A. Theory of propagation of elastic waves in fluid-saturated porous solid. Part I: Low frequency range. J. Acoust. Soc. Am. 1956, 28, 168–178. [Google Scholar] [CrossRef]
Biot, M.A. Theory of propagation of elastic waves in fluid saturated porous solid. Part II: Higher frequency range. J. Acoust. Soc. Am. 1956, 28, 179–191. [Google Scholar] [CrossRef]
Grelle, G.; Guadagno, F.M. Seismic refraction methodology for groundwater level determination: “Water seismic index”. J. Appl. Geophys. 2009, 68, 301–320. [Google Scholar] [CrossRef]
Dvorkin, J. Yet another Vs equation. Geophysics 2008, 73, E35–E39. [Google Scholar] [CrossRef]
Uyanık, O. The porosity of saturated shallow sediments from seismic compressional and shear wave velocities. J. Appl. Geophys. 2011, 73, 16–24. [Google Scholar] [CrossRef]
Konstantaki, L.A.; Carpentier, S.F.A.; Garofalo, F.; Bergamo, P.; Socco, L.V. Determining hydrological and soil mechanical parameters from multichannel surface-wave analysis across the Alpine Fault at Inchbonnie, New Zealand. Near Surf. Geophys. 2013, 11, 435–448. [Google Scholar] [CrossRef]
Bachrach, R.; Nur, A. High-resolution shallow-seismic experiments in sand, part I: Water table, fluid flow, and saturation. Geophysics 1998, 63, 1225–1233. [Google Scholar] [CrossRef]
Zelt, A.C.; Azaria, A.; Levander, A. 3D seismic refraction traveltime tomography at a groundwater contamination site. Geophysics 2006, 58, 1314–1323. [Google Scholar] [CrossRef]
Nagashima, F.; Kawase, H. The relationship between Vs, Vp, density and depth based on PS-logging data at K-NET and KiK-net sites. Geophys. J. Int. 2021, 225, 1467–1491. [Google Scholar] [CrossRef]
Adam, L.; Batzle, M.; Brevik, I. Gassmann’s fluid substitution and shear modulus variability in carbonates at laboratory seismic and ultrasonic frequencies. Geophysics 2006, 71, F173–F183. [Google Scholar] [CrossRef]
Ghasemzadeh, H.; Abounouri, A.A. Effect of subsurface hydrological properties on velocity and attenuation of compressional and shear wave in fluid-saturated viscoelastic porous media. J. Hydrol. 2012, 460–461, 110–116. [Google Scholar] [CrossRef]
NEID. K-NET, KiK-net, National Research Institute for Earth Science and Disaster Resilience. 2019. Available online: https://www.kyoshin.bosai.go.jp (accessed on 17 December 2024).
Herman, J.D.; Usher, W. SALib: Sensitivity Analysis Library in Python. J. Open Source Softw. 2017, 2, 97. [Google Scholar] [CrossRef]
Saltelli, A.; Tarantola, S.; Chan, K. A Quantitative Model-Independent Method for Global Sensitivity Analysis of Model Output. Technometrics 1999, 41, 39–56. [Google Scholar] [CrossRef]
Saltelli, A. Making best use of model evaluations to compute sensitivity indices. Comput. Phys. Commun. 2002, 145, 280–297. [Google Scholar] [CrossRef]
Sobol, I.M. Global sensitivity indices for nonlinear mathematical models and their Monte Carlo estimates. Math. Comput. Simul. 2001, 55, 271–280. [Google Scholar] [CrossRef]
Lovatti, B.P.O.; Nascimento, M.H.C.; Neto, Á.C.; Castro, E.V.R.; Filgueiras, P.R. Use of Random forest in the identification of important variables. Microchem. J. 2019, 145, 1129–1134. [Google Scholar] [CrossRef]
Khan, I.; Ayaz, M. Sensitivity analysis-driven machine learning approach for groundwater quality prediction: Insights from integrating ENTROPY and CRITIC methods. Groundw. Sustain. Dev. 2024, 26, 101309. [Google Scholar] [CrossRef]
Hayashi, K.; Suzuki, T.; Inazaki, T.; Konishi, C.; Suzuki, H.; Matsuyama, H. Estimating S-wave velocity profiles from horizontal-to-vertical spectral ratios based on deep learning. Soils Found. 2024, 64, 101525. [Google Scholar] [CrossRef]
Pan, D.; Miura, H.; Kwan, C. Transfer learning model for estimating site amplification factors from limited microtremor H/V spectral ratios. Geophys. J. Int. 2024, 237, 622–635. [Google Scholar] [CrossRef]
Zhu, J.; Li, S.; Song, J. Data-knowledge driven hybrid deep learning for earthquake early warning. Earth Space Sci. 2024, 11, e2023EA003363. [Google Scholar] [CrossRef]
Abadi, M.; Barham, P.; Chen, J.; Chen, Z.; Davis, A.; Dean, J.; Devin, M.; Ghemawat, S.; Irving, G.; Isard, M.; et al. TensorFlow: A system for large-scale machine learning. In Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation, Savannah, GA, USA, 2–4 November 2016; (OSDI ’16). USENIX Association: Berkeley, CA, USA, 2016; pp. 265–283. Available online: https://www.tensorflow.org (accessed on 1 January 2025).
Arthur, D.; Vassilvitskii, S. K-means++the advantages of careful seeding. In Proceedings of the 18th Annual ACM-SIAM Symposium on Discrete Algorithms, New Orleans, LA, USA, 7–9 January 2007; pp. 1027–1035. [Google Scholar]
Chen, R.; Zhang, P.; Wu, H.; Wang, Z.T.; Zhong, Z.Q. Prediction of shield tunneling-induced ground settlement using machine learning techniques. Front. Struct. Civ. Eng. 2019, 13, 1363–1378. [Google Scholar] [CrossRef]
Boore, D.M. Estimating VS (30) (or NEHRP site classes) from shallow velocity models (depths < 30 m). Bull. Seismol. Soc. Am. 2004, 94, 591–597. [Google Scholar] [CrossRef]
FEMA P-2082-1; (National Earthquake Hazards Reduction Program) Recommended Seismic Provisions for New Buildings and Other Structures. National Institute of Building Sciences: Washington, DC, USA, 2020; Volume I, Part 1 Provisions and Part 2 Commentary. Available online: https://www.fema.gov/sites/default/files/2020-10/fema_2020-nehrp-provisions_part-1-and-part-2.pdf (accessed on 1 January 2025).
Jia, R.; Lv, Y.; Wang, G.; Carranza, E.; Chen, Y.; Wei, C.; Zhang, Z. A stacking methodology of machine learning for 3D geological modeling with geological-geophysical datasets, Laochang Sn camp, Gejiu (China). Comput. Geosci. 2021, 151, 104754. [Google Scholar] [CrossRef]
Thibaut, R.; Laloy, E.; Hermans, T. A new framework for experimental design using Bayesian Evidential Learning: The case of wellhead protection area. J. Hydrol. 2021, 603, 126903. [Google Scholar] [CrossRef]
Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Hastie, T.; Tibshirani, R.; Friedman, J. Random Forests. In The Elements of Statistical Learning; Springer: New York, NY, USA, 2009. [Google Scholar] [CrossRef]
Watson, D.B.; Doll, W.E.; Jeffrey Gamey, T.; Sheehan, J.R.; Jardine, P.M. Plume and lithologic profiling with surface resistivity and seismic tomography. Ground Water 2005, 43, 169–177. [Google Scholar] [CrossRef] [PubMed]
Danneels, G.; Bourdeau, C.; Torgoev, I.; Havenith, H.B. Geophysical investigation and dynamic modelling of unstable slopes: Case-study of Kainama (Kyrgyzstan). Geophys. J. Int. 2008, 175, 17–34. [Google Scholar] [CrossRef]
Thabet, M.; Nagashima, F.; Kawase, H. A computational approach for bedrock regressions with diffuse field concept beneath the Japan Islands. Soil Dyn. Earthquake Eng. 2024, 177, 108429. [Google Scholar] [CrossRef]
Thabet, M.; Omar, K. Subsurface velocity structures at the Egyptian seismological network stations retrieved by diffuse field assumption for Earthquakes. Eng. Geol. 2024, 338, 107626. [Google Scholar] [CrossRef]
Gardner, G.H.F.; Gardner, L.W.; Gregory, A.R. Formation velocity and density—The diagnostic basics for stratigraphic traps. Geophysics 1974, 39, 770–780. [Google Scholar] [CrossRef]
Munirwansyah, M.; Fulazzaky, M.A.; Yunita, H.; Munirwan, R.P.; Jonbi, J.; Sumeru, K. A new empirical equation of shear wave velocity to predict the different peak surface accelerations for Jakarta city. Geod. Geodyn. 2020, 11, 455–467. [Google Scholar] [CrossRef]
Wang, Z.; Cai, Y.; Liu, D.; Lu, J.; Qiu, F.; Sun, F.; Hu, J.; Li, Z. Characterization of natural fracture development in coal reservoirs using logging machine learning inversion, well test data and simulated geostress analyses. Eng. Geol. 2024, 341, 107696. [Google Scholar] [CrossRef]

Figure 1. Workflow diagram of the present adopted methodology.

Figure 2. Workflow diagram of the variance-based global sensitivity analyses.

Figure 3. The schematic architecture of the DNN training approach. Boxes are dataset processing. Circles are neurons.

Figure 4. The schematic architecture of the RFR training approach. Circles are regressions in each tree.

Figure 5. Location map of the 772 K-NET sites (red circles) in Japan. Blue circles correspond to the four K-NET sites used as examples later.

Figure 6. The observed depths of the groundwater levels at the 772 K-NET sites.

Figure 7. P- and S-wave velocities and N-value structures of the used 772 K-NET sites (Figure 5).

Figure 8. Calculating the mean and ± standard deviation of the P- and S-wave velocities and N-value structures of Figure 7.

Figure 9. Examples of agreement between observed and RFR-predicted depths of groundwater levels. These four K-NET sites are shown in Figure 5 as blue circles.

Figure 10. Comparison distribution of residuals using the RFR model (blue), SRP [5] (red), and WSI [8] (green).

Figure 11. Observed versus RFR-predicted depths of groundwater levels.

Figure 12. MSEs versus decision trees during the RFR training process.

Figure 13. The concept of the k-fold cross-validation approach. Arrows indicate the cross-validation process.

Figure 14. Example distributions of 10% (a) and 50% (b) of the 772 K-NET sites used in the RFR and DNN training process. Red circles are K-NET sites.

Figure 15. Comparison distribution of residuals using the DNN model (blue), SRP [5] (red), and WSI [8] (green).

Figure 16. Examples of agreement between observed and DNN-predicted depths of groundwater levels. These four K-NET sites are shown in Figure 5 as blue circles.

Figure 17. Observed versus DNN-predicted depths of groundwater levels.

Figure 18. Loss (i.e., MSE) versus epoch during the DNN training process.

Figure 19. Global sensitivity indices above groundwater levels (i.e., dry conditions).

Figure 20. Global sensitivity indices at groundwater levels (i.e., wet conditions).

Figure 21. Global sensitivity indices at each depth point.

S_{x}

and

{S T}_{x}

correspond to first- and total-order indices.

Figure 21. Global sensitivity indices at each depth point.

S_{x}

and

{S T}_{x}

correspond to first- and total-order indices.

Figure 22. Performance of RFR (upper blue) and DNN (lower blue) using only two input proxies (

V_{S}

and

V_{P}

). SRP [5] (red), and WSI [8] (green).

Figure 22. Performance of RFR (upper blue) and DNN (lower blue) using only two input proxies (

V_{S}

and

V_{P}

). SRP [5] (red), and WSI [8] (green).

Table 1. The assigned lithology index and its definition.

Index	Definition
1	surface soil
2	fill soil
3	gravel
4	gravelly soil
5	sand
6	sandy soil
7	silt
8	clay
9	organic soil
10	volcanic ash clay
11	peat
12	rock

Table 2. Example source dataset at the KYT013 site.

Depth (m)	N-SPT	$V_{P}$ (m/s)	$V_{S}$ (m/s)	$V_{S 30}$ (m/s)	$ρ$ (g/cm³)	LI ²	$V_{P} / V_{S}$	Poisson’s Ratio	IC_P ¹	IC_S ¹	GLI ²
1	10	240	110	318	1.61	1	2.18	0.367	−0.013	−0.013	0
2	24	240	110	318	1.57	1	2.18	0.367	0.724	0.423	0
3	20	1550	280	318	1.52	1	5.54	0.483	0.041	0.041	1
4	34	1550	280	318	1.65	1	5.54	0.483	0.015	0.015	0
5	34	1550	280	318	1.70	1	5.54	0.483	0.020	0.020	0
6	34	1550	280	318	1.77	1	5.54	0.483	0.046	0.046	0
7	25	1550	280	318	1.94	1	5.54	0.483	−0.037	−0.037	0
8	24	1550	280	318	1.80	8	5.54	0.483	0.014	0.014	0
9	23	1550	280	318	1.85	7	5.54	0.483	0.026	0.026	0
10	28	1550	280	318	1.95	7	5.54	0.483	0.191	0.205	0
11	99	2260	420	318	1.97	5	5.38	0.482	0.010	0.010	0
12	99	2260	420	318	2.01	3	5.38	0.482	0.015	0.015	0
13	99	2260	420	318	2.07	3	5.38	0.482	0.012	0.012	0
14	99	2260	420	318	2.12	4	5.38	0.482	−0.014	−0.014	0
15	99	2260	420	318	2.06	4	5.38	0.482	−0.007	−0.007	0
16	99	2260	420	318	2.03	4	5.38	0.482	−0.012	−0.012	0
17	99	2260	420	318	1.98	3	5.38	0.482	0.003	0.003	0
18	99	2260	420	318	1.99	3	5.38	0.482	0.017	0.017	0
19	99	2260	420	318	2.06	3	5.38	0.482	−0.010	−0.010	0
20	99	2260	420	318	2.02	3	5.38	0.482	0.000	0.000	0

¹ IC_P and IC_S correspond to P- and S-wave impedance contrasts, respectively. ² LI and GLI correspond to the lithology index and the groundwater level index, respectively.

Table 3. Impact of validation approaches on the performance of the RFR model.

Validation Approaches		No. of K-NET Sites When Residual (r)			Failed Groundwater Detection at K-NET Sites
Validation Approaches		0.1 < r < 1.0 (Overestimate)	r ≈ 1.0 (Match)	1.0 < r < 10.0 (Underestimate)	Failed Groundwater Detection at K-NET Sites
k-fold cross-validation		1	733	33	5
Training %–Validating % (772 K-NET sites)	90–10%	16	686	63	7
	70–30%	50	572	143	7
	50–50%	121	462	176	13
	30–70%	161	387	216	8
	20–80%	171	310	277	29
	10–90%	164	237	357	14

Table 4. Impact of validation approaches on the performance of the DNN model.

Validation Approaches		No. of K-NET Sites When Residual (r)			Failed Groundwater Detection at K-NET Sites
Validation Approaches		0.1 < r < 1.0 (Overestimate)	r ≈ 1.0 (Match)	1.0 < r < 10.0 (Underestimate)	Failed Groundwater Detection at K-NET Sites
k-fold cross-validation		135	470	155	12
Training %–Validating % (772 K-NET sites)	90–10%	198	317	245	12
	70–30%	171	287	304	10
	50–50%	302	281	178	11
	30–70%	311	256	194	11
	20–80%	306	243	217	6
	10–90%	353	218	197	4

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Thabet, M. Machine-Learning Models and Global Sensitivity Analyses to Explicitly Estimate Groundwater Presence Validated by Observed Dataset at K-NET in Japan. Geosciences 2025, 15, 126. https://doi.org/10.3390/geosciences15040126

AMA Style

Thabet M. Machine-Learning Models and Global Sensitivity Analyses to Explicitly Estimate Groundwater Presence Validated by Observed Dataset at K-NET in Japan. Geosciences. 2025; 15(4):126. https://doi.org/10.3390/geosciences15040126

Chicago/Turabian Style

Thabet, Mostafa. 2025. "Machine-Learning Models and Global Sensitivity Analyses to Explicitly Estimate Groundwater Presence Validated by Observed Dataset at K-NET in Japan" Geosciences 15, no. 4: 126. https://doi.org/10.3390/geosciences15040126

APA Style

Thabet, M. (2025). Machine-Learning Models and Global Sensitivity Analyses to Explicitly Estimate Groundwater Presence Validated by Observed Dataset at K-NET in Japan. Geosciences, 15(4), 126. https://doi.org/10.3390/geosciences15040126

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Machine-Learning Models and Global Sensitivity Analyses to Explicitly Estimate Groundwater Presence Validated by Observed Dataset at K-NET in Japan

Abstract

1. Introduction

2. Methodology

2.1. Global Sensitivity Analyses

2.2. DNN

2.3. RFR

3. Data

4. Results and Discussion

4.1. Performance of the RFR Model

4.2. Validating the RFR Model

4.3. Performance of the DNN Model

4.4. Global Sensitivity Analyses

4.5. Prospecting Applicability

5. Conclusions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI