Next Article in Journal
TAS-SLAM: A Visual SLAM System for Complex Dynamic Environments Integrating Instance-Level Motion Classification and Temporally Adaptive Super-Pixel Segmentation
Previous Article in Journal
Research on Spatiotemporal Dynamic and Driving Mechanism of Urban Real Estate Inventory: Evidence from China
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Application of a Hybrid CNN-LSTM Model for Groundwater Level Forecasting in Arid Regions: A Case Study from the Tailan River Basin

1
College of Hydraulic and Civil Engineering, Xinjiang Agricultural University, Urumqi 830052, China
2
Xinjiang Key Laboratory of Hydraulic Engineering Security and Water Disasters Prevention, Xinjiang Agricultural University, Urumqi 830052, China
3
Wensu Future Irrigation District Field Station, Wensu, Aksu 843100, China
*
Author to whom correspondence should be addressed.
ISPRS Int. J. Geo-Inf. 2026, 15(1), 6; https://doi.org/10.3390/ijgi15010006
Submission received: 18 September 2025 / Revised: 8 December 2025 / Accepted: 18 December 2025 / Published: 21 December 2025

Abstract

Accurate forecasting of groundwater level dynamics poses a critical challenge for sustainable water management in arid regions. However, the strong spatiotemporal heterogeneity inherent in groundwater systems and their complex interactions between natural processes and human activities often limit the effectiveness of conventional prediction methods. To address this, a hybrid CNN-LSTM deep learning model is constructed. This model is designed to extract multivariate coupled features and capture temporal dependencies from multi-variable time series data, while simultaneously simulating the nonlinear and delayed responses of aquifers to groundwater abstraction. Specifically, the convolutional neural network (CNN) component extracts the multivariate coupled features of hydro-meteorological driving factors, and the long short-term memory (LSTM) network component models the temporal dependencies in groundwater level fluctuations. This integrated architecture comprehensively represents the combined effects of natural recharge–discharge processes and anthropogenic pumping on the groundwater system. Utilizing monitoring data from 2021 to 2024, the model was trained and tested using a rolling time-series validation strategy. Its performance was benchmarked against traditional models, including the autoregressive integrated moving average (ARIMA) model, recurrent neural network (RNN), and standalone LSTM. The results show that the CNN-LSTM model delivers superior performance across diverse hydrogeological conditions: at the upstream well AJC-7, which is dominated by natural recharge and discharge, the Nash–Sutcliffe efficiency (NSE) coefficient reached 0.922; at the downstream well AJC-21, which is subject to intensive pumping, the model maintained a robust NSE of 0.787, significantly outperforming the benchmark models. Further sensitivity analysis reveals an asymmetric response of the model’s predictions to uncertainties in pumping data, highlighting the role of key hydrogeological processes such as delayed drainage from the vadose zone. This study not only confirms the strong applicability of the hybrid deep learning model for groundwater level prediction in data-scarce arid regions but also provides a novel analytical pathway and mechanistic insight into the nonlinear behavior of aquifer systems under significant human influence.

1. Introduction

Water is a vital natural resource, essential for both socioeconomic development and ecological balance. In China, growing water shortages have worsened groundwater overuse in recent times. This causes serious problems like land subsidence and ecological damage, especially in arid and semi-arid regions [1,2,3,4]. Groundwater is key to the socioeconomic and ecological stability of the Tailan River Basin. Therefore, accurately predicting groundwater levels is crucial for sustainable management. But making these predictions is difficult. Challenges include aquifer heterogeneity, nonlinear links between state variables, and the varying interactions of surface water and groundwater over space and time [5,6].
Physics-based numerical simulation has been the main way to predict groundwater levels. But its accuracy needs a lot of data. Also, it is computationally heavy and often cannot fully capture the complex ways hydrological variables relate to each other, which hurts prediction accuracy [7]. Advances in computing have led to new methods. Over the last ten years, many studies have looked at both the strengths and weaknesses of traditional physics-based models. They have also compared how well these models predict against newer, data-driven machine learning methods [8]. Most machine learning methods encounter challenges in handling missing data and highly nonlinear relationships. Deep learning has emerged as an effective approach for addressing these issues, owing to its strong capability in feature extraction and modeling complex systems [9,10]. As an important branch of machine learning, deep learning employs deep neural networks—such as Recurrent Neural Networks (RNN), Long Short-Term Memory (LSTM), Convolutional Neural Networks (CNN), and Transformers—to automatically extract data features, enabling the prediction of complex hydrological time series [11,12]. These models can effectively learn the spatiotemporal variation patterns of groundwater levels, significantly improving prediction accuracy [13]. However, traditional RNNs suffer from vanishing/exploding gradient problems, making it difficult to capture long-term dependencies and leading to the accumulation of forecast errors over extended sequences. To address this, improved models such as LSTM and Gated Recurrent Units (GRU) have been introduced to enhance long-sequence modeling capability and have been successfully applied in groundwater level prediction research [14].
In recent years, data-driven models have shown promise as alternatives to traditional physical simulations in groundwater studies. Early research largely focused on linear regression and artificial neural networks (ANN) [15,16,17,18]. For instance, Di Salvo et al. suggested that while machine learning models may not fully replace numerical models for physical characterization, they can significantly enhance the efficiency of single-well water level forecasting; the synergy of both approaches aids in optimizing groundwater management decisions. In specific application scenarios, machine learning models exhibit notable advantages: for example, Kerebih et al. utilized an ANN to integrate initial water level, pumping rates, and hydro-meteorological data, achieving high-precision groundwater level predictions in arid and semi-arid regions. Their prediction results outperformed those of traditional MODFLOW numerical simulations, highlighting the reliability of data-driven methods. However, such models exhibit significant limitations in capturing nonlinearities and long-term dependencies within spatiotemporal groundwater level series. Subsequently, recurrent neural networks (RNNs) and long short-term memory (LSTM) networks, which are more capable of modeling temporal characteristics, were introduced into this field. Bowes et al., in a groundwater prediction study for a coastal city, compared LSTM with RNN and confirmed that the LSTM model possesses stronger predictive capability when trained on observational data [19]. Although that study effectively improved the accuracy of single-point water level predictions, its structure typically focuses on temporal modeling of individual sites, making it difficult to effectively integrate spatial correlations among multiple wells within a basin. To simultaneously capture multivariate coupled features and temporal dependencies, some studies began exploring hybrid architectures that couple CNN with GRU/LSTM. For example, the LSTM time-series model constructed by Zhang et al. using water diversion, evaporation, precipitation, and air temperature as inputs, successfully simulated and predicted groundwater dynamics in multiple subzones of the Hetao Irrigation District. It demonstrated particularly excellent fitting of peak water levels, providing an effective supplementary method for data-scarce regions [6]. Pan et al. developed a model integrating convolutional neural network (CNN) and gated recurrent unit (GRU), where CNN effectively captured spatial correlation features of water levels in different regions of the Yangtze River Basin, while GRU efficiently extracted temporal dependencies, achieving better river water level prediction accuracy with a simplified structure (compared to LSTM) [20]. Furthermore, the application of various deep learning techniques has been validated. For instance, Afzaal et al. compared the performance of multilayer perceptron (MLP), LSTM, and CNN in groundwater level estimation, confirming the convenience and accuracy of such deep learning models [21]. Lähivaara et al. innovatively applied deep learning techniques, constructing and training a deep neural network model based on seismic data, significantly enhancing the accuracy and reliability of groundwater storage estimation [22]. Recent research has further advanced the development of hybrid architectures and multi-source information fusion. Sun et al. proposed a hybrid framework combining LSTM with a physical model, which significantly improved daily-scale groundwater level prediction accuracy by calibrating physical model errors or fusing their output features, proving effective for over 77% of monitoring wells [23]. Kow et al. introduced a ConvAE-LSTM model that integrates a convolutional autoencoder with LSTM. This model can jointly extract deep features from groundwater point data and spatiotemporal images, and its three-month-ahead water level predictions significantly outperformed traditional hydrological and LSTM models, with the R2 in key areas increasing by over 49% [24]. Chu et al. combined LSTM with predictor selection based on partial mutual information (PMI) and the Bootstrap method. After incorporating remote sensing information, their approach improved daily prediction accuracy across different climatic zones and quantified the associated uncertainty [25]. However, existing research predominantly focuses on model structure optimization or applications in humid regions with abundant hydro-meteorological data. While these models have achieved success, their performance in arid systems like the Tailan River Basin—which is dominated by high-intensity, highly uncertain anthropogenic pumping—remains a significant research gap. Specifically, the challenge of utilizing limited observational data to not only achieve high forecasting accuracy but also to reveal the response mechanisms of hydrogeological processes under pumping activities (e.g., drainage hysteresis effects) has been insufficiently addressed.
This study, conducted in the Tailan River Basin of Xinjiang, aims to develop a hybrid CNN-LSTM model capable of simultaneously extracting multivariate coupled features and modeling the temporal dependencies of groundwater levels. The innovations of this research are threefold: (1) A specialized model architecture was designed to address the characteristics of arid groundwater systems, which involve multi-source inputs and complex multivariate coupling and hysteresis effects. This architecture is tailored to synergistically extract the coupled features and temporal dependencies from hydro-meteorological variables. (2) It focuses on validating the model’s applicability and robustness across different hydrogeological units (represented by three monitoring wells) under the dual drivers of natural recharge/discharge and intensive anthropogenic pumping, particularly in the data-sparse downstream area. (3) Through rigorous comparative experiments and a multi-metric evaluation framework, it quantifies the performance improvement of the CNN-LSTM model over conventional ARIMA, standalone RNN, and LSTM models, thereby providing a reliable application case and paradigm for this model’s use in complex arid groundwater systems.

2. Study Area and Monitoring Data

2.1. Study Area Description

The Tailan River Basin in Xinjiang was selected as the study area. This basin exemplifies a typical arid piedmont hydrogeological system, in which groundwater dynamics are shaped by a complex coupling of natural recharge/discharge processes and intensive human intervention. These attributes make it a suitable setting for evaluating the performance of deep learning models under hydrogeologically challenging conditions. The plain portion of the basin (1950.8 km2) is dominated by a piedmont alluvial aquifer with pronounced heterogeneity. Major recharge originates from infiltration of the Tailan River runoff as it exits the mountain pass, with an annual volume of 754 million m3. The climate is warm-temperate and extremely arid, featuring limited natural recharge (mean annual precipitation: 85.14 mm) and high potential evaporation (3392 mm). In contrast, agricultural irrigation depends heavily on large-scale groundwater extraction, which totals 225 million m3 per year. This accounts for the majority of the basin’s total groundwater resources (288 million m3/a) and has substantially altered natural flow patterns. Consequently, the groundwater system exhibits high nonlinearity and is significantly pumping-dominated, posing substantial challenges for conventional simulation methods and presenting a dual challenge for data-driven models: to accurately capture multivariate coupled signals and the nonlinear responses under intensive anthropogenic disturbance. The geographic location and hydrogeological zoning of the study area are shown in Figure 1 and Figure 2, respectively.

2.2. Data and Statistical Analysis

Daily observational data from 2021 to 2024 within the Tailan River Basin were used in this study. Model inputs consisted of precipitation, air temperature, evaporation, river runoff, and groundwater pumping volume, with groundwater level serving as the output variable. The study area is representative of data-scarce regions, where monitoring is sparse and data quality presents ongoing difficulties. Pumping volumes were estimated from agricultural electricity consumption records. Although this approach reflects temporal variations in extraction intensity, it may incorporate systematic biases due to pump efficiency fluctuations and conveyance losses. The influence of this uncertainty on forecasting outcomes will be explicitly evaluated in the sensitivity analysis (Section 5.4). Groundwater level data were collected from seven monitoring wells installed across the basin. Statistical analysis of the level time-series from these wells showed marked spatial heterogeneity in dynamic behavior (Table 1), offering a principled basis for selecting representative wells during model development.
These variables exhibit distinct temporal variability and seasonal patterns that drive groundwater fluctuations.
Statistical analysis results (Table 1, Figure 3 and Figure 4) indicate that the water level dynamics in the monitoring wells exhibit a spatially patterned variation along the groundwater flow path (from upstream to downstream), characterized by increasing fluctuation (standard deviation) and intensifying negative skewness. This pattern intuitively reflects the spatial gradient of anthropogenic disturbance intensity. The dynamics at the upstream well AJC-7 are highly stable (standard deviation: 0.63 m; skewness: 5.01), indicating an intact aquifer structure in this area where groundwater dynamics are predominantly controlled by natural recharge and discharge processes, with minimal anthropogenic interference. In contrast, the downstream wells AJC-12, AJC-14, AJC-17, and AJC--21 all exhibit high fluctuation and strong negative skewness. Particularly, wells AJC-14 (standard deviation: 0.97 m; skewness: −15.84) and AJC-21 (standard deviation: 2.21 m; skewness: −11.50) show statistical signatures suggestive of intense yet distinct modes of human intervention. The combination of “low fluctuation-extreme negative skewness” at AJC-14 reveals a dynamic pattern dominated by long-term stability, interspersed with several short-lived but sharp water level decline events. Time-series analysis shows these decline events systematically lag behind pumping peaks by approximately 1–2 months (e.g., the pumping peak in 2024 occurred in July–August, while the water level trough at this well appeared in September–October). This hysteresis phenomenon is highly consistent with the “non-equilibrium drainage” effect in aquifers: intensive pumping creates a large-scale cone of depression, and even after pumping diminishes or ceases, gravity drainage in the vadose zone continues, causing water levels to decline further to their trough after the pumping peak. Conversely, the observed “high fluctuation-strong negative skewness” signature at well AJC-21 attests to intense and frequent direct disturbances in its vicinity. Furthermore, well AJC-12 was excluded from subsequent analysis to maintain data reliability, as observed anomalies indicated potential periodic dry-out or sensor malfunction.

2.3. Monitoring Well Selection and Data Preprocessing

Based on the principles of data integrity, representativeness of hydrogeological processes, and avoidance of redundancy, this study selected three monitoring wells—AJC-7, AJC-14, and AJC-21—for model construction and validation. These wells are distributed along the groundwater flow path, forming a complete gradient from natural recharge dominance to intense human disturbance, thereby systematically representing the dynamic characteristics of different zones within the study area. The upstream well AJC-7 reflects relatively stable dynamics controlled by natural recharge and discharge processes; the midstream well AJC-14 exhibits transitional behavior with delayed responses under increasing extraction, making it suitable for revealing key hydrogeological processes such as vadose zone drainage; while the downstream well AJC-21 represents highly fluctuating conditions under strong anthropogenic influence. This well combination, covering a full spectrum of disturbance intensities, supports high-precision predictions and enhances the model’s ability to identify system behaviors under different driving mechanisms, thereby establishing a robust modeling foundation for understanding complex aquifer processes in arid regions under multiple stressors.
During the data preprocessing phase, emphasis was placed on ensuring data quality and model stability. Short-term gaps in meteorological and pumping records were filled using temporal interpolation, while missing groundwater level records of up to seven days were reconstructed via linear interpolation, and longer gaps were addressed using a 7-day rolling average. Obvious outliers caused by sensor failures were removed. To effectively leverage spatiotemporal information, a multivariate feature set was constructed, incorporating lagged terms of key variables and temperature-precipitation interaction terms to represent potential nonlinear relationships. Finally, all features were normalized to eliminate scale differences and improve model convergence efficiency [26].

3. Methods

3.1. CNN-LSTM Model Architecture

Groundwater level dynamics represent a complex spatiotemporal process influenced by the combined effects of meteorological, hydrological, and anthropogenic pumping factors. Traditional standalone models often struggle to simultaneously handle their multivariate coupled features and temporal hysteresis effects. To address this, the present study developed a hybrid CNN–LSTM model, whose core architecture consists of two main components. First, a convolutional neural network (CNN) analyzes multivariate input sequences—such as precipitation and pumping rates—by treating multiple correlated time series as integrated entities for feature extraction. Its detailed architecture is shown in Figure 5. This allows the identification of short-term (e.g., within days) synergistic variations and local patterns among different variables, such as the correlation between increased extraction and rising temperature during an irrigation event. Second, a long short-term memory network (LSTM) is employed to simulate the memory mechanism of the aquifer system, characterizing the delayed response and long-term trends of groundwater levels to recharge and extraction events—for instance, capturing the hysteretic effects of vadose zone drainage and seasonal fluctuations. The structure of the LSTM component is illustrated in Figure 6. This coupled architecture enables the model to jointly account for the multivariate associations of hydro-meteorological factors and the hysteresis behavior of water level changes, thereby revealing the system response mechanisms under the combined influence of natural and anthropogenic drivers. The model architecture is illustrated in Figure 7. In this study, the time step (sliding window length) was set to 60 days, with five input features: precipitation, temperature, evaporation, runoff, and pumping volume. The input tensor was first fed into the CNN module for feature extraction, configured as follows:
(1) CNN Module: For extracting multivariate coupled features. A two-layer 1D convolutional structure was employed in this study. The 1D convolution effectively scans the multivariate input sequences (e.g., precipitation, pumping volume, air temperature) to identify synergistic variation patterns among different variables over short-term periods (for instance, an irrigation event may correspond to a simultaneous surge in pumping volume and rise in temperature). The kernel size of the first convolutional layer was set to 5, aiming to capture multi-variable interactive features within approximately a one-week cycle; the kernel size of the second layer was set to 3, to further integrate and abstract more refined temporal features [27]. Each convolutional layer was followed by normalization, a GELU activation function, and a max-pooling layer to accelerate training, introduce non-linearity, and enhance feature robustness. This module ultimately outputs features with a Dropout rate of 0.3 to improve the model’s generalization ability [26,28,29,30].
(2) LSTM Module: For Temporal Dependency Modeling. The features extracted by the CNN are reshaped into a time series and fed into the LSTM layer. The gating mechanism (input gate, forget gate, output gate) of the LSTM enables it to effectively learn long-term dependencies within sequences, making it particularly suitable for simulating the hysteretic response of groundwater to recharge and pumping events (e.g., vadose zone drainage hysteresis) and the nonlinear evolutionary trends of water levels [31,32,33,34]. This model employs a two-layer LSTM architecture (with 64 neurons per layer) [35,36,37,38] to fully learn the complex temporal dynamics of hydrological processes, also incorporating Dropout (rate of 0.3) to mitigate overfitting [39,40]. The specific computational formulas are as follows:
Forget Gate:
f t = σ W f h t 1 , x t + b f
Input Gate:
i t = σ W i h t 1 , x t + b i ,   C ~ t = tanh W C h t 1 , x t + b C
Cell State Update:
C t = f t C t 1 + i t C ~ t
Output Gate:
o t = σ W o h t 1 , x t + b o ,   h t = o t tanh C t
where σ and t a n h denote activation functions; i t represents the output of the input gate; x t is the input at the current timestep; h t 1 denotes the hidden state from the previous timestep; C t indicates the cell state at the current timestep; W f ,   W i ,   W C ,   W o are learnable weight matrices; and b f ,   b i ,   b C ,   b o are corresponding bias terms.
To achieve accurate modeling of groundwater dynamics under diverse hydrogeological conditions and to clearly evaluate the method’s applicability in heterogeneous settings, this study adopted a “one independent model per monitoring well” strategy. Specifically, separate CNN-LSTM models with identical architecture but independently trained parameters were developed for the three wells: AJC-7, AJC-14, and AJC-21. This strategy is based on the following considerations: (1) The significant spatial heterogeneity in hydrogeological conditions and the intensity of human-induced disturbances within the study area; preliminary tests indicated that a single shared model struggled to simultaneously capture the distinct dynamic patterns observed at the three wells. (2) Independent modeling avoids the need to introduce complex spatial features such as well location encoding, allowing the model to focus on learning the unique driver-response relationship specific to each well. (3) Under this strategy, the model achieved excellent performance at all three representative wells, directly demonstrating the general applicability and robustness of the proposed CNN-LSTM architecture across various hydrogeological regimes—from those dominated by natural recharge–discharge to those subject to intense anthropogenic perturbation. This provides a basis for extending the method to wells under different conditions. The model architecture, training, and validation procedures described in the following sections apply to each independent single-well model.
The network models long-term temporal dependencies and delayed aquifer responses to pumping and recharge.
The CNN extracts short-term multivariate features, while the LSTM captures long-term dependencies, enabling improved simulation of groundwater dynamics.

3.2. Model Training, Validation, and Hyperparameter Setting

To ensure the reliability and generalizability of the model predictions, a strict backward validation method was applied to the time-series data. The dataset was partitioned using 1 January 2024, as a cutoff: training data spanned from September 2021 to December 2023, and the validation set comprised the full calendar year of 2024. This partitioning guards against information leakage from the future into the training phase, thereby mimicking real forecasting conditions.
During training, emphasis was placed on balancing predictive accuracy with model generalizability. The AdamW optimizer was chosen for its improved generalization performance, attributable to decoupled weight decay regularization [41,42]. The Huber loss function was selected for its reduced sensitivity to anomalous values, making it suitable for real-world monitoring datasets that often contain noise [43]. Key hyperparameters used during training are listed in Table 2. Multiple strategies were implemented to enhance training stability: a dynamic learning rate scheduler reduces the rate when the validation loss stabilizes to refine convergence; gradient clipping caps the norm at 1.0 to curb instability; and Dropout is applied to randomly omit neurons, countering overfitting. To alleviate the LSTM’s sensitivity to initial states, a sliding time window of 60 days was used during training. Processing sequences in consecutive batches acts as an implicit “warm-up,” promoting the stabilization of the model’s internal state by the end of each window and diminishing the effect of initial transient behaviors [44].
Despite the limited time window of the training data, the aforementioned regularization measures (Dropout and weight decay) combined with strict temporal validation effectively mitigated the risk of overfitting. To visually illustrate the training process, Figure 8 presents the convergence curves of training and validation loss for a representative monitoring well (AJC-7). Both curves decreased steadily and converged without significant divergence, indicating no severe overfitting occurred. This study also experimented with deeper network architectures, but these exhibited early performance deterioration on the validation set. Consequently, the current streamlined yet efficient architecture was ultimately selected to achieve an optimal balance between model capacity and generalization ability.
The similar convergence trends indicate stable training and effective prevention of overfitting.

3.3. Model Performance Evaluation Metrics

A multidimensional set of metrics (Equations (1)–(4)) was employed to comprehensively quantify the predictive performance of the model, including the Mean Absolute Error (MAE), Root Mean Square Error (RMSE), Nash–Sutcliffe efficiency (NSE) coefficient, and Pearson correlation coefficient (R). The NSE serves as a core metric for evaluating hydrological model performance, measuring the improvement of predictions over simply using the mean of observed values [45]. Each metric assesses performance from a distinct perspective: MAE reflects the average physical magnitude of prediction errors; RMSE emphasizes the influence of larger errors; NSE evaluates predictive capability relative to the observation mean; and R quantifies the degree of linear correlation between predicted and observed values.
M A E = 1 n i = 1 n   y ^ i y i
R M S E = i = 1 N y i y ^ i 2 N
N S E = 1 i = 1 n   ( y i o b s y i p r e d ) 2 i = 1 n   ( y i o b s y ¯ o b s ) 2
R = i = 1 n y i o b s y ¯ o b s y i p r e d y ¯ p r e d i = 1 n y i o b s y ¯ o b s 2 i = 1 n y i p r e d y ¯ p r e d 2
where x and y represent the observation sequence and the prediction sequence, respectively, and x ¯ and y ¯ denote their corresponding means, and y i is the measured value at the i -th time step, y i ^ denotes the corresponding predicted value, and n represents the total number of samples, y i o b s denotes the i -th measured value, y i p r e d represents the corresponding model prediction at timestep i , and y ¯ o b s is the arithmetic mean of the measured dataset, y ¯ p r e d denotes the arithmetic mean of the predicted values.

3.4. Uncertainty and Sensitivity Analysis Methods

Sensitivity analysis in this study was focused on pumping inputs, primarily due to reliability variations in data sources. As described in Section 2.2, the pumping data were estimated from agricultural electricity consumption. While this method effectively reflects the dynamic variations in pumping intensity, unquantified systematic errors may be introduced due to pump efficiency differences and water conveyance losses. In contrast, meteorological data (precipitation, air temperature, and evaporation) were sourced from high-precision observations with relatively lower measurement uncertainty. Therefore, quantifying the impact of this dominant source of uncertainty—pumping volume—on model predictions is of primary importance for assessing the reliability of the forecast results. Future work could expand this analysis to a comprehensive assessment of uncertainties from multiple sources. By intentionally introducing a systematic bias of ±20% (a magnitude referenced from practical investigations of local irrigation systems), three distinct pumping input scenarios were constructed: the baseline scenario (original estimated values), an overestimation scenario (original values × 1.2), and an underestimation scenario (original values × 0.8). Under the condition that the model structure, hyperparameters, and other input variables remained unchanged, the pre-trained model was driven, respectively, by these three sets of pumping data to re-run predictions and calculate the performance metrics (e.g., NSE, RMSE). By comparing the simulation outcomes across these perturbed scenarios, the sensitivity of the model output to input data uncertainty and the robustness of the primary conclusions were quantitatively assessed.

4. Results

4.1. Model Prediction Accuracy

To identify the optimal predictive model for the Tailan River Basin, the performance of four model types—the autoregressive integrated moving average (ARIMA) model, RNN, LSTM, and CNN-LSTM—was systematically compared. To ensure a fair comparison, the input and output variables were kept identical across all models (ARIMA, RNN, LSTM, and CNN-LSTM). All data preprocessing, model training, and forecasting were conducted at a daily timestep. The models learned from continuous daily sequences and generated forecasts of future daily groundwater levels. The traditional ARIMA model performed the worst at all monitoring wells (e.g., NSE = −0.9676 at well AJC-7) due to the inherent conflict between its linear assumptions and the nonlinear nature of the hydrogeological system. The RNN model demonstrated some predictive capability at the upstream well AJC-7, which is dominated by natural dynamics (NSE = 0.5219). However, its performance declined significantly at the downstream wells AJC-14 and AJC-21 (NSE = −0.0103 and 0.0287, respectively), indicating that a simple recurrent structure is inadequate for handling the complex nonlinear signals and long-term dependencies introduced by human activities. The LSTM model showed markedly improved performance (NSE at AJC-21 increased to 0.3778). Nevertheless, its single-point temporal modeling architecture cannot leverage multivariate coupled features. This limitation leads to systematic lags and an underestimation of amplitude in its response to regional pumping events. In contrast, the CNN-LSTM model demonstrated comprehensive and superior predictive advantages. The Nash–Sutcliffe efficiency coefficient (NSE) exceeded 0.58 at all wells, reaching 0.9224 at well AJC-7. Furthermore, its predictions showed the strongest correlation with observed values (R > 0.91 for all wells). Crucially, the magnitude of its performance improvement was most pronounced in areas with more intense anthropogenic disturbance: at well AJC-21, its NSE increased by over 107% compared to the suboptimal LSTM model (from 0.378 to 0.787), while the RMSE decreased by approximately 37%. This convincingly demonstrates the indispensable value of the hybrid CNN-LSTM architecture in handling complex multivariate driver-response processes under the dual driving mechanisms of natural and anthropogenic forces (see Figure A1 in the Appendix A for the full comparison of model performance across wells.)

4.2. Visual Analysis of Model Prediction Performance

Figure 9 visually compares the predicted sequences from the four models against the observed values, intuitively demonstrating their performance differences. The prediction sequence of the ARIMA model (Figure 9a) exhibits significant deviations from the observed values, manifesting as apparent phase lags and an over-smoothed amplitude, failing to capture any rapid water level changes. This is consistent with its exceedingly low quantitative evaluation metrics mentioned earlier. The prediction sequences of the RNN and LSTM models (Figure 9b, c) partially replicate the trends in water level variation but still show notable flaws. The prediction trajectory of the RNN is contaminated by high-frequency noise during periods of high fluctuation. The predictions from the LSTM are smoother; however, during peak pumping periods at the downstream wells, its prediction curve displays a systematic lag and an underestimation of the magnitude of sharp water level declines. As an illustration, at well AJC-21 during August 2024—a period of intensive pumping—the standalone LSTM model showed a noticeable delay in its predictions, estimating the water level decline 5–7 days later than measured, and underpredicted the total drawdown by close to 20%. These outcomes underscore the challenges faced by LSTM-based models in reacting promptly to strong external influences such as pumping. By contrast, the CNN–LSTM model yielded predictions that closely match observed values throughout the monitoring network (Figure 9d). It accurately simulated both the steady variations typical of the upstream well AJC-7 and the rapid, irrigation-triggered changes seen at downstream wells. Notably, the model correctly predicted the delayed water level low observed at well AJC-14 in mid-September, as well as the repeated sharp variations during irrigation events at AJC-21 (indicated with arrows in Figure 9d). Its predicted curves show a high degree of agreement with the measured data in terms of both the timing and magnitude of fluctuations, indicating that the model effectively integrates multi-source driving information with the temporal dependencies of water levels, thereby providing a more realistic simulation of the actual hydrogeological processes.
The hybrid CNN–LSTM provides the closest match to observations, capturing both variability and timing more accurately than the other models.

4.3. Sensitivity Analysis Results

To quantify the impact of uncertainty in pumping volume data on the prediction results, a sensitivity analysis was conducted. Three pumping input scenarios were constructed by introducing a systematic bias of ±20%. The results demonstrate that the CNN-LSTM model’s response to variations in pumping input exhibits a distinct asymmetry (Figure A2 and Table A1 in the Appendix A). Under the baseline scenario, the model achieved its optimal overall performance (RMSE = 0.300 m, R = 0.786). When the pumping volume was overestimated by 20%, only a slight performance degradation was observed (RMSE = 0.308 m, R = 0.788). However, a significant performance decline occurred when the pumping volume was underestimated by 20%, with the RMSE increasing to 0.326 m—an increase of 8.6%. It is noteworthy that the correlation coefficient (R) between predicted and observed values remained relatively stable across all scenarios. This indicates that the model’s ability to capture the trend of water level dynamics is highly robust, yet systematic bias in the input data can directly lead to a systematic offset in the prediction results. This asymmetric response pattern is not a model flaw but rather implies an important hydrogeological process mechanism. It reveals a potential intrinsic nonlinearity in the dynamic response process of groundwater levels to pumping in the study area. The specific underlying mechanism will be discussed in depth in the Discussion Section 5.2.

5. Discussion

5.1. The Performance Superiority of the CNN-LSTM Model and Its Spatiotemporal Synergistic Mechanism

Conventional time series models—such as ARIMA, RNN, and LSTM—exhibit notable limitations in simulating groundwater level dynamics in the Tailan River Basin, primarily stemming from a mismatch between their model architectures and the multivariate driving forces and hysteresis response mechanisms inherent to the groundwater system itself. More specifically, the ARIMA model, grounded in linear assumptions, fails to accurately represent the nonlinear dynamic responses of aquifers to extraction and recharge events, resulting in poor predictive performance. While RNN models possess certain capabilities in learning temporal dependencies, they struggle with the complex signals in downstream areas subject to intense human intervention due to issues such as gradient vanishing and an inability to extract multivariate coupled features. The LSTM model partially mitigates long-term dependency problems through gating mechanisms, outperforming RNNs in this regard. However, as its architecture is designed solely for single-point time series modeling, it cannot incorporate multivariate synergistic information. Consequently, LSTM can only infer based on local historical water level data and fails to respond promptly to drastic changes caused by regional high-intensity extraction events. This leads to a lagged response to abrupt water level changes induced by pumping and a frequent underestimation of the change magnitude.
In contrast, the hybrid CNN-LSTM model demonstrates comprehensive performance improvement. Its success can be attributed to its ability to simultaneously and effectively handle both the multivariate coupled features and temporal dependencies within the input data. The CNN component, through its convolutional operations, adeptly identifies synergistic patterns over short periods—such as the concurrent surge in pumping, increase in temperature, and variation in runoff during an irrigation event. This essentially provides the model with an early-warning signal of “intensifying regional pumping activity.” Building upon this, the LSTM component deeply learns the long-term dependencies in the water level time series, effectively capturing the aquifer’s hysteretic response to pumping events while filtering out high-frequency noise. This synergistic mechanism between feature extraction and temporal modeling enables the model to more proactively and accurately predict key hydrological events, such as sharp water level declines and subsequent recoveries. Consequently, it maintains excellent predictive capability (NSE < 0.78) even in the intensively disturbed downstream area (well AJC-21).

5.2. Asymmetric Response to Pumping Input Uncertainty and Its Hydrogeological Mechanism

The key phenomenon revealed by the sensitivity analysis—that the model is far more sensitive to an underestimation of pumping volume than to an overestimation—is not a model defect to be circumvented. On the contrary, it is a valuable data signal hinting at the underlying hydrogeological mechanisms. We hypothesize that this asymmetry primarily stems from the hysteresis effect of vadose zone drainage and the nonlinear characteristics of the specific yield.
In the downstream area of the piedmont alluvial–proluvial fan, where fine-grained materials dominate, the significant thickness and lower permeability of the vadose zone lead to considerable time lag in drainage. This mechanism aligns with the significant phase delay of water levels in response to periodic disturbances observed in low-permeability aquifers [46]. When the model input pumping volume is underestimated (the −20% scenario), the model essentially underestimates the “stress” imposed on the aquifer. This results in an insufficient simulation of the water level decline magnitude and an overestimation of the aquifer’s short-term recovery capacity. Since the actual, stronger pumping is continuously draining pore water—a process unknown to the model—the predictions develop a systematic high bias (overestimation). It is noteworthy that recent studies indicate such drainage and deformation processes triggered by intense pumping are often non-equilibrium and hysteretic. For example, laboratory experiments reveal that the deformation of aquifer sands lags noticeably behind head changes, and compression may persist even during water level recovery under net pumping effects; numerical simulations further confirm that long-term, high-intensity pumping can lead to permanent loss of aquifer storage capacity [47,48]. On the other hand, the specific yield often exhibits a nonlinear increase as the water level declines. When pumping is underestimated, the model uses inappropriately small Sy values, which further leads to an underestimation of the water level change amplitude. Conversely, when the pumping volume is overestimated (+20% scenario), although the simulated “stress” is excessive, the simulated impact can be partially compensated for by natural recharge mechanisms—provided the simulated water level does not drop below a critical boundary (e.g., an aquitard). Consequently, the model performance degradation is relatively moderate. Furthermore, the heterogeneity of the piedmont alluvial–proluvial fan in the study area and the potential presence of local aquitards can exacerbate the asynchrony of water level responses in different zones, making the overall dynamics more complex. This finding underscores the critical importance of accurately quantifying pumping volumes in high-precision groundwater modeling. Particular attention must be paid to avoiding underestimation scenarios, as they introduce significant systematic prediction biases and management misjudgments. Notably, this phenomenon of asymmetric and hysteretic water level responses to pumping aligns with conclusions from several recent studies focusing on intensely stressed aquifer systems [49,50]. These studies, from perspectives such as hysteretic aquifer deformation, permanent storage loss, and the influence of heterogeneous structures, collectively demonstrate that the behavior of groundwater systems under strong pumping is fundamentally non-equilibrium. These mechanisms support the inference drawn from the sensitivity analysis in this study: that vadose zone hysteresis and the nonlinearity of specific yield are the core hydrogeological processes underlying the observed phenomena.

5.3. Dominant Role of Anthropogenic Pumping over Climatic Factors

The superior performance of the CNN-LSTM model provides a reliable means for identifying the primary driving factors controlling groundwater level dynamics in the Tailan River Basin. The prediction results indicate that intensive pumping during irrigation periods is the predominant factor controlling the sharp water level fluctuations (amplitude > 29.98 m) in the downstream area. The model accurately captured the high sensitivity of water levels to pumping events. A representative case occurred during the peak irrigation period in 2024, where sustained high-intensity pumping (>100 m3/d) directly triggered a rapid water level decline of 1.5 m at well AJC-21. In stark contrast, our analysis demonstrates that daily-scale water level fluctuations induced by all natural climatic factors (precipitation, evaporation, runoff, and air temperature) during the same period were less than 0.1 m. This difference of more than an order of magnitude (15-fold) definitively confirms, from a physical mechanism perspective, the overwhelming influence of anthropogenic intensity.
In contrast, the direct influence of climatic factors is comparatively weaker. No significant statistical correlation was observed between precipitation and water level dynamics, which is likely related to the great thickness and low vertical permeability of the vadose zone in the study area, resulting in low precipitation infiltration efficiency. Temperature primarily affects groundwater recharge indirectly by modulating snow and ice melt runoff in high mountain areas; however, its effect is considerably smaller—both temporally and in magnitude—than that of localized, intensive agricultural pumping. This “pumping-driven dominance with weak climatic response” pattern clearly indicates that in such arid river basins subject to high-intensity anthropogenic disturbance, the classical hydrological analysis paradigm—centered on natural climatic fluctuations—has become inadequate. Consequently, the core of groundwater resource management should prioritize the precise monitoring and spatial optimization of pumping activities, while climatic variables can be treated as a background term or utilized for long-term trend analysis.

5.4. Sources of Error

This study has certain limitations, the foremost of which stems from the scarcity of observational data. Although its aim was to provide a novel methodology for data-scarce arid regions, the objective reality of having only seven monitoring wells (2021–2024) in the Tailan River Basin, coupled with data gaps or anomalies in some wells, constrains the model’s performance in two key aspects: firstly, it limits the model’s capacity to depict the basin’s spatial heterogeneity in finer detail; secondly, it poses a challenge for predicting water level responses to long-term climatic fluctuations (e.g., multi-year droughts). Furthermore, the pumping volume, as a primary driving factor, was estimated from electricity consumption data. The insufficient spatiotemporal resolution of this dataset may have introduced unquantified systematic errors. Finally, as a purely data-driven model, its predictive capability relies on the representativeness of historical data, introducing uncertainty when projecting future extreme scenarios.

5.5. Management Implications and Sustainability Considerations

The CNN-LSTM model developed in this study has successfully achieved high-precision prediction of groundwater levels in the Tailan River Basin. Its value extends far beyond the methodology itself, providing profound scientific insights for understanding the core dynamics of the regional groundwater system and its sustainable management.
The prediction results and sensitivity analysis collectively reveal a definitive conclusion: anthropogenic pumping is the overwhelmingly dominant factor controlling regional groundwater level dynamics. For instance, during a representative event in 2024, sustained high-intensity pumping (>100 m3/d) directly triggered a rapid water level decline of 1.5 m at well AJC-21. In stark contrast, daily-scale water level fluctuations induced by all natural climatic factors (precipitation, evaporation, runoff, and air temperature) during the same period were less than 0.1 m. This difference of more than an order of magnitude (15-fold) definitively proves, from a physical mechanism perspective, that attempting to maintain water resource balance by “adapting to climatic fluctuations” is ineffective in this region. The management focus must, therefore, be resolutely fixed on the precise regulation and standardized management of human pumping activities.
The water level dynamic pattern revealed by this study—high fluctuation (annual amplitude of 29.98 m) and strong negative skewness (sharp declines followed by slow recovery)—is a classic signature of an aquifer under intense non-equilibrium conditions, commonly observed in arid regions worldwide. This dynamic signal clearly indicates that the current pumping rate vastly exceeds the aquifer’s long-term sustainable recharge capacity, rendering the present extraction regime unsustainable. If the current intensity is maintained, the model-predicted declining water level trend will be inevitable, ultimately triggering a cascade of ecological and economic risks, including vadose zone thickening, vegetation degradation, soaring irrigation costs, and ultimately, water resource depletion.
The high-precision CNN-LSTM prediction model developed in this study can be employed to simulate and evaluate the effects of different management scenarios, such as pumping prohibition, restriction, or seasonal extraction. It thereby provides indispensable intelligent decision support for exploring and formulating a restoration pathway toward the sustainable use of groundwater resources. To further enhance the practical value of this predictive framework in spatial management and planning, its deep integration with Geographic Information Systems (GIS) and remote sensing technology represents a critical direction for future development. The CNN-LSTM model has demonstrated its capability in processing complex temporal signals. By integrating its core architecture with spatial data layers, a leap from “point” to “area” prediction can be achieved. For instance, incorporating raster data such as satellite-derived evapotranspiration, vegetation indices, and land use can directly quantify the spatial variability in ecosystem water consumption. Utilizing the spatial interpolation and analysis functions of GIS can upscale limited point observations into continuous driving fields, thereby supporting basin-scale simulation of groundwater level distribution. The fusion of such data-driven models with geospatial information would not only significantly enhance the model’s capacity to characterize regional heterogeneity but also provide dynamic, visualized decision support for refined management tasks—such as spatial water allocation, over-exploitation zone delineation, and restoration assessment—ultimately propelling water resources management toward a new, truly “spatially intelligent” paradigm.

6. Conclusions and Outlook

This study addressed the challenges posed by intensive anthropogenic disturbances and multi-driver natural processes affecting groundwater systems in arid regions by innovatively developing a hybrid model coupling Convolutional Neural Networks and Long Short-Term Memory networks (CNN-LSTM). Applied to the case study of the Tailan River Basin in Xinjiang, the model achieved high-precision groundwater level prediction through multi-source data fusion. The main conclusions are as follows:
(1) The CNN-LSTM model demonstrates clear superiority over both traditional time series models (e.g., ARIMA) and standalone neural network models (including RNN and LSTM) in terms of predictive accuracy and stability. This performance improvement is primarily attributed to the adopted synergistic mechanism between feature extraction and temporal modeling: the CNN module effectively extracts multivariate coupled features, while the LSTM module accurately describes the nonlinear hysteretic response behaviors of the aquifer to recharge and pumping events. The model exhibits consistently sound predictive performance across sites with different hydrogeological backgrounds and dynamic characteristics—from upstream areas dominated by natural recharge–discharge to downstream regions strongly affected by intensive pumping. This indicates its strong effectiveness and adaptability to diverse conditions in handling the complex multivariate driver-response processes in arid regions.
(2) Sensitivity analysis also revealed that model performance responds asymmetrically to errors in pumping data, with underestimations of extraction rates producing more severe distortion than overestimations. This asymmetry is not an artifact of the model, but rather corresponds to physical processes such as slow drainage in the vadose zone and nonlinear aquifer storage release. These findings emphasize that accurately quantifying pumping rates is essential in arid-region groundwater modeling, as underestimation can lead to significant predictive bias and ultimately misinform water management planning.
(3) The high-precision predictions provided a reliable basis for identifying the primary driving factors controlling groundwater level dynamics. The study conclusively demonstrated that human exploitation is the absolute dominant factor controlling groundwater level fluctuations in the Tailan River Basin, with exploitation-induced water level changes (>1.5 m) exceeding the effects of climatic factors (<0.1 m) by an order of magnitude. This finding indicates that in such intensively human-disturbed arid basins, the traditional climate-centric hydrologic analysis paradigm has become inadequate. The core of groundwater resource management must consequently shift from “adapting to climate fluctuations” to “regulating human activities.”
Based on these conclusions, the successful application of this study not only provides a novel predictive tool for data-scarce arid regions but, more importantly, offers a new paradigm for analyzing the mechanisms of complex systems through data-driven modeling. The developed high-precision model can serve as a “digital sandbox” for evaluating the effectiveness of various management scenarios (e.g., exploitation prohibition, restriction measures), thereby providing quantifiable decision-making support for the efficient management and sustainable utilization of groundwater resources.
Looking ahead, this study can be further deepened in the following aspects. First, integrating measured extraction data with higher precision and spatiotemporal resolution by developing pumping inversion methods based on the assimilation of remote sensing and smart meter data, to reduce input uncertainty. Second, exploring pathways to enhance model interpretability and integrate it with physical mechanisms, such as constructing a coupled modeling framework by combining it with groundwater numerical models, to improve the physical consistency of predictions. Third, promoting the deep integration of the model with Geographic Information Systems (GIS) and remote sensing products, leveraging spatially distributed data to enhance regional-scale simulation capabilities, thereby supporting the optimization of monitoring networks and spatial planning. Fourth, advancing the application of the model in decision-support scenarios such as ecological water level management and the assessment of restoration effectiveness in over-exploited areas, to provide an intelligent tool for achieving sustainable water resource utilization.

Author Contributions

Author Contributions: Conceptualization, Shuting Hu and Mingliang Du; Methodology, Shuting Hu; Software, Shuting Hu; Validation, Shuting Hu, Mingliang Du and Jiayun Yang; Investigation, Jiayun Yang, Yankun Liu, Ziyun Tuo and Xiaofei Ma; Resources, Jiayun Yang, Yankun Liu, Ziyun Tuo and Xiaofei Ma; Data Curation, Shuting Hu and Mingliang Du; Writing—original draft, Shuting Hu; Writing—review & editing, Shuting Hu, Mingliang Du and Yankun Liu; Supervision, Mingliang Du; Project administration, Mingliang Du. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Xinjiang Uygur Autonomous Region Major Science and Technology Special Project, grant number 2024A03007-3. The APC was funded by the same funder.

Data Availability Statement

The groundwater data analyzed in this study were obtained from a collaborative research project and are not publicly available due to privacy and confidentiality regulations concerning water resource management in Xinjiang. Reasonable requests for access to the data for verification or collaborative research purposes can be directed to the first author via email at: czswjhst@163.com.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Figure A1. Comparison of model performance for three wells using (a) MAE, (b) RMSE, and (c) NSE metrics. The hybrid CNN–LSTM model achieves the best overall performance, especially in strongly pumped downstream areas.
Figure A1. Comparison of model performance for three wells using (a) MAE, (b) RMSE, and (c) NSE metrics. The hybrid CNN–LSTM model achieves the best overall performance, especially in strongly pumped downstream areas.
Ijgi 15 00006 g0a1
Table A1. Performance Evaluation of Water Level Predictions under Different Pumping Scenarios.
Table A1. Performance Evaluation of Water Level Predictions under Different Pumping Scenarios.
Pumping RateMAE (m)RMSE (m)R
Normal0.23500.30020.7856
Overestimated0.23900.30780.7882
Underestimated0.25200.32550.7781
Figure A2. Groundwater predictions under baseline, +20% overestimation, and −20% underestimation pumping scenarios. The model shows asymmetric sensitivity, with underestimation of pumping leading to notably larger prediction errors.
Figure A2. Groundwater predictions under baseline, +20% overestimation, and −20% underestimation pumping scenarios. The model shows asymmetric sensitivity, with underestimation of pumping leading to notably larger prediction errors.
Ijgi 15 00006 g0a2

References

  1. AbuKaraki, A.; Alrawashdeh, T.; Abusaleh, S.; Alksasbeh, M.Z.; Alqudah, B.; Alemerien, K.; Alshamaseen, H. Pulmonary Edema and Pleural Effusion Detection Using EfficientNet-V1-B4 Architecture and AdamW Optimizer from Chest X-Rays Images. CMC-Comput. Mater. Contin. 2024, 80, 1055–1073. [Google Scholar] [CrossRef]
  2. Afzaal, H.; Farooque, A.A.; Abbas, F.; Acharya, B.; Esau, T. Groundwater estimation from major physical hydrology components using artificial neural networks and deep learning. Water 2019, 12, 5. [Google Scholar] [CrossRef]
  3. Ali, A.S.A.; Ebrahimi, S.; Ashiq, M.M.; Alasta, M.S.; Azari, B. CNN-Bi LSTM neural network for simulating groundwater level. Environ. Eng. 2022, 8, 1–7. [Google Scholar] [CrossRef]
  4. Bandara, K.; Bergmeir, C.; Smyl, S. Forecasting across time series databases using recurrent neural networks on groups of similar series: A clustering approach. Expert Syst. Appl. 2020, 140, 112896–112929. [Google Scholar] [CrossRef]
  5. Barua, S.; Cartwright, I.; Dresel, P.E.; Daly, E. Using multiple methods to investigate the effects of land-use changes on groundwater recharge in a semi-arid area. Hydrol. Earth Syst. Sci. 2021, 25, 89–104. [Google Scholar] [CrossRef]
  6. Bowes, B.D.; Sadler, J.M.; Morsy, M.M.; Behl, M.; Goodall, J.L. Forecasting groundwater table in a flood prone coastal city with long short-term memory and recurrent neural networks. Water 2019, 11, 1098. [Google Scholar] [CrossRef]
  7. Brookfield, A.E.; Zipper, S.; Kendall, A.D.; Ajami, H.; Deines, J.M. Estimating Groundwater Pumping for Irrigation: A Method Comparison. Groundwater 2024, 62, 15–33. [Google Scholar] [CrossRef] [PubMed]
  8. Chang, Y.; Lv, L.; Wang, X.; Xie, R.; Wang, Y. The influence of withdrawal-recharging pattern on the deformation characteristics of sand in confined aquifer. Heliyon 2024, 10, e35773. [Google Scholar] [CrossRef] [PubMed]
  9. Chicco, D.; Warrens, M.J.; Jurman, G. The coefficient of determination R-squared is more informative than SMAPE, MAE, MAPE, MSE and RMSE in regression analysis evaluation. Peerj Comput. Sci. 2021, 7, e623. [Google Scholar] [CrossRef]
  10. Chu, H.; Bian, J.; Lang, Q.; Sun, X.; Wang, Z. Daily Groundwater Level Prediction and Uncertainty Using LSTM Coupled with PMI and Bootstrap Incorporating Teleconnection Patterns Information. Sustainability 2022, 14, 11598. [Google Scholar] [CrossRef]
  11. de Graaf, I.E.; Gleeson, T.; Van Beek, L.; Sutanudjaja, E.H.; Bierkens, M.F. Environmental flow limits to global groundwater pumping. Nature 2019, 574, 90–94. [Google Scholar] [CrossRef]
  12. Di Salvo, C. Improving Results of Existing Groundwater Numerical Models Using Machine Learning Techniques: A Review. Water 2022, 14, 2307. [Google Scholar] [CrossRef]
  13. Fang, W.; Zhang, F.; Ding, Y.; Sheng, J. A New Sequential Image Prediction Method Based on LSTM and DCGAN. Comput. Mater. Contin. 2020, 64, 217–231. [Google Scholar] [CrossRef]
  14. Gers, F.A.; Schmidhuber, J.; Cummins, F. Learning to forget: Continual prediction with LSTM. Neural Comput. 2000, 12, 2451–2471. [Google Scholar] [CrossRef]
  15. Graves, A.; Graves, A. Long short-term memory. In Supervised Sequence Labelling with Recurrent Neural Networks; Springer: Berlin/Heidelberg, Germany, 2012; pp. 37–45. [Google Scholar] [CrossRef]
  16. Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
  17. Huang, J.; Ge, Y.; Li, S. Mixed-Unit-Model-Based and Quantitative Studies on Groundwater Recharging and Discharging between Aquifers of Aksu River. Sustainability 2022, 14, 6936. [Google Scholar] [CrossRef]
  18. Kemeth, F.P.; Bertalan, T.; Evangelou, N.; Cui, T.; Malani, S.; Kevrekidis, I.G. Initializing LSTM internal states via manifold learning. Chaos 2021, 31, 093111. [Google Scholar] [CrossRef]
  19. Kerebih, M.S.; Keshari, A.K. Prediction of groundwater level using artificial neural network as an alternative approach: A comparison assessment with numerical groundwater flow model. Hydrol. Sci. J. 2024, 69, 1691–1701. [Google Scholar] [CrossRef]
  20. Khan, J.; Lee, E.; Balobaid, A.S.; Kim, K. A comprehensive review of conventional, machine leaning, and deep learning models for groundwater level (GWL) forecasting. Appl. Sci. 2023, 13, 2743. [Google Scholar] [CrossRef]
  21. Kow, P.-Y.; Liou, J.-Y.; Sun, W.; Chang, L.-C.; Chang, F.-J. Watershed groundwater level multistep ahead forecasts by fusing convolutional-based autoencoder and LSTM models. J. Environ. Manag. 2024, 351, 119789. [Google Scholar] [CrossRef]
  22. Längkvist, M.; Karlsson, L.; Loutfi, A. A review of unsupervised feature learning and deep learning for time-series modeling. Pattern Recognit. Lett. 2014, 42, 11–24. [Google Scholar] [CrossRef]
  23. Le, Q.V.; Jaitly, N.; Hinton, G.E. A simple way to initialize recurrent networks of rectified linear units. arXiv 2015, arXiv:1504.00941. [Google Scholar] [CrossRef]
  24. Liu, M.T.; Chen, X.K.; Wang, G.H.; Zhang, H.; Zhang, M.X.; Yan, T.Z. Short-Term Prediction of Groundwater Level Based on Spatiotemporal Correlation. Water Resour. 2024, 51, 207–220. [Google Scholar] [CrossRef]
  25. Marçais, J.; de Dreuzy, J.-R. Prospective interest of deep learning for hydrological inference. Groundwater 2017, 55, 688–692. [Google Scholar] [CrossRef]
  26. Pradhan, A.; Adams, K.H.; Chandrasekaran, V.; Liu, Z.; Reager, J.T.; Stuart, A.M.; Turmon, M.J. Modeling groundwater levels in California’s Central Valley by hierarchical Gaussian process and neural network regression. J. Geophys. Res. Mach. Learn. Comput. 2024, 1, e2024JH000322. [Google Scholar] [CrossRef]
  27. Mohammed, A.; Kora, R. A comprehensive review on ensemble deep learning: Opportunities and challenges. J. King Saud Univ.-Comput. Inf. Sci. 2023, 35, 757–774. [Google Scholar] [CrossRef]
  28. Nourani, V.; Khodkar, K.; Gebremichael, M. Uncertainty assessment of LSTM based groundwater level predictions. Hydrol. Sci. J. 2022, 67, 773–790. [Google Scholar] [CrossRef]
  29. Pan, M.; Zhou, H.; Cao, J.; Liu, Y.; Hao, J.; Li, S.; Chen, C.-H. Water level prediction model based on GRU and CNN. IEEE Access 2020, 8, 60090–60100. [Google Scholar] [CrossRef]
  30. Panahi, M.; Sadhasivam, N.; Pourghasemi, H.R.; Rezaie, F.; Lee, S. Spatial prediction of groundwater potential mapping based on convolutional neural network (CNN) and support vector regression (SVR). J. Hydrol. 2020, 588, 125033. [Google Scholar] [CrossRef]
  31. Retike, I.; Bikse, J.; Kalvans, A.; Delina, A.; Avotniece, Z.; Zaadnoordijk, W.J.; Jemeljanova, M.; Popovs, K.; Babre, A.; Zelenkevics, A.; et al. Rescue of groundwater level time series: How to visually identify and treat errors. J. Hydrol. 2022, 605, 127294–127308. [Google Scholar] [CrossRef]
  32. Rezaie-balf, M.; Naganna, S.R.; Ghaemi, A.; Deka, P.C. Wavelet coupled MARS and M5 Model Tree approaches for groundwater level forecasting. J. Hydrol. 2017, 553, 356–373. [Google Scholar] [CrossRef]
  33. Roy, D.K.; Biswas, S.K.; Mattar, M.A.; El-Shafei, A.A.; Murad, K.F.I.; Saha, K.K.; Datta, B.; Dewidar, A.Z. Groundwater level prediction using a multiple objective genetic algorithm-grey relational analysis based weighted ensemble of ANFIS models. Water 2021, 13, 3130. [Google Scholar] [CrossRef]
  34. Rygus, M.; Bianchi, M.; Novellino, A.; Hussain, E.; Taufiq, A.; Rusli, S.R.; Sarah, D.; Meisina, C. Permanent aquifer storage loss from long-term groundwater withdrawal: A case study of subsidence in Bandung (Indonesia). J. Hydrol.-Reg. Stud. 2025, 57, 102129. [Google Scholar] [CrossRef]
  35. Saad, B.; El-Sehiemy, R.A.; Hasanien, H.M.; El-Dabah, M.A. Robust parameter estimation of proton exchange membrane fuel cell using Huber loss statistical function. Energy Convers. Manag. 2025, 323, 119231. [Google Scholar] [CrossRef]
  36. Salehi, A.W.; Khan, S.; Gupta, G.; Alabduallah, B.I.; Almjally, A.; Alsolai, H.; Siddiqui, T.; Mellit, A. A study of CNN and transfer learning in medical imaging: Advantages, challenges, future scope. Sustainability 2023, 15, 5930. [Google Scholar] [CrossRef]
  37. Sun, K.; Hu, L.; Sun, J.; Cao, X. Enhancing groundwater level prediction accuracy at a daily scale through combined machine learning and physics-based modeling. J. Hydrol.-Reg. Stud. 2023, 50, 101577. [Google Scholar] [CrossRef]
  38. Tatsunami, Y.; Taki, M. Sequencer: Deep lstm for image classification. Adv. Neural Inf. Process. Syst. 2022, 35, 38204–38217. [Google Scholar] [CrossRef]
  39. Teimoori, S.; Olya, M.H.; Miller, C.J. Groundwater level monitoring network design with machine learning methods. J. Hydrol. 2023, 625, 130145–130177. [Google Scholar] [CrossRef]
  40. Valadkhan, D.; Moghaddasi, R.; Mohammadinejad, A. Groundwater quality prediction based on LSTM RNN: An Iranian experience. Int. J. Environ. Sci. Technol. 2022, 19, 11397–11408. [Google Scholar] [CrossRef] [PubMed]
  41. Vu, M.; Jardani, A.; Massei, N.; Fournier, M. Reconstruction of missing groundwater level data by using Long Short-Term Memory (LSTM) deep neural network. J. Hydrol. 2021, 597, 125776–125808. [Google Scholar] [CrossRef]
  42. Wunsch, A.; Liesch, T.; Broda, S. Groundwater level forecasting with artificial neural networks: A comparison of long short-term memory (LSTM), convolutional neural networks (CNNs), and non-linear autoregressive networks with exogenous input (NARX). Hydrol. Earth Syst. Sci. 2021, 25, 1671–1687. [Google Scholar] [CrossRef]
  43. Xing, Y.; Liu, Q.; Hu, R.; Gu, H.; Taherdangkoo, R.; Ptak, T. Global sensitivity analysis of water level response to harmonic aquifer disturbances through a Monte-Carlo based surrogate model with random forest algorithm. J. Hydrol. 2024, 641, 131775. [Google Scholar] [CrossRef]
  44. Xing, Y.; Liu, Q.; Hu, R.; Gu, H.; Taherdangkoo, R.; Yang, H.; Ptak, T. A general numerical model for water level response to harmonic disturbances in aquifers considering wellbore effects. J. Hydrol. 2022, 609, 127678. [Google Scholar] [CrossRef]
  45. Yang, C.-F.; Chi, W.-C.; Ke, C.-C.; Lin, C.-J. Application of seismically derived tilt signals to characterize groundwater flow regimes: An example from a constant-rate pumping test in Taiwan. J. Hydrol. 2024, 645, 132188. [Google Scholar] [CrossRef]
  46. Yang, J.; Rajanayaka, C.; Daughney, C.J.; Booker, D.; Morris, R.; Thompson, M. Metamodelling of Naturalised Groundwater Levels at a Regional Level in New Zealand. Sustainability 2023, 15, 13393. [Google Scholar] [CrossRef]
  47. Yang, X.; Zhang, Z. A CNN-LSTM model based on a meta-learning algorithm to predict groundwater level in the middle and lower reaches of the Heihe River, China. Water 2022, 14, 2377. [Google Scholar] [CrossRef]
  48. Zhang, J.; Zhu, Y.; Zhang, X.; Ye, M.; Yang, J. Developing a Long Short-Term Memory (LSTM) based model for predicting water table depth in agricultural areas. J. Hydrol. 2018, 561, 918–929. [Google Scholar] [CrossRef]
  49. Zhao, Y.; Yang, L.; Pan, H.; Li, Y.; Shao, Y.; Li, J.; Xie, X. Spatio-temporal prediction of groundwater vulnerability based on CNN-LSTM model with self-attention mechanism: A case study in Hetao Plain, northern China. J. Environ. Sci. 2025, 153, 128–142. [Google Scholar] [CrossRef]
  50. Zhou, P.; Xie, X.; Lin, Z.; Yan, S. Towards Understanding Convergence and Generalization of AdamW. IEEE Trans. Pattern Anal. Mach. Intell. 2024, 46, 6486–6493. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Geographic location of the Tailan River Basin in Xinjiang, China, showing basin boundaries, rivers, and monitoring wells. This map provides the spatial context for interpreting groundwater variations in the study area. The red area in the inset map indicates the location of the study basin within China.
Figure 1. Geographic location of the Tailan River Basin in Xinjiang, China, showing basin boundaries, rivers, and monitoring wells. This map provides the spatial context for interpreting groundwater variations in the study area. The red area in the inset map indicates the location of the study basin within China.
Ijgi 15 00006 g001
Figure 2. Hydrogeological zoning map of the Tailan River Basin. The basin is divided into upstream recharge areas, midstream transition zones, and downstream pumping regions based on sediment facies and aquifer conditions, supporting the identification of representative wells.
Figure 2. Hydrogeological zoning map of the Tailan River Basin. The basin is divided into upstream recharge areas, midstream transition zones, and downstream pumping regions based on sediment facies and aquifer conditions, supporting the identification of representative wells.
Ijgi 15 00006 g002
Figure 3. Distribution of input variables used for groundwater prediction, including (a) pumping, (b) evaporation, (c) runoff, (d) precipitation, and (e) temperature.
Figure 3. Distribution of input variables used for groundwater prediction, including (a) pumping, (b) evaporation, (c) runoff, (d) precipitation, and (e) temperature.
Ijgi 15 00006 g003aIjgi 15 00006 g003b
Figure 4. Statistical distribution of normalized groundwater levels across monitoring wells. (a) Boxplots where the color intensity of each box is proportional to its interquartile range (IQR), with darker colors indicating more concentrated data (smaller IQR) and lighter colors indicating greater dispersion (larger IQR). (b) Histograms of representative wells. The results reveal spatial differences in groundwater variability, with downstream wells generally showing larger IQRs (lighter-colored boxes), greater fluctuations, and stronger negative skewness.
Figure 4. Statistical distribution of normalized groundwater levels across monitoring wells. (a) Boxplots where the color intensity of each box is proportional to its interquartile range (IQR), with darker colors indicating more concentrated data (smaller IQR) and lighter colors indicating greater dispersion (larger IQR). (b) Histograms of representative wells. The results reveal spatial differences in groundwater variability, with downstream wells generally showing larger IQRs (lighter-colored boxes), greater fluctuations, and stronger negative skewness.
Ijgi 15 00006 g004
Figure 5. Structure of the convolutional neural network (CNN) module for extracting short-term coupled features from multivariate inputs. Convolution and pooling layers progressively capture localized temporal patterns relevant to groundwater dynamics.
Figure 5. Structure of the convolutional neural network (CNN) module for extracting short-term coupled features from multivariate inputs. Convolution and pooling layers progressively capture localized temporal patterns relevant to groundwater dynamics.
Ijgi 15 00006 g005
Figure 6. Structure of the long short-term memory (LSTM) module. In the schematic, the plus sign (+) denotes vector addition (e.g., for updating the cell state), and the cross sign (×) denotes element-wise multiplication (e.g., for gating operations).
Figure 6. Structure of the long short-term memory (LSTM) module. In the schematic, the plus sign (+) denotes vector addition (e.g., for updating the cell state), and the cross sign (×) denotes element-wise multiplication (e.g., for gating operations).
Ijgi 15 00006 g006
Figure 7. Overall architecture of the hybrid CNN–LSTM model.
Figure 7. Overall architecture of the hybrid CNN–LSTM model.
Ijgi 15 00006 g007
Figure 8. Training and validation loss curves for the representative well AJC-7. The shaded green area highlights the final 50 training epochs, indicating the phase where the model loss exhibited a stable, descending trend, suggesting convergence.
Figure 8. Training and validation loss curves for the representative well AJC-7. The shaded green area highlights the final 50 training epochs, indicating the phase where the model loss exhibited a stable, descending trend, suggesting convergence.
Ijgi 15 00006 g008
Figure 9. Comparison between predicted and observed groundwater levels for four models: The ARIMA model (a) exhibits phase lag, while RNN (b) shows noise. LSTM (c) predictions are smoother but lag behind sharp declines, whereas the CNN–LSTM model (d) demonstrates the closest agreement with observations.
Figure 9. Comparison between predicted and observed groundwater levels for four models: The ARIMA model (a) exhibits phase lag, while RNN (b) shows noise. LSTM (c) predictions are smoother but lag behind sharp declines, whereas the CNN–LSTM model (d) demonstrates the closest agreement with observations.
Ijgi 15 00006 g009aIjgi 15 00006 g009b
Table 1. Statistical Description Table of Groundwater Levels in Different Zones (unit: meters).
Table 1. Statistical Description Table of Groundwater Levels in Different Zones (unit: meters).
SubzoneMean ValueMaximumMinimumStandard Deviation Skewness Coefficient
AJC-71177.8021181.921177.10.63275.010
AJC-121156.9811160.61136.282.5596−3.400
AJC-141145.5251146.471119.920.9655−15.838
AJC-171097.8531107.441077.462.0268−5.824
AJC-181103.9661107.861103.171.08103.156
AJC-211064.2991067.931037.952.2066−11.499
(Note: Well AJC-24 was excluded from presentation due to its data being completely identical to that of Well AJC-7, which was attributed to a recording error).
Table 2. Configuration of Key Model Hyperparameters.
Table 2. Configuration of Key Model Hyperparameters.
HyperparameterSet ValueHyperparameterSet Value
Input window length60Number of CNN filters64, 32
Kernel size5, 3Number of LSTM units64
Network depth2OptimizerAdamW
Initial learning rate1 × 10−4Weight decay1 × 10−5
Batch size32Loss functionHuber Loss
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Hu, S.; Du, M.; Yang, J.; Liu, Y.; Tuo, Z.; Ma, X. Application of a Hybrid CNN-LSTM Model for Groundwater Level Forecasting in Arid Regions: A Case Study from the Tailan River Basin. ISPRS Int. J. Geo-Inf. 2026, 15, 6. https://doi.org/10.3390/ijgi15010006

AMA Style

Hu S, Du M, Yang J, Liu Y, Tuo Z, Ma X. Application of a Hybrid CNN-LSTM Model for Groundwater Level Forecasting in Arid Regions: A Case Study from the Tailan River Basin. ISPRS International Journal of Geo-Information. 2026; 15(1):6. https://doi.org/10.3390/ijgi15010006

Chicago/Turabian Style

Hu, Shuting, Mingliang Du, Jiayun Yang, Yankun Liu, Ziyun Tuo, and Xiaofei Ma. 2026. "Application of a Hybrid CNN-LSTM Model for Groundwater Level Forecasting in Arid Regions: A Case Study from the Tailan River Basin" ISPRS International Journal of Geo-Information 15, no. 1: 6. https://doi.org/10.3390/ijgi15010006

APA Style

Hu, S., Du, M., Yang, J., Liu, Y., Tuo, Z., & Ma, X. (2026). Application of a Hybrid CNN-LSTM Model for Groundwater Level Forecasting in Arid Regions: A Case Study from the Tailan River Basin. ISPRS International Journal of Geo-Information, 15(1), 6. https://doi.org/10.3390/ijgi15010006

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop