1. Introduction
Water is an indispensable resource for human beings, of which groundwater is one of the most important freshwater resources in geological structures [
1,
2,
3]. With the rapid development and increase in the urban population and economy in eastern coastal areas of China, the demand for water resources is also increasing, and the intensity of groundwater mining is also strengthening gradually. However, excessive exploitation of groundwater will not only lead to the decrease in groundwater level, but also lead to water pollution [
4], land subsidence [
5,
6], seawater intrusion [
7,
8,
9], and other environmental hazards. In the eastern coastal areas of China, groundwater extraction reached its peak in the early 1990s, resulting in the continuous decline in groundwater level in these areas and the continuous development of geological disasters such as land subsidence and ground cracks. According to official groundwater monitoring reports, by the end of 2014, the cumulative land subsidence of coastal areas of Jiangsu Province is more than 200 mm, and the area of funnel formation is nearly 14,000 km
2. The largest settlement center is located at Dafeng Haifeng Farm in Yancheng City, and the cumulative land subsidence is more than 700 mm.
Compared with surface water systems, the internal mechanisms of groundwater systems are more complex. As the change in water quantity and the migration laws of groundwater cannot be directly observed, and the geological harm caused by groundwater over-extraction is slow, once accumulated to a certain extent, it will cause irreversible damage. Therefore, relying on the monitoring data of groundwater dynamics, timely and accurate prediction of the dynamic change process of groundwater level and analysis of the dynamic change characteristics of groundwater are of great significance for groundwater exploitation and effective and sustainable management of water resources [
10,
11,
12,
13].
According to a review of the literature, dynamic prediction of groundwater levels is usually performed using either deterministic or stochastic models [
14,
15,
16,
17,
18,
19]. Deterministic models are solved by numerical model equations for known data, but deterministic models often require high data requirements and high costs, and realistic data inaccuracies and limited hydrogeological parameters make classical numerical models more uncertain [
20]. The uncertainty of classical numerical models is compounded by the reality of inaccurate data and limited hydrogeological parameters. At the same time, changes in groundwater level are nonlinear and affected by multiple factors, such as precipitation, geological conditions, surface recharge, and human activities, which makes the prediction of groundwater very complicated. Therefore, it is necessary to establish a model reflecting the dynamic change law of groundwater levels by combining statistical theories [
21,
22]. As a natural phenomenon, the variations in groundwater tables are spatially continuous [
23,
24,
25,
26]. Therefore, the prediction of groundwater level changes should take into account both spatial and temporal factors [
27].
With the rapid development of artificial intelligence computing, data-driven methods such as the artificial neural network (ANN) [
28], support vector regression (SVR) [
29], wavelet transform model [
30,
31], and extreme learning machine (ELM) [
32,
33] have been widely used in groundwater level prediction, and also provide a new method for modeling spatiotemporal series to solve complex nonlinear problems. Daliakopoulos investigated the performance of different neural networks in groundwater prediction and determined the optimal neural network structure to simulate the decreasing trend of groundwater level and predict the groundwater level over the next 18 months [
34]. Lallahem proposed an artificial-neural-network-based approach using minimum lag and the number of hidden nodes to simulate the effective parameters and data in the groundwater level to generate the best performing simulation model [
35]. Taormina and Mohammadi trained artificial neural network (ANN) simulations based on limited groundwater data to simulate and predict groundwater levels [
36,
37]. On the basis of principal component analysis, Sun combined the phase space reconstructed by chaos theory with a Back Propagation (BP) neural network, established the BP neural network model based on chaos theory, and predicted the groundwater level of Heihu Spring in Jinan [
38]. Raj used artificial neural networks to predict rainfall and groundwater table depth [
39]. Crespo [
40] used the spatiotemporal autoregressive and moving average (STARMA) model for the short-term prediction of compressed sequence images, and the results showed that the STARMA model has good short-term prediction capability. Stroud [
41] proposed a state-space framework for non-stationary spatiotemporal data and used tropical rainfall to demonstrate that the state-space model can handle non-stationary spatial processes and spatiotemporal correlations. Building on these premises, the current study set out to investigate space–time prediction using BP and STARMA models. In various studies, the STARMA model has been applied to rainfall forecasting [
42,
43], ecological management [
44], temperature forecasting [
45,
46], and groundwater forecasting [
47,
48].
The groundwater levels are affected by a variety of factors, such as precipitation, hydrological conditions, surface recharge, and human activities, so it is difficult to predict dynamic groundwater levels [
49,
50]. The groundwater level monitoring data are typical spatiotemporal series with discrete time and continuous space [
51]. The spatial distribution of aquifers is often continuous, and the spatial structure of time series changes slowly with the evolution of the environment, which belongs to the non-stationary spatial series, so the difference method is not applicable. For such spatiotemporal sequences, some methods that have been used to simulate and model non-stationary spatiotemporal data include Bayesian models [
52], state-space models [
53], and Kalman filtering methods [
54], but most of these models are only applicable to specific areas. Therefore, it is necessary to seek a more general and applicable spatiotemporal modeling method for the nature of spatiotemporal sequences. The STARMA model has good applicability to smooth spatial and temporal series that are discrete in both time and space, but in fact, most spatial and temporal series are non-smooth. Martin proposed to use the difference method to transform non-smooth series into smooth ones and then apply the STARMA model to model them, but the difference method can only deal with temporal non-smoothness but not spatial non-smoothness. The BP neural network model can provide solutions for the realization and training of multi-layer neural networks, and is good at solving nonlinear problems. In addition, the groundwater level is affected by many factors such as climate and human activities, and its change law is nonlinear. The BP neural network model can effectively predict the nonlinear groundwater level. However, the prediction of the groundwater table should consider not only the temporal distribution but also the spatial heterogeneity. The STARMA model belongs to spatiotemporal modeling, which can consider the distribution of groundwater level data in time and space, and better cater to the spatiotemporal variation trend of groundwater level. This study focuses on the spatiotemporal series analysis of groundwater monitoring data and the interaction between the three models. The BP neural network model and STARMA model are combined effectively to simulate and predict the dynamic groundwater level, which can effectively improve the prediction accuracy of groundwater level.
3. Results and Discussion
3.1. Data Processing and Analysis
3.1.1. Monitoring Data Processing
The depth of the water level of confined aquifer III in the study area is higher in the east and lower in the west. The confined aquifer III in the eastern coastal area is buried at a depth of about 8.0 m. With the gradual extension of the aquifer to the west, the water level of confined aquifer III is also decreasing gradually. The buried depth of the groundwater level in the western region is mostly deeper than 20.0 m, and the buried depth of the groundwater level reaches about 32.0 m in Fuyang County and Yancheng urban area, which indicates that the groundwater in the study area has strong spatial heterogeneity. From the scope of the study area and the scope of a single monitoring well, the water level monitoring data of the confined aquifer III monitoring well from 2005 to 2014 in the study area were drawn, respectively, as the ‘water level–time’ variation curve, as shown in
Figure 4, to analyze the dynamic variation characteristics of the groundwater level. In
Figure 4, the average water level of each quarter in the horizontal coordinate is accumulated quarter by quarter since the first quarter of 2005.
Figure 4a represents the change in the average lower water level of the whole study area over time, and
Figure 4b represents the change in the water level of a single monitoring well over time. According to the analysis in
Figure 4a, it can be seen that the groundwater level in the whole study area presents a downward trend. The blue in the
Figure 4a represents the trend line of water level change; that is, there is a definite downward trend of water level in the entire study area. It can be seen from
Figure 4b that the water level change curves of different water level monitoring wells show a certain randomness in different periods; that is, the water level change within the range of a single monitoring well shows a certain randomness. A comprehensive analysis shows that the characteristics of water level variation in the study area shows a certain decline trend in the global area and a certain random variation in the local area.
Groundwater level monitoring data of confined aquifer III in the study area were selected from the original database, and their spatial distribution is shown in
Figure 1c. Most monitoring wells are distributed in the western part of the study area, while monitoring wells in the eastern coastal zone are less distributed. For individual monitoring wells with missing water level values in certain years, the ArcGIS spatial difference module or data statistics were used to obtain the results.
3.1.2. Data Analysis
Normal distribution test: This study used skewness and kurtosis coefficients and non-parametric methods to test the spatiotemporal data. The data from all monitoring wells were examined, and it was found that the monitoring wells numbered 51073006#, 51073509#, 51073512#, and 51074507# have a skewness or kurtosis greater than 1, as shown in
Table 1, and their skewness
u is greater than U
0.05 = 1.96, which is tentatively considered not to conform to a normal distribution.
These four groups of data were re-run using SPSS for non-parametric tests, and the results are shown in
Table 2. It was found that their P-test values (Sig2-tailed) are all less than 0.05; therefore, they do not conform to a normal distribution. In order not to affect the modeling accuracy, these 4 groups of monitoring data were removed from the dataset, and the remaining 23 monitoring wells’ monitoring data were selected as the experimental data.
Test of time smoothness: The test of time smoothness is mainly achieved through time series analysis. The average groundwater level of confined aquifer III in the study area from 2005 to 2014 was calculated as shown in
Table 3. In turn, a trend analysis of the time series water level monitoring data in
Table 3 was made, as shown in
Figure 4a, from which it can be seen that the average water level in the study area for 40 periods shows a decreasing trend. From
Figure 5, it can be seen that the series only converges to zero after the delay interval of period 9 for the time autocorrelation function value (the area between the two grey bars in the graph), indicating that there is a degree of temporal correlation in the series and that the series is non-stationary in time.
Spatial variability analysis: The Kriging spatial interpolation method was used to obtain the elevation maps of the groundwater levels in the study area for the 10th, 20th, and 30th phases of the sample data and the 35th and 40th phases of the verification data. In the
Figure 6b(elevation maps of the groundwater levels for 10th, 20th and 30th ), it can be seen that: first, there is a clear trend of decreasing water levels with the increase in years; secondly, there is a trend of decreasing minimum water level values year by year, while the maximum water level also decreases year by year; thirdly, the area of water levels in each class in the study area is also gradually increasing; fourthly, the water levels in the western part of the study area are clearly lower than those in the eastern part. The above combination indicates that the mean water levels in the study area have spatially heterogeneous characteristics for each period.
3.2. BP-STARMA Model Building
3.2.1. BP Neural Network to Extract Nonlinear Spatiotemporal Trends
After the spatiotemporal smoothness test described above, it was concluded that the spatiotemporal monitoring series of pore groundwater in the study area was a spatiotemporal non-smooth series and showed a decreasing trend throughout the study area, showing some randomness in the range of individual monitoring wells. Therefore, the BP neural network was used to extract the definite spatiotemporal trend values present in the groundwater of the study area.
In order to analyze the trend of groundwater level with seasonality, each quarterly data sample of water level monitoring was taken as one period and the data were the average of water level in each quarter; therefore, there were 40 periods (quarters) of data for each monitoring well in the study area for the 10 years from 2005 to 2014. In the BP neural network, the learning training data accounted for 70–80% of the total sample data and the validation data accounted for 20–30% of the total sample data, so the first 32 periods of all monitoring data of groundwater level in the study area were used as network learning training data and the last 8 periods were used as network validation data.
The trained BP neural network was used to fit the nonlinear trend value of the confined aquifer III in the study area, and the average groundwater level fitting value of the 10th, 20th, and 30th stages was obtained as shown in
Figure 6. In the figure, green represents areas where water levels are deeply buried, while white represents areas where water levels are shallower.
3.2.2. STARMA Modeling
For the underground hydrogeological conditions in the study area, the spatial autocorrelation of the sample residuals used the semi-variance function (Equation (2)) to determine whether there was a spatial correlation distance between the sample residuals after removing the temporal trend values. The spatial variation in the study area was isotropic; that is, from west to east, and the semi-variance function analysis was carried out by selecting the appropriate number of periods of groundwater level values from the sample residuals to obtain the analysis results shown in
Table 4. From the table, it can be seen that there is a spatial correlation distance in the residual series, which indicates that the sample residuals after removing the trend values are spatially correlated. In other words, the STARMA was used to model the residual series. The analysis results show that the residual series has relatively large bias abutment values and relatively small block gold values, indicating that the residual series has a strong spatial correlation.
The least-squares method was used to estimate the parameters of Formula 3 above, and the parameters and test values were obtained as shown in
Table 5:
The prob in the table represents the significance levels of the t-statistic and are all less than 0.05, indicating that the coefficients are correlated with the dependent variable.
After the STARMA modeling of the residuals is completed, the residuals need to be tested. If the mean of the spatiotemporal autocorrelation coefficient values of the residuals is close to 0 and the variance is close to [
N(
T −
S)]
−1 (
N = 23 indicates the number of spatial cells,
T = 32 indicates the number of groundwater level monitoring periods, and s = 2 indicates the time delay), this indicates that the spatiotemporal autocorrelation function values are close to random errors. The values of the spatiotemporal autocorrelation coefficients calculated for the residuals of the model are shown in the
Table 6 below.
As can be seen from the table, the spatially delayed autocorrelation coefficients of order 0 and order 1 are around 0, with mean values of 0.002 and −0.015, respectively, and variance values of 0.00112 and 0.00138, respectively, which are less than 1/[2332 − 2] = 0.00145, indicating that the model residuals are not significantly auto-correlated in time and space and the residual series are close to random errors, which explains how the STARMA model can better explain the spatiotemporal data of groundwater dynamic monitoring after removing the spatiotemporal trend values.
3.3. Comparison of Model Prediction Accuracy
According to Equation (5), the fitting result
of the trend value extracted based on the BP neural network was added to the fitting value
of the STARMA model to obtain the average fitting results of groundwater in the 10th, 20th, and 30th periods. The non-stationary spatiotemporal series was transformed into stationary series by Difference Methods (DMs), and the groundwater level was modeled by the STARMA model and BP neural network model. By comparing the three fitting results with the actual monitoring values, it can be seen that the BP-STARMA model can better fit the spatiotemporal evolution of groundwater. In order to further test the BP-STARMA, STARMA, and BP neural network models, the Root-Mean-Squared Error (RMSE) was used as an evaluation index to evaluate the fitting of the three models in different periods, and the evaluation results are shown in
Table 7. The table shows that the standard deviation of fitting of BP-STARMA is smaller than those of the STARMA model and BP neural network model. Compared with the STARMA model and BP neural network model, the fitting accuracy of BP-STARMA is improved by 3.1% and 25.8%, and 19.5% and 7.5%, respectively.
From the analysis of the BP-STARMA fitting effect, the model better fit the spatial and temporal variation pattern of groundwater in confined aquifer Ⅲ in the study area and the fitting result is better than the BP model and the STARMA model. However, the prediction results need to be verified to prove the good performance of the BP-STARMA model. Therefore, the above three models were used to validate the dynamic groundwater level monitoring data of the 33rd to 40th periods in the study area.
Figure 7 shows the prediction results of the BP-STARMA model, STARMA model, and BP neural network model for the 34th, 37th, and 40th periods. At the same time, RMSE was also used to evaluate the fitting values of the three models at different periods, and the evaluation results are shown in the table.
The 34th BP-STARMA model, the STARMA model, and the BP neural network can all predict groundwater level values better, but as time increases, the STARMA model shows deviations between the predicted and actual monitored values, and the BP neural network model is less well fitted in local areas. The BP-STARMA model is better than the STARMA model and the BP network model in terms of prediction.
3.4. Evaluation and Comparison of Comprehensive Model Performance
In order to comprehensively compare the modeling effects of BP-STARMA, STARMA, and BP neural network models, we used four evaluation indexes to evaluate and compare the three models. They were the residual standard error (RSE), normalized mean squared error (NMSE), root-mean-squared error (RMSE), and mean absolute error (MAE). The values of the three models after evaluation are shown in
Table 8.
It can be seen from
Table 9 that the evaluation indexes of BP-STARMA of the above six monitoring wells are lower than those of STARMA and BP, apart from the BP-STARMA evaluation indexes of the 51073010# monitoring well, which are higher than those of the STARMA model, which shows that the prediction effect of groundwater level based on the BP-STARMA model is better than that of the STARMA model and BP neural network model. That is, it has a good modeling effect for spatiotemporal data of continuous time discrete in space.
In order to more intuitively display the difference between the actual monitored values and the values of the three simulation models, we used a graph to describe the water level change curves of the above six monitoring wells, as shown in
Figure 8.
In order to compare the fitting and prediction effects of the models separately, the fitting accuracy and prediction accuracy of BP-STARMA, STARMA, and BP neural network models were calculated separately using the RMSE evaluation index.Except for monitoring well 51073010#, the fitted root-mean-square error of the BP-STARMA model is reduced by 39.92%, 38.35%, 30.25%, 31.55%, and 13.57% compared to the STARMA model, and by 22.2%, 8.7%, 15.9%, 28.5%, and 4.42%, respectively, compared to the BP neural network model, indicating that the BP neural network has a better fit than STARMA. As for the prediction accuracy, the prediction accuracy of BP-STARMA is improved by 69.34%, 63.61%, 32.81%, and 47.28%, respectively, compared to the STARMA model, and 84.4%, 85.8%, 14.3%, 44.9%, 58.9%, and 55.4% compared to the BP neural network, which indicates that the STARMA model has a better prediction accuracy than the BP neural network model. It can be seen that the spatial and temporal fitting and prediction of groundwater level in the study area based on the BP-STARMA model are better than those of the STARMA model and the BP neural network model.
4. Conclusions
In this paper, three models, BP, STARMA, and BP-STARMA, were used to simulate and predict groundwater level changes for the spatiotemporal variation process of groundwater level. In the modeling using STARMA, the spatial variables tended to be isotropic in order to construct the spatial weight matrix, and the areas where the aquifers are continuous were selected; thus, the constructed models are suitable for areas with spatial continuity, but for phenomena where sharp extinction exists, a suitable model is not yet known. Based on the spatiotemporal variation characteristics of the groundwater level monitoring sequence, i.e., a global deterministic nonlinear variation trend value and a local stochastic spatial variation value, we constructed a BP neural network containing three input parameters to extract the global deterministic spatiotemporal trend value; a STARMA model was constructed for the residual values of the samples after removing the trend value, and finally, the BP model and the STARMA model were fitted to obtain the results. This showed that the BP-STARMA model was very effective in predicting the groundwater level in the study area. The results show that the BP-STARMA model is more applicable to the spatiotemporal series of continuous groundwater levels.