Spatiotemporal Prediction and Pattern Analysis of Complex Ground Deformation Fields from Multi-Temporal InSAR

Fu, Yuanzhao; Wang, Jili; Zhang, Yi; Zhang, Heng; Wu, Yulun; Kang, Litao

doi:10.3390/rs18060925

Open AccessArticle

Spatiotemporal Prediction and Pattern Analysis of Complex Ground Deformation Fields from Multi-Temporal InSAR

by

Yuanzhao Fu

^1,2

,

Jili Wang

^1,*

,

Yi Zhang

¹

,

Heng Zhang

¹

,

Yulun Wu

¹

and

Litao Kang

^1,2

¹

Department of Space Microwave Remote Sensing System, Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100190, China

²

School of Electronic, Electrical and Communication Engineering, University of Chinese Academy of Sciences, Beijing 100049, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2026, 18(6), 925; https://doi.org/10.3390/rs18060925

Submission received: 14 February 2026 / Revised: 9 March 2026 / Accepted: 13 March 2026 / Published: 18 March 2026

(This article belongs to the Special Issue Advances in Synthetic Aperture Radar (SAR) System, Signal Processing and Applications)

Download

Browse Figures

Versions Notes

Highlights

What are the main findings?

A spatiotemporal synchronous prediction framework is proposed for large-scale complex InSAR ground deformation fields.
A combined ICA and K-means approach is proposed to identify dominant evolution patterns of the deformation field and their spatial distributions.

What are the implications of the main findings?

The proposed framework improves the prediction capability for complex multimodal ground deformation processes.
The identified interaction patterns between ground deformation and groundwater provide insights for urban groundwater management and geohazard assessment.

Abstract

Ground deformation is a major geohazard in many urban areas, requiring reliable monitoring and forecasting for hazard mitigation. Although Multi-Temporal InSAR enables high-resolution deformation monitoring, most prediction approaches rely on single-point modeling and fail to exploit spatial dependencies within deformation fields. This study proposes a spatiotemporally synchronous prediction framework for large-scale InSAR deformation fields, integrating sequence preprocessing, spatiotemporal modeling, and deformation pattern analysis. First-order differencing reduces sequence non-stationarity, while a patch-based encoder-decoder structure preserves spatial topology during dimensionality reduction. The core prediction model, built on PredRNNv2, captures the long-term spatiotemporal evolution of InSAR deformation sequences. In addition, independent component analysis (ICA) combined with K-means clustering identifies dominant deformation patterns and their geological associations. The framework is evaluated using synthetic datasets simulating multiple deformation mechanisms and Sentinel-1 InSAR time-series data over the Beijing Plain from 2015 to 2025. Results show that the model accurately captures deformation evolution and identifies transitions associated with groundwater regulation. These findings demonstrate the potential of deep spatiotemporal learning for large-scale InSAR deformation prediction and geohazard mechanism interpretation.

Keywords:

Multi-Temporal InSAR; ground deformation; spatiotemporal prediction; PredRNNv2; Beijing plain; deformation dynamics simulation

1. Introduction

Ground deformation refers to the dynamic evolution of surface vertical subsidence or uplift under the influence of natural geological processes or anthropogenic engineering activities, which is often closely associated with geological hazards such as land subsidence, earthquakes, landslides, and ground fissures [1]. As a long-standing geological environmental issue confronting numerous regions globally, achieving high-precision monitoring and reliable trend prediction of ground deformation is important for ensuring the safe operation of urban infrastructure, assessing the risks of underground space development, and safeguarding the safety of human lives and property [2].

Traditional ground deformation monitoring methods, such as leveling [3] and Global Navigation Satellite System (GNSS) static measurements [4], offer high accuracy at individual monitoring points. However, owing to the constraints of expensive deployment costs and limited observation density, these methods fail to provide continuous deformation information with large-scale coverage and high spatial resolution. Although Unmanned Aerial Vehicle (UAV) photogrammetry technology [5] exhibits high flexibility, it is susceptible to meteorological conditions and survey scheduling, limiting its applicability for regional-scale and long-term time-series monitoring. In recent years, Interferometric Synthetic Aperture Radar (InSAR) has evolved into a core technical means for acquiring high-density ground deformation time series data owing to its all-weather, all-day, and large-scale observational capabilities. This has made it a pivotal tool for monitoring a wide range of geohazards and phenomena, including volcanic activity [6], landslides [7], urban subsidence [8], mining activities [9], and transportation infrastructure deformation [10,11].

InSAR technology is producing ever-growing volumes of time series deformation data. Yet, turning this data into stable, reliable predictions of future trends remains a significant challenge. Existing studies predominantly follow a single-point strategy for time series modeling, building separate prediction models for each monitoring point. These methods can be broadly classified into three categories. The first category is based on physical mechanisms, which simulate formation and deformation processes by constructing geotechnical or consolidation models [12,13]. However, such approaches exhibit strong dependence on model parameters and are difficult to extend to regional-scale applications. The second category is based on mathematical statistics, employing methods such as polynomial functions [14], logistic functions [15], Weibull functions [16], ARIMA models [17], and Holt-Winters models [18] for fitting and extrapolation. These methods are typically applicable only to specific evolution patterns and therefore exhibit limited generalization capability. The third category relies on machine learning methods, including Support Vector Regression (SVR) [19], Random Forest Regression (RFR) [20], Long Short-Term Memory (LSTM) networks [21], Gated Recurrent Units (GRUs) [22], and Transformers [23], to enhance the feature representation and fitting capacity for complex time series data. A critical limitation shared by the above methods is their neglect of spatial correlations during modeling. When applied to large-scale deformation fields with significant spatial heterogeneity, this limitation often leads to spatial discontinuities in the prediction results, making it difficult to accurately characterize the overall evolution, expansion, and migration processes of subsidence funnels.

To address the limitations of single-point modeling in the spatial dimension, spatiotemporal prediction methods have gradually emerged in recent years, which jointly consider both spatial structural characteristics and temporal evolution. Yao et al. employed a Convolutional Long Short-Term Memory (ConvLSTM) model to simulate and predict ground subsidence in the Jinchuan Mining Area, Gansu Province [24]. Hu et al. incorporated an attention mechanism into the ConvLSTM model and applied it to the analysis of deformation processes associated with the Jishishan earthquake [25]. Jiang et al. utilized the SAM-ConvLSTM model to monitor and predict ground deformation in the context of the Mountain Excavation and City Construction project (MECC) in Yan’an New District (YND) [26]. Although the above research has achieved some success in the joint spatiotemporal modeling of deformation fields, critical limitations still remain. Firstly, most existing models adopt a “multi-step to single-step” prediction paradigm, which is prone to error accumulation and accuracy degradation during long-term time series extrapolation. Secondly, most existing studies have focused on scenarios with limited spatial extents and simple evolutionary patterns, leaving the robustness and generalizability insufficiently validated for large-scale, complex ground deformation systems in which accelerated, linear, and periodic dynamics coexist.

To address the aforementioned issues, this paper proposes a spatiotemporal synchronous prediction and comprehensive analysis framework for large-scale, multi-mode ground deformation. To balance the demands of massive data processing with computational resource constraints, a patch-based encoding-decoding structure is introduced to effectively reduce the dimensionality of high-dimensional spatiotemporal data while preserving the original spatial topology. In addition, a first-order differencing and window-based renormalization strategy is adopted to mitigate the influence of non-stationary trend components in the original deformation sequences on model training and convergence. On this basis, a spatiotemporal collaborative prediction model based on the Predictive Recurrent Neural Network v2 (PredRNNv2) [27] is constructed, in which the dual-channel memory flow mechanism is leveraged to enhance the modeling capability for long-term dependencies and complex spatiotemporal evolution characteristics. Furthermore, deformation patterns exhibiting significant heterogeneity within the study area are identified and summarized by combining Independent Component Analysis (ICA) with K-means clustering methods. Meanwhile, a simulation dataset integrating multiple nonlinear evolution processes and spatially correlated noise characteristics is constructed to systematically evaluate the stability of the spatiotemporal prediction model under complex deformation scenarios.

The remainder of this paper is organized as follows: Section 2 briefly introduces the hydrogeological background and multi-source observational data of the study area. Section 3 describes in detail the structure of the proposed spatiotemporal prediction model and the associated data preprocessing methods. Section 4 presents the application results of the model using measured data from the Beijing Plain. Section 5 evaluates the applicability of the model under different deformation modes through simulation experiments and analyzes the physical driving mechanisms of ground deformation in conjunction with hydrogeological conditions. Finally, Section 6 provides a systematic summary of the entire study.

2. Study Area and Datasets

2.1. Beijing Plain Area

As shown in Figure 1b, Beijing is located in the northwestern part of the North China Plain (NCP), with geographic coordinates ranging from approximately 39.4° to 41.6°N and 115.7° to 117.4°E, covering a total area of about 16,422 km². The overall topography of Beijing is characterized by higher elevations in the northwest and lower elevations in the southeast. The northwestern region is dominated by the Yanshan and Taihang Mountains, exhibiting pronounced topographic relief, whereas the southeastern region consists mainly of the alluvial proluvial plain of the NCP, with relatively flat terrain.

The Beijing Plain is predominantly covered by Quaternary unconsolidated sediments, with lithologies mainly consisting of highly compressible sand, silt, and clay layers, and a typical multilayer aquifer system is well developed [28]. Under the combined effects of long-term groundwater overexploitation and high-intensity human activities, the aquifer system has continuously undergone compressive deformation, making this region one of the typical areas of land subsidence induced by groundwater extraction in China [29].

In terms of climate and hydrological characteristics, Beijing experiences a temperate monsoon climate, with precipitation primarily concentrated in summer (June to August) and exhibiting a highly uneven distribution throughout the year [30]. As shown in Figure 1a, Beijing contains five major river systems from west to east: the Daqing River, the Yongding River, the Beiyun River, the Chaobai River, and the Jiyun River. Groundwater has been an important component of urban water supply systems for a long time, and the dynamic imbalance between its extraction and natural recharge has further exacerbated the compression of the aquifer. In recent years, with the implementation of groundwater extraction restriction policies and the South-to-North Water Diversion (SNWD) project, groundwater levels in certain regions have risen, accompanied by localized ground uplift. However, the overall deformation process still exhibits significant temporal and spatial differences [31].

Considering the interplay of its geological setting and intense human activity, this study focuses on the Beijing Plain, specifically the area below 120 m in elevation and 5° in slope (Figure 1a). This region is not only the core of concentrated urban development but also exhibits the most significant and mechanistically complex historical surface deformation, making it a site of high scientific and practical relevance.

2.2. Dataset

2.2.1. SAR Datasets

The C-band Sentinel-1A satellite, launched by the European Space Agency (ESA) on 3 April 2014, has been operating stably since its launch and is capable of acquiring long-term, continuous SAR images. As shown in Table 1, this study utilized Sentinel-1A ascending orbit images covering the Beijing area over the period from 30 July 2015 to 5 October 2025. A total of 325 scenes were processed, corresponding to 283 acquisition dates, with certain dates having two scenes.

To validate the accuracy of the deformation monitoring results, Single-Look Complex (SLC) images from the descending orbit during the same period were processed simultaneously. The spatiotemporal baselines of the two datasets are presented in Figure 2. For the ascending orbit data, a single image can generally cover the entire study area, whereas for the descending orbit data, full coverage can only be achieved by mosaicking two images, as shown in Figure 1c. The above image data were obtained from the data distribution system of the Alaska Satellite Facility (ASF) (https://search.asf.alaska.edu/ (accessed on 12 March 2026)).

2.2.2. Terrain Data

The Copernicus Digital Elevation Model (COP-DEM) is a high-precision digital surface model (DSM) jointly released by the ESA and the European Commission (EC). This dataset uses WGS84 as the horizontal datum and EGM2008 as the vertical datum, with an absolute horizontal accuracy better than 6 m and an absolute vertical accuracy better than 4 m. In this study, the COP-DEM GLO-30 dataset with a spatial resolution of 30 m was used as the reference digital elevation model (DEM) in the InSAR processing workflow, mainly for geocoding and topographic phase removal. The original data can be downloaded from the Copernicus Data Space Ecosystem (CDSE) platform (https://dataspace.copernicus.eu/ (accessed on 12 March 2026)).

2.2.3. Hydrological Data

This study obtained groundwater depth data from 99 automatic groundwater monitoring stations in Beijing from 21 June 2019 to 28 December 2025, with a monitoring frequency of once per month. These data can effectively reflect the dynamic characteristics of groundwater in the study area. In addition, statistical data on the water supply structure in Beijing from 2015 to 2024 were obtained from the official website of the Beijing Water Authority (https://swj.beijing.gov.cn/ (accessed on 12 March 2026)) to assist in analyzing the relationship between groundwater changes and human activities.

3. Methodology

In this study, a spatiotemporal prediction method for analyzing ground deformation is proposed. Firstly, ground deformation monitoring data in the study area are acquired using Multi-Temporal InSAR (MT-InSAR) technology. Subsequently, the obtained spatiotemporal deformation data are aligned and completed to mitigate the impact of data loss during the observation process. On this basis, a spatiotemporal prediction model is constructed and applied to predict the future ground deformation in the study area. Finally, by combining ICA decomposition with the K-means clustering method, the different deformation patterns in the study area are systematically identified and analyzed, and the prediction performance of spatiotemporal models is compared and evaluated. The overall research process is illustrated in Figure 3.

3.1. Multi-Temporal InSAR

Due to the high level of urbanization in the study area, numerous man-made structures (e.g., buildings and roads) provide abundant temporally stable radar scatterers. Therefore, the Persistent Scatterer InSAR (PS-InSAR) technique was adopted to process the original SLC data and retrieve high-precision ground deformation time series [32]. The detailed PS-InSAR processing workflow is provided in Text S1 of the Supplementary Material.

For the deformation reference, Tiananmen Square, located in the central part of the study area, has been shown by previous monitoring to be stable without significant subsidence [33]. Accordingly, a strong scatterer within this area was selected as the zero-deformation reference point, and all retrieved deformation values are expressed relative to this reference point.

3.2. Spatiotemporal Deformation Data Construction

The deformation monitoring results derived from the PS-InSAR method represent displacement components along the Line of Sight (LOS) direction. Previous studies have demonstrated that ground deformation in Beijing is predominantly vertical, while horizontal displacement is relatively minor [34]. Under the assumption that the horizontal displacement component can be neglected, the LOS deformation can be converted into vertical deformation:

D_{U} = \frac{D_{L O S}}{cos θ}

(1)

where

θ

is the satellite incidence angle.

The theoretical revisit period of the Sentinel-1A satellite is 12 days, but constraints related to orbital scheduling and data acquisition conditions result in a certain degree of data loss in the original observation time series. To ensure the continuity and equidistant spacing of deformation monitoring sequences in the temporal dimension, this study employs interpolation methods to fill in missing observations, strictly unifying the temporal sampling interval to 12 days, thereby meeting the requirements of spatiotemporal prediction models for equidistant sampling of input features. After interpolation, the original 283 valid observations were expanded to 311 regularly sampled time series observations.

To compensate for spatial gaps in the PS-InSAR results, reduce computational complexity in subsequent analyses, and satisfy the requirement of spatiotemporal prediction models for grid-based inputs at each time step, the Inverse Distance-Weighted (IDW) interpolation method was employed to spatially resample the deformation measurements derived from Persistent Scatterer (PS) points [35]:

\hat{Z} (x_{0}) = \frac{\sum_{i = 0}^{n} Z (x_{i}) d_{i}^{- p}}{\sum_{i = 0}^{n} d_{i}^{- p}}

(2)

where

\hat{Z} (x_{0})

denotes the estimated value at the target point

x_{0}

to be interpolated;

Z (x_{i})

represents the observed value at the sample point

x_{i}

;

d_{i}

is the distance between the target point

x_{0}

and the sample point

x_{i}

; p is the distance decay exponent, which is typically set to 2; and n denotes the number of sample points involved in the interpolation.

Given the limited number of samples along the temporal dimension of the ground deformation monitoring data, this study adopts a sliding window strategy to construct spatiotemporal supervised learning samples. As illustrated in Figure 4, let the temporal lengths of the input and output sequences be

T_{i n}

and

T_{o u t}

, respectively. The input feature sequence of the i-th sample consists of deformation field images at

T_{i n}

consecutive time steps

S_{i} = \{X_{i}, X_{i + 1}, \dots, X_{i + T_{i n} - 1}\}

, while the corresponding prediction target comprises deformation field images at the subsequent

T_{o u t}

time steps

Y_{i} = \{X_{i + T_{i n}}, X_{i + T_{i n} + 1}, \dots, X_{i + T_{i n} + T_{o u t} - 1}\}

. The time window then slides forward along the temporal axis with a unit step size, gradually constructing a complete spatiotemporal sample dataset. For dataset partitioning, this study strictly follows the characteristics of time series data by using the first 80% of the samples for model training and the remaining 20% for performance testing, thereby avoiding data leakage and ensuring an objective and reliable evaluation of the model’s generalization performance.

Considering that ground deformation time series exhibit pronounced trend components and substantial variations in statistical characteristics across different time periods, first-order differencing and window-based re-normalization [36] are further introduced to preprocess the sample data. The corresponding processing workflow and parameter settings are described in detail in Section 4.2.1.

3.3. Spatiotemporal Prediction Model

To model the spatiotemporal evolution of ground deformation, PredRNNv2 was adopted as the core prediction model. By introducing an auxiliary spatiotemporal memory unit, the model enables explicit propagation of memory states across network layers, which provides clear advantages in capturing long-term temporal dependencies while preserving spatial topology [27]. A detailed comparison between PredRNNv2 and related models, including LSTM [37] and ConvLSTM [38], is provided in Text S2 of the Supplementary Material. In this section, the architecture of the prediction model is further described, together with the definition of the loss function used for optimization and the performance evaluation metrics.

3.3.1. Structure of the Model

As illustrated in Figure 5, the overall network architecture consists of three main components: an encoder, a predictor, and a decoder. The encoder performs a block-based spatial reorganization of the input images to reduce the size of the feature maps. The predictor is composed of four stacked Spatiotemporal LSTM (ST-LSTM) memory units, which progressively extract spatial features at different scales and generate frame-by-frame predictions of future deformation sequences. Finally, the decoder reconstructs the predicted block-wise feature maps into deformation images with the original spatial resolution through an inverse block operation.

(1): Encoder module: The encoder module mainly performs spatial reorganization of the input deformation images using a patch-based blocking operation. Specifically, the original image is divided into fixed-size patches along the spatial dimensions H and W, and the resulting sub-patches are concatenated along the channel dimension C. This operation effectively reduces the spatial resolution of the feature maps while preserving local spatial details. Considering both computational efficiency and feature representation capability, the patch size is set to 4 in this study, whereby the original input images with dimensions of $512 \times 512 \times 1$ are reorganized into feature representations of $128 \times 128 \times 6$ . This operation substantially reduces the computational cost in the subsequent prediction stage without introducing additional learnable parameters, while simultaneously enlarging the receptive field of each feature unit. As a result, it facilitates the learning of spatial correlation characteristics of ground deformation over a large spatial extent.
(2): Predictor module: The predictor consists of multiple stacked ST-LSTM units, which are designed to model the complex spatiotemporal dependencies in ground deformation sequences. The ST-LSTM unit employs a dual-stream memory conversion mechanism and introduces an auxiliary spatiotemporal memory unit $M_{t}$ alongside the traditional memory cell $C_{t}$ . Explicitly transferring memory states across network layers, it enhances the modeling ability of long-term temporal dependencies and multi-level spatial semantic information. The structure of a single ST-LSTM unit is illustrated in Figure 6, and its input and output relationship at time t can be expressed as follows:

$\begin{matrix} g_{t} = \tanh (W_{x g} * X_{t} + W_{h g} * H_{t - 1}^{l}) \\ i_{t} = σ (W_{x i} * X_{t} + W_{h i} * H_{t - 1}^{l}) \\ f_{t} = σ (W_{x f} * X_{t} + W_{h f} * H_{t - 1}^{l}) \\ C_{t}^{l} = f_{t} ⊙ C_{t - 1}^{l} + i_{t} ⊙ g_{t} \\ g_{t}^{'} = \tanh (W_{x g}^{'} * X_{t} + W_{m g} * M_{t}^{l - 1}) \\ i_{t}^{'} = σ (W_{x i}^{'} * X_{t} + W_{m i} * M_{t}^{l - 1}) \\ f_{t}^{'} = σ (W_{x f}^{'} * X_{t} + W_{m f} * M_{t}^{l - 1}) \\ M_{t}^{l} = f_{t}^{'} ⊙ M_{t}^{l - 1} + i_{t}^{'} ⊙ g_{t}^{'} \\ o_{t} = σ (W_{x o} * X_{t} + W_{h o} * H_{t - 1}^{l} + W_{c o} * C_{t}^{l} + W_{m o} * M_{t}^{l}) \\ H_{t}^{l} = o_{t} ⊙ tanh (W_{1 \times 1} * [C_{t}^{l}, M_{t}^{l}]) \end{matrix}$

(3)

where $i_{t}, f_{t}, g_{t}$ represent the input gate, forget gate, and candidate memory content of the original memory unit; $C_{t}$ ; $i_{t}^{'}, f_{t}^{'}, g_{t}^{'}$ represent the input gate, forget gate, and candidate memory content of the auxiliary memory unit $M_{t}$ ; $o_{t}$ is the output gate; W is the weights of the corresponding convolution kernels; $σ (\cdot)$ and $tanh (\cdot)$ are the sigmoid and hyperbolic tangent activation functions, respectively; ∗ is the convolution operation; and ⊙ is the Hadamard product operation. In this study, the predictor is composed of four stacked ST-LSTM layers, with each layer containing 128 hidden state channels. By progressively extracting spatiotemporal features at different spatial scales, the model is able to predict the ground deformation field accurately and stably frame by frame.
(3): Decoder module: The decoder module is responsible for reconstructing the block-wise feature maps output by the predictor into ground deformation images with the original spatial resolution. Specifically, the decoder remaps the features from the channel dimension back to the spatial dimensions through an inverse patching operation, reconstructing a feature representation of size $128 \times 128 \times 6$ into a $512 \times 512 \times 1$ deformation image. The process also does not introduce additional learnable parameters, which can effectively maintain the continuity and consistency of the prediction results in space and ultimately output ground deformation prediction results that are consistent with the original input scale.

3.3.2. Loss Function

To effectively guide the optimization of model parameters, this study adopts the log-cosh loss function as the training objective. The model is driven to converge by minimizing the discrepancy between predicted and true values. The mathematical formulation is as follows:

L_{log - cosh} = \frac{1}{n} \sum_{i = 1}^{n} \log (\cosh ({\hat{y}}_{i} - y_{i}))

(4)

where

y_{i}

and

{\hat{y}}_{i}

are the true value and predicted value of the i-th sample, respectively; n is the total number of samples.

The log-cosh loss function possesses a unique piecewise approximation characteristic: when the prediction residual is small, its Taylor expansion resembles that of the Mean Squared Error (MSE), which allows for precise constraining of small prediction deviations; when the residual is large, the function gradually approaches the form of Mean Absolute Error (MAE), thereby reducing the influence of outliers on the gradient update process. This combination of second-order smoothness and robustness enables the loss function to simultaneously adapt to the stable evolution trends and local nonlinear variations in ground deformation sequences, thereby markedly improving the stability of model training and the reliability of prediction results.

To further improve the representation efficiency of the internal memory structure and the stability of long-term modeling, a decoupling loss is additionally introduced as an auxiliary constraint when employing the PredRNNv2 model. Its mathematical formulation is given as

L_{d e c o u p l e} = \sum_{t} \sum_{l} \sum_{c} \frac{|{〈Δ C_{t}^{l}, Δ M_{t}^{l}〉}_{c}|}{{∥Δ C_{t}^{l}∥}_{c} \cdot {∥Δ M_{t}^{l}∥}_{c}}

(5)

where

Δ C_{t}^{l}

and

Δ M_{t}^{l}

denote the state increments of the temporal memory unit and the auxiliary spatial memory unit, respectively;

〈\cdot, \cdot〉

and

{∥\cdot∥}_{c}

represent the dot product and

l_{2}

norm operations, respectively.

The final training objective of the model is defined as

L = L_{log - cosh} + λ L_{d e c o u p l e}

(6)

where

λ

is the decoupling coefficient used to balance prediction accuracy and the degree of memory separation. By introducing this structural regularization constraint, the temporal memory can focus more on modeling long-term dynamic trends, while the spatial memory primarily facilitates semantic information propagation across layers, thereby improving the stability and representation capability of long-term prediction.

3.3.3. Metrics for Prediction Results

To quantitatively evaluate the accuracy of ground deformation predictions and systematically compare the performance of different models, this study employs a variety of complementary evaluation metrics. These include MSE, MAE, Symmetric Mean Absolute Percentage Error (SMAPE), Willmott’s Index of Agreement (WIA) [39], Explained Variance Score (EVS), Structural Similarity (SSIM) [40], and Peak Signal-to-Noise Ratio (PSNR) [41], among others. The detailed formulations of these metrics are provided in Text S3 of the Supplementary Material.

Among these metrics, MSE, MAE, and SMAPE primarily characterize the numerical error between the predicted results and the observed values; WIA and EVS reflect the consistency between predictions and observations in terms of overall trend from a statistical perspective; SSIM and PSNR evaluate the spatial similarity between the predicted deformation field and the real deformation field from two aspects of spatial structure and signal-to-noise ratio.

3.4. Independent Component Analysis

ICA is a blind source separation method based on the assumption of statistical independence. Its core idea is to decompose the observed multi-channel mixed signals into several latent source signals that are as statistically independent as possible [42]. ICA assumes that the observed signals are linear mixtures of multiple mutually statistically independent source signals. Without relying on prior knowledge of the mixing process, ICA can effectively recover the source signals by maximizing their statistical independence or non-Gaussianity. Its typical mathematical formulation is as follows:

X = A \cdot S

(7)

where

X \in R^{m \times n}

is the observed mixed signal matrix,

S \in R^{k \times n}

is the independent source signal matrix, and A is an unknown mixing matrix. The objective of ICA is to estimate the unmixing matrix

W

(i.e., the inverse of the mixing matrix

A

) through iterative optimization using only the observed signals

X

to recover the source signals:

\hat{S} = W \cdot X

(8)

In this study, the observation matrix

X \in R^{N_{p o i n t} \times T}

is constructed as a ground deformation time series matrix, where each row corresponds to the deformation monitoring sequence of a PS point, and each column represents the deformation observations of different PS points at the same time [43]. By applying the FastICA algorithm to decompose the observation matrix, the corresponding source signal matrix

\hat{S} \in R^{N_{c o m p o n e n t} \times T}

and mixing matrix

A

can be obtained. Each independent component of the source signal corresponds to a typical temporal evolution pattern of deformation, while the mixing matrix reflects the spatial response strength of different PS points to each deformation mode. Since ICA decomposition is sensitive to noise, this study first reduces the dimensionality of the observation matrix using Principal Component Analysis (PCA) before applying ICA. By selecting the first six principal components, the influence of noise on the ICA decomposition is reduced while retaining 99% of the data variance, thereby improving the stability and reliability of deformation mode extraction.

3.5. K-Means Clustering

K-means clustering is an unsupervised learning algorithm based on iterative optimization. Its primary objective is to partition n observation samples into k disjoint clusters such that the similarity among samples within each cluster is maximized, while the differences between clusters are maximized [44]. Specifically, K-means effectively partitions the sample space by minimizing the Sum of Squared Errors (SSE) between each sample and the centroid of its assigned cluster. When the Euclidean distance is used as the similarity metric, the optimization objective can be expressed as

J = \sum_{j = 1}^{k} \sum_{i = 1}^{n} w_{i j} {∥x_{i} - μ_{j}∥}^{2}

(9)

where J denotes the moment of inertia, reflecting the overall dispersion of the clustering results;

x_{i}

is the i-th sample vector;

μ_{j}

is the centroid of the j-th cluster; and

w_{i j}

is an indicator variable, which equals 1 if the i-th sample belongs to the j-th cluster and 0 otherwise. The K-means algorithm alternately updates the cluster centroids and the sample assignments, iterating step by step until the objective function converges.

In this study, K-means clustering is performed on the ICA-derived independent component scores of each monitoring point, rather than directly on the original deformation time series [45]. This approach mitigates the interference of strong temporal correlations on distance measurements, thereby yielding more discriminative and robust clustering results. During the clustering process, the initial cluster centers are selected using the K-means++ strategy to improve the stability of the clustering results and accelerate the convergence of the algorithm.

4. Results and Analysis

4.1. PS-InSAR Monitoring Result

By processing the original SLC imagery with the PS-InSAR method and converting the LOS deformation measurements to the vertical direction through geometric projection, the spatial distribution of ground deformation rates in the Beijing Plain from 2015 to 2025 was obtained, as shown in Figure 7a. The results indicate that ground deformation in the study area exhibits significant spatial heterogeneity. Ground subsidence, occurring in a patchy pattern, is primarily distributed in the northern part of Haidian District, southern Changping District, eastern Chaoyang District, western Tongzhou District, and southern Daxing District. In contrast, ground uplift is concentrated in the northern part of Shunyi District, western Pinggu District, and eastern Haidian District, while the remaining areas remain relatively stable.

According to the contour lines of different cumulative deformation values, this study delineates 11 primary ground Deformation Funnels (DFs) within the study area, with detailed information provided in Table 2. Among these, the area experiencing the most severe subsidence is DF4, located at the junction of Chaoyang District and Tongzhou District. The maximum cumulative settlement reaches 709.58 mm, and the area affected by cumulative settlement exceeding 140 mm spans 368.36 km². In addition, a significant ground uplift area (DF9) is located in the northern part of Shunyi District, covering 175.10 km², with cumulative uplift exceeding 110 mm. This uplift phenomenon is closely associated with the ecological water replenishment projects implemented in Beijing in recent years, and its underlying mechanism will be further discussed in Section 5.3.

To further analyze the temporal dynamics of deformation evolution, this study selects 15 representative feature points within the aforementioned deformation areas and plots their deformation time series, as shown in Figure 7b. The results indicate that the deformation rates of the feature points located in the subsidence areas generally exhibit a nonlinear deceleration trend. In contrast, several feature points in the uplift areas display a continuous and relatively stable rebound process. In addition to the long-term trend, the time series of each feature point generally exhibits varying degrees of periodic deformation, reflecting the response of ground deformation to seasonal variations in groundwater dynamics.

To verify the accuracy of the ground deformation monitoring results, this study further processed the descending orbit data from the Sentinel-1A satellite and extracted ground deformation measurements during the common observation period (1 January 2016–31 December 2021) by matching the common monitoring points in the ascending and descending orbit datasets and analyzing the correlation of their deformation rates, as shown in Figure 8. The statistical results show that the correlation coefficient of the deformation rates for 162,179 pairs of common points reaches 0.96, with a Root Mean Square Error (RMSE) of 5.64 mm/year. The minor differences between the two sets of monitoring results are primarily attributed to the combined effects of differing observation geometries of the ascending and descending satellites, variations in LOS sensitivity, residual atmospheric delay errors, and other factors. Overall, the high consistency between the inversion results of ascending and descending orbits demonstrates that the time series deformation monitoring data obtained in this study are highly reliable and can accurately reflect the spatiotemporal evolution of ground deformation in the study area.

4.2. Real Data Experiments

4.2.1. Model Training

The hardware configurations used in the experiments are summarized in Table 3 and the corresponding software versions are listed in Table 4. To ensure the fairness and objectivity of the evaluation results for each comparison model, the hyperparameters during training are set as follows: batch size is 1, epochs are 128, and the initial learning rate is 0.0001; the Adam optimizer is used for parameter updates.

At the temporal scale, both the input and output sequence lengths of the model are set to 10. Given the 12-day temporal sampling interval of the dataset, this configuration corresponds to approximately 120 days of deformation evolution. This time span can cover a complete seasonal variation cycle and effectively evaluate the model’s ability to capture long-term temporal dependencies. Meanwhile, this setting is consistent with the multi-step prediction configurations widely adopted in PredRNN-based models, improving the comparability of the experiments [27,38,46].

In terms of spatial processing, the encoder adopts a

4 \times 4

spatial partition strategy. Through spatial rearrangement, this operation compresses spatial resolution while preserving local structural information. The chosen partition size balances spatial representation capability and computational cost: smaller partitions would significantly increase computational overhead, whereas larger partitions may weaken spatial continuity and local correlation. Therefore, the

4 \times 4

partition is adopted as a structural design that balances feature extraction capability and computational efficiency.

Because ground deformation time series typically contain significant trend components, their non-stationary nature leads to notable differences in the statistical distributions between the training and test sets, resulting in distribution drift. If the model is trained directly on the original data, it may face convergence difficulties and exhibit reduced generalization ability during unseen periods. To address the above issues, this study adopts a preprocessing strategy that combines first-order differencing with windowed re-normalization. The effect of this strategy on improving the data distribution is illustrated in Figure 9.

From a statistical perspective, the original deformation sequence exhibits obvious mismatches between the training and test sets in terms of statistical characteristics, such as numerical range, degree of dispersion, and probability density distribution (Figure 9a,b). After first-order differencing, the frequency distributions of the training and test sets (Figure 9c) tend to align. Meanwhile, the box plot results (Figure 9d) show that differencing effectively reduces the statistical offset between the datasets. Further comparison of the deformation time series of typical feature points (Figure 9e) shows that first-order differencing significantly attenuates the long-term trend and compresses the numerical range from [−709.58, 198.16] to [−6.51, 3.84], resulting in higher consistency of dynamic characteristics between the training and test sets.

On this basis, a windowed re-normalization method is further applied to independently normalize the data within each sliding window. This effectively constrains scale differences between different time windows and prevents inconsistent numerical ranges from interfering with model parameter updates. The overall preprocessing strategy ensures numerical stability and scale consistency of the input data, providing a standardized representation for subsequent high-precision ground deformation prediction.

To clarify the potential influence of first-order differencing and window re-normalization on long-term trend recovery and error accumulation, we provide a systematic theoretical derivation and error propagation analysis in Text S4 of the Supplementary Materials. The results indicate that within the temporal scale and error control level of this study, the proposed preprocessing strategy does not introduce significant trend bias, and the overall error propagation remains within a controllable range.

4.2.2. Prediction Results

After fully training the model on the training set, this study first qualitatively evaluates the performance of ground deformation prediction on the test set. Figure 10 presents the prediction results of the model at several key time points in two typical deformation zones: DF4 (subsidence zone) and DF9 (uplift zone). The complete spatiotemporal evolution results for the entire study area are provided in Figure S4 of the Supplementary Material. By comparing the magnitude at the deformation centers and the spatial distribution of the deformation field, it can be seen that the model accurately captures and reconstructs the overall trends of subsidence and uplift in the deformation areas. To further evaluate the ability of the model to capture the characteristics of periodic deformation, the deformation contours of −140 mm and 110 mm are extracted in Figure 10a and Figure 10b, respectively. By analyzing the temporal evolution of the contour envelopes, it is observed that the model successfully predicts the seasonal periodic characteristics of ground deformation. From February to June each year, the subsidence area of DF4 gradually expands, while the uplift area of DF9 correspondingly shrinks. Conversely, from July to January of the following year, these trends reverse, with the DF4 subsidence area shrinking and the DF9 uplift area expanding. The 3-D dynamic visualizations provided in Videos S1 and S2 of the Supplementary Material further illustrate this periodic variation process more intuitively. The above results are highly consistent with actual observations in terms of both spatial scale and temporal rhythm, preliminarily demonstrating the effectiveness of the model in capturing the nonlinear evolution characteristics of complex spatiotemporal processes.

Building on the qualitative analysis, this study further evaluates the prediction accuracy of the model through quantitative comparison. Traditional time series methods, including Polynomial Fitting (PF), Log-Logistic curve Fitting (LLF), and Support Vector Regression (SVR), as well as deep learning models such as LSTM, ConvLSTM, Convolutional Gated Recurrent Unit (ConvGRU), and PredRNN, are employed as baseline models for comparison. Among these, PF, LLF, and SVR adopt an independent modeling and prediction strategy for each pixel. LSTM serves as a global time-series model with shared weights across all pixels, whereas the training procedures for spatiotemporal prediction models such as ConvLSTM, ConvGRU, and PredRNN are consistent with the approach proposed in this study.

Based on the seven evaluation metrics defined in Section 3.3.3 (Table 5), the models exhibit distinct variations in prediction accuracy and structural fidelity. While LLF outperforms PF due to its superior alignment with the observed “acceleration-deceleration” (inverted S-shaped) evolution pattern, such empirical functional models suffer from inherent generalization constraints. In contrast, SVR and LSTM effectively capture nonlinear temporal dependencies. LSTM, in particular, leverages gated memory units to better represent stage-wise deformation characteristics. However, these point-wise approaches neglect the spatial coupling between adjacent pixels, often resulting in spatial discontinuities when applied to large-scale deformation fields. Comparatively, spatiotemporal models demonstrate superior overall performance, with PredRNNv2 yielding the best results across all metrics: numerical error (MSE, MAE, SMAPE), trend consistency (WIA, EVS), and spatial structure similarity (SSIM, PSNR).

To further investigate the underlying mechanisms behind the performance differences among the models, this study randomly selected a typical sample and analyzed the spatial distribution of the absolute prediction errors for the four spatiotemporal prediction models, as shown in Figure 11. The results show that as the prediction time step increases, the errors of all models exhibit a clear cumulative effect, with error magnitudes gradually increasing and the high error regions expanding spatially. However, the rate of error growth differs significantly among the models. ConvGRU, with its relatively simplified gating structure, has limited capacity to capture long-term spatiotemporal dependencies. Consequently, its high error regions expand more rapidly, and its overall accuracy is lower than that of ConvLSTM, which employs independent memory units. In contrast, PredRNN constructs dual information transmission paths by incorporating the ST-LSTM structure and a vertical memory flow mechanism, effectively enhancing the long-term memory capacity of the model for complex spatiotemporal evolution, thereby significantly slowing the error accumulation process. PredRNNv2, which achieves the best performance, exhibits the slowest error growth over time and the smallest spatial extent of high error regions. This indicates that the decoupling loss operator effectively decouples spatiotemporal features and further enhances the stability of the model in multi-step prediction by suppressing information redundancy [27].

4.3. Deformation Pattern in the Study Area

Ground deformation in the Beijing Plain exhibits significant spatial heterogeneity, presenting a variety of distinct evolution patterns [43]. To further verify the capability of the proposed spatiotemporal prediction model in capturing different dynamic characteristics, this study first separates the dominant drivers in the time-series deformation signals using ICA. The resulting independent component scores and mixing matrix are shown in Figure 12. According to the elbow method, three dominant independent components (ICs) were ultimately extracted, revealing three typical deformation patterns in the study area: (1) IC1 exhibits approximately linear changes, and its spatial scores characterize the basic deformation pattern in the study area. In this component, the blue regions correspond to linear subsidence caused by long-term groundwater overexploitation, while the red regions reflect relatively stable ground uplift trends. (2) IC2 exhibits a gradually decreasing trend over time, with its high-score regions primarily located in the subsidence center at the junction of Chaoyang District and Tongzhou District. This indicates that the deformation rate in this area has entered a stage of nonlinear deceleration. (3) The time series of IC3 exhibits seasonal fluctuations along with some residual noise, and its spatial scores are relatively evenly distributed across the plain. This component mainly reflects the periodic elastic deformation response driven by seasonal variations in precipitation and temperature [47].

Although ICA extracted only three dominant ICs, the differences in their combinations and weight distributions give rise to more complex spatiotemporal deformation patterns. Therefore, this study further applies the K-means algorithm to perform spatial clustering on the scores of each independent component, subdividing the ground deformation patterns in the study area into six categories. Their spatial distribution is shown in Figure 13a. By extracting representative feature points from each category and plotting their deformation time series (Figure 13b–g), the dynamic characteristics of the different modes can be clearly observed: The first type exhibits continuous, weak linear uplift. The second type also shows an upward trend but is more strongly influenced by seasonal periodic deformation. The third and fourth types correspond to relatively stable deformation regions with slight fluctuations and high stability, respectively. The fifth type represents a nonlinear subsidence zone with a gradually decreasing deformation rate. The sixth type shows a significant subsidence area with approximately linear behavior. By comparing the original deformation monitoring data of representative feature points with the prediction curves of the model, it can be observed that the model effectively reproduces the time-series characteristics in both the linear evolution regions with relatively stable trends and the nonlinear evolution regions with more complex dynamics, demonstrating high consistency in spatiotemporal prediction.

Further considering the geographical locations of the major deformation areas, it can be observed that DF1, DF2, and DF3 mainly belong to the fifth category, with their subsidence rates in recent years exhibiting a clear deceleration trend. The funnel areas of DF5, DF6, DF7, and DF8 correspond to the sixth type, maintaining approximately linear subsidence characteristics. Although DF4 also falls into the sixth category, the IC decomposition results indicate that its deformation rate has slowed, reflecting a transition from linear to nonlinear behavior. Additionally, DF9 and DF10 (second type) display a relatively pronounced ground uplift trend, while DF11 (first type) exhibits a relatively mild linear recovery.

The results of ICA decomposition and spatial clustering analysis indicate that the model proposed in this study can not only effectively capture regional-scale trend deformation characteristics but also accurately identify nonlinear and periodic fine-scale responses induced by geological and environmental changes. This fully demonstrates the robustness and reliability of the prediction framework in handling complex, heterogeneous ground deformation data.

5. Discussion

5.1. The Applicability of the Model

To further verify the applicability of the model under different geological environments and deformation dynamics, this study constructs a set of simulation datasets integrating a variety of typical deformation modes. Considering that actual ground deformation fields often exhibit clustered or funnel-shaped spatial distributions, this study employs a distorted 2-D Gaussian surface to simulate the basic structure of the spatial deformation field [48]. In the temporal dimension, the Bézier curve is used to model the deformation time series, capturing the nonlinear and multi-stage evolution processes that ground deformation may undergo [49]. In addition, to realistically reproduce the spatially correlated noise commonly present in MT-InSAR inversion results, multi-scale fractal Perlin noise is introduced to simulate the effects of residual terrain errors and turbulent atmospheric phase delays [50]. By superimposing the above components, nine representative deformation modes are constructed within a

384 \times 384

pixel grid (each deformation occupying a

128 \times 128

pixel area), including accelerated deformation, linear deformation, decelerated deformation, acceleration followed by deceleration, stable no significant deformation, deceleration followed by acceleration, uplift followed by subsidence, periodic deformation, and subsidence followed by uplift. Their spatial distribution is shown in Figure 14a. The cumulative deformation time series extracted from the center of each deformation region (Figure 14b) demonstrates that the constructed simulation samples exhibit high complexity in both spatial structure and temporal evolution. The 3-D dynamic visualization presented in Video S3 of the Supplementary Material further illustrates the spatiotemporal evolution characteristics of the simulated deformation process more intuitively.

In the simulation data experiment, this study compares and analyzes traditional time series algorithms, including PF, LLF, and SVR, as well as deep learning models such as ConvLSTM and ConvGRU. The dataset partitioning method and training hyperparameter settings are consistent with those used in the measured data experiments. Quantitative evaluation results are presented in Table 6. Overall, the spatiotemporal prediction model outperforms traditional methods based on single-point modeling across all evaluation metrics, highlighting the important role of spatial context information in deformation prediction tasks. It is worth noting that because the temporal evolution of the simulation data is relatively smooth and the noise level is controllable, ConvGRU, with its relatively simplified gating structure, exhibits stronger generalization ability and achieves slightly better prediction performance than ConvLSTM. In contrast, PredRNN and PredRNNv2 effectively mitigate the problem of redundant accumulation of long-term memory information in traditional recurrent structures by introducing a dual-channel modeling mechanism that combines temporal memory and spatiotemporal memory. They maintain optimal prediction accuracy even when handling multi-stage, nonlinear evolution in the simulated deformation data.

Further analysis of the prediction results at each deformation center point (Figure 15) shows that the prediction curve of the model closely fits the true simulated values and remains consistent in both temporal rhythm and amplitude of change. This holds for both monotonic trend deformations and nonlinear deformations with inflection points and periodic fluctuations. In addition, the RMSE statistics (Figure S5) indicate that the error distributions across different regions exhibit similar patterns and remain at relatively low levels. This provides quantitative support for the above comparisons of typical time series and establishes consistency between the global statistical analysis and the local temporal fitting results. These results fully demonstrate that the model possesses strong robustness and adaptability across a variety of deformation dynamics and noise conditions, providing further theoretical support for its practical application in complex surface environments.

5.2. The Structural Design of the Model

To address constraints in model parameters and hardware resources, this study employs block operations in the encoder to achieve spatial downsampling and inverse block operations in the decoder to restore the original spatial resolution. Comparative experimental results (Table 7) show that the prediction accuracy of this approach is superior to that of the convolution and pooling downsampling strategy commonly used in U-Net-like structures [24,25]. Although the U-Net-like structures perform well in complex multi-channel image processing, for the ground deformation field with a single channel and highly sensitive spatial topology considered in this study, repeated convolution and pooling operations can easily introduce feature redundancy and cause subtle gradient information in the deformation field to be smoothed or lost during downsampling. In contrast, the ST-LSTM unit in PredRNNv2 possesses strong spatiotemporal feature modeling capabilities. The direct blocking strategy is used to effectively compress the input dimensions while preserving the spatial topological relationships of the original pixels to the greatest extent. This reduces information distortion caused by spatial resolution degradation and enables the model to use the original representation with higher fidelity for spatiotemporal evolution modeling.

In the data preprocessing stage, this study adopts a strategy combining first-order differencing and windowed re-normalization. The comparison results in Table 8 show that the model trained on the differenced sequences outperforms the model trained directly on the original sequences across various evaluation metrics, with the SMAPE index notably reduced from 64.44% to 13.18%. This improvement is mainly attributed to the fact that the first-order differencing effectively weakens the non-stationary trend component in the original deformation sequence, converting it into an approximately stationary deformation increment sequence and thereby reducing the difficulty of model optimization. This operation allows the model to focus more on the dynamic evolution of deformation, effectively alleviating issues such as “prediction lag” or “amplitude underestimation” that commonly arise when modeling non-stationary series directly, and further enhances the model’s stability and extrapolation ability in multi-step prediction.

The comparative experimental results for different loss functions (Table 9) show that the log-cosh loss function outperforms MAE and MSE in overall prediction performance. This phenomenon reflects the differing adaptability of various loss functions to InSAR observation errors: MSE (L2 norm) is highly sensitive to outliers, in the presence of residual atmospheric noise or local unwrapping errors, large squared errors can cause the model to overfit the noise. Although MAE (L1 norm) is robust to outliers, it is not differentiable near the zero point, which may introduce small oscillations in the later stages of training. In contrast, the log-cosh loss function approximates MSE for small errors, ensuring rapid model convergence, while gradually approaching MAE for large errors, thereby reducing the influence of outliers on the gradient. Its combination of convergence efficiency and noise robustness enables it to achieve optimal prediction accuracy in surface deformation prediction tasks.

5.3. Ground Deformation Coupling Mechanism

Through a comprehensive analysis of the hydrogeological environment in the Beijing Plain (Figure 16a), it can be seen that the spatial pattern of ground deformation is strongly influenced by the hydrogeological conditions and tectonic settings. Several significant ground subsidence areas (DF1–DF8) are located in regions with two to three sandy-gravel aquifers, while the ground uplift areas (DF9–DF11) mainly occur in multi-layer aquifer regions. This difference indicates that multi-layer aquifer systems are more likely to produce considerable elastic rebound effects during water level recovery, whereas areas with a limited number of aquifers tend to exhibit settlement responses dominated by compression. In addition, the spatial distribution of the main deformation zones is clearly constrained by fault structures. These zones not only extend along the fault lines but are also segmented into discontinuous, independent units [51]. This phenomenon reflects the barrier effect of faults on groundwater flow and pore pressure propagation, which limits the continuous spatial diffusion of deformation.

Combined with the evolution of Beijing’s water supply structure (Figure 17a), it can be observed that since the initiation of the middle route of the SNWD project in 2014, external water transfer and reclaimed water have gradually replaced groundwater as the main water supply source in Beijing, fundamentally alleviating regional land subsidence caused by long-term overexploitation. At the same time, to restore the water ecological environment, Beijing has successively launched ecological water replenishment projects in the Yongding River, Chaobai River, and Beiyun River basins since 2019, continuously supplying water downstream through upstream reservoirs. To quantitatively assess the impact of ecological water replenishment on the groundwater system and ground deformation, this study extracts groundwater levels and ground deformation fields from June 2019 to October 2025, as shown in Figure 16b,c. Near the ecological water replenishment channels, three distinct groundwater level rise areas have formed: in the central part of Shunyi District, the western part of Pinggu District, and the western part of Haidian District. Their spatial distribution closely corresponds to the three ground uplift areas, DF9, DF10, and DF11. This spatial correlation indicates that ecological water replenishment significantly alters the stress-strain state of local aquifers by raising the groundwater head, serving as an important external driving factor for ground uplift [52].

By further analyzing the evolution curves of groundwater levels and ground deformation at typical monitoring wells (Figure 17b–d), this study identifies three typical dynamic response patterns of ground deformation:

(1): Synergetic uplift mode: Represented by monitoring well W87, the groundwater level in this area closely matches the ground deformation in both long-term trends and periodic fluctuations. Notably, after the implementation of ecological water replenishment in 2021, the rapid rise in groundwater level drove elastic rebound in the aquifer, resulting in significant ground uplift.
(2): Deceleration convergence mode: Represented by monitoring well W52, although the area is still experiencing ground settlement, the settlement rate has significantly slowed since 2021. This indicates that the rise in groundwater level effectively suppresses the continuous increase in effective stress, gradually slows the consolidation process, and causes surface deformation to exhibit a converging trend, transitioning from rapid settlement to a stable state.
(3): Lag linear mode: Represented by monitoring well W21, this type of area exhibits a lag phenomenon where the groundwater level gradually rises while the ground continues to experience linear settlement. This is mainly because the regional groundwater head has not yet recovered to the new pre-consolidation head height, and the aquifer system is still undergoing primary consolidation compression in an unsteady state. As a result, the accumulated compressive deformation cannot be immediately reversed by short-term increases in water level [53].

On this basis, this study further selects groundwater monitoring wells near each significant deformation area for verification (monitoring wells are not available near the DF7 and DF8 areas, so these are excluded from the analysis), as shown in Figures S6 and S7 of the Supplementary Material. The DF1–DF4 areas exhibit the same deceleration convergence mode observed at monitoring well W52. The DF5 and DF6 regions display a lag linear mode similar to that of W21, while the DF9–DF11 areas correspond to the synergetic uplift mode seen at W87.

Although these three dynamic patterns differ significantly in their long-term evolutionary trends, they are all jointly modulated by seasonal rainfall variations at the seasonal scale. In the study area, precipitation is mainly concentrated between June and August each year, forming a pronounced seasonal recharge pulse. Through infiltration and internal transmission within the aquifer system, rainfall is converted into groundwater recharge, leading to periodic fluctuations in groundwater levels. Due to recharge transmission processes and the lagged response of groundwater dynamics, groundwater level peaks usually occur after the precipitation peaks [54]. Seasonal groundwater fluctuations regulate pore water pressure and effective stress conditions, triggering elastic or viscoelastic responses of the strata, which are manifested as episodic uplift or changes in the subsidence rates. Driven by the region’s geological complexity and heterogeneous permeability, ground deformation typically exhibits a further phase lag of approximately 1–2 months relative to groundwater dynamics [55].

The above classification analysis based on physical mechanisms not only reveals the multi-source evolution processes of ground deformation in the Beijing Plain but also further confirms that the proposed model demonstrates strong reliability and applicability when handling complex nonlinear predictions driven by policy measures and geological conditions.

5.4. Model Extension and Future Directions

This study verifies the modeling capability of the PredRNNv2 model for large-scale complex deformation time series prediction. However, there remains room for further extension and optimization.

First, collaborative modeling of multi-source driving factors. Previous studies have demonstrated that incorporating exogenous variables such as precipitation into time series prediction models can more comprehensively capture the physical response processes of ground deformation, and may enhance the detection of episodic changes and lagged effects [44,56,57]. Due to the heterogeneity in the spatiotemporal resolution of multi-source observational data, this study primarily focuses on validating the model under single-sequence deformation conditions. Future work could explore integrating hydrometeorological factors into the model input and jointly modeling them via feature fusion mechanisms, thereby further improving the physical consistency of the predictions.

Second, cross-scenario generalization and transferability. Although PredRNNv2 performs well on observed data from the Beijing Plain and in multi-pattern simulation experiments, its applicability in highly heterogeneous scenarios such as severe subsidence in mining areas or discontinuous displacement in landslide regions still requires further investigation. Future work will focus on constructing large-scale joint training datasets that include various typical geological backgrounds, such as mining, landslides, and permafrost regions. This will enrich sample diversity and reduce the limitations caused by single-scenario data. Gradually, these efforts aim to establish a universal ground deformation prediction model with cross-scenario adaptive capability.

Finally, adaptation and fusion of multi-temporal-scale data. The data sampling frequency directly affects the model’s ability to capture dynamic features: high-frequency data can help identify sudden deformation events but may introduce more observational noise, whereas low-frequency data emphasize trend representation but may reduce sensitivity to nonlinear turning points and short-period fluctuations. In this study, a 12-day sampling interval was adopted, which is well-suited to the predominantly medium to low frequency evolution context. Future work could construct multi-scale deformation datasets via multi-source satellite data fusion methods (e.g., MQQA [58], Kalman filtering [57,59]) to systematically evaluate the impact of temporal resolution on prediction accuracy. Additionally, exploring multi-scale time series modeling strategies could further enhance the model’s adaptability under diverse observation conditions.

6. Conclusions

This study addresses the spatiotemporal prediction of large-scale, complex ground deformation by developing a spatiotemporally synchronous prediction framework that integrates data preprocessing, spatiotemporal modeling, pattern recognition, and physical mechanism analysis. The effectiveness of the proposed framework has been systematically validated through empirical research and simulation experiments in the Beijing Plain. The main conclusions are summarized as follows:

(1): An efficient spatiotemporal synchronous prediction framework for ground deformation was established. To address the non-stationarity commonly present in deformation time series, a first-order differencing and windowed re-normalization strategy was introduced. This approach effectively reduces the interference of long-term trend components on model training and significantly enhances the ability of the model to capture incremental changes in deformation. Dimensionality reduction of high-dimensional spatiotemporal data is achieved through a block-based encoding and decoding structure, which not only preserves the spatial topological relationships at the pixel level but also reduces computational resource requirements for large-scale prediction tasks. Furthermore, by combining ICA decomposition with K-means clustering, dominant deformation modes such as linear, nonlinear, and periodic patterns are extracted from the multidimensional spatiotemporal series, enabling both qualitative interpretation and accurate quantitative evaluation of the prediction results.
(2): A deformation data simulation strategy that accounts for geological characteristics is proposed and validated. By integrating 2-D Gaussian surfaces, Bézier curves, and fractal Perlin noise, this study constructs a deformation simulation dataset that can characterize complex spatial morphology, multi-stage dynamic evolution, and spatially correlated noise characteristics. The simulation results demonstrate that, in deformation fields exhibiting significant spatial heterogeneity and nonlinear evolution, the spatiotemporal prediction model outperforms traditional point-by-point prediction methods in both accuracy and stability. Among these models, PredRNNv2, leveraging its dual-channel memory flow mechanism, exhibits stronger generalization capability in capturing nonlinear transition processes and long-term dependencies.
(3): The stable prediction of the evolutionary trend of ground deformation in the Beijing Plain is realized. Based on ground deformation time series from 2015 to 2025 obtained via MT-InSAR, the applicability and reliability of the proposed strategy in complex urban environments are verified. The prediction results indicate that this method can not only accurately capture the spatiotemporal evolution of large-scale subsidence areas but also effectively identify subtle local deformation features, including transitions from subsidence to uplift under the influence of ecological water replenishment and human interventions, demonstrating its potential for application in urban geological safety monitoring.
(4): The coupling relationship between ground deformation, groundwater dynamics, and hydrogeological conditions is revealed. Combined with groundwater monitoring data and hydrogeological analysis, it is found that the spatial variability of ground deformation in the Beijing Plain is jointly controlled by the barrier effects of fault structures and the heterogeneity of aquifer systems. By summarizing three typical dynamic response modes, i.e., synergetic uplift, deceleration convergence, and lag linear, the physical process through which changes in groundwater head influence the elastic rebound and settlement of formations driven by ecological water replenishment is clarified, which provides a quantitative basis for understanding the nonlinear mechanisms governing the response of ground deformation to groundwater regulation measures.

In summary, the spatiotemporal synchronous prediction strategy proposed in this study improves the accuracy of ground deformation predictions, enhances the physical interpretability of the results, and provides a feasible technical approach for investigating large-scale ground deformation evolution and its driving mechanisms. The relevant research results can provide a reference for urban land subsidence risk assessment, groundwater resources management, and underground space security.

Supplementary Materials

The following supporting information can be downloaded at https://www.mdpi.com/article/10.3390/rs18060925/s1: Text S1: Multi-Temporal InSAR processing workflow; Text S2: Comparison of LSTM, ConvLSTM, and PredRNN models; Text S3: Metrics for prediction results; Text S4: Errors introduced by data preprocessing; Figure S1: Structure of the LSTM memory cell; Figure S2: Structure of the ConvLSTM memory cell; Figure S3: Structure of the ST-LSTM memory cell; Figure S4: All prediction results of the PredRNNv2 model on the test set; Figure S5: Statistical distribution of prediction errors across different regions; Figure S6: Coupling mechanism between ground deformation at representative subsidence centers and variations in groundwater level and precipitation; Figure S7: Coupling mechanism between ground deformation at representative uplift centers and variations in groundwater level and precipitation; Video S1: 3D dynamic visualization of PredRNNv2-predicted deformation in the representative subsidence area DF4; Video S2: 3D dynamic visualization of PredRNNv2-predicted deformation in the representative uplift area DF9; Video S3: 3D dynamic visualization of the simulated ground deformation field.

Author Contributions

Conceptualization, Y.F. and J.W.; methodology, Y.F.; software, Y.F.; validation, Y.F. and L.K.; formal analysis, Y.F.; investigation, Y.F.; resources, Y.F. and J.W.; data curation, Y.F.; writing—original draft preparation, Y.F. and J.W.; writing—review and editing, Y.F., J.W. and Y.W.; visualization, Y.F.; supervision, Y.Z.; project administration, H.Z. and Y.Z.; funding acquisition, Y.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Acknowledgments

The authors acknowledge the European Space Agency (ESA) for providing the Sentinel-1A satellite data and the Copernicus DEM (COP-DEM) dataset. We also express our gratitude to the Beijing Hydrological Station for supplying the groundwater depth data. Special thanks are given to Liqing Wu for assistance with English language polishing and limited data processing.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Herrera-García, G.; Ezquerro, P.; Tomás, R.; Béjar-Pizarro, M.; López-Vinielles, J.; Rossi, M.; Mateos, R.M.; Carreón-Freyre, D.; Lambert, J.; Teatini, P.; et al. Mapping the global threat of land subsidence. Science 2021, 371, 34–36. [Google Scholar] [CrossRef]
Ao, Z.; Hu, X.; Tao, S.; Hu, X.; Wang, G.; Li, M.; Wang, F.; Hu, L.; Liang, X.; Xiao, J.; et al. A national-scale assessment of land subsidence in China’s major cities. Science 2024, 384, 301–306. [Google Scholar] [CrossRef]
Zhang, L.; Chen, J.; Chen, L.; Luo, Y.; Liu, W. Deformation grading and prediction of heterogeneous layered soft rock tunnels under high ground-stress: A case study. Tunn. Undergr. Space Technol. 2026, 167, 107071. [Google Scholar] [CrossRef]
Wang, J.; Hu, S.; Wang, T.; Liang, H.; Yang, Z. GNSS horizontal motion field in the Beijing plain in view of the variation characteristics of the 3D deformation field. Remote Sens. 2023, 15, 787. [Google Scholar] [CrossRef]
Lee, J.S.; Jeong, S.H.; Park, G.; Kim, Y.; Tutumluer, E.; Kim, S.Y. Geotechnical application of unmanned aerial vehicle (UAV) for estimation of ground settlement after filling and compaction. Transp. Geotech. 2025, 51, 101517. [Google Scholar] [CrossRef]
Mohammadnia, M.; Yip, M.W.; Webb, A.A.G.; González, P.J. Spontaneous transient summit uplift at taftan volcano (makran subduction arc) imaged using an InSAR common-mode filtering method. Geophys. Res. Lett. 2025, 52, e2025GL114853. [Google Scholar] [CrossRef]
Ma, P.; Chen, L.; Yu, C.; Zhu, Q.; Ding, Y.; Wu, Z.; Li, H.; Tian, C.; Fan, X. Dynamic landslide susceptibility mapping over last three decades to uncover variations in landslide causation in subtropical urban mountainous areas. Remote Sens. Environ. 2025, 326, 114800. [Google Scholar] [CrossRef]
Zhao, Q.; Zhang, Y.; Pepe, A.; Mastro, P.; Zheng, T.; Yang, T. Coupled ground subsidence and rapid urbanization of the red river delta region and the city of Hanoi, vietnam, revealed through a multi-track InSAR analysis. Int. J. Appl. Earth Obs. Geoinf. 2025, 144, 104886. [Google Scholar] [CrossRef]
Ge, Z.; Wu, W.; Hu, J.; Muhetaer, N.; Zhu, P.; Guo, J.; Li, Z.; Zhang, G.; Bai, Y.; Ren, W. Evaluating the interferometric performance of China’s dual-star SAR satellite constellation in large deformation scenarios: A case study in the Jinchuan mining area, Gansu. Remote Sens. 2025, 17, 2451. [Google Scholar] [CrossRef]
Wang, L.; Zhao, L.; Liu, S.; Zhou, H.; Hu, G.; Zou, D.; Du, E.; Liu, G.; Xiao, Y.; Chen, Y.; et al. Evaluation of stability and cooling engineering effectiveness of the qinghai-tibet transportation routes: A first comprehensive assessment using space geodetic observations. Eng. Geol. 2026, 361, 108502. [Google Scholar] [CrossRef]
Ma, P.; Wu, Z.; Zhang, Z.; Au, F.T. SAR-transformer-based decomposition and geophysical interpretation of InSAR time-series deformations for the Hong Kong-zhuhai-macao bridge. Remote Sens. Environ. 2024, 302, 113962. [Google Scholar] [CrossRef]
Li, G.; Zhao, C.; Wang, B.; Liu, X.; Chen, H. Land Subsidence Monitoring and Dynamic Prediction of Reclaimed Islands with Multi-Temporal InSAR Techniques in Xiamen and Zhangzhou Cities, China. Remote Sens. 2022, 14, 2930. [Google Scholar] [CrossRef]
Bajni, G.; Apuani, T.; Beretta, G.P. Hydro-geotechnical modelling of subsidence in the como urban area. Eng. Geol. 2019, 257, 105144. [Google Scholar] [CrossRef]
Kim, S.; Wdowinski, S.; Dixon, T.H.; Amelung, F.; Kim, J.W.; Won, J. Measurements and predictions of subsidence induced by soil consolidation using persistent scatterer InSAR and a hyperbolic model. Geophys. Res. Lett. 2010, 37, 2009GL041644. [Google Scholar] [CrossRef]
Dai, S.; Zhang, Z.; Li, Z.; Liu, X.; Chen, Q. Prediction of Mining-Induced 3-D Deformation by Integrating Single-Orbit SBAS-InSAR, GNSS, and Log-Logistic Model (LL-SIG). IEEE Trans. Geosci. Remote Sens. 2023, 61, 1–13. [Google Scholar] [CrossRef]
Yang, Z.; Xu, B.; Li, Z.; Wu, L.; Zhu, J. Prediction of Mining-Induced Kinematic 3-D Displacements From InSAR Using a Weibull Model and a Kalman Filter. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–12. [Google Scholar] [CrossRef]
Lv, J.; Zhang, R.; Bao, X.; Wu, R.; Hong, R.; He, X.; Liu, G. Time-series InSAR landslide three-dimensional deformation prediction method considering meteorological time-delay effects. Eng. Geol. 2025, 350, 107986. [Google Scholar] [CrossRef]
Wang, X.; Yu, Q.; Ma, J.; Yang, L.; Liu, W.; Li, J. Study and Prediction of Surface Deformation Characteristics of Different Vegetation Types in the Permafrost Zone of Linzhi, Tibet. Remote Sens. 2022, 14, 4684. [Google Scholar] [CrossRef]
Ma, J.; Xia, D.; Guo, H.; Wang, Y.; Niu, X.; Liu, Z.; Jiang, S. Metaheuristic-based support vector regression for landslide displacement prediction: A comparative study. Landslides 2022, 19, 2489–2511. [Google Scholar] [CrossRef]
Liu, Z.; Ng, A.H.M.; Wang, H.; Chen, J.; Du, Z.; Ge, L. Land subsidence modeling and assessment in the west pearl river delta from combined InSAR time series, land use and geological data. Int. J. Appl. Earth Obs. Geoinf. 2023, 118, 103228. [Google Scholar] [CrossRef]
Chen, Y.; Chen, X.; Guo, S.; Li, H.; Du, P. A novel surface deformation prediction method based on AWC-LSTM model. Int. J. Appl. Earth Obs. Geoinf. 2024, 135, 104292. [Google Scholar] [CrossRef]
Zhou, C.; Ye, M.; Xia, Z.; Wang, W.; Luo, C.; Muller, J.P. An interpretable attention-based deep learning method for landslide prediction based on multi-temporal InSAR time series: A case study of xinpu landslide in the TGRA. Remote Sens. Environ. 2025, 318, 114580. [Google Scholar] [CrossRef]
Wang, J.; Li, C.; Li, L.; Huang, Z.; Wang, C.; Zhang, H.; Zhang, Z. InSAR time-series deformation forecasting surrounding Salt Lake using deep transformer models. Sci. Total Environ. 2023, 858, 159744. [Google Scholar] [CrossRef] [PubMed]
Yao, S.; He, Y.; Zhang, L.; Yang, W.; Chen, Y.; Sun, Q.; Zhao, Z.; Cao, S. A ConvLSTM neural network model for spatiotemporal prediction of mining area surface deformation based on SBAS-InSAR monitoring data. IEEE Trans. Geosci. Remote Sens. 2023, 61, 1–22. [Google Scholar] [CrossRef]
Hu, J.; Zhang, Z.; Zhu, X.; Zhang, X.; Yang, S.; Huang, C.; Wang, W.; Li, X.; Hou, L.; Zhao, L. Geological hazard susceptibility assessment and forecasting analysis based on InSAR and C-L-A model. Int. J. Appl. Earth Obs. Geoinf. 2025, 143, 104840. [Google Scholar] [CrossRef]
Jiang, Y.; Xu, Q.; Meng, R.; Zhang, C.; Zheng, L.; Lu, Z. Remote sensing characterizing and deformation predicting of yan’an new district’s mountain excavation and city construction with dual-polarization MT-InSAR method. Int. J. Appl. Earth Obs. Geoinf. 2025, 136, 104364. [Google Scholar] [CrossRef]
Wang, Y.; Wu, H.; Zhang, J.; Gao, Z.; Wang, J.; Yu, P.S.; Long, M. PredRNN: A recurrent neural network for spatiotemporal predictive learning. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 45, 2208–2225. [Google Scholar] [CrossRef]
Zhao, D.; Chen, B.; Gong, H.; Lei, K.; Zhou, C.; Hu, J. Unraveling the deformation and water storage characteristics of different aquifer groups by integrating PS-InSAR technology and a spatial correlation model. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2024, 17, 2501–2515. [Google Scholar] [CrossRef]
Chen, B.; Gong, H.; Chen, Y.; Li, X.; Zhou, C.; Lei, K.; Zhu, L.; Duan, L.; Zhao, X. Land subsidence and its relation with groundwater aquifers in beijing plain of China. Sci. Total Environ. 2020, 735, 139111. [Google Scholar] [CrossRef]
Long, D.; Yang, W.; Scanlon, B.R.; Zhao, J.; Liu, D.; Burek, P.; Pan, Y.; You, L.; Wada, Y. South-to-North Water Diversion stabilizing Beijing’s groundwater levels. Nat. Commun. 2020, 11, 3665. [Google Scholar] [CrossRef]
Zhou, C.; Tang, Q.; Zhao, Y.; Warner, T.A.; Liu, H.; Clague, J.J. Reduction of subsidence and large-scale rebound in the beijing plain after anthropogenic water transfer and ecological recharge of groundwater: Evidence from long time-series satellites InSAR. Remote Sens. 2024, 16, 1528. [Google Scholar] [CrossRef]
Ferretti, A.; Prati, C.; Rocca, F. Permanent scatterers in SAR interferometry. IEEE Trans. Geosci. Remote Sens. 2001, 39, 8–12. [Google Scholar] [CrossRef]
Lei, K.; Ma, F.; Chen, B.; Luo, Y.; Cui, W.; Zhou, Y.; Liu, H.; Sha, T. Three-dimensional surface deformation characteristics based on time series InSAR and GPS technologies in beijing, china. Remote Sens. 2021, 13, 3964. [Google Scholar] [CrossRef]
Dong, J.; Guo, S.; Wang, N.; Zhang, L.; Ge, D.; Liao, M.; Gong, J. Tri-decadal evolution of land subsidence in the beijing plain revealed by multi-epoch satellite InSAR observations. Remote Sens. Environ. 2023, 286, 113446. [Google Scholar] [CrossRef]
Lu, G.Y.; Wong, D.W. An Adaptive Inverse-Distance Weighting Spatial Interpolation Technique. Comput. Geosci. 2008, 34, 1044–1055. [Google Scholar] [CrossRef]
Kim, T.; Kim, J.; Tae, Y.; Park, C.; Choi, J.H.; Choo, J. Reversible instance normalization for accurate time-series forecasting against distribution shift. In Proceedings of the International Conference on Learning Representations, Virtual, 25–29 April 2022; pp. 1–25. [Google Scholar]
Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
Shi, X.; Chen, Z.; Wang, H.; Yeung, D.Y.; Wong, W.k.; Woo, W.c. Convolutional LSTM network: A machine learning approach for precipitation nowcasting. In Proceedings of the 29th International Conference on Neural Information Processing Systems, Montreal, QC, Canada, 7–12 December 2015; Volume 1, pp. 802–810. [Google Scholar] [CrossRef]
Willmott, C.J.; Robeson, S.M.; Matsuura, K. A refined index of model performance. Int. J. Climatol. 2012, 32, 2088–2094. [Google Scholar] [CrossRef]
Wang, Z.; Bovik, A.; Sheikh, H.; Simoncelli, E. Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Process. 2004, 13, 600–612. [Google Scholar] [CrossRef]
Huynh-Thu, Q.; Ghanbari, M. Scope of validity of PSNR in image/video quality assessment. Electron. Lett. 2008, 44, 800–801. [Google Scholar] [CrossRef]
Hyvarinen, A.; Oja, E. A fast fixed-point algorithm for independent component analysis. Neural Comput. 1997, 9, 1483–1492. [Google Scholar] [CrossRef]
Lai, S.; Lin, J.; Dong, J.; Wu, J.; Huang, X.; Liao, M. Investigating overlapping deformation patterns of the beijing plain by independent component analysis of InSAR observations. Int. J. Appl. Earth Obs. Geoinf. 2024, 135, 104279. [Google Scholar] [CrossRef]
Yang, Y.; Dou, J.; Merghadi, A.; Liang, W.; Dong, A.; Xiong, D.; Zhang, L. Advanced prediction of landslide deformation through temporal fusion transformer and multivariate time-series clustering of InSAR: Insights from the badui region, eastern tibet. IEEE Trans. Geosci. Remote Sens. 2024, 62, 1–19. [Google Scholar] [CrossRef]
Festa, D.; Novellino, A.; Hussain, E.; Bateson, L.; Casagli, N.; Confuorto, P.; Del Soldato, M.; Raspini, F. Unsupervised detection of InSAR time series patterns based on PCA and K-means clustering. Int. J. Appl. Earth Obs. Geoinf. 2023, 118, 103276. [Google Scholar] [CrossRef]
Shi, X.; Gao, Z.; Lausen, L.; Wang, H.; Yeung, D.Y.; Wong, W.k.; Woo, W.c. Deep learning for precipitation nowcasting: A benchmark and a new model. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; Volume 30, pp. 5622–5632. [Google Scholar] [CrossRef]
Meng, D.; Chen, B.; Gong, H.; Zhang, S.; Ma, R.; Zhou, C.; Lei, K.; Xu, L.; Wang, X. Land subsidence and rebound response to groundwater recovery in the Beijing plain: A new hydrological perspective. J. Hydrol. Reg. Stud. 2025, 57, 102127. [Google Scholar] [CrossRef]
Wu, Z.; Wang, T.; Wang, Y.; Wang, R.; Ge, D. Deep Learning for the Detection and Phase Unwrapping of Mining-Induced Deformation in Large-Scale Interferograms. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–18. [Google Scholar] [CrossRef]
Baydas, S.; Karakas, B. Defining a curve as a Bezier curve. J. Taibah Univ. Sci. 2019, 13, 522–528. [Google Scholar] [CrossRef]
Valade, S.; Ley, A.; Massimetti, F.; D’Hondt, O.; Laiolo, M.; Coppola, D.; Loibl, D.; Hellwich, O.; Walter, T.R. Towards Global Volcano Monitoring Using Multisensor Sentinel Missions and Artificial Intelligence: The MOUNTS Monitoring System. Remote Sens. 2019, 11, 1528. [Google Scholar] [CrossRef]
Zhou, C.; Gong, H.; Zhang, Y.; Warner, T.; Wang, C. Spatiotemporal evolution of land subsidence in the beijing plain 2003–2015 using persistent scatterer interferometry (PSI) with multi-source SAR data. Remote Sens. 2018, 10, 552. [Google Scholar] [CrossRef]
Fu, Y.; Wang, J.; Zhang, Y.; Yang, H.; Li, L.; Ren, Z. Spatiotemporal evolution characteristics of ground deformation in the Beijing Plain from 1992 to 2023 derived from a novel multi-sensor InSAR fusion method. Remote Sens. Environ. 2025, 319, 114635. [Google Scholar] [CrossRef]
Yu, X.; Wang, G.; Hu, X.; Liu, Y.; Bao, Y. Land Subsidence in Tianjin, China: Before and after the South-to-North Water Diversion. Remote Sens. 2023, 15, 1647. [Google Scholar] [CrossRef]
Zhong, X.; Gong, H.; Chen, B.; Zhou, C.; Xu, M. Study on the evolution of shallow groundwater levels and its spatiotemporal response to precipitation in the Beijing plain of China based on variation points. Ecol. Indic. 2024, 166, 112466. [Google Scholar] [CrossRef]
Chen, J.; Zheng, G.; Zeng, C.F.; Xue, X.L. Delayed land subsidence during dewatering in multi-aquifer systems: Mechanisms, patterns and assessment. J. Hydrol. 2026, 664, 134462. [Google Scholar] [CrossRef]
Yuan, Y.; Zhang, D.; Cui, J.; Zeng, T.; Zhang, G.; Zhou, W.; Wang, J.; Chen, F.; Guo, J.; Chen, Z.; et al. Land subsidence prediction in zhengzhou’s main urban area using the GTWR and LSTM models combined with the attention mechanism. Sci. Total Environ. 2024, 907, 167482. [Google Scholar] [CrossRef]
Cai, J.; Ming, D.; Liu, F.; Zhao, W.; Zhang, M.; Ling, X.; Zhu, M.; Xu, L.; Lu, T.; Liu, N.; et al. An enhanced spatiotemporal prediction method on landslide displacement with LDP-ConvFormer and MT-InSAR observations. ISPRS J. Photogramm. Remote Sens. 2026, 232, 594–612. [Google Scholar] [CrossRef]
Zhao, Q.; Ma, G.; Wang, Q.; Yang, T.; Liu, M.; Gao, W.; Falabella, F.; Mastro, P.; Pepe, A. Generation of Long-Term InSAR Ground Displacement Time-Series Through a Novel Multi-Sensor Data Merging Technique: The Case Study of the Shanghai Coastal Area. ISPRS J. Photogramm. Remote Sens. 2019, 154, 10–27. [Google Scholar] [CrossRef]
Cai, J.; Liu, G.; Jia, H.; Zhang, B.; Wu, R.; Fu, Y.; Xiang, W.; Mao, W.; Wang, X.; Zhang, R. A New Algorithm for Landslide Dynamic Monitoring with High Temporal Resolution by Kalman Filter Integration of Multiplatform Time-Series InSAR Processing. Int. J. Appl. Earth Obs. Geoinf. 2022, 110, 102812. [Google Scholar] [CrossRef]

Figure 1. Geographical location of the study area and coverage of satellite imagery. (a) The blue lines indicate the main river systems in Beijing and the brown lines represent the major fault zones. (b) Geographical location of Beijing within China. (c) Coverage of different SAR acquisition tracks.

Figure 2. Spatiotemporal baseline maps of different satellite datasets (for clarity, the spatial baseline of the descending orbit images was offset upward by 300 m).

Figure 3. Flow chart of the spatiotemporal prediction framework for ground deformation.

Figure 4. Schematic diagram of dataset division (to better illustrate the deformation process, the images shown in the diagram are not consecutive frames).

Figure 5. Structure of the spatiotemporal prediction model.

Figure 6. Structure of the ST-LSTM memory cell.

Figure 7. Ground deformation in the Beijing Plain from 2015 to 2025. (a) Spatial distribution of ground deformation rates. (b) Deformation time series of selected feature points (the geographic locations of the feature points are shown in Figure 7a).

Figure 8. Correlation between ascending and descending orbit deformation measurements.

Figure 9. Comparison of ground deformation data before and after differencing. (a) Frequency distribution histogram of the original dataset. (b) Boxplot of the original dataset. (c) Frequency distribution histogram after differencing. (d) Boxplot after differencing; (e) Deformation time series of typical feature points before and after differencing (the geographical locations of the feature points are shown in Figure 7a).

Figure 10. Spatial distribution of ground deformation predicted by the model (limited by space, only several key time points are shown). (a) Predicted deformation field of DF4 in the significant subsidence area; the white solid line represents the −140 mm contour. (b) Predicted deformation field of DF9 in the significant uplift area; the white solid line represents the 110 mm contour.

Figure 11. Spatial distribution of prediction errors for different spatiotemporal prediction models (limited by space, only even time steps are shown).

Figure 12. ICA decomposition results. (a–c) Scores of the different independent components. (d) Time-varying patterns of the mixing matrix.

Figure 13. K-means clustering results. (a) Spatial distribution maps of different categories. (b–g) Comparison between the original deformation monitoring time series (black solid line) of representative feature points and the model prediction results (red dotted line).

Figure 14. Simulated ground deformation results. (a) 2-D deformation field. (b) Deformation time series of feature points in different regions (the locations of feature points are shown in Figure 14a).

Figure 15. Prediction results of the PredRNNv2 model at representative feature points (the black solid line is the historical observation sequence, the blue dotted line is the reference future deformation values, and the red dotted line is the model prediction value).

Figure 16. Factors affecting ground deformation. (a) Hydrogeological environment of the Beijing Plain (data from [31]). (b) Groundwater head change rate from June 2019 to October 2025. (c) Ground deformation rate from June 2019 to October 2025.

Figure 17. Groundwater and ground deformation sequence (the geographical locations of feature points are shown in Figure 16c). (a) Water supply structure in Beijing from 2015 to 2024. (b–d) Variation sequences of ground deformation, groundwater level, and precipitation at several characteristic points.

Table 1. Basic parameters of SAR satellite data.

SAR Satellite	Sentinel-1A	Sentinel-1A
Incident angle/ $°$	29–46
Orbit directions	Ascending	Descending
Resolution (Rg × Az)/m	5 × 20
No. of images	325	306
Timespan	2015.07.30–2025.10.05	2014.10.08–2021.11.30
Acquisition dates	283	153
Polarization	VV

Table 2. Key parameters of the primary ground deformation areas.

Name	Contour (mm)	Area ( ${km}^{2}$ )	Max. Deformation (mm)	Type
DF1	−20	138.46	−452.03	Subsidence
DF2	−20	86.03	−233.09	Subsidence
DF3	−40	20.21	−154.01	Subsidence
DF4	−140	368.36	−709.58	Subsidence
DF5	−130	41.94	−300.85	Subsidence
DF6	−130	239.11	−481.32	Subsidence
DF7	−130	19.23	−303.19	Subsidence
DF8	−130	17.88	−277.81	Subsidence
DF9	110	175.10	178.73	Uplift
DF10	80	77.70	116.25	Uplift
DF11	50	179.30	84.10	Uplift

Table 3. Basic information of the experimental platform.

Project	Operating System	CPU	GPU
Content	deepin 25	Intel(R) Xeon(R) Gold 5128 CPU @2.30 GHz	NVIDIA A100-PCIE-40GB

Table 4. Main software configurations.

Project	Graphics Driver	CUDA	CUDNN	Python	PyTorch
Content	570.124.04	V12.8	V9.8.0	3.10.8	2.9.1

Table 5. Evaluation of prediction performance for different models.

Model	Metrics
Model	MSE↓	MAE↓	SMAPE↓	WIA↑	EVS↑	SSIM↑	PSNR↑
PF	1727.8143	32.0608	110.1501	0.9296	0.9118	0.3184	27.7269
LLF	404.8398	13.5290	48.1150	0.9799	0.9300	0.8797	38.6271
SVR	131.2081	9.2582	39.5567	0.9934	0.9744	0.8864	36.3493
LSTM	50.4429	3.7942	14.8464	0.9972	0.9911	0.9412	41.1167
ConvGRU	34.0857	4.0672	24.9155	0.9983	0.9947	0.9466	46.4410
ConvLSTM	21.2619	3.2697	20.1696	0.9989	0.9963	0.9646	48.6967
PredRNN	13.0308	2.3582	13.8162	0.9994	0.9984	0.9809	51.7289
PredRNNv2	8.9647	1.9902	13.1846	0.9996	0.9984	0.9846	53.0632