CED-LSTM: A Coherence-Conditioned Encoder–Decoder Network for Robust InSAR Time-Series Deformation Extraction in Open-Pit Mines

Wang, Yanping; Kong, Xiangbo; Bai, Zechao; Li, Yang; Lu, Yao; Tang, Weikai; Lin, Yun; Shen, Wenjie; Cai, Guanjun

doi:10.3390/rs18070984

Open AccessArticle

CED-LSTM: A Coherence-Conditioned Encoder–Decoder Network for Robust InSAR Time-Series Deformation Extraction in Open-Pit Mines

by

Yanping Wang

¹,

Xiangbo Kong

¹,

Zechao Bai

^1,*

,

Yang Li

¹

,

Yao Lu

²,

Weikai Tang

³,

Yun Lin

¹

,

Wenjie Shen

¹

and

Guanjun Cai

⁴

¹

Radar Monitoring Technology Laboratory, School of Artificial Intelligence and Computer Science, North China University of Technology, Beijing 100144, China

²

BGRIMM Technology Group, Beijing 100160, China

³

Heilongjiang Duobaoshan Copper Industry Co., Ltd., Heihe 161416, China

⁴

Beijing Jingneng Geological Engineering Co., Ltd., Beijing 102300, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2026, 18(7), 984; https://doi.org/10.3390/rs18070984

Submission received: 9 February 2026 / Revised: 18 March 2026 / Accepted: 20 March 2026 / Published: 25 March 2026

(This article belongs to the Special Issue InSAR Innovations: Advances in Remote Sensing for Geohazard Monitoring)

Download

Browse Figures

Versions Notes

Highlights

What are the main findings?

A coherence-conditioned CED-LSTM denoiser is developed for InSAR deformation time series, reducing noise while preserving deformation signals.
The method performs well on synthetic data and, on the open-pit mine, supports deformation-level classification mapping and highlights localized level IV zones.

What are the implications of the main findings?

It improves efficiency and objectivity for regional hazard assessment, reducing reliance on manual interpretation.
The score and level framework is easy to transfer and update across different periods and areas with varying deformation patterns.

Abstract

Systematically characterizing the time series deformation evolution of open-pit mine slopes is key to revealing their potential instability development and supporting subsequent deformation-level classification. Interferometric Synthetic Aperture Radar (InSAR), by enabling measurement of ground deformation at a global scale approximately every ten days, may hold the key to those interactions. However, atmospheric propagation delays still have a significant impact on deformation calculations, and open-pit mine slopes monitored by InSAR often suffer from low coherence. This noise can obscure nonlinear and transient precursory signatures in deformation time series, reducing the identifiability of key temporal patterns required for automated interpretation. Here, we present a Coherence-conditioned Encoder–Decoder Long Short-Term Memory (CED-LSTM) denoising network for deformation time series. We generate a physics-aware synthetic dataset by modeling coherence-dependent measurement noise and temporally correlated atmospheric delays. The network jointly models deformation time series and coherence, using residual learning and adaptive gated composite loss to preserve deformation trends. It is designed to autonomously extract ground deformation signals from noise in InSAR time series without prior knowledge of where deformation occurs or how it evolves. On the synthetic validation set, the network achieved a root mean square error (RMSE) of 2.2 mm across the validation sequences. Applied to three InSAR datasets over an open-pit mine from March 2019 to March 2022, denoising suppresses noise and stabilizes deformation boundaries, enabling extraction of trend and transient indicators and a data-driven deformation-level score. Using quantile-based thresholds, these scores are then used to produce multi-year deformation-level classification maps.

Keywords:

open-pit mine slope; long short-term memory (LSTM); encoder–decoder; time series deformation; InSAR

1. Introduction

Open-pit mining, as a major mining method, has assumed an important position in the global exploitation of mineral resources. Owing to its capacity for large-scale production and high operational efficiency, this industry plays a crucial part in socioeconomic development. In recent years, with the continuous intensification of mining activities and the combined influence of climatic and other environmental factors, open-pit mine slopes are prone to continuous deformation, affecting mining safety. Consequently, acquiring and analyzing deformation time series over mining areas have become essential for characterizing the evolving state of open-pit mines. During open-pit operations, it is necessary to continuously collect information on the mining area. Such information supports the formulation and revision of production plans [1] and, importantly, enables deformation-level assessment within the mine to mitigate casualties and economic losses [2]. Deformation monitoring and subsequent deformation-level classification are particularly critical, as potential risk zones must be identified from deformation amplitude, deformation rate, and abrupt change characteristics, thereby providing quantitative evidence to support zoned management and early-warning decision making.

Interferometric Synthetic Aperture Radar (InSAR), as an advanced remote-sensing technique, has become a powerful tool for large-area deformation monitoring owing to its millimeter-level measurement accuracy and wide spatial coverage [3,4,5]. With the continuous advancement of InSAR, it has been extensively applied to the monitoring of open-pit mines [6,7,8]. To enhance the monitoring capability for long-term deformations, researchers have developed multi-temporal InSAR (MT-InSAR) processing technology. Representative approaches include Persistent Scatterers InSAR (PS-InSAR) and Small Baseline Subset InSAR (SBAS-InSAR), which together constitute strategies tailored to different scattering characteristics [9,10,11,12]. In mining environments, MT-InSAR provides a critical data source for revealing spatiotemporal deformation induced by excavation activities. However, these high-density time series observations must be reliably interpreted and further translated into quantitative deformation indicators to support the identification of potentially unstable zones, risk analysis, and dynamic management decisions.

Researchers have explored the identification and risk analysis of unstable zones based on acquired InSAR time series deformation data. Poggi et al. integrated surface deformation intensity retrieved from Sentinel-1 InSAR with field-based building damage surveys, established empirical fragility/vulnerability curves, and conducted quantitative landslide risk assessment at the regional scale, thereby forming an interpretable evaluation chain from deformation intensity to damage grades and risk quantification [13]. Notably, under conditions where progressive slope deformation coexists with episodic acceleration, the stability and interpretability of the inferred results directly affect the effectiveness of early warnings. Nevertheless, conventional interpretation pipelines typically require manual field investigations and computational processing [14,15], resulting in low efficiency and strong subjectivity. Moreover, they are susceptible to phase noise, atmospheric phase screens (APS), and low coherence, which may overly smooth or misclassify key precursory signatures, such as abrupt changes and accelerating deformation [16], thereby limiting their applicability to hazard assessment [17]. Although existing workflows based on empirical thresholds and spatiotemporal filtering can partially mitigate these issues, achieving a robust balance between noise suppression and fine-detail preservation remains challenging in the presence of strong noise and concurrent nonlinear abrupt deformation [18].

In recent years, deep-learning-based deformation extraction techniques have been widely applied to the processing of InSAR time series data. These methods primarily include CNN frameworks for spatial feature learning, such as the U-Net network with encoder–decoder architecture [19] and the ResNet architecture with deep residual connections [20], as well as temporal models for long-term sequence representation, including RNNs and their long short-term memory (LSTM) variants [21], and self-attentive Transformers [22]. These approaches have been successfully applied to tasks such as APS correction, phase estimation, and even interferometric phase reconstruction combined with coherence estimation. Under complex conditions, they typically demonstrate superior performance compared to traditional processing methods [18,23,24,25,26]. Nevertheless, time series InSAR observations over mining areas are often characterized by pronounced coherence fluctuations intertwined with multi-source noise. If a model fails to explicitly characterize data reliability indicators such as coherence, it may still lead to missed detection of deformation [18]. High-quality supervisory signals and ground truth deformation are often difficult to obtain. Many studies therefore rely on simulated data or weakly supervised strategies to construct training samples. Even when supervised frameworks with controllable ground truth can be built using synthetic interferograms [26], the highly non-stationary and noise-contaminated background typical of open-pit mines may still induce distribution discrepancies between simulated and real measurements, thereby degrading model generalization across sites and sensors [27].

Many existing networks learn end-to-end mapping from the observed phase to deformation, with greater emphasis on improving the accuracy of intermediate steps, such as filtering or phase unwrapping [28]. They often lack mechanisms for physics-consistent constraints and uncertainty representation. As a result, the interpretability of model outputs and the reliability of the associated confidence assessment remain limited [29]. In addition, CNN-based approaches predominantly rely on local receptive fields to model spatial textures, making it difficult to explicitly represent long-range temporal dependencies and the temporal uncertainty introduced by irregular sampling and missing observations. In scenarios involving rapidly varying coherence or abrupt deformation, they are more prone to boundary over-smoothing and loss of fine details, which in turn compromises the stability and interpretability of subsequent deformation-level classification [30]. Under strongly non-stationary conditions such as open-pit mines, factors including rapid temporal coherence variation, unequal sampling intervals, and the coupling between abrupt deformation and noise often compound, amplifying the tendency of deep models to over-smooth and increasing the risk of misclassification or missed detection [31]. Meanwhile, recent studies on time series modeling based on LSTM and Transformer architectures have shown promise for trend prediction [32]. However, there is still a lack of a unified and robust solution that can reliably retain critical deformation features, such as abrupt changes, in noisy environments and effectively support subsequent work. Therefore, for open-pit mine deformation-level classification, it is imperative to incorporate an explicit characterization of observation reliability within deep time series modeling and to integrate relevant physical priors to enhance model robustness and transferability.

In this study, we develop a Coherence-conditioned Encoder–Decoder Long Short-Term Memory (CED-LSTM)-based deformation extraction approach for open-pit mine InSAR time series. The proposed method is centered on LSTM networks and adopts an encoder–decoder temporal modeling framework, which is specifically designed to cope with strong noise contamination and the tendency of abrupt change signatures to be smoothed out in open-pit InSAR time series. At the input stage, observation masks and coherence information are explicitly incorporated, enabling adaptive modeling under incomplete measurements and heterogeneous data quality and reducing the interference of low-quality observations in time series reconstruction. At the output stage, a residual learning strategy is employed to facilitate the separation of measurement noise and anomalous disturbances, thereby suppressing noise while preserving, as much as possible, the boundaries of abrupt changes and nonlinear evolution patterns in the deformation signal. The resulting time series is therefore more stable and physically consistent, enabling reliable derivation of deformation descriptors such as cumulative deformation, deformation velocity, acceleration, deformation spatial gradient, and the magnitude and rate of abrupt changes. These descriptors provide quantitative characterization of the spatiotemporal deformation process across the mine area and offer robust data support for subsequent analyses.

2. Study Area and Datasets

2.1. Study Area

The study site is an open-pit mining area in northeast China, as shown in Figure 1. The deposit is dominated by a super-large porphyry copper (molybdenum) system, accompanied by multiple metallic resources, such as gold and silver. Mining operations are currently dominated by open-pit extraction, supplemented by underground mining and mineral processing, with sustained and high-intensity activity. Superimposed tectonic deformation and magmatic activity have produced a typical setting characterized by multi-stage mineralization and a complex geological environment [33]. Long-term high-intensity excavation and waste dumping have formed prominent stepped pit slopes and large-scale waste dump units. Together with environmental forcing, such as seasonal freeze–thaw in the cold temperate climate, surface deformation in the area exhibits pronounced spatiotemporal non-stationarity and localized nonlinear abrupt variations. Owing to the combined effects of natural conditions and engineering disturbance, the open-pit mine provides a representative test area for evaluating the robustness and engineering applicability of satellite InSAR time series deformation interpretation and deformation-level classification methods.

2.2. Datasets

This study used 92 Sentinel-1 ascending orbit SAR scenes acquired between March 2019 and March 2022 covering the entire open-pit mine, with an incidence angle of 43.8 degrees [34]. To characterize the spatiotemporal evolution of mining area deformation across different annual stages and to facilitate period-wise comparative analysis, the three-year observation record was divided into three consecutive monitoring intervals, with 34, 30, and 28 valid acquisition epochs, respectively. Table 1 summarizes the main parameters of the Sentinel-1 imagery used in this study. To ensure geometric consistency between the subsequent time series deformation inversion and spatial mapping, an external DEM was incorporated during processing to remove the topographic phase contribution and to complete geocoding [35].

The original InSAR cumulative deformation was generated using the following workflow. (i) Deformation time series were retrieved from the original InSAR observations over the open-pit mine using the SBAS-InSAR approach and then organized into per-pixel displacement sequences. (ii) If observations were missing, they were linearly interpolated only to provide numerical placeholders while a validity mark preserved information on which epochs contain original measurements. (iii) The cumulative deformation over the three-year observation period was finally mapped to visualize its spatial distribution, as shown in Figure 2, providing the baseline deformation reference for subsequent validation and comparative analyses.

3. Methodology

This study develops a CED-LSTM denoising network for open-pit mine InSAR time series to cope with low coherence in time series observations and reduce the attenuation of abrupt deformation signatures. The proposed method consists of three key components. First, because independent ground truth is unavailable for real InSAR time series, synthetic InSAR time series with statistically calibrated deformation and noise characteristics are generated to provide controllable supervisory targets for model training and quantitative evaluation. Second, the trained model is transferred to real InSAR time series data from the mining area for inference, producing cumulative deformation sequences that are more stable and physically consistent. Finally, deformation-related indicators are derived to build deformation-level scores and corresponding deformation levels, thereby enabling a spatial representation of deformation levels across the mining area. The overall workflow is illustrated in Figure 3. In addition, a dedicated training and optimization strategy is developed to ensure both denoising capability and the ability to preserve and extract abrupt deformation features.

3.1. Synthetic InSAR Time Series Data Generation

Although InSAR has enabled the accumulation of extensive observational archives, accurately labeled ground truth deformation remains highly scarce, particularly in scenarios involving abrupt geohazard events. To train and evaluate the proposed deep learning model, we developed a synthetic dataset generator driven by physical mechanisms and the statistical characteristics of real measurements. The generator can reproduce a range of representative surface deformation patterns and dynamically inject mixed noise that conforms to InSAR physical properties according to the coherence coefficient.

3.1.1. Kinematic Deformation Model

We define the clean deformation signal

Y_{clean} (t)

observed by InSAR as the superposition of a trend term, a periodic term, and an instantaneous abrupt change term. To represent the potentially complex deformation behaviors in the open-pit mine, six fundamental deformation modes are specified, with their mathematical formulations given as follows:

Y_{clean} (t) = \underset{Trend}{\underset{⏟}{v \cdot t + \frac{1}{2} a \cdot t^{2}}} + \underset{Seasonality}{\underset{⏟}{A_{osc} \sin (2 π t + ϕ)}} + \underset{Transient}{\underset{⏟}{\sum S_{k} \cdot I (t \geq t_{k})}}

(1)

where t denotes a nonuniform time vector defined by the actual satellite revisit times. For steady or uniform deformation (Class 0 to Class 2), the acceleration is set to a = 0 and sampled v from the velocity distribution

P (v)

derived from real data. For accelerated deformation (Class 3), a nonzero acceleration a is introduced to emulate the creep behavior prior to slope instability. For periodic deformation driven by thermal expansion or groundwater fluctuations (Class 5), the oscillation amplitude

A_{osc}

is set from the spectral analysis of real time series. For the abrupt deformation events of interest in this study (Class 4), a step function

I ()

is incorporated. The event time

t_{k}

is randomly sampled within the sequence, and the step magnitude

S_{k}

is sampled from the distribution of extrema of the second-order differences in the real data, with a fixed share of long-tail extremes to better represent rare events. We further assess sensitivity to deformation-mode proportions by varying each mode’s sampling rate in the synthetic training set, with emphasis on rare transient abrupt events (Class 4). The validation set remains mode balanced to isolate the effect of mode proportion from coherence and noise statistics.

3.1.2. Physics-Aware Noise Simulation Strategy

Real InSAR time series contain not only surface deformation signals but also the combined effects of APS, spatiotemporally decorrelated noise, and phase unwrapping errors. To train the proposed CED-LSTM network with strong noise robustness, simply adding Gaussian white noise is insufficient to reproduce the complex characteristics of InSAR time series. Therefore, we develop a physics-aware mixed noise simulation strategy. The observed sequence

Y_{obs} (t)

is expressed as

Y_{obs} (t) = Y_{clean} (t) + δ_{meas} (t, γ) + δ_{atm} (t) + δ_{unwrap} (t, γ)

(2)

where

γ

denotes the coherence coefficient at the corresponding pixel.

The measurement noise

δ_{meas}

mainly arises from decorrelation effects in the radar returns and is manifested as high-frequency random fluctuations. Due to the underlying InSAR physics, the noise magnitude is negatively correlated with coherence. Interferometric phase dispersion is controlled by the magnitude of the complex correlation coefficient, and phase distribution becomes markedly broader as the correlation decreases [36]. To reproduce this heteroscedastic behavior, we model the measurement noise as a non-stationary, high-frequency Gaussian process:

δ_{meas} (t) \sim N (0, {(\frac{σ_{base}}{γ_{t}})}^{2})

(3)

where

σ_{base}

is the baseline noise standard deviation. This mechanism leads to a pronounced signal-to-noise ratio reduction in low-coherence areas, encouraging the network to dynamically adjust the reliability assigned to observations according to the coherence-weighted contribution.

For atmospheric turbulence

δ_{atm}

, APS typically appears as low-frequency red noise. As a phase manifestation of neutral atmosphere delay, APS can be synthesized using temporally stochastic processes. In particular, a random walk model has been used to describe the time evolution of neutral atmosphere delay and its gradients in a statistical formulation [37]. We generate temporally red noise using a one-dimensional random walk process:

δ_{atm} (t) = δ_{atm} (t - 1) + ϵ_{t}, ϵ_{t} \sim N (0, σ_{rw}^{2})

(4)

where

σ_{rw}

controls the intensity of atmospheric drift. This pixel-wise APS term injects low-frequency, temporally correlated drift into each time series to stress test separability from slow nonlinear deformation. We remove the linear trend of the random walk so that APS appears as low-frequency fluctuations rather than a global bias, reducing confusion with the true long-term deformation trend. Such APS-like drift can mimic slow creep, making it essential for evaluating the model’s ability to distinguish deformation trends from APS. Note that the APS term in Equation (4) is generated pixel-wise as a one-dimensional random walk. This design mainly introduces a temporal low-frequency nuisance component at each pixel in the simulator but does not impose spatial covariance across neighboring pixels. This simplification matches our pixel-wise CED-LSTM training and inference, where each sample is an individual time series with coherence-conditioned noise injection. We therefore assess spatial continuity using real scene results, focusing on the boundary stability and connectivity of anomalous zones.

For phase unwrapping errors

δ_{unwrap}

, unwrapping algorithms may produce jump errors in regions with low coherence or excessively large deformation gradients. Such errors are sparse but highly disruptive outliers. Phase unwrapping artifacts are manifested as integer cycle jumps, and unwrapping becomes particularly difficult in poorly correlated areas with very closely spaced fringes, where errors are expected [38]. We model them as a Poisson process, where the jump occurrence probability P is inversely related to coherence:

P (δ_{unwrap} (t, γ) \neq 0) \propto \frac{1}{γ_{t}}

(5)

When a jump occurs, its magnitude is sampled from bounds calibrated using real jump-like outliers rather than unconstrained random draws. Specifically, we compute the maximum absolute first-order difference per pixel,

| Δ y |

, and select the high-percentile tail as candidate unwrapping error samples. The tail ratio calibrates the jump rate, and robust quantiles of these candidates define the sampling bounds. This calibration anchors simulated jumps to realistic scales and enforces robustness to outliers that are more frequent in low-coherence regions.

3.2. CED-LSTM Denoising Network

3.2.1. Network Architecture

We develop a coherence-conditioned encoder–decoder LSTM denoising network, termed CED-LSTM, to recover more reliable deformation information from InSAR cumulative deformation time series contaminated by noise, anomalous drift, and jump artifacts. The encoder extracts global temporal contextual information, whereas the decoder reconstructs the deformation sequence at each epoch under the guidance of coherence conditions. In addition, residual learning and a Teacher Forcing strategy are incorporated to improve training stability and preserve abrupt deformation events. The network architecture is illustrated in Figure 4.

For each monitoring pixel, we construct a three-channel input sequence of length T. The observation sequence is denoted as

X = {x_{1}, x_{2}, \dots, x_{T}}

, with the corresponding coherence sequence

C = {c_{1}, c_{2}, \dots, c_{T}}

and the validity mask sequence

M = {m_{1}, m_{2}, \dots, m_{T}}

, where T is the number of time steps. Here,

x_{t}

denotes the observed cumulative deformation at epoch t in millimeters,

m_{t}

is a binary indicator with 1 for valid observations and 0 for missing or unreliable samples, and

c_{t}

is the interferometric coherence in the range of 0 to 1 that reflects phase stability and measurement reliability. The network is trained to perform end-to-end prediction of the denoised deformation sequence

\hat{Y}

. Specifically, the encoder is responsible for extracting temporal features from the noise-contaminated inputs. We design the input layer in a multi-channel form such that at each time step t, the input vector is composed of the observation, validity mask, and coherence coefficient, namely,

v_{t} = [x_{t}, m_{t}, c_{t}]

. Before training and inference, missing observations are linearly interpolated only to avoid NaNs while keeping

m_{t} = 0

; if

m_{t}

= 1 for all series, the input remains unchanged. All deformation values are standardized using the training set mean and standard deviation, so the model can rely on temporal context rather than the interpolated placeholder.

Using a single-layer LSTM to process the full sequence, the encoder compresses the temporal context into the hidden state

h_{enc}

and the cell state

c_{e n c}

, thereby capturing the global deformation pattern. The decoder is then initialized using the final encoder states. To mitigate information loss during decoding and to enhance sensitivity to the current observation, an observation-guided mechanism is introduced at each decoding step. The decoder input at time t, denoted as

d_{t}

, includes not only the previous predicted output

{\hat{y}}_{t - 1}

but also concatenates the current observation

x_{t}

and coherence coefficient

c_{t}

:

d_{t} = Concat (x_{t}, c_{t}, {\hat{y}}_{t - 1})

(6)

This design allows the decoder to access the current raw observation at every step, thereby dynamically adjusting the relative reliance on the observation

x_{t}

and the historical prediction

{\hat{y}}_{t - 1}

according to the coherence level

c_{t}

. In addition, we adopt a residual learning strategy. Rather than directly predicting the final deformation value, the network outputs an additive correction term

r_{t}

through a fully connected layer. The final denoised output

{\hat{y}}_{t}

is given by

{\hat{y}}_{t} = x_{t} + r_{t} = x_{t} + F_{out} (h_{dec, t})

(7)

where

F_{out}

denotes the fully connected layer and

h_{dec, t}

is the decoder hidden state at time step t. This strategy enables the network to more readily learn identity mapping in low-noise, high-coherence regions (i.e.,

r_{t} \approx 0

), thereby concentrating its learning capacity on separating high-frequency noise from abrupt deformation signals. It should be noted that high coherence does not necessarily imply perfect identity mapping when systematic low-frequency biases remain in the observations. In such cases, the residual branch may learn a small smooth correction rather than a strictly zero residual, which is compatible with the residual learning formulation.

The encoder and decoder each use a single LSTM layer with size 96. The input has three channels, and the decoder input dimension is also three. A fully connected layer projects the decoder’s hidden state to one output channel. This model has 77,665 trainable parameters. Although deeper or bidirectional LSTM encoders are possible, we use a single-layer unidirectional encoder to balance capacity and robustness. First, the Sentinel-1 time series used in this study has 28–34 epochs per interval, for which a single layer is sufficient to capture long-term nonlinear trends at the pixel level while reducing overfitting risk. Second, deeper or bidirectional recurrence would increase parameters and computation and may over-smooth short-lived abrupt changes that are critical for instability interpretation. Third, the observation-guided decoder and residual learning already help preserve nonlinear trends and mutation-like changes via conditioning on the current observation and coherence. The framework can be extended to deeper (>1 layer) or bidirectional encoders when longer time spans or more complex dynamics are required.

3.2.2. Physics-Aware Adaptive Loss Function

For InSAR time series, irregular sampling and abrupt deformation events are common, and a conventional mean squared error (MSE) loss can easily lead to overly smoothed results. Therefore, we design a composite loss function

L_{total}

that incorporates an adaptive gating mechanism and consists of three components:

L_{total} = L_{main} + λ_{vel} L_{vel} + λ_{smooth} L_{smooth}

(8)

where

λ_{vel}

and

λ_{smooth}

are the weighting coefficients for the velocity consistency term and the smoothness term, respectively. In our implementation, the default setting is

λ_{vel}

= 0.1 and

λ_{smooth} = 1 \times 1 0^{- 4}

. The smoothing term only serves as weak auxiliary regularization to suppress high-frequency jitters in stable periods. If the weight is set to a relatively large value, it can still easily cause excessive smoothing for medium-amplitude nonlinear changes or step deformations, even when the penalty has been gated off in the vicinity of abrupt changes.

Neural networks often over-smooth high-frequency abrupt deformation signals, which can attenuate step-like behaviors in InSAR time series. To mitigate this issue, we introduce a gradient-based adaptive gate and incorporate it into the primary regression loss. We first compute the ground truth first differences

| Δ y_{t} | = | y_{t + 1} - y_{t} |

and keep only valid adjacent pairs with

m_{t - 1} m_{t} = 1

. For each mini batch, we set the abrupt change threshold

τ = {Q u a n t i l e}_{q} (Δ y_{t})

over the valid pairs, where q = 0.55 in all experiments. If no valid pair exists, we use the mini batch mean of

Δ y_{t}

as a safe fallback. The abrupt change gating factor

α_{t}

is defined as

α_{t} = σ (k \cdot (τ - | Δ y_{t} |))

(9)

where

σ

denotes the Sigmoid function and k controls the sharpness. In this study, k is set to 50 to obtain a sufficiently sharp yet numerically stable gate, and it is kept fixed across all experiments. Importantly, this ground truth difference is used only to construct training time loss reweighting and is not required during inference. In deployment, CED-LSTM decodes using only the current observation and coherence as inputs. The gate influences inference indirectly by encouraging the model to preserve true abrupt changes while suppressing noise fitting.

The gating threshold is determined by the q-quantile of

| Δ y |

over valid epochs within each mini batch, where q = 0.55, and the mutation-aware amplification coefficient is set to

β = 16.0

A smaller q would classify more intervals as abrupt changes and thus relax the velocity and smoothing constraints too broadly, making the model more sensitive to noise, whereas a larger q would emphasize only the strongest jumps and tend to over-smooth moderate but meaningful step-like variations. Therefore, q = 0.55 was adopted as a balanced default setting for preserving abrupt deformation while maintaining denoising stability. When an abrupt change is detected (

| Δ y_{t} | ≫ τ

),

α_{t}

approaches 0; otherwise, it approaches 1. The primary regression loss is formulated using the SmoothL1 function. To enhance the capture of abrupt change boundaries, we apply dynamic weighting to the loss according to

(1 - α_{t})

:

L_{main} = \frac{1}{T} \sum_{t = 1}^{T} (1 + β (1 - α_{t})) \cdot SmoothL 1 ({\hat{y}}_{t}, y_{t})

(10)

where

β

is an amplification coefficient. This design encourages the network to place greater emphasis on epochs associated with abrupt changes, thereby effectively mitigating the attenuation of step-like deformation signals. In implementation,

α_{t}

is computed on the interval (t − 1, t) and used in three places. First, the mutation-aware weighting of the primary SmoothL1 term up-weights the two adjacent epochs

w_{t} = 1 + β (1 - α_{t})

and

w_{t - 1} = 1 + β (1 - α_{t})

. Second, the velocity consistency loss is scaled by

α_{t}

so the continuity constraint is active mainly in stable intervals and relaxed near abrupt changes. Third, for the second-order smoothing term at epoch t, we use

\min (α_{t}, α_{t + 1})

to deactivate smoothing whenever either adjacent interval indicates an abrupt change, avoiding over-penalization of genuine step-like behavior.

Given the nonuniformity of satellite revisit intervals, we adopt coherence-weighted irregular sampling velocity loss that constrains physical velocity rather than raw displacement differences to avoid inaccurate supervision. We compute the physical velocity using the real time interval vector

Δ t

and impose a velocity consistency constraint. In addition, to address heteroscedastic noise, we introduce a coherence-based weight

w (c_{t}) = 1 + (1 - c_{t})

. In low-coherence regions where

c_{t}

is small,

w (c_{t})

increases, encouraging the model to adhere more strictly to the physical kinematic behavior and reducing overfitting to low-quality observations. It should be noted that at epochs with abrupt changes, the velocity term may exhibit instantaneous discontinuities, which can conflict with the continuity constraint. To avoid such conflicts between physical constraints and abrupt change reconstruction, we use the gating factor

α_{t}

to relax the constraint around abrupt changes:

L_{vel} = \sum_{t = 1}^{T - 1} α_{t} \cdot w (c_{t}) \cdot SmoothL 1 (\frac{{\hat{y}}_{t + 1} - {\hat{y}}_{t}}{Δ t_{t}}, \frac{y_{t + 1} - y_{t}}{Δ t_{t}})

(11)

This term enforces velocity continuity during stable deformation periods (

α_{t} \approx 1

) while allowing pronounced velocity jumps during abrupt change periods (

α_{t} \approx 0

).

The adaptive second-order smoothing loss introduces a second-order-difference-based smoothness constraint to further suppress high-frequency atmospheric turbulence noise. Meanwhile, to smooth noise while avoiding excessive smoothing of the deformation signal, this constraint is also adaptively modulated using the same gating mechanism:

L_{smooth} = \sum_{t = 1}^{T - 2} \min (α_{t}, α_{t + 1}) \cdot w (c_{t}) \cdot | {\hat{y}}_{t + 2} - 2 {\hat{y}}_{t + 1} + {\hat{y}}_{t} |

(12)

When abrupt changes are detected at consecutive epochs, the smoothness penalty is automatically deactivated, ensuring that the denoising process does not distort genuine nonlinear deformation behavior.

Each term in the composite loss corresponds to a specific training objective. The masked reconstruction term is the primary denoising driver, fitting the network output to valid observations while ignoring missing or unreliable samples. All loss terms and change-point gating statistics are computed only on valid observations, and missing observations neither contribute to optimization nor affect gate threshold estimation, avoiding interpolation-induced bias. The mutation-aware weighting up-weights abrupt change epochs to preserve step-like or accelerated deformation and reduce over-smoothing. The velocity consistency term constrains the inter-epoch deformation rate to follow physically plausible evolution, improving autoregressive stability and reducing error accumulation. The smoothness term suppresses residual high-frequency jitter in non-mutation intervals. Coherence conditioning further down-weights low-reliability measurements so that noisy segments do not dominate training. During inference, the loss terms are absent, and stability is governed by the learned encoder–decoder dynamics with coherence-conditioned decoding.

3.2.3. Training Strategy

All experiments were conducted on a 64-bit Windows 11 workstation equipped with an Intel Xeon Gold 5218 CPU (2.40 GHz) and an NVIDIA GeForce RTX 3090 GPU with 24 GB memory, providing sufficient computational support for deep learning model training. The proposed network was implemented in PyTorch 2.0.0, with CUDA 11.8 and cuDNN 8.7.0 enabled for GPU acceleration.

Because abrupt or accelerated deformation samples are relatively rare in natural settings, we use a weighted random sampler based on class labels to ensure each mini batch contains sufficient extreme cases. This class-balanced sampling reduces dependence on the global mode proportions and mitigates performance degradation when rare transient events are underrepresented. For reproducibility, we fix the random seed to 42 for all stochastic components, including NumPy sampling and PyTorch random number generators. The dataset is randomly split into 85% for training and 15% for validation, and each class weight is set inversely to its frequency to oversample rare extreme labels. During training, we adopt a fully autoregressive setting consistent with the inference stage. At each decoding step, the feedback input was the model prediction from the previous time step

{\hat{y}}_{t - 1}

, rather than the ground truth label. Although this strategy increases the difficulty of early convergence, it encourages the model to learn to correct accumulated errors during training, thereby improving robustness and stability for long sequence inference. The network was trained for 100 epochs, and validation loss was evaluated after each epoch. We do not apply early stopping; instead, we train for 100 epochs and select the checkpoint with the lowest validation loss for all subsequent evaluations. AdamW was used for optimization with an initial learning rate of 0.002 and a batch size of 256. Weight decay was further applied to mitigate overfitting, and the model checkpoint achieving the lowest validation loss was selected for subsequent inference on real data. The weight decay coefficient was set to

1 \times 1 0^{- 5}

.

3.3. Deformation-Level Evaluation Based on Denoised InSAR Time Series Data

After training, the CED-LSTM network is transferred to real InSAR observations for inference. Because no independent deformation measurements are available for the real mine time series in this study, the real-scene experiments are intended to assess transferability, spatiotemporal interpretability, and scene-level consistency rather than absolute reconstruction accuracy. To account for potential statistical discrepancies between the simulated and real domains and to ensure stable inference, we standardize the real data using the global mean

μ_{sim}

and standard deviation

σ_{sim}

computed from the simulated training set. The real network input keeps the same composition as in training, i.e., the observation, validity mask, and coherence coefficient. This design prevents the network’s sensitivity to abrupt anomalous signals from being degraded by distribution shifts in real measurements. For missing observations (NaN) that may occur in the real time series, linear interpolation is used for gap filling; meanwhile, the corresponding mask entries are set to 0 (invalid), encouraging the network to rely on temporal context rather than the interpolated values for prediction. Finally, the denoised cumulative deformation sequences are used for subsequent feature extraction and deformation-type level classification.

3.3.1. Spatiotemporal Deformation Feature Extraction

We develop a multidimensional feature extraction scheme based on the denoised time series to comprehensively quantify potential geohazard risk by incorporating kinematic trends, instantaneous anomalies, and spatial gradients. For the denoised sequence

{\hat{Y}}_{clean} (t)

, a second-order polynomial is fitted to extract long-term kinematic parameters. This quadratic fit yields a compact global descriptor and is robust for short sequences. In this study, the global trend descriptors are complemented by the transient metrics described below to retain local turning behaviors. We emphasize that the quadratic fit is used as a compact descriptor rather than a physical model of stage-wise mining operations. Step-like behaviors are captured by the transient metrics and are explicitly considered in the final scoring, so the hazard inference is not driven by interpreting the quadratic acceleration as an exact operational parameter. For each pixel, the long-term deformation trend is modeled as

{\hat{y}}_{i} (t) \approx v_{i} \cdot t + \frac{1}{2} a_{i} \cdot t^{2} + c

(13)

where

v_{i}

denotes the mean deformation rate, indicating persistent uplift or subsidence, and

a_{i}

represents the acceleration used to identify whether deformation is entering an accelerated instability stage. This fitting procedure effectively extracts long-term kinematic parameters and provides a basis for distinguishing linear and nonlinear trends.

For common local collapses or sudden slumping in open-pit mines, conventional linear trend analysis can overlook short-term abrupt variations. Therefore, we introduce a sliding window algorithm to capture instantaneous anomalies. The window length is denoted as W and set to 3 in this study, which is the shortest window capable of robustly capturing step-like changes across adjacent acquisitions while reducing sensitivity to single-epoch noise. At each time step t, the displacement increment between the two window endpoints is computed while sliding along the time axis. To jointly characterize subsidence and uplift, we record the maximum absolute abrupt change magnitude

| Δ y |_{\max}

during window traversal and the corresponding transient rate

R_{trans}

:

| Δ y |_{\max} = \max_{t} | {\hat{Y}}_{clean} (t + W) - {\hat{Y}}_{clean} (t) |

(14)

This indicator enables effective identification of step-like deformation signals that deviate from long-term trends.

The deformation magnitude at an individual pixel is often insufficient to characterize destructiveness, whereas differential subsidence across the surface is a primary driver of slope failures in mining areas. Therefore, we construct a spatial gradient field based on the k-nearest neighbors (kNN). For an arbitrary target pixel p, we select its k closest spatial neighbors and set k = 8, which approximates a local 8-neighborhood while keeping the estimate robust to sparse outliers. To improve stability near scene boundaries or spatial data gaps, the kNN search is restricted to pixels with values, and k is capped by the number of available valid pixels. Finally, to prevent spuriously large gradients caused by enlarged neighbor distances at boundaries or sparse sampling, the gradient field is clipped at a high quantile (P99) as a robust safeguard. To suppress salt and pepper artifacts induced by noise, we first apply a median filter to the deformation values within the local neighborhood to obtain a robust estimate. We then compute the unit distance displacement differences between the target pixel and its neighbors and take the median as the local spatial gradient

G_{spatial}

:

G_{spatial} (p) = {median}_{q \in N_{k} (p)} (\frac{| {\hat{y}}_{cum} (p) - {\hat{y}}_{cum} (q) |}{d (p, q)})

(15)

where

d (p, q)

is the Euclidean distance between two pixels and

{\hat{y}}_{cum}

denotes the cumulative deformation at the final epoch. This median-based statistic is highly robust and can effectively suppress artificially large gradients caused by occasional phase unwrapping errors.

3.3.2. Adaptive Deformation-Level Classification Framework

Based on the above multidimensional features, we propose an adaptive integrated deformation-level classification model. The model determines risk thresholds in a data-driven manner, thereby reducing the subjectivity introduced by manually specified fixed thresholds. Considering that geophysical variables often follow long-tailed distributions, direct linear normalization is sensitive to extreme outliers. Accordingly, we adopt a robust quantile-based normalization strategy. For an arbitrary feature vector

f

, linear scaling is performed using the 5th and 95th percentiles with clipping:

\tilde{f_{i}} = Clip (\frac{f_{i} - Q_{05} (f)}{Q_{95} (f) - Q_{05} (f)}, 0, 1)

(16)

This procedure maps all feature values to the unit interval [0, 1] while preserving the relative contrast of the main portion of the distribution. The composite anomaly score

H_{score}

is formed by a weighted combination of a transient component and a trend-related component:

H_{trans} = 0.6 \cdot {\tilde{F}}_{| Δ |_{\max}} + 0.4 \cdot {\tilde{F}}_{R_{trans}}

(17)

H_{trend} = 0.4 \cdot {\tilde{F}}_{| v |} + 0.2 \cdot {\tilde{F}}_{| a |} + 0.2 \cdot {\tilde{F}}_{| Cum |} + 0.2 \cdot {\tilde{F}}_{G_{spatial}}

(18)

where

\tilde{F}

denotes the normalized features. The transient component emphasizes abrupt events and is characterized by the maximum abrupt change magnitude

| Δ |_{\max}

and the transient rate

R_{trans}

. The trend-related component focuses on long-term cumulative damage and integrates the deformation rate

|v|

, acceleration

|a|

, cumulative deformation

|Cum|

, and the spatial gradient

G_{spatial}

. The final deformation-level score is computed as a weighted sum:

H_{score} = 0.7 \cdot H_{trans} + 0.3 \cdot H_{trend}

(19)

where the weighting factor

α

is set to 0.7 and the trend-related component assigns weights of 0.30, 0.30, 0.20, and 0.20 to the deformation rate, acceleration, cumulative deformation, and spatial gradient to markedly enhance the model’s sensitivity to short-term abrupt deformation signals. We tested

α

values in the range of 0.5–0.7 and found that the current configuration provides the best overall balance between transient sensitivity, trend stability, and score interpretability. Therefore, it is retained as the default setting in this study.

Unlike conventional approaches that directly weight the deformation-level score based on coherence, which can overly suppress the deformation response in low-coherence areas and lead to missed identification of strongly anomalous pixels, we treat coherence as an independent confidence indicator. It is used only to apply a mild attenuation correction to the final deformation-level score, ensuring that pronounced deformation signals can still be detected even under low-coherence conditions. To accommodate spatial heterogeneity in deformation intensity across different subareas, we avoid fixed thresholds and instead determine deformation levels from the statistical distribution of deformation-level scores over the entire scene. Specifically, the quantile thresholds

Q_{70}

,

Q_{93}

, and

Q_{99}

are used as the boundaries for Level I (stable), Level II (slightly anomalous), Level III (moderately anomalous), and Level IV (strongly anomalous), so that the top one percent of pixels are highlighted as Level IV.

4. Results and Analysis

4.1. Construction of Coherence-Based Synthetic Datasets

4.1.1. Statistical Calibration Using Real Data

All synthesis parameters (e.g., velocity range and jump magnitude) are automatically calibrated using statistics estimated from the open-pit mine InSAR time series observations to mitigate the domain gap between synthetic and real data. Specifically, we compute robust scale statistics of the second-order differences on the real time series to characterize the intensity of observational fluctuations and combine them with quantile-based regression to determine the sampling bounds of kinematic parameters, such as velocity and acceleration. This calibration procedure ensures statistical consistency between the synthetic dataset and the study area in terms of both kinematic characteristics and noise properties. Meanwhile, the generation stage relies only on random sampling from these statistical distributions and does not require access to ground truth deformation, thereby reducing the risk of overfitting to specific samples. The resulting synthetic dataset contains 30,000 time series, preserving the completeness of the mathematical model while reproducing the statistical characteristics of real open-pit mine observations.

We compare the statistical distributions of the generated dataset with those of the real open-pit mine time series to verify the realism of the synthetic data, as shown in Figure 5. To transparently quantify the remaining sim-to-real offsets, we further report matched-quantile alignment statistics (Q5, Q50, Q95, and Q99) for the absolute deformation velocity and the noise-level proxy estimated from second-order differences (Table 2). The results show that the extreme upper tail of absolute velocity is nearly aligned between real (108.7 mm/yr) and synthetic (107.5 mm/yr) data at Q99, indicating that rare kinematic extremes in the real domain are largely covered by the synthetic training envelope, while the median velocity differs due to broader synthetic sampling for robustness. For the noise proxy, the synthetic distribution exhibits a moderately heavier upper tail, with the Q99 value reaching 7.2 mm compared with 5.9 mm in the real data, which supports conservative training under high-noise and low-coherence conditions.

Beyond the velocity and noise level distributions, we validate the magnitude scale of simulated phase unwrapping jumps using real jump-like outliers extracted from the observations. For each pixel, we compute the maximum absolute first-order difference and treat its high-percentile tail as a proxy set of real unwrapping error samples. The calibrated sampling bounds for the simulated jump magnitudes are calibrated to these candidates. As a result, simulated and real jump magnitude histograms align in main mass, with a slightly heavier synthetic tail to retain rare large jumps. The synthetic data span the full dynamic range of observed deformation rates. Compared with the real data, synthetic noise shows a more concentrated main mass and a moderately heavier upper tail and is slightly shifted toward higher

σ

, supporting robust training under high-noise and low-coherence conditions. Figure 6 presents representative synthetic time series examples under different deformation modes. The synthetic observations (dashed lines) capture the underlying deformation trend (solid lines) under the validity mask while reasonably incorporating coherence-dependent noise and potential artifacts.

4.1.2. Denoising Performance Evaluation on Synthetic Data

We first conducted a comprehensive evaluation of the constructed synthetic dataset to verify the effectiveness of the proposed CED-LSTM network in handling complex nonlinear deformation behaviors and non-stationary noise. An APS sensitivity analysis was conducted by fixing the trained model and varying the APS generation settings, including the drift intensity and correlation form. The results show only limited performance variation across the tested settings, with no clear improvement from replacing the random walk APS with an AR(1) process. Therefore, the random walk formulation is retained as the default APS model in this study, as it provides a simple and sufficiently robust non-stationary perturbation setting for synthetic data generation.

Figure 7 presents the distribution of denoising performance under different coherence conditions. The upper panel shows a two-dimensional statistical map between coherence and denoising gain, defined as the root mean square error (RMSE) reduction ratio, where color intensity indicates sample density. The thick black curve represents the median within each coherence bin, and the gray curves denote the interquartile range (25th–75th percentiles). The results indicate that within the mid to low coherence interval covered by most samples (approximately 0.0–0.7), the median RMSE reduction ratio remains at 0.55–0.70. Figure 7 also includes an extremely low-coherence example, where the RMSE remains at 0.9 mm, indicating that the observation-guided decoder does not collapse to noise fitting even when coherence < 0.1 and that denoising gains remain substantial under poor observation quality. The third example shows a slow, nonlinear, creep-like trajectory, where the ground truth deformation exhibits a gradual curvature change rather than a step-like jump. Despite the presence of APS-like oscillations in the noisy observation, the denoised prediction tracks the clean nonlinear trend without being straightened, indicating that the second-order smoothing regularizer suppresses temporally inconsistent jitter while not overly penalizing genuine acceleration under the current weighting. The separation between the two gray quartile curves suggests a certain degree of performance variability among samples under the same coherence level, implying that residual errors are related to the specific deformation mode and noise structure. In the high-coherence regime, the reduction ratio decreases primarily because the raw observations already exhibit small errors, leaving limited room for relative improvement. Therefore, this trend does not indicate degraded absolute accuracy after denoising; rather, it reflects diminishing marginal returns of a relative metric under high-quality observation conditions.

The lower panel provides three representative time series examples to illustrate the correspondence between the statistical results and typical temporal profiles. Despite evident random fluctuations and occasional local discontinuities in the raw sequences under mid to low coherence, the model output effectively suppresses high-frequency noise while preserving deformation trends and key change features consistent with the ground truth. After denoising, the RMSE values relative to the ground truth for the three examples are 0.8 mm, 0.3 mm, and 0.1 mm, respectively. These results further confirm the stable error suppression capability and good generalization performance of the proposed method across different coherence levels, providing reliable support for deformation time series reconstruction under complex observation conditions.

We randomly selected validation samples corresponding to different deformation modes and visualized the results in Figure 8 to provide an intuitive assessment of the signal reconstruction capability. The figure compares the noisy observations, the denoised predictions, and the ground truth. The results show that for steady trend–type deformation, the proposed model effectively suppresses noise components such as high-frequency atmospheric turbulence and reliably recovers the long-term linear or nonlinear subsidence or uplift evolution. For abrupt deformation signals, benefiting from the physics-aware adaptive loss with an adaptive gating mechanism, the model helps reduce transitional over-smoothing in the synthetic examples and retains the onset timing and the main magnitude of step-like changes while suppressing part of the high-frequency noise. Under strong noise conditions, even when the observation trend is difficult to discern visually due to severe contamination, the residual learning formulation still enables the network to leverage contextual information to reconstruct the underlying deformation trajectory.

In addition, we computed the RMSE for each time series over the entire validation set, and the corresponding histogram is shown in Figure 9 (left). The RMSE reached 2.2 mm, with the RMSE values of most samples concentrated in the low-error interval of 0–5 mm, with a mode of approximately 1.5 mm, indicating high reconstruction accuracy for the majority of InSAR time series. A small long-tail portion (RMSE > 20 mm) is also observed, which mainly corresponds to samples with extremely low coherence or severe phase unwrapping errors. We further report an error versus time analysis in Figure 9 (right), where the step-wise absolute error and the prefix RMSE are aggregated across the validation set. The results show a mild increase in error with the time index, whereas the growth slope in the later stage decreases markedly and tends to stabilize; specifically, the median error slope over t = 22–33 is approximately 0.012 mm/step. No long-horizon divergence is observed, indicating that error accumulation under fully autoregressive decoding remains controlled overall. In real time series data, coherence < 0.1 is considered an unreliable result, and we have excluded it. In contrast, the synthetic validation set contains 9.84% samples with coherence < 0.1 because we intentionally inject low-coherence cases for stress testing and robustness training. Therefore, the pronounced long-tail behavior in the synthetic RMSE distribution should be interpreted as a worst case robustness assessment, and in our real scenario it is more likely dominated by challenging noise structures, such as severe unwrapping jumps and APS-like disturbances, rather than coherence < 0.1 itself. This behavior is consistent with the expected physical characteristics and error sources of InSAR observations.

Table 3 reports the ablation results for the synthetic validation set. The single channel baseline LSTM that only uses the deformation time series achieves an RMSE of 2.3 mm. When the CED-LSTM architecture is kept unchanged but the adaptive loss function is not used, the RMSE increases to 2.7 mm. This indicates that the joint optimization objective plays an important role in robust time series reconstruction, especially under low coherence and noisy conditions. Overall, the proposed model achieves high-accuracy deformation recovery on the synthetic dataset, providing a reliable basis for subsequent application to real-world data.

We further perform a dedicated ablation on the velocity consistency weight

λ_{vel}

to clarify the trade-off between noise suppression and abrupt change preservation. Table 4 summarizes the results: when

λ_{vel}

increases from 0 to 0.1, RMSE and MAE consistently decrease while the change point metric F1 improves substantially, indicating stronger kinematic regularization without losing key change structures. However, an overly large weight leads to a rebound in RMSE and MAE with only marginal F1 gain, suggesting over-regularization that starts to penalize rapid transitions. Therefore,

λ_{vel}

= 0.1 provides the best balance and is adopted in the following experiments.

4.2. Application to Real InSAR Observations in the Open-Pit Mine

The trained CED-LSTM network was transferred to the InSAR observations over the open-pit mine. To comprehensively characterize deformation patterns across the mining area, we performed spatial mapping and comparative analyses from two complementary perspectives: cumulative deformation and transient abrupt changes. The spatial distribution of cumulative deformation in the open-pit mine over the three-year period from 2019 to 2022 is shown in Figure 2. Across all three years, the results consistently indicate subsidence concentrated in the central sector, suggesting persistent ground activity or mining-related disturbance rather than sporadic subsidence induced by random noise. Nevertheless, prior to denoising, the background region still exhibits pronounced spatially irregular textures and undulations, making it difficult for deformation gradients to form continuous and stable boundaries. From a temporal evolution perspective, the subsidence anomaly had already emerged in 2019 and continued to accumulate and intensify over the subsequent two years. Interannual comparisons further reveal a progressive strengthening of the anomaly and a moderate outward expansion trend.

The denoised cumulative deformation map is presented in Figure 10. Compared with Figure 2, denoising yields substantial improvements in both spatial structure recovery and interannual interpretability. First, high-frequency background undulations are markedly reduced, and texture-like artifacts, such as banded and ring-shaped patterns, are effectively suppressed, resulting in a background field that more plausibly reflects low deformation and weak variability. Second, the transition between the central subsidence zone and its surrounding areas becomes smoother, which facilitates subsequent analyses, including delineation of anomalous extents, spatial gradient estimation, and deformation-level classification. In the temporal dimension, the denoised results largely preserve the spatial location and the overall morphology of the anomalous zone in the original data. Meanwhile, the enhanced boundary stability makes the year-by-year cumulative intensification over the three years more continuous and more amenable to interannual comparison. This behavior is consistent with the expected deformation accumulation pattern in the mine scene and reduces the interference of noise-induced spatial discontinuities on interannual interpretation. However, without independent ground truth measurements, it should be interpreted as improved scene-level interpretability rather than direct validation of absolute reconstruction accuracy. Importantly, the smoother transition and enhanced boundary stability in Figure 10 suggest that even without explicitly enforcing a joint spatial–temporal correlation structure in the synthetic training data, the proposed model can still reconstruct spatially continuous deformation zones in real mine scenes by suppressing temporally incoherent noise that manifests as spatial texture artifacts.

However, cumulative deformation primarily describes progressive accumulation at interannual scales and is relatively insensitive to short-term disturbances. To capture potential abrupt instabilities, we further mapped the spatial distribution of transient change magnitude, as shown in Figure 11. Notably, the high-transient-change areas do not fully coincide with the regions of maximum cumulative subsidence. Figure 11 indicates that several high-transient points (

| Δ |_{\max}

> 25 mm) are scattered along specific pit-slope locations rather than at the pit bottom. This suggests that although the pit bottom exhibits large cumulative subsidence, it may mainly reflect relatively uniform compaction-related settlement whose risk can be managed through long-term monitoring and engineering control. In contrast, strong transient signals on pit slopes are more indicative of localized precursors to slope failure or collapse, implying greater abruptness and potential destructiveness. These results demonstrate that the proposed bidirectional transient detection algorithm can effectively identify short-term intense disturbances that may be obscured by cumulative deformation signals.

4.3. Deformation-Level Classification Results

A sensitivity analysis was conducted for the sliding window length W and for transferability across different monitoring frequencies simulated by temporal subsampling. The results show that the deformation-level score ranking remains highly stable across the tested settings, while the proportion of pixels assigned to higher-risk levels changes only moderately. Therefore, W = 3 is retained as the default setting in this study as a balanced choice. When the revisit interval changes, a time-span-matched window is recommended to maintain comparable temporal context. We further evaluated the stability of the kNN gradient estimator near boundaries and spatial data gaps by repeating the gradient and deformation-level calculations with different neighborhood sizes. The results show that although the local gradient values vary with neighborhood size in these edge or sparse-data regions, the final deformation-level score and the extent of higher-risk zones remain largely stable. Therefore, k = 8 is retained as the default setting in this study as a balanced choice between local sensitivity and overall robustness.

Based on the proposed adaptive deformation-level classification framework, we integrated long-term trend indicators, transient change metrics, and spatial gradient features to generate deformation-level maps for the 2019–2022 period, as shown in Figure 12. Deformation levels were adaptively determined using scene-wide quantile thresholds (P70, P93, P99), thereby maintaining comparability and stability of the classification results under variations in spatial coverage and noise conditions. The results indicate that Level III and Level IV areas are primarily concentrated within the central deformation anomalous belt, exhibiting a core-to-periphery spatial gradient. The Level IV core consistently emerges in the east central sector and is surrounded by a Level III transition band. Beyond the central belt, Level III pixels are also scattered along pit slopes and surrounding areas, suggesting localized disturbances with discontinuous spatial expression. Comparing the three years, higher-level areas generally evolve from relatively patchy clusters to a more connected, belt-like pattern, and the Level IV region becomes increasingly continuous along the anomalous belt, implying that deformation anomalies not only intensify but also become more spatially coherent. Accordingly, Level IV zones are interpreted here as candidate hotspots for follow-up inspection and monitoring under limited resources, while Level III transition zones indicate areas that may warrant closer observation. In this study, the hotspot denotes the upper-tail ranking of the deformation-level score for screening purposes. Without incident or risk ground truth, it should be treated as a cue for follow-up field verification.

We quantified the impact of extremely low-coherence pixels on risk identification. In the real scene, no valid time series have coherence below 0.1, including within the Level III–IV regions. This indicates that the overall risk map in our application is not influenced by extremely low-coherence outliers and that high-risk delineation is driven by consistent temporal deformation behaviors rather than noise-dominated, low-coherence observations. This evolution from patchy clusters to a more connected belt-like pattern, especially the increasingly continuous Level IV core, indicates that the proposed denoising and scoring framework does not fragment spatially coherent deformation zones; instead, it enhances spatial continuity, which is beneficial for reliable delineation in engineering interpretation. Different attenuation settings were tested to examine how coherence-based attenuation affects the final level assignment. The results indicate that this term mainly provides conservative screening in low-coherence areas while keeping the overall delineation in moderate- to high-coherence regions broadly stable. Therefore,

γ = 0.3

is retained in this study as a mild confidence-aware adjustment rather than a dominant factor in hazard mapping. Because quantile thresholds stabilize class proportions by design, we further report scene-level deformation-level indicators to capture the absolute prevalence of high scores in each year. Specifically, Table 5 summarizes the mean score, a high-percentile score (P95), and the exceedance ratio above a fixed threshold

τ = 0.6

on the [0, 1] scale. As shown in Table 5, the intensity indicators vary across years even though the quantile-based levels keep fixed proportions. In our results, the March 2020 to March 2021 interval exhibits the highest mean score and exceedance ratio, indicating a higher absolute prevalence of elevated-risk pixels. Meanwhile, the P95 score remains high between March 2021 and March 2022, suggesting that the risk tail’s intensity is persistently strong. These distribution-based indicators therefore provide a complementary view to avoid underestimating overall risk in years with scene-wide score shifts.

We further applied the Jenks natural breaks method with k = 4 to the same deformation-level scores and compared the results with the quantile-based classification to examine the sensitivity of the classification strategy to spatial pattern interpretation. The Jenks-based deformation-level map is shown in Figure 13. Compared with the deformation-level maps in Figure 12, the natural break results produce a substantially different delineation of higher-level areas, with a marked increase in the proportion of Level III–IV pixels, leading to a more aggressive extraction of strong deformation anomaly zones. Although a pronounced Level IV core can still be identified within the central anomalous belt, Level III and Level IV pixels become much more widespread across peripheral sectors and pit slope margins and appear in a more scattered manner. Their spatial patterns also vary more noticeably from year to year, showing stronger fragmentation and larger extent fluctuations. This indicates that natural break thresholds are more sensitive to year-specific distributional changes and extreme values, so the same class can represent different deformation intensity and spatial coverage across years, which reduces interannual comparability. In contrast, the quantile-based strategy anchors the thresholds to fixed percentiles and stabilizes class proportions, so year by year changes primarily reflect expansion, contraction, and morphological evolution of the core anomalous belt, making it more suitable for consistent annual monitoring and comparison. Our preference for quantile thresholds is motivated by interannual comparability and operational stability, and the resulting levels should be interpreted as relative deformation anomaly ranks that require follow-up verification.

Three representative points from different deformation levels were selected for time series comparison to validate the reliability of the level classification and the model’s capability to characterize subtle deformation signatures, as shown in Figure 14. The stable point (Level I, ID: 2557) is located in a background stable area. Its raw observation series (orange dashed line) exhibits pronounced random fluctuations due to noise perturbations with no coherent deformation trend. After CED-LSTM denoising, the sequence remains stable with only minor fluctuations occurring within the range of approximately −3 to 0 mm and without spurious drift, indicating that the model effectively suppresses high-frequency disturbances such as atmospheric noise while maintaining a low false alarm rate and avoiding misinterpretation of noise as genuine deformation. The linear trend point (Level III, ID: 10794) shows persistent subsidence. The denoised curve reveals a clearer near-linear downward trajectory and yields an estimated long-term rate of approximately −65.0 mm/yr. Although the cumulative subsidence at this point is large, no abrupt jumps or pronounced nonlinear acceleration are observed; it is therefore reasonably classified as a moderate deformation level, demonstrating the framework’s ability to distinguish progressive trend-related risk from abrupt instability risk.

In contrast, the transient anomaly point (Level IV, ID: 12571) highlights the proposed method’s advantage in identifying nonlinear deformation. This point exhibits stronger short-term intense variations in the monitoring sequence. The denoised result not only preserves the continuity of the overall subsidence process but also more prominently retains morphology inflections caused by stage-wise changes, resulting in a transient change magnitude exceeding 13.0 mm and increasing the deformation level score to 0.78. This observation suggests that relying solely on cumulative deformation or the linear rate may be insufficient to fully capture short-term disturbances associated with localized instability. By incorporating transient features, the proposed framework can effectively capture potential abrupt anomalies, allowing higher deformation levels to emphasize nonlinear and abrupt change points and thereby improving the engineering relevance and interpretive consistency of the level classification results.

5. Discussion

5.1. Denoising Performance

In this study, we quantified the distribution of the absolute differences in cumulative deformation before and after denoising for the three-year time series to assess whether denoising introduces amplitude bias, and the histogram is shown in Figure 15. For the three annual datasets, the P50 percentiles are 2.1 mm, 2.5 mm, and 3.8 mm, respectively, whereas the P90 percentiles are 5.2 mm, 6.0 mm, and 10.1 mm, respectively. These statistics indicate that the denoising-induced changes in cumulative deformation are predominantly at the millimeter level over the entire scene. This suggests that the model mainly suppresses random fluctuations while preserving the overall magnitude and spatial pattern of cumulative deformation, rather than uniformly compressing deformation amplitudes. Meanwhile, the distribution exhibits a sparse long tail, and both the P90 percentile and tail extent increase in later years, implying that a small number of pixels undergo relatively large adjustments after denoising. These pixels are mainly located in areas dominated by low coherence, missing observations, or phase unwrapping errors, indicating that the model’s strong corrections are selective and primarily applied to locations with weak observational reliability. Overall, the proposed denoising method improves the interpretability of anomalous signals while maintaining deformation amplitude stability, and it provides more robust input for subsequent deformation feature extraction.

5.2. Engineering Applicability Analysis

Relying solely on cumulative deformation or the linear deformation rate is often insufficient to characterize the nonlinear evolution of instability in mining areas because high-risk signals may manifest as short-term abrupt changes, stage-wise turning points, or locally differential subsidence. In this study, the denoised time series were further decomposed into trend-related risk and transient risk components, which were then integrated into a composite deformation-level indicator through robust normalization and weighted fusion. This design aims to preserve the distinguishability of risk contributions associated with different failure mechanisms, thereby avoiding the simplistic assumption that larger cumulative deformation necessarily implies a higher deformation level. In practical engineering scenarios, differential subsidence and local deformation gradients are often more indicative of damage modes than the magnitude at a single pixel. In the case of an open-pit mine, the primary benefit of denoising is reflected in structural improvements, including suppression of background texture artifacts and enhanced continuity of anomaly boundaries. These improvements increase the reliability of interannual comparisons and anomalous area delineation and directly enhance the interpretability of subsequent gradient estimation and deformation-level classification, rather than merely reducing numerical errors. The representative point analysis further demonstrates that the CED-LSTM network can distinguish among different deformation patterns. In particular, the transient anomaly point highlights stage-wise turning behavior while preserving the continuity of the overall subsidence process, resulting in a high transient change magnitude and a markedly increased deformation-level score. This indicates that incorporating transient components enables high-level warnings to better focus on nonlinear and abrupt risks.

Despite the strong performance in point-wise time series denoising and feature extraction, several limitations remain. First, systematic biases that are weakly correlated with coherence may not be fully distinguishable from true long-term trends in a purely pixel-wise residual denoising setting; additional preprocessing or spatially coupled bias modeling could further improve robustness. Second, while we adopt a shallow unidirectional LSTM encoder–decoder for efficiency and to avoid over-smoothing abrupt changes, the potential benefits of deeper or bidirectional recurrent architectures for representing very long-term nonlinear trends have not yet been systematically quantified and will be explored in future work. Third, the current pixel-wise pipeline does not exploit local spatial correlation. In extremely low-coherence regions or under large contiguous unwrapping error patches, single-pixel temporal cues may be insufficient for reliable recovery.

Future work will therefore focus on developing spatiotemporal deep learning models with spatial context constraints and upgrading the synthetic generator from pixel-wise noise injection to a spatiotemporally coupled simulation. For example, modeling APS as a spatially correlated random field with random walk temporal evolution so that the joint spatial–temporal correlation structure can be explicitly controlled during training. Beyond line-of-sight deformation from InSAR, future studies may integrate geological structural maps, rainfall records, and optical imagery to establish a more comprehensive multi-source landslide susceptibility assessment framework. Furthermore, we will explore physics-consistent probabilistic strategies. These include coupling physically based probabilistic stability models with deep learning to explicitly characterize parameter uncertainty and coupling physically based initiation models with runout process models (e.g., TRIGRS and RAMMS) to support more robust and physically interpretable hazard chain assessment [39,40].

6. Conclusions

This study proposes a coherence-conditioned encoder–decoder LSTM denoising network, termed CED-LSTM, for separating nonlinear change point signals from noise backgrounds. By embedding the statistical characteristics of real mining area deformation and a coherence-dependent noise model into the synthetic data generation process, we constructed a high-fidelity training domain, effectively mitigating the simulation-to-real domain gap that commonly limits deep-learning-based InSAR applications. On the synthetic validation set, the proposed model demonstrates strong signal reconstruction ability, achieving an RMSE as low as 2.2 mm. Furthermore, aided by the proposed adaptive gating loss, the network effectively suppresses atmospheric turbulence and decorrelation noise while retaining key transient abrupt change signatures. In the application to the open-pit mine, the method reveals complex spatiotemporal deformation patterns and highlights a small fraction of localized areas with pronounced deformation anomalies. Time series analyses show that these anomalous points are characterized not only by notable cumulative deformation but also by step-like acceleration that can be overlooked by traditional linear models. The resulting deformation sequences and derived descriptors provide robust inputs for zone delineation and decision support in mine monitoring. However, because no independent in situ deformation measurements are available in this study, the real scene application results should be interpreted as improved scene-level interpretability rather than direct validation of absolute reconstruction accuracy. Within this limitation, the proposed method improves deformation extraction accuracy and computational efficiency, offering broad practical potential for routine InSAR time series interpretation and deformation screening in open-pit mining environments.

Author Contributions

Conceptualization, Y.W. and Z.B.; methodology, Y.W., X.K. and Z.B.; software, Z.B.; validation, X.K.; formal analysis, X.K.; investigation, Y.W. and Z.B.; resources, Y.W. and Z.B.; writing—original draft preparation, Y.W., X.K. and Z.B.; writing—review and editing, Y.W., Z.B., Y.L. (Yao Lu), W.T., Y.L. (Yang Li), Y.L. (Yun Lin), W.S. and G.C.; visualization, Z.B.; supervision, Z.B.; project administration, Y.W.; funding acquisition, Z.B. All authors have read and agreed to the published version of the manuscript.

Funding

This work was partially supported by the National Natural Science Foundation of China under Grant 42501571 and the R&D Program of the Beijing Municipal Education Commission under Grant KM202410009001.

Data Availability Statement

The code for this study has been publicly released on GitHub and is available at https://github.com/kongxiangbo7/CED-LSTM.git (accessed on 18 March 2026).

Acknowledgments

We thank the European Space Agency (ESA) for providing us with the Sentinel-1 dataset for research purposes in this project. We also thank the Alaska Satellite Facility (ASF) of the University of Alaska for providing us with the platform for downloading Sentinel-1 data.

Conflicts of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Du, S.; Li, W.; Li, J.; Du, S.; Zhang, C.; Sun, Y. Open-Pit Mine Change Detection from High Resolution Remote Sensing Images Using DA-UNet++ and Object-Based Approach. Int. J. Min. Reclam. Environ. 2022, 36, 512–535. [Google Scholar] [CrossRef]
Guo, J.; Li, Q.; Xie, H.; Li, J.; Qiao, L.; Zhang, C.; Yang, G.; Wang, F. Monitoring of Vegetation Disturbance and Restoration at the Dumping Sites of the Baorixile Open-Pit Mine Based on the LandTrendr Algorithm. Int. J. Environ. Res. Public Health 2022, 19, 9066. [Google Scholar] [CrossRef]
Massonnet, D.; Rossi, M.; Carmona, C.; Adragna, F.; Peltzer, G.; Feigl, K.; Rabaute, T. The displacement field of the Landers earthquake mapped by radar interferometry. Nature 1993, 364, 138–142. [Google Scholar] [CrossRef]
Bhattacharya, A.; Mukherjee, K. Review on InSAR based displacement monitoring of Indian Himalayas: Issues, challenges and possible advanced alternatives. Geocarto Int. 2017, 32, 298–321. [Google Scholar] [CrossRef]
Pedretti, L.; Bordoni, M.; Vivaldi, V.; Figini, S.; Parnigoni, M.; Grossi, A.; Lanteri, L.; Tararbra, M.; Negro, N.; Meisina, C. InterpolatiON of InSAR Time series for the dEtection of ground deforMatiOn eVEnts (ONtheMOVE): Application to slow-moving landslides. Landslides 2023, 20, 1797–1813. [Google Scholar] [CrossRef]
Du, S.; Du, S.; Liu, B.; Zhang, X. Incorporating DeepLabv3+ and Object-Based Image Analysis for Semantic Segmentation of Very High Resolution Remote Sensing Images. Int. J. Digit. Earth 2021, 14, 357–378. [Google Scholar] [CrossRef]
Wang, C.; Chang, L.; Zhao, L.; Niu, R. Automatic Identification and Dynamic Monitoring of Open-Pit Mines Based on Improved Mask R-CNN and Transfer Learning. Remote Sens. 2020, 12, 3474. [Google Scholar] [CrossRef]
Bai, Z.; Zhao, F.; Wang, J.; Li, J.; Wang, Y.; Li, Y.; Lin, Y.; Shen, W. Revealing Long-Term Displacement and Evolution of Open-Pit Coal Mines Using SBAS-InSAR and DS-InSAR. Remote Sens. 2025, 17, 1821. [Google Scholar] [CrossRef]
Ferretti, A.; Prati, C.; Rocca, F. Permanent scatterers in SAR interferometry. IEEE Trans. Geosci. Remote Sens. 2001, 39, 8–20. [Google Scholar] [CrossRef]
Berardino, P.; Fornaro, G.; Lanari, R.; Sansosti, E. A new algorithm for surface deformation monitoring based on small baseline differential SAR interferograms. IEEE Trans. Geosci. Remote Sens. 2002, 40, 2375–2383. [Google Scholar] [CrossRef]
Hooper, A.; Zebker, H.; Segall, P.; Kampes, B. A new method for measuring deformation on volcanoes and other natural terrains using InSAR persistent scatterers. Geophys. Res. Lett. 2004, 31, L23611. [Google Scholar] [CrossRef]
Crosetto, M.; Monserrat, O.; Cuevas-González, M.; Devanthéry, N.; Crippa, B. Persistent Scatterer Interferometry: A review. ISPRS J. Photogramm. Remote Sens. 2016, 115, 78–89. [Google Scholar] [CrossRef]
Poggi, F.; Caleca, F.; Nardini, O.; Barbadori, F.; Del Soldato, M.; De Luca, C.; Casu, F.; Bonano, M.; Lanari, R.; Tofani, V.; et al. Sentinel-1 imagery for wide-scale quantitative landslide vulnerability assessment of buildings. Remote Sens. Environ. 2026, 115199, 0034–4257. [Google Scholar] [CrossRef]
Liu, X.; Zhao, C.; Zhang, Q.; Lu, Z.; Li, Z.; Yang, C.; Zhu, W.; Liu-Zeng, J.; Chen, L.; Liu, C. Integration of Sentinel-1 and ALOS/PALSAR-2 SAR datasets for mapping active landslides along the Jinsha River corridor, China. Eng. Geol. 2021, 284, 106033. [Google Scholar] [CrossRef]
Di Martire, D.; Paci, M.; Confuorto, P.; Costabile, S.; Guastaferro, F.; Verta, A.; Calcaterra, D. A nation-wide system for landslide mapping and risk management in Italy: The second Not-ordinary Plan of Environmental Remote Sensing. Int. J. Appl. Earth Obs. Geoinf. 2017, 63, 143–157. [Google Scholar] [CrossRef]
Wang, L.; Yang, L.; Wang, W.; Chen, B.; Sun, X. Monitoring Mining Activities Using Sentinel-1A InSAR Coherence in Open-Pit Coal Mines. Remote Sens. 2021, 13, 4485. [Google Scholar] [CrossRef]
Li, X.; Zhang, X.; Shen, W.; Zeng, Q.; Chen, P.; Qin, Q.; Li, Z. Research on the Mechanism and Control Technology of Coal Wall Sloughing in the Ultra-Large Mining Height Working Face. Int. J. Environ. Res. Public Health 2023, 20, 868. [Google Scholar] [CrossRef]
Zhao, Z.; Wu, Z.; Zheng, Y.; Ma, P. Recurrent neural networks for atmospheric noise removal from InSAR time series with missing values. ISPRS J. Photogramm. Remote Sens. 2021, 180, 227–237. [Google Scholar] [CrossRef]
Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Proceedings of the MICCAI 2015, Munich, Germany, 5–9 October 2015; pp. 234–241. [Google Scholar]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention Is All You Need. In Advances in Neural Information Processing Systems 30 (NeurIPS 2017); Curran Associates, Inc.: Red Hook, NY, USA, 2017. [Google Scholar]
Ghorbanzadeh, O.; Blaschke, T.; Gholamnia, K.; Meena, S.R.; Tiede, D.; Aryal, J. Evaluation of Different Machine Learning Methods and Deep-Learning Convolutional Neural Networks for Landslide Detection. Remote Sens. 2019, 11, 196. [Google Scholar] [CrossRef]
Chen, C.; Dai, K.; Tang, X.; Cheng, J.; Pirasteh, S.; Wu, M.; Shi, X.; Zhou, H.; Li, Z. Removing InSAR Topography-Dependent Atmospheric Effect Based on Deep Learning. Remote Sens. 2022, 14, 4171. [Google Scholar] [CrossRef]
Zhou, H.; Dai, K.; Tang, X.; Xiang, J.; Li, R.; Wu, M.; Peng, Y.; Li, Z. Time-Series InSAR with Deep-Learning-Based Topography-Dependent Atmospheric Delay Correction for Potential Landslide Detection. Remote Sens. 2023, 15, 5287. [Google Scholar] [CrossRef]
Sun, X.; Zimmer, A.; Mukherjee, S.; Kottayil, N.K.; Ghuman, P.; Cheng, I. DeepInSAR—A Deep Learning Framework for SAR Interferometric Phase Restoration and Coherence Estimation. Remote Sens. 2020, 12, 2340. [Google Scholar] [CrossRef]
Zhu, X.X.; Tuia, D.; Mou, L.; Xia, G.S.; Zhang, L.; Xu, F.; Fraundorfer, F. Deep Learning in Remote Sensing: A Comprehensive Review and List of Resources. IEEE Geosci. Remote Sens. Mag. 2017, 5, 8–36. [Google Scholar] [CrossRef]
Murdaca, G.; Rucci, A.; Prati, C. Deep Learning for InSAR Phase Filtering: An Optimized Framework for Phase Unwrapping. Remote Sens. 2022, 14, 4956. [Google Scholar] [CrossRef]
Ball, J.E.; Anderson, D.T.; Chan, C.S. Comprehensive Survey of Deep Learning in Remote Sensing: Theories, Tools, and Challenges for the Community. J. Appl. Remote Sens. 2017, 11, 042609. [Google Scholar] [CrossRef]
Zhu, X.; Montazeri, S.; Ali, M.; Hua, Y.; Wang, Y.; Mou, L.; Shi, Y.; Xu, F.; Bamler, R. Deep Learning Meets SAR: Concepts, Models, Pitfalls, and Perspectives. IEEE Geosci. Remote Sens. Mag. 2021, 9, 143–172. [Google Scholar] [CrossRef]
Vijay Kumar, S.; Sun, X.; Wang, Z.; Goldsbury, R.; Cheng, I. A U-Net Approach for InSAR Phase Unwrapping and Denoising. Remote Sens. 2023, 15, 5081. [Google Scholar] [CrossRef]
Wang, J.; Li, C.; Li, L.; Huang, Z.; Wang, C.; Zhang, H.; Zhang, Z. InSAR Time-Series Deformation Forecasting Surrounding Salt Lake Using Deep Transformer Models. Sci. Total Environ. 2023, 858, 159744. [Google Scholar] [CrossRef]
Zhao, Z.; Qiao, K.; Liu, Y.; Chen, J.; Li, C. Geochemical Data Mining by Integrated Multivariate Component Data Analysis: The Heilongjiang Duobaoshan Area (China) Case Study. Minerals 2022, 12, 1035. [Google Scholar] [CrossRef]
Torres, R.; Snoeij, P.; Geudtner, D.; Bibby, D.; Davisson, M.; Attema, E.; Potin, P.; Rommen, B.; Floury, N.; Brown, M.; et al. GMES Sentinel-1 mission. Remote Sens. Environ. 2012, 120, 9–24. [Google Scholar] [CrossRef]
Ho Tong Minh, D.; Hanssen, R.; Rocca, F. Radar interferometry: 20 years of development in time series techniques and future perspectives. Remote Sens. 2020, 12, 1364. [Google Scholar] [CrossRef]
Just, D.; Bamler, R. Phase Statistics of Interferograms with Applications to Synthetic Aperture Radar. Appl. Opt. 1994, 33, 4361–4368. [Google Scholar] [CrossRef]
Emardson, T.R.; Simons, M.; Webb, F.H. Neutral atmospheric delay in interferometric synthetic aperture radar applications: Statistical description and mitigation. J. Geophys. Res. 2003, 108, 2231. [Google Scholar] [CrossRef]
Chen, C.; Zebker, H. Phase unwrapping for large SAR interferograms: Statistical segmentation and generalized network models. IEEE Trans. Geosci. Remote Sens. 2002, 40, 1709–1719. [Google Scholar] [CrossRef]
Cui, H.-Z.; Tong, B.; Wang, T.; Dou, J.; Ji, J. A hybrid data-driven approach for rainfall-induced landslide susceptibility mapping: Physically-based probabilistic model with convolutional neural network. J. Rock Mech. Geotech. Eng. 2025, 17, 4933–4951. [Google Scholar] [CrossRef]
Musaib, A.; Aparna, V.; Divya, P.V. Integrating TRIGRS and RAMMS for the spatiotemporal prediction of rainfall induced landslides and landslide trajectory: A case study. Nat. Hazards 2026, 122, 97. [Google Scholar] [CrossRef]

Figure 1. The study area. Blue lines represent the scope of the open-pit mining area.

Figure 2. Spatial distribution of deformation in the open-pit mine. (a) March 2019–March 2020, (b) March 2020–March 2021, (c) March 2021–March 2022.

Figure 3. Flowchart of the proposed method. Arrows indicate the direction of information flow, and colors distinguish different network components.

Figure 4. The network architecture of CED-LSTM. Different colors represent different functional components of the network, where the blue block denotes the encoder, the yellow block denotes the decoder, and other colors indicate input features and predicted outputs.

Figure 5. Statistical consistency verification between the open-pit mine observations and the synthetic datasets. The left panel shows the histogram of deformation velocity. The middle panel shows the noise level histogram over the main density range. The right panel shows the histogram of phase unwrapping jump magnitude consistency, where the dashed line denotes the real tail threshold (P95).

Figure 6. Examples of synthetic InSAR time series covering six representative deformation modes.

Figure 7. Performance on the synthetic validation dataset. The upper panel shows the relationship between coherence and denoising gain, where color intensity indicates sample density, the black solid line represents the median within each coherence bin, and the gray curves denote the interquartile range. The lower panel presents three representative time series examples, where the blue solid line indicates the ground truth, the orange dashed line denotes the noisy observation, and the green solid line represents the CED-LSTM denoised prediction.

Figure 8. Representative denoising examples on the validation set. Before denoising denotes the noisy input observation and after denoising denotes the CED-LSTM prediction.

Figure 9. Long-sequence error characterization on the synthetic validation set. The left panel shows the histogram of per-sequence root mean square error (RMSE) across the validation dataset. The right panel shows the error versus time analysis under fully autoregressive decoding. The shaded region represents the 25th–75th percentile range.

Figure 10. Spatial distribution of cumulative deformation in the open-pit mine after denoising. (a) March 2019–March 2020, (b) March 2020–March 2021, (c) March 2021–March 2022.

Figure 11. Spatial distribution of transient abrupt change magnitude. (a) March 2019–March 2020, (b) March 2020–March 2021, (c) March 2021–March 2022.

Figure 12. Deformation-level classification results using the quantile-based method. Colors indicate Level I to Level IV. The classification is produced by fusing long-term trends, transient change, and spatial gradient indicators and then applying scene-wide quantile thresholds. (a) March 2019–March 2020, (b) March 2020–March 2021, (c) March 2021–March 2022.

Figure 13. Deformation-level classification results using the Jenks natural breaks method. Colors indicate Level I to Level IV. The classification is produced by fusing long-term trend, transient change, and spatial gradient indicators and then applying the Jenks natural breaks method. (a) March 2019–March 2020, (b) March 2020–March 2021, (c) March 2021–March 2022.

Figure 14. Time series analysis of representative points under different deformation levels. Points are selected to be typical of each level and to have sufficient valid observations.

Figure 15. Histogram of the absolute differences in cumulative deformation before and after denoising. For each year, the absolute difference is computed between the cumulative deformation derived from the original time series and that derived from the denoised time series. The unit is millimeters, and statistics are computed over all valid pixels in the scene.

Table 1. Main parameters of the Sentinel-1 data.

Parameter	Value
Flight direction	Ascending
Beam mode	IW
Polarization	VV
Wave band	C
Wavelength/cm	5.6
Number of images	92
Monitored period	March 2019–March 2022

Table 2. Matched quantile alignment between real and synthetic domains for key calibration statistics.

Metric	Domain	Q5	Q50	Q95	Q99
$\| V e l o c i t y \|$ /(mm/yr)	Real	1.1	11.0	35.9	108.8
$\| V e l o c i t y \|$ /(mm/yr)	Synthetic	0.7	21.6	72.9	107.5
$\| V e l o c i t y \|$ /(mm/yr)	Offset	+0.4	−10.5	−37.0	+1.2
$Noise σ$ from 2nd diff/(mm)	Real	1.0	2.0	4.4	5.9
$Noise σ$ from 2nd diff/(mm)	Synthetic	1.5	2.1	5.6	7.2
$Noise σ$ from 2nd diff/(mm)	Offset	−0.5	−0.1	−1.3	−1.2

Table 3. Ablation results on the synthetic validation set.

Variant	RMSE/mm	MAE/mm	F1
Baseline LSTM	2.3	1.9	0.86
CED-LSTM without adaptive loss function	2.7	2.1	0.29
CED-LSTM	2.2	1.8	0.86

Table 4. Ablation of the velocity consistency weight on the synthetic validation set.

$λ_{vel}$	RMSE/mm	MAE/mm	F1
0	2.8	2.2	0.24
0.001	2.7	2.1	0.36
0.01	2.4	1.9	0.66
0.1	2.2	1.8	0.86
0.3	2.5	2.1	0.87

Table 5. Interannual comparison of scene-level score distribution statistics.

Monitored Period	Mean (Score)	P95 (Score)	Ratio (Score ≥ 0.6)
March 2019–March 2020	0.3353	0.7405	0.1200
March 2020–March 2021	0.3667	0.7999	0.1412
March 2021–March 2022	0.3282	0.8019	0.1358

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wang, Y.; Kong, X.; Bai, Z.; Li, Y.; Lu, Y.; Tang, W.; Lin, Y.; Shen, W.; Cai, G. CED-LSTM: A Coherence-Conditioned Encoder–Decoder Network for Robust InSAR Time-Series Deformation Extraction in Open-Pit Mines. Remote Sens. 2026, 18, 984. https://doi.org/10.3390/rs18070984

AMA Style

Wang Y, Kong X, Bai Z, Li Y, Lu Y, Tang W, Lin Y, Shen W, Cai G. CED-LSTM: A Coherence-Conditioned Encoder–Decoder Network for Robust InSAR Time-Series Deformation Extraction in Open-Pit Mines. Remote Sensing. 2026; 18(7):984. https://doi.org/10.3390/rs18070984

Chicago/Turabian Style

Wang, Yanping, Xiangbo Kong, Zechao Bai, Yang Li, Yao Lu, Weikai Tang, Yun Lin, Wenjie Shen, and Guanjun Cai. 2026. "CED-LSTM: A Coherence-Conditioned Encoder–Decoder Network for Robust InSAR Time-Series Deformation Extraction in Open-Pit Mines" Remote Sensing 18, no. 7: 984. https://doi.org/10.3390/rs18070984

APA Style

Wang, Y., Kong, X., Bai, Z., Li, Y., Lu, Y., Tang, W., Lin, Y., Shen, W., & Cai, G. (2026). CED-LSTM: A Coherence-Conditioned Encoder–Decoder Network for Robust InSAR Time-Series Deformation Extraction in Open-Pit Mines. Remote Sensing, 18(7), 984. https://doi.org/10.3390/rs18070984

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.

Article Menu

CED-LSTM: A Coherence-Conditioned Encoder–Decoder Network for Robust InSAR Time-Series Deformation Extraction in Open-Pit Mines

Highlights

Abstract

1. Introduction

2. Study Area and Datasets

2.1. Study Area

2.2. Datasets

3. Methodology

3.1. Synthetic InSAR Time Series Data Generation

3.1.1. Kinematic Deformation Model

3.1.2. Physics-Aware Noise Simulation Strategy

3.2. CED-LSTM Denoising Network

3.2.1. Network Architecture

3.2.2. Physics-Aware Adaptive Loss Function

3.2.3. Training Strategy

3.3. Deformation-Level Evaluation Based on Denoised InSAR Time Series Data

3.3.1. Spatiotemporal Deformation Feature Extraction

3.3.2. Adaptive Deformation-Level Classification Framework

4. Results and Analysis

4.1. Construction of Coherence-Based Synthetic Datasets

4.1.1. Statistical Calibration Using Real Data

4.1.2. Denoising Performance Evaluation on Synthetic Data

4.2. Application to Real InSAR Observations in the Open-Pit Mine

4.3. Deformation-Level Classification Results

5. Discussion

5.1. Denoising Performance

5.2. Engineering Applicability Analysis

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI